CN109862422A - Method for processing video frequency, device, computer readable storage medium and computer equipment - Google Patents
Method for processing video frequency, device, computer readable storage medium and computer equipment Download PDFInfo
- Publication number
- CN109862422A CN109862422A CN201910150506.3A CN201910150506A CN109862422A CN 109862422 A CN109862422 A CN 109862422A CN 201910150506 A CN201910150506 A CN 201910150506A CN 109862422 A CN109862422 A CN 109862422A
- Authority
- CN
- China
- Prior art keywords
- role
- original video
- video frame
- frame
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Studio Circuits (AREA)
Abstract
This application involves a kind of method for processing video frequency, device, computer readable storage medium and computer equipment, method includes: the audio fragment obtained in audio file;Application on Voiceprint Recognition is carried out to audio fragment, obtains the corresponding sound type of audio fragment;Obtain caption character corresponding with audio fragment in subtitle file;Obtain original video frame sequence corresponding with audio fragment in original video file;Based on the original video frame in original video frame sequence, role corresponding with sound type is identified;According to position of the role in original video frame, the subtitling image frame including caption character is generated;Subtitling image frame is used to synthesize target video frame with original video frame.The ability of delivery of video information can be improved in scheme provided by the present application.
Description
Technical field
This application involves field of computer technology, more particularly to a kind of method for processing video frequency, device, computer-readable deposit
Storage media and computer equipment.
Background technique
Nowadays most of videos are by the way that video file, audio file and subtitle file are carried out group according to the agreement of agreement
It is obtained after conjunction, video is simultaneously presented to the user by a complete video available in this way.Subtitle in video can be by sound
Acoustic information in frequency file shows in the form of subtitles, and family can be used and visually see dialogue between role,
Better understand the content of video.
However, being in most cases fixed placement in video pictures, leading to word in this way for the subtitle in video
Defective tightness is contacted between the role occurred in curtain and video pictures, so that video conveys the ability of information to be restricted.
Summary of the invention
Based on this, it is necessary to the ability for causing video to convey information in video pictures for existing subtitle fixed placement
The technical issues of being restricted provides a kind of method for processing video frequency, device, computer readable storage medium and computer equipment.
A kind of method for processing video frequency, comprising:
Obtain the audio fragment in audio file;
Application on Voiceprint Recognition is carried out to the audio fragment, obtains the corresponding sound type of the audio fragment;
Obtain caption character corresponding with the audio fragment in subtitle file;
Obtain original video frame sequence corresponding with the audio fragment in original video file;
Based on the original video frame in the original video frame sequence, role corresponding with the sound type is identified;
According to position of the role in the original video frame, the subtitling image frame including the caption character is generated;
The subtitling image frame is used to synthesize target video frame with the original video frame.
A kind of video process apparatus, described device include:
Audio fragment obtains module, for obtaining the audio fragment in audio file;
It is corresponding to obtain the audio fragment for carrying out Application on Voiceprint Recognition to the audio fragment for sound type identification module
Sound type;
Caption character obtains module, for obtaining caption character corresponding with the audio fragment in subtitle file;
Original video frame sequence obtains module, for obtaining original video frame corresponding with the audio fragment in original video file
Sequence;
Role's identification module, for based on the original video frame in the original video frame sequence, identification and the sound type
Corresponding role;
Subtitling image frame generation module, for the position according to the role in the original video frame, generating includes institute
State the subtitling image frame of caption character;The subtitling image frame is used to synthesize target video frame with the original video frame.
A kind of computer readable storage medium is stored with computer program, when the computer program is executed by processor,
So that the processor executes the step of above-mentioned method for processing video frequency.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the calculating
When machine program is executed by the processor, so that the step of processor executes above-mentioned method for processing video frequency.
Above-mentioned method for processing video frequency, device, computer readable storage medium and computer equipment have pre-established entire view
The corresponding relationship of sound type and role in frequency, by finding angle corresponding with the sound type identified in original video frame
Color generates the subtitling image frame including caption character according to position of the role in original video frame.The subtitling image frame of generation
It synthesizes to obtain the target video frame including the caption character close to role with original video frame.Due in the target video frame of synthesis
Role of the subtitle in video pictures shows that the form of expression more horn of plenty is believed to improve video pictures and transmit to user
The ability of breath is conducive to user and better understands video content.
Detailed description of the invention
Fig. 1 is the applied environment figure of method for processing video frequency in one embodiment;
Fig. 2 is the overall flow schematic diagram of method for processing video frequency in one embodiment;
Fig. 3 is the flow diagram of method for processing video frequency in one embodiment;
Fig. 4 is in one embodiment be the corresponding relationship established in one embodiment between sound type and role process
Schematic diagram;
Fig. 5 is the schematic diagram of subtitling image frame corresponding with original video frame in one embodiment;
Fig. 6 is the flow diagram that terminal obtains credit video file from server in one embodiment;
Fig. 7 is the flow diagram of method for processing video frequency in a specific embodiment;
Fig. 8 is the structural block diagram of video process apparatus in one embodiment;
Fig. 9 is the structural block diagram of video process apparatus in another embodiment;
Figure 10 is the structural block diagram of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, and
It is not used in restriction the application.
Fig. 1 is the applied environment figure of method for processing video frequency in one embodiment.Referring to Fig.1, the method for processing video frequency application
In processing system for video.The processing system for video includes terminal 110 and server 120.Terminal 110 and server 120 pass through net
Network connection.Terminal 110 specifically can be terminal console or mobile terminal, and mobile terminal specifically can be with mobile phone, tablet computer, notes
At least one of this computer and Web TV etc..The mountable client having for carrying out video playing in terminal 110, such as
Client or videoconference client etc. is broadcast live.It can be interacted, be led to server 120 by the client run in terminal 110
Client triggering is crossed after the acquisition synthetic video file of server 120, terminal 110 can be played based on the synthetic video file
And show the target video frame in synthetic video file.Server 120 can use independent server either multiple servers
The server cluster of composition is realized.
As shown in Fig. 2, being in one embodiment the overall flow schematic diagram of method for processing video frequency provided by the present application.Ginseng
According to Fig. 2, server according to the subtitle file of a complete image, audio file and original video file generated credit video file,
Credit video file is made of subtitling image frame, and server can also be by the credit video frame and original video in credit video file
Original video frame in file synthesizes the target video frame for obtaining caption character close to role, the synthetic video that target video frame is constituted
File.The audio file of synthetic video file and the image can be sent to terminal in response to the request of terminal by server, by
Terminal, which is realized, plays the audio file and synthetic video file, and the caption character in target video frame in synthetic video file leans on
Role in nearly video pictures, so that the target video frame in synthetic video file strengthens contacting between subtitle and role,
More information can be transmitted to user.In one embodiment, the available user of terminal passes through locally-installed client
The request of triggering obtains synthetic video file from server based on the request.
As shown in figure 3, in one embodiment, providing a kind of method for processing video frequency.The present embodiment is mainly in this way
It is illustrated applied to the server 120 in above-mentioned Fig. 1.Referring to Fig. 3, which specifically comprises the following steps:
S302 obtains the audio fragment in audio file.
Wherein, audio file be include a complete image all voice data file.The institute of one complete image
There is voice data to can be the dubbing data of all roles in this complete image.Audio fragment is by audio file with a language
Sound data are each voice data that unit is divided.Specifically, the audio text of the available complete image of server
Part obtains first audio fragment of audio file, from first sound in audio file according to the play time sequence of image
Frequency segment starts, and is successively handled.
S304 carries out Application on Voiceprint Recognition to audio fragment, obtains the corresponding sound type of audio fragment.
Wherein, sound type is the classification of sound.In a complete image, if dubbing not of using of different role
Together, corresponding sound type is also just different, and sound type and role are one-to-one corresponding relationships;If multiple roles have used phase
Same dubs, then the sound type of this multiple role is also just identical, and sound type and role are many-to-one corresponding relationships;If one
A role used it is multiple dub, then the corresponding sound type of this role just there are many, sound type and role are a pair of
More corresponding relationships.Server can carry out Application on Voiceprint Recognition to currently processed audio fragment, and it is corresponding to obtain the audio fragment
Sound type.For example different sound types can be marked with " 1 bugle call sound ", " 2 bugle call sound " etc., it can be with role in plot
In name represent role.
In one embodiment, before step S304, method for processing video frequency further includes to the sound in a complete image
The step of sound type and role establish corresponding relationship and store the corresponding relationship: each audio fragment in traversal audio file;
Application on Voiceprint Recognition is carried out to current audio fragment, obtains sound type corresponding with audio fragment;It is matched when sound type exists
Role when, then traverse next audio fragment;When matched role is not present in sound type, then obtain in original video file
Original video frame sequence corresponding with current audio fragment carries out recognition of face to each original video frame in original video frame sequence,
The role in original video frame is obtained, by sound type storage corresponding with the role identified.
Wherein, original video file be include a complete image all image datas file.In general, one complete
Image is by audio file, video file and subtitle file according to obtaining after preset combination of protocols, same portion's image middle pitch
There is corresponding close between the caption character in the sequence of frames of video and subtitle file in audio fragment, video file in frequency file
System.For example, one section is dubbed and corresponded to one section of continuous video pictures, while a line subtitle is also corresponded to.
It specifically, can be with when corresponding relationship of the server between the sound type and role for establishing a complete image
Each audio fragment in audio file is traversed, Application on Voiceprint Recognition is carried out to the audio fragment currently traversed, obtains corresponding sound
Sound type if occurring when audio fragment of the sound type before traversal, that is, has locally identified and the audio
The corresponding role of segment, then traverse next audio fragment;If not going out when audio fragment of the sound type before traversal
Now cross, that is, temporarily there is not yet role corresponding with the sound type, then obtain in original video file with the audio fragment
Corresponding original video frame sequence identifies angle corresponding with the sound type to recognition of face is carried out in the original video frame sequence
Color, to establish the corresponding relationship between the sound type and the role.
In some embodiments, if when audio fragment of the sound type before traversal occurred, server still can be with
Original video frame sequence corresponding with the audio fragment is obtained, recognition of face is carried out to original video frame sequence, identifies angle therein
Color traverses next audio if the role identified role corresponding with the sound type is stored before is the same role
Segment;If not the same role, then there may be multiple roles to be corresponding to it for the sound type, can to the sound type into
Line flag, to carry out manual confirmation.
As shown in figure 4, for the flow diagram for the corresponding relationship established in one embodiment between sound type and role.
Step in the flow chart is executed by server.Referring to Fig. 4, comprising the following steps: S402, server traverse in audio file
Audio fragment;S404 carries out Application on Voiceprint Recognition to the audio fragment currently traversed, obtains sound type;S406 judges whether
In the presence of role corresponding with the sound type;If so, thening follow the steps S412;If it is not, thening follow the steps S408;S408, to original
Original video frame sequence corresponding with the audio fragment currently traversed carries out recognition of face in video file, obtains and the sound type
Corresponding role;S410, by sound type storage corresponding with the role;S412 traverses next audio fragment.
In the above-described embodiments, if there are multiple faces in picture when carrying out recognition of face to original video frame sequence, also
The speaker in picture further can be determined according to the opening and closing state of the shape of the mouth as one speaks of face each in original video frame sequence, thus will
Speaker is as role corresponding with the sound type.
S306 obtains caption character corresponding with audio fragment in subtitle file.
Wherein, subtitle file includes the file of all caption characters an of complete image.The sound of same portion's complete image
Frequency file, original video file and subtitle file can be mapped by timestamp.In the available subtitle file of server with
The corresponding caption character of currently processed audio fragment.Caption character is one corresponding with the voice data in audio fragment
Words.Step S306 can be executed after step S308 in the present embodiment, can also be executed after step S310.
S308 obtains original video frame sequence corresponding with audio fragment in original video file.
Specifically, after obtaining the corresponding sound type of currently processed audio fragment, server can be according to building in advance
Corresponding relationship between vertical sound type and role inquires role corresponding with the sound type.Then server
Obtain original video file in original video frame sequence corresponding with audio fragment, with determine the original video frame sequence in the presence or absence of with
The identical role of the role inquired.Original video frame sequence is made of continuous multiple original video frames, present video segment
The frame per second of the quantity of original video frame included by corresponding original video frame per second and the duration of present video segment and original video file
It is positively correlated.
S310 identifies role corresponding with sound type based on the original video frame in original video frame sequence.
Specifically, server can to the original video frame sequence of acquisition carry out recognition of face, identify in original video frame with
The corresponding role of sound type.In one embodiment, if the role and to inquire role corresponding with sound type be identical
Role, then server can determine position of the role in original video frame.Position of the role in original video frame can be used
Coordinate of the role in original video frame indicates.
S312 generates the subtitling image frame including caption character according to position of the role in original video frame;Subtitling image
Frame is used to synthesize target video frame with original video frame.
Specifically, after server identifies role corresponding with sound type in original video frame, server can root
According to position of the role identified in original video frame, a subtitling image frame identical with original video frame size, word are generated
Include the caption character that obtains in step S306 in curtain picture frame, and the caption character according to the role identified in original video
Position in frame is placed in the corresponding position of subtitling image frame.For example, head of the caption character close to role in subtitling image frame
Portion is placed on corresponding position.
Subtitling image frame is the video frame of background transparent.In some embodiments, in subtitling image frame caption character word
Body can be unified color, such as white or black, naturally it is also possible to be other colors.In further embodiments, word
The font of curtain text may not be unified color, for example, the font color of caption character can be according to word in original video frame
The corresponding background colour in position shown by curtain text accordingly adjusts, as long as caption character can be shown.For another example, subtitling image
In frame the font color of caption character can also be in original video frame close to character location contrastive colours, caption character can be made
Apparent, if placing than caption character close to the head of role, the color close to the head of role is black, then caption character
For white.Caption character can laterally or longitudinally be shown.
It in one embodiment, can also include the subtitle pointer that caption character is directed toward to role in subtitling image frame;It should
Method for processing video frequency further include: determine position of the role in original video frame;When each original regards role in original video frame sequence
When position in frequency frame changes, then correspondingly change position of the subtitle pointer in subtitling image frame, so that in video frame
Subtitle pointer is accordingly moved with the movement of role in the video continuous pictures of composition.
Wherein, subtitle pointer is the pointer for caption character to be directed toward to role corresponding with sound type.Subtitling image
Caption character and subtitle pointer in frame cooperate, after the subtitling image frame synthesizes to obtain video frame with original video frame, new view
The video pictures that frequency frame is showed can specify contacting between role in subtitle and picture, and user not only can be with more intuitive bright
The really lines of which role can also convey more video informations to the defective user of hearing.
Specifically, server can determine angle in identifying original video frame sequence after role corresponding with sound type
Position of the color in each original video frame can be determined according to the variation of position in the time corresponding with the audio fragment duration
It is interior, variation of the role in picture.For example, currently processed audio fragment when it is 2 seconds a length of, the frame per second of original video file is
30 frames/second, then the original video frame sequence obtained include 60 frame original video frames, and for each original video frame, server, which all determines, to be known
Not Chu role corresponding with sound type where position, so that it is determined that change in location of the role in picture in this 2 seconds.
In one embodiment, if the position where the role identified is not sent out in the corresponding duration of audio fragment
Changing, the i.e. role are static in video pictures, then position of the subtitle pointer in subtitling image frame is also constant;If
In the corresponding duration of audio fragment, determines that the role is moved in picture according to the variation of role position, for example be horizontal
To or longitudinal linear movement, then position of the subtitle pointer in subtitling image frame also corresponding change, in this way, subtitling image frame with
The effect of movement that the video frame obtained after the superposition of original video frame can show subtitle pointer with the movement of role.
In one embodiment, position of the caption character in subtitling image frame close to the role in the original video frame
Set corresponding show in bubble text box.Wherein, bubble text box is the text presented in the form of bubble is wrapped in caption character
This frame.In some embodiments, the caption character strong for some moods, can also be using the skill with more visual impact
Art font is shown in subtitling image frame.
As shown in figure 5, for the schematic diagram of subtitling image frame corresponding with original video frame in one embodiment.It is left referring to Fig. 5
Side is three original video frames in the corresponding original video frame sequence of audio fragment;Referring among Fig. 5, corresponded to according to audio fragment
Caption character, the position where the role that identifies in the original video frame of the left side, generate corresponding subtitling image frame, subtitling image
Caption character in frame shows in bubble text box that the bubble pointer of bubble text box is directed toward should according to the position where role
Role;New video frame is to synthesize original video frame with subtitling image frame shown on the right of Fig. 5, it can be seen that subtitle refers to
The position of needle changes with the variation of character location.
In some embodiments, when the position according to role determines that role moves in original video picture, do not change
The position of bubble text box in the subtitling image frame of generation, that is, the position of caption character will not change, and only accordingly change word
The position of curtain pointer.
Above-mentioned method for processing video frequency has pre-established the corresponding relationship of sound type and role in entire video, by
Role corresponding with the sound type identified is found in original video frame, according to position of the role in original video frame, is generated
Subtitling image frame including caption character.The subtitling image frame of generation is used to synthesize to obtain with original video frame including close to role's
The target video frame of caption character.Due to the subtitle in the target video frame of synthesis, the role in video pictures is shown, table
Existing form more horn of plenty is conducive to user and better understands view to improve the ability that video pictures transmit information to user
Frequency content.
In one embodiment, method for processing video frequency further include: when there is no corresponding with sound type in original video frame
Role when, it is determined that role's title corresponding with role;Generate the subtitle including role's title, caption character and subtitle pointer
Picture frame;The image border of subtitle pointer direction subtitling image frame.
Specifically, after server inquires role corresponding with sound type according to corresponding relationship, server is based on original
Original video frame in sequence of frames of video carry out recognition of face it is unidentified go out inquiry role identical role when, that is, the role
When not appearing in original video frame, then server can show caption character according to the corresponding role's title of role of inquiry
In the subtitling image frame of generation.In the subtitling image frame, subtitle pointer is directed toward image border, and can be in caption character
The corresponding role's title of the role is added in beginning, in this way, the subtitling image frame synthesizes after obtaining new video frame with original video frame,
Contacting between the caption character and speaker's identity can also be conveyed to user.
In some embodiments, since more people speak simultaneously, the sound type of present video segment can not be identified,
Then server can also be marked present video segment, determine that present video segment is corresponding by way of manual confirmation
Sound type.
In some embodiments, recognition of face can not be carried out in the complicated picture of some high-speed motions due to having, it can
The corresponding this kind of original video frame of audio fragment to be marked, to carry out manual confirmation, to generate corresponding credit video
Frame.
In one embodiment, above-mentioned method for processing video frequency further include: determine that there is no corresponding sounds in original video file
The original video frame of frequency segment;Blank image frame is generated, using blank image frame as the corresponding subtitling image frame of the original video frame.
Specifically, for the original video frame in original video file there is no corresponding audio fragment, for example, some landscape paintings
Face etc., then server can be corresponding with these original video frames with the picture frame of blank, so that in the credit video file generated
Original video frame in subtitling image frame and original video file is one-to-one, that is, credit video file and original video file
Duration be identical.In this way, when synthesizing new video, it is only necessary to which the credit video frame in credit video file to be superimposed upon
In original video file on corresponding original video frame, so that it may obtain new video frame.Certainly for being not present in original video file
The original video frame of corresponding audio fragment, can be used directly the original video frame, is added to subtitle together with other subtitling image frames
In video file.
In one embodiment, above-mentioned method for processing video frequency further include: corresponding according to original video frame each in original video file
Credit video frame generate credit video file;By the storage corresponding with the video identifier of original video file of credit video file.
Specifically, for original video frame each in original video frame sequence, server generate corresponding credit video frame it
Afterwards, so that it may obtain next audio-frequency fragments, next audio fragment is handled, generate the corresponding original of an audio fragment
Credit video frame corresponding to each original video frame in sequence of frames of video, and so on, until to the audio piece in audio file
Section is disposed, so that it may the corresponding all credit video frames of entire audio file is obtained, for the original of audio fragment is not present
Video frame can be corresponding to it with blank image frame, thus, the corresponding credit video frame of entire original video file is obtained, these
Credit video frame constitutes credit video file.Server can be by the video identifier pair of the credit video file and original video file
It should store.Due in credit video file do not include original video file content, can by the credit video file with not
Original video file with clarity or different-format is synthesized, and new synthetic video file is obtained, clear compared to every kind
For the original video file of degree or format all generates corresponding credit video file, memory space can be saved.
In one embodiment, above-mentioned method for processing video frequency further include: receive the carrying video identifier and work as that terminal is sent
The subtitle of preceding play time node opens request;Obtain the corresponding original video file of video identifier and credit video file;From original
Current play time node is corresponding in the corresponding original video frame of current play time node and credit video file in video file
Credit video frame start carry out Video Composition, obtain synthetic video file;It opens and requests in response to subtitle, by synthetic video text
Part feeds back to terminal;The synthetic video file for feeding back to terminal is used to indicate terminal current play time from synthetic video file
The corresponding video frame of node starts to play in order video frame.
Wherein, when the caption character in credit video file is shown in bubble text box, subtitle opens request can be with
It is that bubble caption opens request, the subtitle open command of the available user's triggering of terminal, according to the video of currently playing video
Mark, current play time node generate corresponding subtitle and open request, and are sent to server, and including to server request should
The video file of special subtitle.Server extracts video and current play time node after receiving the subtitle and opening request,
Obtain the corresponding credit video file of the video identifier and original video file, from credit video file with current play time section
The corresponding subtitling image frame of point starts, and original video frame corresponding with current play time node is closed in original video file
At synthetic video file being obtained, and synthetic video file is fed back to terminal, in this way, terminal can play the synthetic video
File shows video pictures since current play time node, includes the view of the caption character close to role to user's displaying
Frequency picture.Audio file corresponding with the video identifier can also be fed back to end together with the synthetic video file by server
End, terminal can play the audio file and synthetic video file.
In some embodiments, if terminal barrage when playing video is to open, user's triggering is got in terminal
After subtitle open command, barrage can be automatically closed, guarantee that subtitle is not covered by barrage.
As shown in fig. 6, the flow diagram of credit video file is obtained from server for terminal in one embodiment.Reference
Fig. 6, comprising the following steps: S602: user, which clicks, opens bubble caption.Terminal gets user and clicks the behaviour for opening bubble caption
After work, it is carried out S604: sending the bubble caption unlatching request for carrying video identifier to server.Server is according to video identifier
Original video file and credit video file are obtained, and executes step S606: original video file and credit video file are closed
At obtaining synthetic video file.Then server executes step S608: synthetic video file is back to terminal.S610: user
It can the video with bubble caption that is played according to the synthetic video file of viewing terminal.
As shown in fig. 7, in a specific embodiment method for processing video frequency the following steps are included:
S702 traverses each audio fragment in audio file.
S704 carries out Application on Voiceprint Recognition to current audio fragment, obtains sound type corresponding with audio fragment.
S706 then traverses next audio fragment when the sound type has existed matched role;
S708 then obtains audio piece in original video file and current when matched role is not present in the sound type
The corresponding original video frame sequence of section carries out recognition of face to each original video frame in the original video frame sequence, obtains original video frame
In role, by the sound type it is corresponding with the role identified storage.
S710 obtains audio fragment currently processed in audio file;Application on Voiceprint Recognition is carried out to the audio fragment, is somebody's turn to do
The corresponding sound type of audio fragment.
S712 obtains caption character corresponding with audio fragment to be processed in subtitle file.
S714 obtains original video frame sequence corresponding with the audio fragment in original video file.
S716 carries out recognition of face based on the original video frame in the original video frame sequence.
S718, when recognizing role corresponding to sound type corresponding with currently processed audio fragment, determining should
Position of the role in original video frame.
S720, according to position of the role in original video frame, generating includes caption character and the subtitle for being directed toward the role
The subtitling image frame of pointer;The subtitling image frame with original video frame for synthesizing to obtain new video frame.
S722, when position of the role in original video frame sequence in each original video frame changes, then correspondingly
Change position of the subtitle pointer in the subtitling image frame, so that subtitle refers in the video continuous pictures that new video frame is constituted
Needle is accordingly moved with the movement of role.
S724, when there is no roles corresponding with sound type corresponding to currently processed audio fragment in original video frame
When, it is determined that role's title corresponding with the role;Generate the subtitling image including role's title, caption character and subtitle pointer
Frame;Subtitle pointer is directed toward the image border of the subtitling image frame.
S726 determines the original video frame that corresponding audio fragment is not present in original video file;Blank image frame is generated, it will
Blank image frame is as the corresponding subtitling image frame of the original video frame.
S728 generates credit video file according to the corresponding credit video frame of original video frame each in original video file;
S730, by the storage corresponding with the video identifier of original video file of credit video file.
S732 receives the subtitle unlatching request for carrying video identifier and current play time node that terminal is sent;
S734 obtains the corresponding original video file of video identifier and credit video file;
S736, it is current from the corresponding original video frame of current play time node in original video file and credit video file
The corresponding credit video frame of play time node starts to carry out Video Composition, obtains synthetic video file;
S738 opens in response to subtitle and requests, synthetic video file is fed back to terminal;Feed back to the synthetic video of terminal
File is used to indicate terminal and plays in order video the corresponding video frame of current play time node since synthetic video file
Frame.
Above-mentioned method for processing video frequency establishes the corresponding relationship of sound type and role in entire video, by original video
Role corresponding with the sound type identified is found in frame, according to position of the role in original video frame, generating includes word
The subtitling image frame of curtain text.The subtitling image frame of generation synthesizes to obtain including the caption character close to role with original video frame
Target video frame.Due to the subtitle in the target video frame of synthesis, the role in video pictures shows that the form of expression is more
It is abundant, to improve the ability that video pictures transmit information to user, is conducive to user and better understands video content.
Fig. 7 is the flow diagram of method for processing video frequency in one embodiment.Although should be understood that the process of Fig. 7
Each step in figure is successively shown according to the instruction of arrow, but these steps are not the inevitable sequence indicated according to arrow
Successively execute.Unless expressly stating otherwise herein, there is no stringent sequences to limit for the execution of these steps, these steps can
To execute in other order.Moreover, at least part step in Fig. 7 may include multiple sub-steps or multiple stages,
These sub-steps or stage are not necessarily to execute completion in synchronization, but can execute at different times, these
Sub-step perhaps the stage execution sequence be also not necessarily successively carry out but can be with the son of other steps or other steps
Step or at least part in stage execute in turn or alternately.
In one embodiment, as shown in figure 8, providing a kind of video process apparatus 800, which includes audio fragment
Acquisition module 802, sound type identification module 804, caption character obtain module 806, original video frame sequence obtains module 808,
Role's identification module 810 and subtitling image frame generation module 812, in which:
Audio fragment obtains module 802, for obtaining the audio fragment in audio file;
Sound type identification module 804 obtains the corresponding sound of audio fragment for carrying out Application on Voiceprint Recognition to audio fragment
Type;
Caption character obtains module 806, for obtaining caption character corresponding with audio fragment in subtitle file;
Original video frame sequence obtains module 808, for obtaining original video frame corresponding with audio fragment in original video file
Sequence;
Role's identification module 810, for identifying corresponding with sound type based on the original video frame in original video frame sequence
Role;
Subtitling image frame generation module 812, for the position according to role in original video frame, generating includes caption character
Subtitling image frame;Subtitling image frame is used to synthesize target video frame with original video frame.
It in one embodiment, further include the subtitle pointer that caption character is directed toward to role in subtitling image frame;At video
Managing device 800 further includes subtitle pointer processing module;Subtitle pointer processing module is for determining position of the role in original video frame
It sets;When position of the role in original video frame sequence in each original video frame changes, then correspondingly change subtitle pointer
Position in subtitling image frame so that video frame constitute video continuous pictures in subtitle pointer with the movement of role and
It is corresponding mobile.
In one embodiment, subtitling image frame generation module 812 is also used to be not present in the original video frame and sound class
When the corresponding role of type, it is determined that role's title corresponding with role;Generate includes role's title, caption character and subtitle pointer
Subtitling image frame;The image border of subtitle pointer direction subtitling image frame.
In one embodiment, subtitling image frame generation module 812 is also used to determine that there is no corresponding in original video file
Audio fragment original video frame;Blank image frame is generated, using blank image frame as the corresponding subtitling image frame of original video frame.
In one embodiment, video process apparatus 800 further includes corresponding relationship memory module;Corresponding relationship memory module
For traversing each audio fragment in audio file;Application on Voiceprint Recognition is carried out to current audio fragment, is obtained and audio fragment
Corresponding sound type;When sound type is there are when matched role, then next audio fragment is traversed;When sound type is not deposited
In matched role, then original video frame sequence corresponding with current audio fragment in original video file is obtained, to original video
Each original video frame in frame sequence carries out recognition of face, obtains the role in original video frame, by sound type and the angle identified
The corresponding storage of color.
In one embodiment, video process apparatus 800 further includes credit video file storage module;Credit video storage
Module is used to generate credit video file according to the corresponding credit video frame of original video frame each in original video file;By credit video
File storage corresponding with the video identifier of original video file.
In one embodiment, as shown in figure 9, video process apparatus 800 further includes that subtitle opens request receiving module
902, synthesis module 904 and sending module 906;Subtitle opens the carrying view that request receiving module 902 is used to receive terminal transmission
Frequency marking is known and the subtitle of current play time node opens request;Synthesis module 904 is for obtaining the corresponding former view of video identifier
Frequency file and credit video file;The corresponding original video frame of current play time node and credit video text from original video file
The corresponding credit video frame of current play time node starts to carry out Video Composition in part, obtains synthetic video file;Send mould
Block 906 is used to open in response to subtitle and request, and synthetic video file is fed back to terminal;Feed back to the synthetic video file of terminal
It is used to indicate terminal and plays in order video frame the corresponding video frame of current play time node since synthetic video file.
In one embodiment, position of the caption character in subtitling image frame close to the role in the original video frame
Set corresponding show in bubble text box.
Above-mentioned video process apparatus 800, has pre-established the corresponding relationship of sound type and role in entire video, has passed through
Role corresponding with the sound type identified is found in original video frame, it is raw according to position of the role in original video frame
At the subtitling image frame including caption character.The subtitling image frame of generation synthesizes to obtain the word including close to role with original video frame
The target video frame of curtain text.Due to the subtitle in the target video frame of synthesis, the role in video pictures is shown, performance
Form more horn of plenty is conducive to user and better understands video to improve the ability that video pictures transmit information to user
Content.
Figure 10 shows the internal structure chart of computer equipment in one embodiment.The computer equipment specifically can be figure
Server 120 in 1.As shown in Figure 10, which includes the processor connected by system bus 1002
1004, memory 1006 and network interface 1008.Wherein, memory 1006 includes non-volatile memory medium and built-in storage.
The non-volatile memory medium of the computer equipment 1000 is stored with operating system, can also be stored with computer program, the calculating
When machine program is executed by processor 1004, processor 1004 may make to realize method for processing video frequency.It can also be stored up in the built-in storage
There is computer program, when which is executed by processor 1004, processor 1004 may make to execute video processing side
Method.
It will be understood by those skilled in the art that structure shown in Figure 10, only part relevant to application scheme
The block diagram of structure, does not constitute the restriction for the computer equipment being applied thereon to application scheme, and specific computer is set
Standby may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, video process apparatus 800 provided by the present application can be implemented as a kind of computer program
Form, computer program can be run in computer equipment as shown in Figure 10.Group can be stored in the memory of computer equipment
At each program module of the video process apparatus, for example, audio fragment shown in Fig. 8 obtains module 802, sound type identification
Module 804, caption character obtain module 806, original video frame sequence obtains module 808, role's identification module 810 and subtitling image
Frame generation module 812.The computer program that each program module is constituted makes processor execute this Shen described in this specification
It please step in the method for processing video frequency of each embodiment.
For example, computer equipment shown in Fig. 10 can pass through the audio piece in video process apparatus 800 as shown in Figure 8
Section obtains module 802 and executes step S302.Computer equipment can execute step S304 by sound type identification module 804.Meter
The execution of module 806 step S306 can be obtained by caption character by calculating machine equipment.Computer equipment can be obtained by original video frame sequence
Modulus block 808 executes step S308.Computer equipment can obtain module 808 by original video frame sequence and execute step S310.Meter
Step S312 can be executed by subtitling image frame generation module 812 by calculating machine equipment.
In one embodiment, a kind of computer equipment, including memory and processor are provided, memory is stored with meter
Calculation machine program, when computer program is executed by processor, so that the step of processor executes above-mentioned method for processing video frequency.It regards herein
The step of frequency processing method, can be the step in the method for processing video frequency of above-mentioned each embodiment.
In one embodiment, a kind of computer readable storage medium is provided, computer program, computer journey are stored with
When sequence is executed by processor, so that the step of processor executes above-mentioned method for processing video frequency.The step of method for processing video frequency herein
It can be the step in the method for processing video frequency of above-mentioned each embodiment.
It can be with those of ordinary skill in the art will appreciate that realizing that all or part of the process in above-described embodiment method is
Relevant hardware is instructed to complete by computer program, it is readable that computer program can be stored in a non-volatile computer
It takes in storage medium, which when being executed, can execute according to the step of embodiment of such as above-mentioned each method.Wherein,
To any reference of memory or other media used in each embodiment provided herein, may each comprise non-volatile
And/or volatile memory.Nonvolatile memory may include that read-only memory (ROM), programming ROM (PROM), electricity can be compiled
Journey ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include random access memory
(RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, such as static state RAM
(SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhanced SDRAM
(ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) directly RAM (RDRAM), straight
Connect memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
Above embodiments only express the several embodiments of the application, and the description thereof is more specific and detailed, but can not
Therefore it is interpreted as the limitation to the application the scope of the patents.It should be pointed out that for those of ordinary skill in the art,
Without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection model of the application
It encloses.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (15)
1. a kind of method for processing video frequency, comprising:
Obtain the audio fragment in audio file;
Application on Voiceprint Recognition is carried out to the audio fragment, obtains the corresponding sound type of the audio fragment;
Obtain caption character corresponding with the audio fragment in subtitle file;
Obtain original video frame sequence corresponding with the audio fragment in original video file;
Based on the original video frame in the original video frame sequence, role corresponding with the sound type is identified;
According to position of the role in the original video frame, the subtitling image frame including the caption character is generated;It is described
Subtitling image frame is used to synthesize target video frame with the original video frame.
2. the method according to claim 1, wherein further including by the caption character in the subtitling image frame
It is directed toward the subtitle pointer of the role;The method also includes:
Determine position of the role in the original video frame;
When position of the role in the original video frame sequence in each original video frame changes, then correspondingly change
Position of the subtitle pointer in the subtitling image frame, so that in the video continuous pictures that the target video frame is constituted
The subtitle pointer is accordingly moved with the movement of the role.
3. the method according to claim 1, wherein the method also includes:
When role corresponding with the sound type is not present in the original video frame, then
Determine role's title corresponding with the role;
Generate the subtitling image frame including role's title, the caption character and subtitle pointer;The subtitle pointer is directed toward
The image border of the subtitling image frame.
4. the method according to claim 1, wherein the method also includes:
Determine the original video frame that corresponding audio fragment is not present in the original video file;
Blank image frame is generated, using the blank image frame as the corresponding subtitling image frame of the original video frame.
5. the method according to claim 1, wherein the method also includes:
Traverse each audio fragment in audio file;
Application on Voiceprint Recognition is carried out to current audio fragment, obtains sound type corresponding with the audio fragment;
When the sound type is there are when matched role, then next audio fragment is traversed;
When matched role is not present in the sound type, then obtain corresponding with current audio fragment in original video file
Original video frame sequence carries out recognition of face to each original video frame in the original video frame sequence, obtains in the original video frame
Role, by the sound type it is corresponding with the role identified storage.
6. the method according to claim 1, wherein the method also includes:
Credit video file is generated according to the corresponding credit video frame of original video frame each in original video file;
By credit video file storage corresponding with the video identifier of the original video file.
7. method according to any one of claims 1 to 6, which is characterized in that the method also includes:
Receive the subtitle unlatching request for carrying video identifier and current play time node that terminal is sent;
Obtain the corresponding original video file of the video identifier and credit video file;
In the corresponding original video frame of the current play time node described in the original video file and the credit video file
The corresponding credit video frame of current play time node starts to carry out Video Composition, obtains synthetic video file;
It opens and requests in response to the subtitle, the synthetic video file is fed back into the terminal;Feed back to the terminal
Synthetic video file is used to indicate the terminal corresponding view of current play time node described in the synthetic video file
Frequency frame starts to play in order video frame.
8. method according to any one of claims 1 to 6, which is characterized in that the caption character in the subtitling image frame
It is accordingly shown in bubble text box close to position of the role in the original video frame.
9. a kind of video process apparatus, which is characterized in that described device includes:
Audio fragment obtains module, for obtaining the audio fragment in audio file;
Sound type identification module obtains the corresponding sound of the audio fragment for carrying out Application on Voiceprint Recognition to the audio fragment
Sound type;
Caption character obtains module, for obtaining caption character corresponding with the audio fragment in subtitle file;
Original video frame sequence obtains module, for obtaining original video frame sequence corresponding with the audio fragment in original video file
Column;
Role's identification module, for identifying corresponding with the sound type based on the original video frame in the original video frame sequence
Role;
Subtitling image frame generation module, for the position according to the role in the original video frame, generating includes the word
The subtitling image frame of curtain text;The subtitling image frame is used to synthesize target video frame with the original video frame.
10. device according to claim 9, which is characterized in that further include by the subtitle text in the subtitling image frame
Word is directed toward the subtitle pointer of the role;Described device further includes subtitle pointer processing module;The subtitle pointer processing module
For determining position of the role in the original video frame;When each original regards the role in the original video frame sequence
When position in frequency frame changes, then correspondingly change position of the subtitle pointer in the subtitling image frame, so that
The subtitle pointer described in the video continuous pictures that the target video frame is constituted accordingly is moved with the movement of the role.
11. device according to claim 9, which is characterized in that the subtitling image frame generation module is also used to when described
When role corresponding with the sound type being not present in original video frame, it is determined that role's title corresponding with the role;It is raw
At the subtitling image frame for including role's title, the caption character and subtitle pointer;The subtitle pointer is directed toward the word
The image border of curtain picture frame.
12. device according to claim 9, which is characterized in that described device further includes corresponding relationship memory module;It is described
Corresponding relationship memory module is used to traverse each audio fragment in audio file;Vocal print knowledge is carried out to current audio fragment
Not, sound type corresponding with the audio fragment is obtained;When the sound type is there are when matched role, then traverse next
A audio fragment;When matched role is not present in the sound type, then audio piece in original video file and current is obtained
The corresponding original video frame sequence of section carries out recognition of face to each original video frame in the original video frame sequence, obtains the original
Role in video frame, by sound type storage corresponding with the role identified.
13. device according to claim 9, which is characterized in that described device further includes synthetic video file sending module;
The subtitle of carrying video identifier and current play time node that synthetic video file sending module is used to receive terminal transmission is opened
Open request;Obtain the corresponding original video file of the video identifier and credit video file;Described in the original video file
The corresponding subtitle of current play time node in the corresponding original video frame of current play time node and the credit video file
Video frame starts to carry out Video Composition, obtains synthetic video file;It opens and requests in response to the subtitle, by the synthetic video
File feeds back to the terminal;The synthetic video file for feeding back to the terminal is used to indicate the terminal from the synthetic video
The corresponding video frame of current play time node described in file starts to play in order video frame.
14. a kind of computer readable storage medium is stored with computer program, when the computer program is executed by processor,
So that the processor is executed such as the step of any one of claims 1 to 8 the method.
15. a kind of computer equipment, including memory and processor, the memory is stored with computer program, the calculating
When machine program is executed by the processor, so that the processor executes the step such as any one of claims 1 to 8 the method
Suddenly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910150506.3A CN109862422A (en) | 2019-02-28 | 2019-02-28 | Method for processing video frequency, device, computer readable storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910150506.3A CN109862422A (en) | 2019-02-28 | 2019-02-28 | Method for processing video frequency, device, computer readable storage medium and computer equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109862422A true CN109862422A (en) | 2019-06-07 |
Family
ID=66899259
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910150506.3A Pending CN109862422A (en) | 2019-02-28 | 2019-02-28 | Method for processing video frequency, device, computer readable storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109862422A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110427930A (en) * | 2019-07-29 | 2019-11-08 | 中国工商银行股份有限公司 | Multimedia data processing method and device, electronic equipment and readable storage medium storing program for executing |
CN110446062A (en) * | 2019-07-18 | 2019-11-12 | 平安科技(深圳)有限公司 | Receiving handling method, electronic device and the storage medium of large data files transmission |
CN110572691A (en) * | 2019-08-01 | 2019-12-13 | 浙江大华技术股份有限公司 | Video reading method, device, equipment and storage medium |
CN111309963A (en) * | 2020-01-22 | 2020-06-19 | 百度在线网络技术(北京)有限公司 | Audio file processing method and device, electronic equipment and readable storage medium |
CN111601174A (en) * | 2020-04-26 | 2020-08-28 | 维沃移动通信有限公司 | Subtitle adding method and device |
CN112153461A (en) * | 2020-09-25 | 2020-12-29 | 北京百度网讯科技有限公司 | Method and device for positioning sound production object, electronic equipment and readable storage medium |
CN112312196A (en) * | 2020-11-13 | 2021-02-02 | 深圳市前海手绘科技文化有限公司 | Video subtitle making method |
CN112380922A (en) * | 2020-10-23 | 2021-02-19 | 岭东核电有限公司 | Method and device for determining compound video frame, computer equipment and storage medium |
CN112383809A (en) * | 2020-11-03 | 2021-02-19 | Tcl海外电子(惠州)有限公司 | Subtitle display method, device and storage medium |
CN112752165A (en) * | 2020-06-05 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Subtitle processing method, subtitle processing device, server and computer-readable storage medium |
CN112750184A (en) * | 2019-10-30 | 2021-05-04 | 阿里巴巴集团控股有限公司 | Data processing, action driving and man-machine interaction method and equipment |
CN112820265A (en) * | 2020-09-14 | 2021-05-18 | 腾讯科技(深圳)有限公司 | Speech synthesis model training method and related device |
CN113660536A (en) * | 2021-09-28 | 2021-11-16 | 北京七维视觉科技有限公司 | Subtitle display method and device |
CN113992972A (en) * | 2021-10-28 | 2022-01-28 | 维沃移动通信有限公司 | Subtitle display method and device, electronic equipment and readable storage medium |
CN114007116A (en) * | 2022-01-05 | 2022-02-01 | 凯新创达(深圳)科技发展有限公司 | Video processing method and video processing device |
CN114339081A (en) * | 2021-12-22 | 2022-04-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Subtitle generating method, electronic equipment and computer readable storage medium |
CN114342353A (en) * | 2019-09-10 | 2022-04-12 | 华为技术有限公司 | Method and system for video segmentation |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11261890A (en) * | 1998-03-11 | 1999-09-24 | Nippon Telegr & Teleph Corp <Ntt> | Video caption inserting method and device, and recording medium recording the inserting method |
CN101518055A (en) * | 2006-09-21 | 2009-08-26 | 松下电器产业株式会社 | Subtitle generation device, subtitle generation method, and subtitle generation program |
US20130141551A1 (en) * | 2011-12-02 | 2013-06-06 | Lg Electronics Inc. | Mobile terminal and control method thereof |
CN103533256A (en) * | 2013-10-28 | 2014-01-22 | 广东威创视讯科技股份有限公司 | Method and device for processing subtitle and subtitle display system |
CN105979169A (en) * | 2015-12-15 | 2016-09-28 | 乐视网信息技术(北京)股份有限公司 | Video subtitle adding method, device and terminal |
CN106021496A (en) * | 2016-05-19 | 2016-10-12 | 海信集团有限公司 | Video search method and video search device |
CN108419141A (en) * | 2018-02-01 | 2018-08-17 | 广州视源电子科技股份有限公司 | A kind of method, apparatus, storage medium and the electronic equipment of subtitle position adjustment |
CN108540845A (en) * | 2018-03-30 | 2018-09-14 | 优酷网络技术(北京)有限公司 | Barrage method for information display and device |
CN109376145A (en) * | 2018-11-19 | 2019-02-22 | 深圳Tcl新技术有限公司 | The method for building up of movie dialogue database establishes device and storage medium |
-
2019
- 2019-02-28 CN CN201910150506.3A patent/CN109862422A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11261890A (en) * | 1998-03-11 | 1999-09-24 | Nippon Telegr & Teleph Corp <Ntt> | Video caption inserting method and device, and recording medium recording the inserting method |
CN101518055A (en) * | 2006-09-21 | 2009-08-26 | 松下电器产业株式会社 | Subtitle generation device, subtitle generation method, and subtitle generation program |
US20130141551A1 (en) * | 2011-12-02 | 2013-06-06 | Lg Electronics Inc. | Mobile terminal and control method thereof |
CN103533256A (en) * | 2013-10-28 | 2014-01-22 | 广东威创视讯科技股份有限公司 | Method and device for processing subtitle and subtitle display system |
CN105979169A (en) * | 2015-12-15 | 2016-09-28 | 乐视网信息技术(北京)股份有限公司 | Video subtitle adding method, device and terminal |
CN106021496A (en) * | 2016-05-19 | 2016-10-12 | 海信集团有限公司 | Video search method and video search device |
CN108419141A (en) * | 2018-02-01 | 2018-08-17 | 广州视源电子科技股份有限公司 | A kind of method, apparatus, storage medium and the electronic equipment of subtitle position adjustment |
CN108540845A (en) * | 2018-03-30 | 2018-09-14 | 优酷网络技术(北京)有限公司 | Barrage method for information display and device |
CN109376145A (en) * | 2018-11-19 | 2019-02-22 | 深圳Tcl新技术有限公司 | The method for building up of movie dialogue database establishes device and storage medium |
Non-Patent Citations (1)
Title |
---|
周恕义: "《多媒体CAI开发实用教程》", 30 April 1999, 中国水利水电出版社 * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110446062A (en) * | 2019-07-18 | 2019-11-12 | 平安科技(深圳)有限公司 | Receiving handling method, electronic device and the storage medium of large data files transmission |
CN110446062B (en) * | 2019-07-18 | 2022-11-25 | 平安科技(深圳)有限公司 | Receiving processing method for big data file transmission, electronic device and storage medium |
CN110427930A (en) * | 2019-07-29 | 2019-11-08 | 中国工商银行股份有限公司 | Multimedia data processing method and device, electronic equipment and readable storage medium storing program for executing |
CN110572691A (en) * | 2019-08-01 | 2019-12-13 | 浙江大华技术股份有限公司 | Video reading method, device, equipment and storage medium |
CN110572691B (en) * | 2019-08-01 | 2022-05-20 | 浙江大华技术股份有限公司 | Video reading method, device, equipment and storage medium |
CN114342353A (en) * | 2019-09-10 | 2022-04-12 | 华为技术有限公司 | Method and system for video segmentation |
CN112750184B (en) * | 2019-10-30 | 2023-11-10 | 阿里巴巴集团控股有限公司 | Method and equipment for data processing, action driving and man-machine interaction |
CN112750184A (en) * | 2019-10-30 | 2021-05-04 | 阿里巴巴集团控股有限公司 | Data processing, action driving and man-machine interaction method and equipment |
CN111309963A (en) * | 2020-01-22 | 2020-06-19 | 百度在线网络技术(北京)有限公司 | Audio file processing method and device, electronic equipment and readable storage medium |
CN111309963B (en) * | 2020-01-22 | 2023-07-04 | 百度在线网络技术(北京)有限公司 | Audio file processing method and device, electronic equipment and readable storage medium |
CN111601174A (en) * | 2020-04-26 | 2020-08-28 | 维沃移动通信有限公司 | Subtitle adding method and device |
CN112752165B (en) * | 2020-06-05 | 2023-09-01 | 腾讯科技(深圳)有限公司 | Subtitle processing method, subtitle processing device, server and computer readable storage medium |
CN112752165A (en) * | 2020-06-05 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Subtitle processing method, subtitle processing device, server and computer-readable storage medium |
CN112820265B (en) * | 2020-09-14 | 2023-12-08 | 腾讯科技(深圳)有限公司 | Speech synthesis model training method and related device |
CN112820265A (en) * | 2020-09-14 | 2021-05-18 | 腾讯科技(深圳)有限公司 | Speech synthesis model training method and related device |
CN112153461B (en) * | 2020-09-25 | 2022-11-18 | 北京百度网讯科技有限公司 | Method and device for positioning sound production object, electronic equipment and readable storage medium |
CN112153461A (en) * | 2020-09-25 | 2020-12-29 | 北京百度网讯科技有限公司 | Method and device for positioning sound production object, electronic equipment and readable storage medium |
CN112380922B (en) * | 2020-10-23 | 2024-03-22 | 岭东核电有限公司 | Method, device, computer equipment and storage medium for determining multiple video frames |
CN112380922A (en) * | 2020-10-23 | 2021-02-19 | 岭东核电有限公司 | Method and device for determining compound video frame, computer equipment and storage medium |
CN112383809A (en) * | 2020-11-03 | 2021-02-19 | Tcl海外电子(惠州)有限公司 | Subtitle display method, device and storage medium |
CN112312196A (en) * | 2020-11-13 | 2021-02-02 | 深圳市前海手绘科技文化有限公司 | Video subtitle making method |
CN113660536A (en) * | 2021-09-28 | 2021-11-16 | 北京七维视觉科技有限公司 | Subtitle display method and device |
CN113992972A (en) * | 2021-10-28 | 2022-01-28 | 维沃移动通信有限公司 | Subtitle display method and device, electronic equipment and readable storage medium |
CN114339081A (en) * | 2021-12-22 | 2022-04-12 | 腾讯音乐娱乐科技(深圳)有限公司 | Subtitle generating method, electronic equipment and computer readable storage medium |
CN114007116A (en) * | 2022-01-05 | 2022-02-01 | 凯新创达(深圳)科技发展有限公司 | Video processing method and video processing device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109862422A (en) | Method for processing video frequency, device, computer readable storage medium and computer equipment | |
US10733574B2 (en) | Systems and methods for logging and reviewing a meeting | |
JP4310916B2 (en) | Video display device | |
CN109155865A (en) | The first inlet signal in most interested region in the picture transmits | |
US20100085363A1 (en) | Photo Realistic Talking Head Creation, Content Creation, and Distribution System and Method | |
JPH0823530A (en) | Method and apparatus for processing stream of audio signal and video signal | |
US8386909B2 (en) | Capturing and presenting interactions with image-based media | |
CN106210703A (en) | The utilization of VR environment bust shot camera lens and display packing and system | |
EP3742742A1 (en) | Method, apparatus and system for synchronously playing message stream and audio/video stream | |
KR20070084471A (en) | Captioned still image content creating device, captioned still image content creating program and captioned still image content creating system | |
CN106383576A (en) | Method and system for displaying parts of bodies of experiencers in VR environment | |
US20140139619A1 (en) | Communication method and device for video simulation image | |
US20200186887A1 (en) | Real-time broadcast editing system and method | |
CN1732687A (en) | Method, system and apparatus for telepresence communications | |
CN1112326A (en) | A picture communication apparatus | |
EP2352290A1 (en) | Method and apparatus for matching audio and video signals during a videoconference | |
CN110740283A (en) | method for converting voice into character based on video communication | |
US20110141106A1 (en) | Method and apparatus for identifying speakers and emphasizing selected objects in picture and video messages | |
CN107483872A (en) | Video call system and video call method | |
CN106161871B (en) | Synchronization signal processing method and processing device | |
JP5813542B2 (en) | Image communication system, AR (Augmented Reality) video generation device, and program | |
CN108111781A (en) | A kind of trial video subtitle fabricating method and device | |
KR102417084B1 (en) | Method And System for Transmitting and Receiving Multi-Channel Media | |
CN105791964A (en) | Cross-platform media file playing method and system | |
JP5894505B2 (en) | Image communication system, image generation apparatus, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190607 |