CN103945140B - The generation method and system of video caption - Google Patents

The generation method and system of video caption Download PDF

Info

Publication number
CN103945140B
CN103945140B CN201310018669.9A CN201310018669A CN103945140B CN 103945140 B CN103945140 B CN 103945140B CN 201310018669 A CN201310018669 A CN 201310018669A CN 103945140 B CN103945140 B CN 103945140B
Authority
CN
China
Prior art keywords
video caption
video
caption
information
captions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310018669.9A
Other languages
Chinese (zh)
Other versions
CN103945140A (en
Inventor
赵永刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201310018669.9A priority Critical patent/CN103945140B/en
Publication of CN103945140A publication Critical patent/CN103945140A/en
Application granted granted Critical
Publication of CN103945140B publication Critical patent/CN103945140B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a kind of video caption generation method and system, detection video caption plays type control information;Obtain the video caption broadcast information for playing type control information with the video caption and matching;It is determined that video caption animation model corresponding with the video caption broadcast information;Extract video caption text information;The video caption text information is finally converted using the video caption animation model, video caption is generated, because the video caption of generation is the video caption with caption animation model, realizes the purpose of the dynamic effect of video caption.

Description

The generation method and system of video caption
Technical field
The present invention relates to technical field of data processing, a kind of generation method more specifically to video caption and it is System.
Background technology
Now, video includes film and TV, because that can bring the lifting of visual experience, by rapid general And.
However, the generating mode of the captions of video still can only meet the requirement that planar solid is shown in the prior art, no Dynamic Announce can be realized.
The content of the invention
In view of this, the present invention provides a kind of generation method of video caption, to generate the video caption of dynamic effect.
To achieve these goals, it is proposed that scheme it is as follows:
A kind of video caption generation method, including:
Detect video caption and play type control information;
Obtain the video caption broadcast information for playing type control information with the video caption and matching;
It is determined that video caption animation model corresponding with the video caption broadcast information;
Extract video caption text information;
The video caption text information is converted using the video caption animation model, generates video caption.
Preferably, the detection video caption, which plays type control information, includes:
Gather the human facial expression information of speech vendors corresponding with captions in video.
Preferably, the detection video caption, which plays type control information, includes:
Receive user's input video captions and play type control information.
Preferably, the detection video caption, which plays type control information, includes:
Gather the tone of speech vendors corresponding with captions in video;
The tonal variations of preset time period are calculated, it is determined that video caption corresponding with the tonal variations plays Type Control Information.
Preferably, the extraction video caption text information includes:
Gather the voice messaging of speech vendors corresponding with captions in video;
The voice messaging is identified, generates text information corresponding with the voice.
Preferably, also include before generating video caption:
Gather the speech volume of speech vendors corresponding with captions in video;
The parameter of the video caption animation model is adjusted according to the speech volume.
A kind of video caption generates system, including:
Detector, type control information is played for detecting video caption;
Processor, for obtaining the video caption broadcasting letter for playing type control information with the video caption and matching Breath;It is determined that video caption animation model corresponding with the video caption broadcast information;Extract video caption text information;Using The video caption animation model makes the video caption text information, generates video caption.
Preferably, the detector is image acquisition device, for gathering speech vendors corresponding with captions in video Human facial expression information.
Preferably, the detector is receiver, and the video caption for receiving user's input plays type control information.
Preferably, the detector is voice collector, for gathering speech vendors corresponding with captions in video Tone;
The processor is additionally operable to obtain the tone, calculates the tonal variations of preset time period, it is determined that with the tone Video caption corresponding to change plays type control information.
Preferably, the mode of the processor extraction video caption text information includes:
Gather the voice messaging of speech vendors corresponding with captions in video;
The voice messaging is identified, generates text information corresponding with the voice.
Preferably, the processor is additionally operable to before video caption is generated, and gathers voice corresponding with captions in video The speech volume of supplier;The parameter of the video caption animation model is adjusted according to the speech volume.
It can be seen from the above technical scheme that in the generation method of video caption disclosed by the invention, the video of generation Captions are the video caption with caption animation model, realize the dynamic effect of video caption.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is a kind of flow chart of the generation method of video caption disclosed in the embodiment of the present invention;
Fig. 2 is a kind of flow chart of the generation method of video caption disclosed in another embodiment of the present invention;
Fig. 3 is a kind of flow chart of the generation method of video caption disclosed in another embodiment of the present invention;
Fig. 4 is a kind of flow chart of the generation method of video caption disclosed in another embodiment of the present invention;
Fig. 5 is a kind of flow chart of the generation method of video caption disclosed in another embodiment of the present invention;
Fig. 6 is the structure chart that a kind of video caption disclosed in another embodiment of the present invention generates system.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.
The embodiment of the present invention provides a kind of generation method of video caption, to generate the video caption of dynamic effect.
Referring to Fig. 1, video caption generation method disclosed in the embodiment of the present invention, including step:
S101, detection video caption play type control information;
Wherein, the video caption plays the generation type that type control information controls the video caption, and plays and regard Played out during frequency captions using the generation type of video caption.
S102, obtain the video caption broadcast information for playing type control information with the video caption and matching;
Specifically, prestoring video caption plays type control information and the corresponding relation of video caption broadcast information, After the video caption broadcasting type control information is got, search in the corresponding relation and played with the video caption The video caption broadcast information that type control information matches.
S103, determine video caption animation model corresponding with the video caption broadcast information;
S104, extraction video caption text information;
Specifically, video caption text information can be prestored, when needing to generate video caption, acquisition prestores Video caption text information;Or when needing to generate video caption, receive the video caption text information of input.
S105, the video caption text information, generation video caption are converted using the video caption animation model.
Wherein, when needing to generate the video caption of animation effect, it is necessary to be generated according to video caption animation model.
In the generation method of video caption disclosed in the present embodiment, the video caption of generation is with caption animation model Video caption, realize the dynamic effect of video caption.
Preferably, in video caption generation method disclosed in the present embodiment, before step S105, it can also carry out following step Suddenly:
Gather the speech volume of speech vendors corresponding with captions in video;
The parameter of the video caption animation model is adjusted according to the speech volume.
Specifically, the parameter in the video caption animation model is used for the animation effect for controlling the video caption of generation Degree, when needing to generate the video caption of different animation effect degree, the parameter of video caption animation model can be adjusted.
In the playing process of video, the language and characters of the speaker in video are corresponding with captions, gather the language of speaker Sound volume, the parameter of the video caption animation model is adjusted according to the speech volume, generates different animation effect degree Video caption.
Another embodiment of the present invention also discloses a kind of video caption generation method, as shown in Fig. 2 including step:
The human facial expression information of speech vendors corresponding with captions in S201, collection video;
Specifically, in video display process, the language and characters of the speaker currently shown can be identical with captions.Also, say The human face expression of words person can change according to the scene of video, gather the human facial expression information of the speaker currently shown, can To ensure that the animation effect of the video caption of generation is identical with the scene of current video.
Also, human facial expression information can include single characteristic informations such as the interpupillary distance of eyes, the appearance profile of eyes and mouth shape, Or including all features of human face expression change, including eyes, the corners of the mouth and eyebrow etc. can be reflected.
S202, obtain the video caption broadcast information to match with the human facial expression information;
Specifically, after collecting human facial expression information, by identifying that the human facial expression information reflects current video Scene.Also, the video caption broadcast information obtained matches with human facial expression information, it is ensured that the video caption of generation is expired The demand of sufficient video scene.
Such as:When the human facial expression information of collection shows that the speaker currently shown is very glad, illustrate the field of current video Scape is cheerful and light-hearted scene;When the human facial expression information of collection shows that the speaker currently shown is very angry, illustrate current video Scene is nervous scene.
Wherein, when the interpupillary distance that the human facial expression information is eyes, the size of the interpupillary distance of eyes can be analyzed, to determine to work as The mood of the speaker of preceding display;When the human facial expression information is the appearance profile of eyes, the profile of eyes can be analyzed The trend of profile, to determine the mood of the speaker currently shown;When the human facial expression information is mouth shape, can equally analyze The mood for walking the speaker for always determining currently to show of mouth shape.
When the human facial expression information is integrated information, including all features of human face expression change can be reflected, can With the human face expression for being formed all features that can reflect human face expression change and multiple basic human face expression templates Matched, the mood of the people indicated by the higher basic human face expression template of matching degree is the heart of the speaker currently shown Feelings.
Or using neural network analysis methods, using the expression of basic face as output neuron, generally six kinds of bases The expression of this face, using the human facial expression information collected as input neuron, analysis, which is calculated to correspond to, states human face expression The human face expression type of information, it is determined that the mood of the speaker currently shown.
S203, determine video caption animation model corresponding with the video caption broadcast information;
Wherein:Different video caption broadcast informations is corresponding with different video caption animation models;Get video words After curtain broadcast information, it is thus necessary to determine that video caption animation model corresponding with the video caption broadcast information.
For example, when the mood for the speaker that video caption broadcast information reaction is currently shown is glad, can be true The model of fixed cheerful and light-hearted captions bouncing effect;When the mood for the speaker that video caption broadcast information reaction is currently shown is When angry, it may be determined that there is the model of destructive effect.
S204, extraction video caption text information;
It is same as the previously described embodiments, video caption text information can be prestored, when needing to generate video caption, is obtained Take the video caption text information prestored;Or when needing to generate video caption, receive the video caption word of input Information.
S205, the video caption text information, generation video caption are converted using the video caption animation model.
In the generation method of video caption disclosed by the invention, the video caption of generation is regarding with caption animation model Frequency captions, realize the dynamic effect of video caption;Also, the caption animation model of the video caption also and with the captions Corresponding speech vendors' human facial expression information is corresponding, and the dynamic effect of captions is met to the face table of subtitle language supplier Feelings demand, enhance the iconicity of screen picture.
Same as the previously described embodiments, the present embodiment may also include step before step S205:
Gather the speech volume of speech vendors corresponding with captions in video;
The parameter of the video caption animation model is adjusted according to the speech volume.
Specifically, the parameter in the video caption animation model is used for the animation effect for controlling the video caption of generation Degree, when needing to generate the video caption of different animation effect degree, the parameter of video caption animation model can be adjusted.
Such as:When it is determined that video caption animation model be cheerful and light-hearted captions bouncing effect model, pass through the language of collection Sound volume adjusts the parameter of the model of cheerful and light-hearted captions bouncing effect, it is determined that the amplitude of the captions of bounce.
Another embodiment of the present invention also discloses a kind of video caption generation method, as shown in figure 3, including step:
S301, the video caption broadcasting type control information for receiving user's input;
Specifically, when needing the broadcasting type of video caption of manual control generation, can be played with input video captions Type control information.
S302, obtain the video caption broadcast information for playing type control information with the video caption and matching;
Equally, it is previously stored with the corresponding pass that storage video caption plays type control information and video caption broadcast information System, after the video caption broadcasting type control information is got, searched and the video caption in the corresponding relation Play the video caption broadcast information that type control information matches.
S303, determine video caption animation model corresponding with the video caption broadcast information;
S304, extraction video caption text information;
S305, the video caption text information, generation video caption are converted using the video caption animation model.
The detailed process of the present embodiment is shown in above-mentioned two embodiment disclosure, and here is omitted.
Video caption generation method disclosed in the present embodiment, the video caption inputted according to user play Type Control letter Breath, it is final to determine video caption animation model, then the video caption word letter is converted using the video caption animation model Breath, generate video caption;In this way, video caption can be generated according to user's request.
Another embodiment of the present invention also discloses a kind of video caption generation method, as shown in figure 4, including step:
The tone of speech vendors corresponding with captions in S401, collection video;
Specifically, in video display process, the scene of video is different, and the mood of speaker is different, and speaker's speaks Tone it is also different;The tone of speech vendors corresponding with captions judges current language in video by gathering a period of time The mood of sound supplier.
S402, the tonal variations for calculating preset time period, it is determined that video caption broadcast message class corresponding with the tonal variations Type control information;
Specifically, according to actual use demand setting time section, the change of the tone of the collection of the period is calculated, according to Tonal variations determine that video caption plays type control information.
Wherein, generally, when the tonal variations for judging preset time period are fast, then show speaker's mood to be excited or Person's indignation, it is determined that video caption to play type control information can be that control video caption have the video of violent animation effect Captions play type control information;
When the tonal variations for judging preset time are smaller, or do not change, show that speaker's phychology is gentle, it is determined that regard It can be that the video caption that control video caption has gentle animation effect plays type control that frequency captions, which play type control information, Information processed.
S403, obtain the video caption broadcast information for playing type control information with the video caption and matching;
Specifically, prestoring video caption plays type control information and the corresponding relation of video caption broadcast information, After the video caption broadcasting type control information is got, search in the corresponding relation and played with the video caption The video caption broadcast information that type control information matches.
S404, determine video caption animation model corresponding with the video caption broadcast information;
S405, extraction video caption text information;
S406, the video caption text information, generation video caption are converted using the video caption animation model.
In the present embodiment, video caption is generated according to speech vendors' tonal variations, the dynamic effect of video caption is expired The tonal variations of sufficient subtitle language supplier, equally also enhance the iconicity of screen picture.
The embodiment of corresponding diagram 3 and Fig. 4, it is preferable that can perform step before video caption is generated:
Gather the speech volume of speech vendors corresponding with captions in video;
The parameter of the video caption animation model is adjusted according to the speech volume.
Wherein, specific process is shown in corresponding diagram 1 and Fig. 2 embodiment, and here is omitted.
Referring to Fig. 5, the also disclosed video caption generation method of another embodiment of the present invention, including step:
S501, detection video caption play type control information;
S502, obtain the video caption broadcast information for playing type control information with the video caption and matching;
S503, determine video caption animation model corresponding with the video caption broadcast information;
The voice messaging of speech vendors corresponding with captions in S504, collection video;
S505, the identification voice messaging, generate text information corresponding with the voice.
S506, the video caption text information, generation video caption are converted using the video caption animation model.
In the present embodiment, when video playback, voice messaging is gathered, identifies voice messaging, is generated corresponding with the voice Text information, it is not necessary to prestore video caption text information, it is not required that obtain video caption text information, it is simpler Folk prescription is just.
The detailed process of the present embodiment is shown in above-mentioned all embodiment disclosures, and here is omitted.
Another embodiment of the present invention also discloses a kind of video caption generation system, referring to Fig. 6, including:
Detector 101, type control information is played for detecting video caption;
Processor 102, for obtaining the video caption broadcasting for playing type control information with the video caption and matching Information;It is determined that video caption animation model corresponding with the video caption broadcast information;Extract video caption text information;Adopt The video caption text information is made with the video caption animation model, generates video caption.
Specifically, detector 101 detects that video caption is transmitted to processor 102, processing after playing type control information Video caption is prestored in device 102 and plays type control information and the corresponding relation of video caption broadcast information, works as processor After 102 receive the video caption broadcasting type control information, search in the corresponding relation and broadcast with the video caption The video caption broadcast information that type control information matches is put, then determines video corresponding with the video caption broadcast information Caption animation model;And video caption text information is extracted, finally, the video is made using the video caption animation model Caption character information, generate video caption.
Wherein, processor 102 can prestore video caption text information, when needing to generate video caption, obtain The video caption text information prestored;Or when needing to generate video caption, processor 102 receives the video of input Caption character information.
Video caption disclosed in the present embodiment generates system, when detector 101 detects that video caption plays Type Control Information, and processor 102 is sent it to, processor 102 obtains to match with video caption broadcasting type control information Video caption broadcast information;It is determined that video caption animation model corresponding with the video caption broadcast information;Extract video Caption character information;The video caption text information is made using the video caption animation model, generates video caption.This Sample, the video caption that processor 102 generates is the video caption with caption animation model, realizes the dynamic effect of video caption The purpose of fruit.
Preferably, the detector 101 in above-described embodiment can be image acquisition device, for gather in video with captions pair The human facial expression information of the speech vendors answered.
Specifically, described image collector can be camera, the facial image of speaker in screen is shot;Wherein, may be used To shoot whole face, can also be shot only for the part of face, such as:Human eye, mouth etc..
The processor obtains the image of camera shooting, identifies image, determines the current mood of speaker, and obtain The video caption broadcast information to match with the human facial expression information.
Wherein, image is identified to determine that the process of the current mood of speaker is shown in embodiment corresponding with Fig. 2, herein not Repeat again.
Or, it is preferable that the detector 101 in above-described embodiment is receiver, for receiving the video words of user's input Curtain plays type control information.
Specifically, the receiver can be connected with communication interface, the processor by communication interface with external equipment, use Family plays type control information in the human-computer interaction interface input video captions of external device, and the video caption plays type control Information processed passes through communications interface transmission to processor.
Again or, it is preferable that the detector 101 in above-described embodiment is voice collector, for gather in video with word The tone of speech vendors corresponding to curtain;
Specifically, the voice collector can be speech transducer, the frequency of the voice of speaker, i.e. tone are gathered. Processor obtains the frequency of the voice of the speaker of speech transducer collection, calculates the tonal variations of preset time period, it is determined that with Video caption corresponding to the tonal variations plays type control information.
Specifically, according to actual use demand setting time section, the change of the tone of the collection of the period is calculated, according to Tonal variations determine that video caption plays type control information.
The processor determines that video caption plays type control information according to the speed of tonal variations, and detailed process is shown in The content of embodiment corresponding to Fig. 4, here is omitted.
In above-mentioned all embodiments, the mode of the processor extraction video caption text information can be:Prestore Video caption text information, when needing to generate video caption, obtain the video caption text information prestored;Or when When needing to generate video caption, the video caption text information of input is received.
It can also include:During video playback, voice corresponding with captions carries in the processor collection video The voice messaging of donor;The voice messaging is identified, generates text information corresponding with the voice.It is this way it is not necessary to extra Store video caption text information or extra receive store video caption text information, need to only be converted according to video speech, It is simple and convenient.
Also, all embodiments disclosed above, in the processor using video caption animation model conversion institute Video caption text information is stated, before generating video caption, the processor can also carry out following operation:
Gather the speech volume of speech vendors corresponding with captions in video;
The parameter of the video caption animation model is adjusted according to the speech volume.
Specifically, the parameter in the video caption animation model is used for the animation effect for controlling the video caption of generation Degree, when needing to generate the video caption of different animation effect degree, the parameter of video caption animation model can be adjusted.
In the playing process of video, the language and characters of the speaker in video are corresponding with captions, gather the language of speaker Sound volume, the parameter of the video caption animation model is adjusted according to the speech volume, generates different animation effect degree Video caption.
Finally, it is to be noted that, herein, such as first and second or the like relational terms be used merely to by One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation Between any this actual relation or order be present.Moreover, term " comprising ", "comprising" or its any other variant meaning Covering including for nonexcludability, so that process, method, article or equipment including a series of elements not only include that A little key elements, but also the other element including being not expressly set out, or also include for this process, method, article or The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", is not arranged Except other identical element in the process including the key element, method, article or equipment being also present.
Each embodiment is described by the way of progressive in this specification, what each embodiment stressed be and other The difference of embodiment, between each embodiment identical similar portion mutually referring to.
The foregoing description of the disclosed embodiments, professional and technical personnel in the field are enable to realize or using the present invention. A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The most wide scope caused.

Claims (12)

  1. A kind of 1. video caption generation method, it is characterised in that including:
    The video caption for detecting speech vendors corresponding with captions in video plays type control information;
    Obtain the video caption broadcast information for playing type control information with the video caption and matching;
    It is determined that video caption animation model corresponding with the video caption broadcast information, wherein, react current speech supplier Video caption animation model corresponding to the video caption broadcast information of different moods is different, in the video caption animation model Parameter be used for control generation video caption animation effect degree, with will pass through adjustment video caption animation model ginseng Number, generate the video caption of different animation effect degree;
    Extract video caption text information;
    The video caption text information is converted using the video caption animation model, generates video caption.
  2. 2. according to the method for claim 1, it is characterised in that the detection video caption plays type control information bag Include:
    Gather the human facial expression information of speech vendors corresponding with captions in video.
  3. 3. according to the method for claim 1, it is characterised in that the detection video caption plays type control information bag Include:
    Receive user's input video captions and play type control information.
  4. 4. according to the method for claim 1, it is characterised in that the detection video caption plays type control information bag Include:
    Gather the tone of speech vendors corresponding with captions in video;
    The tonal variations of preset time period are calculated, it is determined that video caption corresponding with the tonal variations plays Type Control letter Breath.
  5. 5. according to the method for claim 1, it is characterised in that the extraction video caption text information includes:
    Gather the voice messaging of speech vendors corresponding with captions in video;
    The voice messaging is identified, generates text information corresponding with the voice.
  6. 6. according to the method described in any one in claim 1-5, it is characterised in that also include before generation video caption:
    Gather the speech volume of speech vendors corresponding with captions in video;
    The parameter of the video caption animation model is adjusted according to the speech volume.
  7. 7. a kind of video caption generates system, it is characterised in that including:
    Detector, the video caption for detecting speech vendors corresponding with captions in video play type control information;
    Processor, for obtaining the video caption broadcast information for playing type control information with the video caption and matching;Really Fixed video caption animation model corresponding with the video caption broadcast information;Extract video caption text information;Using described Video caption animation model makes the video caption text information, generates video caption;
    Wherein, the video caption animation model corresponding to the video caption broadcast information of current speech supplier's difference mood is reacted Difference, the parameter in the video caption animation model are used for the degree for controlling the animation effect of the video caption of generation, so as to By adjusting the parameter of video caption animation model, the video caption of different animation effect degree is generated.
  8. 8. system according to claim 7, it is characterised in that the detector is image acquisition device, for gathering video In speech vendors corresponding with captions human facial expression information.
  9. 9. system according to claim 7, it is characterised in that the detector is receiver, for receiving user's input Video caption play type control information.
  10. 10. system according to claim 7, it is characterised in that the detector is voice collector, for gathering video In speech vendors corresponding with captions tone;
    The processor is additionally operable to obtain the tone, calculates the tonal variations of preset time period, it is determined that with the tonal variations Corresponding video caption plays type control information.
  11. 11. system according to claim 7, it is characterised in that the side of the processor extraction video caption text information Formula includes:
    Gather the voice messaging of speech vendors corresponding with captions in video;
    The voice messaging is identified, generates text information corresponding with the voice.
  12. 12. according to the system described in claim 7-11 any one, it is characterised in that the processor is additionally operable to regard in generation Before frequency captions, the speech volume of speech vendors corresponding with captions in video is gathered;Institute is adjusted according to the speech volume State the parameter of video caption animation model.
CN201310018669.9A 2013-01-17 2013-01-17 The generation method and system of video caption Active CN103945140B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310018669.9A CN103945140B (en) 2013-01-17 2013-01-17 The generation method and system of video caption

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310018669.9A CN103945140B (en) 2013-01-17 2013-01-17 The generation method and system of video caption

Publications (2)

Publication Number Publication Date
CN103945140A CN103945140A (en) 2014-07-23
CN103945140B true CN103945140B (en) 2017-11-28

Family

ID=51192596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310018669.9A Active CN103945140B (en) 2013-01-17 2013-01-17 The generation method and system of video caption

Country Status (1)

Country Link
CN (1) CN103945140B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104392633B (en) * 2014-11-12 2020-08-25 国家电网公司 Explanation control method for power system simulation training
CN108419141B (en) * 2018-02-01 2020-12-22 广州视源电子科技股份有限公司 Subtitle position adjusting method and device, storage medium and electronic equipment
CN111507143B (en) 2019-01-31 2023-06-02 北京字节跳动网络技术有限公司 Expression image effect generation method and device and electronic equipment
CN110990623B (en) * 2019-12-04 2024-03-01 广州酷狗计算机科技有限公司 Audio subtitle display method and device, computer equipment and storage medium
CN111814540B (en) * 2020-05-28 2024-08-27 维沃移动通信有限公司 Information display method, information display device, electronic equipment and readable storage medium
CN113301428A (en) * 2021-05-14 2021-08-24 上海樱帆望文化传媒有限公司 Live caption device for electric competition events
CN118104241A (en) * 2021-10-27 2024-05-28 海信视像科技股份有限公司 Display apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1711756A (en) * 2002-11-15 2005-12-21 汤姆森许可贸易公司 Method and apparatus for composition of subtitles
CN1908965A (en) * 2005-08-05 2007-02-07 索尼株式会社 Information processing apparatus and method, and program
CN101309390A (en) * 2007-05-17 2008-11-19 华为技术有限公司 Visual communication system, apparatus and subtitle displaying method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100044477A (en) * 2008-10-22 2010-04-30 삼성전자주식회사 Display apparatus and control method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1711756A (en) * 2002-11-15 2005-12-21 汤姆森许可贸易公司 Method and apparatus for composition of subtitles
CN1908965A (en) * 2005-08-05 2007-02-07 索尼株式会社 Information processing apparatus and method, and program
CN101309390A (en) * 2007-05-17 2008-11-19 华为技术有限公司 Visual communication system, apparatus and subtitle displaying method

Also Published As

Publication number Publication date
CN103945140A (en) 2014-07-23

Similar Documents

Publication Publication Date Title
CN103945140B (en) The generation method and system of video caption
CN110531860B (en) Animation image driving method and device based on artificial intelligence
US11380316B2 (en) Speech interaction method and apparatus
CN109087669B (en) Audio similarity detection method and device, storage medium and computer equipment
CN111194465B (en) Audio activity tracking and summarization
CN106604125B (en) A kind of determination method and device of video caption
CN109637518A (en) Virtual newscaster's implementation method and device
CN112040263A (en) Video processing method, video playing method, video processing device, video playing device, storage medium and equipment
EP2595031A2 (en) Display apparatus and control method thereof
CN107316642A (en) Video file method for recording, audio file method for recording and mobile terminal
CN110097890A (en) A kind of method of speech processing, device and the device for speech processes
WO2020134926A1 (en) Video quality evaluation method, apparatus and device, and storage medium
CN103873919B (en) A kind of information processing method and electronic equipment
CN107918726A (en) Apart from inducing method, equipment and storage medium
CN109040641A (en) A kind of video data synthetic method and device
CN107809654A (en) System for TV set and TV set control method
CN107770598A (en) A kind of detection method synchronously played, mobile terminal
CN113301372A (en) Live broadcast method, device, terminal and storage medium
CN108364635A (en) A kind of method and apparatus of speech recognition
CN103414720A (en) Interactive 3D voice service method
JP2019082982A (en) Cooking support device, cooking information generation device, cooking support system, cooking support method, and program
WO2022041192A1 (en) Voice message processing method and device, and instant messaging client
KR101119867B1 (en) Apparatus for providing information of user emotion using multiple sensors
JP6305538B2 (en) Electronic apparatus, method and program
CN110164438A (en) A kind of audio recognition method, device and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant