CN103607635A

CN103607635A - Method, device and terminal for caption identification

Info

Publication number: CN103607635A
Application number: CN201310463870.8A
Authority: CN
Inventors: 李鹏; 孙熙; 崇伟峰; 章志坚; 高鹏程
Original assignee: Very (beijing) Mdt Infotech Ltd
Current assignee: KUYUN INTERACTIVE TECHNOLOGY LIMITED
Priority date: 2013-10-08
Filing date: 2013-10-08
Publication date: 2014-02-26

Abstract

The invention discloses a caption identification method. The method comprises the steps of extracting a multi-frame image including captions in a present video stream; carrying out caption identification to the multi-frame image and obtaining a plurality of caption identification results; detecting at least two caption identification results belonging to the same caption in the obtained plurality of caption identification results; and determining a final caption identification result in dependence on the at least two caption identification results belonging to the same caption. According to the invention, the accuracy of a character identification result of a caption in a video is greatly improved, a subsequent calculating task is facilitated, and the efficiency of a character identification process is improved. The invention also discloses a device and a terminal which are used for achieving the above method.

Description

A kind of subtitle recognition method, device and terminal

Technical field

The present invention relates to video identification technology field, relate in particular to a kind of subtitle recognition method, device and terminal.

Background technology

Constantly universal along with intelligent television, and the improving constantly of the intelligent degree of Set Top Box, a kind of and in progress programme content of the information pushing side expectation relevant to intelligent television or the information pushing mode relevant with the audience feature of watching program.A lot of TV programme are all furnished with and the corresponding captions of sound, identify these captions and for information transmission system, identify the in progress content of TV and have valuable help.

OCR(Optical Character Recognition, optical character identification) technology can be applied to the text detection in video interception.Because the text in TV programme is directly superimposed upon on video content miscellaneous conventionally, when background color and text color approach very much, can affect significantly word recognition result, cause recognition result for follow-up calculation task, to make the inefficiency of word identifying.

Summary of the invention

The embodiment of the present invention provides a kind of subtitle recognition method, device and terminal, can make the accuracy rate of word recognition result significantly improve.

For reaching above-mentioned purpose, the embodiment of the present invention by the following technical solutions:

A subtitle recognition method, described method comprises:

Extract the multiple image that comprises captions in current video stream;

Described multiple image is carried out respectively to subtitle recognition, obtain a plurality of caption identification;

Detect at least two caption identification that belong to same captions in a plurality of caption identification that obtain;

According to described at least two caption identification that belong to same captions, determine final caption identification.

The multiple image that comprises captions by extraction obtains a plurality of caption identification, to belonging to the caption identification of same captions in a plurality of caption identification, merge, thereby greatly improved the accuracy rate to the caption character recognition result in video, more be conducive to carry out follow-up calculation task, improved the efficiency of word identifying.

Described at least two caption identification that belong to same captions in a plurality of caption identification that obtain that detect, comprising: according to one in the ratio of the same text of the content-length difference of the time interval information of institute's identification caption, caption identification and caption identification or several, determine at least two caption identification that belong to same captions.

Comprehensive three kinds of detection methods, can effectively make up detection leak when each detection method is single to be used, and the accuracy rate of testing result is significantly improved.

At least two caption identification that belong to same captions described in described basis, determine final caption identification, comprise: at least two caption identification that belong to same captions are carried out to matching treatment, retain the identical captions in described at least two caption identification after coupling; For the different captions in described at least two caption identification, according to probabilistic model, determine respectively the probability of different captions, from the probability of different captions, determine captions corresponding to greater probability; The captions that described greater probability is corresponding are merged into final caption identification with described identical captions according to the order in caption identification.According to a plurality of caption identification and probabilistic model, determine the captions combination that probability of occurrence is the highest, thereby guarantee the accuracy of recognition result.

For the different captions in described at least two caption identification, before determining according to probabilistic model the captions that probability of occurrence is higher, described method also comprises: the content type information of obtaining current video stream; According to described content type information, determine corresponding probabilistic model.

For the video caption of different content classification, can determine caption identification with the probabilistic model corresponding with this classification, realized the height customization of subtitle recognition algorithm based on video content categories, further improved the accuracy of caption identification.

The multiple image that comprises captions in described extraction current video stream, comprising: by preset frequency, extract the continuous multiple frames image that comprises captions in current video stream.This mode can not reduce whole subtitle recognition effect, and processing speed can be more stable.

Described method also comprises: send described final caption identification to service end; Receive the information that described service end pushes according to described final caption identification.The caption identification accuracy rate sending is higher, receives that the information that service end pushes more approaches video content, more easily causes beholder's interest.

A subtitle recognition device, comprising:

Extraction module, the multiple image that comprises captions for extracting current video stream;

Identification module, for described multiple image is carried out respectively to subtitle recognition, obtains a plurality of caption identification;

Detection module, for detection of belonging at least two caption identification of same captions in a plurality of caption identification that go out to obtain;

Determination module, for belonging at least two caption identification of same captions described in basis, determines final caption identification.

Described detection module comprises:

The first determining unit, for determining according to one of the ratio of the same text of the content-length difference of the time interval information of institute's identification caption, caption identification and caption identification or several at least two caption identification that belong to same captions.

Described determination module comprises:

Processing unit, at least two caption identification that belong to same captions are carried out to matching treatment, retains the identical captions in described at least two caption identification after coupling;

The second determining unit, for the different captions for described at least two caption identification, determines respectively the probability of different captions according to probabilistic model, determines captions corresponding to greater probability from the probability of different captions;

Merge cells, for being merged into final caption identification with described identical captions according to the order of caption identification by captions corresponding to described greater probability.

Described determination module also comprises:

Acquiring unit, for obtaining the content type information of current video stream;

The 3rd determining unit, for determining corresponding probabilistic model according to described content type information.

Described extraction module is for extracting the continuous multiple frames image that current video stream comprises captions by preset frequency.

Described subtitle recognition device also comprises:

Sending module, for sending described final caption identification to service end;

Receiver module, the information pushing according to described final caption identification for receiving described service end.

A subtitle recognition terminal, comprises above-mentioned any one subtitle recognition device.

Other features and advantages of the present invention will be set forth in the following description, and, partly from specification, become apparent, or understand by implementing the present invention.Object of the present invention and other advantages can be realized and be obtained by specifically noted structure in the specification write, claims and accompanying drawing.

Below by drawings and Examples, technical scheme of the present invention is described in further detail.

Accompanying drawing explanation

Accompanying drawing is used to provide a further understanding of the present invention, and forms a part for specification, for explaining the present invention, is not construed as limiting the invention together with embodiments of the present invention.In the accompanying drawings:

Fig. 1 is the flow chart of a kind of subtitle recognition method of providing of the embodiment of the present invention one;

Fig. 2 is the flow chart that the embodiment of the present invention one is determined the method for final caption identification;

Fig. 3 is the flow chart of a kind of subtitle recognition method of providing of the embodiment of the present invention two;

Fig. 4 is the flow chart that the embodiment of the present invention two is determined the method for final caption identification;

The flow chart of a kind of subtitle recognition method that Fig. 5 embodiment of the present invention three provides;

Fig. 6 is the structural representation of a kind of subtitle recognition device of providing of the embodiment of the present invention one;

Fig. 7 is the structural representation of the embodiment of the present invention one detection module;

Fig. 8 is the structural representation of the embodiment of the present invention one determination module;

Fig. 9 is the structural representation of a kind of subtitle recognition device of providing of the embodiment of the present invention two;

Figure 10 is the structural representation of the embodiment of the present invention two detection modules;

Figure 11 is the structural representation of the embodiment of the present invention two determination modules;

Figure 12 is the structural representation of a kind of subtitle recognition device of providing of the embodiment of the present invention three.

Embodiment

Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein, only for description and interpretation the present invention, is not intended to limit the present invention.

Fig. 1 is a kind of subtitle recognition method that the embodiment of the present invention one provides, and the method comprises:

S101, extracts the multiple image that comprises captions in current video stream.

Read current video flowing, and therefrom extract the multiple image that comprises captions.The embodiment of the present invention can be applied in several scenes, for example, be applied in television set, and video flowing can be the live video stream of the TV programme broadcasted in television set; Or the video flowing of the program broadcasting by the mode of program request on television set and the video flowing that passes through the internet video resource that a broadcast mode obtains that broadcasts at television set.In addition, can also be applied on the player of the installing terminal equipment such as computer, panel computer, mobile phone, this player can carry out subtitle recognition to the in progress video flowing with captions.

Can adopt key frame recognition technology to extract the multiple image that comprises captions.In continuous a plurality of frame of video, gather the key frame with intact video images, then identification has the key frame of characteristic feature (as captions feature).For example, while extracting the multiple image that comprises captions in current video stream, first from video flowing, gather a plurality of key frames with intact video images, again each key frame is carried out to color distribution processing, and mate with the specific captions aspect of model (as font, color, position distribution etc.), a plurality of key frames that comprise captions to be detected, extract the image of a plurality of key frames that comprise captions.

Also can extract the continuous multiple frames image that comprises captions in current video stream by preset frequency, according to preset frequency, carry out sectional drawing, extract the continuous multiple frames image that wherein comprises captions.This mode can not reduce whole subtitle recognition effect, and processing speed can be more stable.

S102, carries out respectively subtitle recognition to described multiple image, obtains a plurality of caption identification.

Can adopt OCR technology, the multiple image extracting is carried out to subtitle recognition, obtain a plurality of preliminary caption identification, each two field picture obtains a preliminary caption identification.

S103, detects at least two caption identification that belong to same captions in a plurality of caption identification that obtain.

For a large amount of caption identification that obtain in S102, detect which caption identification and belong to same captions, detection method at least comprises following three kinds:

(1) for each caption identification, according to the time interval information of institute's identification caption, judge, obtain the temporal information of caption identification, whether judgement and time interval of a upper caption identification be within default duration.The duration that same captions show in video can be not long, according to the temporal information of caption identification, for example, if the time interval of current caption identification and a upper caption identification outside default duration (2 seconds), can determine that current caption identification and a upper caption identification do not belong to the same captions that show in video.If the time interval of current caption identification and a upper caption identification, within default duration, can tentatively determine that current caption identification and a upper caption identification belong to the same captions that show in video.

(2), for each caption identification, whether the difference of judgement and the content-length of a upper caption identification is within preset length.Should be identical to the content-length of the recognition result of same captions, do not get rid of owing to extracting the quality of image and the error that subtitle recognition process causes, for example, if therefore the difference of the content-length of current caption identification and a upper caption identification, outside preset length (2 character lengths), can determine that current caption identification and a upper caption identification do not belong to the same captions that show in video.If the difference of the content-length of current caption identification and a upper caption identification within preset length, can tentatively determine that current caption identification and a upper caption identification belong to the same captions that show in video.

(3), for each caption identification, whether the ratio of the same text of judgement and a upper caption identification is higher than preset ratio.Recognition result to same captions, the word content comprising should be identical, do not get rid of owing to extracting the quality of image and the error that subtitle recognition process causes, therefore if the ratio of the same text of current caption identification and a upper caption identification for example,, lower than preset ratio (60%), can determine that current caption identification and a upper caption identification do not belong to the same captions that show in video.If the ratio of the same text of current caption identification and a upper caption identification higher than preset ratio, can tentatively determine that current caption identification and a upper caption identification belong to the same captions that show in video.

Adopt one of above-mentioned three kinds of methods, get final product the caption identification that fast detecting goes out to belong to same captions.In order to improve the accuracy of detection, can fully utilize two kinds or three kinds in above-mentioned three kinds of detection methods, to reach better detection effect.Certainly adopt above-mentioned three kinds of detection methods can detect more accurately the caption identification that belongs to same captions simultaneously.For example, whether first judgement and time interval of a upper caption identification be within default duration, if outside default duration, can determine and not belong to same captions, if within default duration, again the difference of judgement and the content-length of a upper caption identification whether within preset length, if outside preset length, can determine and not belong to same captions, if within preset length, whether the ratio of the same text of continuation judgement and a upper caption identification is higher than preset ratio, if lower than preset ratio, can determine and not belong to same captions, if higher than preset ratio, can determine that current caption identification and a upper caption identification belong to same captions.Comprehensive above-mentioned three kinds of detection methods, can effectively make up detection leak when each detection method is single to be used, and the accuracy rate of testing result is significantly improved.

S104, according to described at least two caption identification that belong to same captions, determines final caption identification.Determine the method for final caption identification as shown in Figure 2, comprising:

S104a, carries out matching treatment by least two caption identification that belong to same captions, retains the identical captions in described at least two caption identification after coupling.

For caption identification, be considered as the character string being formed by some characters, utilize existing string matching algorithm, to belonging at least two caption identification of same captions, carry out matching treatment.Can mate successively between two described at least two caption identification, or a plurality of caption identification in described at least two caption identification are mated simultaneously.

For example, first captions recognition result is " _ _ far away putting down with Li Tianjiang talked ";

Second caption identification is " Wang Yuanping and Li Tianjiang have carried out remaining what is said or talked about "; Wherein " _ _ " representing space, i.e. a word has been lacked in first captions recognition result beginning.

According to matching algorithm, process, occur continuously that the part of identical characters aligns in two caption identification, the result after alignment is:

" _ _ is far flat to talk with Li Tianjiang ";

" Wang Yuanping and Li Tianjiang have carried out remaining what is said or talked about ".

For identical captions in two caption identification, retain, retained " far away putting down with Li Tianjiang carried out " and " what is said or talked about ".

S104b, for the different captions in described at least two caption identification, determines respectively the probability of different captions according to probabilistic model, determines captions corresponding to greater probability from the probability of different captions.

Described probabilistic model, is used for judging that the probability of which recognition result appearance is larger.Can pass through a large amount of text materials, the probability successively occurring between statistics word and word, obtain a probabilistic model the most direct, can consider the semantic features such as position that adjacent words before word, after word (can be adjacent one, two words even more) and word occur in sentence and part of speech, thereby form a probabilistic model more accurately.

According to probabilistic model, determine the relative larger captions of different captions probabilities of occurrence in two caption identification.For example:

" _ _ is far flat to talk with Li Tianjiang ";

According to probabilistic model, determine, when " far flat " two words occur simultaneously, before occur that the probability of " king " word is 95%, and occur that the probability in space is 6% above, the probability that occurs " king " word is greater than and occurs space or without any the probability of character, thereby has determined captions " king "; " carried out " and " what is said or talked about " between occur that the probability of " meeting " word is 85%, occur that the probability of " remaining " word is 14%, the probability of appearance " meeting " word is greater than the probability of appearance " remaining " word, thereby has determined captions " meeting ".

S104c, the captions that described greater probability is corresponding are merged into final caption identification with described identical captions according to the order in caption identification.

For example, by according to the definite greater probability of probabilistic model corresponding " king " and " meeting " and identical captions " far away putting down with Li Tianjiang carried out " and " what is said or talked about ", according to the order in caption identification, merge, merge into:

" king " " far away putting down with Li Tianjiang carried out " " meeting " " what is said or talked about ".

The caption identification of first captions recognition result after mating with second caption identification is " Wang Yuanping and Li Tianjiang talk ".

If adopt the method that caption identification is mated between two, caption identification after again first captions recognition result being mated with second caption identification is mated again with the 3rd caption identification, until all mate successively afterwards with all caption identification that belong to same captions, obtain final caption identification.

To described at least two caption identification that belong to same captions, the mode that can also adopt a plurality of caption identification to mate simultaneously.For example,

First captions recognition result is " _ _ far away putting down with Li Tianjiang talked ";

Second caption identification is " Wang Yuanping and Li Tianjiang have carried out remaining what is said or talked about ";

The 3rd caption identification is " Wang Xuanping and Li Tianjiang have carried out remaining what is said or talked about ".

After registration process, identical captions are retained, retained " flat and Li Tianjiang has carried out " and " what is said or talked about ".

Different captions comprise " king " and " _ _ "; " far " and " choosing "; " meeting " and " remaining ".According to probabilistic model, appear in " flat and Li Tianjiang " adjacent words before, the probability that " far " word occurs is 75%, the probability that " choosing " word occurs is 15%, appear in " flat and Li Tianjiang " second word before, the probability of the appearance of " king " word is 70%, and the probability that occurs space is 35%; And when " far flat " two words occur simultaneously, adjacent words above occurs that the probability of " king " word is 95%, adjacent words above occurs that the probability in space is 6%, and " choosing flat " two words are when occur simultaneously, adjacent words above occurs that the probability of " king " word is 45%, and adjacent words above occurs that the probability in space is 12%.According to above-mentioned probability, " flat and Li Tianjiang has carried out " adjacent words is above defined as " far ", and second word is above defined as " king "." carried out " and " what is said or talked about " between occur that the probability of " meeting " word is 85%, occur that the probability of " remaining " word is 14%, the probability of appearance " meeting " word is greater than the probability of appearance " remaining " word, thereby has determined captions " meeting ".

By according to the definite greater probability of probabilistic model corresponding " king ", " far " and " meeting " and identical captions " flat and Li Tianjiang has carried out " and " what is said or talked about ", according to the order in caption identification, merge, merge into:

" king " " far " " flat and Li Tianjiang has carried out " " meeting " " what is said or talked about ".

Above-mentioned is the example that three caption identification are mated simultaneously, the caption identification that more even all belongs to same captions can certainly be mated simultaneously.

The embodiment of the present invention one, the multiple image that comprises captions by extraction obtains a plurality of caption identification, to belonging to the caption identification of same captions in a plurality of caption identification, according to probabilistic model, merge, thereby greatly improved the accuracy rate to the caption character recognition result in video, more be conducive to carry out follow-up calculation task, improved the efficiency of word identifying.

Fig. 3 is a kind of subtitle recognition method that the embodiment of the present invention two provides, and the method comprises:

S301, extracts the multiple image that comprises captions in current video stream.Identical with S101 implementation procedure, at this, do not do repeat specification.

S302, carries out respectively subtitle recognition to described multiple image, obtains a plurality of caption identification.Identical with S102 implementation procedure, at this, do not do repeat specification.

S303, detects at least two caption identification that belong to same captions in a plurality of caption identification that obtain.Identical with S103 implementation procedure, at this, do not do repeat specification.

S304, according to described at least two caption identification that belong to same captions, determines final caption identification.Determine the method for final caption identification as shown in Figure 4, comprising:

S304a, obtains the content type information that current video flows.

According to image, sound and the broadcast time of current video stream, determine the classification of this video content, such as news, variety, music, economy, physical culture etc.

S304b, determines corresponding probabilistic model according to described content type information.

According to the content type information of obtaining, determine corresponding probabilistic model, described probabilistic model belongs to and the tight corresponding probabilistic model of content type information.Probabilistic model comprises the probability that common words in this field and vocabulary occur.The probability that different vocabulary occurs in different probabilistic models may be different.For example, when content type information is variety class, the probability of the several words in artist, star and film name in the probabilistic model of variety class will be larger than the probability in other probabilistic models.The probability simultaneously occurring as " Guo Degang " in the probabilistic model in variety classification three words is greater than the probability that " Zhang Dejiang " three words occur simultaneously, and the probability that " Zhang Dejiang " three words occur simultaneously in the probabilistic model of news category is greater than the probability that " Guo Degang " three words occur simultaneously.Thereby the video caption for different content classification, can determine caption identification with the probabilistic model corresponding with this classification, realize the height customization of subtitle recognition algorithm based on video content categories, further improved the accuracy of caption identification.

S304c, carries out matching treatment by least two caption identification that belong to same captions, retains the identical captions in described at least two caption identification after coupling.Identical with S104a implementation procedure, at this, do not do repeat specification.

S304d, for the different captions in described at least two caption identification, determines respectively the probability of different captions according to probabilistic model, determines captions corresponding to greater probability from the probability of different captions.According to the probabilistic model corresponding with video content categories information of determining in S304b, determine respectively the probability of different captions, and then captions corresponding to definite greater probability.

S304e, the captions that described greater probability is corresponding are merged into final caption identification with described identical captions according to the order in caption identification.Identical with S104c implementation procedure, at this, do not do repeat specification.

The embodiment of the present invention two, the multiple image that comprises captions by extraction obtains a plurality of caption identification, to belonging to the caption identification of same captions in a plurality of caption identification, according to probabilistic model, merge, probabilistic model is determined according to video content categories information, thereby greatly improved the accuracy rate to the caption character recognition result in video, more be conducive to carry out follow-up calculation task, improved the efficiency of word identifying, video caption for different content classification, can determine caption identification with the probabilistic model corresponding with this classification, realized the height customization of subtitle recognition algorithm based on video content categories, further improved the accuracy of caption identification.

Fig. 5 is a kind of subtitle recognition method that the embodiment of the present invention three provides, and the method comprises:

S501, extracts the multiple image that comprises captions in current video stream.Identical with S101 implementation procedure, at this, do not do repeat specification.

S502, carries out respectively subtitle recognition to described multiple image, obtains a plurality of caption identification.Identical with S102 implementation procedure, at this, do not do repeat specification.

S503, detects at least two caption identification that belong to same captions in a plurality of caption identification that obtain.Identical with S103 implementation procedure, at this, do not do repeat specification.

S504, according to described at least two caption identification that belong to same captions, determines final caption identification.Identical with S104 or S304 implementation procedure, at this, do not do repeat specification.

S505, sends described final caption identification to service end.

Described final caption identification is sent to service end, for inquiring about the relevant information with final caption identification, for example, the information such as the Domestic News relevant with final caption identification, the video of identical category or product advertising.

S506, receives the information that described service end pushes according to described final caption identification.

Receive the information that described service end pushes according to described final caption identification, comprise the information such as the video of the Domestic News relevant with final caption identification, identical category or product advertising, and above-mentioned information is shown, for program viewing person, browse, select or watch.

The embodiment of the present invention three, the multiple image that comprises captions by extraction obtains a plurality of caption identification, to belonging to the caption identification of same captions in a plurality of caption identification, according to probabilistic model, merge, probabilistic model is determined according to video content categories information, thereby greatly improved the accuracy rate to the caption character recognition result in video, more be conducive to carry out follow-up calculation task, improved the efficiency of word identifying; Will be more accurately caption identification be sent to service end and inquire about, receive that the information that service end pushes more approaches video content, more easily causes beholder's interest.

Fig. 6 is a kind of subtitle recognition device corresponding with the embodiment of the present invention one, comprising:

Extraction module 51, the multiple image that comprises captions for extracting current video stream;

Identification module 52, for described multiple image is carried out respectively to subtitle recognition, obtains a plurality of caption identification;

Detection module 53, for detection of belonging at least two caption identification of same captions in a plurality of caption identification that go out to obtain;

Determination module 54, for belonging at least two caption identification of same captions described in basis, determines final caption identification.

Described detection module 53 as shown in Figure 7, comprising:

The first determining unit 533, for determining according to one of the ratio of the same text of the content-length difference of the time interval information of institute's identification caption, caption identification and caption identification or several at least two caption identification that belong to same captions.

Described determination module 54 as shown in Figure 8, comprising:

Processing unit 540, at least two caption identification that belong to same captions are carried out to matching treatment, retains the identical captions in described at least two caption identification after coupling;

The second determining unit 541, for the different captions for described at least two caption identification, determines respectively the probability of different captions according to probabilistic model, determines captions corresponding to greater probability from the probability of different captions;

Merge cells 542, for being merged into final caption identification with described identical captions according to the order of caption identification by captions corresponding to described greater probability.

Fig. 9 is a kind of subtitle recognition device corresponding with the embodiment of the present invention two, comprising:

Described detection module 53 as shown in figure 10, comprising:

Described determination module 54 as shown in figure 11, comprising:

Acquiring unit 543, for obtaining the content type information of current video stream;

The 3rd determining unit 544, for determining corresponding probabilistic model according to described content type information;

Figure 12 is a kind of subtitle recognition device corresponding with the embodiment of the present invention three, comprising:

Determination module 54, for belonging at least two caption identification of same captions described in basis, determines final caption identification;

Sending module 55, for sending described final caption identification to service end;

Receiver module 56, the information pushing according to described final caption identification for receiving described service end.

The embodiment of the present invention also provides a kind of subtitle recognition terminal, comprises any one subtitle recognition device described in the embodiment of the present invention one, embodiment bis-or embodiment tri-.

The caption identification accuracy rate that described subtitle recognition terminal sends is higher, receives that the information that service end pushes more approaches video content, more easily causes beholder's interest.

Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt complete hardware implementation example, implement software example or in conjunction with the form of the embodiment of software and hardware aspect completely.And the present invention can adopt the form that wherein includes the upper computer program of implementing of computer-usable storage medium (including but not limited to magnetic disc store and optical memory etc.) of computer usable program code one or more.

The present invention is with reference to describing according to flow chart and/or the block diagram of the method for the embodiment of the present invention, equipment (system) and computer program.Should understand can be in computer program instructions realization flow figure and/or block diagram each flow process and/or the flow process in square frame and flow chart and/or block diagram and/or the combination of square frame.Can provide these computer program instructions to the processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce a machine, the instruction of carrying out by the processor of computer or other programmable data processing device is produced for realizing the device in the function of flow process of flow chart or a plurality of flow process and/or square frame of block diagram or a plurality of square frame appointments.

These computer program instructions also can be stored in energy vectoring computer or the computer-readable memory of other programmable data processing device with ad hoc fashion work, the instruction that makes to be stored in this computer-readable memory produces the manufacture that comprises command device, and this command device is realized the function of appointment in flow process of flow chart or a plurality of flow process and/or square frame of block diagram or a plurality of square frame.

These computer program instructions also can be loaded in computer or other programmable data processing device, make to carry out sequence of operations step to produce computer implemented processing on computer or other programmable devices, thereby the instruction of carrying out is provided for realizing the step of the function of appointment in flow process of flow chart or a plurality of flow process and/or square frame of block diagram or a plurality of square frame on computer or other programmable devices.

Obviously, those skilled in the art can carry out various changes and modification and not depart from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of the claims in the present invention and equivalent technologies thereof, the present invention is also intended to comprise these changes and modification interior.

Claims

1. a subtitle recognition method, is characterized in that, described method comprises:

Extract the multiple image that comprises captions in current video stream;

2. the method for claim 1, is characterized in that, described in detect at least two caption identification that belong to same captions in a plurality of caption identification that obtain, comprising:

According to one in the ratio of the same text of the content-length difference of the time interval information of institute's identification caption, caption identification and caption identification or several, determine at least two caption identification that belong to same captions.

3. the method for claim 1, is characterized in that, belongs at least two caption identification of same captions described in described basis, determines final caption identification, comprising:

At least two caption identification that belong to same captions are carried out to matching treatment, retain the identical captions in described at least two caption identification after coupling;

For the different captions in described at least two caption identification, according to probabilistic model, determine respectively the probability of different captions, from the probability of different captions, determine captions corresponding to greater probability;

The captions that described greater probability is corresponding are merged into final caption identification with described identical captions according to the order in caption identification.

4. method as claimed in claim 3, is characterized in that, for the different captions in described at least two caption identification, before determining according to probabilistic model the captions that probability of occurrence is higher, described method also comprises:

Obtain the content type information of current video stream;

According to described content type information, determine corresponding probabilistic model.

5. the method for claim 1, is characterized in that, the multiple image that comprises captions in described extraction current video stream, comprising:

By preset frequency, extract the continuous multiple frames image that comprises captions in current video stream.

6. the method for claim 1, is characterized in that, comprising:

Send described final caption identification to service end;

Receive the information that described service end pushes according to described final caption identification.

7. a subtitle recognition device, is characterized in that, comprising:

8. device as claimed in claim 7, is characterized in that, described detection module comprises:

The first determining unit, for according to one or several of the ratio of the same text of the content-length difference of the time interval information of institute's identification caption, caption identification and caption identification, determine at least two caption identification that belong to same captions.

9. device as claimed in claim 7, is characterized in that, described determination module comprises:

10. device as claimed in claim 9, is characterized in that, described determination module also comprises:

11. devices as claimed in claim 7, is characterized in that, described extraction module is for extracting the continuous multiple frames image that current video stream comprises captions by preset frequency.

12. devices as claimed in claim 7, is characterized in that, described device also comprises:

13. 1 kinds of subtitle recognition terminals, is characterized in that, comprise the arbitrary described subtitle recognition device of claim 7-12.