CN106295592A - Method and device for identifying subtitles of media file and electronic equipment - Google Patents

Method and device for identifying subtitles of media file and electronic equipment Download PDF

Info

Publication number
CN106295592A
CN106295592A CN201610681287.8A CN201610681287A CN106295592A CN 106295592 A CN106295592 A CN 106295592A CN 201610681287 A CN201610681287 A CN 201610681287A CN 106295592 A CN106295592 A CN 106295592A
Authority
CN
China
Prior art keywords
frame picture
caption
unique
media file
obtains
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610681287.8A
Other languages
Chinese (zh)
Inventor
田昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Internet Security Software Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201610681287.8A priority Critical patent/CN106295592A/en
Publication of CN106295592A publication Critical patent/CN106295592A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Television Systems (AREA)

Abstract

The embodiment of the invention provides a method and a device for identifying a media file subtitle and electronic equipment, wherein the method comprises the steps of screening out a frame picture to be processed with subtitle content in the frame picture of a media file; removing the duplicate of the frame picture to be processed according to the caption content, and acquiring a unique frame picture corresponding to the same caption content; identifying the unique frame picture to obtain character information corresponding to the unique frame picture; and processing the character information to generate subtitle information. Compared with the scheme in the prior art, the method and the device do not need to identify a plurality of frame pictures corresponding to the same subtitle content for a plurality of times, and the same subtitle content only needs to identify one corresponding frame picture to obtain the text information, so that the efficiency of subtitle identification is improved.

Description

The recognition methods of a kind of media file caption, device and electronic equipment
Technical field
The present invention relates to electronic technology field, particularly relate to the recognition methods of a kind of media file caption, device and electronics Equipment.
Background technology
Along with the update of the development of network, particularly mobile network, network broadband is greatly improved, video Transmission becomes very convenient.According to the statistics of famous video website YouTube, the most monthly video duration total is play in this website More than 4,000,000,000 hours.In the face of the hugest the video data volume and user's request, the Word message of video caption is carried out function Extension is particularly important, but having the captions of a lot of video is not the most single associated with, but with each frame of video Putting together, needing the caption content in frame of video to be identified as Word message so that carrying out Function Extension.
The identification technology of existing video caption, is to obtain the frame picture in video mostly, is directly identified frame picture Obtain Word message, and then information of being stabbed with the frame image time of video by the Word message of identification is combined and obtains caption information.
The frame picture of video is directly processed by prior art, and the efficiency of subtitle recognition is low.
Summary of the invention
The present invention proposes the recognition methods of a kind of media file caption, device and electronic equipment, by being drawn by multiple frames Face compares, and during the corresponding same caption content of different frame picture, obtains the frame picture unique frame as this caption content Picture is identified, and then identifies the Word message of this unique frame picture, generates caption information, it is to avoid to same caption content Identify the situation of several frame pictures, improve the efficiency of subtitle recognition.
In one aspect, embodiments providing the recognition methods of media file caption, described method includes:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame corresponding to same caption content and draw Face;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
Wherein, described in filter out the pending frame picture having caption content in described media file frame picture, particularly as follows:
The frame picture of described media file is obtained every fixing frame number;
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
Wherein, described in filter out the pending frame picture having caption content in described media file frame picture, particularly as follows:
Obtain the frame picture of described media file every fixing frame number, when obtaining several frame pictures, to described several Frame picture carries out multiple threads, and the process step of each thread includes:
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
Wherein, described according to caption content, described pending frame picture is carried out duplicate removal, obtain same caption content corresponding Unique frame picture, particularly as follows:
Step 1 obtains the first frame picture in described pending frame picture as present frame picture, the second frame picture conduct Contrast frame picture;
Step 2 judges whether the caption content of described present frame picture and described contrast frame picture changes, if judging Go out to change execution step 3, if judging the execution step 4 that do not changes;
Step 3 extracts described present frame frame picture as unique frame picture, and using described contrast frame picture as present frame Picture, obtains the next frame frame picture as a comparison of described contrast frame picture, performs step 2;
Any frame picture in described present frame picture and described contrast frame picture as present frame picture, is obtained by step 4 Take the next frame picture frame picture as a comparison of described contrast frame picture, perform step 2.
Wherein, if getting unique frame picture that multiple caption content is corresponding, the most described described unique frame picture is entered Row identifies, obtains the Word message that described unique frame picture is corresponding, particularly as follows:
Unique frame picture that the multiple caption content got are corresponding respectively is carried out multithreading optical character recognition, obtains The Word message that every width unique frame picture is corresponding.
Wherein, described described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding, tool Body is:
Described unique frame picture is carried out optical character recognition, obtains the Word message that described unique frame picture is corresponding;Or Person
Described unique frame picture is sent to remote server, receive the word letter that described remote server identification returns Breath.
Wherein, described described Word message is carried out process obtain caption information, particularly as follows:
Obtain the timestamp information of described unique frame picture;
Described word is generated caption information according to described timestamp information.
Preferably, described carrying out described Word message after process obtains caption information, described method also includes:
Described caption information is imported in described media file, the word in caption information described in simultaneous display.
Preferably, described carrying out described Word message after process obtains caption information, described method also includes:
Described caption information is sent to remote server, makes described remote server that described caption information to be examined Calibrate and preserve, captions letter when again needing to identify described media file caption, after described remote server calls calibration Breath.
In yet another aspect, embodiments providing the identification device of media file caption, described device includes: sieve Modeling block, deduplication module, identification module and captions generation module;
Described screening module, for filtering out the pending frame picture having caption content in described media file frame picture;
Described deduplication module, for described pending frame picture being carried out duplicate removal according to caption content, obtains same captions Unique frame picture that content is corresponding;
Described identification module, for being identified described unique frame picture, obtains the literary composition that described unique frame picture is corresponding Word information;
Described captions generation module, generates caption information for carrying out described Word message processing.
Wherein, described screening module includes the first acquiring unit, converting unit, statistic unit, computing unit and screening list Unit, wherein:
Described first acquiring unit, for obtaining the frame picture of described media file every fixing frame number;
Described converting unit, for being converted into gray level image by described frame picture;
Described statistic unit, for adding up the gray value of each pixel in described gray level image, obtains described frame picture Grey level histogram;
Described computing unit, for choosing first threshold and the Second Threshold of intensity value ranges, calculates described intensity histogram The local message entropy of figure;
Described screening unit, draws as pending frame more than the frame picture of the 3rd threshold value for filtering out local message entropy Face.
Wherein, described screening module includes the first acquiring unit and multiple processing module, wherein:
Described first acquiring unit, for obtaining the frame picture of described media file every fixing frame number;
Each described processing module includes converting unit, statistic unit, computing unit and screening unit;
Described converting unit, for being converted into gray level image by described frame picture;
Described statistic unit, for adding up the gray value of each pixel in described gray level image, obtains described frame picture Grey level histogram;
Described computing unit, for choosing first threshold and the Second Threshold of intensity value ranges, calculates described intensity histogram The local message entropy of figure;
Described screening unit, draws as pending frame more than the frame picture of the 3rd threshold value for filtering out local message entropy Face.
Wherein, described deduplication module includes that second acquisition unit, judging unit, extraction unit, the first frame picture determine list Unit, the second frame picture determination unit, wherein:
Described second acquisition unit for obtaining the first frame picture in described pending frame picture as present frame picture, Second frame picture frame picture as a comparison;
Described judging unit, for judging whether the caption content of described present frame picture and described contrast frame picture occurs Change;
Described extraction unit, for judging described present frame picture and the word of described contrast frame picture when described judging unit When curtain content changes, extract described present frame frame picture as unique frame picture;
Described first frame picture determination unit, for judging described present frame picture and described contrast when described judging unit When the caption content of frame picture changes, using described contrast frame picture as present frame picture, and obtain described contrast frame picture The next frame in face frame picture as a comparison;
Described second frame picture determination unit, for judging described present frame picture and described contrast when described judging unit When the caption content of frame picture does not changes, any frame picture in described present frame picture and described contrast frame picture is made For present frame picture, and obtain the next frame picture frame picture as a comparison of described contrast frame picture.
Wherein, described identification module includes multiple recognition unit, specifically for by multiple caption content correspondence respectively only One frame picture carries out multithreading optical character recognition, obtains the Word message that every width unique frame picture is corresponding.
Wherein, described identification module, specifically for being identified as caption character by described caption content;Or
Specifically for sending described unique frame picture to remote server, receive what described remote server identification returned Word message.
Wherein, described captions generation module includes the 3rd acquiring unit and captions signal generating unit, wherein:
Described 3rd acquiring unit, for obtaining the timestamp information of described unique frame picture;
Described captions signal generating unit, for generating caption information by described word according to described timestamp information.
Preferably, described device also includes Subtitle Demonstration module, for described caption information is imported described media file In, the word in caption information described in simultaneous display.
Preferably, described device also includes examining module, for sending described caption information to remote server, again Caption information when needing to identify described media file caption, after described remote server calls calibration.
In yet another aspect, embodiments provide a kind of terminal, including: media file caption as above Identify device.
In yet another aspect, embodiments provide a kind of electronic equipment, including: housing, processor, memorizer, Display screen, circuit board and power circuit, wherein, described circuit board is placed in the interior volume that described housing surrounds, described process Device and described memorizer are arranged on described circuit board, be embedded on described housing and connect described circuit board outside described display screen; Described power circuit, powers for each circuit or the device for electronic equipment;Described memorizer is used for storing executable program Code and data;Described processor runs by reading the executable program code of storage in described memorizer and can perform journey The program that sequence code is corresponding, for performing following steps:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame corresponding to same caption content and draw Face;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
The such scheme of the present invention at least includes following beneficial effect:
The present invention, for same caption content, is only identified operation, phase to unique frame picture that this caption content is corresponding Ratio is in the scheme of prior art, and the present invention need not repeatedly identify several corresponding for same caption content frame pictures, with A corresponding width frame picture only need to be identified by one caption content, obtains Word message, improves the effect of subtitle recognition Rate.
Accompanying drawing explanation
The specific embodiment of the present invention is described below with reference to accompanying drawings, wherein:
Fig. 1 shows the schematic diagram of the recognition methods of media file caption in the embodiment of the present invention one;
Fig. 2 shows the schematic diagram of the recognition methods of media file caption in the embodiment of the present invention two;
Fig. 3 shows in the embodiment of the present invention two, according to caption content, pending frame picture is carried out duplicate removal, obtains same The schematic diagram of unique frame picture method that caption content is corresponding;
Fig. 4 shows the structural representation identifying device of media file caption in the embodiment of the present invention three;
Fig. 5 shows the structural representation identifying device of media file caption in the embodiment of the present invention four;
Fig. 6 shows the structural representation screening module in the embodiment of the present invention four;
Fig. 7 shows the structural representation of deduplication module in the embodiment of the present invention four;
Fig. 8 shows the structural representation of identification module in the embodiment of the present invention four;
Fig. 9 shows the structural representation of electronic equipment in the embodiment of the present invention five.
Detailed description of the invention
In order to make technical scheme and advantage clearer, below in conjunction with exemplary to the present invention of accompanying drawing Embodiment is described in more detail, it is clear that described embodiment be only the present invention a part of embodiment rather than All embodiments exhaustive.And in the case of not conflicting, the embodiment in this explanation and the feature in embodiment can be mutual Combine.
Embodiments of the invention provide the recognition methods of a kind of media file caption, device and electronic equipment, for same Caption content, is only identified operation to unique frame picture that this caption content is corresponding, compared to the scheme of prior art, this Several corresponding for same caption content frame pictures are repeatedly identified by bright being no longer necessary to, and same caption content only need to be to correspondence One width frame picture is identified, and obtains Word message, improves the efficiency of subtitle recognition.
In embodiments of the invention, media file can be video file or video flowing, this video file or video The source of stream includes but not limited to: the video file preserved in (1) storage device;(2) live video stream, such as live telecast regard Frequency stream, network direct broadcasting video flowing etc..
Embodiment one
The first embodiment schematic flow sheet of the recognition methods of a kind of media file caption that Fig. 1 provides for the present invention.This The recognition methods of the media file caption that inventive embodiments one provides includes:
Step 101, filter out the pending frame picture having caption content in media file frame picture;
Step 102, according to caption content, pending frame picture is carried out duplicate removal, obtain corresponding unique of same caption content Frame picture;
Step 103, unique frame picture is identified, obtains the Word message that unique frame picture is corresponding;
Step 104, Word message is carried out process generate caption information.
The embodiment of the present invention, for same caption content, is only identified behaviour to unique frame picture that this caption content is corresponding Making, compared to the scheme of prior art, the present invention is no longer necessary to carry out repeatedly several corresponding for same caption content frame pictures Identifying, a corresponding width frame picture only need to be identified by same caption content, obtains Word message, improves captions and knows Other efficiency.
Embodiment two
Second embodiment schematic flow sheet of the recognition methods of a kind of media file caption that Fig. 2 provides for the present invention.This The recognition methods of the media file caption that inventive embodiments two provides includes:
Step 201, filter out the pending frame picture having caption content in media file frame picture;
Provide two kinds in the present embodiment and filter out the pending frame picture having caption content in media file frame picture Method, wherein, several are had the frame picture of caption content to carry out caption content to carry out the process of multithreading, make by first method The corresponding unique frame picture of same caption content;Second method carries out caption content to the frame picture having caption content and compares, and makes The corresponding unique frame picture of same caption content.Specific as follows:
In the present embodiment, filter out the first side of the pending frame picture having caption content in media file frame picture Method, obtains the frame picture of media file every fixing frame number, when obtaining several frame pictures, carries out several frame pictures described Multiple threads, the process step of each thread includes:
Frame picture is converted into gray level image;
In statistics gray level image, the gray value of each pixel, obtains the grey level histogram of frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of grey level histogram;Filter out Local message entropy is more than the frame picture of the 3rd threshold value as pending frame picture.
In the present embodiment, filter out the second side of the pending frame picture having caption content in media file frame picture Method, obtains the frame picture of media file every fixing frame number;Frame picture is converted into gray level image;In statistics gray level image The gray value of each pixel, obtains the grey level histogram of frame picture;Choose first threshold and the Second Threshold of intensity value ranges, meter Calculating the local message entropy of grey level histogram, Second Threshold is more than first threshold;Filter out local message entropy more than the 3rd threshold value Frame picture, local message entropy is the pending frame picture having caption content more than the frame picture of the 3rd threshold value.
Such as, according to the gray value i (i ∈ [0,255]) of pixel each in gray level image, the intensity histogram of frame picture is obtained Figure is H [i];Choose gray value in the range of first threshold θep1≤ i≤Second Threshold θep2
Grey level histogram is normalized:
Obtain rectangular histogram local message entropy:
If ep1>=the three threshold value EPL, then it is assumed that frame picture has caption content.
Step 202, according to caption content, pending frame picture is carried out duplicate removal, obtain corresponding unique of same caption content Frame picture;
In the present embodiment, the frame picture having caption content is carried out caption content and compares, make same caption content corresponding The method of unique frame picture is as it is shown on figure 3, include:
Step 2021 obtains the first frame picture in pending frame picture as present frame picture, and the second frame picture is as right Ratio frame picture;
Step 2022 judges whether the caption content of present frame picture and contrast frame picture changes, if judging to occur Change performs step 2023, if judging the execution step 2024 that do not changes;
Step 2023 extracts present frame frame picture as unique frame picture, and will contrast frame picture as present frame picture, Obtain the next frame frame picture as a comparison of contrast frame picture, perform step 2022;
Any frame picture in present frame picture and contrast frame picture as present frame picture, is obtained contrast by step 2024 The next frame picture frame picture as a comparison of frame picture, performs step 2022.
Wherein it is possible to utilize whether stroke direction changes, first to the caption content judging present frame and contrast frame Frame picture is done rim detection, then adds up the rectangular histogram that the gradient direction of edge pixel is constituted, use OpenCV function CompareHist produces the similar of the gradient orientation histogram of an edge pixel expressing present frame picture and contrast frame picture The numerical value of degree, and determine a threshold value, if this numerical value is not less than threshold value, then it is assumed that caption content does not changes.
Such as, the gradient orientation histogram of the edge pixel of present frame picture is H1, the edge pixel of contrast frame picture Gradient orientation histogram is H2, OpenCV function compareHist generate contrast standard d (H1, H2),
d ( H 1 , H 2 ) = Σ i ( H 1 ( i ) - H 2 ( i ) ) 2 H 1 ( i ) + H 2 ( i )
Wherein, i is pixel value, i ∈ [0,255],
If d is (H1, H2) >=threshold value D (H1, H2), then it is assumed that caption content does not changes.
Step 203, unique frame picture is identified, obtains the Word message that unique frame picture is corresponding;
Present embodiments providing three kinds of methods that the caption content in unique frame picture is identified as caption character:
Caption content in unique frame picture is identified as in the present embodiment the first method of caption character, and being specially will Unique frame picture of multiple caption content correspondence respectively carries out multithreading optical character recognition, obtains every width unique frame picture corresponding Word message.
Caption content in unique frame picture is identified as in the present embodiment the second method of caption character, and being specially will Unique frame picture carries out optical character recognition, obtains the Word message that unique frame picture is corresponding;
Caption content in unique frame picture is identified as in the present embodiment the third method of caption character, and being specially will Unique frame picture sends to remote server, receives the Word message that remote server identification returns.
Step 204, Word message is carried out process generate caption information.
In the present embodiment, caption character is carried out process and obtains caption information particularly as follows: add caption character to literary composition In presents, then according to content and the timestamp of text, the form adding captions according to a time code generates Caption information, i.e. adds the form of captions according to a time code and writes word in caption information.
The kind of captions has multiple, and the most the more commonly used subtitling format has graphical format and text formatting two class, relatively For graphical format captions, text formatting captions have that size is little, form simple, are easy to make and the feature of amendment, text lattice Formula captions include utf, idx, sub, srt, smi, rt, txt, ssa, aq, jss, js, ass, wherein the text subtitle of srt form Most widely used, it can compatible various common media players, MPC, QQ are audio-visual etc. all can load the type automatically Captions.Therefore, in the present embodiment, caption information uses srt form, and certain the present embodiment does not limit the lattice of caption information Formula, as long as the form of caption information can support used media player.
Step 205, by caption information importing medium file, the word in synchronously displaying subtitle information.
In the present embodiment, caption information is stored in the file at media file place, when playing media file, and should Caption information can be automatically imported and simultaneous display.
Additionally, for the display effect optimizing captions, can be by sentence branch display longer in caption information.
Step 206, by caption information send to remote server, make remote server caption information is carried out examine calibration And preserve, when again needing to identify media file caption, the caption information after remote server calls calibration.
The embodiment of the present invention carries out the screening of multithreading and obtains comprising the frame of caption content and draw the frame picture of media file Face, obtains, by duplicate removal, unique frame picture that each same caption content is corresponding, only the most corresponding only to multiple caption content One frame picture carries out multithreading identification operation, and carries out caption information examining calibration.Compared to the scheme of prior art, this The bright frame picture to media file carries out the screening of multithreading, decreases and obtains the frame picture required time comprising caption content; Pass through duplicate removal, it is no longer necessary to repeatedly being identified by several corresponding for same caption content frame pictures, same caption content only needs A corresponding width frame picture is identified, obtains Word message, improve the efficiency of subtitle recognition;To in multiple captions Hold unique frame picture corresponding respectively and carry out the identification of multithreading, further increase the efficiency of subtitle recognition;To caption information Carry out examining calibration, improve speed and the accuracy of the media file caption again obtained.
Based on same inventive concept, the embodiment of the present invention additionally provides the identification device of a kind of media file caption, by The principle solving problem in these systems is similar to the recognition methods of a kind of media file caption, and therefore the enforcement of these systems can To see the enforcement of method, repeat no more in place of repetition.
As shown in Figure 4, providing the identification device of a kind of media file caption in the embodiment of the present invention, device can wrap Include:
Screening module 301, for filtering out the pending frame picture having caption content in media file frame picture;
Deduplication module 302, for pending frame picture being carried out duplicate removal according to caption content, obtains same caption content pair The unique frame picture answered;
Identification module 303, for being identified unique frame picture, obtains the Word message that unique frame picture is corresponding;
Captions generation module 304, generates caption information for carrying out Word message processing.
Embodiment of the present invention deduplication module obtains unique frame picture that same caption content is corresponding, and identification module is only to this word Curtain unique frame picture corresponding to content is identified operation, compared to the scheme of prior art, the subtitle recognition device of the present invention Being no longer necessary to repeatedly identify several corresponding for same caption content frame pictures, same caption content only need to be to corresponding one Width frame picture is identified, and obtains Word message, improves the efficiency of subtitle recognition.
As it is shown in figure 5, provide the identification device of another kind of media file caption in the embodiment of the present invention, device can wrap Include:
Screening module 401, for filtering out the pending frame picture having caption content in media file frame picture;
In the present embodiment, as shown in Figure 6, screen module described in screening module 401 and include the first acquiring unit and multiple place Reason module, wherein:
Described first acquiring unit, for obtaining the frame picture of described media file every fixing frame number;
Each described processing module includes converting unit, statistic unit, computing unit and screening unit;
Described converting unit, for being converted into gray level image by described frame picture;
Described statistic unit, for adding up the gray value of each pixel in described gray level image, obtains described frame picture Grey level histogram;
Described computing unit, for choosing first threshold and the Second Threshold of intensity value ranges, calculates described intensity histogram The local message entropy of figure;
Described screening unit, draws as pending frame more than the frame picture of the 3rd threshold value for filtering out local message entropy Face.
The present embodiment additionally provides another kind of screening module, can include the first acquiring unit 4011, converting unit 4012, Statistic unit 4013, computing unit 4014 and screening unit 4015, wherein:
First acquiring unit 4011, for obtaining the frame picture of media file every fixing frame number;
Converting unit 4012, for being converted into gray level image by frame picture;
Statistic unit 4013, for adding up the gray value of each pixel in gray level image, obtains the intensity histogram of frame picture Figure;
Computing unit 4014, for choosing first threshold and the Second Threshold of intensity value ranges, calculates grey level histogram Local message entropy;
Screening unit 4015, for filtering out the local message entropy frame picture more than the 3rd threshold value.
Deduplication module 402, compares for the frame picture having caption content is carried out caption content, makes same caption content pair Should unique frame picture;
In the present embodiment, as it is shown in fig. 7, deduplication module 402 includes second acquisition unit 4021, judging unit 4022, carries Take unit the 4023, first frame picture determination unit the 4024, second frame picture determination unit 4025, wherein:
Second acquisition unit 4021, for obtaining the first frame picture in pending frame picture as present frame picture, the Two frame picture frame pictures as a comparison;
Judging unit 4022, for judging whether the caption content of present frame picture and contrast frame picture changes;
Extraction unit 4023, for judging that when judging unit 4022 caption content of present frame picture and contrast frame picture is sent out During changing, extract present frame frame picture as unique frame picture;
First frame picture determination unit 4024, for when judging unit 4022 judges present frame picture and contrasts frame picture When caption content changes, by contrast frame picture as present frame picture, and obtain the next frame of contrast frame picture as right Ratio frame picture;
Second frame picture determination unit 4025, for when judging unit 4022 judges present frame picture and contrasts frame picture When caption content does not changes, using any frame picture in present frame picture and contrast frame picture as present frame picture, and Obtain the next frame picture frame picture as a comparison of contrast frame picture.
Identification module 403, is identified as caption character by the caption content in unique frame picture;
In the present embodiment, identification module 403 can include multiple recognition unit as shown in Figure 8, specifically for by multiple words Unique frame picture of curtain content correspondence respectively carries out multithreading optical character recognition, obtains the word that every width unique frame picture is corresponding Information.
The present embodiment additionally provides another kind of identification module 403, for caption content is identified as caption character;Or For sending unique frame picture to remote server, receive the Word message that remote server identification returns.
Captions generation module 404, carries out process to caption character and obtains caption information.
In the present embodiment, captions generation module 404 includes the 3rd acquiring unit and captions signal generating unit, wherein:
3rd acquiring unit, for obtaining the timestamp information of unique frame picture;
Captions signal generating unit, for generating caption information by word according to timestamp information.
Subtitle Demonstration module 405, for by caption information importing medium file, the literary composition in synchronously displaying subtitle information Word;
Examine module 406, for sending caption information to remote server, again need to identify media file caption Time, the caption information after remote server calls calibration.
The screening that embodiment of the present invention screening module carries out multithreading to the frame picture of media file obtains comprising in captions The frame picture held, deduplication module obtains unique frame picture that same caption content is corresponding, and identification module is only to multiple caption content Unique frame picture corresponding respectively carries out multithreading identification operation, examines that caption information is carried out examining calibration by module.Compared to The scheme of prior art, the subtitle recognition device of the present invention carries out the screening of multithreading to the frame picture of media file, decreases Obtain the frame picture required time comprising caption content;Pass through duplicate removal, it is no longer necessary to by several corresponding for same caption content frames Picture repeatedly identifies, a corresponding width frame picture only need to be identified by same caption content, obtains Word message, Improve the efficiency of subtitle recognition;Unique frame picture that multiple caption content are corresponding respectively is carried out the identification of multithreading, enters one Step improves the efficiency of subtitle recognition;Carry out caption information examining calibration, improve the media file caption that again obtains Speed and accuracy.
As it is shown in figure 9, the embodiment of the present invention additionally provides a kind of electronic equipment, including: housing 501, processor 502, Memorizer 503, display screen (not shown), circuit board 504 and power circuit 505, wherein, circuit board 504 is placed in housing 501 interior volume surrounded, processor 502 and memorizer 503 are arranged on circuit board 504, are embedded on housing 501 outside display screen And connect circuit board 504;Power circuit 505, powers for each circuit or the device for electronic equipment;Memorizer 503 is used for Storage executable program code and data;Processor 502 is transported by reading the executable program code of storage in memorizer 503 The program that row is corresponding with executable program code, for performing following steps:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame corresponding to same caption content and draw Face;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
Electronic equipment in the embodiment of the present invention, filters out the frame picture of caption content, obtains same caption content pair The unique frame picture answered, is only identified operation, compared to the side of prior art to unique frame picture that this caption content is corresponding Case, the electronic equipment of the present invention is no longer necessary to repeatedly identify several corresponding for same caption content frame pictures, same word A corresponding width frame picture only need to be identified by curtain content, obtains Word message, improves the efficiency of subtitle recognition.
For convenience of description, each several part of system above is divided into various module or unit to be respectively described with function.Certainly, The function of each module or unit can be realized in same or multiple softwares or hardware when implementing the present invention.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program Product.Therefore, the reality in terms of the present invention can use complete hardware embodiment, complete software implementation or combine software and hardware Execute the form of example.And, the present invention can use at one or more computers wherein including computer usable program code The upper computer program product implemented of usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) The form of product.
The present invention is with reference to method, equipment (system) and the flow process of computer program according to embodiments of the present invention Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.
These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or The function specified in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one The step of the function specified in individual square frame or multiple square frame.
Although preferred embodiments of the present invention have been described, but those skilled in the art once know basic creation Property concept, then can make other change and amendment to these embodiments.So, claims are intended to be construed to include excellent Select embodiment and fall into all changes and the amendment of the scope of the invention.

Claims (10)

1. the recognition methods of a media file caption, it is characterised in that including:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame picture that same caption content is corresponding;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
2. the method for claim 1, it is characterised in that described in filter out described media file frame picture have captions in The pending frame picture held, particularly as follows:
The frame picture of described media file is obtained every fixing frame number;
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
3. device as claimed in claim 1, it is characterised in that described in filter out described media file frame picture have captions in The pending frame picture held, particularly as follows:
Obtain the frame picture of described media file every fixing frame number, when obtaining several frame pictures, several frames described are drawn Face carries out multiple threads, and the process step of each thread includes:
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
4. the method for claim 1, it is characterised in that described according to caption content, described pending frame picture is carried out Duplicate removal, obtains unique frame picture that same caption content is corresponding, particularly as follows:
Step 401 obtains the first frame picture in described pending frame picture as present frame picture, and the second frame picture is as right Ratio frame picture;
Step 402 judges whether the caption content of described present frame picture and described contrast frame picture changes, if judging Change execution step 403, if judging the execution step 404 that do not changes;
Step 403 extracts described present frame frame picture as unique frame picture, and is drawn as present frame by described contrast frame picture Face, obtains the next frame frame picture as a comparison of described contrast frame picture, performs step 402;
Any frame picture in described present frame picture and described contrast frame picture as present frame picture, is obtained by step 404 The next frame picture frame picture as a comparison of described contrast frame picture, performs step 402.
5. the method for claim 1, it is characterised in that draw if getting unique frame corresponding to multiple caption content Face, the most described is identified described unique frame picture, obtains the Word message that described unique frame picture is corresponding, particularly as follows:
Unique frame picture that the multiple caption content got are corresponding respectively is carried out multithreading optical character recognition, obtains every width The Word message that unique frame picture is corresponding.
6. the method for claim 1, it is characterised in that described be identified described unique frame picture, obtains described The Word message that unique frame picture is corresponding, particularly as follows:
Described unique frame picture is carried out optical character recognition, obtains the Word message that described unique frame picture is corresponding;Or
Described unique frame picture is sent to remote server, receives the Word message that described remote server identification returns.
7. the method for claim 1, it is characterised in that described described Word message is carried out process obtain captions letter Breath, particularly as follows:
Obtain the timestamp information of described unique frame picture;
Described word is generated caption information according to described timestamp information.
8. the method as described in any one of claim 1-7, it is characterised in that described described Word message is carried out process obtain After caption information, described method also includes:
Described caption information is imported in described media file, the word in caption information described in simultaneous display.
9. method as claimed in claim 8, it is characterised in that described described Word message is carried out process obtain caption information After, described method also includes:
Described caption information is sent to remote server, makes described remote server carry out described caption information examining calibration And preserve, caption information when again needing to identify described media file caption, after described remote server calls calibration.
10. the identification device of a media file caption, it is characterised in that including: screening module, deduplication module, identification module With captions generation module;
Described screening module, for filtering out the pending frame picture having caption content in described media file frame picture;
Described deduplication module, for described pending frame picture being carried out duplicate removal according to caption content, obtains same caption content Corresponding unique frame picture;
Described identification module, for being identified described unique frame picture, obtains the word letter that described unique frame picture is corresponding Breath;
Described captions generation module, generates caption information for carrying out described Word message processing.
CN201610681287.8A 2016-08-17 2016-08-17 Method and device for identifying subtitles of media file and electronic equipment Pending CN106295592A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610681287.8A CN106295592A (en) 2016-08-17 2016-08-17 Method and device for identifying subtitles of media file and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610681287.8A CN106295592A (en) 2016-08-17 2016-08-17 Method and device for identifying subtitles of media file and electronic equipment

Publications (1)

Publication Number Publication Date
CN106295592A true CN106295592A (en) 2017-01-04

Family

ID=57679560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610681287.8A Pending CN106295592A (en) 2016-08-17 2016-08-17 Method and device for identifying subtitles of media file and electronic equipment

Country Status (1)

Country Link
CN (1) CN106295592A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488107A (en) * 2020-12-04 2021-03-12 北京华录新媒信息技术有限公司 Video subtitle processing method and processing device
CN112925905A (en) * 2021-01-28 2021-06-08 北京达佳互联信息技术有限公司 Method, apparatus, electronic device and storage medium for extracting video subtitles
CN114071184A (en) * 2021-11-11 2022-02-18 腾讯音乐娱乐科技(深圳)有限公司 Subtitle positioning method, electronic equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761205A (en) * 2005-11-18 2006-04-19 郑州金惠计算机系统工程有限公司 System for detecting eroticism and unhealthy images on network based on content
CN101360193A (en) * 2008-09-04 2009-02-04 北京中星微电子有限公司 Video subtitle processing apparatus and method
CN102004916A (en) * 2010-11-15 2011-04-06 无锡中星微电子有限公司 Image characteristic extraction system and method
CN102915438A (en) * 2012-08-21 2013-02-06 北京捷成世纪科技股份有限公司 Method and device for extracting video subtitles
CN103186780A (en) * 2011-12-30 2013-07-03 乐金电子(中国)研究开发中心有限公司 Video caption identifying method and device
CN103607635A (en) * 2013-10-08 2014-02-26 十分(北京)信息科技有限公司 Method, device and terminal for caption identification
CN103634605A (en) * 2013-12-04 2014-03-12 百度在线网络技术(北京)有限公司 Processing method and device for video images
CN104021385A (en) * 2013-03-02 2014-09-03 北京信息科技大学 Video subtitle thinning method based on template matching and curve fitting
CN104244107A (en) * 2014-08-26 2014-12-24 中译语通科技(北京)有限公司 Video caption restoring method based on caption detection and recognition

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761205A (en) * 2005-11-18 2006-04-19 郑州金惠计算机系统工程有限公司 System for detecting eroticism and unhealthy images on network based on content
CN101360193A (en) * 2008-09-04 2009-02-04 北京中星微电子有限公司 Video subtitle processing apparatus and method
CN102004916A (en) * 2010-11-15 2011-04-06 无锡中星微电子有限公司 Image characteristic extraction system and method
CN103186780A (en) * 2011-12-30 2013-07-03 乐金电子(中国)研究开发中心有限公司 Video caption identifying method and device
CN102915438A (en) * 2012-08-21 2013-02-06 北京捷成世纪科技股份有限公司 Method and device for extracting video subtitles
CN104021385A (en) * 2013-03-02 2014-09-03 北京信息科技大学 Video subtitle thinning method based on template matching and curve fitting
CN103607635A (en) * 2013-10-08 2014-02-26 十分(北京)信息科技有限公司 Method, device and terminal for caption identification
CN103634605A (en) * 2013-12-04 2014-03-12 百度在线网络技术(北京)有限公司 Processing method and device for video images
CN104244107A (en) * 2014-08-26 2014-12-24 中译语通科技(北京)有限公司 Video caption restoring method based on caption detection and recognition

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488107A (en) * 2020-12-04 2021-03-12 北京华录新媒信息技术有限公司 Video subtitle processing method and processing device
CN112925905A (en) * 2021-01-28 2021-06-08 北京达佳互联信息技术有限公司 Method, apparatus, electronic device and storage medium for extracting video subtitles
CN112925905B (en) * 2021-01-28 2024-02-27 北京达佳互联信息技术有限公司 Method, device, electronic equipment and storage medium for extracting video subtitles
CN114071184A (en) * 2021-11-11 2022-02-18 腾讯音乐娱乐科技(深圳)有限公司 Subtitle positioning method, electronic equipment and medium

Similar Documents

Publication Publication Date Title
US10922804B2 (en) Method and apparatus for evaluating image definition, computer device and storage medium
CN106303303A (en) Method and device for translating subtitles of media file and electronic equipment
CN109919244B (en) Method and apparatus for generating a scene recognition model
CN111862035B (en) Training method of light spot detection model, light spot detection method, device and medium
CN105913088A (en) Lag identification method, lag identification device and computing equipment
CN111738041A (en) Video segmentation method, device, equipment and medium
CN105718861A (en) Method and device for identifying video streaming data category
CN103186780B (en) Video caption recognition methods and device
CN112749696B (en) Text detection method and device
CN110059624B (en) Method and apparatus for detecting living body
CN111028222A (en) Video detection method and device, computer storage medium and related equipment
CN106295592A (en) Method and device for identifying subtitles of media file and electronic equipment
CN107454479A (en) A kind of processing method and processing device of multi-medium data
CN110191356A (en) Video reviewing method, device and electronic equipment
CN108921023A (en) A kind of method and device of determining low quality portrait data
CN105979283A (en) Video transcoding method and device
US11728914B2 (en) Detection device, detection method, and program
CN107483916A (en) The control method of audio frequency and video archival quality detecting system
CN111369557A (en) Image processing method, image processing device, computing equipment and storage medium
CN109151520B (en) Method, device, electronic equipment and medium for generating video
CN117745589A (en) Watermark removing method, device and equipment
CN110728193A (en) Method and device for detecting richness characteristics of face image
CN113095178A (en) Bad information detection method, system, electronic device and readable storage medium
Josephs et al. Artifact magnification on deepfake videos increases human detection and subjective confidence
CN112287790A (en) Image processing method, image processing device, storage medium and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170104

RJ01 Rejection of invention patent application after publication