CN106295592A - Method and device for identifying subtitles of media file and electronic equipment - Google Patents
Method and device for identifying subtitles of media file and electronic equipment Download PDFInfo
- Publication number
- CN106295592A CN106295592A CN201610681287.8A CN201610681287A CN106295592A CN 106295592 A CN106295592 A CN 106295592A CN 201610681287 A CN201610681287 A CN 201610681287A CN 106295592 A CN106295592 A CN 106295592A
- Authority
- CN
- China
- Prior art keywords
- frame picture
- caption
- unique
- media file
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/635—Overlay text, e.g. embedded captions in a TV program
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Television Systems (AREA)
Abstract
The embodiment of the invention provides a method and a device for identifying a media file subtitle and electronic equipment, wherein the method comprises the steps of screening out a frame picture to be processed with subtitle content in the frame picture of a media file; removing the duplicate of the frame picture to be processed according to the caption content, and acquiring a unique frame picture corresponding to the same caption content; identifying the unique frame picture to obtain character information corresponding to the unique frame picture; and processing the character information to generate subtitle information. Compared with the scheme in the prior art, the method and the device do not need to identify a plurality of frame pictures corresponding to the same subtitle content for a plurality of times, and the same subtitle content only needs to identify one corresponding frame picture to obtain the text information, so that the efficiency of subtitle identification is improved.
Description
Technical field
The present invention relates to electronic technology field, particularly relate to the recognition methods of a kind of media file caption, device and electronics
Equipment.
Background technology
Along with the update of the development of network, particularly mobile network, network broadband is greatly improved, video
Transmission becomes very convenient.According to the statistics of famous video website YouTube, the most monthly video duration total is play in this website
More than 4,000,000,000 hours.In the face of the hugest the video data volume and user's request, the Word message of video caption is carried out function
Extension is particularly important, but having the captions of a lot of video is not the most single associated with, but with each frame of video
Putting together, needing the caption content in frame of video to be identified as Word message so that carrying out Function Extension.
The identification technology of existing video caption, is to obtain the frame picture in video mostly, is directly identified frame picture
Obtain Word message, and then information of being stabbed with the frame image time of video by the Word message of identification is combined and obtains caption information.
The frame picture of video is directly processed by prior art, and the efficiency of subtitle recognition is low.
Summary of the invention
The present invention proposes the recognition methods of a kind of media file caption, device and electronic equipment, by being drawn by multiple frames
Face compares, and during the corresponding same caption content of different frame picture, obtains the frame picture unique frame as this caption content
Picture is identified, and then identifies the Word message of this unique frame picture, generates caption information, it is to avoid to same caption content
Identify the situation of several frame pictures, improve the efficiency of subtitle recognition.
In one aspect, embodiments providing the recognition methods of media file caption, described method includes:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame corresponding to same caption content and draw
Face;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
Wherein, described in filter out the pending frame picture having caption content in described media file frame picture, particularly as follows:
The frame picture of described media file is obtained every fixing frame number;
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
Wherein, described in filter out the pending frame picture having caption content in described media file frame picture, particularly as follows:
Obtain the frame picture of described media file every fixing frame number, when obtaining several frame pictures, to described several
Frame picture carries out multiple threads, and the process step of each thread includes:
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
Wherein, described according to caption content, described pending frame picture is carried out duplicate removal, obtain same caption content corresponding
Unique frame picture, particularly as follows:
Step 1 obtains the first frame picture in described pending frame picture as present frame picture, the second frame picture conduct
Contrast frame picture;
Step 2 judges whether the caption content of described present frame picture and described contrast frame picture changes, if judging
Go out to change execution step 3, if judging the execution step 4 that do not changes;
Step 3 extracts described present frame frame picture as unique frame picture, and using described contrast frame picture as present frame
Picture, obtains the next frame frame picture as a comparison of described contrast frame picture, performs step 2;
Any frame picture in described present frame picture and described contrast frame picture as present frame picture, is obtained by step 4
Take the next frame picture frame picture as a comparison of described contrast frame picture, perform step 2.
Wherein, if getting unique frame picture that multiple caption content is corresponding, the most described described unique frame picture is entered
Row identifies, obtains the Word message that described unique frame picture is corresponding, particularly as follows:
Unique frame picture that the multiple caption content got are corresponding respectively is carried out multithreading optical character recognition, obtains
The Word message that every width unique frame picture is corresponding.
Wherein, described described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding, tool
Body is:
Described unique frame picture is carried out optical character recognition, obtains the Word message that described unique frame picture is corresponding;Or
Person
Described unique frame picture is sent to remote server, receive the word letter that described remote server identification returns
Breath.
Wherein, described described Word message is carried out process obtain caption information, particularly as follows:
Obtain the timestamp information of described unique frame picture;
Described word is generated caption information according to described timestamp information.
Preferably, described carrying out described Word message after process obtains caption information, described method also includes:
Described caption information is imported in described media file, the word in caption information described in simultaneous display.
Preferably, described carrying out described Word message after process obtains caption information, described method also includes:
Described caption information is sent to remote server, makes described remote server that described caption information to be examined
Calibrate and preserve, captions letter when again needing to identify described media file caption, after described remote server calls calibration
Breath.
In yet another aspect, embodiments providing the identification device of media file caption, described device includes: sieve
Modeling block, deduplication module, identification module and captions generation module;
Described screening module, for filtering out the pending frame picture having caption content in described media file frame picture;
Described deduplication module, for described pending frame picture being carried out duplicate removal according to caption content, obtains same captions
Unique frame picture that content is corresponding;
Described identification module, for being identified described unique frame picture, obtains the literary composition that described unique frame picture is corresponding
Word information;
Described captions generation module, generates caption information for carrying out described Word message processing.
Wherein, described screening module includes the first acquiring unit, converting unit, statistic unit, computing unit and screening list
Unit, wherein:
Described first acquiring unit, for obtaining the frame picture of described media file every fixing frame number;
Described converting unit, for being converted into gray level image by described frame picture;
Described statistic unit, for adding up the gray value of each pixel in described gray level image, obtains described frame picture
Grey level histogram;
Described computing unit, for choosing first threshold and the Second Threshold of intensity value ranges, calculates described intensity histogram
The local message entropy of figure;
Described screening unit, draws as pending frame more than the frame picture of the 3rd threshold value for filtering out local message entropy
Face.
Wherein, described screening module includes the first acquiring unit and multiple processing module, wherein:
Described first acquiring unit, for obtaining the frame picture of described media file every fixing frame number;
Each described processing module includes converting unit, statistic unit, computing unit and screening unit;
Described converting unit, for being converted into gray level image by described frame picture;
Described statistic unit, for adding up the gray value of each pixel in described gray level image, obtains described frame picture
Grey level histogram;
Described computing unit, for choosing first threshold and the Second Threshold of intensity value ranges, calculates described intensity histogram
The local message entropy of figure;
Described screening unit, draws as pending frame more than the frame picture of the 3rd threshold value for filtering out local message entropy
Face.
Wherein, described deduplication module includes that second acquisition unit, judging unit, extraction unit, the first frame picture determine list
Unit, the second frame picture determination unit, wherein:
Described second acquisition unit for obtaining the first frame picture in described pending frame picture as present frame picture,
Second frame picture frame picture as a comparison;
Described judging unit, for judging whether the caption content of described present frame picture and described contrast frame picture occurs
Change;
Described extraction unit, for judging described present frame picture and the word of described contrast frame picture when described judging unit
When curtain content changes, extract described present frame frame picture as unique frame picture;
Described first frame picture determination unit, for judging described present frame picture and described contrast when described judging unit
When the caption content of frame picture changes, using described contrast frame picture as present frame picture, and obtain described contrast frame picture
The next frame in face frame picture as a comparison;
Described second frame picture determination unit, for judging described present frame picture and described contrast when described judging unit
When the caption content of frame picture does not changes, any frame picture in described present frame picture and described contrast frame picture is made
For present frame picture, and obtain the next frame picture frame picture as a comparison of described contrast frame picture.
Wherein, described identification module includes multiple recognition unit, specifically for by multiple caption content correspondence respectively only
One frame picture carries out multithreading optical character recognition, obtains the Word message that every width unique frame picture is corresponding.
Wherein, described identification module, specifically for being identified as caption character by described caption content;Or
Specifically for sending described unique frame picture to remote server, receive what described remote server identification returned
Word message.
Wherein, described captions generation module includes the 3rd acquiring unit and captions signal generating unit, wherein:
Described 3rd acquiring unit, for obtaining the timestamp information of described unique frame picture;
Described captions signal generating unit, for generating caption information by described word according to described timestamp information.
Preferably, described device also includes Subtitle Demonstration module, for described caption information is imported described media file
In, the word in caption information described in simultaneous display.
Preferably, described device also includes examining module, for sending described caption information to remote server, again
Caption information when needing to identify described media file caption, after described remote server calls calibration.
In yet another aspect, embodiments provide a kind of terminal, including: media file caption as above
Identify device.
In yet another aspect, embodiments provide a kind of electronic equipment, including: housing, processor, memorizer,
Display screen, circuit board and power circuit, wherein, described circuit board is placed in the interior volume that described housing surrounds, described process
Device and described memorizer are arranged on described circuit board, be embedded on described housing and connect described circuit board outside described display screen;
Described power circuit, powers for each circuit or the device for electronic equipment;Described memorizer is used for storing executable program
Code and data;Described processor runs by reading the executable program code of storage in described memorizer and can perform journey
The program that sequence code is corresponding, for performing following steps:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame corresponding to same caption content and draw
Face;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
The such scheme of the present invention at least includes following beneficial effect:
The present invention, for same caption content, is only identified operation, phase to unique frame picture that this caption content is corresponding
Ratio is in the scheme of prior art, and the present invention need not repeatedly identify several corresponding for same caption content frame pictures, with
A corresponding width frame picture only need to be identified by one caption content, obtains Word message, improves the effect of subtitle recognition
Rate.
Accompanying drawing explanation
The specific embodiment of the present invention is described below with reference to accompanying drawings, wherein:
Fig. 1 shows the schematic diagram of the recognition methods of media file caption in the embodiment of the present invention one;
Fig. 2 shows the schematic diagram of the recognition methods of media file caption in the embodiment of the present invention two;
Fig. 3 shows in the embodiment of the present invention two, according to caption content, pending frame picture is carried out duplicate removal, obtains same
The schematic diagram of unique frame picture method that caption content is corresponding;
Fig. 4 shows the structural representation identifying device of media file caption in the embodiment of the present invention three;
Fig. 5 shows the structural representation identifying device of media file caption in the embodiment of the present invention four;
Fig. 6 shows the structural representation screening module in the embodiment of the present invention four;
Fig. 7 shows the structural representation of deduplication module in the embodiment of the present invention four;
Fig. 8 shows the structural representation of identification module in the embodiment of the present invention four;
Fig. 9 shows the structural representation of electronic equipment in the embodiment of the present invention five.
Detailed description of the invention
In order to make technical scheme and advantage clearer, below in conjunction with exemplary to the present invention of accompanying drawing
Embodiment is described in more detail, it is clear that described embodiment be only the present invention a part of embodiment rather than
All embodiments exhaustive.And in the case of not conflicting, the embodiment in this explanation and the feature in embodiment can be mutual
Combine.
Embodiments of the invention provide the recognition methods of a kind of media file caption, device and electronic equipment, for same
Caption content, is only identified operation to unique frame picture that this caption content is corresponding, compared to the scheme of prior art, this
Several corresponding for same caption content frame pictures are repeatedly identified by bright being no longer necessary to, and same caption content only need to be to correspondence
One width frame picture is identified, and obtains Word message, improves the efficiency of subtitle recognition.
In embodiments of the invention, media file can be video file or video flowing, this video file or video
The source of stream includes but not limited to: the video file preserved in (1) storage device;(2) live video stream, such as live telecast regard
Frequency stream, network direct broadcasting video flowing etc..
Embodiment one
The first embodiment schematic flow sheet of the recognition methods of a kind of media file caption that Fig. 1 provides for the present invention.This
The recognition methods of the media file caption that inventive embodiments one provides includes:
Step 101, filter out the pending frame picture having caption content in media file frame picture;
Step 102, according to caption content, pending frame picture is carried out duplicate removal, obtain corresponding unique of same caption content
Frame picture;
Step 103, unique frame picture is identified, obtains the Word message that unique frame picture is corresponding;
Step 104, Word message is carried out process generate caption information.
The embodiment of the present invention, for same caption content, is only identified behaviour to unique frame picture that this caption content is corresponding
Making, compared to the scheme of prior art, the present invention is no longer necessary to carry out repeatedly several corresponding for same caption content frame pictures
Identifying, a corresponding width frame picture only need to be identified by same caption content, obtains Word message, improves captions and knows
Other efficiency.
Embodiment two
Second embodiment schematic flow sheet of the recognition methods of a kind of media file caption that Fig. 2 provides for the present invention.This
The recognition methods of the media file caption that inventive embodiments two provides includes:
Step 201, filter out the pending frame picture having caption content in media file frame picture;
Provide two kinds in the present embodiment and filter out the pending frame picture having caption content in media file frame picture
Method, wherein, several are had the frame picture of caption content to carry out caption content to carry out the process of multithreading, make by first method
The corresponding unique frame picture of same caption content;Second method carries out caption content to the frame picture having caption content and compares, and makes
The corresponding unique frame picture of same caption content.Specific as follows:
In the present embodiment, filter out the first side of the pending frame picture having caption content in media file frame picture
Method, obtains the frame picture of media file every fixing frame number, when obtaining several frame pictures, carries out several frame pictures described
Multiple threads, the process step of each thread includes:
Frame picture is converted into gray level image;
In statistics gray level image, the gray value of each pixel, obtains the grey level histogram of frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of grey level histogram;Filter out
Local message entropy is more than the frame picture of the 3rd threshold value as pending frame picture.
In the present embodiment, filter out the second side of the pending frame picture having caption content in media file frame picture
Method, obtains the frame picture of media file every fixing frame number;Frame picture is converted into gray level image;In statistics gray level image
The gray value of each pixel, obtains the grey level histogram of frame picture;Choose first threshold and the Second Threshold of intensity value ranges, meter
Calculating the local message entropy of grey level histogram, Second Threshold is more than first threshold;Filter out local message entropy more than the 3rd threshold value
Frame picture, local message entropy is the pending frame picture having caption content more than the frame picture of the 3rd threshold value.
Such as, according to the gray value i (i ∈ [0,255]) of pixel each in gray level image, the intensity histogram of frame picture is obtained
Figure is H [i];Choose gray value in the range of first threshold θep1≤ i≤Second Threshold θep2;
Grey level histogram is normalized:
Obtain rectangular histogram local message entropy:
If ep1>=the three threshold value EPL, then it is assumed that frame picture has caption content.
Step 202, according to caption content, pending frame picture is carried out duplicate removal, obtain corresponding unique of same caption content
Frame picture;
In the present embodiment, the frame picture having caption content is carried out caption content and compares, make same caption content corresponding
The method of unique frame picture is as it is shown on figure 3, include:
Step 2021 obtains the first frame picture in pending frame picture as present frame picture, and the second frame picture is as right
Ratio frame picture;
Step 2022 judges whether the caption content of present frame picture and contrast frame picture changes, if judging to occur
Change performs step 2023, if judging the execution step 2024 that do not changes;
Step 2023 extracts present frame frame picture as unique frame picture, and will contrast frame picture as present frame picture,
Obtain the next frame frame picture as a comparison of contrast frame picture, perform step 2022;
Any frame picture in present frame picture and contrast frame picture as present frame picture, is obtained contrast by step 2024
The next frame picture frame picture as a comparison of frame picture, performs step 2022.
Wherein it is possible to utilize whether stroke direction changes, first to the caption content judging present frame and contrast frame
Frame picture is done rim detection, then adds up the rectangular histogram that the gradient direction of edge pixel is constituted, use OpenCV function
CompareHist produces the similar of the gradient orientation histogram of an edge pixel expressing present frame picture and contrast frame picture
The numerical value of degree, and determine a threshold value, if this numerical value is not less than threshold value, then it is assumed that caption content does not changes.
Such as, the gradient orientation histogram of the edge pixel of present frame picture is H1, the edge pixel of contrast frame picture
Gradient orientation histogram is H2, OpenCV function compareHist generate contrast standard d (H1, H2),
Wherein, i is pixel value, i ∈ [0,255],
If d is (H1, H2) >=threshold value D (H1, H2), then it is assumed that caption content does not changes.
Step 203, unique frame picture is identified, obtains the Word message that unique frame picture is corresponding;
Present embodiments providing three kinds of methods that the caption content in unique frame picture is identified as caption character:
Caption content in unique frame picture is identified as in the present embodiment the first method of caption character, and being specially will
Unique frame picture of multiple caption content correspondence respectively carries out multithreading optical character recognition, obtains every width unique frame picture corresponding
Word message.
Caption content in unique frame picture is identified as in the present embodiment the second method of caption character, and being specially will
Unique frame picture carries out optical character recognition, obtains the Word message that unique frame picture is corresponding;
Caption content in unique frame picture is identified as in the present embodiment the third method of caption character, and being specially will
Unique frame picture sends to remote server, receives the Word message that remote server identification returns.
Step 204, Word message is carried out process generate caption information.
In the present embodiment, caption character is carried out process and obtains caption information particularly as follows: add caption character to literary composition
In presents, then according to content and the timestamp of text, the form adding captions according to a time code generates
Caption information, i.e. adds the form of captions according to a time code and writes word in caption information.
The kind of captions has multiple, and the most the more commonly used subtitling format has graphical format and text formatting two class, relatively
For graphical format captions, text formatting captions have that size is little, form simple, are easy to make and the feature of amendment, text lattice
Formula captions include utf, idx, sub, srt, smi, rt, txt, ssa, aq, jss, js, ass, wherein the text subtitle of srt form
Most widely used, it can compatible various common media players, MPC, QQ are audio-visual etc. all can load the type automatically
Captions.Therefore, in the present embodiment, caption information uses srt form, and certain the present embodiment does not limit the lattice of caption information
Formula, as long as the form of caption information can support used media player.
Step 205, by caption information importing medium file, the word in synchronously displaying subtitle information.
In the present embodiment, caption information is stored in the file at media file place, when playing media file, and should
Caption information can be automatically imported and simultaneous display.
Additionally, for the display effect optimizing captions, can be by sentence branch display longer in caption information.
Step 206, by caption information send to remote server, make remote server caption information is carried out examine calibration
And preserve, when again needing to identify media file caption, the caption information after remote server calls calibration.
The embodiment of the present invention carries out the screening of multithreading and obtains comprising the frame of caption content and draw the frame picture of media file
Face, obtains, by duplicate removal, unique frame picture that each same caption content is corresponding, only the most corresponding only to multiple caption content
One frame picture carries out multithreading identification operation, and carries out caption information examining calibration.Compared to the scheme of prior art, this
The bright frame picture to media file carries out the screening of multithreading, decreases and obtains the frame picture required time comprising caption content;
Pass through duplicate removal, it is no longer necessary to repeatedly being identified by several corresponding for same caption content frame pictures, same caption content only needs
A corresponding width frame picture is identified, obtains Word message, improve the efficiency of subtitle recognition;To in multiple captions
Hold unique frame picture corresponding respectively and carry out the identification of multithreading, further increase the efficiency of subtitle recognition;To caption information
Carry out examining calibration, improve speed and the accuracy of the media file caption again obtained.
Based on same inventive concept, the embodiment of the present invention additionally provides the identification device of a kind of media file caption, by
The principle solving problem in these systems is similar to the recognition methods of a kind of media file caption, and therefore the enforcement of these systems can
To see the enforcement of method, repeat no more in place of repetition.
As shown in Figure 4, providing the identification device of a kind of media file caption in the embodiment of the present invention, device can wrap
Include:
Screening module 301, for filtering out the pending frame picture having caption content in media file frame picture;
Deduplication module 302, for pending frame picture being carried out duplicate removal according to caption content, obtains same caption content pair
The unique frame picture answered;
Identification module 303, for being identified unique frame picture, obtains the Word message that unique frame picture is corresponding;
Captions generation module 304, generates caption information for carrying out Word message processing.
Embodiment of the present invention deduplication module obtains unique frame picture that same caption content is corresponding, and identification module is only to this word
Curtain unique frame picture corresponding to content is identified operation, compared to the scheme of prior art, the subtitle recognition device of the present invention
Being no longer necessary to repeatedly identify several corresponding for same caption content frame pictures, same caption content only need to be to corresponding one
Width frame picture is identified, and obtains Word message, improves the efficiency of subtitle recognition.
As it is shown in figure 5, provide the identification device of another kind of media file caption in the embodiment of the present invention, device can wrap
Include:
Screening module 401, for filtering out the pending frame picture having caption content in media file frame picture;
In the present embodiment, as shown in Figure 6, screen module described in screening module 401 and include the first acquiring unit and multiple place
Reason module, wherein:
Described first acquiring unit, for obtaining the frame picture of described media file every fixing frame number;
Each described processing module includes converting unit, statistic unit, computing unit and screening unit;
Described converting unit, for being converted into gray level image by described frame picture;
Described statistic unit, for adding up the gray value of each pixel in described gray level image, obtains described frame picture
Grey level histogram;
Described computing unit, for choosing first threshold and the Second Threshold of intensity value ranges, calculates described intensity histogram
The local message entropy of figure;
Described screening unit, draws as pending frame more than the frame picture of the 3rd threshold value for filtering out local message entropy
Face.
The present embodiment additionally provides another kind of screening module, can include the first acquiring unit 4011, converting unit 4012,
Statistic unit 4013, computing unit 4014 and screening unit 4015, wherein:
First acquiring unit 4011, for obtaining the frame picture of media file every fixing frame number;
Converting unit 4012, for being converted into gray level image by frame picture;
Statistic unit 4013, for adding up the gray value of each pixel in gray level image, obtains the intensity histogram of frame picture
Figure;
Computing unit 4014, for choosing first threshold and the Second Threshold of intensity value ranges, calculates grey level histogram
Local message entropy;
Screening unit 4015, for filtering out the local message entropy frame picture more than the 3rd threshold value.
Deduplication module 402, compares for the frame picture having caption content is carried out caption content, makes same caption content pair
Should unique frame picture;
In the present embodiment, as it is shown in fig. 7, deduplication module 402 includes second acquisition unit 4021, judging unit 4022, carries
Take unit the 4023, first frame picture determination unit the 4024, second frame picture determination unit 4025, wherein:
Second acquisition unit 4021, for obtaining the first frame picture in pending frame picture as present frame picture, the
Two frame picture frame pictures as a comparison;
Judging unit 4022, for judging whether the caption content of present frame picture and contrast frame picture changes;
Extraction unit 4023, for judging that when judging unit 4022 caption content of present frame picture and contrast frame picture is sent out
During changing, extract present frame frame picture as unique frame picture;
First frame picture determination unit 4024, for when judging unit 4022 judges present frame picture and contrasts frame picture
When caption content changes, by contrast frame picture as present frame picture, and obtain the next frame of contrast frame picture as right
Ratio frame picture;
Second frame picture determination unit 4025, for when judging unit 4022 judges present frame picture and contrasts frame picture
When caption content does not changes, using any frame picture in present frame picture and contrast frame picture as present frame picture, and
Obtain the next frame picture frame picture as a comparison of contrast frame picture.
Identification module 403, is identified as caption character by the caption content in unique frame picture;
In the present embodiment, identification module 403 can include multiple recognition unit as shown in Figure 8, specifically for by multiple words
Unique frame picture of curtain content correspondence respectively carries out multithreading optical character recognition, obtains the word that every width unique frame picture is corresponding
Information.
The present embodiment additionally provides another kind of identification module 403, for caption content is identified as caption character;Or
For sending unique frame picture to remote server, receive the Word message that remote server identification returns.
Captions generation module 404, carries out process to caption character and obtains caption information.
In the present embodiment, captions generation module 404 includes the 3rd acquiring unit and captions signal generating unit, wherein:
3rd acquiring unit, for obtaining the timestamp information of unique frame picture;
Captions signal generating unit, for generating caption information by word according to timestamp information.
Subtitle Demonstration module 405, for by caption information importing medium file, the literary composition in synchronously displaying subtitle information
Word;
Examine module 406, for sending caption information to remote server, again need to identify media file caption
Time, the caption information after remote server calls calibration.
The screening that embodiment of the present invention screening module carries out multithreading to the frame picture of media file obtains comprising in captions
The frame picture held, deduplication module obtains unique frame picture that same caption content is corresponding, and identification module is only to multiple caption content
Unique frame picture corresponding respectively carries out multithreading identification operation, examines that caption information is carried out examining calibration by module.Compared to
The scheme of prior art, the subtitle recognition device of the present invention carries out the screening of multithreading to the frame picture of media file, decreases
Obtain the frame picture required time comprising caption content;Pass through duplicate removal, it is no longer necessary to by several corresponding for same caption content frames
Picture repeatedly identifies, a corresponding width frame picture only need to be identified by same caption content, obtains Word message,
Improve the efficiency of subtitle recognition;Unique frame picture that multiple caption content are corresponding respectively is carried out the identification of multithreading, enters one
Step improves the efficiency of subtitle recognition;Carry out caption information examining calibration, improve the media file caption that again obtains
Speed and accuracy.
As it is shown in figure 9, the embodiment of the present invention additionally provides a kind of electronic equipment, including: housing 501, processor 502,
Memorizer 503, display screen (not shown), circuit board 504 and power circuit 505, wherein, circuit board 504 is placed in housing
501 interior volume surrounded, processor 502 and memorizer 503 are arranged on circuit board 504, are embedded on housing 501 outside display screen
And connect circuit board 504;Power circuit 505, powers for each circuit or the device for electronic equipment;Memorizer 503 is used for
Storage executable program code and data;Processor 502 is transported by reading the executable program code of storage in memorizer 503
The program that row is corresponding with executable program code, for performing following steps:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame corresponding to same caption content and draw
Face;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
Electronic equipment in the embodiment of the present invention, filters out the frame picture of caption content, obtains same caption content pair
The unique frame picture answered, is only identified operation, compared to the side of prior art to unique frame picture that this caption content is corresponding
Case, the electronic equipment of the present invention is no longer necessary to repeatedly identify several corresponding for same caption content frame pictures, same word
A corresponding width frame picture only need to be identified by curtain content, obtains Word message, improves the efficiency of subtitle recognition.
For convenience of description, each several part of system above is divided into various module or unit to be respectively described with function.Certainly,
The function of each module or unit can be realized in same or multiple softwares or hardware when implementing the present invention.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program
Product.Therefore, the reality in terms of the present invention can use complete hardware embodiment, complete software implementation or combine software and hardware
Execute the form of example.And, the present invention can use at one or more computers wherein including computer usable program code
The upper computer program product implemented of usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.)
The form of product.
The present invention is with reference to method, equipment (system) and the flow process of computer program according to embodiments of the present invention
Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram
Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce
A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real
The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.
These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to
Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or
The function specified in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter
Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or
The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one
The step of the function specified in individual square frame or multiple square frame.
Although preferred embodiments of the present invention have been described, but those skilled in the art once know basic creation
Property concept, then can make other change and amendment to these embodiments.So, claims are intended to be construed to include excellent
Select embodiment and fall into all changes and the amendment of the scope of the invention.
Claims (10)
1. the recognition methods of a media file caption, it is characterised in that including:
Filter out the pending frame picture having caption content in media file frame picture;
According to caption content, described pending frame picture is carried out duplicate removal, obtain unique frame picture that same caption content is corresponding;
Described unique frame picture is identified, obtains the Word message that described unique frame picture is corresponding;
Carry out described Word message processing and generate caption information.
2. the method for claim 1, it is characterised in that described in filter out described media file frame picture have captions in
The pending frame picture held, particularly as follows:
The frame picture of described media file is obtained every fixing frame number;
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
3. device as claimed in claim 1, it is characterised in that described in filter out described media file frame picture have captions in
The pending frame picture held, particularly as follows:
Obtain the frame picture of described media file every fixing frame number, when obtaining several frame pictures, several frames described are drawn
Face carries out multiple threads, and the process step of each thread includes:
Described frame picture is converted into gray level image;
Add up the gray value of each pixel in described gray level image, obtain the grey level histogram of described frame picture;
Choose first threshold and the Second Threshold of intensity value ranges, calculate the local message entropy of described grey level histogram;
Filter out the local message entropy frame picture more than the 3rd threshold value as pending frame picture.
4. the method for claim 1, it is characterised in that described according to caption content, described pending frame picture is carried out
Duplicate removal, obtains unique frame picture that same caption content is corresponding, particularly as follows:
Step 401 obtains the first frame picture in described pending frame picture as present frame picture, and the second frame picture is as right
Ratio frame picture;
Step 402 judges whether the caption content of described present frame picture and described contrast frame picture changes, if judging
Change execution step 403, if judging the execution step 404 that do not changes;
Step 403 extracts described present frame frame picture as unique frame picture, and is drawn as present frame by described contrast frame picture
Face, obtains the next frame frame picture as a comparison of described contrast frame picture, performs step 402;
Any frame picture in described present frame picture and described contrast frame picture as present frame picture, is obtained by step 404
The next frame picture frame picture as a comparison of described contrast frame picture, performs step 402.
5. the method for claim 1, it is characterised in that draw if getting unique frame corresponding to multiple caption content
Face, the most described is identified described unique frame picture, obtains the Word message that described unique frame picture is corresponding, particularly as follows:
Unique frame picture that the multiple caption content got are corresponding respectively is carried out multithreading optical character recognition, obtains every width
The Word message that unique frame picture is corresponding.
6. the method for claim 1, it is characterised in that described be identified described unique frame picture, obtains described
The Word message that unique frame picture is corresponding, particularly as follows:
Described unique frame picture is carried out optical character recognition, obtains the Word message that described unique frame picture is corresponding;Or
Described unique frame picture is sent to remote server, receives the Word message that described remote server identification returns.
7. the method for claim 1, it is characterised in that described described Word message is carried out process obtain captions letter
Breath, particularly as follows:
Obtain the timestamp information of described unique frame picture;
Described word is generated caption information according to described timestamp information.
8. the method as described in any one of claim 1-7, it is characterised in that described described Word message is carried out process obtain
After caption information, described method also includes:
Described caption information is imported in described media file, the word in caption information described in simultaneous display.
9. method as claimed in claim 8, it is characterised in that described described Word message is carried out process obtain caption information
After, described method also includes:
Described caption information is sent to remote server, makes described remote server carry out described caption information examining calibration
And preserve, caption information when again needing to identify described media file caption, after described remote server calls calibration.
10. the identification device of a media file caption, it is characterised in that including: screening module, deduplication module, identification module
With captions generation module;
Described screening module, for filtering out the pending frame picture having caption content in described media file frame picture;
Described deduplication module, for described pending frame picture being carried out duplicate removal according to caption content, obtains same caption content
Corresponding unique frame picture;
Described identification module, for being identified described unique frame picture, obtains the word letter that described unique frame picture is corresponding
Breath;
Described captions generation module, generates caption information for carrying out described Word message processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610681287.8A CN106295592A (en) | 2016-08-17 | 2016-08-17 | Method and device for identifying subtitles of media file and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610681287.8A CN106295592A (en) | 2016-08-17 | 2016-08-17 | Method and device for identifying subtitles of media file and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106295592A true CN106295592A (en) | 2017-01-04 |
Family
ID=57679560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610681287.8A Pending CN106295592A (en) | 2016-08-17 | 2016-08-17 | Method and device for identifying subtitles of media file and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106295592A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112488107A (en) * | 2020-12-04 | 2021-03-12 | 北京华录新媒信息技术有限公司 | Video subtitle processing method and processing device |
CN112925905A (en) * | 2021-01-28 | 2021-06-08 | 北京达佳互联信息技术有限公司 | Method, apparatus, electronic device and storage medium for extracting video subtitles |
CN114071184A (en) * | 2021-11-11 | 2022-02-18 | 腾讯音乐娱乐科技(深圳)有限公司 | Subtitle positioning method, electronic equipment and medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1761205A (en) * | 2005-11-18 | 2006-04-19 | 郑州金惠计算机系统工程有限公司 | System for detecting eroticism and unhealthy images on network based on content |
CN101360193A (en) * | 2008-09-04 | 2009-02-04 | 北京中星微电子有限公司 | Video subtitle processing apparatus and method |
CN102004916A (en) * | 2010-11-15 | 2011-04-06 | 无锡中星微电子有限公司 | Image characteristic extraction system and method |
CN102915438A (en) * | 2012-08-21 | 2013-02-06 | 北京捷成世纪科技股份有限公司 | Method and device for extracting video subtitles |
CN103186780A (en) * | 2011-12-30 | 2013-07-03 | 乐金电子(中国)研究开发中心有限公司 | Video caption identifying method and device |
CN103607635A (en) * | 2013-10-08 | 2014-02-26 | 十分(北京)信息科技有限公司 | Method, device and terminal for caption identification |
CN103634605A (en) * | 2013-12-04 | 2014-03-12 | 百度在线网络技术(北京)有限公司 | Processing method and device for video images |
CN104021385A (en) * | 2013-03-02 | 2014-09-03 | 北京信息科技大学 | Video subtitle thinning method based on template matching and curve fitting |
CN104244107A (en) * | 2014-08-26 | 2014-12-24 | 中译语通科技(北京)有限公司 | Video caption restoring method based on caption detection and recognition |
-
2016
- 2016-08-17 CN CN201610681287.8A patent/CN106295592A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1761205A (en) * | 2005-11-18 | 2006-04-19 | 郑州金惠计算机系统工程有限公司 | System for detecting eroticism and unhealthy images on network based on content |
CN101360193A (en) * | 2008-09-04 | 2009-02-04 | 北京中星微电子有限公司 | Video subtitle processing apparatus and method |
CN102004916A (en) * | 2010-11-15 | 2011-04-06 | 无锡中星微电子有限公司 | Image characteristic extraction system and method |
CN103186780A (en) * | 2011-12-30 | 2013-07-03 | 乐金电子(中国)研究开发中心有限公司 | Video caption identifying method and device |
CN102915438A (en) * | 2012-08-21 | 2013-02-06 | 北京捷成世纪科技股份有限公司 | Method and device for extracting video subtitles |
CN104021385A (en) * | 2013-03-02 | 2014-09-03 | 北京信息科技大学 | Video subtitle thinning method based on template matching and curve fitting |
CN103607635A (en) * | 2013-10-08 | 2014-02-26 | 十分(北京)信息科技有限公司 | Method, device and terminal for caption identification |
CN103634605A (en) * | 2013-12-04 | 2014-03-12 | 百度在线网络技术(北京)有限公司 | Processing method and device for video images |
CN104244107A (en) * | 2014-08-26 | 2014-12-24 | 中译语通科技(北京)有限公司 | Video caption restoring method based on caption detection and recognition |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112488107A (en) * | 2020-12-04 | 2021-03-12 | 北京华录新媒信息技术有限公司 | Video subtitle processing method and processing device |
CN112925905A (en) * | 2021-01-28 | 2021-06-08 | 北京达佳互联信息技术有限公司 | Method, apparatus, electronic device and storage medium for extracting video subtitles |
CN112925905B (en) * | 2021-01-28 | 2024-02-27 | 北京达佳互联信息技术有限公司 | Method, device, electronic equipment and storage medium for extracting video subtitles |
CN114071184A (en) * | 2021-11-11 | 2022-02-18 | 腾讯音乐娱乐科技(深圳)有限公司 | Subtitle positioning method, electronic equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10922804B2 (en) | Method and apparatus for evaluating image definition, computer device and storage medium | |
CN106303303A (en) | Method and device for translating subtitles of media file and electronic equipment | |
CN109919244B (en) | Method and apparatus for generating a scene recognition model | |
CN111862035B (en) | Training method of light spot detection model, light spot detection method, device and medium | |
CN105913088A (en) | Lag identification method, lag identification device and computing equipment | |
CN111738041A (en) | Video segmentation method, device, equipment and medium | |
CN105718861A (en) | Method and device for identifying video streaming data category | |
CN103186780B (en) | Video caption recognition methods and device | |
CN112749696B (en) | Text detection method and device | |
CN110059624B (en) | Method and apparatus for detecting living body | |
CN111028222A (en) | Video detection method and device, computer storage medium and related equipment | |
CN106295592A (en) | Method and device for identifying subtitles of media file and electronic equipment | |
CN107454479A (en) | A kind of processing method and processing device of multi-medium data | |
CN110191356A (en) | Video reviewing method, device and electronic equipment | |
CN108921023A (en) | A kind of method and device of determining low quality portrait data | |
CN105979283A (en) | Video transcoding method and device | |
US11728914B2 (en) | Detection device, detection method, and program | |
CN107483916A (en) | The control method of audio frequency and video archival quality detecting system | |
CN111369557A (en) | Image processing method, image processing device, computing equipment and storage medium | |
CN109151520B (en) | Method, device, electronic equipment and medium for generating video | |
CN117745589A (en) | Watermark removing method, device and equipment | |
CN110728193A (en) | Method and device for detecting richness characteristics of face image | |
CN113095178A (en) | Bad information detection method, system, electronic device and readable storage medium | |
Josephs et al. | Artifact magnification on deepfake videos increases human detection and subjective confidence | |
CN112287790A (en) | Image processing method, image processing device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170104 |
|
RJ01 | Rejection of invention patent application after publication |