CN108093314A - A kind of news-video method for splitting and device - Google Patents
A kind of news-video method for splitting and device Download PDFInfo
- Publication number
- CN108093314A CN108093314A CN201711371733.6A CN201711371733A CN108093314A CN 108093314 A CN108093314 A CN 108093314A CN 201711371733 A CN201711371733 A CN 201711371733A CN 108093314 A CN108093314 A CN 108093314A
- Authority
- CN
- China
- Prior art keywords
- camera lens
- time point
- headline
- news
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/738—Presentation of query results
- G06F16/739—Presentation of query results in form of a video summary, e.g. the video summary being a video sequence, a composite still image or having synthesized frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8455—Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
Abstract
The invention discloses a kind of news-video method for splitting and device, including:Pending news video is decomposed at least one camera lens, point and end time point at the beginning of each camera lens is recorded in news video, the m frame key frames of camera lens are extracted according to prefixed time interval, the m frame key frames of analysis camera lens obtain host's classification information of camera lens, record the start time point of headline and end time point;Generate the headline label information that camera lens is marked, based on each camera lens in news video at the beginning of the headline label information of point and end time point, host's classification information of camera lens and camera lens, news video is split as N news information according to the default rule that splits.The present invention can split news-video automatically based on host's information in news-video and headline information, improve the efficiency of news-video fractionation.
Description
Technical field
The present invention relates to technical field of video processing, more specifically to a kind of news-video method for splitting and device.
Background technology
Contain substantial amounts of newest information in news-video, have for the application of video website and news category
Important value.The application of video website or news category needs to split the whole news-video broadcasted daily, reach the standard grade,
For user click viewing is carried out for wherein interested every news.Since the TV station in the whole nation is large number of, except satellite TV's platform
It is also cut outside there are all kinds of local broadcasting stations if necessary to be split to all news-videos, it is necessary to expend substantial amounts of manpower
Point.Simultaneously because the timeliness of news, the rate request for the segmentation of news-video is also very stringent, so to artificial
Segmentation brings the pressure of bigger, and news-video was largely broadcasted in some period (such as 12 noon to 12 thirty), was
Ensure timeliness, it is necessary to before the deadline entire news-video program be cut into independent news entry as early as possible, and
It cannot be produced by the way of backlog post-processing.
In conclusion in the prior art, the technical solution that news-video is split automatically can often not needed
Substantial amounts of staff is wanted manually to split news-video, human cost is higher.Therefore, how quickly and effectively to regarding
It is a urgent problem to be solved that frequency news, which carries out automatic split,.
The content of the invention
In view of this, it is an object of the invention to provide a kind of news-video method for splitting, can be based in news-video
Host's information and headline information news-video is split automatically, improve news-video fractionation efficiency.
To achieve the above object, the present invention provides following technical solution:A kind of news-video method for splitting, the method bag
It includes:
By being clustered to the video frame in pending news video, the pending news video is decomposed into
At least one camera lens;
Point and end time point at the beginning of each camera lens is recorded in the news video;
The length for the camera lens that point and end time point calculate at the beginning of based on the camera lens, according to default
Time interval extracts the m frame key frames of the camera lens;
The m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Headline detection is carried out to the pending news video, it is new when being included in the pending news video
When hearing title, the start time point of the headline and end time point are recorded;
Start time point and end time point and the camera lens based on the headline are in the news video
At the beginning of point and end time point, generate the headline label information that the camera lens is marked;
Based on each camera lens in the news video at the beginning of point and end time point, the camera lens
The headline label information of host's classification information and the camera lens tears the news video open according to the default rule that splits
It is divided into N news information, wherein, N is more than or equal to 1.
Preferably, it is described that headline detection is carried out to the pending news video, when the pending news
When headline is included in video, after recording the start time point of the headline and end time point, further include:
The headline of the pending news video to detecting carries out deduplication operation, and records residue after duplicate removal
The start time point of headline and end time point;
Correspondingly, start time point and the end time point based on the headline and the camera lens are described
Point and end time point, generate the headline label information that the camera lens is marked at the beginning of in news video
Including:
Based on the start time point of remaining headline after the duplicate removal and end time point and the camera lens described
Point and end time point at the beginning of in news video generate and mark letter to the headline that the camera lens is marked
Breath.
Preferably, host's classification information that the m frame key frames for analyzing the camera lens obtain the camera lens includes:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, generates each frame key frame
Corresponding host's class categories;
Host's class categories of all key frames of the camera lens are counted, by host's classification class of quantity maximum
It is not determined as host's classification information of the camera lens.
Preferably, it is described that headline detection is carried out to the pending news video, when the pending news
When headline is included in video, the start time point and end time point that record the headline include:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by described in
The time of occurrence point in headline region is determined as the start time point of headline, during by the disappearance in the headline region
Between point be determined as the end time point of headline.
Preferably, start time point and the end time point based on the headline and the camera lens are described
Point and end time point, generate the headline label information that the camera lens is marked at the beginning of in news video
Including:
By the start time point of the headline and end time point with the camera lens in the news video
Sart point in time and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video
In at the beginning of point and end time point form period in when, generate the first headline label information;
It is regarded when the start time point of the headline and end time point are not included in the camera lens in the news
When in the period that point and end time point are formed at the beginning of in frequency, the second headline label information is generated.
Preferably, it is described based on each camera lens in the news video at the beginning of point and end time
The headline label information of point, host's classification information of the camera lens and the camera lens, will according to the default rule that splits
The news video, which is split as N news information, to be included:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai,
Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens
Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
A kind of news-video detachment device, including:
Decomposing module, will be described pending for by being clustered to the video frame in pending news video
News video is decomposed at least one camera lens;
First logging modle, for recording each camera lens in the news video at the beginning of point and terminate
Time point;
Abstraction module, for point and end time point calculate at the beginning of based on the camera lens the camera lens
Length extracts the m frame key frames of the camera lens according to prefixed time interval;
Analysis module, the m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Second logging modle, for carrying out headline detection to the pending news video, when described pending
News video in include headline when, record the start time point of the headline and end time point;
Generation module, for the start time point based on the headline and end time point and the camera lens in institute
Point and end time point at the beginning of stating in news video generate and mark letter to the headline that the camera lens is marked
Breath;
Split module, for being based on each camera lens in the news video at the beginning of point and end time
The headline label information of point, host's classification information of the camera lens and the camera lens, will according to the default rule that splits
The news video is split as N news information, wherein, N is more than or equal to 1.
Preferably, described device further includes:
Deduplication module, for carrying out deduplication operation to the headline of the pending news video detected, and
The remaining start time point of headline and end time point after record duplicate removal;
Correspondingly, the generation module is used for:Start time point and knot based on remaining headline after the duplicate removal
Beam time point and the camera lens in the news video at the beginning of point and end time point, generate to the camera lens into
The headline label information of line flag.
Preferably, the analysis module is specifically used for:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, generates each frame key frame
Corresponding host's class categories;
Host's class categories of all key frames of the camera lens are counted, by host's classification class of quantity maximum
It is not determined as host's classification information of the camera lens.
Preferably, second logging modle is specifically used for:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by described in
The time of occurrence point in headline region is determined as the start time point of headline, during by the disappearance in the headline region
Between point be determined as the end time point of headline.
Preferably, the generation module is specifically used for:
By the start time point of the headline and end time point with the camera lens in the news video
Sart point in time and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video
In at the beginning of point and end time point form period in when, generate the first headline label information;
It is regarded when the start time point of the headline and end time point are not included in the camera lens in the news
When in the period that point and end time point are formed at the beginning of in frequency, the second headline label information is generated.
Preferably, the fractionation module is specifically used for:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai,
Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens
Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
It can be seen from the above technical proposal that the invention discloses a kind of news-video method for splitting, to news-video
When being split, first by being clustered to the video frame in pending news video, by pending news video point
It solves as at least one camera lens, point and end time point at the beginning of then recording each camera lens in news video are based on
The length for the camera lens that point and end time point calculate, camera lens is extracted according to prefixed time interval at the beginning of camera lens
M frame key frames, the m frame key frames for analyzing camera lens obtain host's classification information of camera lens, and pending news video is carried out
Headline detect, when in pending news video include headline when, record headline start time point and
End time point, the beginning of start time point and end time point and camera lens in the news video based on headline
Time point and end time point, generate the headline label information that camera lens is marked, based on each camera lens in news
The headline of point and end time point, host's classification information of camera lens and camera lens mark letter at the beginning of in video
News video is split as N news information by breath according to the default rule that splits, wherein, N is more than or equal to 1.The present invention can be based on
Host's information and headline information in news-video split news-video automatically, improve news-video and tear open
The efficiency divided.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention, for those of ordinary skill in the art, without creative efforts, can be with
Other attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of flow chart of news-video method for splitting disclosed in the embodiment of the present invention 1;
Fig. 2 is a kind of flow chart of news-video method for splitting disclosed in the embodiment of the present invention 2;
Fig. 3 is a kind of structure diagram of news-video detachment device disclosed in the embodiment of the present invention 1;
Fig. 4 is a kind of structure diagram of news-video detachment device disclosed in the embodiment of the present invention 2.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained all other without making creative work
Embodiment belongs to the scope of protection of the invention.
As shown in Figure 1, for a kind of flow chart of news-video method for splitting embodiment 1 disclosed by the invention, method includes
Following steps:
S101, by being clustered to the video frame in pending news video, pending news video is decomposed
For at least one camera lens;
When needing to split news-video, video frame similar in pending news video is gathered first
Class merges into a camera lens.When video is decomposed into camera lens, by each video frame for calculating pending news video
The color histogram H [i] of rgb space calculates the Euclidean distance between the color histogram H [i] of the adjacent video frame of time domain, such as
This Euclidean distance of fruit is more than preset threshold value Th1, then it is assumed that shear, record start position and end position has occurred in camera lens
Between all video frame be a camera lens;Current video frame is calculated with the color histogram between the video frame of the n frames before it
The distance of H [i] is schemed, if this distance is more than preset threshold value Th2, then it is assumed that gradual shot has occurred here, records
All video frame between starting position and end position are a camera lens;If camera lens is both without occurring shear or not occurring
Gradual change, then it is assumed that still inside a camera lens.
S102, record each camera lens in news video at the beginning of point and end time point;
After pending news video is decomposed at least one camera lens, to each camera lens in pending news video
In at the beginning of point and end time point recorded.
S103, based on camera lens at the beginning of the length of camera lens that calculates of point and end time point, according to it is default when
Between interval extract the m frame key frames of camera lens;
According to the length of the camera lens that point and end time point calculate at the beginning of the camera lens of record, setting needs to take out
The frame number m of the key frame taken, the rule of setting can be described as:When lens length is less than 2s, m=1, when lens length is less than
During 4s, m=2, when lens length is less than 10s, m=3, when lens length is more than 10s, (parameter herein can be by m=4
Row adjustment).M frames are extracted in camera lens as frame is represented, calculate the interval gap=(end positions-start bit for extracting key frame
Put)/(m+1), video frame is extracted by interval of gap since camera lens, as key frame.
S104, the m frame key frames of analysis camera lens obtain host's classification information of camera lens;
Then each key frame is analyzed respectively, draws host's classification information of the camera lens.
S105, headline detection is carried out to pending news video, when including news in pending news video
During title, the start time point of headline and end time point are recorded;
Meanwhile headline detection and analysis are carried out to pending news video, judge be in pending news video
It is no comprising headline, when including headline in pending news video, start time point to headline and
End time point is recorded.
At the beginning of S106, the start time point based on headline and end time point and camera lens are in news video
Between point and end time point, generate the headline label information that camera lens is marked;
Then according to the start time point of the headline of record and end time point and camera lens in pending news
Point and end time point, generate the headline label information that camera lens is marked, that is, mark at the beginning of in video
Whether headline is included in camera lens.
S107, based on each camera lens in news video at the beginning of point and end time point, the host of camera lens
News video is split as N news letter by the headline label information of classification information and camera lens according to the default rule that splits
Breath, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens
Host's classification information and camera lens headline label information, pending news video is split, is split as N
News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, when needing to generate the poster figure of news video, first by new to target
The video frame heard in video is clustered, and targeted news video is decomposed at least one camera lens, each camera lens is then recorded and exists
Point and end time point at the beginning of in the targeted news video;Point and end time at the beginning of based on camera lens
The length for the camera lens that point calculates, the m frame key frames of camera lens are extracted according to prefixed time interval, record each key frame in mesh
Point and end time point, respectively handle each key frame, generate key frame at the beginning of marking in news video
Host's label information, at the same to targeted news video carry out headline detection, when in targeted news video include news mark
During topic, the start time point of headline and end time point, start time point and end based on headline are recorded
Time point and key frame in targeted news video at the beginning of point and end time point, generate and key frame be marked
Headline label information, be finally based on host's label information of all key frames and headline label information, it is raw
It, can be based on the host's information and headline Automatic generation of information in news-video into the poster figure of targeted news video
The poster figure of news-video content can be characterized, the poster figure generation form of news-video in the prior art effectively solved is single,
The problem of poor user experience.
As shown in Fig. 2, be a kind of flow chart of news-video method for splitting embodiment 2 disclosed by the invention, the present embodiment
On the basis of above-described embodiment 1, headline detection is being carried out to pending news video, when pending news video
In include headline when, after recording the start time point of headline and end time point, further include:
S201, the headline progress deduplication operation to the pending news video detected, and remained after recording duplicate removal
The start time point of remaining headline and end time point;
It can be found by the observation for news data, be often present in news item, displaying same is repeated several times
The situation of headline.News is split if only relying on and a headline occur, the undue of news can be caused
It cuts, therefore, can deduplication operation further be carried out to the headline of pending news video detected, and record duplicate removal
The remaining start time point of headline and end time point afterwards.
When the headline of the pending news video to detecting carries out deduplication operation, it is assumed that obtain n-th
A title its start and end frame position for t1, t2, position in the video frame is CRn (x, y, w, h), for this
Title Cn [t1, t2].Two titles before it are respectively Cn-1 [t3, t4], Cn-2 [t5, t6], position in the video frame
For CRn-1 and CRn-2.
The title Cn-1 of step 1, comparison current head Cn with before, the ratio of repeat region in video, i.e.,
Ratio R1s of the CRn with the repeat region of CRn-1 is calculated, if R1>=Thr then thinks that two titles need to carry out duplicate removal comparison,
Go to step 2.Otherwise continue to compare region repeatability R2 of the Cn with Cn-2, if R2>=Thr then thinks that two titles need to carry out
Duplicate removal compares, and goes to step 2, otherwise it is assumed that Cn is not repeat title.
Step 2, two titles for input are each chosen and represent a frame of its content, for Cn, choose (t1+t2)/
The video frame at 2 moment for CRn (x, y, w, h), sets contrast district rect:
Rect.x=x+w*R1;
Rect.y=y+h*R2;
Rect.w=w*R3;
Rect.h=h*R4;
R1, R2, R3, R4 are preset parameter.
Image in selecting video frame rect is as IMG1, for Cn-1 (or Cn-2), chooses (or (the t5+ of (t3+t4)/2
T6)/2) video frame at moment chooses the image in same area rect, is denoted as IMG2.
Step 3, by two input pictures, input picture is converted into gray scale/or arbitrary brightness color by rgb color space
Color separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114;
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2.
Step 4 calculates segmentation threshold, and for the gray scale or luminance picture of IMG1, gray scale point is calculated using OTSU methods
Threshold value is cut, OTSU methods are described as:
(1) assume that gray level image I can be divided into N number of gray scale (N<=256) image can, be extracted for this N number of gray scale
N rank grey level histograms H.
(2) for each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
(3) madeThe maximum corresponding x (t) of t are used as segmentation threshold Th.
Step 5, binary image IMG1 and IMG2.Its corresponding reference binary of image IMG1 or IMG2 pixel (x, y)
The pixel for changing image B is IfI (x, y)<Th, B (x, y)=0;ifI(x,y)>=Th, B (x, y)=255.
Step 6, binary image B1 and B2 by IMG1 and IMG2, carry out point-by-point difference, and calculate the average value of difference
Diff:
Wherein, W and H is the width in rect regions, high.
Step 7 compares Diff and preset threshold value, then think if less than threshold value two it is entitled identical
Title, by associated camera lens in the time range [t1, t2] of Cn labeled as identical subtitle, otherwise labeled as different subtitles.
Correspondingly, S202, being existed based on the start time point of remaining headline after duplicate removal and end time point and camera lens
Point and end time point, generate the headline label information that camera lens is marked at the beginning of in news video;
Then treated according to the start time point of remaining headline after the duplicate removal of record and end time point and camera lens
Point and end time point at the beginning of in the news video of processing generate and mark letter to the headline that camera lens is marked
Breath marks in camera lens whether include headline.
S203, based on each camera lens in news video at the beginning of point and end time point, the host of camera lens
News video is split as N news letter by the headline label information of classification information and camera lens according to the default rule that splits
Breath, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens
Host's classification information and camera lens headline label information, pending news video is split, is split as N
News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, it, can be further pending to what is detected on the basis of embodiment 1
The headline of news video carry out deduplication operation, the problem of effectively preventing the over-segmentation of news.
Specifically, in the above-described embodiment, the m frame key frames for analyzing camera lens obtain host's classification information of camera lens
One of which realization method can be:
Each frame key frame of camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to
Host's class categories, host's class categories of all key frames of camera lens are counted, by the host of quantity maximum
Class categories are determined as host's classification information of camera lens.
That is, for several frame key frames of each camera lens chosen before, be input to grader trained in advance into
Row host's category classification, and vote the result of several frames, the most classification of voting results is chosen as the camera lens
Classification.
Wherein, the training process of grader is:Different channel, different news program video in extract it is a certain number of
Video frame manually by these video frame, is categorized as double host's sitting posture class, single host's sitting posture class, single host station
Appearance class and non-hosting four classifications of the mankind (being illustrated with four classes at this place, however it is not limited to this four class), utilize deep learning side
Method trains corresponding grader, and training module refers to according to the deep learning network training method and model structure increased income, instruction
Practice the process of network model.
Training process:The deep learning frame progress model increased income using caffe is instructed again (can also be used other depth of increasing income
Learning framework is trained) specific training process is BP neural algorithms, i.e., before to during transmission, export in layer, if output layer
Obtained result has difference then to carry out back transfer with desired value, according to its error with gradient descent method come update its weight and
Threshold values, repeated several times, until error function reaches global minimum, specific algorithm is complicated, and is not original algorithm, belongs to one
As universal method, repeat no more detailed process.By above-mentioned training process, the network model classified is available for.
Assorting process:Each key frame obtained for each camera lens after shot segmentation is input to trained model
In, according to the convolution of same model structure and trained parameter, successively progress image, pooling, RELU are operated, directly
Belong to double host's sitting posture class, single host's sitting posture class, single host's stance class and non-to image to final obtain
Preside over the mankind each classification confidence level probability output P1, P2, P3, P4, select the corresponding classification of maximum therein as
The class categories of this unknown images.I.e. for example:P1 is the maximum in (P1, P2, P3, P4), this image belongs to double
Host's sitting posture class.For a camera lens, count it and belong to the quantity of key frame of all categories, the crucial number of frames of selection is done more
Classification of the classification as this camera lens.
Specifically, in the above-described embodiments, headline detection is carried out to pending news video, when pending new
When hearing in video comprising headline, the start time point of headline and the one of which realization side of end time point are recorded
Formula can be:
The predeterminable area of the video frame of pending news video is determined as candidate region, to the image in candidate region into
Line trace is handled, generation tracking handling result;Judge whether candidate region is headline region based on tracking handling result, if
It is the time of occurrence point in headline region to be then determined as the start time point of headline, by disappearing for headline region
Mistake time point is determined as the end time point of headline.
That is, the thinking of title detection algorithm is each video frame for the news video of input, is carried out steady based on time domain
Qualitatively headline detects, and obtains the frame number for the starting and ending frame that headline occurs in entire news.It will be in modules A
The time location in video of each camera lens obtained is compared with the appearance position of headline, if gone out in title
In existing scope, then it is assumed that this camera lens be tool it is headed, otherwise it is assumed that this camera lens do not have it is headed.
This place judged using this mode, without being carried out using the mode that title is found in single image, be in order to
Distinguish roll titles that may be present, the roll titles occurred in news generally take the extremely approximate pattern of same headline into
Row displaying, if only judging whether it is headline to an image, is present with mistake, influences poster map generalization matter
Amount.
Specific algorithm is:
1st, potential candidate region is selected:
(1) can choosing key frame bottom section, (bottom section is the position that most of news headline occurs.Carry out area
The purpose that domain is chosen is to reduce calculation amount, promotes accuracy of detection) in image, as image to be detected, bottom section
Choosing method is:
Assuming that width a height of W, H of key frame, then bottom section Rect (rect.x, rect.y, rect.w, rect.h) (square
Width, height of starting point coordinate of the shape region in key frame with the region) position in the image of key frame is:
Rect.x=0;
Rect.y=H*cut_ratio;
Rect.w=W;
Rect.h=H* (1-cut_ratio);
Wherein cut_ratio is a default coefficient.
(2) image to be detected of selection is converted into gray scale/or arbitrary brightness and color separated space by rgb color space
(such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(3) for gray scale or luminance picture, the edge feature of image is extracted, there are many ways to extracting edge, such as
Sobel operators, Canny operators etc., the present embodiment illustrate by taking Sobel operators as an example:
Using horizontal direction edge gradient operator and vertical direction edge gradient operator, the progress of same gray scale/luminance picture
Convolution obtains horizontal edge figure Eh and vertical edge figure Ev, final to calculate edge strength figure Eall, i.e., for arbitrary on edge graph
One point Eall (x, y), Eall (x, y)=sqrt (Ev (x, y) 2+Eh (x, y) 2)
For edge gradient operator horizontally and vertically by taking Sobel operators as an example, other operators are equally applicable
(4) compared for Eall and preset threshold value The1, by edge graph binaryzation i.e., ifEall (x, y)>
The1E (x, y)=1, else E (x, y)=0.
(5) for the operation of each passages of the RGB of image to be detected, respectively execution 3, the edge of three passages respectively is obtained
Intensity map Er, Eg, Eb.
(6) compared for Er, Eg, Eb with preset threshold value The2, by edge graph binaryzation, i.e., (with some
Passage is illustrated) ifEr (x, y)>The2Er (x, y)=1, else Er (x, y)=0.The2 and The1 can it is identical can not also
Together, if headline frame bottom is the type of gradual manner, the higher threshold value of use can not detect the edge of headline frame, it is necessary to
The edge detected with lower threshold is strengthened, therefore, general The2<The1
(7) Edge Enhancement is carried out for obtained edge image E, E (x, y)=E (x, y) | Er (x, y) | Eg (x, y) | Eb
(x, y) obtains final edge graph.(5)~(7) for strengthen step, can select to use as needed or without using.It can be to one
Passage is strengthened, and also three passages can be strengthened, and the purpose is to prevent caption area from causing to detect when there is gradual change
Failure.
(8) projection of horizontal direction is carried out for final edge graph, is counted per the pixel for meeting following conditions in a line i
Quantity Numedge, if Numedge>Thnum, then histogram H [i]=1, otherwise histogram H [i]=0.Following conditions are:
There are the value that at least one pixel is 1 in the pixel and neighbouring pixel, the marginal value for being considered as the pixel is 1, simultaneously
It is 1 to count the continuous pixel edge value of the pixel or so, and the total number of continuous pixel of the length more than threshold value Thlen.(mesh
Guarantee have continuous straight line)
(9) for histogram H [i], traveled through, H [i]==1 between line space, if spacing be more than threshold value
Throw, then using the edge image region between this two row as first stage candidate region, if not provided, continuing with next
Key frame.
(10) for each first stage candidate region, the edge projection histogram V of vertical direction is counted, for arbitrary
The i of one row, if the quantity that the edge pixel of this row is 1 is more than Thv, V [i]=1, otherwise V [i]=0, forces to set V
[0]=1&&V [W-1]=1.It finds in V, V [i]==1&&V [j]==1&&V [k] k ∈ (i, j)==0&&argmax (i-
J) right boundary of the region as caption area.The original image in this region is selected, the candidate regions as second stage
Domain.The method for seeking the edge pixel of row is identical with seeking the method for capable edge pixel.
(11) right boundary of second stage candidate region is finely found, with the sliding window of certain length (can be for 32*32)
The artwork of mouth scanning second stage candidate region, calculates the color histogram in each window, while counts face in the window
The number numcolor of non-zero position in Color Histogram finds the position of the background area of monochromatic areas or color complexity, i.e.,
numcolor<Thcolor1||numcolor>Thcolor2 will meet the center of the window of the condition, as new vertical
Direction border.
(12) the rectangular area CandidateRect determined for the above method, is judged using constraints, constraint
Condition includes but not limited to, the location information of the starting point of CandidateRect need in certain image range,
CandidateRect it is highly desirable within a certain range etc., headline is considered if eligible
Candidate region.If the candidate region is not located in tracking, into line trace revolving die block B, otherwise examined always in A modules
It surveys.
2nd, for the candidate region found into line trace:
(1) determine whether this region of the first secondary tracking, i.e., can be known after the present embodiment is handled by last moment
Road is in either with or without a region or multiple regions in tracking or tracking is completed or tracking fails, if there is the area in tracking
Domain, by it with present candidate region, into the comparison of row position, if there is higher registration in two regions in position, i.e.,
Understand this region be in tracking in, otherwise then determine this region be for the first time trace into, wherein so-called first secondary tracking this
A region can refer to and track this region for the first time, after last tracking can also be referred to, then this region of secondary tracking.If
It is to track for the first time, carries out (2), if not tracking for the first time, exits the method and step of the present embodiment.
(2) for the region of the first secondary tracking, a following range in key frame is set (since the key frame of input is waited
Additional background area, the i.e. region not comprising headline may be included in favored area, in order to promote the accuracy of tracking, is needed
Tracing area is set).Setting method is:If the position of the candidate region of the headline of key frame is CandidateRect
(x, y, w, h) (starting point x, y and corresponding width high w, h in key frame), setting tracing area track (x, y, w, h) are:
Track.x=CandidateRect.x+CandidateRect.w*Xratio1;
Track.y=CandidateRect.y+CandidateRect.h*Yratio1;
Track.w=CandidateRect.w*Xratio2;
Track.h=CandidateRect.h*Yratio2;
Xratio1, Xratio2, Yratio1, Yratio2 are preset parameter.
(3) image in key frame tracing area is chosen, image is converted into gray scale/or arbitrary bright by rgb color space
It spends color-separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(4) segmentation threshold is calculated, for gray scale or luminance picture, intensity slicing threshold value is calculated using OTSU methods,
OTSU methods are described as:Assuming that gray level image I can be divided into N number of gray scale (N<=256), can be carried for this N number of gray scale
Take the N rank grey level histograms H of image.For each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
MadeThe maximum corresponding x (t) of t are used as segmentation threshold Thtrack.
(5) by image binaryzation, i.e., its corresponding reference binary image Bref for the pixel (x, y) in image I
Pixel is IfI (x, y)<Thtrack, Bref (x, y)=0;ifI(x,y)>=Thtrack, Bref (x, y)=255.
(6) the color histogram Href of image in tracing area is calculated.
(7) for the key frame of input, it is converted into gray scale/or arbitrary brightness and color separation by rgb color space
Space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(8) choose the gray level image in key frame in tracing area, carry out binaryzation, i.e., in image I pixel (x,
Y) pixel of its corresponding binary image B is IfI (x, y)<Thtrack, Bcur (x, y)=0;ifI(x,y)>=
Thtrack, Bcur (x, y)=255.The result that step 4 obtains during secondary tracking headed by Thtrack.
(9) the binary image Bcur of present frame is carried out point-by-point difference, and calculates difference with reference binary image Bref
The average value Diffbinary divided:
Wherein W and H is the width of tracing area image, high.
(10) the color histogram Hcur of present image in tracing area is calculated, and distance Diffcolor is sought with Href.
(11) for the Diffbinary and Diffcolor of acquisition, it is compared with preset threshold value, if
Diffbinary<Thbinary&&Diffcolor<Thcolor is then returned in status tracking, by lock-on counter tracking_
Num++, otherwise by lost_num++;It should be noted that the tracking mode based on color histogram and binaryzation, can only use
One of them, can also be applied in combination.
(12) if lost_num>Thlost then returning tracking done states, while return to the frame number (note of current key frame
It is the time point that headline disappears to have recorded this frame), otherwise in returning tracking.The purpose for setting up lost_num be in order to avoid
Individual video signals are interfered, and image is caused distortion occur, cause that it fails to match, by setting up for lost_num, allow to calculate
Method has the key frame tracking failure of discrete quantities.
3rd, it is a Title area to judge this tracing area:
If terminated to candidate regions tracking, compare tracking_num and preset threshold value Thtracking_num
Size, if tracking_num>=Thtracking_num then judges this image for headline region, otherwise to be non-
Headline region.
Specifically, in the above-described embodiments, start time point and end time point and camera lens based on headline exist
Point and end time point at the beginning of in news video generate headline label information that camera lens is marked its
A kind of middle realization method can be:
By the start time point of headline and end time point and camera lens in the news video at the beginning of
Point and end time point are compared, when the start time point of headline and end time point are included in camera lens in news
When in the period that point and end time point are formed at the beginning of in video, the first headline label information is generated, when
Point and knot at the beginning of the start time point of headline and end time point are not included in camera lens in news video
When in the period that beam time point is formed, the second headline label information is generated.
Specifically, in the above-described embodiments, based on each camera lens in news video at the beginning of point and at the end of
Between point, host's classification information of camera lens and camera lens headline label information, news regarded according to the default rule that splits
Frequently being split as the one of which realization method of N news information can be:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai,
Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens
Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
That is, step 1, for wherein each camera lens, if news starting point is sky, which is arranged to news
Starting point, the next camera lens of diversion treatments, if there is provided news starting points to turn to step 2.
If step 2, SiIn TiBelong to double host's classification, then by Si-1The T of camera lensi-1Terminal it is new as splitting
The terminal of news, meanwhile, by SiIt is independent news item, the T that the starting point of news isiBeginning and end returns to two demolitions
As a result, news starting point is set for sky, the next camera lens of diversion treatments.
If step 3, SiIn TiBelong to single host's sitting posture or stance classification, then by Si-1The T of camera lensi-1End
Terminal of the point as fractionation news, meanwhile, by SiAs the starting point of new news item, a demolition is returned as a result, diversion treatments
Next camera lens.
If step 4, SiIn TiIt is other to belong to the non-hosting mankind, and CiThere are subtitle and CsiFor new subtitle, then by Si-1Camera lens
Ti-1Terminal as split news terminal, meanwhile, by SiAs the starting point of new news item, return to a demolition as a result,
The next camera lens of diversion treatments.
If step 5 does not meet above-mentioned condition, by camera lens SiIt is added in this news, the next mirror of diversion treatments
Head.
As shown in figure 3, be a kind of structure diagram of news-video detachment device embodiment 1 disclosed by the invention, device
Including:
Decomposing module 301, will be pending new for by being clustered to the video frame in pending news video
It hears video and is decomposed at least one camera lens;
When needing to split news-video, video frame similar in pending news video is gathered first
Class merges into a camera lens.When video is decomposed into camera lens, by each video frame for calculating pending news video
The color histogram H [i] of rgb space calculates the Euclidean distance between the color histogram H [i] of the adjacent video frame of time domain, such as
This Euclidean distance of fruit is more than preset threshold value Th1, then it is assumed that shear, record start position and end position has occurred in camera lens
Between all video frame be a camera lens;Current video frame is calculated with the color histogram between the video frame of the n frames before it
The distance of H [i] is schemed, if this distance is more than preset threshold value Th2, then it is assumed that gradual shot has occurred here, records
All video frame between starting position and end position are a camera lens;If camera lens is both without occurring shear or not occurring
Gradual change, then it is assumed that still inside a camera lens.
First logging modle 302, for recording each camera lens in news video at the beginning of point and end time
Point;
After pending news video is decomposed at least one camera lens, to each camera lens in pending news video
In at the beginning of point and end time point recorded.
Abstraction module 303, for the length of the camera lens that point and end time point calculate at the beginning of based on camera lens,
The m frame key frames of camera lens are extracted according to prefixed time interval;
According to the length of the camera lens that point and end time point calculate at the beginning of the camera lens of record, setting needs to take out
The frame number m of the key frame taken, the rule of setting can be described as:When lens length is less than 2s, m=1, when lens length is less than
During 4s, m=2, when lens length is less than 10s, m=3, when lens length is more than 10s, (parameter herein can be by m=4
Row adjustment).M frames are extracted in camera lens as frame is represented, calculate the interval gap=(end positions-start bit for extracting key frame
Put)/(m+1), video frame is extracted by interval of gap since camera lens, as key frame.
Analysis module 304, the m frame key frames for analyzing camera lens obtain host's classification information of camera lens;
Then each key frame is analyzed respectively, draws host's classification information of the camera lens.
Second logging modle 305, for carrying out headline detection to pending news video, when pending news
When headline is included in video, the start time point of headline and end time point are recorded;
Meanwhile headline detection and analysis are carried out to pending news video, judge be in pending news video
It is no comprising headline, when including headline in pending news video, start time point to headline and
End time point is recorded.
Generation module 306 regards for the start time point based on headline and end time point and camera lens in news
Point and end time point, generate the headline label information that camera lens is marked at the beginning of in frequency;
Then according to the start time point of the headline of record and end time point and camera lens in pending news
Point and end time point, generate the headline label information that camera lens is marked, that is, mark at the beginning of in video
Whether headline is included in camera lens.
Split module 307, for being based on each camera lens in news video at the beginning of point and end time point, mirror
News video is split as by host's classification information of head and the headline label information of camera lens according to the default rule that splits
N news information, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens
Host's classification information and camera lens headline label information, pending news video is split, is split as N
News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, when needing to generate the poster figure of news video, first by new to target
The video frame heard in video is clustered, and targeted news video is decomposed at least one camera lens, each camera lens is then recorded and exists
Point and end time point at the beginning of in the targeted news video;Point and end time at the beginning of based on camera lens
The length for the camera lens that point calculates, the m frame key frames of camera lens are extracted according to prefixed time interval, record each key frame in mesh
Point and end time point, respectively handle each key frame, generate key frame at the beginning of marking in news video
Host's label information, at the same to targeted news video carry out headline detection, when in targeted news video include news mark
During topic, the start time point of headline and end time point, start time point and end based on headline are recorded
Time point and key frame in targeted news video at the beginning of point and end time point, generate and key frame be marked
Headline label information, be finally based on host's label information of all key frames and headline label information, it is raw
It, can be based on the host's information and headline Automatic generation of information in news-video into the poster figure of targeted news video
The poster figure of news-video content can be characterized, the poster figure generation form of news-video in the prior art effectively solved is single,
The problem of poor user experience.
As shown in figure 4, be a kind of structure diagram of news-video detachment device embodiment 2 disclosed by the invention, this reality
Example is applied on the basis of above-described embodiment 1, headline detection is being carried out to pending news video, when pending news
When headline is included in video, after recording the start time point of headline and end time point, further include:
Deduplication module 401 for carrying out deduplication operation to the headline of the pending news video detected, and is remembered
The remaining start time point of headline and end time point after record duplicate removal;
It can be found by the observation for news data, be often present in news item, displaying same is repeated several times
The situation of headline.News is split if only relying on and a headline occur, the undue of news can be caused
It cuts, therefore, can deduplication operation further be carried out to the headline of pending news video detected, and record duplicate removal
The remaining start time point of headline and end time point afterwards.
When the headline of the pending news video to detecting carries out deduplication operation, it is assumed that obtain n-th
A title its start and end frame position for t1, t2, position in the video frame is CRn (x, y, w, h), for this
Title Cn [t1, t2].Two titles before it are respectively Cn-1 [t3, t4], Cn-2 [t5, t6], position in the video frame
For CRn-1 and CRn-2.
The title Cn-1 of step 1, comparison current head Cn with before, the ratio of repeat region in video, i.e.,
Ratio R1s of the CRn with the repeat region of CRn-1 is calculated, if R1>=Thr then thinks that two titles need to carry out duplicate removal comparison,
Go to step 2.Otherwise continue to compare region repeatability R2 of the Cn with Cn-2, if R2>=Thr then thinks that two titles need to carry out
Duplicate removal compares, and goes to step 2, otherwise it is assumed that Cn is not repeat title.
Step 2, two titles for input are each chosen and represent a frame of its content, for Cn, choose (t1+t2)/
The video frame at 2 moment for CRn (x, y, w, h), sets contrast district rect:
Rect.x=x+w*R1;
Rect.y=y+h*R2;
Rect.w=w*R3;
Rect.h=h*R4;
R1, R2, R3, R4 are preset parameter.
Image in selecting video frame rect is as IMG1, for Cn-1 (or Cn-2), chooses (or (the t5+ of (t3+t4)/2
T6)/2) video frame at moment chooses the image in same area rect, is denoted as IMG2.
Step 3, by two input pictures, input picture is converted into gray scale/or arbitrary brightness color by rgb color space
Color separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114;
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2.
Step 4 calculates segmentation threshold, and for the gray scale or luminance picture of IMG1, gray scale point is calculated using OTSU methods
Threshold value is cut, OTSU methods are described as:
(1) assume that gray level image I can be divided into N number of gray scale (N<=256) image can, be extracted for this N number of gray scale
N rank grey level histograms H.
(2) for each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
(3) madeThe maximum corresponding x (t) of t are used as segmentation threshold Th.
Step 5, binary image IMG1 and IMG2.Its corresponding reference binary of image IMG1 or IMG2 pixel (x, y)
The pixel for changing image B is IfI (x, y)<Th, B (x, y)=0;ifI(x,y)>=Th, B (x, y)=255.
Step 6, binary image B1 and B2 by IMG1 and IMG2, carry out point-by-point difference, and calculate the average value of difference
Diff:
Wherein, W and H is the width in rect regions, high.
Step 7 compares Diff and preset threshold value, then think if less than threshold value two it is entitled identical
Title, by associated camera lens in the time range [t1, t2] of Cn labeled as identical subtitle, otherwise labeled as different subtitles.
Generation module 402, for based on the start time point of remaining headline after duplicate removal and end time point and mirror
Head in news video at the beginning of point and end time point, generate the headline that camera lens is marked and mark letter
Breath;
Then treated according to the start time point of remaining headline after the duplicate removal of record and end time point and camera lens
Point and end time point at the beginning of in the news video of processing generate and mark letter to the headline that camera lens is marked
Breath marks in camera lens whether include headline.
Split module 403, for being based on each camera lens in news video at the beginning of point and end time point, mirror
News video is split as by host's classification information of head and the headline label information of camera lens according to the default rule that splits
N news information, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens
Host's classification information and camera lens headline label information, pending news video is split, is split as N
News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, it, can be further pending to what is detected on the basis of embodiment 1
The headline of news video carry out deduplication operation, the problem of effectively preventing the over-segmentation of news.
Specifically, in the above-described embodiment, analysis module is specifically used for:
Each frame key frame of camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to
Host's class categories, host's class categories of all key frames of camera lens are counted, by the host of quantity maximum
Class categories are determined as host's classification information of camera lens.
That is, for several frame key frames of each camera lens chosen before, be input to grader trained in advance into
Row host's category classification, and vote the result of several frames, the most classification of voting results is chosen as the camera lens
Classification.
Wherein, the training process of grader is:Different channel, different news program video in extract it is a certain number of
Video frame manually by these video frame, is categorized as double host's sitting posture class, single host's sitting posture class, single host station
Appearance class and non-hosting four classifications of the mankind (being illustrated with four classes at this place, however it is not limited to this four class), utilize deep learning side
Method trains corresponding grader, and training module refers to according to the deep learning network training method and model structure increased income, instruction
Practice the process of network model.
Training process:The deep learning frame progress model increased income using caffe is instructed again (can also be used other depth of increasing income
Learning framework is trained) specific training process is BP neural algorithms, i.e., before to during transmission, export in layer, if output layer
Obtained result has difference then to carry out back transfer with desired value, according to its error with gradient descent method come update its weight and
Threshold values, repeated several times, until error function reaches global minimum, specific algorithm is complicated, and is not original algorithm, belongs to one
As universal method, repeat no more detailed process.By above-mentioned training process, the network model classified is available for.
Assorting process:Each key frame obtained for each camera lens after shot segmentation is input to trained model
In, according to the convolution of same model structure and trained parameter, successively progress image, pooling, RELU are operated, directly
Belong to double host's sitting posture class, single host's sitting posture class, single host's stance class and non-to image to final obtain
Preside over the mankind each classification confidence level probability output P1, P2, P3, P4, select the corresponding classification of maximum therein as
The class categories of this unknown images.I.e. for example:P1 is the maximum in (P1, P2, P3, P4), this image belongs to double
Host's sitting posture class.For a camera lens, count it and belong to the quantity of key frame of all categories, the crucial number of frames of selection is done more
Classification of the classification as this camera lens.
Specifically, in the above-described embodiments, the second logging modle is specifically used for:
The predeterminable area of the video frame of pending news video is determined as candidate region, to the image in candidate region into
Line trace is handled, generation tracking handling result;Judge whether candidate region is headline region based on tracking handling result, if
It is the time of occurrence point in headline region to be then determined as the start time point of headline, by disappearing for headline region
Mistake time point is determined as the end time point of headline.
That is, the thinking of title detection algorithm is each video frame for the news video of input, is carried out steady based on time domain
Qualitatively headline detects, and obtains the frame number for the starting and ending frame that headline occurs in entire news.It will be in modules A
The time location in video of each camera lens obtained is compared with the appearance position of headline, if gone out in title
In existing scope, then it is assumed that this camera lens be tool it is headed, otherwise it is assumed that this camera lens do not have it is headed.
This place judged using this mode, without being carried out using the mode that title is found in single image, be in order to
Distinguish roll titles that may be present, the roll titles occurred in news generally take the extremely approximate pattern of same headline into
Row displaying, if only judging whether it is headline to an image, is present with mistake, influences poster map generalization matter
Amount.
Specific algorithm is:
1st, potential candidate region is selected:
(1) can choosing key frame bottom section, (bottom section is the position that most of news headline occurs.Carry out area
The purpose that domain is chosen is to reduce calculation amount, promotes accuracy of detection) in image, as image to be detected, bottom section
Choosing method is:
Assuming that width a height of W, H of key frame, then bottom section Rect (rect.x, rect.y, rect.w, rect.h) (square
Width, height of starting point coordinate of the shape region in key frame with the region) position in the image of key frame is:
Rect.x=0;
Rect.y=H*cut_ratio;
Rect.w=W;
Rect.h=H* (1-cut_ratio);
Wherein cut_ratio is a default coefficient.
(2) image to be detected of selection is converted into gray scale/or arbitrary brightness and color separated space by rgb color space
(such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(3) for gray scale or luminance picture, the edge feature of image is extracted, there are many ways to extracting edge, such as
Sobel operators, Canny operators etc., the present embodiment illustrate by taking Sobel operators as an example:
Using horizontal direction edge gradient operator and vertical direction edge gradient operator, the progress of same gray scale/luminance picture
Convolution obtains horizontal edge figure Eh and vertical edge figure Ev, final to calculate edge strength figure Eall, i.e., for arbitrary on edge graph
One point Eall (x, y), Eall (x, y)=sqrt (Ev (x, y) 2+Eh (x, y) 2)
For edge gradient operator horizontally and vertically by taking Sobel operators as an example, other operators are equally applicable:
(4) compared for Eall and preset threshold value The1, by edge graph binaryzation i.e., ifEall (x, y)>
The1E (x, y)=1, else E (x, y)=0.
(5) for the operation of each passages of the RGB of image to be detected, respectively execution 3, the edge of three passages respectively is obtained
Intensity map Er, Eg, Eb.
(6) compared for Er, Eg, Eb with preset threshold value The2, by edge graph binaryzation, i.e., (with some
Passage is illustrated) ifEr (x, y)>The2Er (x, y)=1, else Er (x, y)=0.The2 and The1 can it is identical can not also
Together, if headline frame bottom is the type of gradual manner, the higher threshold value of use can not detect the edge of headline frame, it is necessary to
The edge detected with lower threshold is strengthened, therefore, general The2<The1
(7) Edge Enhancement is carried out for obtained edge image E, E (x, y)=E (x, y) | Er (x, y) | Eg (x, y) | Eb
(x, y) obtains final edge graph.(5)~(7) for strengthen step, can select to use as needed or without using.It can be to one
Passage is strengthened, and also three passages can be strengthened, and the purpose is to prevent caption area from causing to detect when there is gradual change
Failure.
(8) projection of horizontal direction is carried out for final edge graph, is counted per the pixel for meeting following conditions in a line i
Quantity Numedge, if Numedge>Thnum, then histogram H [i]=1, otherwise histogram H [i]=0.Following conditions are:
There are the value that at least one pixel is 1 in the pixel and neighbouring pixel, the marginal value for being considered as the pixel is 1, simultaneously
It is 1 to count the continuous pixel edge value of the pixel or so, and the total number of continuous pixel of the length more than threshold value Thlen.(mesh
Guarantee have continuous straight line)
(9) for histogram H [i], traveled through, H [i]==1 between line space, if spacing be more than threshold value
Throw, then using the edge image region between this two row as first stage candidate region, if not provided, continuing with next
Key frame.
(10) for each first stage candidate region, the edge projection histogram V of vertical direction is counted, for arbitrary
The i of one row, if the quantity that the edge pixel of this row is 1 is more than Thv, V [i]=1, otherwise V [i]=0, forces to set V
[0]=1&&V [W-1]=1.It finds in V, V [i]==1&&V [j]==1&&V [k] k ∈ (i, j)==0&&argmax (i-
J) right boundary of the region as caption area.The original image in this region is selected, the candidate regions as second stage
Domain.The method for seeking the edge pixel of row is identical with seeking the method for capable edge pixel.
(11) right boundary of second stage candidate region is finely found, with the sliding window of certain length (can be for 32*32)
The artwork of mouth scanning second stage candidate region, calculates the color histogram in each window, while counts face in the window
The number numcolor of non-zero position in Color Histogram finds the position of the background area of monochromatic areas or color complexity, i.e.,
numcolor<Thcolor1||numcolor>Thcolor2 will meet the center of the window of the condition, as new vertical
Direction border.
(12) the rectangular area CandidateRect determined for the above method, is judged using constraints, constraint
Condition includes but not limited to, the location information of the starting point of CandidateRect need in certain image range,
CandidateRect it is highly desirable within a certain range etc., headline is considered if eligible
Candidate region.If the candidate region is not located in tracking, into line trace revolving die block B, otherwise examined always in A modules
It surveys.
2nd, for the candidate region found into line trace:
(1) determine whether this region of the first secondary tracking, i.e., can be known after the present embodiment is handled by last moment
Road is in either with or without a region or multiple regions in tracking or tracking is completed or tracking fails, if there is the area in tracking
Domain, by it with present candidate region, into the comparison of row position, if there is higher registration in two regions in position, i.e.,
Understand this region be in tracking in, otherwise then determine this region be for the first time trace into, wherein so-called first secondary tracking this
A region can refer to and track this region for the first time, after last tracking can also be referred to, then this region of secondary tracking.If
It is to track for the first time, carries out (2), if not tracking for the first time, exits the method and step of the present embodiment.
(2) for the region of the first secondary tracking, a following range in key frame is set (since the key frame of input is waited
Additional background area, the i.e. region not comprising headline may be included in favored area, in order to promote the accuracy of tracking, is needed
Tracing area is set).Setting method is:If the position of the candidate region of the headline of key frame is CandidateRect
(x, y, w, h) (starting point x, y and corresponding width high w, h in key frame), setting tracing area track (x, y, w, h) are:
Track.x=CandidateRect.x+CandidateRect.w*Xratio1;
Track.y=CandidateRect.y+CandidateRect.h*Yratio1;
Track.w=CandidateRect.w*Xratio2;
Track.h=CandidateRect.h*Yratio2;
Xratio1, Xratio2, Yratio1, Yratio2 are preset parameter.
(3) image in key frame tracing area is chosen, image is converted into gray scale/or arbitrary bright by rgb color space
It spends color-separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(4) segmentation threshold is calculated, for gray scale or luminance picture, intensity slicing threshold value is calculated using OTSU methods,
OTSU methods are described as:Assuming that gray level image I can be divided into N number of gray scale (N<=256), can be carried for this N number of gray scale
Take the N rank grey level histograms H of image.For each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
MadeThe maximum corresponding x (t) of t are used as segmentation threshold Thtrack.
(5) by image binaryzation, i.e., its corresponding reference binary image Bref for the pixel (x, y) in image I
Pixel is IfI (x, y)<Thtrack, Bref (x, y)=0;ifI(x,y)>=Thtrack, Bref (x, y)=255.
(6) the color histogram Href of image in tracing area is calculated.
(7) for the key frame of input, it is converted into gray scale/or arbitrary brightness and color separation by rgb color space
Space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(8) choose the gray level image in key frame in tracing area, carry out binaryzation, i.e., in image I pixel (x,
Y) pixel of its corresponding binary image B is IfI (x, y)<Thtrack, Bcur (x, y)=0;ifI(x,y)>=
Thtrack, Bcur (x, y)=255.The result that step 4 obtains during secondary tracking headed by Thtrack.
(9) the binary image Bcur of present frame is carried out point-by-point difference, and calculates difference with reference binary image Bref
The average value Diffbinary divided:
Wherein W and H is the width of tracing area image, high.
(10) the color histogram Hcur of present image in tracing area is calculated, and distance Diffcolor is sought with Href.
(11) for the Diffbinary and Diffcolor of acquisition, it is compared with preset threshold value, if
Diffbinary<Thbinary&&Diffcolor<Thcolor is then returned in status tracking, by lock-on counter tracking_
Num++, otherwise by lost_num++;It should be noted that the tracking mode based on color histogram and binaryzation, can only use
One of them, can also be applied in combination.
(12) if lost_num>Thlost then returning tracking done states, while return to the frame number (note of current key frame
It is the time point that headline disappears to have recorded this frame), otherwise in returning tracking.The purpose for setting up lost_num be in order to avoid
Individual video signals are interfered, and image is caused distortion occur, cause that it fails to match, by setting up for lost_num, allow to calculate
Method has the key frame tracking failure of discrete quantities.
3rd, it is a Title area to judge this tracing area:
If terminated to candidate regions tracking, compare tracking_num and preset threshold value Thtracking_num
Size, if tracking_num>=Thtracking_num then judges this image for headline region, otherwise to be non-
Headline region.
Specifically, in the above-described embodiments, generation module is specifically used for:
By the start time point of headline and end time point and camera lens in the news video at the beginning of
Point and end time point are compared, when the start time point of headline and end time point are included in camera lens in news
When in the period that point and end time point are formed at the beginning of in video, the first headline label information is generated, when
Point and knot at the beginning of the start time point of headline and end time point are not included in camera lens in news video
When in the period that beam time point is formed, the second headline label information is generated.
Specifically, it in the above-described embodiments, splits module and is specifically used for:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai,
Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens
Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
That is, step 1, for wherein each camera lens, if news starting point is sky, which is arranged to news
Starting point, the next camera lens of diversion treatments, if there is provided news starting points to turn to step 2.
If step 2, SiIn TiBelong to double host's classification, then by Si-1The T of camera lensi-1Terminal it is new as splitting
The terminal of news, meanwhile, by SiIt is independent news item, the T that the starting point of news isiBeginning and end returns to two demolitions
As a result, news starting point is set for sky, the next camera lens of diversion treatments.
If step 3, SiIn TiBelong to single host's sitting posture or stance classification, then by Si-1The T of camera lensi-1End
Terminal of the point as fractionation news, meanwhile, by SiAs the starting point of new news item, a demolition is returned as a result, diversion treatments
Next camera lens.
If step 4, SiIn TiIt is other to belong to the non-hosting mankind, and CiThere are subtitle and CsiFor new subtitle, then by Si-1Camera lens
Ti-1Terminal as split news terminal, meanwhile, by SiAs the starting point of new news item, return to a demolition as a result,
The next camera lens of diversion treatments.
If step 5 does not meet above-mentioned condition, by camera lens SiIt is added in this news, the next mirror of diversion treatments
Head.
Each embodiment is described by the way of progressive in this specification, the highlights of each of the examples are with it is other
The difference of embodiment, just to refer each other for identical similar portion between each embodiment.
The foregoing description of the disclosed embodiments enables professional and technical personnel in the field to realize or use the present invention.
A variety of modifications of these embodiments will be apparent for those skilled in the art, it is as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
The embodiments shown herein is not intended to be limited to, and is to fit to and the principles and novel features disclosed herein phase one
The most wide scope caused.
Claims (12)
1. a kind of news-video method for splitting, which is characterized in that the described method includes:
By being clustered to the video frame in pending news video, the pending news video is decomposed at least
One camera lens;
Point and end time point at the beginning of each camera lens is recorded in the news video;
The length for the camera lens that point and end time point calculate at the beginning of based on the camera lens, according to preset time
Interval extracts the m frame key frames of the camera lens;
The m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Headline detection is carried out to the pending news video, when including news mark in the pending news video
During topic, the start time point of the headline and end time point are recorded;
Start time point and end time point and the camera lens opening in the news video based on the headline
Begin time point and end time point, generates the headline label information that the camera lens is marked;
Based on each camera lens in the news video at the beginning of point and end time point, the camera lens hosting
The news video is split as by the headline label information of people's classification information and the camera lens according to the default rule that splits
N news information, wherein, N is more than or equal to 1.
2. according to the method described in claim 1, it is characterized in that, described carry out news mark to the pending news video
Topic detection, when in the pending news video include headline when, record the start time point of the headline with
And it after end time point, further includes:
The headline of the pending news video to detecting carries out deduplication operation, and records remaining news after duplicate removal
The start time point of title and end time point;
Correspondingly, start time point and the end time point based on the headline and the camera lens are in the news
Point and end time point, generate the headline label information bag that the camera lens is marked at the beginning of in video
It includes:
Based on the start time point of remaining headline after the duplicate removal and end time point and the camera lens in the news
Point and end time point, generate the headline label information that the camera lens is marked at the beginning of in video.
3. method according to claim 1 or 2, which is characterized in that the m frame key frames of the analysis camera lens obtain institute
Stating host's classification information of camera lens includes:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to
Host's class categories;
Host's class categories of all key frames of the camera lens are counted, host's class categories of quantity maximum are true
It is set to host's classification information of the camera lens.
4. method according to claim 1 or 2, which is characterized in that described to be carried out newly to the pending news video
Title detection is heard, when including headline in the pending news video, records the initial time of the headline
Point and end time point include:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by the news
The time of occurrence point of Title area is determined as the start time point of headline, by the extinction time point in the headline region
It is determined as the end time point of headline.
5. method according to claim 1 or 2, which is characterized in that the start time point based on the headline
And end time point and the camera lens in the news video at the beginning of point and end time point, generate to described
The headline label information that camera lens is marked includes:
By the start time point of the headline and end time point and beginning of the camera lens in the news video
Time point and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video
When in the period that sart point in time and end time point are formed, the first headline label information is generated;
When the start time point of the headline and end time point are not included in the camera lens in the news video
At the beginning of point and end time point form period in when, generate the second headline label information.
6. method according to claim 1 or 2, which is characterized in that described to be regarded based on each camera lens in the news
The headline of point and end time point, host's classification information of the camera lens and the camera lens at the beginning of in frequency
Label information, the news video is split as N news information according to default fractionation rule includes:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci,
Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the hosting mankind included in camera lens
Other information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
7. a kind of news-video detachment device, which is characterized in that including:
Decomposing module, for by being clustered to the video frame in pending news video, by the pending news
Video is decomposed at least one camera lens;
First logging modle, for recording each camera lens in the news video at the beginning of point and end time
Point;
Abstraction module, for the length of the camera lens that point and end time point calculate at the beginning of based on the camera lens
Degree extracts the m frame key frames of the camera lens according to prefixed time interval;
Analysis module, the m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Second logging modle, for carrying out headline detection to the pending news video, when described pending new
When hearing in video comprising headline, the start time point of the headline and end time point are recorded;
Generation module, for the start time point based on the headline and end time point and the camera lens described new
Point and end time point, generate the headline label information that the camera lens is marked at the beginning of hearing in video;
Split module, for being based on each camera lens in the news video at the beginning of point and end time point,
Host's classification information of the camera lens and the headline label information of the camera lens, according to the default rule that splits by described in
News video is split as N news information, wherein, N is more than or equal to 1.
8. device according to claim 7, which is characterized in that further include:
Deduplication module for carrying out deduplication operation to the headline of the pending news video detected, and records
The remaining start time point of headline and end time point after duplicate removal;
Correspondingly, the generation module is used for:Based on the start time point of remaining headline after the duplicate removal and at the end of
Between point and the camera lens in the news video at the beginning of point and end time point, generate to the camera lens into rower
The headline label information of note.
9. the device according to claim 7 or 8, which is characterized in that the analysis module is specifically used for:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to
Host's class categories;
Host's class categories of all key frames of the camera lens are counted, host's class categories of quantity maximum are true
It is set to host's classification information of the camera lens.
10. the device according to claim 7 or 8, which is characterized in that second logging modle is specifically used for:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by the news
The time of occurrence point of Title area is determined as the start time point of headline, by the extinction time point in the headline region
It is determined as the end time point of headline.
11. the device according to claim 7 or 8, which is characterized in that the generation module is specifically used for:
By the start time point of the headline and end time point and beginning of the camera lens in the news video
Time point and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video
When in the period that sart point in time and end time point are formed, the first headline label information is generated;
When the start time point of the headline and end time point are not included in the camera lens in the news video
At the beginning of point and end time point form period in when, generate the second headline label information.
12. the device according to claim 7 or 8, which is characterized in that the fractionation module is specifically used for:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci,
Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the hosting mankind included in camera lens
Other information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711371733.6A CN108093314B (en) | 2017-12-19 | 2017-12-19 | Video news splitting method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711371733.6A CN108093314B (en) | 2017-12-19 | 2017-12-19 | Video news splitting method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108093314A true CN108093314A (en) | 2018-05-29 |
CN108093314B CN108093314B (en) | 2020-09-01 |
Family
ID=62177211
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711371733.6A Active CN108093314B (en) | 2017-12-19 | 2017-12-19 | Video news splitting method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108093314B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110267061A (en) * | 2019-04-30 | 2019-09-20 | 新华智云科技有限公司 | A kind of news demolition method and system |
CN110610500A (en) * | 2019-09-06 | 2019-12-24 | 北京信息科技大学 | News video self-adaptive strip splitting method based on dynamic semantic features |
CN110941594A (en) * | 2019-12-16 | 2020-03-31 | 北京奇艺世纪科技有限公司 | Splitting method and device of video file, electronic equipment and storage medium |
CN111277859A (en) * | 2020-01-15 | 2020-06-12 | 腾讯科技(深圳)有限公司 | Method and device for acquiring score, computer equipment and storage medium |
CN111314775A (en) * | 2018-12-12 | 2020-06-19 | 华为终端有限公司 | Video splitting method and electronic equipment |
CN113807085A (en) * | 2021-11-19 | 2021-12-17 | 成都索贝数码科技股份有限公司 | Method for extracting title and subtitle aiming at news scene |
CN113810782A (en) * | 2020-06-12 | 2021-12-17 | 阿里巴巴集团控股有限公司 | Video processing method and device, server and electronic device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020126143A1 (en) * | 2001-03-09 | 2002-09-12 | Lg Electronics, Inc. | Article-based news video content summarizing method and browsing system |
CN101616264A (en) * | 2008-06-27 | 2009-12-30 | 中国科学院自动化研究所 | News video categorization and system |
CN102547139A (en) * | 2010-12-30 | 2012-07-04 | 北京新岸线网络技术有限公司 | Method for splitting news video program, and method and system for cataloging news videos |
CN103546667A (en) * | 2013-10-24 | 2014-01-29 | 中国科学院自动化研究所 | Automatic news splitting method for volume broadcast television supervision |
CN104778230A (en) * | 2015-03-31 | 2015-07-15 | 北京奇艺世纪科技有限公司 | Video data segmentation model training method, video data segmenting method, video data segmentation model training device and video data segmenting device |
CN104780388A (en) * | 2015-03-31 | 2015-07-15 | 北京奇艺世纪科技有限公司 | Video data partitioning method and device |
CN107087211A (en) * | 2017-03-30 | 2017-08-22 | 北京奇艺世纪科技有限公司 | A kind of anchor shots detection method and device |
-
2017
- 2017-12-19 CN CN201711371733.6A patent/CN108093314B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020126143A1 (en) * | 2001-03-09 | 2002-09-12 | Lg Electronics, Inc. | Article-based news video content summarizing method and browsing system |
CN101616264A (en) * | 2008-06-27 | 2009-12-30 | 中国科学院自动化研究所 | News video categorization and system |
CN102547139A (en) * | 2010-12-30 | 2012-07-04 | 北京新岸线网络技术有限公司 | Method for splitting news video program, and method and system for cataloging news videos |
CN103546667A (en) * | 2013-10-24 | 2014-01-29 | 中国科学院自动化研究所 | Automatic news splitting method for volume broadcast television supervision |
CN104778230A (en) * | 2015-03-31 | 2015-07-15 | 北京奇艺世纪科技有限公司 | Video data segmentation model training method, video data segmenting method, video data segmentation model training device and video data segmenting device |
CN104780388A (en) * | 2015-03-31 | 2015-07-15 | 北京奇艺世纪科技有限公司 | Video data partitioning method and device |
CN107087211A (en) * | 2017-03-30 | 2017-08-22 | 北京奇艺世纪科技有限公司 | A kind of anchor shots detection method and device |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111314775A (en) * | 2018-12-12 | 2020-06-19 | 华为终端有限公司 | Video splitting method and electronic equipment |
CN111314775B (en) * | 2018-12-12 | 2021-09-07 | 华为终端有限公司 | Video splitting method and electronic equipment |
US11902636B2 (en) | 2018-12-12 | 2024-02-13 | Petal Cloud Technology Co., Ltd. | Video splitting method and electronic device |
CN110267061A (en) * | 2019-04-30 | 2019-09-20 | 新华智云科技有限公司 | A kind of news demolition method and system |
CN110610500A (en) * | 2019-09-06 | 2019-12-24 | 北京信息科技大学 | News video self-adaptive strip splitting method based on dynamic semantic features |
CN110941594A (en) * | 2019-12-16 | 2020-03-31 | 北京奇艺世纪科技有限公司 | Splitting method and device of video file, electronic equipment and storage medium |
CN110941594B (en) * | 2019-12-16 | 2023-04-18 | 北京奇艺世纪科技有限公司 | Splitting method and device of video file, electronic equipment and storage medium |
CN111277859A (en) * | 2020-01-15 | 2020-06-12 | 腾讯科技(深圳)有限公司 | Method and device for acquiring score, computer equipment and storage medium |
CN111277859B (en) * | 2020-01-15 | 2021-12-14 | 腾讯科技(深圳)有限公司 | Method and device for acquiring score, computer equipment and storage medium |
CN113810782A (en) * | 2020-06-12 | 2021-12-17 | 阿里巴巴集团控股有限公司 | Video processing method and device, server and electronic device |
CN113810782B (en) * | 2020-06-12 | 2022-09-27 | 阿里巴巴集团控股有限公司 | Video processing method and device, server and electronic device |
CN113807085A (en) * | 2021-11-19 | 2021-12-17 | 成都索贝数码科技股份有限公司 | Method for extracting title and subtitle aiming at news scene |
Also Published As
Publication number | Publication date |
---|---|
CN108093314B (en) | 2020-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108093314A (en) | A kind of news-video method for splitting and device | |
CN107977645A (en) | A kind of news-video poster map generalization method and device | |
CN104063883B (en) | A kind of monitor video abstraction generating method being combined based on object and key frame | |
CN109431523B (en) | Autism primary screening device based on non-social voice stimulation behavior paradigm | |
Yang et al. | Lecture video indexing and analysis using video ocr technology | |
CN103258232B (en) | A kind of public place crowd estimate's method based on dual camera | |
CN105844621A (en) | Method for detecting quality of printed matter | |
CN104077577A (en) | Trademark detection method based on convolutional neural network | |
EP0821318A3 (en) | Apparatus and method for segmenting image data into windows and for classifying windows as image types | |
Nasir et al. | Automatic passenger counting system using image processing based on skin colour detection approach | |
CN108629319B (en) | Image detection method and system | |
CN102306307B (en) | Positioning method of fixed point noise in color microscopic image sequence | |
CN103714314B (en) | Television video station caption identification method combining edge and color information | |
WO2012005461A2 (en) | Method for automatically calculating information on clouds | |
CN108108733A (en) | A kind of news caption detection method and device | |
KR20090111939A (en) | Method and apparatus for separating foreground and background from image, Method and apparatus for substituting separated background | |
CN108256508A (en) | A kind of news major-minor title detection method and device | |
CN106570885A (en) | Background modeling method based on brightness and texture fusion threshold value | |
CN106709438A (en) | Method for collecting statistics of number of people based on video conference | |
CN113065568A (en) | Target detection, attribute identification and tracking method and system | |
CN111242096B (en) | People number gradient-based people group distinguishing method | |
CN101827224B (en) | Detection method of anchor shot in news video | |
CN108446603B (en) | News title detection method and device | |
CN113012192A (en) | Video total segmentation counting method based on panoramic segmentation multi-target tracking | |
CN102625028B (en) | The method and apparatus that static logos present in video is detected |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |