CN108093314A - A kind of news-video method for splitting and device - Google Patents

A kind of news-video method for splitting and device Download PDF

Info

Publication number
CN108093314A
CN108093314A CN201711371733.6A CN201711371733A CN108093314A CN 108093314 A CN108093314 A CN 108093314A CN 201711371733 A CN201711371733 A CN 201711371733A CN 108093314 A CN108093314 A CN 108093314A
Authority
CN
China
Prior art keywords
camera lens
time point
headline
news
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711371733.6A
Other languages
Chinese (zh)
Other versions
CN108093314B (en
Inventor
刘楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201711371733.6A priority Critical patent/CN108093314B/en
Publication of CN108093314A publication Critical patent/CN108093314A/en
Application granted granted Critical
Publication of CN108093314B publication Critical patent/CN108093314B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • G06F16/739Presentation of query results in form of a video summary, e.g. the video summary being a video sequence, a composite still image or having synthesized frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream

Abstract

The invention discloses a kind of news-video method for splitting and device, including:Pending news video is decomposed at least one camera lens, point and end time point at the beginning of each camera lens is recorded in news video, the m frame key frames of camera lens are extracted according to prefixed time interval, the m frame key frames of analysis camera lens obtain host's classification information of camera lens, record the start time point of headline and end time point;Generate the headline label information that camera lens is marked, based on each camera lens in news video at the beginning of the headline label information of point and end time point, host's classification information of camera lens and camera lens, news video is split as N news information according to the default rule that splits.The present invention can split news-video automatically based on host's information in news-video and headline information, improve the efficiency of news-video fractionation.

Description

A kind of news-video method for splitting and device
Technical field
The present invention relates to technical field of video processing, more specifically to a kind of news-video method for splitting and device.
Background technology
Contain substantial amounts of newest information in news-video, have for the application of video website and news category Important value.The application of video website or news category needs to split the whole news-video broadcasted daily, reach the standard grade, For user click viewing is carried out for wherein interested every news.Since the TV station in the whole nation is large number of, except satellite TV's platform It is also cut outside there are all kinds of local broadcasting stations if necessary to be split to all news-videos, it is necessary to expend substantial amounts of manpower Point.Simultaneously because the timeliness of news, the rate request for the segmentation of news-video is also very stringent, so to artificial Segmentation brings the pressure of bigger, and news-video was largely broadcasted in some period (such as 12 noon to 12 thirty), was Ensure timeliness, it is necessary to before the deadline entire news-video program be cut into independent news entry as early as possible, and It cannot be produced by the way of backlog post-processing.
In conclusion in the prior art, the technical solution that news-video is split automatically can often not needed Substantial amounts of staff is wanted manually to split news-video, human cost is higher.Therefore, how quickly and effectively to regarding It is a urgent problem to be solved that frequency news, which carries out automatic split,.
The content of the invention
In view of this, it is an object of the invention to provide a kind of news-video method for splitting, can be based in news-video Host's information and headline information news-video is split automatically, improve news-video fractionation efficiency.
To achieve the above object, the present invention provides following technical solution:A kind of news-video method for splitting, the method bag It includes:
By being clustered to the video frame in pending news video, the pending news video is decomposed into At least one camera lens;
Point and end time point at the beginning of each camera lens is recorded in the news video;
The length for the camera lens that point and end time point calculate at the beginning of based on the camera lens, according to default Time interval extracts the m frame key frames of the camera lens;
The m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Headline detection is carried out to the pending news video, it is new when being included in the pending news video When hearing title, the start time point of the headline and end time point are recorded;
Start time point and end time point and the camera lens based on the headline are in the news video At the beginning of point and end time point, generate the headline label information that the camera lens is marked;
Based on each camera lens in the news video at the beginning of point and end time point, the camera lens The headline label information of host's classification information and the camera lens tears the news video open according to the default rule that splits It is divided into N news information, wherein, N is more than or equal to 1.
Preferably, it is described that headline detection is carried out to the pending news video, when the pending news When headline is included in video, after recording the start time point of the headline and end time point, further include:
The headline of the pending news video to detecting carries out deduplication operation, and records residue after duplicate removal The start time point of headline and end time point;
Correspondingly, start time point and the end time point based on the headline and the camera lens are described Point and end time point, generate the headline label information that the camera lens is marked at the beginning of in news video Including:
Based on the start time point of remaining headline after the duplicate removal and end time point and the camera lens described Point and end time point at the beginning of in news video generate and mark letter to the headline that the camera lens is marked Breath.
Preferably, host's classification information that the m frame key frames for analyzing the camera lens obtain the camera lens includes:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, generates each frame key frame Corresponding host's class categories;
Host's class categories of all key frames of the camera lens are counted, by host's classification class of quantity maximum It is not determined as host's classification information of the camera lens.
Preferably, it is described that headline detection is carried out to the pending news video, when the pending news When headline is included in video, the start time point and end time point that record the headline include:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by described in The time of occurrence point in headline region is determined as the start time point of headline, during by the disappearance in the headline region Between point be determined as the end time point of headline.
Preferably, start time point and the end time point based on the headline and the camera lens are described Point and end time point, generate the headline label information that the camera lens is marked at the beginning of in news video Including:
By the start time point of the headline and end time point with the camera lens in the news video Sart point in time and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video In at the beginning of point and end time point form period in when, generate the first headline label information;
It is regarded when the start time point of the headline and end time point are not included in the camera lens in the news When in the period that point and end time point are formed at the beginning of in frequency, the second headline label information is generated.
Preferably, it is described based on each camera lens in the news video at the beginning of point and end time The headline label information of point, host's classification information of the camera lens and the camera lens, will according to the default rule that splits The news video, which is split as N news information, to be included:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
A kind of news-video detachment device, including:
Decomposing module, will be described pending for by being clustered to the video frame in pending news video News video is decomposed at least one camera lens;
First logging modle, for recording each camera lens in the news video at the beginning of point and terminate Time point;
Abstraction module, for point and end time point calculate at the beginning of based on the camera lens the camera lens Length extracts the m frame key frames of the camera lens according to prefixed time interval;
Analysis module, the m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Second logging modle, for carrying out headline detection to the pending news video, when described pending News video in include headline when, record the start time point of the headline and end time point;
Generation module, for the start time point based on the headline and end time point and the camera lens in institute Point and end time point at the beginning of stating in news video generate and mark letter to the headline that the camera lens is marked Breath;
Split module, for being based on each camera lens in the news video at the beginning of point and end time The headline label information of point, host's classification information of the camera lens and the camera lens, will according to the default rule that splits The news video is split as N news information, wherein, N is more than or equal to 1.
Preferably, described device further includes:
Deduplication module, for carrying out deduplication operation to the headline of the pending news video detected, and The remaining start time point of headline and end time point after record duplicate removal;
Correspondingly, the generation module is used for:Start time point and knot based on remaining headline after the duplicate removal Beam time point and the camera lens in the news video at the beginning of point and end time point, generate to the camera lens into The headline label information of line flag.
Preferably, the analysis module is specifically used for:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, generates each frame key frame Corresponding host's class categories;
Host's class categories of all key frames of the camera lens are counted, by host's classification class of quantity maximum It is not determined as host's classification information of the camera lens.
Preferably, second logging modle is specifically used for:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by described in The time of occurrence point in headline region is determined as the start time point of headline, during by the disappearance in the headline region Between point be determined as the end time point of headline.
Preferably, the generation module is specifically used for:
By the start time point of the headline and end time point with the camera lens in the news video Sart point in time and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video In at the beginning of point and end time point form period in when, generate the first headline label information;
It is regarded when the start time point of the headline and end time point are not included in the camera lens in the news When in the period that point and end time point are formed at the beginning of in frequency, the second headline label information is generated.
Preferably, the fractionation module is specifically used for:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
It can be seen from the above technical proposal that the invention discloses a kind of news-video method for splitting, to news-video When being split, first by being clustered to the video frame in pending news video, by pending news video point It solves as at least one camera lens, point and end time point at the beginning of then recording each camera lens in news video are based on The length for the camera lens that point and end time point calculate, camera lens is extracted according to prefixed time interval at the beginning of camera lens M frame key frames, the m frame key frames for analyzing camera lens obtain host's classification information of camera lens, and pending news video is carried out Headline detect, when in pending news video include headline when, record headline start time point and End time point, the beginning of start time point and end time point and camera lens in the news video based on headline Time point and end time point, generate the headline label information that camera lens is marked, based on each camera lens in news The headline of point and end time point, host's classification information of camera lens and camera lens mark letter at the beginning of in video News video is split as N news information by breath according to the default rule that splits, wherein, N is more than or equal to 1.The present invention can be based on Host's information and headline information in news-video split news-video automatically, improve news-video and tear open The efficiency divided.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention, for those of ordinary skill in the art, without creative efforts, can be with Other attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of flow chart of news-video method for splitting disclosed in the embodiment of the present invention 1;
Fig. 2 is a kind of flow chart of news-video method for splitting disclosed in the embodiment of the present invention 2;
Fig. 3 is a kind of structure diagram of news-video detachment device disclosed in the embodiment of the present invention 1;
Fig. 4 is a kind of structure diagram of news-video detachment device disclosed in the embodiment of the present invention 2.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained all other without making creative work Embodiment belongs to the scope of protection of the invention.
As shown in Figure 1, for a kind of flow chart of news-video method for splitting embodiment 1 disclosed by the invention, method includes Following steps:
S101, by being clustered to the video frame in pending news video, pending news video is decomposed For at least one camera lens;
When needing to split news-video, video frame similar in pending news video is gathered first Class merges into a camera lens.When video is decomposed into camera lens, by each video frame for calculating pending news video The color histogram H [i] of rgb space calculates the Euclidean distance between the color histogram H [i] of the adjacent video frame of time domain, such as This Euclidean distance of fruit is more than preset threshold value Th1, then it is assumed that shear, record start position and end position has occurred in camera lens Between all video frame be a camera lens;Current video frame is calculated with the color histogram between the video frame of the n frames before it The distance of H [i] is schemed, if this distance is more than preset threshold value Th2, then it is assumed that gradual shot has occurred here, records All video frame between starting position and end position are a camera lens;If camera lens is both without occurring shear or not occurring Gradual change, then it is assumed that still inside a camera lens.
S102, record each camera lens in news video at the beginning of point and end time point;
After pending news video is decomposed at least one camera lens, to each camera lens in pending news video In at the beginning of point and end time point recorded.
S103, based on camera lens at the beginning of the length of camera lens that calculates of point and end time point, according to it is default when Between interval extract the m frame key frames of camera lens;
According to the length of the camera lens that point and end time point calculate at the beginning of the camera lens of record, setting needs to take out The frame number m of the key frame taken, the rule of setting can be described as:When lens length is less than 2s, m=1, when lens length is less than During 4s, m=2, when lens length is less than 10s, m=3, when lens length is more than 10s, (parameter herein can be by m=4 Row adjustment).M frames are extracted in camera lens as frame is represented, calculate the interval gap=(end positions-start bit for extracting key frame Put)/(m+1), video frame is extracted by interval of gap since camera lens, as key frame.
S104, the m frame key frames of analysis camera lens obtain host's classification information of camera lens;
Then each key frame is analyzed respectively, draws host's classification information of the camera lens.
S105, headline detection is carried out to pending news video, when including news in pending news video During title, the start time point of headline and end time point are recorded;
Meanwhile headline detection and analysis are carried out to pending news video, judge be in pending news video It is no comprising headline, when including headline in pending news video, start time point to headline and End time point is recorded.
At the beginning of S106, the start time point based on headline and end time point and camera lens are in news video Between point and end time point, generate the headline label information that camera lens is marked;
Then according to the start time point of the headline of record and end time point and camera lens in pending news Point and end time point, generate the headline label information that camera lens is marked, that is, mark at the beginning of in video Whether headline is included in camera lens.
S107, based on each camera lens in news video at the beginning of point and end time point, the host of camera lens News video is split as N news letter by the headline label information of classification information and camera lens according to the default rule that splits Breath, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens Host's classification information and camera lens headline label information, pending news video is split, is split as N News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, when needing to generate the poster figure of news video, first by new to target The video frame heard in video is clustered, and targeted news video is decomposed at least one camera lens, each camera lens is then recorded and exists Point and end time point at the beginning of in the targeted news video;Point and end time at the beginning of based on camera lens The length for the camera lens that point calculates, the m frame key frames of camera lens are extracted according to prefixed time interval, record each key frame in mesh Point and end time point, respectively handle each key frame, generate key frame at the beginning of marking in news video Host's label information, at the same to targeted news video carry out headline detection, when in targeted news video include news mark During topic, the start time point of headline and end time point, start time point and end based on headline are recorded Time point and key frame in targeted news video at the beginning of point and end time point, generate and key frame be marked Headline label information, be finally based on host's label information of all key frames and headline label information, it is raw It, can be based on the host's information and headline Automatic generation of information in news-video into the poster figure of targeted news video The poster figure of news-video content can be characterized, the poster figure generation form of news-video in the prior art effectively solved is single, The problem of poor user experience.
As shown in Fig. 2, be a kind of flow chart of news-video method for splitting embodiment 2 disclosed by the invention, the present embodiment On the basis of above-described embodiment 1, headline detection is being carried out to pending news video, when pending news video In include headline when, after recording the start time point of headline and end time point, further include:
S201, the headline progress deduplication operation to the pending news video detected, and remained after recording duplicate removal The start time point of remaining headline and end time point;
It can be found by the observation for news data, be often present in news item, displaying same is repeated several times The situation of headline.News is split if only relying on and a headline occur, the undue of news can be caused It cuts, therefore, can deduplication operation further be carried out to the headline of pending news video detected, and record duplicate removal The remaining start time point of headline and end time point afterwards.
When the headline of the pending news video to detecting carries out deduplication operation, it is assumed that obtain n-th A title its start and end frame position for t1, t2, position in the video frame is CRn (x, y, w, h), for this Title Cn [t1, t2].Two titles before it are respectively Cn-1 [t3, t4], Cn-2 [t5, t6], position in the video frame For CRn-1 and CRn-2.
The title Cn-1 of step 1, comparison current head Cn with before, the ratio of repeat region in video, i.e., Ratio R1s of the CRn with the repeat region of CRn-1 is calculated, if R1>=Thr then thinks that two titles need to carry out duplicate removal comparison, Go to step 2.Otherwise continue to compare region repeatability R2 of the Cn with Cn-2, if R2>=Thr then thinks that two titles need to carry out Duplicate removal compares, and goes to step 2, otherwise it is assumed that Cn is not repeat title.
Step 2, two titles for input are each chosen and represent a frame of its content, for Cn, choose (t1+t2)/ The video frame at 2 moment for CRn (x, y, w, h), sets contrast district rect:
Rect.x=x+w*R1;
Rect.y=y+h*R2;
Rect.w=w*R3;
Rect.h=h*R4;
R1, R2, R3, R4 are preset parameter.
Image in selecting video frame rect is as IMG1, for Cn-1 (or Cn-2), chooses (or (the t5+ of (t3+t4)/2 T6)/2) video frame at moment chooses the image in same area rect, is denoted as IMG2.
Step 3, by two input pictures, input picture is converted into gray scale/or arbitrary brightness color by rgb color space Color separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114;
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2.
Step 4 calculates segmentation threshold, and for the gray scale or luminance picture of IMG1, gray scale point is calculated using OTSU methods Threshold value is cut, OTSU methods are described as:
(1) assume that gray level image I can be divided into N number of gray scale (N<=256) image can, be extracted for this N number of gray scale N rank grey level histograms H.
(2) for each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
(3) madeThe maximum corresponding x (t) of t are used as segmentation threshold Th.
Step 5, binary image IMG1 and IMG2.Its corresponding reference binary of image IMG1 or IMG2 pixel (x, y) The pixel for changing image B is IfI (x, y)<Th, B (x, y)=0;ifI(x,y)>=Th, B (x, y)=255.
Step 6, binary image B1 and B2 by IMG1 and IMG2, carry out point-by-point difference, and calculate the average value of difference Diff:
Wherein, W and H is the width in rect regions, high.
Step 7 compares Diff and preset threshold value, then think if less than threshold value two it is entitled identical Title, by associated camera lens in the time range [t1, t2] of Cn labeled as identical subtitle, otherwise labeled as different subtitles.
Correspondingly, S202, being existed based on the start time point of remaining headline after duplicate removal and end time point and camera lens Point and end time point, generate the headline label information that camera lens is marked at the beginning of in news video;
Then treated according to the start time point of remaining headline after the duplicate removal of record and end time point and camera lens Point and end time point at the beginning of in the news video of processing generate and mark letter to the headline that camera lens is marked Breath marks in camera lens whether include headline.
S203, based on each camera lens in news video at the beginning of point and end time point, the host of camera lens News video is split as N news letter by the headline label information of classification information and camera lens according to the default rule that splits Breath, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens Host's classification information and camera lens headline label information, pending news video is split, is split as N News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, it, can be further pending to what is detected on the basis of embodiment 1 The headline of news video carry out deduplication operation, the problem of effectively preventing the over-segmentation of news.
Specifically, in the above-described embodiment, the m frame key frames for analyzing camera lens obtain host's classification information of camera lens One of which realization method can be:
Each frame key frame of camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to Host's class categories, host's class categories of all key frames of camera lens are counted, by the host of quantity maximum Class categories are determined as host's classification information of camera lens.
That is, for several frame key frames of each camera lens chosen before, be input to grader trained in advance into Row host's category classification, and vote the result of several frames, the most classification of voting results is chosen as the camera lens Classification.
Wherein, the training process of grader is:Different channel, different news program video in extract it is a certain number of Video frame manually by these video frame, is categorized as double host's sitting posture class, single host's sitting posture class, single host station Appearance class and non-hosting four classifications of the mankind (being illustrated with four classes at this place, however it is not limited to this four class), utilize deep learning side Method trains corresponding grader, and training module refers to according to the deep learning network training method and model structure increased income, instruction Practice the process of network model.
Training process:The deep learning frame progress model increased income using caffe is instructed again (can also be used other depth of increasing income Learning framework is trained) specific training process is BP neural algorithms, i.e., before to during transmission, export in layer, if output layer Obtained result has difference then to carry out back transfer with desired value, according to its error with gradient descent method come update its weight and Threshold values, repeated several times, until error function reaches global minimum, specific algorithm is complicated, and is not original algorithm, belongs to one As universal method, repeat no more detailed process.By above-mentioned training process, the network model classified is available for.
Assorting process:Each key frame obtained for each camera lens after shot segmentation is input to trained model In, according to the convolution of same model structure and trained parameter, successively progress image, pooling, RELU are operated, directly Belong to double host's sitting posture class, single host's sitting posture class, single host's stance class and non-to image to final obtain Preside over the mankind each classification confidence level probability output P1, P2, P3, P4, select the corresponding classification of maximum therein as The class categories of this unknown images.I.e. for example:P1 is the maximum in (P1, P2, P3, P4), this image belongs to double Host's sitting posture class.For a camera lens, count it and belong to the quantity of key frame of all categories, the crucial number of frames of selection is done more Classification of the classification as this camera lens.
Specifically, in the above-described embodiments, headline detection is carried out to pending news video, when pending new When hearing in video comprising headline, the start time point of headline and the one of which realization side of end time point are recorded Formula can be:
The predeterminable area of the video frame of pending news video is determined as candidate region, to the image in candidate region into Line trace is handled, generation tracking handling result;Judge whether candidate region is headline region based on tracking handling result, if It is the time of occurrence point in headline region to be then determined as the start time point of headline, by disappearing for headline region Mistake time point is determined as the end time point of headline.
That is, the thinking of title detection algorithm is each video frame for the news video of input, is carried out steady based on time domain Qualitatively headline detects, and obtains the frame number for the starting and ending frame that headline occurs in entire news.It will be in modules A The time location in video of each camera lens obtained is compared with the appearance position of headline, if gone out in title In existing scope, then it is assumed that this camera lens be tool it is headed, otherwise it is assumed that this camera lens do not have it is headed.
This place judged using this mode, without being carried out using the mode that title is found in single image, be in order to Distinguish roll titles that may be present, the roll titles occurred in news generally take the extremely approximate pattern of same headline into Row displaying, if only judging whether it is headline to an image, is present with mistake, influences poster map generalization matter Amount.
Specific algorithm is:
1st, potential candidate region is selected:
(1) can choosing key frame bottom section, (bottom section is the position that most of news headline occurs.Carry out area The purpose that domain is chosen is to reduce calculation amount, promotes accuracy of detection) in image, as image to be detected, bottom section Choosing method is:
Assuming that width a height of W, H of key frame, then bottom section Rect (rect.x, rect.y, rect.w, rect.h) (square Width, height of starting point coordinate of the shape region in key frame with the region) position in the image of key frame is:
Rect.x=0;
Rect.y=H*cut_ratio;
Rect.w=W;
Rect.h=H* (1-cut_ratio);
Wherein cut_ratio is a default coefficient.
(2) image to be detected of selection is converted into gray scale/or arbitrary brightness and color separated space by rgb color space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(3) for gray scale or luminance picture, the edge feature of image is extracted, there are many ways to extracting edge, such as Sobel operators, Canny operators etc., the present embodiment illustrate by taking Sobel operators as an example:
Using horizontal direction edge gradient operator and vertical direction edge gradient operator, the progress of same gray scale/luminance picture Convolution obtains horizontal edge figure Eh and vertical edge figure Ev, final to calculate edge strength figure Eall, i.e., for arbitrary on edge graph One point Eall (x, y), Eall (x, y)=sqrt (Ev (x, y) 2+Eh (x, y) 2)
For edge gradient operator horizontally and vertically by taking Sobel operators as an example, other operators are equally applicable
(4) compared for Eall and preset threshold value The1, by edge graph binaryzation i.e., ifEall (x, y)> The1E (x, y)=1, else E (x, y)=0.
(5) for the operation of each passages of the RGB of image to be detected, respectively execution 3, the edge of three passages respectively is obtained Intensity map Er, Eg, Eb.
(6) compared for Er, Eg, Eb with preset threshold value The2, by edge graph binaryzation, i.e., (with some Passage is illustrated) ifEr (x, y)>The2Er (x, y)=1, else Er (x, y)=0.The2 and The1 can it is identical can not also Together, if headline frame bottom is the type of gradual manner, the higher threshold value of use can not detect the edge of headline frame, it is necessary to The edge detected with lower threshold is strengthened, therefore, general The2<The1
(7) Edge Enhancement is carried out for obtained edge image E, E (x, y)=E (x, y) | Er (x, y) | Eg (x, y) | Eb (x, y) obtains final edge graph.(5)~(7) for strengthen step, can select to use as needed or without using.It can be to one Passage is strengthened, and also three passages can be strengthened, and the purpose is to prevent caption area from causing to detect when there is gradual change Failure.
(8) projection of horizontal direction is carried out for final edge graph, is counted per the pixel for meeting following conditions in a line i Quantity Numedge, if Numedge>Thnum, then histogram H [i]=1, otherwise histogram H [i]=0.Following conditions are: There are the value that at least one pixel is 1 in the pixel and neighbouring pixel, the marginal value for being considered as the pixel is 1, simultaneously It is 1 to count the continuous pixel edge value of the pixel or so, and the total number of continuous pixel of the length more than threshold value Thlen.(mesh Guarantee have continuous straight line)
(9) for histogram H [i], traveled through, H [i]==1 between line space, if spacing be more than threshold value Throw, then using the edge image region between this two row as first stage candidate region, if not provided, continuing with next Key frame.
(10) for each first stage candidate region, the edge projection histogram V of vertical direction is counted, for arbitrary The i of one row, if the quantity that the edge pixel of this row is 1 is more than Thv, V [i]=1, otherwise V [i]=0, forces to set V [0]=1&&V [W-1]=1.It finds in V, V [i]==1&&V [j]==1&&V [k] k ∈ (i, j)==0&&argmax (i- J) right boundary of the region as caption area.The original image in this region is selected, the candidate regions as second stage Domain.The method for seeking the edge pixel of row is identical with seeking the method for capable edge pixel.
(11) right boundary of second stage candidate region is finely found, with the sliding window of certain length (can be for 32*32) The artwork of mouth scanning second stage candidate region, calculates the color histogram in each window, while counts face in the window The number numcolor of non-zero position in Color Histogram finds the position of the background area of monochromatic areas or color complexity, i.e., numcolor<Thcolor1||numcolor>Thcolor2 will meet the center of the window of the condition, as new vertical Direction border.
(12) the rectangular area CandidateRect determined for the above method, is judged using constraints, constraint Condition includes but not limited to, the location information of the starting point of CandidateRect need in certain image range, CandidateRect it is highly desirable within a certain range etc., headline is considered if eligible Candidate region.If the candidate region is not located in tracking, into line trace revolving die block B, otherwise examined always in A modules It surveys.
2nd, for the candidate region found into line trace:
(1) determine whether this region of the first secondary tracking, i.e., can be known after the present embodiment is handled by last moment Road is in either with or without a region or multiple regions in tracking or tracking is completed or tracking fails, if there is the area in tracking Domain, by it with present candidate region, into the comparison of row position, if there is higher registration in two regions in position, i.e., Understand this region be in tracking in, otherwise then determine this region be for the first time trace into, wherein so-called first secondary tracking this A region can refer to and track this region for the first time, after last tracking can also be referred to, then this region of secondary tracking.If It is to track for the first time, carries out (2), if not tracking for the first time, exits the method and step of the present embodiment.
(2) for the region of the first secondary tracking, a following range in key frame is set (since the key frame of input is waited Additional background area, the i.e. region not comprising headline may be included in favored area, in order to promote the accuracy of tracking, is needed Tracing area is set).Setting method is:If the position of the candidate region of the headline of key frame is CandidateRect (x, y, w, h) (starting point x, y and corresponding width high w, h in key frame), setting tracing area track (x, y, w, h) are:
Track.x=CandidateRect.x+CandidateRect.w*Xratio1;
Track.y=CandidateRect.y+CandidateRect.h*Yratio1;
Track.w=CandidateRect.w*Xratio2;
Track.h=CandidateRect.h*Yratio2;
Xratio1, Xratio2, Yratio1, Yratio2 are preset parameter.
(3) image in key frame tracing area is chosen, image is converted into gray scale/or arbitrary bright by rgb color space It spends color-separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(4) segmentation threshold is calculated, for gray scale or luminance picture, intensity slicing threshold value is calculated using OTSU methods, OTSU methods are described as:Assuming that gray level image I can be divided into N number of gray scale (N<=256), can be carried for this N number of gray scale Take the N rank grey level histograms H of image.For each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
MadeThe maximum corresponding x (t) of t are used as segmentation threshold Thtrack.
(5) by image binaryzation, i.e., its corresponding reference binary image Bref for the pixel (x, y) in image I Pixel is IfI (x, y)<Thtrack, Bref (x, y)=0;ifI(x,y)>=Thtrack, Bref (x, y)=255.
(6) the color histogram Href of image in tracing area is calculated.
(7) for the key frame of input, it is converted into gray scale/or arbitrary brightness and color separation by rgb color space Space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(8) choose the gray level image in key frame in tracing area, carry out binaryzation, i.e., in image I pixel (x, Y) pixel of its corresponding binary image B is IfI (x, y)<Thtrack, Bcur (x, y)=0;ifI(x,y)>= Thtrack, Bcur (x, y)=255.The result that step 4 obtains during secondary tracking headed by Thtrack.
(9) the binary image Bcur of present frame is carried out point-by-point difference, and calculates difference with reference binary image Bref The average value Diffbinary divided:
Wherein W and H is the width of tracing area image, high.
(10) the color histogram Hcur of present image in tracing area is calculated, and distance Diffcolor is sought with Href.
(11) for the Diffbinary and Diffcolor of acquisition, it is compared with preset threshold value, if Diffbinary<Thbinary&&Diffcolor<Thcolor is then returned in status tracking, by lock-on counter tracking_ Num++, otherwise by lost_num++;It should be noted that the tracking mode based on color histogram and binaryzation, can only use One of them, can also be applied in combination.
(12) if lost_num>Thlost then returning tracking done states, while return to the frame number (note of current key frame It is the time point that headline disappears to have recorded this frame), otherwise in returning tracking.The purpose for setting up lost_num be in order to avoid Individual video signals are interfered, and image is caused distortion occur, cause that it fails to match, by setting up for lost_num, allow to calculate Method has the key frame tracking failure of discrete quantities.
3rd, it is a Title area to judge this tracing area:
If terminated to candidate regions tracking, compare tracking_num and preset threshold value Thtracking_num Size, if tracking_num>=Thtracking_num then judges this image for headline region, otherwise to be non- Headline region.
Specifically, in the above-described embodiments, start time point and end time point and camera lens based on headline exist Point and end time point at the beginning of in news video generate headline label information that camera lens is marked its A kind of middle realization method can be:
By the start time point of headline and end time point and camera lens in the news video at the beginning of Point and end time point are compared, when the start time point of headline and end time point are included in camera lens in news When in the period that point and end time point are formed at the beginning of in video, the first headline label information is generated, when Point and knot at the beginning of the start time point of headline and end time point are not included in camera lens in news video When in the period that beam time point is formed, the second headline label information is generated.
Specifically, in the above-described embodiments, based on each camera lens in news video at the beginning of point and at the end of Between point, host's classification information of camera lens and camera lens headline label information, news regarded according to the default rule that splits Frequently being split as the one of which realization method of N news information can be:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
That is, step 1, for wherein each camera lens, if news starting point is sky, which is arranged to news Starting point, the next camera lens of diversion treatments, if there is provided news starting points to turn to step 2.
If step 2, SiIn TiBelong to double host's classification, then by Si-1The T of camera lensi-1Terminal it is new as splitting The terminal of news, meanwhile, by SiIt is independent news item, the T that the starting point of news isiBeginning and end returns to two demolitions As a result, news starting point is set for sky, the next camera lens of diversion treatments.
If step 3, SiIn TiBelong to single host's sitting posture or stance classification, then by Si-1The T of camera lensi-1End Terminal of the point as fractionation news, meanwhile, by SiAs the starting point of new news item, a demolition is returned as a result, diversion treatments Next camera lens.
If step 4, SiIn TiIt is other to belong to the non-hosting mankind, and CiThere are subtitle and CsiFor new subtitle, then by Si-1Camera lens Ti-1Terminal as split news terminal, meanwhile, by SiAs the starting point of new news item, return to a demolition as a result, The next camera lens of diversion treatments.
If step 5 does not meet above-mentioned condition, by camera lens SiIt is added in this news, the next mirror of diversion treatments Head.
As shown in figure 3, be a kind of structure diagram of news-video detachment device embodiment 1 disclosed by the invention, device Including:
Decomposing module 301, will be pending new for by being clustered to the video frame in pending news video It hears video and is decomposed at least one camera lens;
When needing to split news-video, video frame similar in pending news video is gathered first Class merges into a camera lens.When video is decomposed into camera lens, by each video frame for calculating pending news video The color histogram H [i] of rgb space calculates the Euclidean distance between the color histogram H [i] of the adjacent video frame of time domain, such as This Euclidean distance of fruit is more than preset threshold value Th1, then it is assumed that shear, record start position and end position has occurred in camera lens Between all video frame be a camera lens;Current video frame is calculated with the color histogram between the video frame of the n frames before it The distance of H [i] is schemed, if this distance is more than preset threshold value Th2, then it is assumed that gradual shot has occurred here, records All video frame between starting position and end position are a camera lens;If camera lens is both without occurring shear or not occurring Gradual change, then it is assumed that still inside a camera lens.
First logging modle 302, for recording each camera lens in news video at the beginning of point and end time Point;
After pending news video is decomposed at least one camera lens, to each camera lens in pending news video In at the beginning of point and end time point recorded.
Abstraction module 303, for the length of the camera lens that point and end time point calculate at the beginning of based on camera lens, The m frame key frames of camera lens are extracted according to prefixed time interval;
According to the length of the camera lens that point and end time point calculate at the beginning of the camera lens of record, setting needs to take out The frame number m of the key frame taken, the rule of setting can be described as:When lens length is less than 2s, m=1, when lens length is less than During 4s, m=2, when lens length is less than 10s, m=3, when lens length is more than 10s, (parameter herein can be by m=4 Row adjustment).M frames are extracted in camera lens as frame is represented, calculate the interval gap=(end positions-start bit for extracting key frame Put)/(m+1), video frame is extracted by interval of gap since camera lens, as key frame.
Analysis module 304, the m frame key frames for analyzing camera lens obtain host's classification information of camera lens;
Then each key frame is analyzed respectively, draws host's classification information of the camera lens.
Second logging modle 305, for carrying out headline detection to pending news video, when pending news When headline is included in video, the start time point of headline and end time point are recorded;
Meanwhile headline detection and analysis are carried out to pending news video, judge be in pending news video It is no comprising headline, when including headline in pending news video, start time point to headline and End time point is recorded.
Generation module 306 regards for the start time point based on headline and end time point and camera lens in news Point and end time point, generate the headline label information that camera lens is marked at the beginning of in frequency;
Then according to the start time point of the headline of record and end time point and camera lens in pending news Point and end time point, generate the headline label information that camera lens is marked, that is, mark at the beginning of in video Whether headline is included in camera lens.
Split module 307, for being based on each camera lens in news video at the beginning of point and end time point, mirror News video is split as by host's classification information of head and the headline label information of camera lens according to the default rule that splits N news information, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens Host's classification information and camera lens headline label information, pending news video is split, is split as N News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, when needing to generate the poster figure of news video, first by new to target The video frame heard in video is clustered, and targeted news video is decomposed at least one camera lens, each camera lens is then recorded and exists Point and end time point at the beginning of in the targeted news video;Point and end time at the beginning of based on camera lens The length for the camera lens that point calculates, the m frame key frames of camera lens are extracted according to prefixed time interval, record each key frame in mesh Point and end time point, respectively handle each key frame, generate key frame at the beginning of marking in news video Host's label information, at the same to targeted news video carry out headline detection, when in targeted news video include news mark During topic, the start time point of headline and end time point, start time point and end based on headline are recorded Time point and key frame in targeted news video at the beginning of point and end time point, generate and key frame be marked Headline label information, be finally based on host's label information of all key frames and headline label information, it is raw It, can be based on the host's information and headline Automatic generation of information in news-video into the poster figure of targeted news video The poster figure of news-video content can be characterized, the poster figure generation form of news-video in the prior art effectively solved is single, The problem of poor user experience.
As shown in figure 4, be a kind of structure diagram of news-video detachment device embodiment 2 disclosed by the invention, this reality Example is applied on the basis of above-described embodiment 1, headline detection is being carried out to pending news video, when pending news When headline is included in video, after recording the start time point of headline and end time point, further include:
Deduplication module 401 for carrying out deduplication operation to the headline of the pending news video detected, and is remembered The remaining start time point of headline and end time point after record duplicate removal;
It can be found by the observation for news data, be often present in news item, displaying same is repeated several times The situation of headline.News is split if only relying on and a headline occur, the undue of news can be caused It cuts, therefore, can deduplication operation further be carried out to the headline of pending news video detected, and record duplicate removal The remaining start time point of headline and end time point afterwards.
When the headline of the pending news video to detecting carries out deduplication operation, it is assumed that obtain n-th A title its start and end frame position for t1, t2, position in the video frame is CRn (x, y, w, h), for this Title Cn [t1, t2].Two titles before it are respectively Cn-1 [t3, t4], Cn-2 [t5, t6], position in the video frame For CRn-1 and CRn-2.
The title Cn-1 of step 1, comparison current head Cn with before, the ratio of repeat region in video, i.e., Ratio R1s of the CRn with the repeat region of CRn-1 is calculated, if R1>=Thr then thinks that two titles need to carry out duplicate removal comparison, Go to step 2.Otherwise continue to compare region repeatability R2 of the Cn with Cn-2, if R2>=Thr then thinks that two titles need to carry out Duplicate removal compares, and goes to step 2, otherwise it is assumed that Cn is not repeat title.
Step 2, two titles for input are each chosen and represent a frame of its content, for Cn, choose (t1+t2)/ The video frame at 2 moment for CRn (x, y, w, h), sets contrast district rect:
Rect.x=x+w*R1;
Rect.y=y+h*R2;
Rect.w=w*R3;
Rect.h=h*R4;
R1, R2, R3, R4 are preset parameter.
Image in selecting video frame rect is as IMG1, for Cn-1 (or Cn-2), chooses (or (the t5+ of (t3+t4)/2 T6)/2) video frame at moment chooses the image in same area rect, is denoted as IMG2.
Step 3, by two input pictures, input picture is converted into gray scale/or arbitrary brightness color by rgb color space Color separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114;
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2.
Step 4 calculates segmentation threshold, and for the gray scale or luminance picture of IMG1, gray scale point is calculated using OTSU methods Threshold value is cut, OTSU methods are described as:
(1) assume that gray level image I can be divided into N number of gray scale (N<=256) image can, be extracted for this N number of gray scale N rank grey level histograms H.
(2) for each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
(3) madeThe maximum corresponding x (t) of t are used as segmentation threshold Th.
Step 5, binary image IMG1 and IMG2.Its corresponding reference binary of image IMG1 or IMG2 pixel (x, y) The pixel for changing image B is IfI (x, y)<Th, B (x, y)=0;ifI(x,y)>=Th, B (x, y)=255.
Step 6, binary image B1 and B2 by IMG1 and IMG2, carry out point-by-point difference, and calculate the average value of difference Diff:
Wherein, W and H is the width in rect regions, high.
Step 7 compares Diff and preset threshold value, then think if less than threshold value two it is entitled identical Title, by associated camera lens in the time range [t1, t2] of Cn labeled as identical subtitle, otherwise labeled as different subtitles.
Generation module 402, for based on the start time point of remaining headline after duplicate removal and end time point and mirror Head in news video at the beginning of point and end time point, generate the headline that camera lens is marked and mark letter Breath;
Then treated according to the start time point of remaining headline after the duplicate removal of record and end time point and camera lens Point and end time point at the beginning of in the news video of processing generate and mark letter to the headline that camera lens is marked Breath marks in camera lens whether include headline.
Split module 403, for being based on each camera lens in news video at the beginning of point and end time point, mirror News video is split as by host's classification information of head and the headline label information of camera lens according to the default rule that splits N news information, wherein, N is more than or equal to 1.
Finally, according to each camera lens got in news video at the beginning of point and end time point, camera lens Host's classification information and camera lens headline label information, pending news video is split, is split as N News information, wherein, N is more than or equal to 1.
In conclusion in the above-described embodiments, it, can be further pending to what is detected on the basis of embodiment 1 The headline of news video carry out deduplication operation, the problem of effectively preventing the over-segmentation of news.
Specifically, in the above-described embodiment, analysis module is specifically used for:
Each frame key frame of camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to Host's class categories, host's class categories of all key frames of camera lens are counted, by the host of quantity maximum Class categories are determined as host's classification information of camera lens.
That is, for several frame key frames of each camera lens chosen before, be input to grader trained in advance into Row host's category classification, and vote the result of several frames, the most classification of voting results is chosen as the camera lens Classification.
Wherein, the training process of grader is:Different channel, different news program video in extract it is a certain number of Video frame manually by these video frame, is categorized as double host's sitting posture class, single host's sitting posture class, single host station Appearance class and non-hosting four classifications of the mankind (being illustrated with four classes at this place, however it is not limited to this four class), utilize deep learning side Method trains corresponding grader, and training module refers to according to the deep learning network training method and model structure increased income, instruction Practice the process of network model.
Training process:The deep learning frame progress model increased income using caffe is instructed again (can also be used other depth of increasing income Learning framework is trained) specific training process is BP neural algorithms, i.e., before to during transmission, export in layer, if output layer Obtained result has difference then to carry out back transfer with desired value, according to its error with gradient descent method come update its weight and Threshold values, repeated several times, until error function reaches global minimum, specific algorithm is complicated, and is not original algorithm, belongs to one As universal method, repeat no more detailed process.By above-mentioned training process, the network model classified is available for.
Assorting process:Each key frame obtained for each camera lens after shot segmentation is input to trained model In, according to the convolution of same model structure and trained parameter, successively progress image, pooling, RELU are operated, directly Belong to double host's sitting posture class, single host's sitting posture class, single host's stance class and non-to image to final obtain Preside over the mankind each classification confidence level probability output P1, P2, P3, P4, select the corresponding classification of maximum therein as The class categories of this unknown images.I.e. for example:P1 is the maximum in (P1, P2, P3, P4), this image belongs to double Host's sitting posture class.For a camera lens, count it and belong to the quantity of key frame of all categories, the crucial number of frames of selection is done more Classification of the classification as this camera lens.
Specifically, in the above-described embodiments, the second logging modle is specifically used for:
The predeterminable area of the video frame of pending news video is determined as candidate region, to the image in candidate region into Line trace is handled, generation tracking handling result;Judge whether candidate region is headline region based on tracking handling result, if It is the time of occurrence point in headline region to be then determined as the start time point of headline, by disappearing for headline region Mistake time point is determined as the end time point of headline.
That is, the thinking of title detection algorithm is each video frame for the news video of input, is carried out steady based on time domain Qualitatively headline detects, and obtains the frame number for the starting and ending frame that headline occurs in entire news.It will be in modules A The time location in video of each camera lens obtained is compared with the appearance position of headline, if gone out in title In existing scope, then it is assumed that this camera lens be tool it is headed, otherwise it is assumed that this camera lens do not have it is headed.
This place judged using this mode, without being carried out using the mode that title is found in single image, be in order to Distinguish roll titles that may be present, the roll titles occurred in news generally take the extremely approximate pattern of same headline into Row displaying, if only judging whether it is headline to an image, is present with mistake, influences poster map generalization matter Amount.
Specific algorithm is:
1st, potential candidate region is selected:
(1) can choosing key frame bottom section, (bottom section is the position that most of news headline occurs.Carry out area The purpose that domain is chosen is to reduce calculation amount, promotes accuracy of detection) in image, as image to be detected, bottom section Choosing method is:
Assuming that width a height of W, H of key frame, then bottom section Rect (rect.x, rect.y, rect.w, rect.h) (square Width, height of starting point coordinate of the shape region in key frame with the region) position in the image of key frame is:
Rect.x=0;
Rect.y=H*cut_ratio;
Rect.w=W;
Rect.h=H* (1-cut_ratio);
Wherein cut_ratio is a default coefficient.
(2) image to be detected of selection is converted into gray scale/or arbitrary brightness and color separated space by rgb color space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(3) for gray scale or luminance picture, the edge feature of image is extracted, there are many ways to extracting edge, such as Sobel operators, Canny operators etc., the present embodiment illustrate by taking Sobel operators as an example:
Using horizontal direction edge gradient operator and vertical direction edge gradient operator, the progress of same gray scale/luminance picture Convolution obtains horizontal edge figure Eh and vertical edge figure Ev, final to calculate edge strength figure Eall, i.e., for arbitrary on edge graph One point Eall (x, y), Eall (x, y)=sqrt (Ev (x, y) 2+Eh (x, y) 2)
For edge gradient operator horizontally and vertically by taking Sobel operators as an example, other operators are equally applicable:
(4) compared for Eall and preset threshold value The1, by edge graph binaryzation i.e., ifEall (x, y)> The1E (x, y)=1, else E (x, y)=0.
(5) for the operation of each passages of the RGB of image to be detected, respectively execution 3, the edge of three passages respectively is obtained Intensity map Er, Eg, Eb.
(6) compared for Er, Eg, Eb with preset threshold value The2, by edge graph binaryzation, i.e., (with some Passage is illustrated) ifEr (x, y)>The2Er (x, y)=1, else Er (x, y)=0.The2 and The1 can it is identical can not also Together, if headline frame bottom is the type of gradual manner, the higher threshold value of use can not detect the edge of headline frame, it is necessary to The edge detected with lower threshold is strengthened, therefore, general The2<The1
(7) Edge Enhancement is carried out for obtained edge image E, E (x, y)=E (x, y) | Er (x, y) | Eg (x, y) | Eb (x, y) obtains final edge graph.(5)~(7) for strengthen step, can select to use as needed or without using.It can be to one Passage is strengthened, and also three passages can be strengthened, and the purpose is to prevent caption area from causing to detect when there is gradual change Failure.
(8) projection of horizontal direction is carried out for final edge graph, is counted per the pixel for meeting following conditions in a line i Quantity Numedge, if Numedge>Thnum, then histogram H [i]=1, otherwise histogram H [i]=0.Following conditions are: There are the value that at least one pixel is 1 in the pixel and neighbouring pixel, the marginal value for being considered as the pixel is 1, simultaneously It is 1 to count the continuous pixel edge value of the pixel or so, and the total number of continuous pixel of the length more than threshold value Thlen.(mesh Guarantee have continuous straight line)
(9) for histogram H [i], traveled through, H [i]==1 between line space, if spacing be more than threshold value Throw, then using the edge image region between this two row as first stage candidate region, if not provided, continuing with next Key frame.
(10) for each first stage candidate region, the edge projection histogram V of vertical direction is counted, for arbitrary The i of one row, if the quantity that the edge pixel of this row is 1 is more than Thv, V [i]=1, otherwise V [i]=0, forces to set V [0]=1&&V [W-1]=1.It finds in V, V [i]==1&&V [j]==1&&V [k] k ∈ (i, j)==0&&argmax (i- J) right boundary of the region as caption area.The original image in this region is selected, the candidate regions as second stage Domain.The method for seeking the edge pixel of row is identical with seeking the method for capable edge pixel.
(11) right boundary of second stage candidate region is finely found, with the sliding window of certain length (can be for 32*32) The artwork of mouth scanning second stage candidate region, calculates the color histogram in each window, while counts face in the window The number numcolor of non-zero position in Color Histogram finds the position of the background area of monochromatic areas or color complexity, i.e., numcolor<Thcolor1||numcolor>Thcolor2 will meet the center of the window of the condition, as new vertical Direction border.
(12) the rectangular area CandidateRect determined for the above method, is judged using constraints, constraint Condition includes but not limited to, the location information of the starting point of CandidateRect need in certain image range, CandidateRect it is highly desirable within a certain range etc., headline is considered if eligible Candidate region.If the candidate region is not located in tracking, into line trace revolving die block B, otherwise examined always in A modules It surveys.
2nd, for the candidate region found into line trace:
(1) determine whether this region of the first secondary tracking, i.e., can be known after the present embodiment is handled by last moment Road is in either with or without a region or multiple regions in tracking or tracking is completed or tracking fails, if there is the area in tracking Domain, by it with present candidate region, into the comparison of row position, if there is higher registration in two regions in position, i.e., Understand this region be in tracking in, otherwise then determine this region be for the first time trace into, wherein so-called first secondary tracking this A region can refer to and track this region for the first time, after last tracking can also be referred to, then this region of secondary tracking.If It is to track for the first time, carries out (2), if not tracking for the first time, exits the method and step of the present embodiment.
(2) for the region of the first secondary tracking, a following range in key frame is set (since the key frame of input is waited Additional background area, the i.e. region not comprising headline may be included in favored area, in order to promote the accuracy of tracking, is needed Tracing area is set).Setting method is:If the position of the candidate region of the headline of key frame is CandidateRect (x, y, w, h) (starting point x, y and corresponding width high w, h in key frame), setting tracing area track (x, y, w, h) are:
Track.x=CandidateRect.x+CandidateRect.w*Xratio1;
Track.y=CandidateRect.y+CandidateRect.h*Yratio1;
Track.w=CandidateRect.w*Xratio2;
Track.h=CandidateRect.h*Yratio2;
Xratio1, Xratio2, Yratio1, Yratio2 are preset parameter.
(3) image in key frame tracing area is chosen, image is converted into gray scale/or arbitrary bright by rgb color space It spends color-separated space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(4) segmentation threshold is calculated, for gray scale or luminance picture, intensity slicing threshold value is calculated using OTSU methods, OTSU methods are described as:Assuming that gray level image I can be divided into N number of gray scale (N<=256), can be carried for this N number of gray scale Take the N rank grey level histograms H of image.For each t (0 in histogram<=t<N), equation below is calculated:
X (i)=i*256/N
MadeThe maximum corresponding x (t) of t are used as segmentation threshold Thtrack.
(5) by image binaryzation, i.e., its corresponding reference binary image Bref for the pixel (x, y) in image I Pixel is IfI (x, y)<Thtrack, Bref (x, y)=0;ifI(x,y)>=Thtrack, Bref (x, y)=255.
(6) the color histogram Href of image in tracing area is calculated.
(7) for the key frame of input, it is converted into gray scale/or arbitrary brightness and color separation by rgb color space Space (such as YUV, HSV, HSL, LAB), changing formula for gray space is:
Gray=R*0.299+G*0.587+B*0.114
It for brightness and color separated space, is illustrated with HSL, the conversion formula of brightness L (Lightness) is:
L=(max (R, G, B)+min (R, G, B))/2
(8) choose the gray level image in key frame in tracing area, carry out binaryzation, i.e., in image I pixel (x, Y) pixel of its corresponding binary image B is IfI (x, y)<Thtrack, Bcur (x, y)=0;ifI(x,y)>= Thtrack, Bcur (x, y)=255.The result that step 4 obtains during secondary tracking headed by Thtrack.
(9) the binary image Bcur of present frame is carried out point-by-point difference, and calculates difference with reference binary image Bref The average value Diffbinary divided:
Wherein W and H is the width of tracing area image, high.
(10) the color histogram Hcur of present image in tracing area is calculated, and distance Diffcolor is sought with Href.
(11) for the Diffbinary and Diffcolor of acquisition, it is compared with preset threshold value, if Diffbinary<Thbinary&&Diffcolor<Thcolor is then returned in status tracking, by lock-on counter tracking_ Num++, otherwise by lost_num++;It should be noted that the tracking mode based on color histogram and binaryzation, can only use One of them, can also be applied in combination.
(12) if lost_num>Thlost then returning tracking done states, while return to the frame number (note of current key frame It is the time point that headline disappears to have recorded this frame), otherwise in returning tracking.The purpose for setting up lost_num be in order to avoid Individual video signals are interfered, and image is caused distortion occur, cause that it fails to match, by setting up for lost_num, allow to calculate Method has the key frame tracking failure of discrete quantities.
3rd, it is a Title area to judge this tracing area:
If terminated to candidate regions tracking, compare tracking_num and preset threshold value Thtracking_num Size, if tracking_num>=Thtracking_num then judges this image for headline region, otherwise to be non- Headline region.
Specifically, in the above-described embodiments, generation module is specifically used for:
By the start time point of headline and end time point and camera lens in the news video at the beginning of Point and end time point are compared, when the start time point of headline and end time point are included in camera lens in news When in the period that point and end time point are formed at the beginning of in video, the first headline label information is generated, when Point and knot at the beginning of the start time point of headline and end time point are not included in camera lens in news video When in the period that beam time point is formed, the second headline label information is generated.
Specifically, it in the above-described embodiments, splits module and is specifically used for:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the host included in camera lens Classification information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
That is, step 1, for wherein each camera lens, if news starting point is sky, which is arranged to news Starting point, the next camera lens of diversion treatments, if there is provided news starting points to turn to step 2.
If step 2, SiIn TiBelong to double host's classification, then by Si-1The T of camera lensi-1Terminal it is new as splitting The terminal of news, meanwhile, by SiIt is independent news item, the T that the starting point of news isiBeginning and end returns to two demolitions As a result, news starting point is set for sky, the next camera lens of diversion treatments.
If step 3, SiIn TiBelong to single host's sitting posture or stance classification, then by Si-1The T of camera lensi-1End Terminal of the point as fractionation news, meanwhile, by SiAs the starting point of new news item, a demolition is returned as a result, diversion treatments Next camera lens.
If step 4, SiIn TiIt is other to belong to the non-hosting mankind, and CiThere are subtitle and CsiFor new subtitle, then by Si-1Camera lens Ti-1Terminal as split news terminal, meanwhile, by SiAs the starting point of new news item, return to a demolition as a result, The next camera lens of diversion treatments.
If step 5 does not meet above-mentioned condition, by camera lens SiIt is added in this news, the next mirror of diversion treatments Head.
Each embodiment is described by the way of progressive in this specification, the highlights of each of the examples are with it is other The difference of embodiment, just to refer each other for identical similar portion between each embodiment.
The foregoing description of the disclosed embodiments enables professional and technical personnel in the field to realize or use the present invention. A variety of modifications of these embodiments will be apparent for those skilled in the art, it is as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention The embodiments shown herein is not intended to be limited to, and is to fit to and the principles and novel features disclosed herein phase one The most wide scope caused.

Claims (12)

1. a kind of news-video method for splitting, which is characterized in that the described method includes:
By being clustered to the video frame in pending news video, the pending news video is decomposed at least One camera lens;
Point and end time point at the beginning of each camera lens is recorded in the news video;
The length for the camera lens that point and end time point calculate at the beginning of based on the camera lens, according to preset time Interval extracts the m frame key frames of the camera lens;
The m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Headline detection is carried out to the pending news video, when including news mark in the pending news video During topic, the start time point of the headline and end time point are recorded;
Start time point and end time point and the camera lens opening in the news video based on the headline Begin time point and end time point, generates the headline label information that the camera lens is marked;
Based on each camera lens in the news video at the beginning of point and end time point, the camera lens hosting The news video is split as by the headline label information of people's classification information and the camera lens according to the default rule that splits N news information, wherein, N is more than or equal to 1.
2. according to the method described in claim 1, it is characterized in that, described carry out news mark to the pending news video Topic detection, when in the pending news video include headline when, record the start time point of the headline with And it after end time point, further includes:
The headline of the pending news video to detecting carries out deduplication operation, and records remaining news after duplicate removal The start time point of title and end time point;
Correspondingly, start time point and the end time point based on the headline and the camera lens are in the news Point and end time point, generate the headline label information bag that the camera lens is marked at the beginning of in video It includes:
Based on the start time point of remaining headline after the duplicate removal and end time point and the camera lens in the news Point and end time point, generate the headline label information that the camera lens is marked at the beginning of in video.
3. method according to claim 1 or 2, which is characterized in that the m frame key frames of the analysis camera lens obtain institute Stating host's classification information of camera lens includes:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to Host's class categories;
Host's class categories of all key frames of the camera lens are counted, host's class categories of quantity maximum are true It is set to host's classification information of the camera lens.
4. method according to claim 1 or 2, which is characterized in that described to be carried out newly to the pending news video Title detection is heard, when including headline in the pending news video, records the initial time of the headline Point and end time point include:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by the news The time of occurrence point of Title area is determined as the start time point of headline, by the extinction time point in the headline region It is determined as the end time point of headline.
5. method according to claim 1 or 2, which is characterized in that the start time point based on the headline And end time point and the camera lens in the news video at the beginning of point and end time point, generate to described The headline label information that camera lens is marked includes:
By the start time point of the headline and end time point and beginning of the camera lens in the news video Time point and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video When in the period that sart point in time and end time point are formed, the first headline label information is generated;
When the start time point of the headline and end time point are not included in the camera lens in the news video At the beginning of point and end time point form period in when, generate the second headline label information.
6. method according to claim 1 or 2, which is characterized in that described to be regarded based on each camera lens in the news The headline of point and end time point, host's classification information of the camera lens and the camera lens at the beginning of in frequency Label information, the news video is split as N news information according to default fractionation rule includes:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the hosting mankind included in camera lens Other information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
7. a kind of news-video detachment device, which is characterized in that including:
Decomposing module, for by being clustered to the video frame in pending news video, by the pending news Video is decomposed at least one camera lens;
First logging modle, for recording each camera lens in the news video at the beginning of point and end time Point;
Abstraction module, for the length of the camera lens that point and end time point calculate at the beginning of based on the camera lens Degree extracts the m frame key frames of the camera lens according to prefixed time interval;
Analysis module, the m frame key frames for analyzing the camera lens obtain host's classification information of the camera lens;
Second logging modle, for carrying out headline detection to the pending news video, when described pending new When hearing in video comprising headline, the start time point of the headline and end time point are recorded;
Generation module, for the start time point based on the headline and end time point and the camera lens described new Point and end time point, generate the headline label information that the camera lens is marked at the beginning of hearing in video;
Split module, for being based on each camera lens in the news video at the beginning of point and end time point, Host's classification information of the camera lens and the headline label information of the camera lens, according to the default rule that splits by described in News video is split as N news information, wherein, N is more than or equal to 1.
8. device according to claim 7, which is characterized in that further include:
Deduplication module for carrying out deduplication operation to the headline of the pending news video detected, and records The remaining start time point of headline and end time point after duplicate removal;
Correspondingly, the generation module is used for:Based on the start time point of remaining headline after the duplicate removal and at the end of Between point and the camera lens in the news video at the beginning of point and end time point, generate to the camera lens into rower The headline label information of note.
9. the device according to claim 7 or 8, which is characterized in that the analysis module is specifically used for:
Each frame key frame of the camera lens is inputted into the grader that training is formed in advance respectively, each frame key frame is generated and corresponds to Host's class categories;
Host's class categories of all key frames of the camera lens are counted, host's class categories of quantity maximum are true It is set to host's classification information of the camera lens.
10. the device according to claim 7 or 8, which is characterized in that second logging modle is specifically used for:
The predeterminable area for determining the video frame of the pending news video is candidate region;
Image in the candidate region is handled into line trace, generation tracking handling result;
Judge whether the candidate region is headline region based on the tracking handling result, if so, by the news The time of occurrence point of Title area is determined as the start time point of headline, by the extinction time point in the headline region It is determined as the end time point of headline.
11. the device according to claim 7 or 8, which is characterized in that the generation module is specifically used for:
By the start time point of the headline and end time point and beginning of the camera lens in the news video Time point and end time point are compared;
When the start time point of the headline and end time point are included in the camera lens in the news video When in the period that sart point in time and end time point are formed, the first headline label information is generated;
When the start time point of the headline and end time point are not included in the camera lens in the news video At the beginning of point and end time point form period in when, generate the second headline label information.
12. the device according to claim 7 or 8, which is characterized in that the fractionation module is specifically used for:
By the news video according to information sequence V={ SiSplit, wherein, i=0,1 ..., N, Si={ Ti, Ai, Ci, Csi, TiRepresent camera lens in video at the beginning of point and end time point, AiRepresent the hosting mankind included in camera lens Other information, CiRepresent the headline label information of camera lens, CsiWhether representative is new title.
CN201711371733.6A 2017-12-19 2017-12-19 Video news splitting method and device Active CN108093314B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711371733.6A CN108093314B (en) 2017-12-19 2017-12-19 Video news splitting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711371733.6A CN108093314B (en) 2017-12-19 2017-12-19 Video news splitting method and device

Publications (2)

Publication Number Publication Date
CN108093314A true CN108093314A (en) 2018-05-29
CN108093314B CN108093314B (en) 2020-09-01

Family

ID=62177211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711371733.6A Active CN108093314B (en) 2017-12-19 2017-12-19 Video news splitting method and device

Country Status (1)

Country Link
CN (1) CN108093314B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110267061A (en) * 2019-04-30 2019-09-20 新华智云科技有限公司 A kind of news demolition method and system
CN110610500A (en) * 2019-09-06 2019-12-24 北京信息科技大学 News video self-adaptive strip splitting method based on dynamic semantic features
CN110941594A (en) * 2019-12-16 2020-03-31 北京奇艺世纪科技有限公司 Splitting method and device of video file, electronic equipment and storage medium
CN111277859A (en) * 2020-01-15 2020-06-12 腾讯科技(深圳)有限公司 Method and device for acquiring score, computer equipment and storage medium
CN111314775A (en) * 2018-12-12 2020-06-19 华为终端有限公司 Video splitting method and electronic equipment
CN113807085A (en) * 2021-11-19 2021-12-17 成都索贝数码科技股份有限公司 Method for extracting title and subtitle aiming at news scene
CN113810782A (en) * 2020-06-12 2021-12-17 阿里巴巴集团控股有限公司 Video processing method and device, server and electronic device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020126143A1 (en) * 2001-03-09 2002-09-12 Lg Electronics, Inc. Article-based news video content summarizing method and browsing system
CN101616264A (en) * 2008-06-27 2009-12-30 中国科学院自动化研究所 News video categorization and system
CN102547139A (en) * 2010-12-30 2012-07-04 北京新岸线网络技术有限公司 Method for splitting news video program, and method and system for cataloging news videos
CN103546667A (en) * 2013-10-24 2014-01-29 中国科学院自动化研究所 Automatic news splitting method for volume broadcast television supervision
CN104778230A (en) * 2015-03-31 2015-07-15 北京奇艺世纪科技有限公司 Video data segmentation model training method, video data segmenting method, video data segmentation model training device and video data segmenting device
CN104780388A (en) * 2015-03-31 2015-07-15 北京奇艺世纪科技有限公司 Video data partitioning method and device
CN107087211A (en) * 2017-03-30 2017-08-22 北京奇艺世纪科技有限公司 A kind of anchor shots detection method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020126143A1 (en) * 2001-03-09 2002-09-12 Lg Electronics, Inc. Article-based news video content summarizing method and browsing system
CN101616264A (en) * 2008-06-27 2009-12-30 中国科学院自动化研究所 News video categorization and system
CN102547139A (en) * 2010-12-30 2012-07-04 北京新岸线网络技术有限公司 Method for splitting news video program, and method and system for cataloging news videos
CN103546667A (en) * 2013-10-24 2014-01-29 中国科学院自动化研究所 Automatic news splitting method for volume broadcast television supervision
CN104778230A (en) * 2015-03-31 2015-07-15 北京奇艺世纪科技有限公司 Video data segmentation model training method, video data segmenting method, video data segmentation model training device and video data segmenting device
CN104780388A (en) * 2015-03-31 2015-07-15 北京奇艺世纪科技有限公司 Video data partitioning method and device
CN107087211A (en) * 2017-03-30 2017-08-22 北京奇艺世纪科技有限公司 A kind of anchor shots detection method and device

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111314775A (en) * 2018-12-12 2020-06-19 华为终端有限公司 Video splitting method and electronic equipment
CN111314775B (en) * 2018-12-12 2021-09-07 华为终端有限公司 Video splitting method and electronic equipment
US11902636B2 (en) 2018-12-12 2024-02-13 Petal Cloud Technology Co., Ltd. Video splitting method and electronic device
CN110267061A (en) * 2019-04-30 2019-09-20 新华智云科技有限公司 A kind of news demolition method and system
CN110610500A (en) * 2019-09-06 2019-12-24 北京信息科技大学 News video self-adaptive strip splitting method based on dynamic semantic features
CN110941594A (en) * 2019-12-16 2020-03-31 北京奇艺世纪科技有限公司 Splitting method and device of video file, electronic equipment and storage medium
CN110941594B (en) * 2019-12-16 2023-04-18 北京奇艺世纪科技有限公司 Splitting method and device of video file, electronic equipment and storage medium
CN111277859A (en) * 2020-01-15 2020-06-12 腾讯科技(深圳)有限公司 Method and device for acquiring score, computer equipment and storage medium
CN111277859B (en) * 2020-01-15 2021-12-14 腾讯科技(深圳)有限公司 Method and device for acquiring score, computer equipment and storage medium
CN113810782A (en) * 2020-06-12 2021-12-17 阿里巴巴集团控股有限公司 Video processing method and device, server and electronic device
CN113810782B (en) * 2020-06-12 2022-09-27 阿里巴巴集团控股有限公司 Video processing method and device, server and electronic device
CN113807085A (en) * 2021-11-19 2021-12-17 成都索贝数码科技股份有限公司 Method for extracting title and subtitle aiming at news scene

Also Published As

Publication number Publication date
CN108093314B (en) 2020-09-01

Similar Documents

Publication Publication Date Title
CN108093314A (en) A kind of news-video method for splitting and device
CN107977645A (en) A kind of news-video poster map generalization method and device
CN104063883B (en) A kind of monitor video abstraction generating method being combined based on object and key frame
CN109431523B (en) Autism primary screening device based on non-social voice stimulation behavior paradigm
Yang et al. Lecture video indexing and analysis using video ocr technology
CN103258232B (en) A kind of public place crowd estimate&#39;s method based on dual camera
CN105844621A (en) Method for detecting quality of printed matter
CN104077577A (en) Trademark detection method based on convolutional neural network
EP0821318A3 (en) Apparatus and method for segmenting image data into windows and for classifying windows as image types
Nasir et al. Automatic passenger counting system using image processing based on skin colour detection approach
CN108629319B (en) Image detection method and system
CN102306307B (en) Positioning method of fixed point noise in color microscopic image sequence
CN103714314B (en) Television video station caption identification method combining edge and color information
WO2012005461A2 (en) Method for automatically calculating information on clouds
CN108108733A (en) A kind of news caption detection method and device
KR20090111939A (en) Method and apparatus for separating foreground and background from image, Method and apparatus for substituting separated background
CN108256508A (en) A kind of news major-minor title detection method and device
CN106570885A (en) Background modeling method based on brightness and texture fusion threshold value
CN106709438A (en) Method for collecting statistics of number of people based on video conference
CN113065568A (en) Target detection, attribute identification and tracking method and system
CN111242096B (en) People number gradient-based people group distinguishing method
CN101827224B (en) Detection method of anchor shot in news video
CN108446603B (en) News title detection method and device
CN113012192A (en) Video total segmentation counting method based on panoramic segmentation multi-target tracking
CN102625028B (en) The method and apparatus that static logos present in video is detected

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant