CN102098449A - Method for realizing automatic inside segmentation of TV programs by utilizing mark detection - Google Patents
Method for realizing automatic inside segmentation of TV programs by utilizing mark detection Download PDFInfo
- Publication number
- CN102098449A CN102098449A CN2010105740748A CN201010574074A CN102098449A CN 102098449 A CN102098449 A CN 102098449A CN 2010105740748 A CN2010105740748 A CN 2010105740748A CN 201010574074 A CN201010574074 A CN 201010574074A CN 102098449 A CN102098449 A CN 102098449A
- Authority
- CN
- China
- Prior art keywords
- subgraph
- camera lens
- sign
- mark
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of image processing and pattern recognition, and provides a method for realizing automatic inside segmentation of TV programs by utilizing mark detection. At present, the inside segmentation of the TV programs has urgent requirements; and the incontinuity of the mark time in a program leads the program to have better constitutive property. The method provided by the invention comprises the following steps: (1) segmenting shots on a program video, and extracting a subgraph in an area in which a mark is located from a keyframe of each shots; (2) extracting a feature vector of the subgraph, and classifying by utilizing a support vector machine (SVM) classifier aiming at the mark detection; (3) carrying out statistics on classification results, and demarcating the mark attribute of each lens; and (4) segmenting the video on a shearing point of the mark attribute of the adjacent shots. By the method provided by the invention, the keyframe is processed in the mark detection process, thus improving the efficiency of the method; and the objects applied in the method are the TV programs with incontinuous marks in the program; and the method has no requirement on the program content types, thus enhancing the application universality of the method.
Description
Technical field
The invention belongs to image processing and mode identification technology, be specifically related to a kind of Mark Detection of utilizing and carry out the method that TV programme inside is cut apart automatically.
Background technology
At present, radio and television every days is all at the video that produces magnanimity, and provided electric program menu.Along with extensively popularizing of Web TV and Digital Television, for the better impression of viewing and admiring is provided, many TV programme attempt cutting apart by inner paragraph, provide the inner rating of program and instruct.Simultaneously, to cut apart also be the prerequisite of further content analysis and retrieval in the inside of program.In the face of the video of magnanimity, artificial mark is cut apart can not satisfy the timeliness requirement, and machine is finished is partitioned into active demand automatically.The video structure fractional analysis be meant to video flowing carry out that camera lens is cut apart, processing such as key-frame extraction and scene are cut apart, thereby obtain the structured message of video.Scene is cut apart and is mainly concentrated on the scene cluster, repeats Video Detection, and shot similarity is than on the reciprocity method, often more complicated.Current, increasing TV programme is paid close attention to intellectual property when using the own sign of station symbol or program: at the video paragraph of the non-own property right of program inside, as advertisement, the vidclip of quoting etc. will can not load these signs; And the video paragraph of service marking teaser or tail normally, interview part, or other fragments of recording by this program oneself.It is very strong structural that the discontinuity of sign on time series makes that TV programme has, and cutting apart for the inside of TV programme provides foundation.
Summary of the invention
At TV programme, its station symbol or program sign are referred to as sign below, have temporal discontinuity, the invention provides a kind ofly to the inner automatic division method of this kind TV programme, reach segmentation effect quickly and accurately.
The key step of the inner automatic division method of TV programme of the present invention is as follows:
Step 1 is utilized a kind of existing shot segmentation technique that television program video is carried out camera lens and is cut apart, the shot sequence information that the acquisition time is continuous;
Step 2 is got 5 frame key frames to each camera lens by the time average mode, and extracts the subgraph of the rectangular area of sign position in all key frames;
Step 3, the characteristics of image vector of all subgraphs of extraction training set, the subgraph that contains sign is positive sample, and the subgraph that does not contain sign is a negative sample, and training obtains being used for the svm classifier device of Mark Detection;
Step 4 with this program video to be split, obtains all subgraphs through step 1 and step 2, extracts the characteristics of image vector identical with step 3, utilizes the svm classifier device that obtains in the step 3 to classify, and obtains the classification results of subgraph;
Step 5, mark camera lens sign attribute, if having at least 3 frame subgraphs to be judged as the existence sign in a camera lens, then this camera lens of mark is the sign camera lens, otherwise is labeled as non-sign camera lens;
Step 6, program video inside is cut apart, and the border of adjacent camera lens that has the unlike signal attribute in the video as cut-point, is divided into paragraph to video.
Description of drawings
Fig. 1 is TV programme topology example figure of the present invention.
Fig. 2 is the basic flow sheet of the method for the invention.
Embodiment
Shown in Fig. 2 flow chart, the method for the invention comprises two stages: off-line training grader and online processing video to be split.Two common steps of stage are that camera lens is cut apart, and extract 5 frame key frames and mark region subgraph thereof.It below is the method embodiment.
(1) the camera lens segmentation procedure is to utilize existing a kind of camera lens partitioning algorithm, as based on histogram, based on motion and at the algorithm of compressed video, television program video is cut into continuous shot sequence of time.
(2) each camera lens is divided into 6 sections by the time, 5 two field pictures of getting adjacent segment are as key frame; At this TV programme, known sign is determined that the rectangular area at its place, this rectangle will indicate fully just surrounds, rectangular coordinates be (x, y, w, h), x wherein, y is respectively the horizontal ordinate of the upper left angle point of rectangle, w, h are respectively the wide and high of rectangle; To this rectangle of all key-frame extraction, be called subgraph.
(3) extract three kinds of characteristics of image vectors of all subgraphs: the HSV spatial color histogram, the edge gradient histogram is based on the SIFT characteristic point histogram of speech bag model; Then three kinds of features are connected, form last characteristics of image vector.Concrete feature extracting method is as follows:
1. color histogram extracts
Subgraph is extracted HSV spatial color statistic histogram, and wherein the H space is divided into 8 intervals, and the S space is divided into 3 intervals, and the V space is divided into 3 spaces, with histogram normalization, forms the characteristic vector of 72 dimensions;
2. the edge gradient histogram extracts
Subgraph is extracted the edge gradient histogram, and per 5 degree are an interval, and the gradient in each interval range that adds up with histogram normalization, forms the characteristic vector of 72 dimensions;
3. the SIFT characteristic point histogram based on the speech bag model extracts
Extract all subgraph SIFT characteristic vectors; Use the SIFT characteristic vector cluster of K means clustering algorithm, obtain 64 cluster centres, as the code book of speech bag model to training set data; All SIFT eigenvector projections of each subgraph are arrived code book, and the histograms that form 64 dimensions are also done normalization, obtain characteristic vector;
4. with above three feature vectors polyphone, form the characteristic vector of 208 last dimensions.
(4) off-line training is used for the svm classifier device of Mark Detection, and the characteristics of image vector input SVM instrument of the positive negative sample of training set is trained, and herein, positive and negative collection number of samples is all greater than 1000, and SVM selects the kernel function based on card side's distance.
(5) subgraph for the treatment of divided video extracts the characteristics of image vector identical with step (3), totally 208 dimensions; Wherein, the code book of the needs of the histogram feature vector of formation SIFT is the code book that uses in the step (3), is obtained through K Mean Method cluster by training set.
(6) the svm classifier device that utilizes step (4) to obtain is classified to the characteristic vector that step (5) obtains, and classification results is demarcated each subgraph and whether had sign.
(7) check the number of key frames that contains sign in each camera lens by step (6) result, if more than or equal to 3, then this camera lens of mark is the sign camera lens, otherwise this camera lens of mark is non-sign camera lens.
(8) check the camera lens mark of video to be split by camera lens, if adjacent two camera lens mark differences, then with the border of these two camera lenses as a cut-point, up to intact all the adjacent camera lenses of sequential search, this program video inside is cut apart and is finished at last.
Claims (1)
1. one kind is utilized Mark Detection to carry out the method that TV programme inside is cut apart automatically, it is characterized in that comprising the steps:
Step 1 is utilized shot segmentation technique that television program video is carried out camera lens and is cut apart, the shot sequence information that the acquisition time is continuous;
Step 2 is got 5 frame key frames to each camera lens by the time average mode, and extracts the subgraph of the rectangular area of sign position in all key frames;
Step 3, the characteristics of image vector of all subgraphs of extraction training set, the subgraph that contains sign is positive sample, and the subgraph that does not contain sign is a negative sample, and training obtains being used for the svm classifier device of Mark Detection;
Step 4 with this program video to be split, obtains all subgraphs through step 1 and step 2, extracts the characteristics of image vector identical with step 3, utilizes the svm classifier device that obtains in the step 3 to classify, and obtains the classification results of subgraph;
Step 5, mark camera lens sign attribute, if having at least 3 frame subgraphs to be judged as the existence sign in a camera lens, then this camera lens of mark is the sign camera lens, otherwise is labeled as non-sign camera lens;
Step 6, program video inside is cut apart, and the adjacent shot boundary that has the unlike signal attribute in the video as cut-point, is divided into paragraph to video.
Wherein, described step 2 specifically comprises:
Step 1 is divided into 6 sections with each camera lens by the time, and 5 two field pictures of getting adjacent segment are as key frame;
Step 2 at this TV programme, is determined to known sign that the rectangular area at its place, this rectangle will indicate fully just and is surrounded, rectangular coordinates be (x, y, w, h), x wherein, y is respectively the horizontal ordinate of the upper left angle point of rectangle, w, h are respectively the wide and high of rectangle;
Step 3 to this rectangle of all key-frame extraction, is called subgraph.
Wherein, described step 3 specifically comprises:
Step 1 is extracted HSV spatial color statistic histogram to subgraph, and wherein the H space is divided into 8 intervals, and the S space is divided into 3 intervals, and the V space is divided into 3 spaces, with histogram normalization, forms the characteristic vector of 72 dimensions;
Step 2 is extracted the edge gradient histogram to subgraph, and per 5 degree are an interval, and the gradient in each interval range that adds up with histogram normalization, forms the characteristic vector of 72 dimensions;
Step 3, extract all subgraph SIFT characteristic vectors, use the SIFT characteristic vector cluster of K means clustering algorithm to training set data, obtain 64 cluster centres, as code book, all SIFT eigenvector projections of each subgraph are arrived code book, and the histograms that form 64 dimensions are also done normalization, obtain characteristic vector;
Step 4 with above three feature vectors polyphone, forms the characteristic vector of 208 last dimensions;
Step 5, off-line training are used for the svm classifier device of Mark Detection, and the characteristic vector input SVM instrument of the positive negative sample of training set is trained, and herein, positive and negative collection number of samples is all greater than 1000 in the training, and SVM selects the kernel function based on card side's distance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201010574074.8A CN102098449B (en) | 2010-12-06 | 2010-12-06 | A kind of method utilizing Mark Detection to carry out TV programme automatic inside segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201010574074.8A CN102098449B (en) | 2010-12-06 | 2010-12-06 | A kind of method utilizing Mark Detection to carry out TV programme automatic inside segmentation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102098449A true CN102098449A (en) | 2011-06-15 |
CN102098449B CN102098449B (en) | 2016-08-03 |
Family
ID=44131296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201010574074.8A Expired - Fee Related CN102098449B (en) | 2010-12-06 | 2010-12-06 | A kind of method utilizing Mark Detection to carry out TV programme automatic inside segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102098449B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102436575A (en) * | 2011-09-22 | 2012-05-02 | Tcl集团股份有限公司 | Method for automatically detecting and classifying station captions |
CN102799637A (en) * | 2012-06-27 | 2012-11-28 | 北京邮电大学 | Method for automatically generating main character abstract in television program |
CN103034860A (en) * | 2012-12-14 | 2013-04-10 | 南京思创信息技术有限公司 | Scale-invariant feature transform (SIFT) based illegal building detection method |
CN104185088A (en) * | 2014-03-03 | 2014-12-03 | 无锡天脉聚源传媒科技有限公司 | Video processing method and device |
CN105868768A (en) * | 2015-01-20 | 2016-08-17 | 阿里巴巴集团控股有限公司 | Method and system for recognizing whether picture carries specific marker |
CN107223344A (en) * | 2017-01-24 | 2017-09-29 | 深圳大学 | The generation method and device of a kind of static video frequency abstract |
CN108270946A (en) * | 2016-12-30 | 2018-07-10 | 央视国际网络无锡有限公司 | A kind of computer-aided video editing device in feature based vector library |
CN109525901A (en) * | 2018-11-27 | 2019-03-26 | Oppo广东移动通信有限公司 | Method for processing video frequency, device, electronic equipment and computer-readable medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5146336A (en) * | 1989-09-25 | 1992-09-08 | Le Groupe Videotron Ltee | Sync control for video overlay |
CN101867729A (en) * | 2010-06-08 | 2010-10-20 | 上海交通大学 | Method for detecting news video formal soliloquy scene based on features of characters |
-
2010
- 2010-12-06 CN CN201010574074.8A patent/CN102098449B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5146336A (en) * | 1989-09-25 | 1992-09-08 | Le Groupe Videotron Ltee | Sync control for video overlay |
CN101867729A (en) * | 2010-06-08 | 2010-10-20 | 上海交通大学 | Method for detecting news video formal soliloquy scene based on features of characters |
Non-Patent Citations (1)
Title |
---|
高联雄等: "基于支持向量机和不变矩的交通标志检测", 《计算机工程与应用》, no. 31, 1 November 2008 (2008-11-01), pages 233 - 238 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102436575A (en) * | 2011-09-22 | 2012-05-02 | Tcl集团股份有限公司 | Method for automatically detecting and classifying station captions |
CN102799637A (en) * | 2012-06-27 | 2012-11-28 | 北京邮电大学 | Method for automatically generating main character abstract in television program |
CN103034860A (en) * | 2012-12-14 | 2013-04-10 | 南京思创信息技术有限公司 | Scale-invariant feature transform (SIFT) based illegal building detection method |
CN104185088A (en) * | 2014-03-03 | 2014-12-03 | 无锡天脉聚源传媒科技有限公司 | Video processing method and device |
CN104185088B (en) * | 2014-03-03 | 2017-05-31 | 无锡天脉聚源传媒科技有限公司 | A kind of method for processing video frequency and device |
CN105868768A (en) * | 2015-01-20 | 2016-08-17 | 阿里巴巴集团控股有限公司 | Method and system for recognizing whether picture carries specific marker |
CN108270946A (en) * | 2016-12-30 | 2018-07-10 | 央视国际网络无锡有限公司 | A kind of computer-aided video editing device in feature based vector library |
CN107223344A (en) * | 2017-01-24 | 2017-09-29 | 深圳大学 | The generation method and device of a kind of static video frequency abstract |
WO2018137126A1 (en) * | 2017-01-24 | 2018-08-02 | 深圳大学 | Method and device for generating static video abstract |
CN109525901A (en) * | 2018-11-27 | 2019-03-26 | Oppo广东移动通信有限公司 | Method for processing video frequency, device, electronic equipment and computer-readable medium |
US11601630B2 (en) | 2018-11-27 | 2023-03-07 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Video processing method, electronic device, and non-transitory computer-readable medium |
Also Published As
Publication number | Publication date |
---|---|
CN102098449B (en) | 2016-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102098449A (en) | Method for realizing automatic inside segmentation of TV programs by utilizing mark detection | |
WO2017190656A1 (en) | Pedestrian re-recognition method and device | |
CN107301414B (en) | Chinese positioning, segmenting and identifying method in natural scene image | |
CN103927387A (en) | Image retrieval system, method and device | |
EP2259207B1 (en) | Method of detection and recognition of logos in a video data stream | |
Wang et al. | Natural scene text detection with multi-channel connected component segmentation | |
Xu et al. | Graphics and scene text classification in video | |
CN107358189B (en) | Object detection method in indoor environment based on multi-view target extraction | |
CN104463134A (en) | License plate detection method and system | |
Messelodi et al. | Scene text recognition and tracking to identify athletes in sport videos | |
Sebanja et al. | Automatic detection and recognition of traffic road signs for intelligent autonomous unmanned vehicles for urban surveillance and rescue | |
Nguwi et al. | A study on automatic recognition of road signs | |
Yanagisawa et al. | Face detection for comic images with deformable part model | |
CN109255052B (en) | Three-stage vehicle retrieval method based on multiple features | |
Giri | Text information extraction and analysis from images using digital image processing techniques | |
Chen et al. | Effective candidate component extraction for text localization in born-digital images by combining text contours and stroke interior regions | |
Deshmukh et al. | Real-time traffic sign recognition system based on colour image segmentation | |
Liu et al. | Detection and segmentation text from natural scene images based on graph model | |
Mylonas et al. | Context modelling for multimedia analysis | |
Creusen et al. | A semi-automatic traffic sign detection, classification, and positioning system | |
Paliwal et al. | A survey on various text detection and extraction techniques from videos and images | |
Sushma et al. | Text detection in color images | |
Xiao et al. | Supervised TV logo detection based on SVMS | |
CN116468960B (en) | Video image analysis and retrieval method and system | |
CN108875721A (en) | A kind of more specification text cooperatives positioning and extracting method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160803 Termination date: 20211206 |
|
CF01 | Termination of patent right due to non-payment of annual fee |