CN102098449A - Method for realizing automatic inside segmentation of TV programs by utilizing mark detection - Google Patents

Method for realizing automatic inside segmentation of TV programs by utilizing mark detection Download PDF

Info

Publication number
CN102098449A
CN102098449A CN2010105740748A CN201010574074A CN102098449A CN 102098449 A CN102098449 A CN 102098449A CN 2010105740748 A CN2010105740748 A CN 2010105740748A CN 201010574074 A CN201010574074 A CN 201010574074A CN 102098449 A CN102098449 A CN 102098449A
Authority
CN
China
Prior art keywords
subgraph
camera lens
sign
mark
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010105740748A
Other languages
Chinese (zh)
Other versions
CN102098449B (en
Inventor
董远
肖国锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201010574074.8A priority Critical patent/CN102098449B/en
Publication of CN102098449A publication Critical patent/CN102098449A/en
Application granted granted Critical
Publication of CN102098449B publication Critical patent/CN102098449B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of image processing and pattern recognition, and provides a method for realizing automatic inside segmentation of TV programs by utilizing mark detection. At present, the inside segmentation of the TV programs has urgent requirements; and the incontinuity of the mark time in a program leads the program to have better constitutive property. The method provided by the invention comprises the following steps: (1) segmenting shots on a program video, and extracting a subgraph in an area in which a mark is located from a keyframe of each shots; (2) extracting a feature vector of the subgraph, and classifying by utilizing a support vector machine (SVM) classifier aiming at the mark detection; (3) carrying out statistics on classification results, and demarcating the mark attribute of each lens; and (4) segmenting the video on a shearing point of the mark attribute of the adjacent shots. By the method provided by the invention, the keyframe is processed in the mark detection process, thus improving the efficiency of the method; and the objects applied in the method are the TV programs with incontinuous marks in the program; and the method has no requirement on the program content types, thus enhancing the application universality of the method.

Description

A kind of Mark Detection of utilizing is carried out the method that TV programme inside is cut apart automatically
Technical field
The invention belongs to image processing and mode identification technology, be specifically related to a kind of Mark Detection of utilizing and carry out the method that TV programme inside is cut apart automatically.
Background technology
At present, radio and television every days is all at the video that produces magnanimity, and provided electric program menu.Along with extensively popularizing of Web TV and Digital Television, for the better impression of viewing and admiring is provided, many TV programme attempt cutting apart by inner paragraph, provide the inner rating of program and instruct.Simultaneously, to cut apart also be the prerequisite of further content analysis and retrieval in the inside of program.In the face of the video of magnanimity, artificial mark is cut apart can not satisfy the timeliness requirement, and machine is finished is partitioned into active demand automatically.The video structure fractional analysis be meant to video flowing carry out that camera lens is cut apart, processing such as key-frame extraction and scene are cut apart, thereby obtain the structured message of video.Scene is cut apart and is mainly concentrated on the scene cluster, repeats Video Detection, and shot similarity is than on the reciprocity method, often more complicated.Current, increasing TV programme is paid close attention to intellectual property when using the own sign of station symbol or program: at the video paragraph of the non-own property right of program inside, as advertisement, the vidclip of quoting etc. will can not load these signs; And the video paragraph of service marking teaser or tail normally, interview part, or other fragments of recording by this program oneself.It is very strong structural that the discontinuity of sign on time series makes that TV programme has, and cutting apart for the inside of TV programme provides foundation.
Summary of the invention
At TV programme, its station symbol or program sign are referred to as sign below, have temporal discontinuity, the invention provides a kind ofly to the inner automatic division method of this kind TV programme, reach segmentation effect quickly and accurately.
The key step of the inner automatic division method of TV programme of the present invention is as follows:
Step 1 is utilized a kind of existing shot segmentation technique that television program video is carried out camera lens and is cut apart, the shot sequence information that the acquisition time is continuous;
Step 2 is got 5 frame key frames to each camera lens by the time average mode, and extracts the subgraph of the rectangular area of sign position in all key frames;
Step 3, the characteristics of image vector of all subgraphs of extraction training set, the subgraph that contains sign is positive sample, and the subgraph that does not contain sign is a negative sample, and training obtains being used for the svm classifier device of Mark Detection;
Step 4 with this program video to be split, obtains all subgraphs through step 1 and step 2, extracts the characteristics of image vector identical with step 3, utilizes the svm classifier device that obtains in the step 3 to classify, and obtains the classification results of subgraph;
Step 5, mark camera lens sign attribute, if having at least 3 frame subgraphs to be judged as the existence sign in a camera lens, then this camera lens of mark is the sign camera lens, otherwise is labeled as non-sign camera lens;
Step 6, program video inside is cut apart, and the border of adjacent camera lens that has the unlike signal attribute in the video as cut-point, is divided into paragraph to video.
Description of drawings
Fig. 1 is TV programme topology example figure of the present invention.
Fig. 2 is the basic flow sheet of the method for the invention.
Embodiment
Shown in Fig. 2 flow chart, the method for the invention comprises two stages: off-line training grader and online processing video to be split.Two common steps of stage are that camera lens is cut apart, and extract 5 frame key frames and mark region subgraph thereof.It below is the method embodiment.
(1) the camera lens segmentation procedure is to utilize existing a kind of camera lens partitioning algorithm, as based on histogram, based on motion and at the algorithm of compressed video, television program video is cut into continuous shot sequence of time.
(2) each camera lens is divided into 6 sections by the time, 5 two field pictures of getting adjacent segment are as key frame; At this TV programme, known sign is determined that the rectangular area at its place, this rectangle will indicate fully just surrounds, rectangular coordinates be (x, y, w, h), x wherein, y is respectively the horizontal ordinate of the upper left angle point of rectangle, w, h are respectively the wide and high of rectangle; To this rectangle of all key-frame extraction, be called subgraph.
(3) extract three kinds of characteristics of image vectors of all subgraphs: the HSV spatial color histogram, the edge gradient histogram is based on the SIFT characteristic point histogram of speech bag model; Then three kinds of features are connected, form last characteristics of image vector.Concrete feature extracting method is as follows:
1. color histogram extracts
Subgraph is extracted HSV spatial color statistic histogram, and wherein the H space is divided into 8 intervals, and the S space is divided into 3 intervals, and the V space is divided into 3 spaces, with histogram normalization, forms the characteristic vector of 72 dimensions;
2. the edge gradient histogram extracts
Subgraph is extracted the edge gradient histogram, and per 5 degree are an interval, and the gradient in each interval range that adds up with histogram normalization, forms the characteristic vector of 72 dimensions;
3. the SIFT characteristic point histogram based on the speech bag model extracts
Extract all subgraph SIFT characteristic vectors; Use the SIFT characteristic vector cluster of K means clustering algorithm, obtain 64 cluster centres, as the code book of speech bag model to training set data; All SIFT eigenvector projections of each subgraph are arrived code book, and the histograms that form 64 dimensions are also done normalization, obtain characteristic vector;
4. with above three feature vectors polyphone, form the characteristic vector of 208 last dimensions.
(4) off-line training is used for the svm classifier device of Mark Detection, and the characteristics of image vector input SVM instrument of the positive negative sample of training set is trained, and herein, positive and negative collection number of samples is all greater than 1000, and SVM selects the kernel function based on card side's distance.
(5) subgraph for the treatment of divided video extracts the characteristics of image vector identical with step (3), totally 208 dimensions; Wherein, the code book of the needs of the histogram feature vector of formation SIFT is the code book that uses in the step (3), is obtained through K Mean Method cluster by training set.
(6) the svm classifier device that utilizes step (4) to obtain is classified to the characteristic vector that step (5) obtains, and classification results is demarcated each subgraph and whether had sign.
(7) check the number of key frames that contains sign in each camera lens by step (6) result, if more than or equal to 3, then this camera lens of mark is the sign camera lens, otherwise this camera lens of mark is non-sign camera lens.
(8) check the camera lens mark of video to be split by camera lens, if adjacent two camera lens mark differences, then with the border of these two camera lenses as a cut-point, up to intact all the adjacent camera lenses of sequential search, this program video inside is cut apart and is finished at last.

Claims (1)

1. one kind is utilized Mark Detection to carry out the method that TV programme inside is cut apart automatically, it is characterized in that comprising the steps:
Step 1 is utilized shot segmentation technique that television program video is carried out camera lens and is cut apart, the shot sequence information that the acquisition time is continuous;
Step 2 is got 5 frame key frames to each camera lens by the time average mode, and extracts the subgraph of the rectangular area of sign position in all key frames;
Step 3, the characteristics of image vector of all subgraphs of extraction training set, the subgraph that contains sign is positive sample, and the subgraph that does not contain sign is a negative sample, and training obtains being used for the svm classifier device of Mark Detection;
Step 4 with this program video to be split, obtains all subgraphs through step 1 and step 2, extracts the characteristics of image vector identical with step 3, utilizes the svm classifier device that obtains in the step 3 to classify, and obtains the classification results of subgraph;
Step 5, mark camera lens sign attribute, if having at least 3 frame subgraphs to be judged as the existence sign in a camera lens, then this camera lens of mark is the sign camera lens, otherwise is labeled as non-sign camera lens;
Step 6, program video inside is cut apart, and the adjacent shot boundary that has the unlike signal attribute in the video as cut-point, is divided into paragraph to video.
Wherein, described step 2 specifically comprises:
Step 1 is divided into 6 sections with each camera lens by the time, and 5 two field pictures of getting adjacent segment are as key frame;
Step 2 at this TV programme, is determined to known sign that the rectangular area at its place, this rectangle will indicate fully just and is surrounded, rectangular coordinates be (x, y, w, h), x wherein, y is respectively the horizontal ordinate of the upper left angle point of rectangle, w, h are respectively the wide and high of rectangle;
Step 3 to this rectangle of all key-frame extraction, is called subgraph.
Wherein, described step 3 specifically comprises:
Step 1 is extracted HSV spatial color statistic histogram to subgraph, and wherein the H space is divided into 8 intervals, and the S space is divided into 3 intervals, and the V space is divided into 3 spaces, with histogram normalization, forms the characteristic vector of 72 dimensions;
Step 2 is extracted the edge gradient histogram to subgraph, and per 5 degree are an interval, and the gradient in each interval range that adds up with histogram normalization, forms the characteristic vector of 72 dimensions;
Step 3, extract all subgraph SIFT characteristic vectors, use the SIFT characteristic vector cluster of K means clustering algorithm to training set data, obtain 64 cluster centres, as code book, all SIFT eigenvector projections of each subgraph are arrived code book, and the histograms that form 64 dimensions are also done normalization, obtain characteristic vector;
Step 4 with above three feature vectors polyphone, forms the characteristic vector of 208 last dimensions;
Step 5, off-line training are used for the svm classifier device of Mark Detection, and the characteristic vector input SVM instrument of the positive negative sample of training set is trained, and herein, positive and negative collection number of samples is all greater than 1000 in the training, and SVM selects the kernel function based on card side's distance.
CN201010574074.8A 2010-12-06 2010-12-06 A kind of method utilizing Mark Detection to carry out TV programme automatic inside segmentation Expired - Fee Related CN102098449B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010574074.8A CN102098449B (en) 2010-12-06 2010-12-06 A kind of method utilizing Mark Detection to carry out TV programme automatic inside segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010574074.8A CN102098449B (en) 2010-12-06 2010-12-06 A kind of method utilizing Mark Detection to carry out TV programme automatic inside segmentation

Publications (2)

Publication Number Publication Date
CN102098449A true CN102098449A (en) 2011-06-15
CN102098449B CN102098449B (en) 2016-08-03

Family

ID=44131296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010574074.8A Expired - Fee Related CN102098449B (en) 2010-12-06 2010-12-06 A kind of method utilizing Mark Detection to carry out TV programme automatic inside segmentation

Country Status (1)

Country Link
CN (1) CN102098449B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436575A (en) * 2011-09-22 2012-05-02 Tcl集团股份有限公司 Method for automatically detecting and classifying station captions
CN102799637A (en) * 2012-06-27 2012-11-28 北京邮电大学 Method for automatically generating main character abstract in television program
CN103034860A (en) * 2012-12-14 2013-04-10 南京思创信息技术有限公司 Scale-invariant feature transform (SIFT) based illegal building detection method
CN104185088A (en) * 2014-03-03 2014-12-03 无锡天脉聚源传媒科技有限公司 Video processing method and device
CN105868768A (en) * 2015-01-20 2016-08-17 阿里巴巴集团控股有限公司 Method and system for recognizing whether picture carries specific marker
CN107223344A (en) * 2017-01-24 2017-09-29 深圳大学 The generation method and device of a kind of static video frequency abstract
CN108270946A (en) * 2016-12-30 2018-07-10 央视国际网络无锡有限公司 A kind of computer-aided video editing device in feature based vector library
CN109525901A (en) * 2018-11-27 2019-03-26 Oppo广东移动通信有限公司 Method for processing video frequency, device, electronic equipment and computer-readable medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5146336A (en) * 1989-09-25 1992-09-08 Le Groupe Videotron Ltee Sync control for video overlay
CN101867729A (en) * 2010-06-08 2010-10-20 上海交通大学 Method for detecting news video formal soliloquy scene based on features of characters

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5146336A (en) * 1989-09-25 1992-09-08 Le Groupe Videotron Ltee Sync control for video overlay
CN101867729A (en) * 2010-06-08 2010-10-20 上海交通大学 Method for detecting news video formal soliloquy scene based on features of characters

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高联雄等: "基于支持向量机和不变矩的交通标志检测", 《计算机工程与应用》, no. 31, 1 November 2008 (2008-11-01), pages 233 - 238 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436575A (en) * 2011-09-22 2012-05-02 Tcl集团股份有限公司 Method for automatically detecting and classifying station captions
CN102799637A (en) * 2012-06-27 2012-11-28 北京邮电大学 Method for automatically generating main character abstract in television program
CN103034860A (en) * 2012-12-14 2013-04-10 南京思创信息技术有限公司 Scale-invariant feature transform (SIFT) based illegal building detection method
CN104185088A (en) * 2014-03-03 2014-12-03 无锡天脉聚源传媒科技有限公司 Video processing method and device
CN104185088B (en) * 2014-03-03 2017-05-31 无锡天脉聚源传媒科技有限公司 A kind of method for processing video frequency and device
CN105868768A (en) * 2015-01-20 2016-08-17 阿里巴巴集团控股有限公司 Method and system for recognizing whether picture carries specific marker
CN108270946A (en) * 2016-12-30 2018-07-10 央视国际网络无锡有限公司 A kind of computer-aided video editing device in feature based vector library
CN107223344A (en) * 2017-01-24 2017-09-29 深圳大学 The generation method and device of a kind of static video frequency abstract
WO2018137126A1 (en) * 2017-01-24 2018-08-02 深圳大学 Method and device for generating static video abstract
CN109525901A (en) * 2018-11-27 2019-03-26 Oppo广东移动通信有限公司 Method for processing video frequency, device, electronic equipment and computer-readable medium
US11601630B2 (en) 2018-11-27 2023-03-07 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Video processing method, electronic device, and non-transitory computer-readable medium

Also Published As

Publication number Publication date
CN102098449B (en) 2016-08-03

Similar Documents

Publication Publication Date Title
CN102098449A (en) Method for realizing automatic inside segmentation of TV programs by utilizing mark detection
WO2017190656A1 (en) Pedestrian re-recognition method and device
CN107301414B (en) Chinese positioning, segmenting and identifying method in natural scene image
CN103927387A (en) Image retrieval system, method and device
EP2259207B1 (en) Method of detection and recognition of logos in a video data stream
Wang et al. Natural scene text detection with multi-channel connected component segmentation
Xu et al. Graphics and scene text classification in video
CN107358189B (en) Object detection method in indoor environment based on multi-view target extraction
CN104463134A (en) License plate detection method and system
Messelodi et al. Scene text recognition and tracking to identify athletes in sport videos
Sebanja et al. Automatic detection and recognition of traffic road signs for intelligent autonomous unmanned vehicles for urban surveillance and rescue
Nguwi et al. A study on automatic recognition of road signs
Yanagisawa et al. Face detection for comic images with deformable part model
CN109255052B (en) Three-stage vehicle retrieval method based on multiple features
Giri Text information extraction and analysis from images using digital image processing techniques
Chen et al. Effective candidate component extraction for text localization in born-digital images by combining text contours and stroke interior regions
Deshmukh et al. Real-time traffic sign recognition system based on colour image segmentation
Liu et al. Detection and segmentation text from natural scene images based on graph model
Mylonas et al. Context modelling for multimedia analysis
Creusen et al. A semi-automatic traffic sign detection, classification, and positioning system
Paliwal et al. A survey on various text detection and extraction techniques from videos and images
Sushma et al. Text detection in color images
Xiao et al. Supervised TV logo detection based on SVMS
CN116468960B (en) Video image analysis and retrieval method and system
CN108875721A (en) A kind of more specification text cooperatives positioning and extracting method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160803

Termination date: 20211206

CF01 Termination of patent right due to non-payment of annual fee