CN104166685B - A kind of method and apparatus for detecting video segment - Google Patents
A kind of method and apparatus for detecting video segment Download PDFInfo
- Publication number
- CN104166685B CN104166685B CN201410357297.7A CN201410357297A CN104166685B CN 104166685 B CN104166685 B CN 104166685B CN 201410357297 A CN201410357297 A CN 201410357297A CN 104166685 B CN104166685 B CN 104166685B
- Authority
- CN
- China
- Prior art keywords
- video
- key frame
- frame
- video segment
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Landscapes
- Engineering & Computer Science (AREA)
- Library & Information Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a kind of method and apparatus for detecting video segment, belong to image processing field.Method includes:Using the video file or real time streaming video of current input as object to be detected, whether the video occurred in that in video library in current video stream is detected and recognized by video analysis and search method, and the accurate location to the video segment in flow measurement to be checked is positioned, and export testing result.The multi-features global and local information of image that the present invention is extracted, ensure to reduce matching error rate while recall rate high, content change to image keeps robustness, so as to being accurately positioned to the video segment in real-time or offline TV or video frequency program, there is frame losing in video flowing, occur maintaining accurate detection and the identification to video segment when mosaic block, different video resolution ratio and frame per second, video content change.
Description
Technical field
The present invention relates to image processing field, more particularly to a kind of method and apparatus for detecting video segment.
Background technology
Nowadays with the development of video recording equipment software and hardware technology, increasing video is taken the photograph by specialty or ordinary people
System is produced.In radio, TV and film industries, the quantity and data volume of video material and television program video are very big.Managing these
In massive video data, a common and important behaviour is turned into according to specific needs quick obtaining certain or some videos therein
Make, for example:Find video material, programme content, advertising segment of same video content etc..
Meanwhile, with the rise of internet video website, the video for having hundreds of thousands daily is uploaded, and every video website possesses
Video data be all magnanimity data, storage overhead is very big, and it is empty to waste storage if user uploads similar video
Between, so as to increased cost.And for personal and family, some videos with record life can be also preserved, equally, why
Sample searches similar video segment in these videos and also becomes an actual demand.
The video retrieval technology for being currently based on content is broadly divided into two classes:1) video detection based on key frame;2) it is based on
The video detection of sequence fragment matching.
Video detecting method based on key frame extracts representative key frame set come table from video first
The video is levied, the matching between video and retrieval are then realized by the matching between key frame.After extracting key frame, from
Characteristics of image that is more succinct and being conducive to calculating is extracted on key frame.Characteristics of image mainly includes two major classes:Global image
Feature and local image characteristics.Conventional global characteristics have:Color characteristic (color histogram, color moment etc.), textural characteristics (ash
Degree co-occurrence matrix, LBP, Gabor etc.), shape edges feature (edge histogram, Shape context etc.).Conventional local feature
Including:Local shape factor operator --- Harris, Laplace, DOG, Hessian etc.;Local feature description's operator ---
MSER, SIFT, ORB etc..The characteristics of there is global characteristics computing to detect, but it is easily influenceed by image change;And it is local
Characteristics of image has the spy of illumination invariant, rotational invariance, translation invariance, scale invariability and part affine-invariant features
Point, but its amount of calculation is larger, and the computing capability requirement to machine is higher.
Video detecting method based on sequence fragment matching make use of the sequential and spatial continuity feature between frame of video
To be detected to video and be positioned, the method mainly includes two parts:Image characteristics extraction and sequence analysis.At this
In class method, characteristics of image is mainly using simple global image feature (such as color characteristic, textural characteristics, space
Ordinal features) or object movement locus feature.In matching process is carried out to video, such method needs to use with whole
Body or sliding window form carry out the sequences match between multiple successive frames and analyze (for example:Editing distance, structure matrix diagonals
Line method), so that it is determined that whether occurring in that target video fragment in video to be detected.The calculating time of such method mainly consumes
In sequence analysis, and during interframe ordered pair its is critically important, once entanglement occurs in sequential, detection performance will be affected, in addition
The shortcoming of such algorithm is that can not be accurately determined the state pause judgments time of video segment, and to the testing result of short-sighted frequency
Poor, it is more suitable for judging the overall similitude of video, in real-time streams detection, is susceptible to frame losing, audio letter
Situations such as breath is lost, mosaic image occurs.
The content of the invention
The embodiment provides a kind of method and apparatus for detecting video segment, solve video file or regard in real time
Video segment during frequency flows carries out accurate detection with the problem for recognizing.
To reach above-mentioned purpose, adopt the following technical scheme that:
The invention discloses a kind of method for detecting video segment, comprise the following steps:
The corresponding key frame of all videos in video library is extracted, and feature extraction is carried out to the key frame, obtain described
The corresponding local feature vectors of key frame and global characteristics vector;
Global characteristics vector to the key frame by the way of based on Hash table sets up index, and uses the mode of falling row
Local feature to the key frame sets up index, forms index database;
Key frame to be measured is extracted from video segment, retrieval will be carried out in the index database by key frame to be measured per frame
Match somebody with somebody, and return to the candidate frame set of the video segment similar to the key frame to be measured;
By each key frame in the video segment, video flowing is merged into corresponding candidate frame analysis respectively, according to the video
The similarity of fragment and the video flowing, judges whether the video segment is video in the video library.
Further, the piecemeal color of current image frame is extracted during the corresponding key frame of all videos in the extraction video library
Degree histogram and carries out Similarity Measure as image feature vector with the image feature vector of a upper key frame, such as not phase
Seemingly, then added current image frame as the new key frame of a frame in key frame set, otherwise judge whether the value of counter is equal to
Sampling interval is worth, equal then to be added current image frame as the new key frame of a frame in key frame set.
It is further, described that the colouring information of key frame is normalized rear piecemeal when carrying out feature extraction to key frame,
Three passages to each image block distinguish counting statistics feature, and the characteristic vector cascade of three passages is produced into global characteristics
Vector.
Further, when the global characteristics vector using by the way of based on Hash table to key frame sets up index, to rule
Global characteristics vector G after generalizednormHash mapping is carried out, cryptographic Hash H is
Wherein, ωiIt is the corresponding weight of each characteristic component.
Further, it is described will per frame key frame to be measured carried out in index database retrieval matching when, respectively according to pass to be detected
The corresponding local feature of key frame and global characteristics are matched in index database, obtain the common factor conduct of corresponding key frame of video
The candidate frame set.
It is further, described that by each key frame in the video segment, video flowing is merged into corresponding candidate frame analysis respectively
When,
Video numbering id will be belonged to and the candidate frame of adjacent sequential relationship is metMerge one candidate sequence of composition
Calculate the video segment Q=..., FQ... } and with each candidate sequence VidBetween similarity VSid:
Wherein, Q and VidThe video segment and the candidate sequence are represented respectively, and MF is Q and VidThe key that the match is successful
Frame number,It is the similarity between each pair key frame of video that the match is successful, FQIt is the pass in the video segment
Key frame,It is each candidate frame in the candidate sequence;
So as to obtain the video segment Q and all candidate sequences similarity set VS=..., VS (Q, Vid),…};
By VS (Q, V with maximum similarity in the similarity setid) it is considered as the detection knot of the video segment Q
Really, if its similarity VS (Q, V with the candidate sequenceid) be more than matching threshold, then the video segment Q is identified as institute
State the video V in video libraryid。
The invention also discloses a kind of device for detecting video segment, including such as lower module:
Extraction module, for extracting the corresponding key frame of all videos in video library, and carries out feature to the key frame
Extract, obtain the corresponding local feature vectors of the key frame and global characteristics vector;
Index module, index is set up for the global characteristics vector to the key frame by the way of based on Hash table,
And index is set up to the local feature of the key frame using the mode of falling row, form index database;
Matching module, for extracting key frame to be measured from video segment, will per frame key frame to be measured in the index
Retrieval matching is carried out in storehouse, and returns to the candidate frame set of the video segment similar to the key frame to be measured;
Analysis module, for video to be merged into corresponding candidate frame analysis respectively by each key frame in the video segment
Stream, according to the similarity of the video segment and the video flowing, in judging whether the video segment is the video library
Video.
Further, the extraction module is specifically for extracting the piecemeal chroma histogram of current image frame as characteristics of image
Vector, and Similarity Measure is carried out with the image feature vector of a upper key frame, it is such as dissimilar, then using current image frame as
Whether the new key frame of one frame is added in key frame set, otherwise judges the value of counter equal to sampling interval value, equal then
Preceding picture frame is added in key frame set as the new key frame of a frame.
Further, the extraction module is by the colouring information of key frame specifically for being normalized rear piecemeal, to each
Three passages difference counting statistics feature of image block, and the characteristic vector cascade of three passages is produced into global characteristics vector.
Further, the index module is specifically for the global characteristics vector G after standardizationnormCarry out Hash mapping,
Cryptographic Hash H is
Wherein, ωiIt is the corresponding weight of each characteristic component.
Further, the matching module is specifically for special according to the corresponding local feature of key frame to be detected and the overall situation respectively
Levy and matched in index database, obtain the common factor of corresponding key frame of video as the candidate frame set.
Further, the analysis module includes,
Combining unit, for will belong to video numbering id and meet the candidate frame of adjacent sequential relationshipMerging group
Into a candidate sequence
Similarity calculated, for calculate the video segment Q=..., FQ... } and with each candidate sequence VidIt
Between similarity VSid:
Wherein, Q and VidThe video segment and the candidate sequence are represented respectively, and MF is Q and VidThe key that the match is successful
Frame number, is the similarity between each pair key frame of video that the match is successful, FQIt is the key frame in the video segment,For institute
State each candidate frame in candidate sequence;
So as to obtain the video segment Q and all candidate sequences similarity set VS=..., VS (Q, Vid),…};
Evaluation unit, for VS (Q, the V that will there is maximum similarity in the similarity setid) it is considered as the piece of video
The testing result of section Q, if its similarity VS (Q, V with the candidate sequenceid) be more than matching threshold, then by the piece of video
Section Q is identified as the video V in the video libraryid。
A kind of method and apparatus for detecting video segment disclosed by the invention, by the video file or real-time streams of current input
Whether video detected and recognized by video analysis and search method and occur in that in current video stream and regard as object to be detected
Video in frequency storehouse, and accurate location to the video segment in flow measurement to be checked positions, and export testing result.This hair
The multi-features of the bright extraction global and local information of image, it is ensured that matching error rate is reduced while recall rate high,
Content change to image keeps robustness, so as to be carried out accurately to the video segment in real-time or offline TV or video frequency program
, there is frame losing in video flowing, mosaic block, different video resolution ratio and frame per second, video content occur and change (frame in positioning
Picture material brightness change, object of which movement) when keep to the accurate detection of video segment and identification.
Brief description of the drawings
Fig. 1 is a kind of flow chart of the method for detection video segment that the embodiment of the present invention one is provided;
Hash represents intention in a kind of method of detection video segment that Fig. 2 is provided for the embodiment of the present invention one;
Inverted index schematic diagram in a kind of method of detection video segment that Fig. 3 is provided for the embodiment of the present invention one;
Fig. 4 is a kind of function structure chart of the device of detection video segment that the embodiment of the present invention one is provided.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, further is made to the present invention below in conjunction with the accompanying drawings
Detailed description.
The present invention is realized to the video in TV and video frequency program using video analysis and retrieval technique based on content
Fragment is detected and is recognized that then the video file or real time streaming video that will be currently input into pass through as object to be detected
Whether video analysis and search method detect and recognize the video occurred in that in current video stream in video library, and to the video
Accurate location of the fragment in flow measurement to be checked is positioned, and exports testing result.Here video segment can be that certain is regarded
Frequency sub-piece, or a complete video.
A kind of method for detecting video segment of the present invention, as shown in figure 1, comprising the following steps:
Step 101:The corresponding key frame of all videos in video library is extracted, and feature extraction is carried out to the key frame,
Obtain the corresponding local feature vectors of the key frame and global characteristics vector;
When certain specific video segment is appeared in video file or video flowing, the starting and ending frame of the video segment
Content and adjacent video stream content frame between some image differences occur, for this feature, in order to ensure to video
Fragment carries out accurately mate and positioning, and the corresponding key frame of all videos in video library is extracted in the present embodiment.
In the extraction video library during the corresponding key frame of all videos, the piecemeal chroma histogram of current image frame is extracted
As image feature vector, and Similarity Measure is carried out with the image feature vector of a upper key frame, it is such as dissimilar, then ought
Preceding picture frame is added in key frame set as the new key frame of a frame, otherwise judges whether the value of counter is equal to the sampling interval
Value is equal then to be added current image frame as the new key frame of a frame in key frame set.Specifically:
A, the piecemeal chroma histogram of extraction current image frame are used as image feature vector;
B, the similarity meter that the vectorial image feature vector with a upper key frame of this feature be based on COS distance
Calculate (can certainly be other distance functions), if both are judged to dissmilarity, then it is assumed that current image frame is the new pass of a frame
Key frame, is added into key frame set, otherwise into next step;
C, key frame sampling interval counter is added 1, judge whether the value of counter is equal to key frame sampling interval value, such as
Both are equal for fruit, think that the frame, for key frame, is added into key frame set;
Extracting the corresponding key frame of each video can capture starting and ending of the video segment in TV and video frequency program
Frame, while not influenceed by the duration and frame per second of video segment, either video long or short video clips, can extract foot
Enough significant key frames.
Extract after key frame, it is necessary to extract can characterize the feature of image of these key frames and the characteristics of image of information to
Amount, therefore, it is necessary to carry out feature extraction to key frame the characteristics of when occurring in TV and video frequency program for video segment.
Specifically:
1st, picture size is normalized;
2nd, extracting local image characteristics is used to describe topography's information of key frame:First by Difference-of-
Gaussian local shape factor operator extractions obtain set of characteristic points, then with 128-SIFT characteristic vectors to this feature point
Image neighborhood is described;
3rd, extracting global image feature is used to describe the global image information of key frame:We believe the color of image first
Breath is normalized, and carries out piecemeal to image;Then three passages to each image block distinguish counting statistics feature, including
Average and standard deviation, finally produce final global image characteristic vector by the characteristic vector cascade of three passages;
In the present embodiment, the multi-features of the extraction global and local information of image so that it can be to image
Content change keep robustness.
Step 102:Global characteristics vector to the key frame by the way of based on Hash table sets up index, and uses
Row's mode sets up index to the local feature of the key frame, forms index database;
After obtaining the characteristic vector set of sign video, in order to accelerate the characteristic vector matching during real-time detection
Speed, then set up quick indexing using these characteristic vectors.The characteristics of for global characteristics vector sum local feature vectors, point
The Rapid matching of global and local image feature vector is not realized using two kinds of different indexed modes.
For global image feature, quick indexing is carried out by the way of based on Hash table in the present embodiment, its foundation
Process is as follows:
1) by global characteristics vector G=(g1,g2,…gn) carry out standardization processing:
Wherein, B is constant, for controlling quantified precision.
2) to the global characteristics vector G after standardizationnormHash mapping is carried out, cryptographic Hash H is
Wherein, ωiIt is the corresponding weight of each characteristic component.
3) by the key frame of all videos in video libraryGlobal image characteristic vector all by above step treatment after
Obtain corresponding<H,G>It is saved in a Hash table T, wherein H is key, and G is value, as shown in Figure 2.
For local image characteristics, in the present embodiment, quick indexing is carried out using inverted index mode, its process is such as
Under:
A) the 128-SIFT characteristic vectors for obtaining will be extracted in all key frames to be clustered, and generate K cluster centre C=
{C1,…Ci,…CK, each cluster centre CiIt is one 128 vector of dimension.In order to accelerate to cluster speed in cluster process, I
Vector is calculated using approximate neighbor search mode the distance between with cluster centre;
B) for each key frame images, each local feature vectors (128-SIFT) that it is included are poly- with K
Class center is entered row distance and is compared, and by closest that cluster centre CiGeneric CiThe local feature vectors are assigned,
Then each the local image characteristics vector in the key frame images has been owned by a corresponding classification number;
C) inverted index I is set up:With the classification number C of K cluster centresiAs the list item of inverted index, each list item correspondence
The table of falling row chain then have recorded all key frames comprising the list itemInformation.Index list item in each table of falling row chain
Have recorded classification number CiOnce there is information:Contain such alias CiKey frameThe position (x, y) of characteristic point, position are compiled
Code (pos), yardstick (scale) and principal direction (orient).The inverted index for establishing is as shown in Figure 3.
Step 103:Key frame to be measured is extracted from video segment, will be entered in the index database key frame to be measured per frame
Row retrieval matching, and return to the candidate frame set of the video segment similar to the key frame to be measured;
After establishing the corresponding index database of video library, will be completed using the index set up in video flowing Q to be detected
Similar video segments carry out quick real-time detection in the storehouse of appearance, and are accurately positioned the starting and ending time of its appearance.
Key frame set is extracted from video flowing Q to be detected for characterizing the video flowing, Key-frame Extraction Algorithm and figure
It is consistent with the algorithm that the index stage used is set up offline as feature extraction algorithm.
In order to adapt to the detection of live video stream, as soon as frame key frame is often extracted, as query image to index
Retrieval and inquisition is carried out in storehouse, and carries out picture frame Similarity Measure and sequence, while it is crucial to return to similar video segment correspondence
The candidate frame set of frame, its retrieving is as follows:
1. the newest frame key frame FQ in current input video stream is extracted according to Key-frame Extraction Algorithm, and it is complete to extract its
Office's characteristics of imageAnd local image characteristicsWhereinIt is 128 dimensions
Vector, m represents the local feature region number that the key frame is included, and n represents the length of global characteristics;
2. by local image characteristics LQIt is quantified asWhereinRepresentIt is corresponding after quantization
Classification number;
3. by VQIn ownAll video segments correspondence key frame in corresponding inverted index chained listTake out and make
It is the first candidate, according toMerge, and count eachComprising classification numberNumber, to it is all inquiry obtain
Key frame according toNumber is ranked up from big to small, obtains the first ordered set
4. by global image feature GQCarry out standardization processing and quantify to generate HQHash key, with HQIt is query object to Kazakhstan
The global characteristics vector of correlation is inquired about in uncommon table T, by the corresponding candidate frame of global characteristics vectorTake out as the second candidate, obtain
To the second ordered set
5. the first ordered set Set is soughtLWith the second ordered set SetHCommon factor, the candidate frame collection for obtaining is combined into candidate's
Matching key frame of video series:Set=SetL∩SetH;
6. by key frame FQWith each candidate frame in Set setBetween have identical CiCharacteristic point be considered as a pair
It is right with puttingWherein (x, y) is image coordinate.In these matching double points, due to quantifying to miss
Poor presence, can cause some erroneous matching situations, it is therefore desirable to carry out validity screening to these matching double points:
● judgePosition encoded pos it is whether identical, be considered wrong if difference
Mismatching point to being deleted, otherwise into next step;
● judgePrincipal direction orient between whether differ larger, if difference compared with
It is big then be considered error matching points to being deleted, otherwise into next step;
● judgeImage-region scaling multiple between whether differ larger, if
Difference is larger, is considered error matching points to being deleted;
● to key frame FQAnd candidate frameAfter all matching double points between the two complete above-mentioned judgement, one has been obtained
Matching double points set after simplifying, using key frame FQAnd candidate frameCharacteristic point coordinate position calculate most of between them
The affine transformation relationship that matching double points all meet.In embodiment, using RANSAC (RANdom SAmple Consensus, with
Machine is sampled consistency algorithm) algorithm realizes above-mentioned selection process.
● key frame F will be unsatisfactory forQAnd candidate frameThe matching double points of affine variation relation are deleted, and obtain key frame FQWith
Candidate frameBetween matching point set;
7. pair candidate frame setIn all candidate key-frames be ranked up, according to candidate frameWith
Key frame FQBetween the point logarithm that the match is successful be ranked up from big to small.
By corresponding key frame is processed as above during process obtains the video similar with current queries key frame FQ as time
Select frame setDue to having used the mode of quick indexing, therefore, the matching efficiency between key frame is higher,
In addition, employing the feature matching method that global characteristics vector sum local feature vectors are blended, it is ensured that recall rate high it is same
When reduce the error rate of matching.
Step 104:By each key frame in the video segment, video flowing is merged into corresponding candidate frame analysis respectively, according to
The similarity of the video segment and the video flowing, judges whether the video segment is video in the video library.
For current video segment to be detected, often extract a frame key frame, all inquired about by previous step obtain with
Matching candidate frame setThen need on this basis according to video in interframe sequential relationship and affiliated storehouse
Numbering, to video flowing to be detected in the retrieval result of each key frame be analyzed merging, so as to whether judge candidate video
Recognize successfully and be accurately positioned on the basis of recognizing successfully, detailed process is as follows:
First, the similarity between each pair key frame of video that the match is successful and candidate frame is calculated, using equation below:
Wherein, MP is FQWith the point logarithm that the match is successful, S, R, T are respectively FQWithBetween image scaling multiple, figure
As the anglec of rotation, image translation pixel distance, FS is normalized between [0,1];
Then, will belong to same video number and meet adjacent sequential relationship matching candidate frameMerge composition one
Individual candidate sequence
Calculate video segment Q={ .., F to be checkedQ... } and each candidate sequence VidBetween similarity VSid:
Wherein, Q and VidThe video segment and the candidate sequence are represented respectively, and MF is Q and VidThe key that the match is successful
Frame number,It is the similarity between each pair key frame of video that the match is successful, FQIt is the pass in the video segment
Key frame,It is each candidate frame in the candidate sequence.
Finally, VS is normalized between [0,1];It is similar to all candidate sequences so as to obtain the video segment Q
Degree set VS=..., VS (Q, Vid),…}。
In the present embodiment, by VS (Q, V with maximum similarity in current time fragmentid) as the inspection of video segment
Result is surveyed, if its VS (Q, Vid)>T(0<T≤1, T values are bigger to represent higher to matching confidence level requirement), then it is assumed that piece of video
The matching confidence reliability of section, will the video segment Q video V that are identified as in video libraryid, the starting and ending of video segment Q
Frame time has just corresponded to the position [start, end] that Q occurs in current detection video flowing.
By above step, it is possible to carry out video segment detection and identification to current input video file or real-time streams,
When there is the video segment similar to video library in detecting current video stream and recognize successfully, will be according to sequence of frames of video
Matching result it is accurately positioned, when reporting starting and ending of the video segment for recognizing in the detection video flowing
Between, and recognition result confidence level.
The invention also discloses a kind of device for detecting video segment, as shown in figure 4, including such as lower module:
Extraction module 401, for extracting the corresponding key frame of all videos in video library, and carries out spy to the key frame
Extraction is levied, the corresponding local feature vectors of the key frame and global characteristics vector is obtained;
Index module 402, rope is set up for the global characteristics vector to the key frame by the way of based on Hash table
Draw, and index is set up to the local feature of the key frame using the mode of falling row, form index database;
Matching module 403, for extracting key frame to be measured from video segment, will per frame key frame to be measured in the rope
Draw and carry out in storehouse retrieval matching, and return to the candidate frame set of the video segment similar to the key frame to be measured;
Analysis module 404, for the analysis of corresponding candidate frame to be merged into and regarded respectively by each key frame in the video segment
Frequency flows, according to the similarity of the video segment and the video flowing, in judging whether the video segment is the video library
Video.
Further, the extraction module is specifically for extracting the piecemeal chroma histogram of current image frame as characteristics of image
Vector, and Similarity Measure is carried out with the image feature vector of a upper key frame, it is such as dissimilar, then using current image frame as
Whether the new key frame of one frame is added in key frame set, otherwise judges the value of counter equal to sampling interval value, equal then
Preceding picture frame is added in key frame set as the new key frame of a frame.
Further, the extraction module is specifically for using Difference-of-Gaussian local shape factor operators
Local Features Analysis are carried out to key frame, extraction obtains set of characteristic points, and with 128-SIFT characteristic vectors to this feature point
Image neighborhood is described.
Further, the extraction module is by the colouring information of key frame specifically for being normalized rear piecemeal, to each
Three passages difference counting statistics feature of image block, and the characteristic vector cascade of three passages is produced into global characteristics vector.
Further, the index module is specifically for the global characteristics vector G after standardizationnormCarry out Hash mapping,
Cryptographic Hash H is
Wherein, ωiIt is the corresponding weight of each characteristic component.
Further, the matching module is specifically for special according to the corresponding local feature of key frame to be detected and the overall situation respectively
Levy and matched in index database, obtain the common factor of corresponding key frame of video as the candidate frame set.
Further, the analysis module includes,
Combining unit, for will belong to video numbering id and meet the candidate frame of adjacent sequential relationshipMerge composition
One candidate sequence
Similarity calculated, for calculating video segment Q={ .., the FQ... } and with each candidate sequence VidBetween
Similarity VSid:
Wherein, Q and VidThe video segment and the candidate sequence are represented respectively, and MF is Q and VidThe key that the match is successful
Frame number,It is the similarity between each pair key frame of video that the match is successful, FQIt is the pass in the video segment
Key frame,It is each candidate frame in the candidate sequence;
So as to obtain the video segment Q and all candidate sequences similarity set VS=..., VS (Q, Vid),…};
Evaluation unit, for VS (Q, the V that will there is maximum similarity in the similarity setid) it is considered as the piece of video
The testing result of section Q, if its similarity VS (Q, V with the candidate sequenceid) be more than matching threshold, then by the video
Fragment Q is identified as the video V in the video libraryid。
A kind of method and apparatus for detecting video segment disclosed by the invention, by the video file or real-time streams of current input
Whether video detected and recognized by video analysis and search method and occur in that in current video stream and regard as object to be detected
Video in frequency storehouse, and accurate location to the video segment in flow measurement to be checked positions, and export testing result.This hair
The multi-features of the bright extraction global and local information of image, it is ensured that matching error rate is reduced while recall rate high,
Content change to image keeps robustness, so as to be carried out accurately to the video segment in real-time or offline TV or video frequency program
, there is frame losing in video flowing, mosaic block, different video resolution ratio and frame per second, video content occur and change (frame figure in positioning
As content brightness change, object of which movement) when keep the accurate detection to video segment and identification.
The above, specific embodiment only of the invention, but protection scope of the present invention is not limited thereto, and it is any
Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all contain
Cover within protection scope of the present invention.Therefore, protection scope of the present invention described should be defined by scope of the claims.
Claims (10)
1. it is a kind of detect video segment method, it is characterised in that comprise the following steps:
The corresponding key frame of all videos in video library is extracted, and feature extraction is carried out to the key frame, obtain the key
The corresponding local feature vectors of frame and global characteristics vector;
Global characteristics vector to the key frame by the way of based on Hash table sets up index, and using the mode of falling row to institute
The local feature for stating key frame sets up index, forms index database;
Key frame to be measured is extracted from video segment, retrieval matching will be carried out in the index database by key frame to be measured per frame,
And return to the candidate frame set of the video segment similar to the key frame to be measured;
By each key frame in the video segment, video flowing is merged into corresponding candidate frame analysis respectively, according to the video segment
With the similarity of the video flowing, judge whether the video segment is video in the video library, including:
Video numbering id will be belonged to and the candidate frame of adjacent sequential relationship is metMerge one candidate sequence of composition
Calculate the video segment Q=..., FQ... } and with each candidate sequence VidBetween similarity VSid:
Wherein, Q and VidThe video segment and the candidate sequence are represented respectively, and MF is Q and VidThe crucial frame number that the match is successful,It is the similarity between each pair key frame of video that the match is successful, FQIt is the key frame in the video segment,It is each candidate frame in the candidate sequence;
So as to obtain the video segment Q and all candidate sequences similarity set VS=..., VS (Q, Vid),…};
By VS (Q, V with maximum similarity in the similarity setid) it is considered as the testing result of the video segment Q, such as
Really its similarity VS (Q, V with the candidate sequenceid) be more than matching threshold, then the video segment Q is identified as described regarding
Video V in frequency storehouseid。
2. method according to claim 1, it is characterised in that:The corresponding key frame of all videos in the extraction video library
When, extract current image frame piecemeal chroma histogram as image feature vector, and with the characteristics of image of a upper key frame
Vector carries out Similarity Measure, such as dissimilar, then added current image frame as the new key frame of a frame in key frame set, no
Then key frame sampling interval counter is added 1, whether judge the value of counter equal to sampling interval value, it is equal then by present image
Frame is added in key frame set as the new key frame of a frame.
3. method according to claim 1, it is characterised in that:It is described when carrying out feature extraction to key frame, by key frame
Colouring information be normalized rear piecemeal, three passages of each image block are distinguished with counting statistics feature, and logical by three
The characteristic vector cascade in road produces global characteristics vector.
4. method according to claim 1, it is characterised in that:It is described using based on by the way of Hash table to the complete of key frame
When office's characteristic vector sets up index, Hash mapping is carried out to the global characteristics vector Gnorm after standardization, cryptographic Hash H is
Wherein, ωiIt is the corresponding weight of each characteristic component.
5. method according to claim 1, it is characterised in that:It is described key frame to be measured to be examined in index database per frame
Rope timing, is matched according to the corresponding local feature of key frame to be detected and global characteristics in index database respectively, is obtained
The common factor of corresponding key frame of video is used as the candidate frame set.
6. it is a kind of detect video segment device, it is characterised in that including such as lower module:
Extraction module, for extracting the corresponding key frame of all videos in video library, and carries out feature extraction to the key frame,
Obtain the corresponding local feature vectors of the key frame and global characteristics vector;
Index module, sets up index, and adopt for the global characteristics vector to the key frame by the way of based on Hash table
Index is set up to the local feature of the key frame with the mode of falling row, index database is formed;
Matching module, for extracting key frame to be measured from video segment, will per frame key frame to be measured in the index database
Retrieval matching is carried out, and returns to the candidate frame set of the video segment similar to the key frame to be measured;
Analysis module, for video flowing, root to be merged into corresponding candidate frame analysis respectively by each key frame in the video segment
According to the similarity of the video segment and the video flowing, judge whether the video segment is video in the video library,
Including:
Combining unit, for will belong to video numbering id and meet the candidate frame of adjacent sequential relationshipMerge composition one
Candidate sequence
Similarity calculated, for calculate the video segment Q=..., FQ... } and with each candidate sequence VidBetween
Similarity VSid:
Wherein, Q and VidThe video segment and the candidate sequence are represented respectively, and MF is Q and VidThe crucial frame number that the match is successful,
It is the similarity between each pair key frame of video that the match is successful, FQIt is the key frame in the video segment,It is the time
Select each candidate frame in sequence;
So as to obtain the video segment Q and all candidate sequences similarity set VS=..., VS (Q, Vid),…};
Evaluation unit, for VS (Q, the V that will there is maximum similarity in the similarity setid) it is considered as the video segment Q
Testing result, if its similarity VS (Q, V with the candidate sequenceid) be more than matching threshold, then by the video segment Q
It is identified as the video V in the video libraryid。
7. device according to claim 6, it is characterised in that:The extraction module is specifically for extracting current image frame
Piecemeal chroma histogram carries out Similarity Measure as image feature vector with the image feature vector of a upper key frame,
Such as dissmilarity, then added current image frame as the new key frame of a frame in key frame set, otherwise by the key frame sampling interval
Whether counter adds 1, judges the value of counter equal to sampling interval value, equal then using current image frame as the new key frame of a frame
In addition key frame set.
8. device according to claim 6, it is characterised in that:The extraction module is specifically for the color of key frame is believed
Breath is normalized rear piecemeal, and three passages of each image block are distinguished with counting statistics feature, and by three features of passage
Vector cascade produces global characteristics vector.
9. device according to claim 6, it is characterised in that:The index module is specifically for the overall situation after standardization
Characteristic vector Gnorm carries out Hash mapping, and cryptographic Hash H is
Wherein, ωiIt is the corresponding weight of each characteristic component.
10. device according to claim 6, it is characterised in that:The matching module is specifically for respectively according to be detected
The corresponding local feature of key frame and global characteristics are matched in index database, and the common factor for obtaining corresponding key frame of video is made
It is the candidate frame set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410357297.7A CN104166685B (en) | 2014-07-24 | 2014-07-24 | A kind of method and apparatus for detecting video segment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410357297.7A CN104166685B (en) | 2014-07-24 | 2014-07-24 | A kind of method and apparatus for detecting video segment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104166685A CN104166685A (en) | 2014-11-26 |
CN104166685B true CN104166685B (en) | 2017-07-11 |
Family
ID=51910498
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410357297.7A Expired - Fee Related CN104166685B (en) | 2014-07-24 | 2014-07-24 | A kind of method and apparatus for detecting video segment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104166685B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108900895A (en) * | 2018-08-23 | 2018-11-27 | 深圳码隆科技有限公司 | The screen method and its device of the target area of a kind of pair of video flowing |
Families Citing this family (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9418296B1 (en) * | 2015-03-17 | 2016-08-16 | Netflix, Inc. | Detecting segments of a video program |
CN106354736A (en) * | 2015-07-23 | 2017-01-25 | 无锡天脉聚源传媒科技有限公司 | Judgment method and device of repetitive video |
CN106375847A (en) * | 2015-07-23 | 2017-02-01 | 无锡天脉聚源传媒科技有限公司 | Template generation method, template generation device, video updating method and video updating device |
CN106372092A (en) * | 2015-07-23 | 2017-02-01 | 无锡天脉聚源传媒科技有限公司 | Method and device for generating template, and video update method and device |
CN106708876B (en) * | 2015-11-16 | 2020-04-21 | 任子行网络技术股份有限公司 | Similar video retrieval method and system based on Lucene |
CN105678254B (en) * | 2016-01-04 | 2019-05-31 | 深圳市茁壮网络股份有限公司 | A kind of video detecting method and device |
CN107180056B (en) * | 2016-03-11 | 2020-11-06 | 阿里巴巴集团控股有限公司 | Method and device for matching segments in video |
CN107451156B (en) * | 2016-05-31 | 2021-08-20 | 杭州华为企业通信技术有限公司 | Image re-identification method and identification device |
CN106557545B (en) * | 2016-10-19 | 2020-08-07 | 北京小度互娱科技有限公司 | Video retrieval method and device |
CN106570165B (en) * | 2016-11-07 | 2019-09-13 | 北京航空航天大学 | A kind of content based video retrieval system method and device |
US10482126B2 (en) * | 2016-11-30 | 2019-11-19 | Google Llc | Determination of similarity between videos using shot duration correlation |
CN106792203A (en) * | 2016-12-29 | 2017-05-31 | 深圳Tcl数字技术有限公司 | The video broadcasting method of intelligent television, video play device and cloud server |
CN108319888B (en) * | 2017-01-17 | 2023-04-07 | 阿里巴巴集团控股有限公司 | Video type identification method and device and computer terminal |
CN107483985A (en) * | 2017-07-20 | 2017-12-15 | 北京中科火眼科技有限公司 | A kind of advertisement accurately localization method |
CN107750015B (en) * | 2017-11-02 | 2019-05-07 | 腾讯科技(深圳)有限公司 | Detection method, device, storage medium and the equipment of video copy |
CN109977738B (en) * | 2017-12-28 | 2023-07-25 | 深圳Tcl新技术有限公司 | Video scene segmentation judging method, intelligent terminal and storage medium |
CN108595600B (en) * | 2018-04-18 | 2023-12-15 | 努比亚技术有限公司 | Photo classification method, mobile terminal and readable storage medium |
CN108595679B (en) * | 2018-05-02 | 2021-04-27 | 武汉斗鱼网络科技有限公司 | Label determining method, device, terminal and storage medium |
CN108694737B (en) * | 2018-05-14 | 2019-06-14 | 星视麒(北京)科技有限公司 | The method and apparatus for making image |
CN108769731B (en) * | 2018-05-25 | 2021-09-24 | 北京奇艺世纪科技有限公司 | Method and device for detecting target video clip in video and electronic equipment |
CN110895570A (en) * | 2018-08-24 | 2020-03-20 | 北京搜狗科技发展有限公司 | Data processing method and device and data processing device |
CN109389088B (en) * | 2018-10-12 | 2022-05-24 | 腾讯科技(深圳)有限公司 | Video recognition method, device, machine equipment and computer readable storage medium |
CN109492127A (en) * | 2018-11-12 | 2019-03-19 | 网易传媒科技(北京)有限公司 | Data processing method, device, medium and calculating equipment |
CN109857907B (en) * | 2019-02-25 | 2021-11-30 | 百度在线网络技术(北京)有限公司 | Video positioning method and device |
CN110083742B (en) * | 2019-04-29 | 2022-12-06 | 腾讯科技(深圳)有限公司 | Video query method and device |
CN111027419B (en) * | 2019-11-22 | 2023-10-20 | 腾讯科技(深圳)有限公司 | Method, device, equipment and medium for detecting video irrelevant content |
CN111182364B (en) * | 2019-12-27 | 2021-10-19 | 杭州小影创新科技股份有限公司 | Short video copyright detection method and system |
CN111309962B (en) * | 2020-01-20 | 2023-05-16 | 抖音视界有限公司 | Method and device for extracting audio clips and electronic equipment |
CN111538858B (en) * | 2020-05-06 | 2023-06-23 | 英华达(上海)科技有限公司 | Method, device, electronic equipment and storage medium for establishing video map |
CN111741325A (en) * | 2020-06-05 | 2020-10-02 | 咪咕视讯科技有限公司 | Video playing method and device, electronic equipment and computer readable storage medium |
CN112307883B (en) * | 2020-07-31 | 2023-11-07 | 北京京东尚科信息技术有限公司 | Training method, training device, electronic equipment and computer readable storage medium |
CN112597335B (en) * | 2020-12-21 | 2022-08-19 | 北京华录新媒信息技术有限公司 | Output device and output method for selecting drama |
CN112580569B (en) * | 2020-12-25 | 2023-06-09 | 山东旗帜信息有限公司 | Vehicle re-identification method and device based on multidimensional features |
CN113407780B (en) * | 2021-05-20 | 2022-07-05 | 桂林电子科技大学 | Target retrieval method, device and storage medium |
CN113422981B (en) * | 2021-06-30 | 2023-03-10 | 北京华录新媒信息技术有限公司 | Method and device for identifying opera based on ultra-high definition opera video |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101833650A (en) * | 2009-03-13 | 2010-09-15 | 清华大学 | Video copy detection method based on contents |
CN102508923A (en) * | 2011-11-22 | 2012-06-20 | 北京大学 | Automatic video annotation method based on automatic classification and keyword marking |
CN102982165A (en) * | 2012-12-10 | 2013-03-20 | 南京大学 | Large-scale human face image searching method |
CN103294676A (en) * | 2012-02-24 | 2013-09-11 | 北京明日时尚信息技术有限公司 | Content duplicate detection method of network image based on GIST (generalized search tree) global feature and SIFT (scale-invariant feature transform) local feature |
CN103581705A (en) * | 2012-11-07 | 2014-02-12 | 深圳新感易搜网络科技有限公司 | Method and system for recognizing video program |
CN103631932A (en) * | 2013-12-06 | 2014-03-12 | 中国科学院自动化研究所 | Method for detecting repeated video |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040101042A1 (en) * | 2002-11-25 | 2004-05-27 | Yi-Kai Chen | Method for shot change detection for a video clip |
-
2014
- 2014-07-24 CN CN201410357297.7A patent/CN104166685B/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101833650A (en) * | 2009-03-13 | 2010-09-15 | 清华大学 | Video copy detection method based on contents |
CN102508923A (en) * | 2011-11-22 | 2012-06-20 | 北京大学 | Automatic video annotation method based on automatic classification and keyword marking |
CN103294676A (en) * | 2012-02-24 | 2013-09-11 | 北京明日时尚信息技术有限公司 | Content duplicate detection method of network image based on GIST (generalized search tree) global feature and SIFT (scale-invariant feature transform) local feature |
CN103581705A (en) * | 2012-11-07 | 2014-02-12 | 深圳新感易搜网络科技有限公司 | Method and system for recognizing video program |
CN102982165A (en) * | 2012-12-10 | 2013-03-20 | 南京大学 | Large-scale human face image searching method |
CN103631932A (en) * | 2013-12-06 | 2014-03-12 | 中国科学院自动化研究所 | Method for detecting repeated video |
Non-Patent Citations (2)
Title |
---|
"一种基于内容相似性的重复视频片段检测方法";刘守群等;《中国科学技术大学学报》;20101130;第40卷(第11期);第1130页右栏第2段-第1135页左栏第4段,图1-4 * |
"基于内容的大规模近似重复视频检索研究";祝春多;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140415(第04期);I138-915 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108900895A (en) * | 2018-08-23 | 2018-11-27 | 深圳码隆科技有限公司 | The screen method and its device of the target area of a kind of pair of video flowing |
Also Published As
Publication number | Publication date |
---|---|
CN104166685A (en) | 2014-11-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104166685B (en) | A kind of method and apparatus for detecting video segment | |
CN108549846B (en) | Pedestrian detection and statistics method combining motion characteristics and head-shoulder structure | |
US7715621B2 (en) | Method and apparatus for representing a group of images | |
Hsieh et al. | Motion-based video retrieval by trajectory matching | |
Zhang et al. | Keyframe detection for appearance-based visual SLAM | |
CN112270310A (en) | Cross-camera pedestrian multi-target tracking method and device based on deep learning | |
CN110533654A (en) | The method for detecting abnormality and device of components | |
Ye et al. | Jersey number detection in sports video for athlete identification | |
Asha et al. | Content based video retrieval using SURF descriptor | |
CN110659374A (en) | Method for searching images by images based on neural network extraction of vehicle characteristic values and attributes | |
CN117119253B (en) | High-quality video frame extraction method for target object | |
Chowdhury et al. | An episodic learning network for text detection on human bodies in sports images | |
CN101118544A (en) | Method for constructing picture shape contour outline descriptor | |
CN111382703B (en) | Finger vein recognition method based on secondary screening and score fusion | |
Nguyen et al. | Video instance search via spatial fusion of visual words and object proposals | |
Arai et al. | Method for extracting product information from TV commercial | |
Wang et al. | Fast loop closure detection via binary content | |
Figueira et al. | A window-based classifier for automatic video-based reidentification | |
Brown et al. | Tree-based vehicle color classification using spatial features on publicly available continuous data | |
Chen et al. | Hard nominal example-aware template mutual matching for industrial anomaly detection | |
Mahmoud | An enhanced method for evaluating automatic video summaries | |
Miao et al. | Coarse-to-fine video text detection | |
Liu et al. | Person re-identification by local feature based on super pixel | |
Hsieh et al. | Trajectory-based video retrieval by string matching | |
Chu et al. | Travel video scene detection by search |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170711 Termination date: 20210724 |