CN103617233B - Method and device for detecting repeated video based on semantic content multilayer expression - Google Patents

Method and device for detecting repeated video based on semantic content multilayer expression Download PDF

Info

Publication number
CN103617233B
CN103617233B CN201310611187.4A CN201310611187A CN103617233B CN 103617233 B CN103617233 B CN 103617233B CN 201310611187 A CN201310611187 A CN 201310611187A CN 103617233 B CN103617233 B CN 103617233B
Authority
CN
China
Prior art keywords
inquiry
video
key frame
high dimensional
dimensional feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310611187.4A
Other languages
Chinese (zh)
Other versions
CN103617233A (en
Inventor
刘大伟
徐伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yantai Zhong Ke Network Technical Institute
Original Assignee
Yantai Zhong Ke Network Technical Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yantai Zhong Ke Network Technical Institute filed Critical Yantai Zhong Ke Network Technical Institute
Priority to CN201310611187.4A priority Critical patent/CN103617233B/en
Publication of CN103617233A publication Critical patent/CN103617233A/en
Application granted granted Critical
Publication of CN103617233B publication Critical patent/CN103617233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for detecting repeated video based on semantic content multilayer expression. The method comprises the following steps that according to information of index video, a feature database is established; shot detection is conducted on inquiry video to be inquired; a key frame is extracted from each inquiry video clip; each inquiry key frame is processed according to the feature extraction algorithm; Hash processing is conducted on each inquiry high-dimensional feature vector; each inquiry feature label is in relevance with corresponding inquiry high-dimensional feature vector identification, inquiry key frame identification, inquiry video clip identification and inquiry video identification, and feature labels are retrieved in the feature database; feature filtering is conducted on each group of similar feature labels obtained through retrieval; similarity match is conducted on the feature vectors in each alternative feature vector set, and therefore a repeated video detection result is obtained; distance calculation of the high-dimensional feature vector of a performance bottleneck can be avoided, the detection accuracy is guaranteed, and the processing speed of repeated video detection is effectively improved.

Description

A kind of repetition video detecting method and device represented based on semantic content multilayer
Technical field
The present invention relates to a kind of video detecting method, more particularly to a kind of repetition video represented based on semantic content multilayer Detection method and device.
Background technology
With developing rapidly for network digital video application, in order to protect and managing video content, repeatedly video on a large scale Detection turns into the problem of research concern.Repeat video detecting method and be broadly divided into two major classes:Digital watermarking and the weight based on content Recheck and survey.Digital watermark method is detected during hiding data message (i.e. watermark) is embedded into image and video.And be based on The method Bian video content analysis algorithms of content, generation video signatures or key frame feature are retrieved, with higher Treatment effeciency and accuracy.Most of repetition video frequency searchings of the research concern based on content.
The general procedure process of existing method can be divided into following three step:
First, video generates video segment by shot segmentation algorithm, and each video segment extracts one or more key frames;
Then, one group of high dimensional feature vector is generated using feature extraction algorithm to each key frame of video;
Finally, the similarity for defining video with the time of characteristic vector and spatial match algorithm is used for being detected.
First it is shot segmentation and Key-frame Extraction Algorithm.Shot segmentation is also called Scene Incision (Shot Boundary Detection).Camera lens is a series of sequence of frames of video of the video camera from start to stop between two operations, existing Some shot segmentation algorithms are generally divided into two classes:The first kind is the method based on threshold value when the similarity between two frames is less than advance During the threshold value of definition, that is, it is judged to edge.Threshold value can be it is global, what self adaptation or global self adaptation was combined.Second Class is the method based on statistical learning, is practised including educational inspector and the class method of unsupervised learning two, the algorithm of supervised learning classification The method of such as SVM, Adaboost and other models, the algorithm of unsupervised learning is mainly clustering algorithm, such as K-means, fuzzy K-means.Key-frame Extraction Algorithm is extracted from a camera lens can most represent the frame of camera lens content as key Frame, the feature of concern includes color, edge, shape, MPEG-7 motion descriptors etc..Mainly include two major classes:Frame sequence compares Method and the global method for comparing.
After by the pretreatment of shot segmentation and key-frame extraction, the basic object of index and retrieval is key frame The character representation of character representation, i.e. image, can be divided into two classes:Global characteristics and local feature, correspond respectively to different regarding The selection of frequency content representation algorithm and similarity measurement.Yeh et al. proposes that a kind of 16 dimension subregions of global key frame rank are retouched State symbol and a kind of corresponding sequences match algorithm.Chiu et al. incorporates global and local feature descriptor and uses min- Hashing and time-space registration detect repetition video.Shang et al. proposes a kind of binary system overall situation space-time characteristic and uses to be based on The method of inverted file is indexed and quick detection.Pan et al. proposes the space-time union feature that a kind of Bian is analyzed with DCT, and Video copy detection framework is devised based on this feature.Wu et al. further considers the motion of local key point, takes out one kind Track behavioural characteristic, and Bian Markov-chain models are indicated and match.Liu et al. proposes a kind of combination part SIFT The repetition video detection framework of feature and local sensitive hash (LSH) algorithm and random sampling uniformity (RANSAC) algorithm. Local feature is expressed as vision word and is detected using similar RANSAC matching algorithms by Avrithis et al..
SURF is the detector based on approximate Hessian for representing digital picture for proposing in recent years, by reality Checking is bright to be better than other local feature method for expressing, such as SIFT, PCA-SIFT etc. in terms of computational efficiency.The present invention utilizes SURF Feature to index accordingly optimized:The symbol of the intermediate result Laplacian calculated using characteristic vector, i.e. Hessian Trace of a matrix divides the bucket space of hash index generation, and the filtering of characteristic vector is carried out using the position of point of interest.
Local sensitivity Hash LSH algorithms are a kind of efficient algorithms that approximate KNN lookup is carried out in higher dimensional space.LSH is breathed out Uncommon family of functions has following property:Closely located object has probability higher to collide compared to distant object.Different The different distance metric of LSH families of functions correspondence.
Method based on local feature has more preferable robustness compared to the method based on global characteristics, particularly tackles face Color adjustment, cuts, and adds the video by conversion such as captions, transcoding, but to pay calculation cost higher simultaneously.
Method based on local feature, in the retrieving of basic LSH algorithms, a query point is several by being hashing onto In the corresponding bucket of individual different Hash table, institute some spies a little closest with the taking-up of the distance of query point in bucket are then calculated Vector is levied as retrieval result.It is believed that the high dimensional feature vector in retrieving is (such as:64 dimension SURF descriptors) Europe Formula distance is calculated needs the consumption plenty of time as cost, is the existing performance bottleneck place based on LSH algorithms.Due to network Application scenarios are higher to requirement of real-time, meanwhile, the repetition video detection based on multilayer content analysis needs to process the height of magnanimity Dimensional feature vector, so, processing speed ratio " partial accuracy " is more important.In addition, compared to using only a higher-dimension for integration Vector describes an algorithm based on global characteristics of key frame, and the algorithm based on local feature represents each key frame It is hundreds of high dimension vectors.Therefore, alternatively collection and reduction computational load are concerns how effectively to filter reduction characteristic vector Important Problems.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of adaptive local sensitive hash ADLSH that passes through to frame of video SURF characteristic vectors be indexed and retrieve, the averaged feature vector number effectively estimated by parameter learning in each barrel Based on repetition video detecting method and device that semantic content multilayer is represented.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:A kind of palinopsia represented based on semantic content multilayer Frequency detection method, comprises the following steps:
Step 1:Information according to index video sets up property data base;
Step 2:Inquiry video to be checked is carried out into Shot Detection, multiple queries video segment is obtained;The inquiry is regarded Frequency is provided with inquiry video labeling, and each inquiry video segment is respectively arranged with inquiry video segment mark;
Step 3:Key frame is extracted to each inquiry video segment, multiple queries key frame is obtained, each inquiry key frame It is respectively arranged with the crucial frame identification of inquiry;
Step 4:To each inquiry key frame processed using feature extraction algorithm, obtain a group polling high dimensional feature to Amount, each inquiry high dimensional feature vector is provided with inquiry high dimensional feature vectorial;
Step 5:Each inquiry high dimensional feature vector is carried out into Hash treatment respectively, a group polling feature tag is obtained;
Step 6:By each query characteristics label and corresponding inquiry high dimensional feature vectorial, the crucial frame identification of inquiry, Inquiry video segment mark and inquiry video labeling be associated, and using above-mentioned mark as each query characteristics label association , retrieval and inquisition feature tag and its associations in property data base obtain multigroup similar features label;
Step 7:According to every group of positional information of feature tag, feature is carried out to every group of similar features label that retrieval is obtained Filtering, obtains including the alternative features vector set of multiple characteristic vectors;
Step 8:According to the crucial frame identification of inquiry and inquiry video segment mark, in the vector set of each alternative features Characteristic vector carries out similarity mode, obtains repeating video testing result.
The beneficial effects of the invention are as follows:The present invention is ground to the repetition video detection represented based on semantic content multilayer Study carefully, using SURF descriptors as local feature, design a kind of new index structure based on LSH, the index combines SURF The internal characteristicses of descriptor, set to reduce calculating consumption during retrieval by parameter learning and self adaptation, maintain inspection The scalability and robustness of rope.A kind of simple and effective filter algorithm and two-layer are used to the characteristic vector set that retrieval is obtained Matching algorithm, further cuts down the quantity of alternative features vector set and generates the associated score of whole video, by setting phase Closing score threshold carries out repeating video detection;
The algorithm is indexed and examined by adaptive local sensitive hash ADLSH to the SURF characteristic vectors of frame of video Rope, the averaged feature vector number in each barrel is effectively estimated by parameter learning, and the height of performance bottleneck is caused so as to avoid The distance of dimensional feature vector is calculated, and then, characteristic vector to key frame and video is completed by characteristic filter and two-layer matching Multi-layer Matched, obtains associated score as testing result, and the algorithm can be effective to improve while Detection accuracy is ensured The processing speed of video detection is repeated, better than other algorithms for being currently based on local sensitivity Hash LSH.
On the basis of above-mentioned technical proposal, the present invention can also do following improvement.
Further, the step 1 specifically includes following steps:
Step 1.1:Index video is carried out into Shot Detection, multiple video segments are obtained, each video segment is respectively provided with There is video segment to identify, the index video is provided with index video labeling;
Step 1.2:Key frame is extracted to each video segment, multiple key frames are obtained, each key frame is respectively arranged with Crucial frame identification;
Step 1.3:Each key frame is processed using feature extraction algorithm, one group of high dimensional feature vector is obtained, often Individual high dimensional feature vector is provided with high dimensional feature vectorial;
Step 1.4:Each high dimensional feature vector is carried out into Hash treatment respectively, one group of feature tag is obtained;
Step 1.5:By each feature tag and corresponding high dimensional feature vectorial, crucial frame identification, piece of video segment mark Know and index video labeling be associated, by association after all feature tags be stored in property data base.
Further, the step 5 specifically includes following steps:
Step 5.1:Each inquiry high dimensional feature vector is represented using following sign function:
Wherein, p is 64 dimension high dimensional feature vectors, and Hessian matrixes are characterized the intermediate result of extraction algorithm extraction;
Step 5.2:The hash function of each inquiry high dimensional feature vector is expressed as follows:
Wherein a is the 64 dimension random vectors independently chosen from 2 a to Stable distritation, b be one from being uniformly distributed [0, W] real number chosen, parameter W randomly selects in 4 or 8 as optimal value;
Step 5.3:The inquiry high dimensional feature vector p of each 64 dimension is mapped to L L bucket of Hash table:
gj(p), j=1 ..., L
The label of each barrel is a k dimensional vector, k hash function for randomly selecting of correspondence:
gj(p)=(| h1,j(p)|,...,|hk,j(p)|)
Step 5.4:From inquiry video in extract inquiry high dimensional feature vector in randomly select m to inquiry high dimensional feature to Amount, the probability of mean collisional is m is vectorial inquiry high dimensional feature:
In each barrel inquiry high dimensional feature vector number be:
Nbucket=∑np(ce)k≈n·Ep(c)k
Wherein, NbucketL=nLEp (c)k≤ Raiton, Ratio are 0.1%;
L is expressed as the function of k:
Solution obtains unique k and L optimal values.
Further, the step 7 specifically includes following steps:
Step 7.1:During inquiry key frame is extracted, storage intermediate result is believed as the position of each characteristic point Breath;
Step 7.2:Each query characteristics label for obtaining an as characteristic point will be processed by Hash, it is special according to each Levy relative distance of the positional information calculation each two characteristic point a little in two-dimensional space;
Step 7.3:Statistic of classification is carried out according to the crucial frame identification of inquiry, obtains all in two correspondence key frame images The average value and standard deviation of characteristic point relative distance;
Step 7.4 relative distance is exceeded average value and is removed as noise spot much larger than the characteristic point of standard deviation.
Further, the step 8 specifically includes following steps:
Step 8.1:It is according to the crucial frame identification of inquiry that each matching characteristic in the vector set of each alternative features is vectorial Hash treatment is carried out again, the key frame matched with inquiry key frame is searched using linear sweep:The number of matching characteristic vector Amount is matching key frame more than the key frame of predetermined threshold;
Step 8.2:For each key frame f that inquiry video segment is identifiedi q, and match key frameSimilarity be:
Wherein, NmBe with the matching key frame in a bucket corresponding characteristic vector number, wi,jIt is the power of correspondence bucket Value, specially wi,j=1/Nbucket
Step 8.3:Inquiry video vqWith index video vcBetween associated score be:
Wherein, NframeIt is the inquiry key frame sum for inquiring about video extraction, if an index video and inquiry video Associated score scorecMore than predetermined threshold St, then by as a repetition video.
Further, a kind of repetition video detecting device represented based on semantic content multilayer, including module is set up, camera lens inspection Survey module, key-frame extraction module, characteristic extracting module, Hash processing module, relating module, characteristic filter module and similarity Matching module;
It is described to set up module, for setting up property data base according to the information of index video;
The shot detection module, for inquiry video to be checked to be carried out into Shot Detection, obtains multiple queries video Fragment, each inquiry video segment is respectively arranged with inquiry video segment mark, and the inquiry video is provided with inquiry video mark Know;
The key-frame extraction module, for extracting key frame to each inquiry video segment, obtains multiple queries crucial Frame, each inquiry key frame is respectively arranged with the crucial frame identification of inquiry;
The characteristic extracting module, for being processed using feature extraction algorithm each inquiry key frame, obtains one Group polling high dimensional feature vector, each inquiry high dimensional feature vector is provided with inquiry high dimensional feature vectorial;
The Hash processing module, for each inquiry high dimensional feature vector to be carried out into Hash treatment respectively, obtains one group Query characteristics label;
The relating module, for by each query characteristics label and corresponding inquiry high dimensional feature vectorial, inquiry Crucial frame identification, inquiry video segment mark and inquiry video labeling are associated, and above-mentioned mark is special as each inquiry The associations of label are levied, retrieval and inquisition feature tag and its associations in property data base obtain multigroup similar features label;
The characteristic filter module, for according to every group of positional information of feature tag, every group obtained to retrieval to be similar Feature tag carries out characteristic filter, obtains including the alternative features vector set of multiple characteristic vectors;
The similarity mode module, it is standby to each for according to the crucial frame identification of inquiry and inquiry video segment mark Selecting the characteristic vector in characteristic vector set carries out similarity mode, obtains repeating video testing result.
Further, the module of setting up further includes detection sub-module, key-frame extraction submodule, feature extraction submodule Block, Hash submodule with associate submodule;
The detection sub-module, for index video to be carried out into Shot Detection, obtains multiple video segments, each piece of video Section is respectively arranged with video segment mark, and the index video is provided with index video labeling;
The key-frame extraction submodule, for extracting key frame to each video segment, obtains multiple key frames, each Key frame is respectively arranged with crucial frame identification;
The feature extraction submodule, for this pair, each key frame is processed using feature extraction algorithm, obtains one Group high dimensional feature vector, each high dimensional feature vector is provided with high dimensional feature vectorial;
The Hash submodule, for each high dimensional feature vector to be carried out into Hash treatment respectively, obtains one group of feature mark Sign;
The association submodule, for by each feature tag and corresponding high dimensional feature vectorial, crucial frame identification, Video segment identify and index video labeling be associated, by association after all feature tags be stored in property data base.
Further, the Hash processing module further includes high dimension vector submodule, hash function submodule, mapping Module and extraction submodule;
The high dimension vector submodule, for each inquiry high dimensional feature vector to be represented using following sign function:
Wherein, p is 64 dimension high dimensional feature vectors, and Hessian matrixes are characterized the intermediate result of extraction algorithm extraction;
The hash function submodule, the hash function for each inquiry high dimensional feature vector is expressed as follows:
Wherein a is the 64 dimension random vectors independently chosen from 2 a to Stable distritation, b be one from being uniformly distributed [0, W] real number chosen, parameter W randomly selects in 4 or 8 as optimal value;
The mapping submodule, for the inquiry high dimensional feature vector p of each 64 dimension to be mapped into the L L of Hash table Individual bucket:
gj(p), j=1 ..., L
The label of each barrel is a k dimensional vector, k hash function for randomly selecting of correspondence:
gj(p)=(| h1,j(p)|,...,|hk,j(p)|)
The extraction submodule, for randomly selecting m to looking into the inquiry high dimensional feature vector of the extraction from inquiry video High dimensional feature vector is ask, m is the probability of mean collisional between inquiry high dimensional feature vector:
In each barrel inquiry high dimensional feature vector number be:
Nbucket=∑np(ce)k≈n·Ep(c)k
Wherein, NbucketL=nLEp (c)k≤ Raiton, Ratio are 0.1%;
L is expressed as the function of k:
Solution obtains unique k and L optimal values.
Further, the characteristic filter module further includes intermediate storage submodule, calculates apart from submodule, classification system Meter submodule and removal submodule;
The intermediate storage submodule, for during inquiry key frame is extracted, storage intermediate result to be used as each The positional information of characteristic point;
It is described to calculate apart from submodule, for each query characteristics label for obtaining an as spy will to be processed by Hash Levy a little, relative distance of the positional information calculation each two characteristic point according to each characteristic point in two-dimensional space;
The statistic of classification submodule, for carrying out statistic of classification according to the crucial frame identification of inquiry, obtains in two correspondences The average value and standard deviation of all characteristic point relative distances in key frame images;
The removal submodule, for relative distance is exceeded average value and much larger than standard deviation characteristic point as making an uproar Sound point is removed.
Further, the similarity mode module further includes to travel through submodule, similarity submodule and related submodule Block;
The traversal submodule, for according to each inquired about during crucial frame identification gathers each alternative features vector Hash treatment is carried out again with characteristic vector, the key frame matched with inquiry key frame is searched using linear sweep:Matching is special The quantity for levying vector exceedes the key frame of predetermined threshold to match key frame;
The similarity submodule, for each key frame f identified for inquiry video segmenti q, and match key frameSimilarity be:
Wherein, NmBe with the matching key frame in a bucket corresponding characteristic vector number, wi,jIt is the power of correspondence bucket Value, specially wi,j=1/Nbucket
The relevant sub-module, for inquiring about video vqWith index video vcBetween associated score be:
Wherein, NframeIt is the inquiry key frame sum for inquiring about video extraction, if an index video and inquiry video Associated score scorecMore than predetermined threshold St, then by as a repetition video.
Brief description of the drawings
Fig. 1 is the inventive method flow chart of steps;
Fig. 2 is apparatus of the present invention structure chart.
In accompanying drawing, the list of parts representated by each label is as follows:
1st, module, 1-1, detection sub-module, 1-2, key-frame extraction submodule, 1-3, feature extraction submodule, 1- are set up 4th, Hash submodule, 1-5, association submodule, 2, shot detection module, 3, key-frame extraction module, 4, characteristic extracting module, 5, Hash processing module, 5-1, high dimension vector submodule, 5-2, hash function submodule, 5-3, mapping submodule, 5-4, extraction Module, 6, relating module, 7, characteristic filter module, 7-1, intermediate storage submodule, are calculated apart from submodule, 7-3, classification 7-2 Statistic submodule, 7-4, removal submodule, 8, similarity mode module, 8-1, traversal submodule, 8-2, similarity submodule, 8- 3rd, relevant sub-module.
Specific embodiment
Principle of the invention and feature are described below in conjunction with accompanying drawing, example is served only for explaining the present invention, and It is non-for limiting the scope of the present invention.
As shown in figure 1, being the inventive method flow chart of steps, Fig. 2 is apparatus of the present invention structure chart.
Embodiment 1
A kind of repetition video detecting method represented based on semantic content multilayer, is comprised the following steps:
Step 1:Information according to index video sets up property data base;
Step 2:Inquiry video to be checked is carried out into Shot Detection, multiple queries video segment is obtained;The inquiry is regarded Frequency is provided with inquiry video labeling, and each inquiry video segment is respectively arranged with inquiry video segment mark;
Step 3:Key frame is extracted to each inquiry video segment, multiple queries key frame is obtained, each inquiry key frame It is respectively arranged with the crucial frame identification of inquiry;
Step 4:To each inquiry key frame processed using feature extraction algorithm, obtain a group polling high dimensional feature to Amount, each inquiry high dimensional feature vector is provided with inquiry high dimensional feature vectorial;
Step 5:Each inquiry high dimensional feature vector is carried out into Hash treatment respectively, a group polling feature tag is obtained;
Step 6:By each query characteristics label and corresponding inquiry high dimensional feature vectorial, the crucial frame identification of inquiry, Inquiry video segment mark and inquiry video labeling be associated, and using above-mentioned mark as each query characteristics label association , retrieval and inquisition feature tag and its associations in property data base obtain multigroup similar features label;
Step 7:According to every group of positional information of feature tag, feature is carried out to every group of similar features label that retrieval is obtained Filtering, obtains including the alternative features vector set of multiple characteristic vectors;
Step 8:According to the crucial frame identification of inquiry and inquiry video segment mark, in the vector set of each alternative features Characteristic vector carries out similarity mode, obtains repeating video testing result.
The step 1 specifically includes following steps:
Step 1.1:Index video is carried out into Shot Detection, multiple video segments are obtained, each video segment is respectively provided with There is video segment to identify, the index video is provided with index video labeling;
Step 1.2:Key frame is extracted to each video segment, multiple key frames are obtained, each key frame is respectively arranged with Crucial frame identification;
Step 1.3:Each key frame is processed using feature extraction algorithm, one group of high dimensional feature vector is obtained, often Individual high dimensional feature vector is provided with high dimensional feature vectorial;
Step 1.4:Each high dimensional feature vector is carried out into Hash treatment respectively, one group of feature tag is obtained;
Step 1.5:By each feature tag and corresponding high dimensional feature vectorial, crucial frame identification, piece of video segment mark Know and index video labeling be associated, by association after all feature tags be stored in property data base.
The step 5 specifically includes following steps:
Step 5.1:Each inquiry high dimensional feature vector is represented using following sign function:
Wherein, p is 64 dimension high dimensional feature vectors, and Hessian matrixes are characterized the intermediate result of extraction algorithm extraction;
Step 5.2:The hash function of each inquiry high dimensional feature vector is expressed as follows:
Wherein a is that 64 dimensions independently chosen from 2 a to Stable distritation (correspondence Euclidean distance is Gaussian Profile) are random Vector, b is one from the real number for being uniformly distributed [0, W] selection, and parameter W randomly selects in 4 or 8 as optimal value;
Step 5.3:The inquiry high dimensional feature vector p of each 64 dimension is mapped to L L bucket of Hash table:
gj(p), j=1 ..., L
The label of each barrel is a k dimensional vector, k hash function for randomly selecting of correspondence:
gj(p)=(| h1,j(p)|,...,|hk,j(p)|)
Step 5.4:From inquiry video in extract inquiry high dimensional feature vector in randomly select m to inquiry high dimensional feature to Amount, the probability of mean collisional is m is vectorial inquiry high dimensional feature:
In each barrel inquiry high dimensional feature vector number be:
Nbucket=∑np(ce)k≈n·Ep(c)k
Wherein, NbucketL=nLEp (c)k≤ Raiton, Ratio are 0.1%;
L is expressed as the function of k:
Solution obtains unique k and L optimal values.
The step 7 specifically includes following steps:
Step 7.1:During inquiry key frame is extracted, storage intermediate result is believed as the position of each characteristic point Breath;
Step 7.2:Each query characteristics label for obtaining an as characteristic point will be processed by Hash, it is special according to each Levy relative distance of the positional information calculation each two characteristic point a little in two-dimensional space;
Step 7.3:Statistic of classification is carried out according to the crucial frame identification of inquiry, obtains all in two correspondence key frame images The average value and standard deviation of characteristic point relative distance;
Step 7.4 relative distance is exceeded average value and is removed as noise spot much larger than the characteristic point of standard deviation.
The step 8 specifically includes following steps:
Step 8.1:It is according to the crucial frame identification of inquiry that each matching characteristic in the vector set of each alternative features is vectorial Hash treatment is carried out again, the key frame matched with inquiry key frame is searched using linear sweep:The number of matching characteristic vector Amount is matching key frame more than the key frame of predetermined threshold;
Step 8.2:For each key frame f that inquiry video segment is identifiedi q, and match key frameSimilarity be:
Wherein, NmBe with the matching key frame in a bucket corresponding characteristic vector number, wi,jIt is the power of correspondence bucket Value, specially wi,j=1/Nbucket
Step 8.3:Inquiry video vqWith index video vcBetween associated score be:
Wherein, NframeIt is the inquiry key frame sum for inquiring about video extraction, if an index video and inquiry video Associated score scorecMore than predetermined threshold St, then by as a repetition video.
A kind of repetition video detecting device represented based on semantic content multilayer, including set up module 1, shot detection module 2, key-frame extraction module 3, characteristic extracting module 4, Hash processing module 5, relating module 6, characteristic filter module 7 and similarity Matching module 8;
It is described to set up module 1, for setting up property data base according to the information of index video;
The shot detection module 2, for inquiry video to be checked to be carried out into Shot Detection, obtains multiple queries video Fragment, each inquiry video segment is respectively arranged with inquiry video segment mark, and the inquiry video is provided with inquiry video mark Know;
The key-frame extraction module 3, for extracting key frame to each inquiry video segment, obtains multiple queries crucial Frame, each inquiry key frame is respectively arranged with the crucial frame identification of inquiry;
The characteristic extracting module 4, for being processed using feature extraction algorithm each inquiry key frame, obtains one Group polling high dimensional feature vector, each inquiry high dimensional feature vector is provided with inquiry high dimensional feature vectorial;
The Hash processing module 5, for each inquiry high dimensional feature vector to be carried out into Hash treatment respectively, obtains one group Query characteristics label;
The relating module 6, for by each query characteristics label and corresponding inquiry high dimensional feature vectorial, inquiry Crucial frame identification, inquiry video segment mark and inquiry video labeling are associated, and above-mentioned mark is special as each inquiry The associations of label are levied, retrieval and inquisition feature tag and its associations in property data base obtain multigroup similar features label;
The characteristic filter module 7, for according to every group of positional information of feature tag, every group obtained to retrieval to be similar Feature tag carries out characteristic filter, obtains including the alternative features vector set of multiple characteristic vectors;
The similarity mode module 8, it is standby to each for according to the crucial frame identification of inquiry and inquiry video segment mark Selecting the characteristic vector in characteristic vector set carries out similarity mode, obtains repeating video testing result.
The module 1 of setting up further includes detection sub-module 1-1, key-frame extraction submodule 1-2, feature extraction submodule Block 1-3, Hash submodule 1-4 with associate submodule 1-5;
The detection sub-module 1-1, for index video to be carried out into Shot Detection, obtains multiple video segments, and each is regarded Frequency fragment is respectively arranged with video segment mark, and the index video is provided with index video labeling;
The key-frame extraction submodule 1-2, for extracting key frame to each video segment, obtains multiple key frames, Each key frame is respectively arranged with crucial frame identification;
The feature extraction submodule 1-3, for this pair, each key frame is processed using feature extraction algorithm, is obtained One group of high dimensional feature vector, each high dimensional feature vector is provided with high dimensional feature vectorial;
The Hash submodule 1-4, for each high dimensional feature vector to be carried out into Hash treatment respectively, obtains one group of feature Label;
The association submodule 1-5, for by each feature tag and corresponding high dimensional feature vectorial, key frame mark Know, video segment mark and index video labeling be associated, by association after all feature tags be stored in property data base.
The Hash processing module 5 further includes high dimension vector submodule 5-1, hash function submodule 5-2, mapping Module 5-3 and extraction submodule 5-4;
The high dimension vector submodule 5-1, for each inquiry high dimensional feature vector to be represented using following sign function:
Wherein, p is 64 dimension high dimensional feature vectors, and Hessian matrixes are characterized the intermediate result of extraction algorithm extraction;
The hash function submodule 5-2, the hash function for each inquiry high dimensional feature vector is expressed as follows:
Wherein a is that 64 dimensions independently chosen from 2 a to Stable distritation (correspondence Euclidean distance is Gaussian Profile) are random Vector, b is one from the real number for being uniformly distributed [0, W] selection, and parameter W randomly selects in 4 or 8 as optimal value;
The mapping submodule 5-3, for the inquiry high dimensional feature vector p of each 64 dimension to be mapped into L Hash table L bucket:
gj(p), j=1 ..., L
The label of each barrel is a k dimensional vector, k hash function for randomly selecting of correspondence:
gj(p)=(| h1,j(p)|,...,|hk,j(p)|)
The extraction submodule 5-4, for randomly selecting m couples in the inquiry high dimensional feature vector of the extraction from inquiry video Inquiry high dimensional feature vector, the probability of mean collisional is m is vectorial inquiry high dimensional feature:
In each barrel inquiry high dimensional feature vector number be:
Nbucket=∑np(ce)k≈n·Ep(c)k
Wherein, NbucketL=nLEp (c)k≤ Raiton, Ratio are 0.1%;
L is expressed as the function of k:
Solution obtains unique k and L optimal values.
The characteristic filter module 7 further includes intermediate storage submodule 7-1, calculates apart from submodule 7-2, classification system Meter submodule 7-3 and removal submodule 7-4;
The intermediate storage submodule 7-1, for during inquiry key frame is extracted, storage intermediate result to be used as every The positional information of individual characteristic point;
It is described to calculate apart from submodule 7-2, for each query characteristics label for obtaining as will to be processed by Hash Individual characteristic point, the relative distance of positional information calculation each two characteristic point according to each characteristic point in two-dimensional space;
The statistic of classification submodule 7-3, for carrying out statistic of classification according to the crucial frame identification of inquiry, obtains right at two Answer the average value and standard deviation of all characteristic point relative distances in key frame images;
The removal submodule 7-4, for relative distance being exceeded into average value and being made much larger than the characteristic point of standard deviation For noise spot is removed.
The similarity mode module 8 further includes to travel through submodule 8-1, similarity submodule 8-2 and related submodule Block 8-3;
The traversal submodule 8-1, for will be in the vector set of each alternative features according to the crucial frame identification of inquiry it is every Individual matching characteristic vector carries out again Hash treatment, and the key frame matched with inquiry key frame is searched using linear sweep: The key frame that quantity with characteristic vector exceedes predetermined threshold is matching key frame;
The similarity submodule 8-2, for each key frame f identified for inquiry video segmenti q, and match key FrameSimilarity be:
Wherein, NmBe with the matching key frame in a bucket corresponding characteristic vector number, wI, jIt is the power of correspondence bucket Value, specially wi,j=1/Nbucket
The relevant sub-module 8-3, for inquiring about video vqWith index video vcBetween associated score be:
Wherein, NframeIt is the inquiry key frame sum for inquiring about video extraction, if an index video and inquiry video Associated score scorecMore than predetermined threshold St, then by as a repetition video.
In specific implementation, using the property of SURF descriptors for each index feature vector p designs a symbol letter Number is as follows:
The sign function is combined with original LSH hash functions and is obtained the hash function of ADLSH and is expressed as follows:
Wherein p is one 64 dimension SURF characteristic vector.A is that (correspondence Euclidean distance is Gauss point from a 2- Stable distritation Cloth) in independently choose 64 dimension random vectors, b be one from be uniformly distributed [0, W] selection real number.Each hash function ha,b P the vectorial p of one 64 dimension is mapped as a real number for having symbol by ().For index building structure, each point p is mapped to L L bucket of individual Hash table:gj(p), j=1 ..., L.The label of each barrel is a k dimensional vector, and corresponding k randomly selects Hash function:gj(p)=(| h1,j(p)|,...,|hk,j(p)|).In concrete implementation, we are exhausted using hash function To value | ha,b(p) | to represent barrel label and each barrel be split into two buckets according to the sign function of p.So, ADLSH algorithms The bucket number of generation is about 2 times of original LSH algorithms, and accordingly, the vectorial number being hashing onto in each barrel is averagely reduced to Half.
Algorithm based on a LSH major issue in actual applications is several parameter W, the selection of k, L.It is existing big Some algorithm is all to set in an experiment, it is impossible to successfully manage the demand of practical application.It is an object of the present invention to ensure local On the premise of sensitive natur, according to the concrete condition of real data, reduction is hashing onto the vectorial number in each barrel, and reaching can be with Cancel the effect that high dimension vector distance is calculated in bucket.I.e. for a query vector q, we take all barrels of g that it is hashing onto1 (q),...,gLInstitute directed quantity q in (q) alternately vector set.Reduce the quantity of alternative vector set as far as possible at the same ensure to Measure neighbours point v (| | the q-v | | of any Euclidean distance of q within R2≤ R) it is contained in alternative vector set.Analysis below The related constraint relation of parameters in ADLSH algorithms, and the method for proposing a kind of parameter learning setting of self adaptation.
ADLSH algorithms can solve the problems, such as R neighbor search with probability 1- δ, and δ is probability of failure (in realization of the invention In we take 0.1%).For two vector p1, p2, it is c=to make its distance | | p1-p2||2, then the two vectors are by a Kazakhstan The probability of uncommon Function Mapping collision is:
Wherein f2T () is the probability density function of Gaussian Profile positive portions:
(without loss of generality, it is assumed that R=1, distance between any data set vector can be with R neighbor searching problems Scaled or expanded in the region of this hypothesis with appropriate ratio, without the distance correspondence between influence data vector), In order to retrieve all Euclidean distance c within R, i.e. the vector of c < 1, it is necessary to meet following condition:
It is above the Probability Condition of single hash function, for a bucket label for k dimensional vectors, collision probability is:
For L Hash table, the probability that query vector q finds neighbour of the distance within 1 is:
PrNN[||q-p||2≤ 1]=1- (1-p (1)k)L≥1-δ (5)
For fixed p (c), the optimal value of parameter W is the function of c, and the collision that reduction W can reduce any two vector is general Rate p (c).Equally, increasing k or reduce L can reduce the probability for finding neighbour.By the pass between further analysis multiple parameters System, devises following three step to complete auto-adaptive parameter setting:
1)W
Drawn through experiment, for L values (generally less than 10 feasible in a practical application3), optimal W values (typically take 2 Power) can not ether it is small.In the case where the property of local sensitivity Hash ensures, all of collision probability p (c) with apart from c monotone decreasings, In order that Ep (c) (following by definition) has a more significant change, optimal W values can not ether it is big.Observed based on more than and analyzed, We choose 4 or 8 as W optimal value.It is worth noting that, the selection of W values is unrelated with real data collection, it is not necessary to according to Real data learns or corrects.
2) sample learning and estimation
When algorithm is applied into real data, from index video extraction sum for n SURF characteristic vectors in take out at random M is taken to vector as sample.Estimate the distribution situation of distance between vector in the data acquisition system, represent what is estimated with formula (6) Mean collisional probability between vector:
Ep (c)=p (ce), wherein
Notice that the c in Ep (c) can be different with the change of data set, not meet c < 1.We are estimated with formula (7) Vectorial number in each barrel:
Nbucket=∑np(ce)k≈n·Ep(c)k (7)
3) k and L
Our target is to reduce the total number of institute's directed quantity in the L bucket that a query vector is hashing onto as far as possible Nbucket·L.With a ratio R atio come the scope according to concrete application requirement setting number.It is n for a sum vector number Data set, each query vector average retrieval to vectorial number no more than Raiton (present invention in use Ratio for 0.1%).Then there is following constraint formula:
NbucketL=nLEp (c)k≤Raito·n (8)
According to formula (5), L can be expressed as the function of k:
Notice that the collision probability in L (k) uses the p (1) after standardizing, rather than p (c).According to formula (3), for solid Fixed W values, p (1) is definite value.In actual applications, according to the difference of data set, by step 2) different Ep (c) can be obtained, Raito is value set in advance, determines an optimal k value according to Raito, and then determine optimal L (k).
● characteristic filter
By ADLSH hash algorithms bucket division and three steps auto-adaptive parameter set, for each query characteristics to Amount, needed not move through during retrieval in bucket higher-dimension distance calculate can be obtained by one it is contemplated that par it is alternative Characteristic vector set.The change of Video Key two field picture can not be effectively tackled due to the SURF characteristic matchings based on Euclidean distance Change or noise.Inspection typically using some spaces and filter algorithm, such as RANSAC in similar research.In the present invention, by Quantity is significantly reduced compared to other researchs in the alternative vector set for obtaining, we are using the simple mistake based on distance Filtering method removes the matching characteristic vector of some apparent errors.For each pair vector for retrieving, we are carried using SURF features The intermediate result i.e. positional information of characteristic point during taking, calculates the relative distance in two-dimensional space.Two correspondences are obtained to close The average value and standard deviation of all of characteristic point relative distance in key two field picture.By therein standard deviation is much larger than more than average Feature as noise spot to removing.
ADLSH of the invention and characteristic filter method can obtain quantity less, match more accurately feature pair.
● two-layer matching process
The SURF characteristic vectors obtained by ADLSH and characteristic filter, further will be obtained using a two-layer matching process To the associated score of correspondence video.During index and retrieval, corresponding of the SURF characteristic vectors of each inquiry video The number of characteristic point will all be recorded in same bucket with characteristic vector and different Hash tables.Our marks according to key frame Each the matching SURF characteristic vector that will be obtained carries out Hash again, and one inquiry of correspondence is found by a linear sweep The matching key frame of key frame:It is crucial that matching characteristic vector number is considered as matching more than the key frame of a threshold value set in advance (about each key frame generates 100 SURF characteristic vectors to frame, and 60) our selected thresholds are.So we have obtained key frame The testing result of rank.In order to obtain the other testing result of videl stage, for each key frame f for inquiring about videoi q, and match pass Key frameSimilarity be defined as follows:
Wherein NdIt is the characteristic vector sum of extraction in the key frame for inquire about video.NmIt is that key frame is matched with this at one Corresponding characteristic vector number in bucket.wi,jIt is the weights of correspondence bucket, for removing the different influence of vectorial number in bucket, that is, reduces The excessive bucket of vectorial number can simply be set as in practice for the influence of similarity:wi,j=1/Nbucket.According to more than As a result, inquiry video vqWith index video vcBetween associated score be defined as follows:
Wherein NframeIt is the key frame sum for inquiring about video extraction.If an index video divides to the related of inquiry video Number scorecMore than a threshold value StThen by as a repetition video.Threshold value S in practical applicationtDepend on data set and need To compromise between recall rate and accuracy rate and be set.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all it is of the invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included within the scope of the present invention.

Claims (8)

1. a kind of repetition video detecting method represented based on semantic content multilayer, it is characterised in that comprise the following steps:
Step 1:Information according to index video sets up property data base;
Step 2:Inquiry video to be checked is carried out into Shot Detection, multiple queries video segment is obtained;The inquiry video sets Inquiry video labeling is equipped with, each inquiry video segment is respectively arranged with inquiry video segment mark;
Step 3:Key frame is extracted to each inquiry video segment, multiple queries key frame is obtained, each inquiry key frame difference It is provided with the crucial frame identification of inquiry;
Step 4:Each inquiry key frame is processed using feature extraction algorithm, group polling high dimensional feature vector is obtained, Each inquiry high dimensional feature vector is provided with inquiry high dimensional feature vectorial;
Step 5:Each inquiry high dimensional feature vector is carried out into Hash treatment respectively, a group polling feature tag is obtained;
Step 6:By each query characteristics label and corresponding inquiry high dimensional feature vectorial, the crucial frame identification of inquiry, inquiry Video segment identify and inquiry video labeling be associated, and using above-mentioned mark as each query characteristics label associations, Retrieval and inquisition feature tag and its associations in property data base, obtain multigroup similar features label;
Step 7:According to every group of positional information of feature tag, feature mistake is carried out to every group of similar features label that retrieval is obtained Filter, obtains including the alternative features vector set of multiple characteristic vectors;
Step 8:According to the crucial frame identification of inquiry and inquiry video segment mark, to the feature in the vector set of each alternative features Vector carries out similarity mode, obtains repeating video testing result;
Wherein, the step 1 specifically includes following steps:
Step 1.1:Index video is carried out into Shot Detection, multiple video segments are obtained, each video segment is respectively arranged with and regards Frequency fragment identification, the index video is provided with index video labeling;
Step 1.2:Key frame is extracted to each video segment, multiple key frames are obtained, each key frame is respectively arranged with key Frame identification;
Step 1.3:Each key frame is processed using feature extraction algorithm, one group of high dimensional feature vector is obtained, each is high Dimensional feature vector is provided with high dimensional feature vectorial;
Step 1.4:Each high dimensional feature vector is carried out into Hash treatment respectively, one group of feature tag is obtained;
Step 1.5:By each feature tag and corresponding high dimensional feature vectorial, crucial frame identification, video segment mark and Index video labeling be associated, by association after all feature tags be stored in property data base.
2. the repetition video detecting method represented based on semantic content multilayer according to claim 1, it is characterised in that institute State step 5 and specifically include following steps:
Step 5.1:Each inquiry high dimensional feature vector is represented using following sign function:
Wherein, p is 64 dimension high dimensional feature vectors, and Hessian matrixes are characterized the intermediate result of extraction algorithm extraction;
Step 5.2:The hash function of each inquiry high dimensional feature vector is expressed as follows:
Wherein a is the 64 dimension random vectors independently chosen from 2 a to Stable distritation, and b is one and is selected from [0, W] is uniformly distributed The real number for taking, parameter W randomly selects in 4 or 8 as optimal value;
Step 5.3:The inquiry high dimensional feature vector p of each 64 dimension is mapped to L L bucket of Hash table:
gj(p), j=1 ..., L
The label of each barrel is a k dimensional vector, k hash function for randomly selecting of correspondence:
gj(p)=(| h1,j(p)|,...,|hk,j(p)|)
Step 5.4:M is randomly selected in the inquiry high dimensional feature vector extracted from inquiry video to inquiry high dimensional feature vector, m The probability of mean collisional is vectorial inquiry high dimensional feature:
Ep (c)=p (ce),
In each barrel inquiry high dimensional feature vector number be:
Nbucket=∑np(ce)k≈n·Ep(c)k
Wherein, NbucketL=nLEp (c)k≤ Raiton, Ratio are 0.1%;
L is expressed as the function of k:
Solution obtains unique k and L optimal values.
3. the repetition video detecting method represented based on semantic content multilayer according to claim 1, it is characterised in that institute State step 7 and specifically include following steps:
Step 7.1:During inquiry key frame is extracted, positional information of the intermediate result as each characteristic point is stored;
Step 7.2:Each the query characteristics label for obtaining will be processed as a characteristic point by Hash, according to each characteristic point Relative distance of the positional information calculation each two characteristic point in two-dimensional space;
Step 7.3:Statistic of classification is carried out according to the crucial frame identification of inquiry, all features in two correspondence key frame images are obtained The average value and standard deviation of point relative distance;
Step 7.4 relative distance is exceeded average value and is removed as noise spot much larger than the characteristic point of standard deviation.
4. the repetition video detecting method represented based on semantic content multilayer according to claim 1, it is characterised in that institute State step 8 and specifically include following steps:
Step 8.1:Each the matching characteristic vector in the vector set of each alternative features is carried out according to the crucial frame identification of inquiry Hash treatment again, the key frame matched with inquiry key frame is searched using linear sweep:The quantity of matching characteristic vector surpasses It is matching key frame to cross the key frame of predetermined threshold;
Step 8.2:For each key frame f that inquiry video segment is identifiedi q, and match key frameSimilarity be:
s i m ( f i q , f j c ) = Σ N d Σ L w i , j · N m
Wherein, NmBe with the matching key frame in a bucket corresponding characteristic vector number, wi,jIt is the weights of correspondence bucket, specifically It is wi,j=1/Nbucket, NbucketIt is the inquiry high dimensional feature vector number in each barrel;
Step 8.3:Inquiry video vqWith index video vcBetween associated score be:
score c = Σ N f r a m e s i m ( f i q , f j c ) N f r a m e
Wherein, NframeIt is the inquiry key frame sum for inquiring about video extraction, if an index video is related to inquiry video Fraction scorecMore than predetermined threshold St, then by as a repetition video.
5. a kind of repetition video detecting device represented based on semantic content multilayer, it is characterised in that:Including setting up module (1), Shot detection module (2), key-frame extraction module (3), characteristic extracting module (4), Hash processing module (5), relating module (6), characteristic filter module (7) and similarity mode module (8);
It is described to set up module (1), for setting up property data base according to the information of index video;
The shot detection module (2), for inquiry video to be checked to be carried out into Shot Detection, obtains multiple queries piece of video Section, each inquiry video segment is respectively arranged with inquiry video segment mark, and the inquiry video is provided with inquiry video labeling;
The key-frame extraction module (3), for extracting key frame to each inquiry video segment, obtains multiple queries crucial Frame, each inquiry key frame is respectively arranged with the crucial frame identification of inquiry;
The characteristic extracting module (4), for being processed using feature extraction algorithm each inquiry key frame, obtains one group Inquiry high dimensional feature vector, each inquiry high dimensional feature vector is provided with inquiry high dimensional feature vectorial;
The Hash processing module (5), for each inquiry high dimensional feature vector to be carried out into Hash treatment respectively, obtains one group and looks into Ask feature tag;
The relating module (6), for each query characteristics label to be closed with corresponding inquiry high dimensional feature vectorial, inquiry Key frame identification, inquiry video segment mark and inquiry video labeling are associated, and using above-mentioned mark as each query characteristics The associations of label, retrieval and inquisition feature tag and its associations in property data base obtain multigroup similar features label;
The characteristic filter module (7), for according to every group of positional information of feature tag, the every group of similar spy obtained to retrieval Levying label carries out characteristic filter, obtains including the alternative features vector set of multiple characteristic vectors;
The similarity mode module (8) is alternative to each for according to the crucial frame identification of inquiry and inquiry video segment mark Characteristic vector in characteristic vector set carries out similarity mode, obtains repeating video testing result;
Wherein, the module (1) of setting up further includes detection sub-module (1-1), key-frame extraction submodule (1-2), feature Extracting sub-module (1-3), Hash submodule (1-4) with associate submodule (1-5);
The detection sub-module (1-1), for index video to be carried out into Shot Detection, obtains multiple video segments, each video Fragment is respectively arranged with video segment mark, and the index video is provided with index video labeling;
The key-frame extraction submodule (1-2), for extracting key frame to each video segment, obtains multiple key frames, often Individual key frame is respectively arranged with crucial frame identification;
The feature extraction submodule (1-3), for this pair, each key frame is processed using feature extraction algorithm, obtains one Group high dimensional feature vector, each high dimensional feature vector is provided with high dimensional feature vectorial;
The Hash submodule (1-4), for each high dimensional feature vector to be carried out into Hash treatment respectively, obtains one group of feature mark Sign;
Association submodule (1-5), for by each feature tag and corresponding high dimensional feature vectorial, key frame mark Know, video segment mark and index video labeling be associated, by association after all feature tags be stored in property data base.
6. the repetition video detecting device represented based on semantic content multilayer according to claim 5, it is characterised in that:Institute State Hash processing module (5) and further include high dimension vector submodule (5-1), hash function submodule (5-2), mapping submodule (5-3) and extract submodule (5-4);
The high dimension vector submodule (5-1), for each inquiry high dimensional feature vector to be represented using following sign function:
Wherein, p is 64 dimension high dimensional feature vectors, and Hessian matrixes are characterized the intermediate result of extraction algorithm extraction;
The hash function submodule (5-2), the hash function for each inquiry high dimensional feature vector is expressed as follows:
Wherein a is the 64 dimension random vectors independently chosen from 2 a to Stable distritation, and b is one and is selected from [0, W] is uniformly distributed The real number for taking, parameter W randomly selects in 4 or 8 as optimal value;
The mapping submodule (5-3), for the inquiry high dimensional feature vector p of each 64 dimension to be mapped into the L L of Hash table Individual bucket:
gj(p), j=1 ..., L
The label of each barrel is a k dimensional vector, k hash function for randomly selecting of correspondence:
gj(p)=(| h1,j(p)|,...,|hk,j(p)|)
Extraction submodule (5-4), for randomly selecting m to looking into the inquiry high dimensional feature vector of the extraction from inquiry video High dimensional feature vector is ask, m is the probability of mean collisional between inquiry high dimensional feature vector:
Ep (c)=p (ce),
In each barrel inquiry high dimensional feature vector number be:
Nbucket=∑np(ce)k≈n·Ep(c)k
Wherein, NbucketL=nLEp (c)k≤ Raiton, Ratio are 0.1%;
L is expressed as the function of k:
Solution obtains unique k and L optimal values.
7. the repetition video detecting device represented based on semantic content multilayer according to claim 5, it is characterised in that:Institute State characteristic filter module (7) and further include intermediate storage submodule (7-1), calculate apart from submodule (7-2), statistic of classification Module (7-3) and removal submodule (7-4);
The intermediate storage submodule (7-1), for during inquiry key frame is extracted, storage intermediate result to be used as each The positional information of characteristic point;
It is described to calculate apart from submodule (7-2), for each the query characteristics label for obtaining will to be processed as one by Hash Characteristic point, the relative distance of positional information calculation each two characteristic point according to each characteristic point in two-dimensional space;
The statistic of classification submodule (7-3), for carrying out statistic of classification according to the crucial frame identification of inquiry, obtains in two correspondences The average value and standard deviation of all characteristic point relative distances in key frame images;
It is described removal submodule (7-4), for using relative distance exceed average value and much larger than standard deviation characteristic point as Noise spot is removed.
8. the repetition video detecting device represented based on semantic content multilayer according to claim 5, it is characterised in that:Institute State similarity mode module (8) and further include traversal submodule (8-1), similarity submodule (8-2) and relevant sub-module (8- 3);
Traversal submodule (8-1), for according to each inquired about during crucial frame identification gathers each alternative features vector Matching characteristic vector carries out again Hash treatment, and the key frame matched with inquiry key frame is searched using linear sweep:Matching The key frame that the quantity of characteristic vector exceedes predetermined threshold is matching key frame;
The similarity submodule (8-2), for each key frame f identified for inquiry video segmenti q, and match key frameSimilarity be:
s i m ( f i q , f j c ) = Σ N d Σ L w i , j · N m
Wherein, NmBe with the matching key frame in a bucket corresponding characteristic vector number, wi,jIt is the weights of correspondence bucket, specifically It is wi,j=1/Nbucket, NbucketIt is the inquiry high dimensional feature vector number in each barrel;
The relevant sub-module (8-3), for inquiring about video vqWith index video vcBetween associated score be:
score c = Σ N f r a m e s i m ( f i q , f j c ) N f r a m e
Wherein, NframeIt is the inquiry key frame sum for inquiring about video extraction, if an index video is related to inquiry video Fraction scorecMore than predetermined threshold St, then by as a repetition video.
CN201310611187.4A 2013-11-26 2013-11-26 Method and device for detecting repeated video based on semantic content multilayer expression Active CN103617233B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310611187.4A CN103617233B (en) 2013-11-26 2013-11-26 Method and device for detecting repeated video based on semantic content multilayer expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310611187.4A CN103617233B (en) 2013-11-26 2013-11-26 Method and device for detecting repeated video based on semantic content multilayer expression

Publications (2)

Publication Number Publication Date
CN103617233A CN103617233A (en) 2014-03-05
CN103617233B true CN103617233B (en) 2017-05-17

Family

ID=50167936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310611187.4A Active CN103617233B (en) 2013-11-26 2013-11-26 Method and device for detecting repeated video based on semantic content multilayer expression

Country Status (1)

Country Link
CN (1) CN103617233B (en)

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870574B (en) * 2014-03-18 2017-03-08 江苏物联网研究发展中心 Forming label based on the storage of H.264 ciphertext cloud video and indexing means
CN104008395B (en) * 2014-05-20 2017-06-27 中国科学技术大学 A kind of bad video intelligent detection method based on face retrieval
CN106375850B (en) * 2015-07-23 2019-09-13 无锡天脉聚源传媒科技有限公司 A kind of judgment method and device matching video
CN106933861A (en) * 2015-12-30 2017-07-07 北京大唐高鸿数据网络技术有限公司 A kind of customized across camera lens target retrieval method of supported feature
US10318581B2 (en) * 2016-04-13 2019-06-11 Google Llc Video metadata association recommendation
CN107515937B (en) * 2017-08-29 2020-10-27 千寻位置网络有限公司 Differential account classification method and system, service terminal and memory
CN107908647A (en) * 2017-10-10 2018-04-13 天津大学 A kind of scalable video search method based on digital watermarking
CN108259932B (en) * 2018-03-15 2019-10-18 华南理工大学 Robust hashing based on time-space domain polar coordinates cosine transform repeats video detecting method
CN110324660B (en) * 2018-03-29 2021-01-19 北京字节跳动网络技术有限公司 Method and device for judging repeated video
CN108520047B (en) * 2018-04-04 2021-05-14 南京信安融慧网络技术有限公司 Video characteristic information retrieval method
CN108763295B (en) * 2018-04-18 2021-04-30 复旦大学 Video approximate copy retrieval algorithm based on deep learning
CN108566562B (en) * 2018-05-02 2020-09-08 中广热点云科技有限公司 Method for finishing sample sealing by copyright video information structured arrangement
CN108769731B (en) * 2018-05-25 2021-09-24 北京奇艺世纪科技有限公司 Method and device for detecting target video clip in video and electronic equipment
CN109086830B (en) * 2018-08-14 2021-09-10 江苏大学 Typical correlation analysis near-duplicate video detection method based on sample punishment
CN109189991B (en) * 2018-08-17 2021-06-08 百度在线网络技术(北京)有限公司 Duplicate video identification method, device, terminal and computer readable storage medium
CN111382620B (en) * 2018-12-28 2023-06-09 阿里巴巴集团控股有限公司 Video tag adding method, computer storage medium and electronic device
CN110175267B (en) * 2019-06-04 2020-07-07 黑龙江省七星农场 Agricultural Internet of things control processing method based on unmanned aerial vehicle remote sensing technology
CN110377794B (en) * 2019-06-12 2022-04-01 杭州当虹科技股份有限公司 Video feature description and duplicate removal retrieval processing method
CN110443007B (en) * 2019-07-02 2021-07-30 北京瑞卓喜投科技发展有限公司 Multimedia data tracing detection method, device and equipment
CN110490250A (en) * 2019-08-19 2019-11-22 广州虎牙科技有限公司 A kind of acquisition methods and device of artificial intelligence training set
CN110796088B (en) * 2019-10-30 2023-07-04 行吟信息科技(上海)有限公司 Video similarity judging method and device
CN110866563B (en) * 2019-11-20 2022-04-29 咪咕文化科技有限公司 Similar video detection and recommendation method, electronic device and storage medium
CN111294613A (en) * 2020-02-20 2020-06-16 北京奇艺世纪科技有限公司 Video processing method, client and server
CN111368552B (en) * 2020-02-26 2023-09-26 北京市公安局 Specific-field-oriented network user group division method and device
CN111723692B (en) * 2020-06-03 2022-08-09 西安交通大学 Near-repetitive video detection method based on label features of convolutional neural network semantic classification
CN111696105B (en) * 2020-06-24 2023-05-23 北京金山云网络技术有限公司 Video processing method and device and electronic equipment
CN112235599B (en) * 2020-10-14 2022-05-27 广州欢网科技有限责任公司 Video processing method and system
CN112839257B (en) * 2020-12-31 2023-05-09 四川金熊猫新媒体有限公司 Video content detection method, device, server and storage medium
CN112989114B (en) * 2021-02-04 2023-08-29 有米科技股份有限公司 Video information generation method and device applied to video screening
CN113361313A (en) * 2021-02-20 2021-09-07 温州大学 Video retrieval method based on multi-label relation of correlation analysis
CN113065025A (en) * 2021-03-31 2021-07-02 厦门美图之家科技有限公司 Video duplicate checking method, device, equipment and storage medium
CN113779303B (en) * 2021-11-12 2022-02-25 腾讯科技(深圳)有限公司 Video set indexing method and device, storage medium and electronic equipment
WO2024065692A1 (en) * 2022-09-30 2024-04-04 华为技术有限公司 Vector retrieval method and device
CN116188815A (en) * 2022-12-12 2023-05-30 北京数美时代科技有限公司 Video similarity detection method, system, storage medium and electronic equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101159834B (en) * 2007-10-25 2012-01-11 中国科学院计算技术研究所 Method and system for detecting repeatable video and audio program fragment
CN103077203A (en) * 2012-12-28 2013-05-01 青岛爱维互动信息技术有限公司 Method for detecting repetitive audio/video clips

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种重复视频的快速检测算法;刘大伟 等;《小型微型计算机系统》;20130615;第34卷(第06期);1400-1404页 *
支持多层表示的海量视频快速检索及反馈学习;刘大伟;《中国博士学位论文全文数据库 信息科技辑(月刊)》;20130115;第2013年卷(第01期);I138-89 *

Also Published As

Publication number Publication date
CN103617233A (en) 2014-03-05

Similar Documents

Publication Publication Date Title
CN103617233B (en) Method and device for detecting repeated video based on semantic content multilayer expression
US11803591B2 (en) Method and apparatus for multi-dimensional content search and video identification
Guo et al. From general to specific: Informative scene graph generation via balance adjustment
CN107133569B (en) Monitoring video multi-granularity labeling method based on generalized multi-label learning
CN109815364A (en) A kind of massive video feature extraction, storage and search method and system
KR20170058263A (en) Methods and systems for inspecting goods
WO2015022020A1 (en) Recognition process of an object in a query image
Thyagharajan et al. Pulse coupled neural network based near-duplicate detection of images (PCNN–NDD)
CN102385592A (en) Image concept detection method and device
CN111126122A (en) Face recognition algorithm evaluation method and device
JP2010231254A (en) Image analyzing device, method of analyzing image, and program
CN115861738A (en) Category semantic information guided remote sensing target detection active sampling method
CN109086830A (en) Typical association analysis based on sample punishment closely repeats video detecting method
JP2012022419A (en) Learning data creation device, learning data creation method, and program
CN115100497A (en) Robot-based method, device, equipment and medium for routing inspection of abnormal objects in channel
CN115082781A (en) Ship image detection method and device and storage medium
CN110287369A (en) A kind of semantic-based video retrieval method and system
CN111241987B (en) Multi-target model visual tracking method based on cost-sensitive three-branch decision
CN107578069B (en) Image multi-scale automatic labeling method
CN111444816A (en) Multi-scale dense pedestrian detection method based on fast RCNN
CN112651996B (en) Target detection tracking method, device, electronic equipment and storage medium
Ovhal et al. Plagiarized image detection system based on CBIR
Zhuang et al. Cross-resolution person re-identification with deep antithetical learning
CN114168780A (en) Multimodal data processing method, electronic device, and storage medium
CN113408356A (en) Pedestrian re-identification method, device and equipment based on deep learning and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant