CN101216886B

CN101216886B - A shot clustering method based on spectral segmentation theory

Info

Publication number: CN101216886B
Application number: CN2008100560968A
Authority: CN
Inventors: 薛玲; 李超; 钟林; 李欢; 熊璋
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2008-01-11
Filing date: 2008-01-11
Publication date: 2010-06-09
Anticipated expiration: 2028-01-11
Also published as: CN101216886A

Abstract

The invention relates to a shot clustering method based on the spectrum segmentation theory, which comprises the following steps: utilizing, the spectrum segmentation theory for shot clustering; extracting feature vectors of each unspecified shot; calculating similarity between each two categories according to the extracted feature vectors; then constituting each shot cluster as a weighted undirected graph; segmenting each shot category into two shot categories by a using spectrum according to the similarity between each two categories; using Bayesian information criteria to judge whether the segmentation is effective or not, the effectively segmented shot sub-categories are iteratively segmented, the ineffectively segmented shot categories are terminals; finally syncretizing the classification results after the segmentation to get the optimal shot classification number and the classification result. The invention solves the difficult problem that the optimized classification number is difficult to estimate in the clustering algorithm, and improves the recall ratio and the pertinency ratio of the clustering result by utilizing the precise classification spectrum segmentation; the proposed overall fusion operation has a function of correcting the classification errors, thereby effectively avoiding the problem of local optimum relation.

Description

A kind of lens clustering method based on spectral segmentation theory

Technical field

The invention belongs to video content analysis and searching field, be specifically related to a kind of method of camera lens being carried out cluster.

Background technology

Video lens is meant semantically continual one section video content, is the basic structure and the semantic primitive of video information retrieval, and representing the unit of video semanteme to carry out cluster to these is the basis that video semanteme is analyzed.Current clustering algorithm can roughly be divided into to be had supervision and not to have two kinds of supervision.Have the supervision cluster by given sample set sorter to be trained, classification is accurate, but needs artificial mark sample set.Do not have the supervision clustering algorithm and have self-learning function, need not training sample, be difficult to determine but be faced with optimization classification number, classification results to initial division than difficult problems such as sensitivities.

In recent years, the research about lens clustering method has a lot.The method of estimation optimization classification number commonly used has following several in the video lens clustering algorithm at present: (1) is based on the judge criterion of the pattern of exploration, this method is based on a kind of appropriate information criterion judgment criteria, traversal the classification number situation that might occur, obtain optimized classification number; (2) based on the method for estimation that merges, at first choose a number of categories much larger than the optimal classification number and carry out cluster, clustering result merges to obtain optimum classification number mutually according to information entropy least disadvantage principle; (3) based on the iteration cluster of k-means, each clustering result iteration is carried out the k-means algorithm, adopt the appropriate information criterion to judge whether to stop, when all iteration stop, obtain optimum classification number.

First method is the simplest, and the result who obtains also is the most objective, but this method computation complexity is higher, and is not having the hunting zone under the situation of priori necessary enough big.Need recomputate the information entropy loss of fusion after the each fusion of second method assorting process, speed of convergence is slow, the computation complexity height, and can't be to the preliminary classification error correction.Mistake that a class is isolated may appear in advantages such as the third method is representative with X-means, has fast convergence rate, and computation complexity is little, but the k-means algorithm only considered to concern between class, and not at the error correction of this mistake.

Summary of the invention

The technical problem to be solved in the present invention: overcome the deficiencies in the prior art, a kind of lens clustering method based on spectral segmentation theory is provided, this method can be difficult to estimate optimized classification number in the clustering algorithm under the low complex degree situation, utilize the spectrum of accurate two classification to cut apart, improved the recall ratio and the precision ratio of cluster result; The overall mixing operation that proposes has the error correction to classification error, has avoided the locally optimal solution problem.

The object of the present invention is achieved like this: a kind of lens clustering method based on spectral segmentation theory may further comprise the steps:

(1) each camera lens to be classified is extracted its proper vector;

(2) according to the proper vector of extracting, calculate the similarity between per two classes;

(3) the camera lens collection is configured to a non-directed graph of having the right,, uses spectrum to cut apart each camera lens class two is divided into two camera lens classes according to the similarity between per two camera lens classes;

(4) judge with bayesian information criterion whether effective this cuts apart; The camera lens subclass iteration cutting operation of effectively cutting apart, the invalid camera lens class of cutting apart is a terminal node;

(5) spectrum is cut apart the result of final output, used bayesian information criterion to judge that two classify and whether are communicated with, merge, finally obtain optimum cluster numbers and cluster result according to connectedness.

The hsv color histogram is adopted in the extraction of the proper vector in the described step (1), and calculates the proper vector of the average color histogram of whole all frames of camera lens as this camera lens.

The calculating formula of similarity of calculating between per two camera lens classes in the described step (2) is:

e_{ij} = e^{\frac{- {| | H_{i} - H_{j} | |}^{2}}{2 σ^{2}}}

E wherein _IjRepresent the similarity between two class camera lens i, the j, H _i, H _jBe respectively camera lens s _i, s _jColor histogram, σ are constant.

Method in the described step (3) is: the result that spectrum is cut apart A, B} satisfy following formula and obtain minimum value in global scope:

Ncut (A, B) = \frac{cut (A, B)}{assoc (A, V)} + \frac{cut (A, B)}{assoc (B, V)}

Wherein:

With camera lens collection S be expressed as a non-directed graph G=who has the right (V, E), v _iPoint representative shot s _i, e _IjSimilarity between expression camera lens i, the j.

Spectrum cutting procedure in the described step (3) is as follows by the performing step that the calculated characteristics vector obtains: at first calculate N * N symmetric matrix E, each element is e _Ij, e _IjSimilarity between expression camera lens i, the j obtains diagonal matrix D (E), d according to E _Ii=∑ _je _Ij, make up matrix L (E), L (E)=(D (E)) ^-1/2E (D (E)) ^-1/2, select maximum proper vector among the L (E), determine the result of cutting apart according to the sign bit of the every dimension of proper vector.

The bayesian information criterion of described step (4) adopts Gauss's spherical model, can the given sample data collection of optimization match; BIC calculates optional model posterior probability, and as the standard as measurement model adaptability, than judging more effective to calculate two kinds of distances between the model profile as the general information criterion of criterion.

Use bayesian information criterion that the classification that produces is merged in the described step (5), promptly any two classes are carried out the BIC model and judge whether it belongs to two classes, if it is better to belong to a class, then defining this two class is communicated with, at last the classification of all connections is merged, get to the end best cluster numbers and clustering result.

The advantage that prior art of the present invention is compared is:

(1) the existing clustering method classifications that adopt apart from the standard as classification more, the empirical parameter that provides is provided effect, be difficult to provide a kind of general, general solution, but the camera lens cluster is a kind of cluster of uncertain classification number, is difficult to estimate accurately classification number and center of all categories before cluster.The present invention is used for the camera lens cluster with spectrogram theory and bayesian information criterion, each camera lens to be classified is extracted its proper vector, according to the proper vector that extracts, calculate the similarity between per two classes, then the camera lens collection is configured to a non-directed graph of having the right, according to the similarity between per two camera lens classes, to use spectrum to cut apart each camera lens class two and be divided into two camera lens classes, judge with bayesian information criterion whether effective this cuts apart again, the camera lens subclass iteration cutting operation of effectively cutting apart, the invalid camera lens class of cutting apart is a terminal node.The present invention utilizes the spectrum of accurate two classification to cut apart, and it is more accurate as the criterion that classification stops to adopt simultaneously based on the bayesian information criterion of posterior probability, has improved the recall ratio and the precision ratio of cluster result.

(2) the present invention compares with traditional lens clustering method, when the present invention finishes at cutting operation, the result of spectrum being cut apart final output, use bayesian information criterion to judge whether two classification are communicated with, merge according to connectedness, finally obtain optimum cluster numbers and cluster result, solved a difficult problem that under the low complex degree situation, is difficult to estimate optimized classification number in the clustering algorithm; The overall mixing operation that proposes has the error correction to classification error, has avoided the locally optimal solution problem.

Description of drawings

Fig. 1 is the schematic flow sheet that the present invention is based on the spectral segmentation theory cluster.

Embodiment

As shown in Figure 1, the present invention specifically may further comprise the steps:

1. camera lens category feature vector extracts

Extract the proper vector of whole camera lenses, the mean value of calculated characteristics vector is as the proper vector of camera lens class.Adopt the hsv color histogram to describe the feature of image among the present invention, promptly calculate the color characteristic of the average color histogram of its whole frames as this camera lens.

Color characteristic is a kind of bottom physical features that best embodies Image Visual Feature, the object or the scene that are comprised in color characteristic and the image have higher correlativity, and compare with other visual signatures, color characteristic is less to the dependence at the size of image itself, direction, visual angle, has higher robustness.Color histogram is the color characteristic that is widely adopted in image indexing system, and it has described the probability distribution of different color in entire image.Though can't describe the object in the image and the locus of object, but still be a kind of describing method efficiently.

HSV (Hue Saturation Value) color space meets the subjective judgement of people to color similarity most, and this paper adopts the hsv color histogram to describe the feature of image.For each camera lens, calculate the color characteristic of the average color histogram of its whole frames as this camera lens.

In the present invention, adopt 12 (H) * 4 (S) * 4 (V) totally 192 grades of hsv color histograms.Definition H _i, H _jBe respectively camera lens s _i, s _jColor histogram, then we define the limit e among the figure G _IjAs follows:

e_{ij} = e^{\frac{- {| | H_{i} - H_{j} | |}^{2}}{2 σ^{2}}}

Wherein σ is camera lens s _i, s _jBetween the function of time interval t because it doesn't matter relation between its camera lens of video of testing in this experiment and time, σ gets constant 0.15.

2. use spectrum to cut apart two story board classes

With camera lens collection S be expressed as a non-directed graph G=who has the right (V, E), ν _jPoint representative shot s _i, e _IjSimilarity between expression camera lens i, the j.The result that spectrum is cut apart A, B} satisfies following formula:

Ncut (A, B) = \frac{cut (A, B)}{assoc (A, V)} + \frac{cut (A, B)}{assoc (B, V)}

In global scope, obtain minimum value.

Wherein:

The spectrum partitioning algorithm is briefly described as follows: at first calculate N * N symmetric matrix E, each element is e _Ij, obtain diagonal matrix D (E), d according to E _Ii=∑ _je _Ij, make up matrix L (E), L (E)=(D (E)) ^-1/2E (D (E)) ^-1/2Select maximum proper vector among the L (E), determine the result of cutting apart according to the sign bit of the every dimension of proper vector.

Distance is estimated in comprehensive of the criteria for classification of spectrum partitioning algorithm definition, the class, makes that the result of cutting apart is more accurate.Algorithm will the problem of minimizing be converted to asks the maximum proper vector of matrix, and can take the method for corresponding approximate treatment, has reduced complexity.

3. calculate the value of BIC (k=1):

If video lens to be clustered is a S set, be used for depositing the formation of camera lens class temporarily, note is made Q, is used to write down the formation of last classification results, and note is made RQ.Spectrum of use is cut apart S set is divided into two camera lens class S ₁, S ₂(suppose optimal classification count k 〉=2).

With S ₁, S ₂Insert formation Q afterbody, get the first camera lens class of formation Q S _i, suppose that the sample in the camera lens class is Gauss's spherical distribution (Spherical Gaussians) around the class center.Then at camera lens class S _iIn sample set X={x _i: i=1 ..., the Gaussian distribution of the M dimension of R} is as follows:

f (μ_{i}, V_{i}; X) = {(2 π)}^{- p / 2} {| V_{i} |}^{- 1 / 2} \exp [- \frac{1}{2} {(x - μ_{i})}^{'} V_{i}^{- 1} (x - μ_{i})]

Calculate the value of BIC (k=1):

BIC (k = 1) = \log L ({\hat{μ}}_{i}, {\hat{V}}_{i}; x_{i} &Element; S_{i}) - M \log R

Wherein

The maximum likelihood of be respectively sample data collection X in M dimension Gaussian distribution average and variance is estimated.M represents the number of parameter in this model.L is the maximum likelihood function of sample data collection X, L ()=∏ f ().R is the number of samples among the camera lens class Si.

4. calculate the value of BIC (k=2):

Spectrum of use is cut apart it is divided into two classes once more, and note is done: S _i ⁽¹⁾, S _i ⁽²⁾Then corresponding X (x _i ⁽¹⁾, x _i ⁽²⁾) Gaussian distribution as follows:

x _i ⁽¹⁾～(μ _i ⁽¹⁾，V _i ⁽¹⁾)，x _i ⁽²⁾～(μ _i ⁽²⁾，V _i ⁽²⁾)

Calculate the value of BIC (k=2):

BIC (k = 2) = \log [L ({\hat{μ}}_{i}^{(1)}, {\hat{V}}_{i}^{(1)} \cdot L ({\hat{μ}}_{i}^{(2)}, {\hat{V}}_{i}^{(2)})] - 2 M \log R

This moment, the number of parameters of model was 2M.

5. judge whether classification is effective:

If BIC (k=2)＞BIC (k=1), it is better to think that then such is divided into two classes, and classification is effective, with S _i ⁽¹⁾, S _i ⁽²⁾Insert formation Q.If BIC (k=2)≤BIC (k=1) thinks that promptly such does not need to divide again, it is invalid to classify, with S _iInsert formation RQ.

6. cluster result merges adjustment

Continue step 5.Be located at camera lens class among the formation RQ and be S '=(S ' ₁, S ' ₂..., S ' _k), for i arbitrarily, j (i, j=1 ..., k, i ≠ j), calculate S ' _i∪ S ' _jBIC (k=2) and BIC (k=1), if BIC (k=2)＞BIC (k=1) then defines S ' _i, S ' _jBe connection, otherwise for not being communicated with.All connected sets are merged output optimal classification number and classification results.

Claims

1. lens clustering method based on spectral segmentation theory is characterized in that may further comprise the steps:

(1) each camera lens to be classified is extracted its proper vector;

2. the lens clustering method based on spectral segmentation theory according to claim 1, it is characterized in that: the hsv color histogram is adopted in the extraction of the proper vector in the described step (1), and calculates the proper vector of the average color histogram of whole all frames of camera lens as this camera lens.

3. the lens clustering method based on spectral segmentation theory according to claim 1 is characterized in that: the calculating formula of similarity of calculating between per two camera lens classes in the described step (2) is:

e_{ij} = e^{\frac{- {| | H_{i} - H_{j} | |}^{2}}{{2 σ}^{2}}}

4. the lens clustering method based on spectral segmentation theory according to claim 1 is characterized in that: the method in the described step (3) is: the result that spectrum is cut apart A, B} satisfy following formula and obtain minimum value in global scope:

Ncut (A, B) = \frac{cut (A, B)}{assoc (A, V)} + \frac{cut (A, B)}{assoc (B, V)}

Wherein:

cut (A, B) = \underset{i &Element; A, j &Element;}{Σ} e_{ij}, assoc (A) = \underset{i &Element; A, j &Element; V}{Σ} e_{ij},

5. the lens clustering method based on spectral segmentation theory according to claim 1 is characterized in that: the spectrum cutting procedure in the described step (3) is as follows by the performing step that the calculated characteristics vector obtains: at first calculate N * N symmetric matrix E, each element is e _Ij, e _IjSimilarity between expression camera lens i, the j obtains diagonal matrix D (E), d according to E _Ii=∑ _je _Ij, make up matrix L (E), L (E)=(D (E)) ^-1/2E (D (E)) ^-1/2, select maximum proper vector among the L (E), determine the result of cutting apart according to the sign bit of the every dimension of proper vector.

6. the lens clustering method based on spectral segmentation theory according to claim 1 is characterized in that: the bayesian information criterion of described step (4) adopts Gauss's spherical model, can the given sample data collection of optimization match; BIC calculates optional model posterior probability, and as the standard as measurement model adaptability, than judging more effective to calculate two kinds of distances between the model profile as the general information criterion of criterion.

7. the lens clustering method based on spectral segmentation theory according to claim 1, it is characterized in that: use bayesian information criterion that the classification that produces is merged in the described step (5), promptly any two classes are carried out the BIC model and judge whether it belongs to two classes, if it is better to belong to a class, then defining this two class is communicated with, at last the classification of all connections is merged, get to the end best cluster numbers and clustering result.