Integrating colour and motion feature carry out the method for searching lens
Technical field
The invention belongs to video search technique area, be specifically related to the method that integrating colour and motion feature carry out searching lens.
Background technology
Along with the increase of the accumulation of TV station's video frequency program, online digital video, and a large amount of multimedia application such as digital library, video request program, remote teaching, how in the magnanimity video, to retrieve needed data fast and seem most important.Traditional video frequency searching of describing based on keyword is because reasons such as descriptive power is limited, strong, the manual marks of subjectivity, demand that can not the satisfying magnanimity video frequency searching.Therefore, since the nineties, the Content-based Video Retrieval technology becomes the hot issue of research, and the formulation of the MPEG-7 standard of Multimedia Content Description Interface has caused that also people pay close attention to widely.
The existing method of video frequency searching is at first to carry out camera lens to cut apart, with basic structural unit and the retrieval unit of camera lens as video sequence; Represent the content of this camera lens then at each camera lens internal extraction key frame, go out low-level features such as color from key-frame extraction, be used for the index and the retrieval of camera lens.Like this, just content-based searching lens being converted into CBIR solves.
(author is C.W.Ngo to the document of delivering at International Journal of Computer Vision in 2002 " Motion-basedVideo Representation for Scene Change Detection ", T.C.Pong, and H.J.Zhang, Vol.50, No.2, page number 127-143), the disclosed method of the document proposes the content change in the camera lens is decomposed into the subelement of several contents unanimity, be called sub-camera lens (subshot), it mainly comprises such several steps: (1) extracts sub-camera lens based on camera motion; (2) then at the sub-camera lens of different motion, choose and construct different key frames and represent, can represent with a key frame as static (static) sub-camera lens, scanning (pan) sub-camera lens represents that by a panorama sketch of structure (panorama) the sub-camera lens of zoom (zoom) can be with representing with two key frames afterwards before the zoom; (3) similarity of latter two camera lens is expressed as the maximal value of their all key frame similarities and the mean value of second largest value:
Represent two camera lens s
iAnd s
jSecond largest similar value.The fine redundancy of considering the camera lens content of the method for the sub-camera lens of this extraction, but the measure of maximal value and second largest value can not hold the similarity degree of seeing two camera lens inside of reflection, ground comprehensively.
On July 18th, 2003, another application for a patent for invention that the applicant applied for discloses " a kind of content-based searching lens method " (application number 03150127.3, open day on February 25th, 2004), this method is used for searching lens with the Kuhn-Munkres method of graph theory Optimum Matching first.This method is emphasized under prerequisite one to one, the similarity of comprehensively and objectively measuring two camera lenses.The measuring similarity of two camera lenses is modeled as the bipartite graph of a cum rights: each frame in the camera lens is regarded a node of bipartite graph as, and the similar value of arbitrary frame is as the weights on limit between two camera lenses.Under prerequisite one to one, utilize the Kuhn-Munkres method to obtain the authority of this bipartite graph, with this similar value as two camera lenses.Consider the retrieval rate problem, proposed two and improved algorithm.This method has improved the accuracy and the speed of searching lens to a certain extent.
But all there is a common problem in above-mentioned these methods: promptly only considered the color characteristic of video, and ignored the motion feature of video.Except color characteristic, motion feature also is the key character that video has but in fact.
Summary of the invention
Only use the defective of color characteristic at existing searching lens method, the objective of the invention is to propose the method that integrating colour and motion feature carry out searching lens, this method is except the color similarity degree that uses two camera lenses of Optimum Matching method valid metric, can also utilize the kinematic similitude degree of two camera lenses of motion histogram tolerance, so, the present invention can improve the accuracy of searching lens on the basis of existing technology greatly, thereby provides powerful support for for the multimedia information retrieval of magnanimity provides.
The object of the present invention is achieved like this: integrating colour and motion feature carry out the method for searching lens, may further comprise the steps:
(1) at first video database is carried out camera lens and cut apart, with the basic structural unit and the retrieval unit of camera lens as video;
(2) on color characteristic, utilize the Optimum Matching method of graph theory, measure two camera lens X and Y
kColor similarity degree Similarity
Color(X, Y
k);
(3) on motion feature, directly on compression domain, extract the motion vector of camera lens, the histogrammic method of motion of constructing camera lens then is as follows: (A) (i is j) as the histogrammic horizontal ordinate of motion for movement angle angle; (B) (i, j represent two frames that video is adjacent to exercise intensity int ensity for i, j) the histogrammic ordinate of conduct motion;
Movement angle is represented the direction of motion vector, and exercise intensity is represented the energy or the intensity of motion vector, and their computing method are as follows:
(dx
I, j, dy
I, j) expression motion vector horizontal ordinate and ordinate.Movement angle is quantized n angular range in 2 π scopes, n is an integer.Then, in camera lens, the exercise intensity of each angular range is added up forms the motion histogram H of camera lens
X(angle), H
X(angle) horizontal ordinate is angular range n, and ordinate is an exercise intensity, and X represents camera lens, angle ∈ [1, n]; Finally, two camera lens X and Y
kThe kinematic similitude degree be defined as Similarity
Motion(X, Y
k):
Wherein,
(4) last, the similarity Similarity of two camera lenses (X, Y
k), depend on above-mentioned color similarity degree Similarity
Color(X, Y
k) and kinematic similitude degree Similarity
Motion(X, Y
k) summation, similarity Similarity (X, Y
k) value big more, represent that two camera lenses are similar more.
In order to realize purpose of the present invention better, when carrying out searching lens, can also add following technical characterictic:
In the step (2), uniformly-spaced extract 3 key frames as node, structure cum rights bipartite graph G={X, Y in each camera lens inside
k, E
k, X represents to inquire about camera lens X 3 equally spaced key frame x
1, x
2.x
3, Y
kCamera lens Y in the expression video library
k3 equally spaced key frame y are arranged
1, y
2, y
3, limit collection E
k={ e
Ij, limit e wherein
Ij=(x
i, y
j) expression x
iWith y
jSimilar, limit e
IjWeights ω
IjExpression x
iWith y
jSimilar value, use histogrammic friendship to calculate ω
Ij:
Wherein,
H
i(h, s v) are the histograms in hsv color space, the present invention H, S, the V component is statistic histogram in 18 * 3 * 3 three dimensions, with 162 numerical value after the normalization as color feature value, Inter sect (x
i, y
j) two histogrammic friendships of expression, judge the similarity of two key frames with it, use A (x
i, y
j) normalization Inter sect (x
i, y
j) between 0,1;
Utilize the Optimum Matching algorithm of graph theory then, obtain G={X, Y
k, E
kOptimum Matching M after, every limit e of M
IjWeights ω
IjG={X is tried to achieve in addition, Y
k, E
kAuthority ω, the present invention defines two camera lens X and Y
kThe color similarity degree
Use 3 with Similarity
Color(X, Y
k) normalize between 0,1, be worth greatly more, show camera lens X and Y
kSimilar more.
The Optimum Matching algorithm of the graph theory described in the step (2) is the Kuhn-Munkres method preferably.
In the step (4), the similarity Similarity of two camera lenses (X, Y
k) depend on above-mentioned color similarity degree Similarity
Color(X, Y
k) and kinematic similitude degree Similarity
Motion(X, Y
k) summation: Similarity (X, Y
k)=ω
1Similarity
Color(X, Y
k)+ω
2Similarity
Motion(X, Y
k)
Wherein, ω
1And ω
2Expression Similarity
Color(X, Y
k) and Similarity
Motion(X, Y
k) similarity Similarity (X, Y in the end
k) in shared proportion, ω
1+ ω
2=1.
ω
1And ω
2Value preferably: ω
1=0.7, ω
2=0.3.
Effect of the present invention is: compare with existing searching lens method, adopt integrating colour of the present invention and motion feature to carry out the method for searching lens, can obtain higher accuracy when carrying out searching lens.
Why the present invention has so significant technique effect, and its reason is: as described in the previous technique content, existing method has only been considered the color characteristic of video, but in fact, except color characteristic, motion feature also is the key character that video has.Only use the defective of color characteristic at existing searching lens method, the present invention proposes the method that integrating colour and motion feature carry out searching lens, except the color similarity degree that uses two camera lenses of Optimum Matching method valid metric, the present invention also utilizes the kinematic similitude degree of two camera lenses of motion histogram tolerance, so the present invention can improve the accuracy of searching lens on the basis of existing technology greatly.Compare with the existing method of only utilizing color characteristic, the test comparing result has proved the outstanding representation of the present invention in searching lens.
Description of drawings
Fig. 1 is a schematic flow sheet of the present invention;
Fig. 2 is 6 semantic category examples of searching lens in the experiment contrast;
Fig. 3 is the result for retrieval of the present invention to the football camera lens.
Embodiment
The present invention is described in further detail below in conjunction with the drawings and specific embodiments.
As shown in Figure 1, a kind of integrating colour and motion feature carry out the method for searching lens, may further comprise the steps:
1, camera lens is cut apart
At first use space-time section algorithm (spatio-temporal slice), video database is carried out camera lens to be cut apart, with basic structural unit and the retrieval unit of camera lens as video, can list of references " Video Partitioning by Temporal Slice Coherency " [C.W.Ngo about the detailed description of space-time section algorithm, T.C.Pong, and R.T.Chin, IEEE Transactions on Circuits andSystems for Video Technology, Vol.11, No.8, pp.941-953, August, 2001].
2, calculate the color similarity degree of two camera lenses
On color characteristic, utilize the Optimum Matching method of graph theory, the method for measuring two camera lens color similarity degree is as follows:
Uniformly-spaced extract 3 key frames as node in each camera lens inside, structure cum rights bipartite graph G={X, Y
k, E
k, X represents to inquire about camera lens X 3 equally spaced key frame x
1, x
2.x
3, Y
kCamera lens Y in the expression video library
k3 equally spaced key frame y are arranged
1, y
2, y
3, limit collection E
k={ e
Ij, limit e wherein
Ij=(x
i, y
j) expression x
iWith y
jSimilar, limit e
IjWeights ω
IjExpression x
iWith y
jSimilar value, use histogrammic friendship to calculate ω
Ij:
H
i(h, s v) are the histograms in hsv color space, the present invention H, S, the V component is statistic histogram in 18 * 3 * 3 three dimensions, with 162 numerical value after the normalization as color feature value, Inter sect (x
i, y
j) two histogrammic friendships of expression, judge the similarity of two key frames with it, use A (x
i, y
j) normalization Inter sect (x
i, y
j) between 0,1;
Utilize the Optimum Matching algorithm of graph theory then, obtain G={X, Y
k, E
kOptimum Matching M after, every limit e of M
IjWeights ω
IjG={X is tried to achieve in addition, Y
k, E
kAuthority ω, the present invention defines two camera lens X and Y
kThe color similarity degree
Use 3 with Similarity
Color(X, Y
k) normalize between 0,1, be worth greatly more, show camera lens X and Y
kSimilar more.
The Optimum Matching algorithm that adopts in the present embodiment is the Kuhn-Munkres method, the visible patent documentation of the particular content of this method " a kind of content-based searching lens method " (application number 03150127.3, open day on February 25th, 2004).
3, calculate the kinematic similitude degree of two camera lenses
On motion feature, directly on compression domain, extract the motion vector of camera lens, the histogrammic method of motion of constructing camera lens then is as follows: (A) (i is j) as the histogrammic horizontal ordinate of motion for movement angle angle; (B) (i, j) as the histogrammic ordinate of motion, i, j represent two frames that video is adjacent to exercise intensity int ensity.Movement angle is represented the direction of motion vector, and exercise intensity is represented the energy or the intensity of motion vector, and their computing method are as follows:
(dx
I, j, dy
I, j) expression motion vector horizontal ordinate and ordinate.Movement angle is quantized n angular range in 2 π scopes.Then, in camera lens, the exercise intensity of each angular range is added up forms the motion histogram H of camera lens
X(angle), H
X(angle) horizontal ordinate is angular range n, and ordinate is an exercise intensity, and X represents camera lens, angle ∈ [1, n].In the present embodiment, n=8 in addition, only considers the P-frame in the video, and this is in order to reduce complexity of calculation, pick up speed.Finally, two camera lens X and Y
kSimilarity be defined as
Wherein,
4, calculate two similarities that camera lens is total
At last, the similarity of two camera lenses, depend on the summation of above-mentioned color similarity degree and kinematic similitude degree:
Similarity (X, Y
k)=ω
1Similarity
Color(X, Y
k)+ω
2Similarity
Motion(X, Y
k) Similarity (X, Y
k) two camera lens X of expression and Y
kSimilarity, Similarity
Color(X, Y
k) two camera lens X of expression and Y
kThe color similarity degree, Similarity
Motion(X, Y
k) two camera lens X of expression and Y
kThe kinematic similitude degree, ω
1And ω
2Expression Similarity
Color(X, Y
k) and Similarity
Motion(X, Y
k) similarity SimiIarity (X, Y in the end
k) in shared proportion, in the present embodiment, ω
1=0.7, ω
2=0.3.
Following experimental result shows that the present invention has obtained the retrieval accuracy higher than existing method, has proved the outstanding representation of the present invention in searching lens.
The searching lens database of experiment usefulness is made up of 3 hours video altogether, comprises 3,392 camera lenses.It comprises multiple sports items, as various ball game, weight lifting, swim and the advertising programme that intercuts etc.In contrast test, we have used 6 class sports as the inquiry camera lens, and they are swimming, judo, and vollyball, football, fencing and hockey, as shown in Figure 2.
In order to prove validity of the present invention, we have tested following 3 kinds of methods and have done the experiment contrast:
(1) integrating colour of the present invention and motion feature carry out the method for searching lens;
(2) existing method 1: another application for a patent for invention that the applicant applied for discloses " a kind of content-based searching lens method " (application number 03150127.3, open day on February 25th, 2004), " the algorithm A: the algorithm of sub-camera lens structure cum rights bipartite graph " in two kinds of algorithms;
(3) existing method 2: the applicant in another application for a patent for invention of asking " a kind of content-based searching lens method " (application number 03150127.3 is disclosed, open day on February 25th, 2004), " the algorithm B: the algorithm of equal interval sampling structure cum rights bipartite graph " in two kinds of algorithms;
(4) (author is C.W.Ngo to the document " Motion-based Video Representation for Scene Change Detection " delivered at International Journal of Computer Vision of existing method 3:2002, T.C.Pong, and H.J.Zhang, Vol.50, No.2, page number 127-143)
(5) existing method 4: use the first frame of each camera lens to do the searching lens method of key frame.
Above-mentioned 5 kinds of methods on color characteristic, have all used 162 components of HSV to do color feature value, use histogrammic friendship to measure the similarity of two width of cloth images, and therefore, last experimental result can prove superiority of the present invention.
Two kinds of evaluation indexes in the MPEG-7 standardization activity have been adopted in experiment: the average adjusted retrieval order of normalization ANMRR (Average Normalized Modified Retrieval Rank) and recall level average AR (Average Recall).AR is similar to traditional recall ratio (Recall), and ANMRR compares with traditional precision ratio (Precision), not only can reflect correct result for retrieval ratio, and can reflect correct result's arrangement sequence number.The ANMRR value is more little, means that the rank of the correct camera lens that retrieval obtains is forward more; The AR value is big more, and it is big more to mean that in the individual Query Result of preceding K (K is the cutoff value of result for retrieval) similar camera lens accounts for the ratio of all similar camera lenses.So AR is big more, illustrate that the recall ratio of searching lens is good more; ANMRR is more little, illustrates that the accuracy of searching lens is high more.Table 2 is AR and the ANMRR comparisons to 6 semantic camera lens classes of above-mentioned 4 kinds of methods.
The contrast and experiment of table 2 the present invention and existing method
The inquiry camera lens | The present invention | Existing method 1 | Existing method 2 | Existing method 3 | Existing method 4 |
??AR | ?ANMRR | ?AR | ?ANMRR | ?AR | ?ANMRR | ?AR | ?ANMRR | ?AR | ?ANMRR |
1. swimming | ??0.7551 | ?0.3866 | ?0.6191 | ?0.4926 | ?0.7202 | ?0.3842 | ?0.6247 | ?0.4876 | ?0.6663 | ?0.4466 |
2. judo | ??0.5575 | ?0.5271 | ?0.5650 | ?0.5113 | ?0.5263 | ?0.5264 | ?0.5650 | ?0.5073 | ?0.5250 | ?0.5476 |
3. vollyball | ??0.6112 | ?0.4906 | ?0.6473 | ?0.4384 | ?0.5895 | ?0.4958 | ?0.6502 | ?0.4363 | ?0.5725 | ?0.5092 |
4. football | ??0.7167 | ?0.4028 | ?0.6334 | ?0.4710 | ?0.6725 | ?0.4246 | ?0.6206 | ?0.4755 | ?0.6767 | ?0.4183 |
5. fencing | ??0.8918 | ?0.2298 | ?0.8020 | ?0.2878 | ?0.8633 | ?0.2343 | ?0.7898 | ?0.2934 | ?0?7408 | ?0.3657 |
6. hockey | ??0.6985 | ?0.4094 | ?0.6761 | ?0.4019 | ?0.6940 | ?0.3899 | ?0.6746 | ?0.4107 | ?0.6313 | ?0.4571 |
Mean value | ??0.7051 | ?0.4077 | ?0.6572 | ?0.4338 | ?0.6776 | ?0.4092 | ?0.6542 | ?0.4351 | ?0.6354 | ?0.4574 |
As can be seen from Table 2, no matter the present invention is AR, or ANMRR, has all obtained than the better effect of existing method, and this mainly is because existing method has only been considered color characteristic, and the present invention has also considered motion feature except considering color characteristic.The similarity of latter two camera lens depends on the summation of camera lens color similarity degree and kinematic similitude degree.The test comparing result has proved the outstanding representation of the present invention in the video lens retrieval.
Fig. 3 has provided the result for retrieval of the present invention to the football match camera lens.Arrange result for retrieval from big to small according to shot similarity, the order of arrangement is from left to right, from top to bottom.Wherein, first image promptly is inquiry camera lens itself, because the similarity of it and oneself is the highest, so be arranged in first of Query Result.As can be seen from Figure 3, result for retrieval all is the camera lens about football match.
Method of the present invention is not limited to the embodiment described in the embodiment, and those skilled in the art's technical scheme according to the present invention draws other embodiment, belongs to technological innovation scope of the present invention equally.