CN1477566A - Method for making video search of scenes based on contents - Google Patents

Method for making video search of scenes based on contents Download PDF

Info

Publication number
CN1477566A
CN1477566A CNA031501265A CN03150126A CN1477566A CN 1477566 A CN1477566 A CN 1477566A CN A031501265 A CNA031501265 A CN A031501265A CN 03150126 A CN03150126 A CN 03150126A CN 1477566 A CN1477566 A CN 1477566A
Authority
CN
China
Prior art keywords
mrow
msub
munder
sigma
math
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA031501265A
Other languages
Chinese (zh)
Other versions
CN1240014C (en
Inventor
董庆杰
彭宇新
郭宗明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIDA FANGZHENG TECHN INST Co Ltd BEIJING
Inst Of Computer Science & Technology Peking University
Original Assignee
BEIDA FANGZHENG TECHN INST Co Ltd BEIJING
Inst Of Computer Science & Technology Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIDA FANGZHENG TECHN INST Co Ltd BEIJING, Inst Of Computer Science & Technology Peking University filed Critical BEIDA FANGZHENG TECHN INST Co Ltd BEIJING
Priority to CN 03150126 priority Critical patent/CN1240014C/en
Publication of CN1477566A publication Critical patent/CN1477566A/en
Application granted granted Critical
Publication of CN1240014C publication Critical patent/CN1240014C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a method of making video search of scene based on contents. It uses the fuzzy aggregation analysis method in the scene search. As compared with existent method it can obtain higher accuracy and quick searching speed.

Description

Method for performing content-based video retrieval on lens
Technical Field
The invention belongs to the technical field of video retrieval, and particularly relates to a method for performing content-based video retrieval on a lens.
Background
With significant technological advances in the manufacture, storage, and dissemination of multimedia data, digital video has become an integral part of people's daily lives. The problem faced by people is no longer the lack of multimedia content, but how to find the information needed by people in the multimedia world that is too expensive as the sea. At present, the traditional video retrieval based on keyword description cannot meet the requirement of massive video retrieval due to the reasons of limited description capability, strong subjectivity, manual labeling and the like. In order to facilitate people to search multimedia data, content-based video analysis and retrieval technology has become a hot problem of research since the last 90 s, and the development of content-based video retrieval technology is further promoted due to the gradual formulation and perfection of a multimedia content description interface MPEG-7.
In the prior art, as described in the document "a New Approach to retrieve Video by instance amplified Video Clip" [ x.m.liu, y.t.zhuang, and y.h.pan, ACM Multimedia, pp.41-44, 1999], a general method of Video Retrieval is to first perform shot boundary detection with shots as basic structural units and Retrieval units of a Video sequence; key frames are then extracted inside each shot to represent the content of the shot, and low-level features such as color and texture are extracted from the key frames for indexing and retrieval of the shot. Thus, the content-based shot retrieval is converted into the content-based image retrieval to solve the problem. A problem with this type of approach is that shots are a continuous sequence of images in time and do not fully exploit the temporal and motion information present in the video. In addition, in 2002, the document "a priori similarity between images and Systems for Video Technology" extracts the Modified Hausdorff Distance and the direct destination "(the document authors are s.H.Kim and R. -H.park, vol.CSVT-12, No.7, page 592-. Since two thresholds are set when extracting key frames: the threshold value of the similarity value of the previous frame and the threshold value of the similarity value between the current frame and the previous key frame must meet the two conditions at the same time to generate a key frame, so that the accuracy of extracting the key frame is influenced, and the accuracy of query is influenced; in addition, the YUV color space commonly used in videos is used as a visual feature, and compared with the HSV color space, the YUV color space is not much consistent with the visual perception of people.
Disclosure of Invention
Aiming at the defects of the existing lens retrieval method, the invention aims to provide a method for performing content-based video retrieval on a lens, which can greatly improve the accuracy of the content-based lens retrieval on the basis of the prior art and simultaneously keep the fast retrieval speed, thereby more fully playing the great role of the lens retrieval technology in the current network information society.
The purpose of the invention is realized as follows: a method for content-based video retrieval of a lens, comprising the steps of:
(1) firstly, carrying out shot segmentation on a video database, and taking a shot as a basic structure unit and a retrieval unit of a video;
(2) calculating the similarity between two frame images, and establishing a fuzzy similarity matrix R according to the following method: when i is j, let rijIs 1; when i ≠ j, let rijIs xiAnd yjThe similarity between them;
(3) calculating equivalent matrix of fuzzy similar matrix R by using transfer closure method
(4) Setting threshold lambda to determine intercept set, passing closure matrix to R matrix
Figure A0315012600052
Performing fuzzy clustering and calculation <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&lambda;</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> Set [ x ]]Namely, the equivalence classes of the fuzzy clustering are obtained, and all frames in each equivalence class set are similar, so that any frame in each set can be taken as a key frame;
(5) using key frames ri1,ri2,...,rikDenotes the lens siThe similarity between two shots is measured with a set of keyframes.
Further, the step (1) is performed on the video databaseThe method of performing shot segmentation is preferably a spatio-temporal slicing algorithm. Calculating x in step (2)iAnd yjThe similarity between can be calculated by the intersection of two image histograms: <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math>
Hi(H, S, V) is the histogram of HSV color space, we use the H, S, V components in a 18X 3 three-dimensional spaceThe middle statistical histogram takes the 162 normalized values as the color characteristic value, Intersect (x)i,yj) Representing the intersection of two histograms, which is used to determine the similarity of two key frames, using A (x)i,yj) Normalized to between 0 and 1.
Still further, in step (3), an equivalent matrix of the fuzzy similarity matrix R is calculatedThe transitive closure method of (2) may employ a flat method: <math> <mrow> <mi>R</mi> <mo>&RightArrow;</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mrow> <mo>&RightArrow;</mo> <msup> <mrow> <mo>(</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> <mo>&RightArrow;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&RightArrow;</mo> <msup> <mi>R</mi> <msup> <mn>2</mn> <mi>k</mi> </msup> </msup> <mo>=</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mo>,</mo> </mrow> </math> its time complexity is O (n)3log2n), if the value of n is extremely large, the total calculation time is influenced, so the composite operation of the fuzzy clustering optimal algorithm calculation matrix based on graph connected branch calculation is adopted, and the recursion is as follows: <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mn>0</mn> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <msub> <mi>r</mi> <mi>ij</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>&le;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </math> <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>max</mi> <mo>{</mo> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <mi>min</mi> <mo>[</mo> <msubsup> <mi>r</mi> <mi>ik</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <msubsup> <mi>r</mi> <mi>kj</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>]</mo> <mo>}</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>&le;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&le;</mo> <mi>n</mi> <mo>;</mo> <mn>0</mn> <mo>&le;</mo> <mi>k</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </math>
the temporal complexity T (n) of this algorithm satisfies O (n) ≦ T (n) ≦ O (n)2)。
To better achieve the object of the present invention, when the shot retrieval is performed, the method is used for the shot retrieval
Figure A0315012600065
The fuzzy clustering method comprises the following steps:
(1) determining n samples X ═ (X)1,...,Xn) The fuzzy similarity relation R and an intercept threshold value alpha;
(2) r is transformed into an equivalent matrix according to the following calculation;
RoR=R2
R2oR2=R4
... R 2 k o R 2 k = R 2 ( K + 1 )
until there is one k satisfying R 2 k = R 2 ( k + 1 )
In the above formula, RoR is a synthesis operation of fuzzy relation, and under the assumption that R is a similar matrix, it has been proved that k must exist, and k is less than or equal to log n;
(3) computing collections <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&alpha;</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> [x]Namely fuzzy clustering, and finishing the algorithm;
after fuzzy clustering analysis is carried out on the n sample spaces, a plurality of equivalence classes are obtained, and one sample is selected from each equivalence class to be used as a key frame. The similarity measure between two shots then becomes the similarity measure between the set of keyframes.
In step (5) of the method, the lens s may be takeniAnd sjThe similarity is defined as Sim ( s i , s j ) = 1 2 { M ( s i , s j ) + M ^ ( s i , s j ) } , M represents the maximum value of similarity of the key frames,a second largest value that indicates that the key frames are similar, wherein, M ( s i , s j ) = max p = { 1,2 , . . . } max q = { 1,2 , . . . } { Inter sec t ( r ip , r jq ) } M ^ ( s i , s j ) = max p = { 1,2 , . . . } ^ max q = { 1,2 , . . . } { Inter sec t ( r ip , r jq ) } <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> <mo>.</mo> </mrow> </math>
the invention has the following effects: the method for searching the video of the lens based on the content can obtain higher accuracy rate and simultaneously keep fast searching speed.
The invention has such remarkable technical effects because: the shot content is divided into a plurality of equivalence classes by using a fuzzy clustering analysis method, the equivalence classes well describe the change of the shot content, and the similarity between shots is expressed as the similarity between key frame combinations. The inter-shot similarity measure takes into account the disadvantage of using HSV color histograms to represent key frames: two key frames are considered similar if they have similar color distributions even if their contents are different. The average of the maximum similarity value and the second largest similarity value is used to enhance the robustness of the algorithm. The comparative experiment result proves the effectiveness of the method provided by the invention.
Drawings
FIG. 1 is a schematic flow diagram of a method for content-based video retrieval of a lens;
FIG. 2 is a diagram of an example of 7 semantic classes for shot retrieval in experimental comparison;
FIG. 3 is a schematic diagram of the search result of the method of the present invention for swimming shots.
Detailed Description
FIG. 1 is a general framework of the present invention, which is a flow chart of the method of each step in the present invention. As shown in fig. 1, a shot retrieval method based on fuzzy clustering analysis includes the following steps:
1. shot segmentation
First, a shot segmentation is performed on a Video database by using a spatio-Temporal Slice algorithm (spatial-Temporal Slice), and the shot is used as a basic structural unit and a retrieval unit of a Video, and a detailed description of the spatio-Temporal Slice algorithm can be found in the document "Video Partitioning by Temporal Slice coherence" [ c.w.ngo, t.c.pong, and r.t.chi, IEEE Transactions on Circuits and systems for Video Technology, vol.11, No.8, pp.941-953, August, 2001 ].
2. Establishing fuzzy similarity matrix R
The method for establishing the fuzzy similarity matrix R between the images in the lens comprises the following steps: when i is j, let rijIs 1, when i ≠ j, let rijIs xiAnd yjThe similarity between the two groups is calculated by the following method: <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math>
Hi(H, S, V) are histograms of HSV color space, and we use H, S, V components to count the histograms in a three-dimensional space of 18 × 3 × 3, and use the 162 normalized values as color feature values. Intersect (x)i,yj) Representing the intersection of two histograms, which is used to determine the similarity of two key frames, using A (x)i,yj) Normalized to between 0 and 1.
3. Solving the transfer closure of the similar matrix R to obtain an equivalent matrix
In this embodiment, solving the transmission closure of the similarity matrix adopts a flat method: <math> <mrow> <mi>R</mi> <mo>&RightArrow;</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>&RightArrow;</mo> <msup> <mrow> <mo>(</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>&RightArrow;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&RightArrow;</mo> <msup> <msup> <mi>R</mi> <mn>2</mn> </msup> <mi>k</mi> </msup> <mo>=</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> </mrow> </math>
its time complexity is O (n)3log2n), if the value of n is extremely large, the total calculation time is inevitably affected. Therefore, the composite operation of the matrix is calculated by adopting the fuzzy clustering optimal algorithm based on the graph connected branch calculation, and the recursion is as follows: <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mn>0</mn> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <msub> <mi>r</mi> <mi>ij</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>&le;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </math> <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>max</mi> <mo>{</mo> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <mi>min</mi> <mo>[</mo> <msubsup> <mi>r</mi> <mi>ik</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <msubsup> <mi>r</mi> <mi>kj</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>]</mo> <mo>}</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>&le;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&le;</mo> <mi>n</mi> <mo>;</mo> <mn>0</mn> <mo>&le;</mo> <mi>k</mi> <mo>&le;</mo> <mi>n</mi> <mo>.</mo> </mrow> </math>
the temporal complexity T (n) of this algorithm satisfies O (n) ≦ T (n) ≦ O (n)2)。
4. Setting threshold lambda to determine intercept set, passing closure matrix to R matrixAnd carrying out fuzzy clustering.
In this embodiment, the specific method is as follows:
(1) determining n samples X ═ (X)1,...,xn) The fuzzy similarity relation R and an intercept threshold value alpha;
(2) r is transformed into an equivalent matrix according to the following calculation;
RoR=R2
R2oR2=R4
... R 2 k o R 2 k = R 2 ( K + 1 )
until there is one k satisfying R 2 k = R 2 ( k + 1 )
In the above formula, RoR is a synthesis operation of fuzzy relation, and under the assumption that R is a similar matrix, it has been proved that k must exist, and k is less than or equal to log n;
(3) computing collections <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&alpha;</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> [x]I.e. fuzzy clustering, the algorithm ends
5. After the lens key frames are obtained by a fuzzy clustering analysis method, lens retrieval is carried out based on the key frames. On this basis, use the key frame { ri1,ri2,...,rik) Representing a lens, siLens siAnd sjThe similarity is defined as Sim ( s i , s j ) = 1 2 { M ( s i , s j ) + M ^ ( s i , s j ) } , Wherein, M ( s i , s j ) = max p = { 1,2 , . . . } max q = { 1,2 , . . . } { Inter sec t ( r ip , r jq ) } M ^ ( s i , s j ) = max p = { 1,2 , . . . } ^ max q = { 1,2 , . . . } { Inter sec t ( r ip , r jq ) } <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> <mo>.</mo> </mrow> </math>
Figure A0315012600095
indicating the second largest value, using
Figure A0315012600096
Because the HSV color histogram is used herein to represent key frames, it has the disadvantage that two key frames are considered similar if they have similar color distributions even though their contents are different, and to overcome this drawback, M and
Figure A0315012600097
to enhance the robustness of the algorithm. Hi(H, S, V) is a histogram of the HSV color space, which is a histogram statistically calculated in a 18 × 3 × 3 three-dimensional space using the H, S, V components, and 162 normalized values as color feature values. Intersect (r)i,rj) Represents the intersection of two histograms, which is used herein to determine the similarity of two key frames.
The following experimental results show that the method has better effect than the prior method, the retrieval speed is high, and the effectiveness of the fuzzy clustering analysis algorithm in shot retrieval is verified.
The experimental data retrieved for shots was a 2002 year's subaccupation program recorded from television for a total of 41 minutes, 777 shots, 62132 frames of images. It includes a variety of sports such as various ball games, weight lifting, swimming, and spot advertising programs. We selected 7 semantic classes as query shots, which are weightlifting, volleyball, swimming, judo, rowing, gymnastics, soccer, as shown in fig. 2.
To verify the effectiveness of the present invention, we tested the following 3 methods for experimental comparison:
(1) a commonly used shot retrieval algorithm using the first frame of each shot as a key frame;
(2) an algorithm described in the document "A.efficient algorithm for Video sequence matching using the modified Hausdorffdistance and the direct direction" (s.H.Kim and R.H.park, vo1.CSVT-12, No.7, page 592-;
(3) using a fuzzy clustering analysis algorithm to obtain a key frame to carry out shot retrieval (only using color features);
the first 3 methods all use only color features, so the final experimental result can prove the superiority of the method disclosed by the invention from the measurement method of the lens similarity. Fig. 3 shows a user interface of the experimental program, the upper line on the right side is a browsing area of the query video, the 1 st key frame of each shot in the video is displayed to represent each shot, the user can select the shot to be queried to search, and the lower line on the right side is a query result area. Fig. 3 is a view of the 1 st shot selected in the upper row, which is a swimming shot, represented by the first frame image 022430.bmp of the shot, the query results are arranged from large to small (left to right, top to bottom) with the greatest weight of similarity calculated according to the method of the present invention. The lower left side is a simple playing period, and the double-click retrieval result image can play the video corresponding to the corresponding lens.
The experiment used two evaluation criteria in the MPEG-7 standardization activities: average normalized modified retrieval rank anmrr (average normalized modified retrieval rank) and average recall ratio ar (average recall). AR is similar to conventional recall (call), and ANMRR reflects not only the correct proportion of search results, but also the sequence number of correct results, compared to conventional precision. The smaller the ANMRR value is, the more the ranking of the correct shot obtained by retrieval is; the larger the AR value is, the larger the proportion of similar shots to all similar shots in the top K (K is a truncated value of the search result) query results is. Table 1 shows the AR and ANMRR comparisons for 7 semantic lens classes for the 3 methods described above.
TABLE 1 comparative experimental results of the present invention and the prior two methods
Classification Method 1 Method 2 Method 3
AR ANMRR AR ANMRR AR ANMRR
Weight lifting 0.8824 0.3098 0.8824 0.1539 0.9412 0.2186
Volleyball 0.6333 0.4974 0.7895 0.3264 0.8556 0.3279
Swimming 0.8400 0.2676 0.8250 0.3164 0.9200 0.2175
Judo (judo) 0.7000 0.4310 0.8214 0.2393 0.8000 0.3093
Rowing boat 0.8750 0.3407 0.6875 0.3570 0.8125 0.2223
Gymnastics 0.7857 0.3445 0.9600 0.1759 0.7857 0.2056
Football game 0.5789 0.4883 0.6889 0.2815 0.8421 0.2614
Mean value 0.7565 0.3827 0.8078 0.2642 0.8510 0.2518
As can be seen from Table 1, the method of the present invention, whether AR or ANMRR, achieves better effect than the existing two algorithms, and confirms the effectiveness of the present invention in applying the fuzzy clustering analysis method to the shot retrieval. The method of the invention uses a fuzzy clustering analysis method to divide the shot content into a plurality of equivalence classes, the equivalence classes well describe the change of the shot content, and the similarity between the shots is expressed as the similarity between key frame combinations. The inter-shot similarity measure takes into account the disadvantage of using HSV color histograms to represent key frames: two key frames are considered similar if they have similar color distributions even if their contents are different. The average of the maximum similarity value and the second largest similarity value is used to enhance the robustness of the algorithm. The comparative experiment result proves the effectiveness of the method provided by the invention. In addition, on a PC with a CPU 500M PIII and 256M memory, the average search time of the algorithm is 22.557 seconds, and for a video library with 777 shots, the search speed of the two algorithms is high.

Claims (6)

1. A method for content-based video retrieval of a lens, the method comprising the steps of:
(1) firstly, carrying out shot segmentation on a video database, and taking a shot as a basic structure unit and a retrieval unit of a video;
(2) calculating the similarity between two frame images, and establishing a fuzzy similarity matrix R according to the following method: when i is j, let rijIs 1; when i ≠ j, let rijIs xiAnd yjSimilarity between them;
(3) utilizing transitive closuresMethod for calculating equivalent matrix of fuzzy similarity matrix R
Figure A0315012600021
(4) Setting threshold lambda to determine intercept set, passing closure matrix to R matrix
Figure A0315012600022
Performing fuzzy clustering and calculation <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&lambda;</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> Set [ x ]]The images in each set can be taken as key frames;
(5) using key frames ri1,ri2,...,rikDenotes the lens siThe set of keyframes measures the similarity between two shots.
2. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: in the step (1), the method for carrying out shot segmentation on the video database is a space-time slicing algorithm.
3. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: in step (2), x is calculatediAnd yjThe similarity between can be calculated by the intersection of two image histograms: <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math>
Hi(H, S, V) are histograms of HSV color space, we use H, S, V components to count histograms in a 18X 3 three-dimensional space, and 162 normalized values are used as the characteristic values of the pigment, Intersect (x)i,yj) Representing the intersection of two histograms, which is used to determine the similarity of two key frames, using A (x)i,yj) Normalized to between 0 and 1.
4. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: in the step (3), an equivalent matrix of the fuzzy similar matrix R is calculatedThe transitive closure method of (2) adopts a flat method: <math> <mrow> <mi>R</mi> <mo>&RightArrow;</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mrow> <mo>&RightArrow;</mo> <msup> <mrow> <mo>(</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> <mo>&RightArrow;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&RightArrow;</mo> <msup> <mi>R</mi> <msup> <mn>2</mn> <mi>k</mi> </msup> </msup> <mo>=</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mo>,</mo> </mrow> </math> its time complexity is O (n)3log2n), if the value of n is extremely large, the total calculation time is influenced, so the composite operation of the fuzzy clustering optimal algorithm calculation matrix based on graph connected branch calculation is adopted, and the recursion is as follows: <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mn>0</mn> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <msub> <mi>r</mi> <mi>ij</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>&le;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </math> <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>max</mi> <mo>{</mo> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <mi>min</mi> <mo>[</mo> <msubsup> <mi>r</mi> <mi>ik</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <msubsup> <mi>r</mi> <mi>kj</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>]</mo> <mo>}</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>&le;</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>&le;</mo> <mi>n</mi> <mo>;</mo> <mn>0</mn> <mo>&le;</mo> <mi>k</mi> <mo>&le;</mo> <mi>n</mi> </mrow> </math>
the temporal complexity T (n) of this algorithm satisfies O (n) ≦ T (n) ≦ O (n)2)。
5. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: to pairThe fuzzy clustering method comprises the following steps:
(1) determining n samples x ═ x1,...,xn) The fuzzy similarity relation R and an intercept threshold value alpha;
(2) reconstruction of R according to the following calculation into an equivalent matrix
RoR=R2
R2oR2=R4
... R 2 k o R 2 k = R 2 ( k + 1 )
Until there is one k satisfying R 2 k = R 2 ( k + 1 )
In the above formula, RoR is a synthesis operation of fuzzy relation, and under the assumption that R is a similar matrix, it has been proved that k must exist, and k is less than or equal to log n;
(3) computing collections <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&alpha;</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> [x]Namely fuzzy clustering, and finishing the algorithm;
after fuzzy clustering analysis is carried out on n sample spaces, a plurality of equivalence classes are obtained, one sample is selected from each equivalence class to serve as a key frame, and therefore the similarity measurement between two shots is changed into the similarity measurement between key frame sets.
6. A method for content-based video retrieval of a lens as claimed in claim 1 or 5, wherein: lens siAnd sjThe similarity is defined as Sim ( s i , s j ) = 1 2 { M ( s i , s j ) + M ^ ( s i , s j ) } , M represents the maximum value of similarity of the key frames,a second largest value that indicates that the key frames are similar, wherein, M ( s i , s j ) = max p = { 1,2 , . . . } max q = { 1,2 , . . . } { Inter sec t ( r ip , r jq ) } M ^ ( s i , s j ) = max p = { 1,2 , . . . } ^ max q = { 1,2 , . . . } { Inter sec t ( r ip , r jq ) } <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>&Sigma;</mi> <mi>h</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>s</mi> </munder> <munder> <mi>&Sigma;</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> <mo>.</mo> </mrow> </math>
CN 03150126 2003-07-18 2003-07-18 Method for making video search of scenes based on contents Expired - Fee Related CN1240014C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 03150126 CN1240014C (en) 2003-07-18 2003-07-18 Method for making video search of scenes based on contents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 03150126 CN1240014C (en) 2003-07-18 2003-07-18 Method for making video search of scenes based on contents

Publications (2)

Publication Number Publication Date
CN1477566A true CN1477566A (en) 2004-02-25
CN1240014C CN1240014C (en) 2006-02-01

Family

ID=34156438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 03150126 Expired - Fee Related CN1240014C (en) 2003-07-18 2003-07-18 Method for making video search of scenes based on contents

Country Status (1)

Country Link
CN (1) CN1240014C (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100399804C (en) * 2005-02-15 2008-07-02 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal capable of briefly offering activity video and its abstract offering method
CN100573523C (en) * 2006-12-30 2009-12-23 中国科学院计算技术研究所 A kind of image inquiry method based on marking area
CN101211355B (en) * 2006-12-30 2010-05-19 中国科学院计算技术研究所 Image inquiry method based on clustering
CN101968797A (en) * 2010-09-10 2011-02-09 北京大学 Inter-lens context-based video concept labeling method
CN101339615B (en) * 2008-08-11 2011-05-04 北京交通大学 Method of image segmentation based on similar matrix approximation
WO2011140783A1 (en) * 2010-05-14 2011-11-17 中兴通讯股份有限公司 Method and mobile terminal for realizing video preview and retrieval
CN104217000A (en) * 2014-09-12 2014-12-17 黑龙江斯迪克信息科技有限公司 Content-based video retrieval system
CN105100894A (en) * 2014-08-26 2015-11-25 Tcl集团股份有限公司 Automatic face annotation method and system
CN106960211A (en) * 2016-01-11 2017-07-18 北京陌上花科技有限公司 Key frame acquisition methods and device
CN110175267A (en) * 2019-06-04 2019-08-27 黑龙江省七星农场 A kind of agriculture Internet of Things control processing method based on unmanned aerial vehicle remote sensing technology
CN110852289A (en) * 2019-11-16 2020-02-28 公安部交通管理科学研究所 Method for extracting information of vehicle and driver based on mobile video
US11265317B2 (en) 2015-08-05 2022-03-01 Kyndryl, Inc. Security control for an enterprise network

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201822B (en) * 2006-12-11 2010-06-23 南京理工大学 Method for searching visual lens based on contents

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100399804C (en) * 2005-02-15 2008-07-02 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal capable of briefly offering activity video and its abstract offering method
CN100573523C (en) * 2006-12-30 2009-12-23 中国科学院计算技术研究所 A kind of image inquiry method based on marking area
CN101211355B (en) * 2006-12-30 2010-05-19 中国科学院计算技术研究所 Image inquiry method based on clustering
CN101339615B (en) * 2008-08-11 2011-05-04 北京交通大学 Method of image segmentation based on similar matrix approximation
WO2011140783A1 (en) * 2010-05-14 2011-11-17 中兴通讯股份有限公司 Method and mobile terminal for realizing video preview and retrieval
US8737808B2 (en) 2010-05-14 2014-05-27 Zte Corporation Method and mobile terminal for previewing and retrieving video
CN101968797A (en) * 2010-09-10 2011-02-09 北京大学 Inter-lens context-based video concept labeling method
CN105100894A (en) * 2014-08-26 2015-11-25 Tcl集团股份有限公司 Automatic face annotation method and system
CN105100894B (en) * 2014-08-26 2020-05-05 Tcl科技集团股份有限公司 Face automatic labeling method and system
CN104217000A (en) * 2014-09-12 2014-12-17 黑龙江斯迪克信息科技有限公司 Content-based video retrieval system
US11265317B2 (en) 2015-08-05 2022-03-01 Kyndryl, Inc. Security control for an enterprise network
US11757879B2 (en) 2015-08-05 2023-09-12 Kyndryl, Inc. Security control for an enterprise network
CN106960211A (en) * 2016-01-11 2017-07-18 北京陌上花科技有限公司 Key frame acquisition methods and device
CN106960211B (en) * 2016-01-11 2020-04-14 北京陌上花科技有限公司 Key frame acquisition method and device
CN110175267A (en) * 2019-06-04 2019-08-27 黑龙江省七星农场 A kind of agriculture Internet of Things control processing method based on unmanned aerial vehicle remote sensing technology
CN110852289A (en) * 2019-11-16 2020-02-28 公安部交通管理科学研究所 Method for extracting information of vehicle and driver based on mobile video

Also Published As

Publication number Publication date
CN1240014C (en) 2006-02-01

Similar Documents

Publication Publication Date Title
Lu et al. Color image retrieval technique based on color features and image bitmap
CN102890700B (en) Method for retrieving similar video clips based on sports competition videos
CN112418012B (en) Video abstract generation method based on space-time attention model
CN105049875B (en) A kind of accurate extraction method of key frame based on composite character and abrupt climatic change
CN1477566A (en) Method for making video search of scenes based on contents
JP2010225172A (en) Method of representing image group, descriptor of image group, searching method of image group, computer-readable storage medium, and computer system
Zhi et al. Two-stage pooling of deep convolutional features for image retrieval
CN1851710A (en) Embedded multimedia key frame based video search realizing method
CN107451200B (en) Retrieval method using random quantization vocabulary tree and image retrieval method based on same
Zheng et al. A feature-adaptive semi-supervised framework for co-saliency detection
CN106777159B (en) Video clip retrieval and positioning method based on content
Rathod et al. An algorithm for shot boundary detection and key frame extraction using histogram difference
Duan et al. Mean shift based video segment representation and applications to replay detection
CN1514644A (en) Method of proceeding video frequency searching through video frequency segment
CN100507910C (en) Method of searching lens integrating color and sport characteristics
CN102306275A (en) Method for extracting video texture characteristics based on fuzzy concept lattice
Tong et al. A unified framework for semantic shot representation of sports video
Zhou et al. An SVM-based soccer video shot classification
CN106844573B (en) Video abstract acquisition method based on manifold sorting
Priya et al. Optimized content based image retrieval system based on multiple feature fusion algorithm
CN1252647C (en) Scene-searching method based on contents
Vimina et al. Image retrieval using colour and texture features of regions of interest
Zhang et al. Unsupervised sports video scene clustering and its applications to story units detection
Mohanty et al. A frame-based decision pooling method for video classification
Jiang et al. A new video similarity measure model based on video time density function and dynamic programming

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20060201

Termination date: 20170718