CN1477566A - Method for making video search of scenes based on contents - Google Patents
Method for making video search of scenes based on contents Download PDFInfo
- Publication number
- CN1477566A CN1477566A CNA031501265A CN03150126A CN1477566A CN 1477566 A CN1477566 A CN 1477566A CN A031501265 A CNA031501265 A CN A031501265A CN 03150126 A CN03150126 A CN 03150126A CN 1477566 A CN1477566 A CN 1477566A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- munder
- sigma
- math
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000004458 analytical method Methods 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims description 30
- 238000004364 calculation method Methods 0.000 claims description 13
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000002123 temporal effect Effects 0.000 claims description 5
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 239000002131 composite material Substances 0.000 claims description 3
- 238000003786 synthesis reaction Methods 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims 2
- 239000000049 pigment Substances 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 abstract description 2
- 238000004220 aggregation Methods 0.000 abstract 1
- 230000002776 aggregation Effects 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000009182 swimming Effects 0.000 description 5
- 238000011524 similarity measure Methods 0.000 description 4
- 230000000052 comparative effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 208000018375 cerebral sinovenous thrombosis Diseases 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 241000057726 Panine gammaherpesvirus 1 Species 0.000 description 1
- 241000190235 Tanakia himantegus chii Species 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Landscapes
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a method of making video search of scene based on contents. It uses the fuzzy aggregation analysis method in the scene search. As compared with existent method it can obtain higher accuracy and quick searching speed.
Description
Technical Field
The invention belongs to the technical field of video retrieval, and particularly relates to a method for performing content-based video retrieval on a lens.
Background
With significant technological advances in the manufacture, storage, and dissemination of multimedia data, digital video has become an integral part of people's daily lives. The problem faced by people is no longer the lack of multimedia content, but how to find the information needed by people in the multimedia world that is too expensive as the sea. At present, the traditional video retrieval based on keyword description cannot meet the requirement of massive video retrieval due to the reasons of limited description capability, strong subjectivity, manual labeling and the like. In order to facilitate people to search multimedia data, content-based video analysis and retrieval technology has become a hot problem of research since the last 90 s, and the development of content-based video retrieval technology is further promoted due to the gradual formulation and perfection of a multimedia content description interface MPEG-7.
In the prior art, as described in the document "a New Approach to retrieve Video by instance amplified Video Clip" [ x.m.liu, y.t.zhuang, and y.h.pan, ACM Multimedia, pp.41-44, 1999], a general method of Video Retrieval is to first perform shot boundary detection with shots as basic structural units and Retrieval units of a Video sequence; key frames are then extracted inside each shot to represent the content of the shot, and low-level features such as color and texture are extracted from the key frames for indexing and retrieval of the shot. Thus, the content-based shot retrieval is converted into the content-based image retrieval to solve the problem. A problem with this type of approach is that shots are a continuous sequence of images in time and do not fully exploit the temporal and motion information present in the video. In addition, in 2002, the document "a priori similarity between images and Systems for Video Technology" extracts the Modified Hausdorff Distance and the direct destination "(the document authors are s.H.Kim and R. -H.park, vol.CSVT-12, No.7, page 592-. Since two thresholds are set when extracting key frames: the threshold value of the similarity value of the previous frame and the threshold value of the similarity value between the current frame and the previous key frame must meet the two conditions at the same time to generate a key frame, so that the accuracy of extracting the key frame is influenced, and the accuracy of query is influenced; in addition, the YUV color space commonly used in videos is used as a visual feature, and compared with the HSV color space, the YUV color space is not much consistent with the visual perception of people.
Disclosure of Invention
Aiming at the defects of the existing lens retrieval method, the invention aims to provide a method for performing content-based video retrieval on a lens, which can greatly improve the accuracy of the content-based lens retrieval on the basis of the prior art and simultaneously keep the fast retrieval speed, thereby more fully playing the great role of the lens retrieval technology in the current network information society.
The purpose of the invention is realized as follows: a method for content-based video retrieval of a lens, comprising the steps of:
(1) firstly, carrying out shot segmentation on a video database, and taking a shot as a basic structure unit and a retrieval unit of a video;
(2) calculating the similarity between two frame images, and establishing a fuzzy similarity matrix R according to the following method: when i is j, let rijIs 1; when i ≠ j, let rijIs xiAnd yjThe similarity between them;
(3) calculating equivalent matrix of fuzzy similar matrix R by using transfer closure method
(4) Setting threshold lambda to determine intercept set, passing closure matrix to R matrixPerforming fuzzy clustering and calculation <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>≥</mo> <mi>λ</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> Set [ x ]]Namely, the equivalence classes of the fuzzy clustering are obtained, and all frames in each equivalence class set are similar, so that any frame in each set can be taken as a key frame;
(5) using key frames ri1,ri2,...,rikDenotes the lens siThe similarity between two shots is measured with a set of keyframes.
Further, the step (1) is performed on the video databaseThe method of performing shot segmentation is preferably a spatio-temporal slicing algorithm. Calculating x in step (2)iAnd yjThe similarity between can be calculated by the intersection of two image histograms: <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math>
Hi(H, S, V) is the histogram of HSV color space, we use the H, S, V components in a 18X 3 three-dimensional spaceThe middle statistical histogram takes the 162 normalized values as the color characteristic value, Intersect (x)i,yj) Representing the intersection of two histograms, which is used to determine the similarity of two key frames, using A (x)i,yj) Normalized to between 0 and 1.
Still further, in step (3), an equivalent matrix of the fuzzy similarity matrix R is calculatedThe transitive closure method of (2) may employ a flat method: <math> <mrow> <mi>R</mi> <mo>→</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mrow> <mo>→</mo> <msup> <mrow> <mo>(</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> <mo>→</mo> <mo>·</mo> <mo>·</mo> <mo>·</mo> <mo>→</mo> <msup> <mi>R</mi> <msup> <mn>2</mn> <mi>k</mi> </msup> </msup> <mo>=</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mo>,</mo> </mrow> </math> its time complexity is O (n)3log2n), if the value of n is extremely large, the total calculation time is influenced, so the composite operation of the fuzzy clustering optimal algorithm calculation matrix based on graph connected branch calculation is adopted, and the recursion is as follows: <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mn>0</mn> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <msub> <mi>r</mi> <mi>ij</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>≤</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>≤</mo> <mi>n</mi> </mrow> </math> <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>max</mi> <mo>{</mo> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <mi>min</mi> <mo>[</mo> <msubsup> <mi>r</mi> <mi>ik</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <msubsup> <mi>r</mi> <mi>kj</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>]</mo> <mo>}</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>≤</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>≤</mo> <mi>n</mi> <mo>;</mo> <mn>0</mn> <mo>≤</mo> <mi>k</mi> <mo>≤</mo> <mi>n</mi> </mrow> </math>
the temporal complexity T (n) of this algorithm satisfies O (n) ≦ T (n) ≦ O (n)2)。
To better achieve the object of the present invention, when the shot retrieval is performed, the method is used for the shot retrievalThe fuzzy clustering method comprises the following steps:
(1) determining n samples X ═ (X)1,...,Xn) The fuzzy similarity relation R and an intercept threshold value alpha;
(2) r is transformed into an equivalent matrix according to the following calculation;
RoR=R2
R2oR2=R4
...
until there is one k satisfying
In the above formula, RoR is a synthesis operation of fuzzy relation, and under the assumption that R is a similar matrix, it has been proved that k must exist, and k is less than or equal to log n;
(3) computing collections <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>≥</mo> <mi>α</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> [x]Namely fuzzy clustering, and finishing the algorithm;
after fuzzy clustering analysis is carried out on the n sample spaces, a plurality of equivalence classes are obtained, and one sample is selected from each equivalence class to be used as a key frame. The similarity measure between two shots then becomes the similarity measure between the set of keyframes.
In step (5) of the method, the lens s may be takeniAnd sjThe similarity is defined as M represents the maximum value of similarity of the key frames,a second largest value that indicates that the key frames are similar, wherein, <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> <mo>.</mo> </mrow> </math>
the invention has the following effects: the method for searching the video of the lens based on the content can obtain higher accuracy rate and simultaneously keep fast searching speed.
The invention has such remarkable technical effects because: the shot content is divided into a plurality of equivalence classes by using a fuzzy clustering analysis method, the equivalence classes well describe the change of the shot content, and the similarity between shots is expressed as the similarity between key frame combinations. The inter-shot similarity measure takes into account the disadvantage of using HSV color histograms to represent key frames: two key frames are considered similar if they have similar color distributions even if their contents are different. The average of the maximum similarity value and the second largest similarity value is used to enhance the robustness of the algorithm. The comparative experiment result proves the effectiveness of the method provided by the invention.
Drawings
FIG. 1 is a schematic flow diagram of a method for content-based video retrieval of a lens;
FIG. 2 is a diagram of an example of 7 semantic classes for shot retrieval in experimental comparison;
FIG. 3 is a schematic diagram of the search result of the method of the present invention for swimming shots.
Detailed Description
FIG. 1 is a general framework of the present invention, which is a flow chart of the method of each step in the present invention. As shown in fig. 1, a shot retrieval method based on fuzzy clustering analysis includes the following steps:
1. shot segmentation
First, a shot segmentation is performed on a Video database by using a spatio-Temporal Slice algorithm (spatial-Temporal Slice), and the shot is used as a basic structural unit and a retrieval unit of a Video, and a detailed description of the spatio-Temporal Slice algorithm can be found in the document "Video Partitioning by Temporal Slice coherence" [ c.w.ngo, t.c.pong, and r.t.chi, IEEE Transactions on Circuits and systems for Video Technology, vol.11, No.8, pp.941-953, August, 2001 ].
2. Establishing fuzzy similarity matrix R
The method for establishing the fuzzy similarity matrix R between the images in the lens comprises the following steps: when i is j, let rijIs 1, when i ≠ j, let rijIs xiAnd yjThe similarity between the two groups is calculated by the following method: <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math>
Hi(H, S, V) are histograms of HSV color space, and we use H, S, V components to count the histograms in a three-dimensional space of 18 × 3 × 3, and use the 162 normalized values as color feature values. Intersect (x)i,yj) Representing the intersection of two histograms, which is used to determine the similarity of two key frames, using A (x)i,yj) Normalized to between 0 and 1.
3. Solving the transfer closure of the similar matrix R to obtain an equivalent matrix
In this embodiment, solving the transmission closure of the similarity matrix adopts a flat method: <math> <mrow> <mi>R</mi> <mo>→</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>→</mo> <msup> <mrow> <mo>(</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>→</mo> <mo>·</mo> <mo>·</mo> <mo>·</mo> <mo>→</mo> <msup> <msup> <mi>R</mi> <mn>2</mn> </msup> <mi>k</mi> </msup> <mo>=</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> </mrow> </math>
its time complexity is O (n)3log2n), if the value of n is extremely large, the total calculation time is inevitably affected. Therefore, the composite operation of the matrix is calculated by adopting the fuzzy clustering optimal algorithm based on the graph connected branch calculation, and the recursion is as follows: <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mn>0</mn> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <msub> <mi>r</mi> <mi>ij</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>≤</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>≤</mo> <mi>n</mi> </mrow> </math> <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>max</mi> <mo>{</mo> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <mi>min</mi> <mo>[</mo> <msubsup> <mi>r</mi> <mi>ik</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <msubsup> <mi>r</mi> <mi>kj</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>]</mo> <mo>}</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>≤</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>≤</mo> <mi>n</mi> <mo>;</mo> <mn>0</mn> <mo>≤</mo> <mi>k</mi> <mo>≤</mo> <mi>n</mi> <mo>.</mo> </mrow> </math>
the temporal complexity T (n) of this algorithm satisfies O (n) ≦ T (n) ≦ O (n)2)。
4. Setting threshold lambda to determine intercept set, passing closure matrix to R matrixAnd carrying out fuzzy clustering.
In this embodiment, the specific method is as follows:
(1) determining n samples X ═ (X)1,...,xn) The fuzzy similarity relation R and an intercept threshold value alpha;
(2) r is transformed into an equivalent matrix according to the following calculation;
RoR=R2
R2oR2=R4
...
until there is one k satisfying
In the above formula, RoR is a synthesis operation of fuzzy relation, and under the assumption that R is a similar matrix, it has been proved that k must exist, and k is less than or equal to log n;
(3) computing collections <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>≥</mo> <mi>α</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> [x]I.e. fuzzy clustering, the algorithm ends
5. After the lens key frames are obtained by a fuzzy clustering analysis method, lens retrieval is carried out based on the key frames. On this basis, use the key frame { ri1,ri2,...,rik) Representing a lens, siLens siAnd sjThe similarity is defined as Wherein, <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> <mo>.</mo> </mrow> </math> indicating the second largest value, usingBecause the HSV color histogram is used herein to represent key frames, it has the disadvantage that two key frames are considered similar if they have similar color distributions even though their contents are different, and to overcome this drawback, M andto enhance the robustness of the algorithm. Hi(H, S, V) is a histogram of the HSV color space, which is a histogram statistically calculated in a 18 × 3 × 3 three-dimensional space using the H, S, V components, and 162 normalized values as color feature values. Intersect (r)i,rj) Represents the intersection of two histograms, which is used herein to determine the similarity of two key frames.
The following experimental results show that the method has better effect than the prior method, the retrieval speed is high, and the effectiveness of the fuzzy clustering analysis algorithm in shot retrieval is verified.
The experimental data retrieved for shots was a 2002 year's subaccupation program recorded from television for a total of 41 minutes, 777 shots, 62132 frames of images. It includes a variety of sports such as various ball games, weight lifting, swimming, and spot advertising programs. We selected 7 semantic classes as query shots, which are weightlifting, volleyball, swimming, judo, rowing, gymnastics, soccer, as shown in fig. 2.
To verify the effectiveness of the present invention, we tested the following 3 methods for experimental comparison:
(1) a commonly used shot retrieval algorithm using the first frame of each shot as a key frame;
(2) an algorithm described in the document "A.efficient algorithm for Video sequence matching using the modified Hausdorffdistance and the direct direction" (s.H.Kim and R.H.park, vo1.CSVT-12, No.7, page 592-;
(3) using a fuzzy clustering analysis algorithm to obtain a key frame to carry out shot retrieval (only using color features);
the first 3 methods all use only color features, so the final experimental result can prove the superiority of the method disclosed by the invention from the measurement method of the lens similarity. Fig. 3 shows a user interface of the experimental program, the upper line on the right side is a browsing area of the query video, the 1 st key frame of each shot in the video is displayed to represent each shot, the user can select the shot to be queried to search, and the lower line on the right side is a query result area. Fig. 3 is a view of the 1 st shot selected in the upper row, which is a swimming shot, represented by the first frame image 022430.bmp of the shot, the query results are arranged from large to small (left to right, top to bottom) with the greatest weight of similarity calculated according to the method of the present invention. The lower left side is a simple playing period, and the double-click retrieval result image can play the video corresponding to the corresponding lens.
The experiment used two evaluation criteria in the MPEG-7 standardization activities: average normalized modified retrieval rank anmrr (average normalized modified retrieval rank) and average recall ratio ar (average recall). AR is similar to conventional recall (call), and ANMRR reflects not only the correct proportion of search results, but also the sequence number of correct results, compared to conventional precision. The smaller the ANMRR value is, the more the ranking of the correct shot obtained by retrieval is; the larger the AR value is, the larger the proportion of similar shots to all similar shots in the top K (K is a truncated value of the search result) query results is. Table 1 shows the AR and ANMRR comparisons for 7 semantic lens classes for the 3 methods described above.
TABLE 1 comparative experimental results of the present invention and the prior two methods
Classification | Method 1 | Method 2 | Method 3 | |||
AR | ANMRR | AR | ANMRR | AR | ANMRR | |
Weight lifting | 0.8824 | 0.3098 | 0.8824 | 0.1539 | 0.9412 | 0.2186 |
Volleyball | 0.6333 | 0.4974 | 0.7895 | 0.3264 | 0.8556 | 0.3279 |
Swimming | 0.8400 | 0.2676 | 0.8250 | 0.3164 | 0.9200 | 0.2175 |
Judo (judo) | 0.7000 | 0.4310 | 0.8214 | 0.2393 | 0.8000 | 0.3093 |
Rowing boat | 0.8750 | 0.3407 | 0.6875 | 0.3570 | 0.8125 | 0.2223 |
Gymnastics | 0.7857 | 0.3445 | 0.9600 | 0.1759 | 0.7857 | 0.2056 |
Football game | 0.5789 | 0.4883 | 0.6889 | 0.2815 | 0.8421 | 0.2614 |
Mean value | 0.7565 | 0.3827 | 0.8078 | 0.2642 | 0.8510 | 0.2518 |
As can be seen from Table 1, the method of the present invention, whether AR or ANMRR, achieves better effect than the existing two algorithms, and confirms the effectiveness of the present invention in applying the fuzzy clustering analysis method to the shot retrieval. The method of the invention uses a fuzzy clustering analysis method to divide the shot content into a plurality of equivalence classes, the equivalence classes well describe the change of the shot content, and the similarity between the shots is expressed as the similarity between key frame combinations. The inter-shot similarity measure takes into account the disadvantage of using HSV color histograms to represent key frames: two key frames are considered similar if they have similar color distributions even if their contents are different. The average of the maximum similarity value and the second largest similarity value is used to enhance the robustness of the algorithm. The comparative experiment result proves the effectiveness of the method provided by the invention. In addition, on a PC with a CPU 500M PIII and 256M memory, the average search time of the algorithm is 22.557 seconds, and for a video library with 777 shots, the search speed of the two algorithms is high.
Claims (6)
1. A method for content-based video retrieval of a lens, the method comprising the steps of:
(1) firstly, carrying out shot segmentation on a video database, and taking a shot as a basic structure unit and a retrieval unit of a video;
(2) calculating the similarity between two frame images, and establishing a fuzzy similarity matrix R according to the following method: when i is j, let rijIs 1; when i ≠ j, let rijIs xiAnd yjSimilarity between them;
(3) utilizing transitive closuresMethod for calculating equivalent matrix of fuzzy similarity matrix R
(4) Setting threshold lambda to determine intercept set, passing closure matrix to R matrixPerforming fuzzy clustering and calculation <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>≥</mo> <mi>λ</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> Set [ x ]]The images in each set can be taken as key frames;
(5) using key frames ri1,ri2,...,rikDenotes the lens siThe set of keyframes measures the similarity between two shots.
2. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: in the step (1), the method for carrying out shot segmentation on the video database is a space-time slicing algorithm.
3. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: in step (2), x is calculatediAnd yjThe similarity between can be calculated by the intersection of two image histograms: <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math>
Hi(H, S, V) are histograms of HSV color space, we use H, S, V components to count histograms in a 18X 3 three-dimensional space, and 162 normalized values are used as the characteristic values of the pigment, Intersect (x)i,yj) Representing the intersection of two histograms, which is used to determine the similarity of two key frames, using A (x)i,yj) Normalized to between 0 and 1.
4. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: in the step (3), an equivalent matrix of the fuzzy similar matrix R is calculatedThe transitive closure method of (2) adopts a flat method: <math> <mrow> <mi>R</mi> <mo>→</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mrow> <mo>→</mo> <msup> <mrow> <mo>(</mo> <msup> <mi>R</mi> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> <mo>→</mo> <mo>·</mo> <mo>·</mo> <mo>·</mo> <mo>→</mo> <msup> <mi>R</mi> <msup> <mn>2</mn> <mi>k</mi> </msup> </msup> <mo>=</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mo>,</mo> </mrow> </math> its time complexity is O (n)3log2n), if the value of n is extremely large, the total calculation time is influenced, so the composite operation of the fuzzy clustering optimal algorithm calculation matrix based on graph connected branch calculation is adopted, and the recursion is as follows: <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mn>0</mn> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <msub> <mi>r</mi> <mi>ij</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>≤</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>≤</mo> <mi>n</mi> </mrow> </math> <math> <mrow> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>max</mi> <mo>{</mo> <msubsup> <mi>r</mi> <mi>ij</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <mi>min</mi> <mo>[</mo> <msubsup> <mi>r</mi> <mi>ik</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> <msubsup> <mi>r</mi> <mi>kj</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>]</mo> <mo>}</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mn>0</mn> <mo>≤</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>≤</mo> <mi>n</mi> <mo>;</mo> <mn>0</mn> <mo>≤</mo> <mi>k</mi> <mo>≤</mo> <mi>n</mi> </mrow> </math>
the temporal complexity T (n) of this algorithm satisfies O (n) ≦ T (n) ≦ O (n)2)。
5. A method for content-based video retrieval of a lens, as claimed in claim 1, wherein: to pairThe fuzzy clustering method comprises the following steps:
(1) determining n samples x ═ x1,...,xn) The fuzzy similarity relation R and an intercept threshold value alpha;
(2) reconstruction of R according to the following calculation into an equivalent matrix
RoR=R2
R2oR2=R4
...
Until there is one k satisfying
In the above formula, RoR is a synthesis operation of fuzzy relation, and under the assumption that R is a similar matrix, it has been proved that k must exist, and k is less than or equal to log n;
(3) computing collections <math> <mrow> <mo>[</mo> <mi>x</mi> <mo>]</mo> <mo>=</mo> <mo>{</mo> <mi>y</mi> <mo>|</mo> <mover> <mi>R</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>≥</mo> <mi>α</mi> <mo>}</mo> <mo>,</mo> </mrow> </math> [x]Namely fuzzy clustering, and finishing the algorithm;
after fuzzy clustering analysis is carried out on n sample spaces, a plurality of equivalence classes are obtained, one sample is selected from each equivalence class to serve as a key frame, and therefore the similarity measurement between two shots is changed into the similarity measurement between key frame sets.
6. A method for content-based video retrieval of a lens as claimed in claim 1 or 5, wherein: lens siAnd sjThe similarity is defined as M represents the maximum value of similarity of the key frames,a second largest value that indicates that the key frames are similar, wherein, <math> <mrow> <mi>Inter</mi> <mi>sec</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <mi>min</mi> <mo>{</mo> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </math> <math> <mrow> <mi>A</mi> <mrow> <mo>(</mo> <msub> <mi>r</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>r</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>min</mi> <mo>{</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>,</mo> <munder> <mi>Σ</mi> <mi>h</mi> </munder> <munder> <mi>Σ</mi> <mi>s</mi> </munder> <munder> <mi>Σ</mi> <mi>v</mi> </munder> <msub> <mi>H</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>s</mi> <mo>,</mo> <mi>v</mi> <mo>)</mo> </mrow> <mo>}</mo> <mo>.</mo> </mrow> </math>
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 03150126 CN1240014C (en) | 2003-07-18 | 2003-07-18 | Method for making video search of scenes based on contents |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 03150126 CN1240014C (en) | 2003-07-18 | 2003-07-18 | Method for making video search of scenes based on contents |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1477566A true CN1477566A (en) | 2004-02-25 |
CN1240014C CN1240014C (en) | 2006-02-01 |
Family
ID=34156438
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 03150126 Expired - Fee Related CN1240014C (en) | 2003-07-18 | 2003-07-18 | Method for making video search of scenes based on contents |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1240014C (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100399804C (en) * | 2005-02-15 | 2008-07-02 | 乐金电子(中国)研究开发中心有限公司 | Mobile communication terminal capable of briefly offering activity video and its abstract offering method |
CN100573523C (en) * | 2006-12-30 | 2009-12-23 | 中国科学院计算技术研究所 | A kind of image inquiry method based on marking area |
CN101211355B (en) * | 2006-12-30 | 2010-05-19 | 中国科学院计算技术研究所 | Image inquiry method based on clustering |
CN101968797A (en) * | 2010-09-10 | 2011-02-09 | 北京大学 | Inter-lens context-based video concept labeling method |
CN101339615B (en) * | 2008-08-11 | 2011-05-04 | 北京交通大学 | Method of image segmentation based on similar matrix approximation |
WO2011140783A1 (en) * | 2010-05-14 | 2011-11-17 | 中兴通讯股份有限公司 | Method and mobile terminal for realizing video preview and retrieval |
CN104217000A (en) * | 2014-09-12 | 2014-12-17 | 黑龙江斯迪克信息科技有限公司 | Content-based video retrieval system |
CN105100894A (en) * | 2014-08-26 | 2015-11-25 | Tcl集团股份有限公司 | Automatic face annotation method and system |
CN106960211A (en) * | 2016-01-11 | 2017-07-18 | 北京陌上花科技有限公司 | Key frame acquisition methods and device |
CN110175267A (en) * | 2019-06-04 | 2019-08-27 | 黑龙江省七星农场 | A kind of agriculture Internet of Things control processing method based on unmanned aerial vehicle remote sensing technology |
CN110852289A (en) * | 2019-11-16 | 2020-02-28 | 公安部交通管理科学研究所 | Method for extracting information of vehicle and driver based on mobile video |
US11265317B2 (en) | 2015-08-05 | 2022-03-01 | Kyndryl, Inc. | Security control for an enterprise network |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101201822B (en) * | 2006-12-11 | 2010-06-23 | 南京理工大学 | Method for searching visual lens based on contents |
-
2003
- 2003-07-18 CN CN 03150126 patent/CN1240014C/en not_active Expired - Fee Related
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100399804C (en) * | 2005-02-15 | 2008-07-02 | 乐金电子(中国)研究开发中心有限公司 | Mobile communication terminal capable of briefly offering activity video and its abstract offering method |
CN100573523C (en) * | 2006-12-30 | 2009-12-23 | 中国科学院计算技术研究所 | A kind of image inquiry method based on marking area |
CN101211355B (en) * | 2006-12-30 | 2010-05-19 | 中国科学院计算技术研究所 | Image inquiry method based on clustering |
CN101339615B (en) * | 2008-08-11 | 2011-05-04 | 北京交通大学 | Method of image segmentation based on similar matrix approximation |
WO2011140783A1 (en) * | 2010-05-14 | 2011-11-17 | 中兴通讯股份有限公司 | Method and mobile terminal for realizing video preview and retrieval |
US8737808B2 (en) | 2010-05-14 | 2014-05-27 | Zte Corporation | Method and mobile terminal for previewing and retrieving video |
CN101968797A (en) * | 2010-09-10 | 2011-02-09 | 北京大学 | Inter-lens context-based video concept labeling method |
CN105100894A (en) * | 2014-08-26 | 2015-11-25 | Tcl集团股份有限公司 | Automatic face annotation method and system |
CN105100894B (en) * | 2014-08-26 | 2020-05-05 | Tcl科技集团股份有限公司 | Face automatic labeling method and system |
CN104217000A (en) * | 2014-09-12 | 2014-12-17 | 黑龙江斯迪克信息科技有限公司 | Content-based video retrieval system |
US11265317B2 (en) | 2015-08-05 | 2022-03-01 | Kyndryl, Inc. | Security control for an enterprise network |
US11757879B2 (en) | 2015-08-05 | 2023-09-12 | Kyndryl, Inc. | Security control for an enterprise network |
CN106960211A (en) * | 2016-01-11 | 2017-07-18 | 北京陌上花科技有限公司 | Key frame acquisition methods and device |
CN106960211B (en) * | 2016-01-11 | 2020-04-14 | 北京陌上花科技有限公司 | Key frame acquisition method and device |
CN110175267A (en) * | 2019-06-04 | 2019-08-27 | 黑龙江省七星农场 | A kind of agriculture Internet of Things control processing method based on unmanned aerial vehicle remote sensing technology |
CN110852289A (en) * | 2019-11-16 | 2020-02-28 | 公安部交通管理科学研究所 | Method for extracting information of vehicle and driver based on mobile video |
Also Published As
Publication number | Publication date |
---|---|
CN1240014C (en) | 2006-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lu et al. | Color image retrieval technique based on color features and image bitmap | |
CN102890700B (en) | Method for retrieving similar video clips based on sports competition videos | |
CN112418012B (en) | Video abstract generation method based on space-time attention model | |
CN105049875B (en) | A kind of accurate extraction method of key frame based on composite character and abrupt climatic change | |
CN1477566A (en) | Method for making video search of scenes based on contents | |
JP2010225172A (en) | Method of representing image group, descriptor of image group, searching method of image group, computer-readable storage medium, and computer system | |
Zhi et al. | Two-stage pooling of deep convolutional features for image retrieval | |
CN1851710A (en) | Embedded multimedia key frame based video search realizing method | |
CN107451200B (en) | Retrieval method using random quantization vocabulary tree and image retrieval method based on same | |
Zheng et al. | A feature-adaptive semi-supervised framework for co-saliency detection | |
CN106777159B (en) | Video clip retrieval and positioning method based on content | |
Rathod et al. | An algorithm for shot boundary detection and key frame extraction using histogram difference | |
Duan et al. | Mean shift based video segment representation and applications to replay detection | |
CN1514644A (en) | Method of proceeding video frequency searching through video frequency segment | |
CN100507910C (en) | Method of searching lens integrating color and sport characteristics | |
CN102306275A (en) | Method for extracting video texture characteristics based on fuzzy concept lattice | |
Tong et al. | A unified framework for semantic shot representation of sports video | |
Zhou et al. | An SVM-based soccer video shot classification | |
CN106844573B (en) | Video abstract acquisition method based on manifold sorting | |
Priya et al. | Optimized content based image retrieval system based on multiple feature fusion algorithm | |
CN1252647C (en) | Scene-searching method based on contents | |
Vimina et al. | Image retrieval using colour and texture features of regions of interest | |
Zhang et al. | Unsupervised sports video scene clustering and its applications to story units detection | |
Mohanty et al. | A frame-based decision pooling method for video classification | |
Jiang et al. | A new video similarity measure model based on video time density function and dynamic programming |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20060201 Termination date: 20170718 |