CN103440292A - Method and system for retrieving multimedia information based on bit vector - Google Patents

Method and system for retrieving multimedia information based on bit vector Download PDF

Info

Publication number
CN103440292A
CN103440292A CN2013103597166A CN201310359716A CN103440292A CN 103440292 A CN103440292 A CN 103440292A CN 2013103597166 A CN2013103597166 A CN 2013103597166A CN 201310359716 A CN201310359716 A CN 201310359716A CN 103440292 A CN103440292 A CN 103440292A
Authority
CN
China
Prior art keywords
vector
high dimensional
dimensional feature
sign
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013103597166A
Other languages
Chinese (zh)
Other versions
CN103440292B (en
Inventor
刘洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sina Technology China Co Ltd
Original Assignee
Sina Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sina Technology China Co Ltd filed Critical Sina Technology China Co Ltd
Priority to CN201310359716.6A priority Critical patent/CN103440292B/en
Publication of CN103440292A publication Critical patent/CN103440292A/en
Application granted granted Critical
Publication of CN103440292B publication Critical patent/CN103440292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method and system for retrieving multimedia information based on a bit vector. The method comprises the steps that the characteristic data of the current multimedia information are extracted to obtain the n-dimensional high-dimensional characteristic vector of the current multi-media information; the n-dimensional high-dimensional characteristic vector is converted through a projection matrix to obtain an m-dimensional middle vector; all the elements of an m-dimensional threshold vector are compared with the corresponding elements of the middle vector respectively, and binaryzation is carried out on the middle vector according to a comparison result to obtain the m-dimensional bit vector of the current multimedia information; m is smaller than n; according to the obtained bit vector, a bit vector similar to the bit vector is searched for in a multimedia characteristic database, and the multimedia information corresponding to the found bit vector is used for being output as a retrieved result. The method ensures original vector identification capacity, and after the high-dimensional characteristic vector of the multimedia information is mapped into a low-dimensional bit vector, the retrieval efficiency based on the bit vector is higher, and the retrieval loss based on the bit vector is smaller.

Description

Multimedia information retrieval method and system based on bit vectors
Technical field
The present invention relates to computer realm, relate in particular to a kind of multimedia information retrieval method and system based on bit vectors.
Background technology
In recent years, along with the develop rapidly of multimedia technology and computer technology, large-scale multimedia messages appear at numerous research and application more and more.Effectively access and utilize for the information that makes to comprise in these numerous and jumbled data can access, traditional text based retrieval technique can't meet the growing demand of user, and the content-based retrieval technology is just arisen at the historic moment.
The content-based retrieval method need to first extract multimedia characteristic and set up property data base, will be then the neighbour's retrieval to characteristic to the retrieval conversion of multimedia messages.For large scale multimedia information, its characteristic is also large-scale.This just need to have the suitable indexing means corresponding with characteristic to carry out tissue characteristic data, accelerates the speed of retrieval.
Yet, the characteristic of multimedia messages is the vector data of higher-dimension (being called for short the high dimensional feature vector) often, traditional Indexing Mechanism that is adapted to low dimension data is difficult to be adapted to the requirement of information retrieval based on contents, and this is the index dimension disaster phenomenon of usually said high dimensional data namely.That is to say, based on the high dimensional feature vector, realize that the retrieval of multimedia messages will expend huge retrieve resources, consumption is very large, inefficiency.
For addressing the above problem, the method of prior art, as similar responsive Hash (Similarity Sensitive Hash, SSH), local sensitivity Hash (Locality Sensitive Hash, LSH) method, by the bit vectors that is low-dimensional by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING, thus the retrieval rate that utilizes Similarity Measures based on bit vectors and efficient index method to accelerate the high dimensional feature vector, thus improve the recall precision of multimedia messages.Yet, the method of prior art easily causes similar high dimensional feature vector (being similar high dimensional feature vector) to be mapped as dissimilar bit vectors, dissimilar high dimensional feature vector (being non-similar high dimensional feature vector) is mapped as similar bit vectors, while causing carrying out multimedia information retrieval, after the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of multimedia messages is bit vectors, there is larger erroneous matching rate, make the recognition capability of former directed quantity descend.
Therefore, be necessary to provide a kind of multimedia information retrieval method based on bit vectors, in the situation that guarantee former directed quantity recognition capability, the bit vectors that is low-dimensional by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of multimedia messages, so that the recall precision of the multimedia messages based on bit vectors is higher than the recall precision of the multimedia messages based on the high dimensional feature vector, reduce retrieval and consume, and reduce the erroneous matching rate of the retrieval of the multimedia messages based on bit vectors.
Summary of the invention
The defect existed for above-mentioned prior art, the invention provides a kind of multimedia information retrieval method and system based on bit vectors, in order in the situation that guarantee former directed quantity recognition capability, after the bit vectors that is low-dimensional by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of multimedia messages, make that recall precision based on bit vectors is higher, retrieval consumes less.
According to an aspect of the present invention, provide a kind of multimedia information retrieval method based on bit vectors, having comprised:
After extracting the characteristic of current multimedia messages, obtain the high dimensional feature vector of the n dimension of described current multimedia messages, be designated as X (x 1, x 2..., x n);
By high dimensional feature vector X (x 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m);
Each element of the threshold vector of m dimension is compared with the respective element of described intermediate vector respectively, according to comparative result, described intermediate vector is carried out to binaryzation, obtain the bit vectors of the m dimension of described current multimedia messages; Wherein, m is less than n;
According to the bit vectors obtained, find out the bit vectors similar to this bit vectors in the characteristics of the multimedia database, the corresponding multimedia messages of the bit vectors found out is exported as result for retrieval;
Wherein, the matrix that described projection matrix P is m * n, and meet the following conditions: for the high dimensional feature vector of each classified multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through the vectorial spacing expectation value after P conversion, with the difference minimum of the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion;
Described threshold vector meets the following conditions: for the high dimensional feature vector of each multimedia messages of storing in described data bank, wherein similar high dimensional feature vector is through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation, with the difference minimum of inhomogeneous high dimensional feature vector through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation.
Preferably, before the characteristic of the current multimedia messages of described extraction, also comprise:
Train described projection matrix P by the multimedia messages of storing in described data bank:
For the multimedia messages of storing in described data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And
Using wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set;
Construct and make in following formula 1
Figure BDA0000368042200000021
minimum projection matrix P:
L ^ = αE { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Wherein, Q is described similar sample set; R is described non-similar sample set; E{||PX-PX'|| 2| Q} means the vectorial spacing expectation value of high dimensional feature vector similar in described Q after the P conversion; E{||PX-PX'|| 2| R} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in described R; The weights of α for setting.
Preferably, described constructing makes in following formula 1 minimum projection matrix P specifically comprises:
Ask for the matrix ∑ gm minimum n dimension matrix characteristic vector; Wherein, described ∑ qdescribed ∑ as shown in Equation 2, ras shown in Equation 3:
q=E{ (X-X') (X-X') t| Q} (formula 2)
In described formula 2, E{ (X-X') (X-X') t| Q} means the average of the covariance matrix between high dimensional feature vector similar in described Q;
r=E{ (X-X') (X-X') t| R} (formula 3)
In described formula 3, E{ (X-X') (X-X') t| R} means the average of the covariance matrix between inhomogeneous high dimensional feature vector in described R;
By the m asked for a n dimension matrix characteristic vector, form the projection matrix P of m * n.
Preferably, after the described multimedia messages of storing in by described data bank trains described projection matrix P, also comprise:
Calculate the m dimensional vector that makes L minimum in following formula 4, be designated as U (u 1, u 2..., u m), and as described threshold vector:
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained.
Perhaps, after the described multimedia messages of storing in by described data bank trains described projection matrix P, also comprise:
Calculate the m dimensional vector that makes L minimum in following formula 4, be designated as U (u 1, u 2..., u m):
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained;
Afterwards, to U (u 1, u 2..., u m) be optimized after, obtain described threshold vector:
Element u for described threshold vector U i, utilize following formula 5 and formula 6, ask for and make FN (u i)+α * FP (u i) minimum u ivalue, as the u after optimizing ivalue;
FN (u i)=Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) (formula 5)
FP (u i)=Pr (min{z, z'}<u i≤ max{z, z'}|R) (formula 6)
In described formula 5, (min{z, z'}>=u ior≤max{z, z'}<u i| the z Q) and z' mean in described Q i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) mean for the set element in described Q, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
In described formula 6, (min{z, z'}<u i≤ max{z, z'}|R) z in and z' mean in described R i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}<u i≤ max{z, z'}|R) mean for the set element in described R, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability.
Preferably, the described m dimensional vector that makes following L minimum that calculates specifically comprises:
Ask for the u that makes following expression 7 minimums ivalue; Wherein, the natural number that i is 1~m;
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Wherein, P i tthe capable vector of i for described projection matrix P; u ifor U (u 1, u 2..., u m) i element;
And by the u obtained 1~u mform described m dimensional vector.
According to another aspect of the present invention, also provide a kind of Multimedia information retrieval system based on bit vectors, having comprised:
The bit vectors modular converter, after the characteristic of extracting current multimedia messages, obtain the high dimensional feature vector of the n dimension of described current multimedia messages, is designated as X (x 1, x 2..., x n); By high dimensional feature vector X (x 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m) after, each element of the threshold vector of m dimension is compared with the respective element of described intermediate vector respectively, according to comparative result, described intermediate vector is carried out to binaryzation, obtain the bit vectors of the m dimension of described current multimedia messages; Wherein, m is less than n;
Retrieval module, bit vectors for the current multimedia messages that obtains according to described bit vectors modular converter, find out the bit vectors similar to this bit vectors in the characteristics of the multimedia database, the corresponding multimedia messages of the bit vectors found out is exported as result for retrieval;
Wherein, the matrix that described projection matrix P is m * n, and meet the following conditions: for the high dimensional feature vector of each multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through the vectorial spacing expectation value after P conversion, with the difference minimum of the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion;
Described threshold vector meets the following conditions: for the high dimensional feature vector of each classified multimedia messages of storing in described data bank, wherein similar high dimensional feature vector is through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation, with the difference minimum of inhomogeneous high dimensional feature vector through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation.
Preferably, described bit vectors modular converter specifically comprises:
High dimensional feature vector determining unit, after the characteristic of extracting current multimedia messages, obtain the high dimensional feature vector of the n dimension of described current multimedia messages, is designated as X (x 1, x 2..., x n);
The intermediate vector computing unit, for the high dimensional feature vector X (x that described high dimensional feature vector determining unit is obtained 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m);
The threshold value comparing unit, the respective element of the intermediate vector obtained with described intermediate vector computing unit respectively for each element of the threshold vector by m dimension compares, according to comparative result, described intermediate vector is carried out to binaryzation, obtain the bit vectors of the m dimension of described current multimedia messages; Wherein, m is less than n.
Further, the described Multimedia information retrieval system based on bit vectors also comprises:
Projection matrix builds module, train described projection matrix P for the multimedia messages of storing by described data bank: for the multimedia messages of storing in described data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And will be wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set; Construct and make in following formula 1
Figure BDA0000368042200000051
minimum projection matrix P:
L ^ = &alpha;E { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Wherein, Q is described similar sample set; R is described non-similar sample set; E{||PX-PX'|| 2| Q} means the vectorial spacing expectation value of high dimensional feature vector similar in described Q after the P conversion; E{||PX-PX'|| 2| R} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in described R; The weights of α for setting;
First threshold vector determination module, for calculating the m dimensional vector that makes following formula 4 L minimums, be designated as U (u 1, u 2..., u m), and as described threshold vector:
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained.
Preferably, described first threshold vector determination module specifically comprises:
Minimum calculation unit, for asking for the u that makes following expression 7 minimums ivalue; Wherein, the natural number that i is 1~m;
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Wherein, P i tthe capable vector of i for described projection matrix P; u ifor U (u 1, u 2..., u m) i element;
The vector component units, for the u that described minimum calculation unit is obtained 1~u mform described m dimensional vector U (u 1, u 2..., u m), as described threshold vector.
Further, the described Multimedia information retrieval system based on bit vectors also comprises:
Projection matrix builds module, train described projection matrix P for the multimedia messages of storing by described data bank: for the multimedia messages of storing in described data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And will be wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set; Construct and make in following formula 1 minimum projection matrix P:
L ^ = &alpha;E { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Wherein, Q is described similar sample set; R is described non-similar sample set; E{||PX-PX'|| 2| Q} means the vectorial spacing expectation value of high dimensional feature vector similar in described Q after the P conversion; E{||PX-PX'|| 2| R} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in described R; The weights of α for setting;
Second Threshold vector determination module, for calculating the m dimensional vector that makes following formula 4 L minimums, be designated as U (u 1, u 2..., u m):
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained;
Second Threshold vector determination module is to U (u 1, u 2..., u m) be optimized after, obtain described threshold vector.
Preferably, described Second Threshold vector determination module specifically comprises:
Minimum calculation unit, for asking for the u that makes following expression 7 minimums ivalue; Wherein, the natural number that i is 1~m;
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Wherein, P i tthe capable vector of i for described projection matrix P; u ifor U (u 1, u 2..., u m) i element;
The vector optimization unit, for to U (u 1, u 2..., u m) element u ibe optimized: for the element u of described threshold vector U i, utilize following formula 5 and formula 6, ask for and make FN (u i)+α * FP (u i) minimum u ivalue, as the u after optimizing ivalue;
FN (u i)=Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) (formula 5)
FP (u i)=Pr (min{z, z'}<u i≤ max{z, z'}|R) (formula 6)
In described formula 5, (min{z, z'}>=u ior≤max{z, z'}<u i| the z Q) and z' mean in described Q i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) mean for the set element in described Q, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
In described formula 6, (min{z, z'}<u i≤ max{z, z'}|R) z in and z' mean in described R i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}<u i≤ max{z, z'}|R) mean for the set element in described R, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
The vector component units, for the u by after described vector optimization unit optimization 1~u mform described threshold vector.
Preferably, described projection matrix structure module specifically comprises:
Minimum matrix characteristic vector computing unit, for asking for the matrix ∑ gm minimum n dimension matrix characteristic vector; Wherein,
Figure BDA0000368042200000071
described ∑ qdescribed ∑ as shown in Equation 2, ras shown in Equation 3:
q=E{ (X-X') (X-X') t| Q} (formula 2)
In described formula 2, E{ (X-X') (X-X') t| Q} means the average of the covariance matrix between high dimensional feature vector similar in described Q;
r=E{ (X-X') (X-X') t| R} (formula 3)
In described formula 3, E{ (X-X') (X-X') t| R} means the average of the covariance matrix between inhomogeneous high dimensional feature vector in described R;
The projection matrix determining unit, for the m by asking for a n dimension matrix characteristic vector, form the projection matrix P of m * n.
In technical scheme of the present invention, due to the high dimensional feature vector of current multimedia messages convert to after bit vectors have in class assemble, discrete effect between class, thereby guaranteed former directed quantity recognition capability; Like this, the retrieval technique of the bit vectors based on low-dimensional that application is ripe, can realize the recall precision higher than the retrieval technique based on the high dimensional feature vector, less retrieval consumption, and the result for retrieval that the retrieval that makes the multimedia messages based on bit vectors draws is more accurate, reduced the erroneous matching rate of retrieval.
The accompanying drawing explanation
The multimedia messages according to storing in data bank that Fig. 1 a is the embodiment of the present invention trains the process flow diagram of the method for projection matrix;
Fig. 1 b be the embodiment of the present invention according to ∑ gconstruct the process flow diagram of the concrete grammar of projection matrix;
The process flow diagram of the multimedia information retrieval method based on bit vectors that Fig. 2 is the embodiment of the present invention;
A kind of inner structure block diagram of the Multimedia information retrieval system based on bit vectors that Fig. 3 a is the embodiment of the present invention;
The another kind of inner structure block diagram of the Multimedia information retrieval system based on bit vectors that Fig. 3 b is the embodiment of the present invention;
Fig. 4 be the embodiment of the present invention carry out the method flow diagram of multimedia information retrieval according to bit vectors.
Embodiment
Below with reference to accompanying drawing, technical scheme of the present invention is carried out to clear, complete description, obviously, described embodiment is only a part of embodiment of the present invention, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills are resulting all other embodiment under the prerequisite of not making creative work, all belong to the scope that the present invention protects.
The terms such as " module " used in this application, " system " are intended to comprise the entity relevant to computing machine, such as but not limited to hardware, firmware, combination thereof, software or executory software.For example, module can be, but be not limited in: the thread of the process of moving on processor, processor, object, executable program, execution, program and/or computing machine.For instance, the application program of moving on computing equipment and this computing equipment can be modules.One or more modules can be positioned at an executory process and/or thread, and module also can and/or be distributed on a computing machine between two or more computing machines.
In technical scheme of the present invention, constructed a mapping function, use the bit vectors that this mapping function can be low-dimensional by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING, and this mapping function can also guarantee: original similar high dimensional feature vector, the bit vectors obtained after mapping is more similar; Original high dimensional feature vector is dissimilar, and the bit vectors obtained after mapping is more dissimilar; That is to say, through the mapping of this mapping function, by original high dimensional feature vector convert to after bit vectors have in class assemble, discrete effect between class, thereby guarantee former directed quantity recognition capability; Afterwards, the retrieval technique of the bit vectors based on low-dimensional that application is ripe, realize the recall precision higher than the retrieval technique based on the high dimensional feature vector, and less retrieval consumption.
Describe technical scheme of the present invention in detail below in conjunction with accompanying drawing.The embodiment of the present invention is in the characteristic of extracting current multimedia messages, before carrying out the retrieval of characteristic of current multimedia messages, need to first construct the mapping function of the binaryzation vector that can be low-dimensional by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of the n of current multimedia messages dimension, be designated as:
Y=sign(PX+U),
Wherein, the projection matrix that P is m * n; The threshold vector that U is the m dimension, be designated as U (u 1, u 2..., u m); The high dimensional feature vector that X is the n dimension, be designated as X (x 1, x 2..., x n), and each element in X is real number value; Sign (PX+U) means the symbol (sign) of amount of orientation PX+U, obtain binaryzation symbolic vector (element of symbolic vector for-1 or+1), even the symbol of the element of PX+U is negative sign, respective element in the then symbol vector is-1, if the symbol of the element of PX+U is positive sign, the respective element in the then symbol vector is+1; The symbolic vector of the binaryzation of the m dimension obtained after the symbol that Y is amount of orientation PX+U, be designated as Y (y 1, y 2..., y m); In fact, each element in symbolic vector can mean by bit, and for example, the element that symbol is negative sign can mean by bit 0, and the element that symbol is positive sign can mean by bit 1, thereby obtains corresponding bit vectors.
During this paper is follow-up, with the high dimensional feature vector X (x of n dimension 1, x 2..., x n) construct mapping function for column vector, and obtain the column vector of m dimension according to the mapping function mapping of structure, i.e. the bit vectors of m dimension; Those skilled in the art can be according to disclosed technical scheme in the embodiment of the present invention, and easy realization is with the high dimensional feature vector X (x of n dimension 1, x 2..., x n) construct mapping function for the row vector, and then mapping obtains the technical scheme of the vectorial bit vectors of the row of m dimension; Therefore, no matter to go the vectorial X (x of vector or the high dimensional feature of column vector 1, x 2..., x n) constructing mapping function and then shine upon the method for the bit vectors that obtains the m dimension or conceive all should be within protection scope of the present invention.
Particularly, can train projection matrix P according to the classified multimedia messages of storing in data bank, and the projection matrix P of the m * n trained meets the following conditions: for the high dimensional feature vector of each classified multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through the vectorial spacing expectation value after P conversion, with the difference minimum of the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion.As shown in Figure 1a, train the method for projection matrix P according to the multimedia messages of storing in data bank, comprise the steps:
S101: for the multimedia messages of storing in data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And will be wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set.
Particularly, for the multimedia messages of storing in data bank, according to the similarity between the high dimensional feature vector of multimedia messages, set up in advance the similar sample set that comprises similar high dimensional feature vector, be designated as Q, and the non-similar sample set that comprises inhomogeneous high dimensional feature vector, be designated as R.
S102: construct and make in following formula 1
Figure BDA0000368042200000101
minimum projection matrix P:
L ^ = &alpha;E { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Above-mentioned formula 1 is predefined objective function; Wherein, Q is similar sample set; R is non-similar sample set; || PX-PX'|| 2| the X in Q} and X' mean a pair of similar high dimensional feature vector in any one set element in Q; || PX-PX'|| 2| the X in R} and X' mean a pair of inhomogeneous high dimensional feature vector in any one set element in R; PX-PX' means the distance between vector that high dimensional feature vector X and X' obtain after the P conversion; || PX-PX'|| 2the covariance of the distance between the vector that expression high dimensional feature vector X and X' obtain after the P conversion;
E{||PX-PX'|| 2| Q} means the vectorial spacing expectation value of high dimensional feature vector similar in Q after the P conversion, means the average of the covariance of the vectorial spacing of high dimensional feature vector similar in Q after the P conversion; E{||PX-PX'|| 2| R} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in R, means the average of the covariance of the vectorial spacing of inhomogeneous high dimensional feature vector after the P conversion in R; The weights of α for setting, value is 1~0.5; α is specially the weights of the tolerance ratio of similar high dimensional feature vector spacing and non-similar high dimensional feature vector spacing, weights are larger, similar high dimensional feature vector distance tolerance weight is larger, similar high dimensional feature vector aggregation extent in class after projection matrix P conversion is higher, in other words, inhomogeneity high dimensional feature vector distance tolerance weight is less, inhomogeneity high dimensional feature vector after projection matrix P conversion between class dispersion degree higher.
Particularly, according to the knowledge of linear algebra, can draw:
E{||PX-PX'|| 2| the Q}=tr{P ∑ qp t(formula 8)
E{||PX-PX'|| 2| the R}=tr{P ∑ rp t(formula 9)
Wherein, P tmean to ask the transposed matrix of P; The tr{P ∑ qp tmean to ask matrix P ∑ qp tmark, the tr{P ∑ rp tmean to ask matrix P ∑ rp tmark; ∑ qas shown in Equation 2, ∑ ras shown in Equation 3:
q=E{ (X-X') (X-X') t| Q} (formula 2)
In formula 2, { (X-X') (X-X') t| the X in Q} and X' mean a pair of similar high dimensional feature vector in any one set element in Q, wherein, and (X-X') tmean to ask for the transposed vector of (X-X'); E{ (X-X') (X-X') t| Q} means the average of the covariance matrix between high dimensional feature vector similar in Q, and concrete the expression averages to each element of the covariance matrix between high dimensional feature vector similar in Q;
r=E{ (X-X') (X-X') t| R} (formula 3)
In formula 3, E{ (X-X') (X-X') t| the X in R} and X' mean a pair of similar high dimensional feature vector in any one set element in R, and E{ (X-X') is (X-X') t| R} means the average of the covariance matrix between high dimensional feature vector similar in R, and concrete the expression averages to each element of the covariance matrix between high dimensional feature vector similar in R.
Like this, according to formula 8 and formula 9, above-mentioned formula 1 can be converted into formula 10:
L ^ = &alpha; tr { P &Sigma; Q P T } - tr { P &Sigma; R P T } (formula 10)
Further, use ( mean to ask for ∑ rinverse matrix
Figure BDA0000368042200000114
after, right evolution) after being multiplied by the expression formula on the right of formula 10, the expression formula obtained is multiplied by again ( expression is asked for
Figure BDA0000368042200000118
transposed matrix) after, make the tr{R ∑ rr tbe converted into constant, make the tr{P ∑ qp tbe converted to as in formula 11 the right expression formula:
L ^ &Proportional; tr { P &Sigma; R - 1 / 2 &Sigma; Q &Sigma; R - T / 2 P T } (formula 11)
Formula 11 means
Figure BDA00003680422000001110
be proportional to tr { P &Sigma; R - 1 / 2 &Sigma; Q &Sigma; R - T / 2 P T } ;
And,
tr { P &Sigma; R - 1 / 2 &Sigma; Q &Sigma; R - T / 2 P T }
= tr { P &Sigma; Q &Sigma; R - 1 P T }
= tr { P &Sigma; G P T }
Wherein, &Sigma; G = &Sigma; Q &Sigma; R - 1 .
Like this, can be according to ∑ g, construct and make in formula 1
Figure BDA00003680422000001116
minimum projection matrix P, the process flow diagram of its concrete grammar as shown in Figure 1 b, comprises the steps:
S111: ask for ∑ gm minimum n dimension matrix characteristic vector.
Particularly, ∑ gbe a positive semidefinite symmetric matrix, can, according to linear algebra knowledge, ask for the matrix ∑ gm minimal characteristic vector, obtain m minimum n dimension matrix characteristic vector.
S112: by the m asked for a n dimension matrix characteristic vector, form the projection matrix P of m * n.
Particularly, by the m asked for a n dimension matrix characteristic vector, form the orthogonal matrix of m * n, i.e. projection matrix P; This projection matrix P can be so that in formula 1
Figure BDA00003680422000001117
obtain minimum value.
After the multimedia messages of storing in by data bank trains projection matrix P, can calculate threshold vector U, and threshold vector U meets the following conditions: for the high dimensional feature vector of each multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through P conversion the vectorial spacing expectation value after threshold vector comparison, binaryzation, with the difference minimum of inhomogeneous high dimensional feature vector through P conversion the vectorial spacing expectation value after threshold vector comparison, binaryzation.
Wherein, calculate threshold vector U, be specially and calculate the m dimensional vector that makes the L minimum in following formula 4, as threshold vector U:
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in Q is through the P conversion and after threshold vector U relatively determines sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in R that inhomogeneous high dimensional feature vector is through the P conversion and after threshold vector U relatively determines sign symbol, the average of the distance between the symbolic vector obtained; Wherein, the distance between symbolic vector has reflected the distance between the bit vectors after this symbolic vector carries out binaryzation.
Further, formula 4 is transformed:
L = E { sign ( PX + U ) T sign ( P X ' + U ) | R } - &alpha; E { sign ( PX + U ) T sign ( P X ' + U ) | Q }
= &Sigma; i = 1 m { E { sign ( P i T X + u i ) sign ( P i T X ' + u i ) | R } - &alpha;E { sign ( P i T X + u i ) sign ( P i T X ' + u i ) | Q } }
= &Sigma; i = 1 m { E { sign ( ( P i T X + u i ) ( P i T X &prime; + u i ) ) | R } - &alpha;E { sign ( ( P i T X + u i ) ( P i T X &prime; + u i ) ) | Q } }
Wherein, P i tthe capable vector of i that means projection matrix P; u ifor U (u 1, u 2..., u m) i element; The natural number that i is 1~m.
Like this, can, by asking for the m dimension threshold vector that makes the L minimum, be converted into m and independently ask for the u that makes following expression 7 minimums ivalue:
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Calculating the u that makes expression formula 7 minimums iafter value, by the u obtained 1~u mform the m dimensional vector, can be using the m dimensional vector that obtains as threshold vector U; As a kind of more excellent embodiment, also can continue to obtain by u 1~u mform the m dimensional vector and be optimized, using the m dimensional vector after optimization as final threshold vector U:
Particularly, for the element u calculated i, utilize following formula 5 and formula 6, ask for and make FN (u i)+α * FP (u i) minimum u ivalue, as the u after optimizing ivalue:
FN (u i)=Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) (formula 5)
FP (u i)=Pr (min{z, z'}<u i≤ max{z, z'}|R) (formula 6)
Wherein, z=P i tx and z'=P i tx'; Min{z, z'} means to ask for the minimum value in two element z and z', max{z, z'} means to ask for the maximal value in two element z and z';
In formula 5, (min{z, z'}>=u ior≤max{z, z'}<u i| the z Q) and z' mean in Q i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after projection matrix P conversion, Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) mean for the set element in Q, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
In formula 6, (min{z, z'}<u i≤ max{z, z'}|R) z in and z' mean in R i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after projection matrix P conversion, Pr (min{z, z'}<u i≤ max{z, z'}|R) mean for the set element in R, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
By the u after optimizing 1~u mform the m dimensional vector as final threshold vector U.
Due to first E{sign ((P in expression formula 7 i tx+u i) (P i tx'+u i)) | the value of R} is proportional to FP (u i), second E{sign ((P i tx+u i) t(P i tx'+u i)) | the value of Q} is proportional to FN (u i), and, according to mathematical statistics knowledge, can according to classified multimedia messages, estimate easily, therefore, can make FN (u by asking for i)+α * FP (u i) minimum u ivalue, determine the final threshold vector U after optimization quickly and accurately.
According to above-mentioned method, after constructing projection matrix P and threshold vector U, the mapping function Y=sign (PX+U) of the binaryzation vector that can to construct the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of the n of current multimedia messages dimension be low-dimensional.And, through the mapping of this mapping function, by original high dimensional feature vector convert to after bit vectors have in class assemble, discrete effect between class, thereby guaranteed the recognition capability of former directed quantity.
The binaryzation vector that the mapping function that uses above-mentioned structure can be low-dimensional by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING, and then carry out the multimedia information retrieval based on bit vectors, the process flow diagram of its method as shown in Figure 2, comprises the steps:
S201: after extracting the characteristic of current multimedia messages, obtain the high dimensional feature vector X (x of the n dimension of current multimedia messages 1, x 2..., x n).
S202: by X (x 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m).
Particularly, can be according to structure during mapping function, the high dimensional feature vector X (x of the constructed projection matrix P gone out to the n dimension 1, x 2..., x n) converted, obtain the intermediate vector PX of m dimension, be designated as W (w 1, w 2..., w m).
S203: each element of the threshold vector of m dimension is compared with the respective element of intermediate vector respectively, according to comparative result, middle vector is carried out to binaryzation, obtain the bit vectors of the m dimension of current multimedia messages; Wherein, m is less than n.
Particularly, can be according to structure during mapping function, the threshold vector U of the m dimension calculated, by U (u 1, u 2..., u m) each element respectively with intermediate vector W (w 1, w 2..., w m) respective element compare, according to comparative result, middle vector is carried out to binaryzation, obtain the bit vectors of the m dimension of current multimedia messages.
Wherein, can carry out binaryzation to middle vector according to mapping function: ask for W+U,, after PX+U, after asking for sign (PX+U) and obtaining symbolic vector, each element of symbolic vector is meaned with bit (1 or 1), obtain corresponding bit vectors.Like this, because m is less than n, after middle vector is carried out to binaryzation, the bit vectors that to have realized the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of the n of current multimedia messages dimension be low-dimensional (m dimension).
S204: according to the bit vectors obtained, find out the bit vectors similar to this bit vectors in the characteristics of the multimedia database, the corresponding multimedia messages of the bit vectors found out is exported as result for retrieval.
Particularly, can, according to the existing multimedia information retrieval method (method as shown in following Fig. 4) based on bit vectors, carry out the retrieval of the multimedia messages based on bit vectors, to obtain result for retrieval.
The embodiment of the present invention also provides a kind of Multimedia information retrieval system based on bit vectors, and its inner structure block diagram, as shown in Fig. 3 a or 3b, specifically comprises: bit vectors modular converter 301 and retrieval module 302.
Bit vectors modular converter 301 for the characteristic of extracting current multimedia messages after, obtain the high dimensional feature vector of the n dimension of current multimedia messages, be designated as X (x 1, x 2..., x n); By high dimensional feature vector X (x 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m) after, each element of the threshold vector of m dimension is compared with the respective element of intermediate vector respectively, according to comparative result, middle vector is carried out to binaryzation, obtain the bit vectors of the m dimension of current multimedia messages; Wherein, m is less than n.
Retrieval module 302 is for the bit vectors of the current multimedia messages that obtains according to bit vectors modular converter 301, find out the bit vectors similar to this bit vectors in the characteristics of the multimedia database, the corresponding multimedia messages of the bit vectors found out is exported as result for retrieval.
Wherein, the matrix that projection matrix P is m * n, and meet the following conditions: for the high dimensional feature vector of each classified multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through the vectorial spacing expectation value after P conversion, with the difference minimum of the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion.
Threshold vector meets the following conditions: for the high dimensional feature vector of each multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation, with the difference minimum of inhomogeneous high dimensional feature vector through P conversion the vectorial spacing expectation value after threshold vector comparison, binaryzation.
Above-mentioned bit vectors modular converter 301 specifically comprises: high dimensional feature vector determining unit 311, intermediate vector computing unit 312 and threshold value comparing unit 313.
High dimensional feature vector determining unit 311 for the characteristic of extracting current multimedia messages after, obtain the high dimensional feature vector of the n dimension of current multimedia messages, be designated as X (x 1, x 2..., x n).
The high dimensional feature vector X (x of intermediate vector computing unit 312 for high dimensional feature vector determining unit 311 is obtained 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m).
The respective element of the intermediate vector that threshold value comparing unit 313 obtains with intermediate vector computing unit 312 respectively for each element of the threshold vector by m dimension compares, according to comparative result, middle vector is carried out to binaryzation, obtain the bit vectors of the m dimension of current multimedia messages; Wherein, m is less than n.
Further, the Multimedia information retrieval system based on bit vectors also comprises: projection matrix builds module 303.
Projection matrix builds module 303 and trains projection matrix P for the multimedia messages of storing by data bank: for the multimedia messages of storing in data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And will be wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set; Construct and make in following formula 1 minimum projection matrix P:
L ^ = &alpha;E { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Wherein, Q is described similar sample set; R is described non-similar sample set; E{||PX-PX'|| 2| Q} means the vectorial spacing expectation value of high dimensional feature vector similar in described Q after the P conversion; E{||PX-PX'|| 2| R} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in described R; The weights of α for setting.
Above-mentioned shadow matrix builds module 303 and specifically comprises: minimum matrix characteristic vector computing unit 331 and projection matrix determining unit 332.
Minimum matrix characteristic vector computing unit 331 is for asking for the matrix ∑ gm minimum n dimension matrix characteristic vector; Wherein,
Figure BDA0000368042200000153
described ∑ qdescribed ∑ as shown in Equation 2, ras shown in Equation 3:
q=E{ (X-X') (X-X') t| Q} (formula 2)
In described formula 2, E{ (X-X') (X-X') t| Q} means the average of the covariance matrix between high dimensional feature vector similar in described Q;
r=E{ (X-X') (X-X') t| R} (formula 3)
In described formula 3, E{ (X-X') (X-X') t| R} means the average of the covariance matrix between inhomogeneous high dimensional feature vector in described R.
The high dimensional feature vector that projection matrix determining unit 332 is tieed up for m the n asked for by minimum matrix characteristic vector computing unit 331, the projection matrix P of formation m * n.
Further, the Multimedia information retrieval system based on bit vectors also comprises: first threshold vector determination module 304(is as shown in Figure 3 a), or Second Threshold vector determination module 305(is as shown in Fig. 3 b).
First threshold vector determination module 304, for calculating the m dimensional vector that makes following formula 4 L minimums, is designated as U (u 1, u 2..., u m), and as described threshold vector:
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained.
Above-mentioned first threshold vector determination module 304 specifically comprises: minimum calculation unit 341 and vectorial component units 342.
Minimum calculation unit 341 is for asking for the u that makes following expression 7 minimums ivalue; Wherein, the natural number that i is 1~m;
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Wherein, P i tthe capable vector of i for described projection matrix P; u ifor U (u 1, u 2..., u m) i element.
The u of vector component units 342 for minimum calculation unit 341 is obtained 1~u mform m dimensional vector U (u 1, u 2..., u m), as threshold vector.
Second Threshold vector determination module 305, for calculating the m dimensional vector that makes following formula 4 L minimums, is designated as U (u 1, u 2..., u m):
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained;
305 couples of U (u of Second Threshold vector determination module 1, u 2..., u m) be optimized after, obtain described threshold vector.
Above-mentioned Second Threshold vector determination module 305 specifically comprises: minimum calculation unit 351, vector optimization unit 352 and vectorial component units 353.
Minimum calculation unit 351 is identical with the function of above-mentioned minimum calculation unit 341, repeats no more herein.
U (the u of vector optimization unit 352 for minimum calculation unit 351 is asked for 1, u 2..., u m) element u ivalue is optimized: for the element u of described threshold vector U i, utilize following formula 5 and formula 6, ask for and make FN (u i)+α * FP (u i) minimum u ivalue, as the u after optimizing ivalue;
FN (u i)=Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) (formula 5)
FP (u i)=Pr (min{z, z'}<u i≤ max{z, z'}|R) (formula 6)
In described formula 5, (min{z, z'}>=u ior≤max{z, z'}<u i| the z Q) and z' mean in described Q i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) mean for the set element in described Q, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
In described formula 6, (min{z, z'}<u i≤ max{z, z'}|R) z in and z' mean in described R i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}<u i≤ max{z, z'}|R) mean for the set element in described R, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability.
Vector component units 353 is for the u after vector optimization unit 352 is optimized 1~u mform described threshold vector.
As shown in Figure 4, can carry out the retrieval of the multimedia messages based on bit vectors according to the existing multimedia information retrieval method based on the design of segmented index thought, to obtain result for retrieval, specifically comprise the steps:
S401: extracting the characteristic of current multimedia messages, is the bit vectors that m ties up by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of the n of current multimedia messages dimension, obtains the bit vectors of current multimedia messages.
Particularly, after extracting the characteristic of current multimedia messages, adopting the method for the invention described above, is the bit vectors that m ties up by the high dimensional feature DUAL PROBLEMS OF VECTOR MAPPING of the n of current multimedia messages dimension, obtains the bit vectors of current multimedia messages.
S402: the bit vectors of current multimedia messages is carried out to even partition, obtain k subvector of current multimedia messages.
Particularly, the j group element of j subvector of current multimedia messages after by the bit vectors even partition of current multimedia messages forms, and wherein the j group element specifically comprises (j-1) * v+1 element in the bit vectors of current multimedia messages~the j * l element; The natural number that wherein j is 1~k, v is the vector element number in each subvector (or every group element).
S403: for each subvector of current multimedia messages, determine respectively candidate collection that should subvector.
Particularly, for each subvector of current multimedia messages, determine respectively corresponding candidate collection, thereby determine k candidate collection; Wherein, in the process of the candidate collection of j subvector determining corresponding current multimedia messages, j the subvector for described current multimedia messages, its corresponding candidate collection is determined according to following method: find out j the index that subvector is identical with this multimedia messages to be retrieved in the indexed set of j index structure, and the candidate collection using the corresponding vectorial set of the index found out as j subvector of the current multimedia messages of correspondence.
Wherein, the bit vectors of each multimedia messages to be retrieved and vectorial thereof are pre-stored within the characteristics of the multimedia database, and for each multimedia messages to be retrieved, in advance the tag bit vector of this multimedia messages to be retrieved is carried out to even partition, set up segmented index, obtain k index structure.
S404: for each vectorial in the candidate collection obtained, find out respectively corresponding bit vectors in the characteristics of the multimedia database.
Particularly, for the candidate collection of each subvector of the current multimedia messages of correspondence obtained in above-mentioned steps S403, i.e. k candidate collection finds out the bit vectors of each vectorial in corresponding candidate collection in the characteristics of the multimedia database.
S405: calculate the bit vectors of current multimedia messages and the bit vectors that finds between Hamming distance.
S406: Hamming distance is met to the corresponding multimedia messages of the bit vectors imposed a condition and export as result for retrieval.
Particularly, meeting the bit vectors imposed a condition can be specifically: with the Hamming distance of the bit vectors of current multimedia messages, be less than or equal to the bit vectors of d; More preferably, above-mentioned k is greater than d, and d is less than or equal to k, can guarantee like this not there will be undetected, and the vectorial that meets the bit vectors imposed a condition all is included in candidate collection.Usually, for meeting the retrieval requirement, those skilled in the art's Hamming distance d value is set to a less number, such as the number that is less than 3 or 4; Therefore, v is at least double figures usually, even larger.
In sum, in technical solution of the present invention, due to the high dimensional feature vector of current multimedia messages convert to after bit vectors have in class assemble, discrete effect between class, thereby guaranteed former directed quantity recognition capability; Like this, the retrieval technique of the bit vectors based on low-dimensional that application is ripe, can realize the recall precision higher than the retrieval technique based on the high dimensional feature vector, less retrieval consumption, and the result for retrieval that the retrieval that makes the multimedia messages based on bit vectors draws is more accurate, reduced the erroneous matching rate of retrieval.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (13)

1. the multimedia information retrieval method based on bit vectors, is characterized in that, comprising:
After extracting the characteristic of current multimedia messages, obtain the high dimensional feature vector of the n dimension of described current multimedia messages, be designated as X (x 1, x 2..., x n);
By high dimensional feature vector X (x 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m);
Each element of the threshold vector of m dimension is compared with the respective element of described intermediate vector respectively, according to comparative result, described intermediate vector is carried out to binaryzation, obtain the bit vectors of the m dimension of described current multimedia messages; Wherein, m is less than n;
According to the bit vectors obtained, find out the bit vectors similar to this bit vectors in the characteristics of the multimedia database, the corresponding multimedia messages of the bit vectors found out is exported as result for retrieval;
Wherein, the matrix that described projection matrix P is m * n, and meet the following conditions: for the high dimensional feature vector of each classified multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through the vectorial spacing expectation value after P conversion, with the difference minimum of the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion;
Described threshold vector meets the following conditions: for the high dimensional feature vector of each multimedia messages of storing in described data bank, wherein similar high dimensional feature vector is through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation, with the difference minimum of inhomogeneous high dimensional feature vector through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation.
2. the method for claim 1, is characterized in that, before the characteristic of the current multimedia messages of described extraction, also comprises:
Train described projection matrix P by the multimedia messages of storing in described data bank:
For the multimedia messages of storing in described data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And
Using wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set;
Construct and make in following formula 1
Figure FDA0000368042190000011
minimum projection matrix P:
L ^ = &alpha;E { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Wherein, Q is described similar sample set; R is described non-similar sample set; E{PX-PX' 2q} means the vectorial spacing expectation value of high dimensional feature vector similar in described Q after the P conversion; E{PX-PX' 2r} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in described R; The weights of α for setting.
3. method as claimed in claim 2, is characterized in that, described constructing makes in following formula 1
Figure FDA0000368042190000021
minimum projection matrix P specifically comprises:
Ask for the matrix ∑ gm minimum n dimension matrix characteristic vector; Wherein, described ∑ qdescribed ∑ as shown in Equation 2, ras shown in Equation 3:
q=E{ (X-X') (X-X') t| Q} (formula 2)
In described formula 2, E{ (X-X') (X-X') t| Q} means the average of the covariance matrix between high dimensional feature vector similar in described Q;
r=E{ (X-X') (X-X') t| R} (formula 3)
In described formula 3, E{ (X-X') (X-X') t| R} means the average of the covariance matrix between inhomogeneous high dimensional feature vector in described R;
By the m asked for a n dimension matrix characteristic vector, form the projection matrix P of m * n.
4. method as claimed in claim 2, is characterized in that, after the described multimedia messages of storing in by described data bank trains described projection matrix P, also comprises:
Calculate the m dimensional vector that makes L minimum in following formula 4, be designated as U (u 1, u 2..., u m), and as described threshold vector:
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained.
5. method as claimed in claim 2, is characterized in that, after the described multimedia messages of storing in by described data bank trains described projection matrix P, also comprises:
Calculate the m dimensional vector that makes L minimum in following formula 4, be designated as U (u 1, u 2..., u m):
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained;
Afterwards, to U (u 1, u 2..., u m) be optimized after, obtain described threshold vector:
Element u for described threshold vector U i, utilize following formula 5 and formula 6, ask for and make FN (u i)+α * FP (u i) minimum u ivalue, as the u after optimizing ivalue;
FN (u i)=Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) (formula 5)
FP (u i)=Pr (min{z, z'}<u i≤ max{z, z'}|R) (formula 6)
In described formula 5, (min{z, z'}>=u ior≤max{z, z'}<u i| the z Q) and z' mean in described Q i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) mean for the set element in described Q, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
In described formula 6, (min{z, z'}<u i≤ max{z, z'}R) z in and z' mean in described R i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}<u i≤ max{z, z'}|R) mean for the set element in described R, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability.
6. method as described as claim 4 or 5, is characterized in that, the described m dimensional vector that makes following L minimum that calculates specifically comprises:
Ask for the u that makes following expression 7 minimums ivalue; Wherein, the natural number that i is 1~m;
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Wherein, P i tthe capable vector of i for described projection matrix P; u ifor U (u 1, u 2..., u m) i element;
And by the u obtained 1~u mform described m dimensional vector.
7. the Multimedia information retrieval system based on bit vectors, is characterized in that, comprising:
The bit vectors modular converter, after the characteristic of extracting current multimedia messages, obtain the high dimensional feature vector of the n dimension of described current multimedia messages, is designated as X (x 1, x 2..., x n); By high dimensional feature vector X (x 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m) after, each element of the threshold vector of m dimension is compared with the respective element of described intermediate vector respectively, according to comparative result, described intermediate vector is carried out to binaryzation, obtain the bit vectors of the m dimension of described current multimedia messages; Wherein, m is less than n;
Retrieval module, bit vectors for the current multimedia messages that obtains according to described bit vectors modular converter, find out the bit vectors similar to this bit vectors in the characteristics of the multimedia database, the corresponding multimedia messages of the bit vectors found out is exported as result for retrieval;
Wherein, the matrix that described projection matrix P is m * n, and meet the following conditions: for the high dimensional feature vector of each classified multimedia messages of storing in data bank, wherein similar high dimensional feature vector is through the vectorial spacing expectation value after P conversion, with the difference minimum of the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion;
Described threshold vector meets the following conditions: for the high dimensional feature vector of each multimedia messages of storing in described data bank, wherein similar high dimensional feature vector is through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation, with the difference minimum of inhomogeneous high dimensional feature vector through P conversion the vectorial spacing expectation value after described threshold vector comparison, binaryzation.
8. system as claimed in claim 7, is characterized in that, described bit vectors modular converter specifically comprises:
High dimensional feature vector determining unit, after the characteristic of extracting current multimedia messages, obtain the high dimensional feature vector of the n dimension of described current multimedia messages, is designated as X (x 1, x 2..., x n);
The intermediate vector computing unit, for the high dimensional feature vector X (x that described high dimensional feature vector determining unit is obtained 1, x 2..., x n) by obtaining the intermediate vector W (w of m dimension after projection matrix P conversion 1, w 2..., w m);
The threshold value comparing unit, the respective element of the intermediate vector obtained with described intermediate vector computing unit respectively for each element of the threshold vector by m dimension compares, according to comparative result, described intermediate vector is carried out to binaryzation, obtain the bit vectors of the m dimension of described current multimedia messages; Wherein, m is less than n.
9. system as claimed in claim 8, is characterized in that, also comprises:
Projection matrix builds module, train described projection matrix P for the multimedia messages of storing by described data bank: for the multimedia messages of storing in described data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And will be wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set; Construct and make in following formula 1
Figure FDA0000368042190000041
minimum projection matrix P:
L ^ = &alpha;E { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Wherein, Q is described similar sample set; R is described non-similar sample set; E{||PX-PX'|| 2| Q} means the vectorial spacing expectation value of high dimensional feature vector similar in described Q after the P conversion; E{||PX-PX'|| 2| R} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in described R; The weights of α for setting;
First threshold vector determination module, for calculating the m dimensional vector that makes following formula 4 L minimums, be designated as U (u 1, u 2..., u m), and as described threshold vector:
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained.
10. system as claimed in claim 9, is characterized in that, described first threshold vector determination module specifically comprises:
Minimum calculation unit, for asking for the u that makes following expression 7 minimums ivalue; Wherein, the natural number that i is 1~m;
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Wherein, P i tthe capable vector of i for described projection matrix P; u ifor U (u 1, u 2..., u m) i element;
The vector component units, for the u that described minimum calculation unit is obtained 1~u mform described m dimensional vector U (u 1, u 2..., u m), as described threshold vector.
11. system as claimed in claim 8, is characterized in that, also comprises:
Projection matrix builds module, train described projection matrix P for the multimedia messages of storing by described data bank: for the multimedia messages of storing in described data bank, using wherein arbitrarily the high dimensional feature vector of a pair of similar multimedia messages as a set element, store in similar sample set; And will be wherein arbitrarily the high dimensional feature vector of a pair of inhomogeneous multimedia messages as a set element, store in non-similar sample set; Construct and make in following formula 1
Figure FDA0000368042190000051
minimum projection matrix P:
L ^ = &alpha;E { | | PX - PX ' | | 2 | Q } - E { | | PX - PX ' | | 2 | R } (formula 1)
Wherein, Q is described similar sample set; R is described non-similar sample set; E{||PX-PX'|| 2| Q} means the vectorial spacing expectation value of high dimensional feature vector similar in described Q after the P conversion; E{||PX-PX'|| 2| R} means the vectorial spacing expectation value of inhomogeneous high dimensional feature vector after the P conversion in described R; The weights of α for setting;
Second Threshold vector determination module, for calculating the m dimensional vector that makes following formula 4 L minimums, be designated as U (u 1, u 2..., u m):
L=E{sign (PX+U) tsign (PX'+U) | R}-α E{sign (PX+U) tsign (PX'+U) | Q} (formula 4)
Wherein, E{sign (PX+U) tsign (PX'+U) | Q} means that high dimensional feature vector similar in described Q is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained; E{sign (PX+U) tsign (PX'+U) | R} means in described R that inhomogeneous high dimensional feature vector is through the P conversion and after described threshold vector is relatively determined sign symbol, the average of the distance between the symbolic vector obtained;
Second Threshold vector determination module is to U (u 1, u 2..., u m) be optimized after, obtain described threshold vector.
12. system as claimed in claim 11, is characterized in that, described Second Threshold vector determination module specifically comprises:
Minimum calculation unit, for asking for the u that makes following expression 7 minimums ivalue; Wherein, the natural number that i is 1~m;
E{sign ((P i tx+u i) (P i tx'+u i)) | R}-α E{sign ((P i tx+u i) t(P i tx'+u i)) | Q} (expression formula 7)
Wherein, P i tthe capable vector of i for described projection matrix P; u ifor U (u 1, u 2..., u m) i element;
The vector optimization unit, for to U (u 1, u 2..., u m) element u ibe optimized: for the element u of described threshold vector U i, utilize following formula 5 and formula 6, ask for and make FN (u i)+α * FP (u i) minimum u ivalue, as the u after optimizing ivalue;
FN (u i)=Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) (formula 5)
FP (u i)=Pr (min{z, z'}<u i≤ max{z, z'}|R) (formula 6)
In described formula 5, (min{z, z'}>=u ior≤max{z, z'}<u i| the z Q) and z' mean in described Q i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}>=u ior≤max{z, z'}<u i| Q) mean for the set element in described Q, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
In described formula 6, (min{z, z'}<u i≤ max{z, z'}|R) z in and z' mean in described R i element of the vector that a pair of similar high dimensional feature vector X in any one set element and X' obtain respectively after described projection matrix P conversion, Pr (min{z, z'}<u i≤ max{z, z'}|R) mean for the set element in described R, u imeet following condition: min{z, z'}>=u ior≤max{z, z'}<u iprobability;
The vector component units, for the u by after described vector optimization unit optimization 1~u mform described threshold vector.
13. described system as arbitrary as claim 7-12, is characterized in that, described projection matrix builds module and specifically comprises:
Minimum matrix characteristic vector computing unit, for asking for the matrix ∑ gm minimum n dimension matrix characteristic vector; Wherein, described ∑ qdescribed ∑ as shown in Equation 2, ras shown in Equation 3:
q=E{ (X-X') (X-X') t| Q} (formula 2)
In described formula 2, E{ (X-X') (X-X') t| Q} means the average of the covariance matrix between high dimensional feature vector similar in described Q;
r=E{ (X-X') (X-X') t| R} (formula 3)
In described formula 3, E{ (X-X') (X-X') t| R} means the average of the covariance matrix between inhomogeneous high dimensional feature vector in described R;
The projection matrix determining unit, for the m by asking for a n dimension matrix characteristic vector, form the projection matrix P of m * n.
CN201310359716.6A 2013-08-16 2013-08-16 Multimedia information retrieval method and system based on bit vectors Active CN103440292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310359716.6A CN103440292B (en) 2013-08-16 2013-08-16 Multimedia information retrieval method and system based on bit vectors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310359716.6A CN103440292B (en) 2013-08-16 2013-08-16 Multimedia information retrieval method and system based on bit vectors

Publications (2)

Publication Number Publication Date
CN103440292A true CN103440292A (en) 2013-12-11
CN103440292B CN103440292B (en) 2016-12-28

Family

ID=49693984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310359716.6A Active CN103440292B (en) 2013-08-16 2013-08-16 Multimedia information retrieval method and system based on bit vectors

Country Status (1)

Country Link
CN (1) CN103440292B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105959224A (en) * 2016-06-24 2016-09-21 西安电子科技大学 Bit vector-based high-speed routing lookup apparatus and method
CN106407311A (en) * 2016-08-30 2017-02-15 北京百度网讯科技有限公司 Method and device for obtaining search result
CN106815589A (en) * 2015-12-01 2017-06-09 财团法人工业技术研究院 Feature description method and feature descriptor using same
JP2018527656A (en) * 2015-07-23 2018-09-20 ベイジン ジンドン シャンケ インフォメーション テクノロジー カンパニー リミテッド Method and device for comparing similarity of high-dimensional features of images
CN111291204A (en) * 2019-12-10 2020-06-16 河北金融学院 Multimedia data fusion method and device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1444753A (en) * 2000-05-26 2003-09-24 萨里大学 Personal identity authentication process and system
US6944319B1 (en) * 1999-09-13 2005-09-13 Microsoft Corporation Pose-invariant face recognition system and process
JP2006268799A (en) * 2005-03-25 2006-10-05 Kitakyushu Foundation For The Advancement Of Industry Science & Technology Image retrieval method, image retrieval device and program
CN101329724A (en) * 2008-07-29 2008-12-24 上海天冠卫视技术研究所 Optimized human face recognition method and apparatus
CN101546332A (en) * 2009-05-07 2009-09-30 哈尔滨工程大学 Manifold dimension-reducing medical image search method based on quantum genetic optimization
CN101706871A (en) * 2009-11-05 2010-05-12 上海交通大学 Isometric mapping based facial image recognition method
KR20100073136A (en) * 2008-12-22 2010-07-01 한국전자통신연구원 Signature clustering method based grouping attack signature by the hashing
GB2481894A (en) * 2010-07-08 2012-01-11 Honeywell Int Inc Landmark localisation for facial images using classifiers
CN102479320A (en) * 2010-11-25 2012-05-30 康佳集团股份有限公司 Face recognition method and device as well as mobile terminal
CN102831389A (en) * 2012-06-28 2012-12-19 北京工业大学 Facial expression recognition algorithm based on discriminative component analysis
CN102982349A (en) * 2012-11-09 2013-03-20 深圳市捷顺科技实业股份有限公司 Image recognition method and device
CN103198331A (en) * 2013-03-25 2013-07-10 江苏易谱恒科技有限公司 Multiple spectrogram characteristic amalgamation and recognition method based on analysis of PCA-LDA

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6944319B1 (en) * 1999-09-13 2005-09-13 Microsoft Corporation Pose-invariant face recognition system and process
CN1444753A (en) * 2000-05-26 2003-09-24 萨里大学 Personal identity authentication process and system
JP2006268799A (en) * 2005-03-25 2006-10-05 Kitakyushu Foundation For The Advancement Of Industry Science & Technology Image retrieval method, image retrieval device and program
CN101329724A (en) * 2008-07-29 2008-12-24 上海天冠卫视技术研究所 Optimized human face recognition method and apparatus
KR20100073136A (en) * 2008-12-22 2010-07-01 한국전자통신연구원 Signature clustering method based grouping attack signature by the hashing
CN101546332A (en) * 2009-05-07 2009-09-30 哈尔滨工程大学 Manifold dimension-reducing medical image search method based on quantum genetic optimization
CN101706871A (en) * 2009-11-05 2010-05-12 上海交通大学 Isometric mapping based facial image recognition method
GB2481894A (en) * 2010-07-08 2012-01-11 Honeywell Int Inc Landmark localisation for facial images using classifiers
CN102479320A (en) * 2010-11-25 2012-05-30 康佳集团股份有限公司 Face recognition method and device as well as mobile terminal
CN102831389A (en) * 2012-06-28 2012-12-19 北京工业大学 Facial expression recognition algorithm based on discriminative component analysis
CN102982349A (en) * 2012-11-09 2013-03-20 深圳市捷顺科技实业股份有限公司 Image recognition method and device
CN103198331A (en) * 2013-03-25 2013-07-10 江苏易谱恒科技有限公司 Multiple spectrogram characteristic amalgamation and recognition method based on analysis of PCA-LDA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张凯歌: "基于线性判别分析的人脸识别系统研究与实现", 《中国优秀硕士学位论文全文数据库》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018527656A (en) * 2015-07-23 2018-09-20 ベイジン ジンドン シャンケ インフォメーション テクノロジー カンパニー リミテッド Method and device for comparing similarity of high-dimensional features of images
CN106815589A (en) * 2015-12-01 2017-06-09 财团法人工业技术研究院 Feature description method and feature descriptor using same
CN105959224A (en) * 2016-06-24 2016-09-21 西安电子科技大学 Bit vector-based high-speed routing lookup apparatus and method
CN105959224B (en) * 2016-06-24 2019-01-15 西安电子科技大学 High speed route lookup device and method based on bit vectors
CN106407311A (en) * 2016-08-30 2017-02-15 北京百度网讯科技有限公司 Method and device for obtaining search result
WO2018040503A1 (en) * 2016-08-30 2018-03-08 北京百度网讯科技有限公司 Method and system for obtaining search results
CN106407311B (en) * 2016-08-30 2020-07-24 北京百度网讯科技有限公司 Method and device for obtaining search result
CN111291204A (en) * 2019-12-10 2020-06-16 河北金融学院 Multimedia data fusion method and device
CN111291204B (en) * 2019-12-10 2023-08-29 河北金融学院 Multimedia data fusion method and device

Also Published As

Publication number Publication date
CN103440292B (en) 2016-12-28

Similar Documents

Publication Publication Date Title
CN109885692B (en) Knowledge data storage method, apparatus, computer device and storage medium
Xie et al. Comparison among dimensionality reduction techniques based on Random Projection for cancer classification
Popat et al. Hierarchical document clustering based on cosine similarity measure
CN107423278B (en) Evaluation element identification method, device and system
CN113656581B (en) Text classification and model training method, device, equipment and storage medium
CN103440292A (en) Method and system for retrieving multimedia information based on bit vector
CN103345496A (en) Multimedia information searching method and system
CN103617157A (en) Text similarity calculation method based on semantics
Yuan et al. Trajectory outlier detection algorithm based on structural features
CN102770857A (en) Relational information expansion device, relational information expansion method and program
CN109871454B (en) Robust discrete supervision cross-media hash retrieval method
CN104616029A (en) Data classification method and device
CN105320764A (en) 3D model retrieval method and 3D model retrieval apparatus based on slow increment features
CN111159332A (en) Text multi-intention identification method based on bert
CN105183792A (en) Distributed fast text classification method based on locality sensitive hashing
CN113901214A (en) Extraction method and device of table information, electronic equipment and storage medium
Baena-García et al. TF-SIDF: Term frequency, sketched inverse document frequency
CN103605653B (en) Big data retrieval method based on sparse hash
CN107133218A (en) Trade name intelligent Matching method, system and computer-readable recording medium
CN113239149B (en) Entity processing method, device, electronic equipment and storage medium
CN112183088B (en) Word level determining method, model building method, device and equipment
CN104978729A (en) Image hashing method based on data sensing
CN104951559A (en) Binary code rearrangement method based on bit weight
CN110265151A (en) A kind of learning method based on isomery temporal data in EHR
CN111724221A (en) Method, system, electronic device and storage medium for determining commodity matching information

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230427

Address after: Room 501-502, 5/F, Sina Headquarters Scientific Research Building, Block N-1 and N-2, Zhongguancun Software Park, Dongbei Wangxi Road, Haidian District, Beijing, 100193

Patentee after: Sina Technology (China) Co.,Ltd.

Address before: 100080, International Building, No. 58 West Fourth Ring Road, Haidian District, Beijing, 20 floor

Patentee before: Sina.com Technology (China) Co.,Ltd.