CN107085607B

CN107085607B - Image feature point matching method

Info

Publication number: CN107085607B
Application number: CN201710258205.3A
Authority: CN
Inventors: 段翰聪; 赵子天; 谭春强; 文慧; 闵革勇; 陈超; 李博洋
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2017-04-19
Filing date: 2017-04-19
Publication date: 2020-06-30
Anticipated expiration: 2037-04-19
Also published as: CN107085607A

Abstract

The invention discloses an image feature point matching method, which comprises the following steps of extracting feature points of a warehousing image: extracting the features of the warehousing image, forming a warehousing feature vector, and reducing the dimension of the warehousing feature vector; vector storage: dividing the dimension-reduced warehouse-in characteristic vector, performing product quantization on each divided part, and then performing vector quantization to form a product quantizer and a vector quantizer, and establishing a retrieval tree and a hash table; extracting characteristic points of the picture to be matched: extracting the features of the image to be matched, forming a feature vector to be matched, and reducing the dimension of the feature vector; vector matching: dividing the feature vector to be matched after dimension reduction, finding out a plurality of clustering centers of the feature vector to be matched, a product quantizer and a vector quantizer, wherein the clustering centers are far away from each other, finding out pictures corresponding to the clustering centers according to a search tree and a hash table to form a candidate set, and calculating the picture closest to the feature vector to be matched in the candidate set by adopting a floating point vector; the speed is fast and the precision is high.

Description

Image feature point matching method

Technical Field

The invention relates to the technical field of picture searching, in particular to an image feature point matching method.

Background

In the field of image search, feature matching is a very important link, and the final search speed and accuracy are determined by the matching efficiency and accuracy of features. When searching for the existing picture, the method comprises the following steps: the first step is to train a conversion matrix through a large amount of sample data, convert the binary code through a hash function, segment the binary code, generate a plurality of hash tables, and directly use the obtained segmented binary code as an entry of the hash table. And secondly, when the vector to be queried reaches, converting the vector into a binary code in the same way, mapping the binary code to a corresponding hash table entry and other entries with the distance r, and taking all pictures in the entries as candidate sets. And thirdly, performing complete Hamming distance calculation on all the picture characteristic vectors in the candidate set and the vector to be queried, and rearranging the distance. When the floating point feature vector is converted into the binary code, the precision of the vector is lost due to the existence of the hash function, and the Hamming distance calculation based on the binary code is still used in the final reordering process, although the speed is high, the calling rate is reduced to a certain extent because the representation precision of the binary code is not as good as that of the floating point vector.

With the rapid development of the internet, the pictures on the internet have reached the billion level or even higher at present. With existing feature point matching methods, it has not been able to adapt to existing fast-growing picture library patterns. How to search for a large number of pictures in a search tree with high efficiency and high precision becomes a hot spot.

Disclosure of Invention

The present invention provides an image feature point matching method for solving the above technical problems, which has high matching precision and high speed.

The invention is realized by the following technical scheme:

an image feature point matching method comprises the following steps,

extracting characteristic points of the warehousing picture: extracting the features of the warehousing image, forming a warehousing feature vector, and reducing the dimension of the warehousing feature vector;

vector storage: dividing the dimension-reduced warehouse-in characteristic vector, performing product quantization on each divided part, and then performing vector quantization to form a product quantizer and a vector quantizer, and establishing a retrieval tree and a hash table;

extracting characteristic points of the picture to be matched: extracting the features of the image to be matched, forming a feature vector to be matched, and reducing the dimension of the feature vector;

vector matching: and dividing the feature vector to be matched after dimension reduction, finding out a plurality of clustering centers of the feature vector to be matched, the product quantizer and the vector quantizer, wherein the clustering centers are far away from each other, finding out pictures corresponding to the clustering centers according to a search tree and a hash table to form a candidate set, and calculating the picture closest to the feature vector to be matched in the candidate set by adopting a floating point vector.

The method of the scheme does not use an iterative quantization algorithm to calculate binary codes, adopts dimension reduction clustering to construct a retrieval tree and a hash table, does not cluster completely related data but divides the data in the first-level clustering, can accelerate the data by adopting a multi-thread parallel processing mode, and greatly reduces the training time of the quantizer. The whole process of matching and searching is carried out in two sections, the first section is to select a candidate set, the second section uses the floating point vector to carry out distance calculation on the whole, on the premise that the range of the candidate set is large, the floating point distance calculation is carried out, the difference between the recall rate of the searching result and the violence matching is small and is not more than 1 percentage point. Ordering is more accurate than hamming distance.

If N records exist in the database, the distance calculation needs to be carried out for N times in violent matching, and the method of the scheme has the advantages that the number of the records in the candidate set is N/100-N/10 according to the different selected parameters, so that the calculation is greatly reduced, and the matching speed is greatly improved. In the process of constructing the retrieval tree, the data clustered in the first stage is divided into a plurality of parts, and the clustering processes of the parts are completely independent, so that the multithreading technology can be used for clustering, and the clustering speed is improved.

Preferably, the vector storage method comprises the following steps:

partitioning the binned eigenvector into disjoint P sections;

the number of clustering centers inside each part is k₁K-means clustering;

for each cluster center, performing vector quantization on all data distributed to the cluster center, wherein the number of the cluster centers is k₂；

And respectively recording the IDs of all the characteristics mapped to the corresponding clustering centers or the names of the corresponding pictures by using the P hash tables.

Further, the specific method of vector matching is as follows:

dividing the feature vector to be matched into P disjoint parts;

within each part, the feature vector to be matched and k are calculated₁Of a cluster centerDistance, and selecting W clustering centers with the minimum distance;

aiming at the selected W clustering centers, the feature vector to be matched and k corresponding to the clustering center are used₂The clustering centers of the second layer are subjected to distance calculation one by one to obtain k₂A distance;

for W x k₂Sequencing the distances, and taking m distances with the shortest distance, wherein m is a natural number more than 1;

taking out the clustering centers corresponding to the m distances, finding out the corresponding hash table entries, and adding the picture names or

ID constitutes a candidate set;

and (4) calculating the distances between the picture characteristic vectors corresponding to the picture IDs in the candidate set and the characteristic vectors to be matched one by adopting floating point vectors, and finally obtaining the target with the minimum distance.

Further, the k-means clustering adopts a parallel processing mode.

Preferably, the dimensionality of the features is reduced using a principal component analysis method.

Compared with the prior art, the invention has the following advantages and beneficial effects:

the invention adopts dimension-reducing clustering to construct a retrieval tree and a hash table, and when the first-level clustering is performed, the retrieval tree and the hash table are segmented, data of each part are completely independent, and the method can be accelerated by adopting a multi-thread parallel processing mode, so that the training time of a quantizer is greatly reduced; in the process of matching retrieval, a candidate set is selected, then the whole floating point vector is used for distance calculation, under the premise that the range of the candidate set is large, the floating point distance calculation is carried out, the difference between the recall rate of a retrieval result and the violence matching is small and does not exceed 1 percentage point, and the retrieval precision is high and efficient.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not used as limitations of the present invention.

Example 1

An image feature point matching method comprises the following steps,

By adopting the method, not only the hash entry which is completely hit but also a plurality of entries nearby the hash entry are considered in the retrieval process, so that the probability of the most similar picture to the query picture in the candidate set is increased, and the recall rate is improved. When reordering is carried out, the original floating point feature vector is used for distance calculation, all information of the original floating point feature vector is reserved, and the original floating point feature vector is quantized into a binary feature code, so that the precision is lost to a certain extent.

Example 2

Based on the idea of the above embodiment, the present embodiment refines each step.

For the pictures stored as the search tree and the pictures to be matched, feature points need to be extracted, and the feature points are extracted by a plurality of methods, such as a convolutional neural network, the dimension of the output feature vector is a relatively large value and can be set as n, which may be 128, 256, 512, or the like. The dimension of the feature vector is large, the calculation amount of the matching process is increased, dimension reduction needs to be carried out on the feature vector, the dimension reduction can be carried out by adopting a principal component analysis method to reduce the dimension of the output feature vector into d dimension, wherein d is less than or equal to n, and d can be 128 or 64. The adoption of the dimension reduction method can not only remove the influence of noise, but also reduce the calculation amount and the calculation time.

The specific process of reducing the dimension is as follows:

assuming S pieces of data, in the original space, S pieces of n-dimensional eigenvectors can be represented by a matrix M ═ D₁,D₂,…D_nDenotes wherein D_nIs the column vector of S x 1. First, the covariance of the matrix M is determined to obtain a matrix var (M), where var (M) is M^TM is 1/n, then the prior method is utilized to solve n eigenvalues and corresponding eigenvectors of the covariance matrix Var (M), the largest d eigenvalues and the corresponding vectors are selected as the matrix R for dimension reduction, the MR is calculated to obtain an L matrix for d, and the dimension reduction process is completed.

Extracting the characteristics of the picture to be stored in a warehouse and reducing the dimension to prepare for constructing a search tree for vector storage, wherein the specific method for vector storage comprises the following steps:

the method comprises the steps of dividing a warehousing feature vector into P disjoint parts, for example, dividing data with 128 dimensions of d into 4 parts, taking the 1 st floating point number to the 32 th floating point number of the feature vector as a first part, the 33 th floating point number to the 64 th floating point number as a second part, the 65 th floating point number to the 96 th floating point number as a third part, and the 97 th floating point number to the 128 th floating point number as a fourth part.

The number of clustering centers inside each part is k₁The k-means clustering, here, is the first layer quantization, and the specific process is illustrated by taking the first part as an example:

1-1, for the S-strip 32 floating-point number features of the first part, firstly selecting k from the S-strip features₁The strip acts as a cluster center;

1-2, respectively putting S pieces of characteristic data in k₁Performing Euclidean distance calculation on each clustering center, and attributing the data to the clustering center when the characteristic data is closest to the clustering center;

1-3, for each cluster center, summing the corresponding floating point number of the feature data belonging to the cluster center in the previous step to obtain an average value, and taking the finally obtained average value vector as a new average value vectork₁A cluster center;

1-4, if the clustering times or the clustering error is reduced to a certain range, terminating clustering, and obtaining k in the step 3₁And (4) obtaining the cluster centers, otherwise, returning to the step 2.

1-5, record k₁The value of each cluster center is needed for retrieval.

On the basis of the first-layer quantization, all data distributed to the clustering center are subjected to vector quantization aiming at each clustering center, and the number of the clustering centers is k₂Here, the second layer quantization is performed as follows:

2-1, for each cluster center quantified at the first level, the feature belonging to a cluster center is S_iWherein all S_iIs S, from S_iIn selecting k₂The strip acts as a cluster center;

2-2. mixing S_iBar feature data in k₂Performing Euclidean distance calculation on each clustering center, and attributing the data to the clustering center when the characteristic data is closest to the clustering center;

2-3, for each cluster center, summing the corresponding floating point number of the feature data belonging to the cluster center in the previous step to obtain an average value, and taking the finally obtained average value vector as a new k₂A cluster center;

2-4, if the clustering times or the clustering error is reduced to a certain range, terminating clustering, and obtaining k in the step 3₂The clustering centers are obtained, otherwise, the step 2 is returned to;

2-5, record k₁The value of each cluster center is needed for retrieval.

Respectively recording the ID of all the characteristics mapped to the corresponding clustering center or the name of the corresponding picture by using P hash tables, wherein the hash code of each hash table is log2 (k)₁)+log2(k₂) A bit. For example:

1. suppose k₁＝16，k₂16, the hash code is 4+ 4-8 bits, the first 4 bits representing one of the 16 cluster centers in the first layer, and the last 4 bits representing one of the 16 two-layer cluster centers corresponding to the first-layer cluster centerAnd (4) respectively.

2. And adding the corresponding picture name or ID into the hash entry corresponding to the hash code corresponding to each feature data mapped to the second-layer clustering center. The hash here is just encoding for the cluster center, and is a data structure for storage.

The essence of vector storage is to lay a foundation for vector matching, and after the retrieval tree and the hash table are built, the vector matching can be carried out. If the picture to be matched is input, the vector matching step is started after the characteristic points of the picture to be matched are extracted according to the method, and the specific method for vector matching comprises the following steps:

dividing the feature vector to be matched into P disjoint parts; the dividing method is the same as the dividing method in warehousing.

Within each part, the feature vector to be matched and k are calculated₁The distances of the cluster centers are sorted and the W cluster centers with the smallest distance are selected, and P is 4, k₁＝16,k₂16, W is 4, the first part of the vector to be queried is illustrated as an example,

and (3) calculating Euclidean distances between the 16 cluster centers of the first part in storage by using the first part of the vector to be inquired, namely 1-32 floating point vectors and the 16 cluster centers. And sorting the 16 distances, and selecting the smallest W-4 clustering centers.

Aiming at the selected W clustering centers, the feature vector to be matched and k corresponding to the clustering center are used₂The clustering centers of the second layer are subjected to distance calculation one by one to obtain k₂And specifically, performing Euclidean distance calculation on the selected cluster centers by using 32 floating point vectors of the vector to be queried and 16 cluster centers of the second-level cluster in the selected cluster centers.

W x k₂The distances are sequentially placed into the large top pile, only m distances are ensured in the large top pile, wherein m is a natural number greater than 1, and the first m distances are taken instead of one, so that the recall rate is ensured.

And (4) taking out m clustering centers corresponding to the distances in the large pushing, finding corresponding hash table entries, and forming candidate sets by picture names or IDs in the entries. And for the m clustering centers corresponding to the distance, encoding a hash code for each clustering center when the clustering centers are put in storage, wherein each hash code corresponds to a unique entry of the hash table, and obtaining a candidate set by summing IDs in the m entries.

Carrying out distance calculation on the picture characteristic vectors corresponding to the picture IDs in the candidate set and the characteristic vectors to be matched one by one, and finally obtaining the target with the minimum distance, wherein the method specifically comprises the following steps:

for each ID in the candidate set, acquiring a complete 128-dimensional floating point feature vector of each ID, and performing distance calculation by using the vector to be queried and the floating point feature vectors one by one;

and (3) selecting the minimum K results in the step (2), wherein the corresponding ID is the most similar picture. When K is 1, the search is accurate, and when K >1, the search is K neighbor.

Example 3

With respect to example 2, a detailed implementation is now disclosed.

The steps of extracting the feature points of the image to be stored and the steps of extracting the feature points of the image to be matched are not described in detail in this embodiment.

Vector storage: training of the product quantizer and the vector quantizer is performed using a large number of feature vectors in the database. The method comprises the following steps:

dividing the D-dimensional warehousing feature vector subjected to dimension reduction into P disjoint parts, taking D as 128 and P as 4 as examples, taking the 1 st to 32 th bits of the feature vector as a first section, the 33 th to 64 th bits as a second section, the 65 th to 96 th bits as a third section, and the 97 th to 128 th bits as a fourth section.

The number of clustering centers inside each part is k₁K-means clustering of (1), total of P x k₁A cluster center, all cluster centers being [ C ]¹ _i]_p＝{[c¹ _i]_p,i＝0,1,2,…,k₁(ii) a P is 0,1,2, …, P, and the cluster center [ C { (m) } is determined¹ _i]_pThe storage is performed, since the features have been segmented in the previous step, the storage space consumption is only D/P × k₁This step is called PQ, i.e. product quantization, and also becomes the first layer quantization, resulting in a corresponding PQ quantizer. And isDue to the independence of all parts, the process can use multiple threads, multiple processes and even multiple nodes to perform parallel processing, and the clustering speed is increased.

On the basis of the first layer quantization, all clusters are clustered to a cluster center c¹ _ijClustering again to generate k₂Center of cluster, total of P x k₁*k₂A cluster center, all cluster centers being [ C ]² _ij]_p＝{[c² _ij]_p,i＝0,1,2,3,…,k₂；j＝0,1,2,3…k₁(ii) a P is 0,1,2, …, P, and the cluster center is stored, this step is called VQ, i.e. vector quantization, also referred to as second-level quantization, and the corresponding VQ quantizer is obtained.

Establishing P hash tables corresponding to the P vector sets separated in the first step, wherein the hash tables have a total

Each hash table is also corresponding to the hash code with the length of

For k in each part₁*k₂And coding the clustering centers, and storing the ID of the characteristic vector mapped to each clustering center in the sample data or the name of the corresponding picture in an entry in a hash table corresponding to the clustering center to obtain the inverted index based on the multiple hash tables.

After the four steps are completed, a PQ quantizer, a VQ quantizer, a search tree and an inverted index structure based on a multi-hash table based on a large amount of sample data can be obtained.

After the search tree and the hash table are established, if the picture to be matched is input, vector matching search is carried out.

When a feature vector y to be matched is given, a vector which is most adjacent to the feature vector y is retrieved from the retrieval tree and the inverted index, and the steps are as follows:

dividing the vector y into intersecting P portions, y ═ y₁,y₂,y₃,…,y_p]。

For y_pCalculating k with the quantization of the first layer of the p-th part₁Distance of individual cluster centers. Definition of dist (y)_p,[c¹ _i]_p)＝||y_p–[c¹ _i]_p||²The distance between the pth part of the vector y to be queried and the ith clustering center of the pth part of the sample data.

Because of the uncertainty of the clustering process, a feature in sample space closest to y is likely to belong to other cluster centers nearby, so it is considered that the distance from y_pOther cluster centers around the nearest cluster center. For y determined in the previous step_pAnd reordering the distances between the cluster centers, and selecting w cluster centers with the smallest distance as the range of the next query.

For w first-layer cluster centers with the minimum distance, for w cluster centers, k is arranged below each cluster center₂A second level cluster center, defining dist (y)_p,[c¹ _ij]_p)＝||y_p–[c¹ _ij]_p||²Is the distance between the p-th part of the vector y to be queried and the j-th second layer cluster center under the i-th cluster center of the first layer in the sample space.

For w x k obtained in step 4₂And (3) sequencing the distances, wherein the distance is only the distance of the p-th part of the sample space, so that only the nearest one can not be taken, the cluster centers corresponding to the first m distances are taken, the corresponding hash table inlets are found according to the cluster centers, the picture IDs or names stored in the m hash table inlets are subjected to union, and finally the candidate set of the most similar picture IDs is obtained.

And taking out the characteristic vector corresponding to the picture ID, calculating the distance between the characteristic vector and the complete vector y to be inquired at one time, and adopting a data structure of small top heap to work well regardless of the size of the final data set and occupy a small amount of memory. And finally obtaining the topK which is the search result and is K before the distance sorting, wherein the topK is the most similar picture when K is 1.

The embodiment adopts product quantization, vector quantization and multi-Hash index to solve the problem of nearest neighbor search, and improves the retrieval recall rate by utilizing the parallel computing process of clustering and two-stage division of the retrieval process.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An image feature point matching method is characterized by comprising the following steps,

vector matching: dividing the feature vector to be matched after dimension reduction, finding out a plurality of clustering centers of the feature vector to be matched, a product quantizer and a vector quantizer, wherein the clustering centers are far away from each other, finding out pictures corresponding to the clustering centers according to a search tree and a hash table to form a candidate set, and calculating the picture closest to the feature vector to be matched in the candidate set by adopting a floating point vector;

the specific method for vector storage comprises the following steps:

a step of dividing the warehousing feature vector: dividing the dimension-reduced warehousing feature vector into disjoint P parts;

product quantization step: clustering centers within each partThe number is k₁K-means clustering of (g) to obtain P x k₁Clustering centers, and dividing all P x k₁Storing the clustering centers to form a product quantizer;

vector quantization step: aiming at each clustering center obtained in the product quantification step, performing clustering step again on all data distributed to the clustering center, wherein the number of the clustering centers is k₂(ii) a To obtain P x k₁*k₂A second layer of cluster centers, clustering the P x k₁*k₂The second layer clustering centers are stored to form a vector quantizer;

establishing a retrieval tree and a hash table: respectively recording IDs of all the characteristics mapped to the corresponding clustering centers or names of corresponding pictures by using P hash tables; the specific method for vector matching is as follows:

dividing the feature vector to be matched into P disjoint parts;

within each portion, the eigenvectors to be matched and k obtained in the product quantization step are calculated₁Distance of each clustering center, and selecting W clustering centers with the minimum distance;

taking out m clustering centers corresponding to the distances, finding corresponding hash table entries, and forming candidate sets by picture names or IDs in the entries;

2. The image feature point matching method according to claim 1, characterized in that: and the k-means clustering adopts a parallel processing mode.

3. The image feature point matching method according to claim 1, characterized in that: and reducing the dimension of the feature by adopting a principal component analysis method.

4. The image feature point matching method according to claim 1, characterized in that: the dimension reduction comprises the following specific steps:

forming a matrix M by using L pieces of n-dimensional eigenvector data, and solving covariance of the matrix M to obtain a matrix Var (M), wherein the matrix M ═ { D ═ D₁,D₂,…D_nN is the dimension of the feature vector, and L is a natural number greater than 1;

solving n eigenvalues and corresponding eigenvectors of the covariance matrix Var (M), and selecting the largest d eigenvalues and corresponding vectors as a dimension reduction matrix R;

and calculating the MR to obtain a matrix of L x d, thereby realizing dimension reduction.