CN111309955B

CN111309955B - Fusion method for image retrieval

Info

Publication number: CN111309955B
Application number: CN202010149889.5A
Authority: CN
Inventors: 孙晓明; 张宁; 车畅; 刘野; 吴海滨
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2017-02-13
Filing date: 2017-02-13
Publication date: 2022-06-24
Anticipated expiration: 2037-02-13
Also published as: CN111368126A; CN111368126B; CN111309956A; CN106844733B; CN111309956B; CN111368125A; CN111309955A; CN111368125B; CN106844733A

Abstract

The invention relates to an image retrieval-oriented fusion method, which comprises the steps of fusing SIFT descriptor kernel density and SIFT descriptor histogram and comprises the following steps: firstly, obtaining a basic probability distribution function of an SIFT descriptor histogram and SIFT descriptor kernel density, and then, applying a Dempster combination rule to combine the basic probability distribution function to obtain a fusion result; the fusion method is applied to the image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance, can improve the image retrieval accuracy and provide a theoretical basis for image retrieval with complex background.

Description

Fusion method for image retrieval

The application is a divisional application of the invention patent application, namely 'image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance'.

Application date of the original case: 2017-02-13.

Original application No.: 2017100760427.

the name of the original invention is: an image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance.

Technical Field

The invention discloses an image retrieval-oriented fusion method, belongs to the technical field of image retrieval, and particularly relates to a key step in an image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance.

Background

Image retrieval methods have so far formed three important branches: text-based image retrieval, content-based image retrieval, and semantic-based image retrieval.

The image retrieval based on the text is to describe the requirements of users by using texts such as image names, image characteristics and the like, but because the text expression capability is limited and text labels have ambiguity, the retrieval result is often inconsistent with the requirements of the users;

the semantic-based image retrieval further refines the high-level semantic expression capability of the image on the basis of the visual characteristics of the image, but the retrieval process of the retrieval method is complex, and the problem that the method system is not developed completely exists;

the image retrieval based on the content is carried out by taking color, texture, shape and the like as the characteristic expression of the image and taking the characteristic expression as the basis for judging the similarity.

If the image features can be extracted accurately, content-based image retrieval would have accuracy advantages not available with the other two types of retrieval. The technical advantages are also aimed at by broad scholars, and researches on how to improve the accuracy of image feature extraction are carried out, so that the accuracy of content-based image retrieval is expected to be further improved.

Disclosure of Invention

In order to meet the technical requirements, the invention discloses an image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance, which can effectively improve the accuracy of content-based image retrieval.

The purpose of the invention is realized as follows:

the image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance comprises the following steps:

step a, extracting an image to be retrieved and SIFT characteristics of an image library;

b, generating an SIFT descriptor histogram and SIFT descriptor kernel density;

step c, fusing SIFT descriptor kernel density and SIFT descriptor histogram;

step d, improving the traditional Hausdorff distance measurement;

and e, using the improved Hausdorff distance for image matching.

The image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance comprises the following specific steps of the step a:

step a 1: constructing Gaussian difference scale function of image to be retrieved and image library

Carrying out convolution operation on Gaussian functions with different scales and the image to construct a Gaussian difference scale function D (x, y, sigma) of the two-dimensional image, wherein the method comprises the following steps:

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y)

where k is the scale scaling factor, G (x, y, σ) is a gaussian function of variable scale, I (x, y) is the image, and there are:

wherein, (x, y) is a scale coordinate, and the size of σ determines the degree of image smoothness;

step a 2: detecting extreme point in Gaussian difference scale space

Comparing each sampling point in the image with adjacent points of the sampling point, and when a certain sampling point is the maximum value or the minimum value in all points of the Gaussian difference scale space, considering the certain sampling point as a characteristic point of the image under the scale;

step a 3: removing the feature points with unstable edges to generate SIFT descriptors

And removing unstable characteristic points of the edge by using a Harris Corner detector, and keeping stable characteristic points to generate the SIFT descriptor.

The image retrieval method based on the combination of the vocabulary tree information fusion and the Hausdorff distance comprises the following specific steps of the step b:

step b 1: construction of an extensible lexical tree by hierarchical clustering of SIFT descriptors

Extracting SIFT descriptors of each picture to obtain a set F ═ F_iThen, the set F is subjected to hierarchical clustering by adopting a K-Means clustering method, and initially, the set F is subjected to K-Means clustering at the 1 st layer, and is divided into K parts of { F }_iI is more than or equal to 1 and less than or equal to k; repeating the operation until the depth reaches the preset L value to construct an expandable vocabulary tree, wherein c is B in total^LA node, wherein B is a branching factor, L is a depth, c is a total number of nodes, f_iRepresenting a certain SIFT descriptor in a picture, F being a set of descriptors, F_iIs a certain cluster set obtained by carrying out K-Means clustering on the set F;

step b 2: accumulating the times of the occurrence of the descriptors on each node in the extensible vocabulary tree to obtain an SIFT descriptor histogram

In constructing an expandable lexical tree, c is shared as B^LEach node accumulates the occurrence frequency of SIFT descriptors on the first node to obtain SIFT descriptor histogram based on the expandable vocabulary tree, and H is used as [ H ]₁,...,h_i,...,h_c]Is represented by the formula (I) in which h_iRepresenting the times of SIFT descriptors appearing on the ith node;

step b 3: quantizing the SIFT descriptor to obtain SIFT descriptor kernel density

Quantizing all SIFT descriptors, then each SIFT descriptor f_iAll correspond to a slave in the extensible vocabulary treeQuantization path from root node to leaf node, i.e. corresponding to a set of visual words

Each group of visual words corresponds to the kernel density f (c) thereof, and the SIFT descriptor kernel density based on the extensible vocabulary tree is obtained; wherein

Is a visual word, i.e. each node in the expandable vocabulary tree represents a visual word, l represents the number of layers of the node in the expandable vocabulary tree, h_lIndicating the index of the node in the level tree node, L being the depth.

The image retrieval method based on the combination of the vocabulary tree information fusion and the Hausdorff distance comprises the following specific steps of step c:

step c 1: obtaining a basic probability distribution function of SIFT descriptor histogram and SIFT descriptor kernel density

For computational convenience, if the SIFT descriptor histogram is set as a, and the SIFT descriptor kernel density is set as B, then the box Ω: { A, B }, the decision box is a set describing all elements constituting the whole hypothesis space, and is represented by m () with all possible results considered by the basic probability distribution function; at this time, the process of the present invention,

the basic probability distribution function of subset A is

The basic probability distribution function of the subset B is

Wherein, M is a normalization constant,

m₁(A_i) Denotes that the focal length is A_iBasic probability assignment of (c), m₂(B_j) Denotes that the focal length is B_jAssigning a basic probability;

step c 2: the fusion result is obtained by applying Dempster combination rule and combining the step c1

The Dempster combination rule is:

substituting the results m (A) and m (B) obtained in the step c1 into m (AB);

where M is a normalization constant, and M ═ Sigma_A∩B＝φ(m(A)m(B))＝1-∑_A∩B≠φ(m(A)m(B))

m (a) represents the basic probability distribution function of subset a, m (B) represents the basic probability distribution function of subset B, and m (ab) represents the fused basic probability distribution function of subset a and subset B.

The image retrieval method based on the combination of the vocabulary tree information fusion and the Hausdorff distance comprises the following specific steps of the step d:

step d 1: form of differential equation for writing cost function

The differential equation form of the cost function is as follows:

step d 2: obtaining a general solution to a cost function

Solving the differential equation to obtain the cost function with the following expression:

wherein gamma is₀The cost function is an initial value of the cost function, the range of the cost function is 0-1, k is a proportionality coefficient, and tau is a matching parameter;

step d 3: improved Hausdorff distance using traditional Hausdorff distance as variable of cost function

Given two finite sets X ═ X₁,x₂,...,x_MY ═ Y₁,y₂,...,y_NThe conventional Hausdorff distance between X and Y is defined as

Where d (X, Y) is the conventional Hausdorff distance, min represents the minimum, max represents the maximum, X and Y are the points in the point sets X and Y, respectively, and d (X, Y) represents the geometric distance between point X and point Y;

the improved Hausdorff distance is:

where | X | is the number of the finite set X, d_H(X, Y) is the modified Hausdorff distance, d (X, Y) is the conventional Hausdorff distance, and γ (d (X, Y)) is a cost function with a variable d (X, Y).

The image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance comprises the following specific steps of step e:

and c, according to the fusion characteristics obtained in the step c, carrying out image similarity measurement by using an improved Hausdorff distance, and arranging the obtained similarities according to a descending order to obtain a retrieval result.

Has the advantages that:

the method adopts the following technical means that firstly, SIFT features of an image to be retrieved and an image library are extracted, then an SIFT descriptor histogram and SIFT descriptor kernel density are generated, then the SIFT descriptor kernel density and the SIFT descriptor histogram are fused, the traditional Hausdorff distance measurement is improved, and finally the improved Hausdorff distance is used for image matching; the technical means are interdependent and are all absent, and as a whole, the technical purpose that the content-based image retrieval accuracy rate cannot be effectively improved when any one means exists is achieved.

Drawings

FIG. 1 is a flow chart of the image retrieval method based on the combination of lexical tree information fusion and Housdov distance.

FIG. 2 is a graph comparing precision ratios of three methods.

Fig. 3 is a "banyan" image to be retrieved.

Fig. 4 is a "banyan" search result based on the method of the present invention.

Fig. 5 is a "banyan" search result based on the SIFT descriptor histogram method.

Fig. 6 is a "banyan" search result based on the SIFT descriptor kernel density method.

FIG. 7 is a "tiger" image to be retrieved.

FIG. 8 is the "tiger" search result based on the method of the present invention.

Fig. 9 is a "tiger" search result based on the SIFT descriptor histogram method.

Fig. 10 is a "tiger" search result based on the SIFT descriptor kernel density method.

Detailed Description

The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.

Detailed description of the preferred embodiment

The present embodiment is a theoretical embodiment of an image retrieval method based on the combination of lexical tree information fusion and hausdorff distance.

The image retrieval method based on the combination of vocabulary tree information fusion and Hausdorff distance in the embodiment has a flow chart as shown in FIG. 1, and comprises the following steps:

b, generating an SIFT descriptor histogram and SIFT descriptor kernel density;

step c, fusing SIFT descriptor kernel density and SIFT descriptor histogram;

step d, improving the traditional Hausdorff distance measurement;

and e, using the improved Hausdorff distance for image matching.

The method comprises the following steps of performing convolution operation on Gaussian functions with different scales and an image to construct a Gaussian difference scale function D (x, y, sigma) of a two-dimensional image, wherein the method comprises the following steps:

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y)

step a 2: detecting extreme point in Gaussian difference scale space

Extracting SIFT descriptor of each picture to obtain a set F ═ F_iThen, the set F is subjected to hierarchical clustering by adopting a K-Means clustering method, and initially, the set F is subjected to K-Means clustering at the 1 st layer, and is divided into K parts of { F }_iI is more than or equal to 1 and less than or equal to k; by analogy, the newly generated cluster is subdivided into K clusters by using K-Means, the operations are continuously repeated until the depth reaches the preset L value,constructing an expandable vocabulary tree with c ═ B^LA node, wherein B is a branching factor, L is a depth, c is a total number of nodes, f_iRepresenting a certain SIFT descriptor in a picture, F being a set of descriptors, F_iIs a certain cluster set obtained by carrying out K-Means clustering on the set F;

Quantizing all SIFT descriptors, then each SIFT descriptor f_iAll corresponding to a quantization path from a root node to a leaf node in the expandable vocabulary tree, i.e. corresponding to a group of visual words

step c 1: obtaining a SIFT descriptor histogram and a basic probability distribution function of SIFT descriptor kernel density

the basic probability distribution function of subset A is

The basic probability distribution function of the subset B is

Wherein, M is a normalization constant,

m₁(A_i) Denotes that the focal length is A_iBasic probability assignment of (1), m₂(B_j) Denotes that the focal length is B_jAssigning a basic probability;

The Dempster combination rule is:

substituting the results m (A) and m (B) obtained in the step c1 into m (AB);

step d 1: form of differential equation for writing cost function

The differential equation form of the cost function is as follows:

step d 2: obtaining a general solution to a cost function

the improved Hausdorff distance is:

Detailed description of the invention

In view of the fact that most of the technicians in the field are scholars, the writing of technical documents is more customary to the writing of articles, and therefore, on the basis of no essential difference from the specific embodiment, the second specific embodiment is supplemented according to the scholars.

The image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance in the embodiment comprises the following steps:

step a: SIFT feature extraction (SIFT: scale invariant feature transform) for image to be retrieved and image library

Step a 1: constructing to-be-retrieved image and image library Gaussian difference scale function

During the extraction of SIFT descriptors, firstly constructing a Gaussian difference scale space, wherein the scale space of a two-dimensional image is

Where G (x, y, σ) is a gaussian function with variable scale, (x, y) is the scale coordinate, I (x, y) is the image, L (x, y, σ) is the scale space of the two-dimensional image, and the size of σ determines the degree of smoothing of the image.

For more accurate detection of image feature points, it is necessary to construct a gaussian difference scale function of a two-dimensional image, which is generated by convolving the image with gaussian functions of different scales, i.e., D (x, y, σ) — (G (x, y, k σ) -G (x, y, σ)) × I (x, y) — L (x, y, k σ) -L (x, y, σ), where D (x, y, σ) is the gaussian difference scale function of the two-dimensional image and k is a scale scaling factor

Step a 2: detecting extreme point in Gaussian difference scale space

To find the extreme point in the scale space requires that each sample point in the image is compared with its neighboring points, and when a sample point is the maximum or minimum among all points in the DoG (difference of gaussians) space, the point is considered as a feature point of the image under the scale.

In order to enhance the matching stable point and improve the noise capability, a Harris Commer detector is used for removing unstable characteristic points of the edge. And keeping stable feature points to generate SIFT descriptors.

Step b: generating SIFT descriptor histogram and SIFT descriptor kernel density

Step b 1: construction of an extensible lexical tree SVT by hierarchical clustering of SIFT descriptors

Extracting SIFT descriptors of each picture to obtain a set F ═ F_iAnd then, performing hierarchical clustering on the set F by adopting a K-Means clustering method. Initially, K-Means clustering is performed on the set F at layer 1, and the set F is divided into K parts of { F_iI is more than or equal to 1 and less than or equal to k. Similarly, the newly generated cluster is subdivided into K clusters by using K-Means, and the above operations are repeated until the depth reaches the preset L value, so that the cluster is not split, and an expandable vocabulary tree is constructed, wherein c is equal to B in total^LAnd each node is formed. Where B is the branching factor, L is the depth, c is the total number of nodes, f_iRepresenting a certain SIFT descriptor in a picture, F being a set of descriptors, F_iIs a certain cluster set obtained by performing K-Means clustering on the set F.

In constructing an expandable lexical tree, c is shared as B^LEach node accumulates the occurrence frequency of SIFT descriptors on the first node to obtain H [ H ] for SIFT descriptor histogram based on the expandable vocabulary tree₁,...,h_i,...,h_c]Is represented by the formula (I) in which h_iThe number of times of SIFT descriptors appearing at the ith node is represented, B is a branching factor, L is depth, and c is the total number of nodes.

All the SIFT descriptors are quantized,each SIFT descriptor f_iAll corresponding to a quantization path from a root node to a leaf node in the expandable vocabulary tree, i.e. corresponding to a group of visual words

Each group of visual words corresponds to the kernel density f (c), and the SIFT descriptor kernel density based on the extensible vocabulary tree is obtained. Wherein

Step c: fusing SIFT descriptor kernel density and SIFT descriptor histogram

For the following computational convenience, the frame Ω is identified by setting the SIFT descriptor histogram to a and the SIFT descriptor kernel density to B: { A, B }, the discrimination box is a set of all elements describing the overall hypothetical space. All possible outcomes are considered with the basic probability distribution function (BPA), often denoted m ().

The basic probability distribution function of subset A is

The basic probability distribution function of the subset B is

Wherein, M is a normalization constant,

The Dempster combination rule is:

substituting the results m (A) and m (B) obtained in step c1 into m (AB).

Step d: improving the conventional Hausdorff distance metric

In order to improve the reliability and stability of the matching process, the invention improves the traditional Hausdorff distance measurement, namely, the traditional Hausdorff distance is used as a variable of a cost function as the improved Hausdorff distance.

Step d 1: differential equation form with cost function written out

The differential equation form of the cost function is as follows:

step d 2: obtaining a general solution to a cost function

wherein gamma is₀The cost function is an initial value of the cost function, the range of the cost function is 0-1, k is a proportionality coefficient, and tau is a matching parameter.

Where d (X, Y) is the conventional Hausdorff distance, min represents the minimum, max represents the maximum, X and Y are the points in the point sets X and Y, respectively, and d (X, Y) represents the geometric distance between point X and point Y

The improved Hausdorff distance is:

where | X | is the number of the finite set X, d_H(X, Y) is the modified Hausdorff distance, d (X, Y) is the conventional Hausdorff distance, and γ (d (X, Y)) is a cost function with a variable d (X, Y)

Step e: using improved Hausdorff distance for image matching

And c, according to the fusion characteristics obtained in the step c, carrying out image similarity measurement on the characteristics by using an improved Hausdorff distance, and arranging the obtained similarities according to a descending order to obtain a retrieval result.

Detailed description of the preferred embodiment

The present embodiment is an experimental embodiment of an image retrieval method based on combination of lexical tree information fusion and hausdov distance.

Fig. 2 shows precision rates of image retrieval based on the SIFT descriptor histogram, image retrieval based on the SIFT descriptor kernel density, and image retrieval based on the present invention.

As can be seen from fig. 2, the first four cloud, star, bird and tree in the image category are simple pictures with backgrounds, and the precision ratios of the three retrieved images are not very different; the last four items in the image category are pictures with tiger, fish, mountain and flower as complex backgrounds, the precision ratio of the three retrieval methods is greatly different, and the retrieval of the method is far larger than that of the first two.

The experimental results for the two image types are given below

In the experiment, a small self-built image database is used, wherein the database contains 8 types of images, namely flowers, birds, fish, tigers, mountains, trees, stars and clouds, the total number of the images is 800 in total, and each type of image is 100.

Experiment one: background clearness experiment of image to be retrieved

The method comprises the steps of taking a banyan image with a simple background as an image to be retrieved, randomly extracting 5 images from all banyans as query images, and finally selecting an average value of the precision ratios of the 5 images as a final result. Precision is defined as follows: the precision ratio (number of images related to the key map in the query result/number of images returned by the query) is 100%.

A banyan image with a simple background is given as an image to be retrieved, as shown in fig. 3; the retrieval result of the method of the invention is shown in fig. 4, the retrieval result based on the SIFT descriptor histogram method is shown in fig. 5, and the retrieval result based on the SIFT descriptor kernel density method is shown in fig. 6.

As can be seen from the search results of fig. 4, 5, and 6: the background of the image to be retrieved is clear, the color information of the banyan is clear, the crown of the banyan is large, most of the image is covered, and rich textural feature information is formed; the shape information between the crown and the background of the image to be retrieved and at the trunk is clearer.

Each image to be retrieved returns 30 images, wherein the images accurately retrieved by the method of the invention are respectively 23, 25 and 25, the precision ratio is respectively 76.7%, 83.3% and 83.3%, and the average precision ratio (76.7+76.7+83.3+83.3+83.3)/5 is 100% and 80.66%; the images accurately searched by the SIFT descriptor histogram method are 23, 24, 25 and 25 respectively, the precision ratio is 76.7%, 80%, 83.3% and 83.3%, and the average precision ratio is (76.7+76.7+80+83.3+83.3)/5 x 100% is 80%; the images accurately searched by the SIFT descriptor density checking method are 23, 24, 25 and 25 respectively, the precision ratio is 76.7%, 80%, 83.3% and 83.3%, and the average precision ratio is (80+76.7+76.7+83.3+83.3)/5 × 100% and 80%;

for pictures with simple backgrounds, the difference between the retrieval method disclosed by the invention and the images retrieved by the histogram retrieval based on the SIFT descriptor and the kernel density based on the SIFT descriptor is not large, and the precision rate difference is not large and reaches about 80%.

Experiment two: background complex experiment of image to be retrieved

Taking the 'tiger' image with complex background as the image to be retrieved, randomly extracting 5 images from all the 'tiger' images as query images, and finally taking the average value of the precision ratios of the selected 5 images as the final result. The precision ratio is defined as follows: the precision ratio (number of images related to the key map in the query result/number of images returned by the query) is 100%.

A tiger image with a complex background is given as an image to be retrieved, as shown in fig. 7; the retrieval result of the method according to the invention is shown in fig. 8, the retrieval result based on the SIFT descriptor histogram method is shown in fig. 9, and the retrieval result based on the SIFT descriptor kernel density method is shown in fig. 10.

As can be seen from fig. 8, a total of 30 images were returned, of which 26 were retrieved with an accuracy of 86.7%. The first image of the retrieval result is the image to be retrieved, 25 retrieved images in the remaining 29 images are all the images of the tiger class, and the shape of the tiger head, the pattern of the tiger skin, the characteristics of the background area and the like in the 25 images are very similar to the image to be retrieved.

As can be seen from fig. 9, a total of 30 images were returned, of which 12 images were retrieved with an accuracy of 40%. As can be seen from fig. 10, a total of 30 images were returned, of which 13 images were accurately retrieved with an accuracy of 43.3%. The two search results show that although 12 and 13 searched images are also the images of the tiger type, the shape of the tiger head, the patterns of the tiger skin and the background area are greatly different from the image to be searched, but the characteristic background of the searched image is single.

Retrieving the other four images to be retrieved as the images of the tiger, wherein each image to be retrieved returns 30 images, the images accurately retrieved by the method are respectively 25, 26 and 27, the precision ratio is respectively 83.3%, 86.7% and 90.0%, and the average precision ratio is (86.7+83.3+83.3+86.7+90.0)/5 x 100% is 86.0%; the images accurately searched by the SIFT descriptor histogram method are respectively 12, 13 and 13, the precision ratio is respectively 40.0%, 43.3% and 43.3%, and the average precision ratio is (40.0+40.0+40.0+43.3+43.3)/5 is 100% and is 41.32%; the images accurately searched by the SIFT descriptor check density method are respectively 12, 13 and 13, the precision ratio is respectively 40.0%, 40.0% and 43.3%, and the average precision ratio is (43.3+40.0+40.0+43.3+43.3)/5 x 100% is 41.98%;

from the search results of experiment two, it can be found that the average precision ratio of the two search results which are not fused in the search of the pictures with complicated backgrounds only reaches 41.32% and 41.98%, which is equivalent to that the pictures with complicated backgrounds cannot be searched at all. The average precision ratio of the method reaches 86%, and the precision ratio is not reduced due to the complex background, so that the retrieval result fully proves that the image retrieval method combining the expandable vocabulary tree information fusion and the Hausdorff distance can make up the defect that the original retrieval method cannot retrieve the picture with the complex background.

Claims

1. The fusion method for image retrieval is characterized by comprising fusion of SIFT descriptor kernel density and SIFT descriptor histogram and comprises the following steps:

the basic probability distribution function of subset A is

The basic probability distribution function of the subset B is

Wherein, M is a normalization constant,

The Dempster combination rule is:

substituting the results m (A) and m (B) obtained in the step c1 into m (AB);

m (a) represents the basic probability distribution function of subset a, m (B) represents the basic probability distribution function of subset B, and m (ab) represents the fused basic probability distribution function of subset a and subset B;

the fusion method facing the image retrieval is used for an image retrieval method based on combination of vocabulary tree information fusion and Hausdorff distance, and the image retrieval method based on combination of the vocabulary tree information fusion and the Hausdorff distance comprises the following steps:

step a, extracting an image to be retrieved and SIFT characteristics of an image library; the method comprises the following specific steps:

step a 1: constructing a Gaussian difference scale function of an image to be retrieved and an image library;

step a 2: detecting extreme points in a Gaussian difference scale space;

step a 3: removing feature points with unstable edges and generating SIFT descriptors;

b, generating an SIFT descriptor histogram and SIFT descriptor kernel density; the method comprises the following specific steps:

step b 1: constructing an extensible vocabulary tree through hierarchical clustering of SIFT descriptors;

step b 2: accumulating the occurrence times of the descriptors on each node in the extensible vocabulary tree to obtain an SIFT descriptor histogram;

step b 3: quantizing the SIFT descriptors to obtain SIFT descriptor kernel density;

step c, fusing SIFT descriptor kernel density and SIFT descriptor histogram; the method comprises the following specific steps:

step c 1: obtaining a SIFT descriptor histogram and a basic probability distribution function of SIFT descriptor kernel density;

step c 2: a fusion result is obtained by applying Dempster combination rule and combining the step c 1;

step d, improving the traditional Hausdorff distance measurement; the method comprises the following specific steps:

step d 1: writing a differential equation form of the cost function;

the differential equation form of the cost function is as follows:

step d 2: obtaining a general solution of the cost function;

wherein, γ₀The cost function is an initial value of the cost function, the range of the cost function is 0-1, k is a proportionality coefficient, and tau is a matching parameter;

step d 3: the traditional Hausdorff distance is used as a variable of the cost function, and the Hausdorff distance is improved;

given two haveLimited set X ═ X₁,x₂,...,x_MY ═ Y₁,y₂,...,y_NThe conventional Hausdorff distance between X and Y is defined as

the improved Hausdorff distance is:

where | X | is the number of the finite set X, d_H(X, Y) is the modified Hausdorff distance, d (X, Y) is the conventional Hausdorff distance, and γ (d (X, Y)) is a cost function with a variable d (X, Y);

step e, using the improved Hausdorff distance for image matching; the method comprises the following specific steps: