CN105354578B

CN105354578B - A kind of multiple target object image matching method

Info

Publication number: CN105354578B
Application number: CN201510712952.0A
Authority: CN
Inventors: 汪粼波; 方贤勇; 仲红; 张少杰
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2015-10-27
Filing date: 2015-10-27
Publication date: 2019-05-21
Anticipated expiration: 2035-10-27
Also published as: CN105354578A

Abstract

The invention discloses a kind of multi-Target Image matching process, comprise the steps that step 1, image pre-matching；Step 2, the local affine transformations of initial matching characteristic area are estimated；Step 3, the local affine transformations distance between any two pairs local characteristic matchings pair is defined；Step 4, it is based on affine transformation distance definition affine transformation space density function；Step 5, the affine transformation space clustering based on density, the biggish all clusters of bit density are carried out；Step 6, it as a result presents.Beneficial technical effect: the present invention overcomes optimizing convergence rate in the existing image matching algorithm based on optimization slowly and being difficult to the problem of obtaining globally optimal solution and being typically only capable to processing single goal object matches, the matched accuracy and efficiency of multi-Target Image is effectively increased.

Description

Multi-target object image matching method

Technical Field

The invention belongs to the technical field of image processing, computer vision and multimedia information, and particularly relates to a multi-target object image matching method.

Background

With the popularization of smart phones and the rapid development of internet information technology, digital images become an indispensable medium for people to communicate in daily life in modern society, and image resources also become a vital information resource. Meanwhile, research on images is receiving wide attention. The multi-target object image matching is a long-studied subject, and the related technology has strong application in image processing such as object identification, image retrieval, image classification and approximate duplicate image detection, and research such as computer vision, and is fundamental work with both theoretical research and practical application value in the related field. The research related to image matching has been greatly developed in recent years, and a large number of excellent local feature detection and description operators are emerged. Representative local area detectors are: MSER, Harris-Hessian Affinine and DoG, etc., local region feature descriptors such as: SIFT, SURF, DAISY, etc. The local feature detector is used for detecting a representative local region with distinguishable characteristics in the image, such as a corner region, and the like, and the feature descriptor quantifies information of the detected local region to generate a local feature vector. The matching of the images is carried out based on the matching of local feature description vectors, and the obtained feature matching pair set forms a local feature initial matching set. In view of better distinguishability and invariance of local feature descriptions, the initial matching set usually contains a large number of correct matching pairs, and at the same time, due to the complex diversity of image contents and the existence of quantization errors of local area information, a large number of wrong local feature matches exist in the initial feature matching set. Therefore, the subsequent algorithm preferably selects correct local feature matching by introducing local geometric mapping relation constraint among the matched features. The common strategy is to adopt a Graph Matching (Graph Matching) method to realize the optimization of a characteristic initial Matching set, wherein each node in the Graph represents the Matching degree of a single Matching pair, the association between the point and the point corresponds to the consistency of the local geometric mapping relation between two pairs of Matching characteristics, and the closely-associated node is determined by maximizing the energy of the whole Matching Graph to determine the final characteristic Matching. The problem is essentially a quadratic optimization problem, and the solution of the quadratic optimization problem requires a large time overhead, so that the quadratic optimization problem cannot be applied to large-scale matching application tasks, such as approximate duplicate image detection and the like. In addition, such methods can only deal with the matching problem of a single target object, and cannot perform multi-target object matching.

Aiming at the problems, the inventor carries out technical attack and customs under the support of information security happy planning professional transformation and new professional construction project No. J05201380 of Anhui university, national science fund youth fund No.61502005, Nanjing university software new technical key laboratory open fund KFKT2015B03 and Anhui university academic and technical lead introduction engineering.

Disclosure of Invention

The purpose of the invention is as follows: the invention aims to solve the technical problem of providing a multi-target object image matching method aiming at the defects of the prior art.

The technical scheme is as follows: the invention discloses a multi-target object image matching method, which is characterized in that local features of an image are extracted and are pre-matched to obtain an initial local feature matching set, local mapping affine matrixes of each pair of matching feature regions are estimated, affine matrix distances are defined and affine spaces are constructed, based on the observation that correct local feature matching pairs have more consistent affine transformation and are clustered into clusters with higher density in the spaces, density functions are defined and the density value of each affine matrix is estimated, modal points and clustering paths of all the clustering clusters in the spaces are determined to further realize density-based clustering, clustering optimization is realized through the boundaries of all the clustering clusters, and finally correct local feature matching and grouping are determined according to the clustering relations of all the affine matrices. The method specifically comprises the following steps: processing a pair of images to be matched by a computer as follows in sequence:

step 1, local feature extraction and pre-matching. And 2, estimating local affine transformation of the initial matching area. And 3, calculating the affine transformation distance. And 4, defining an affine space density function. And 5, carrying out affine transformation spatial clustering based on the density. And 6, determining a correct matching pair and presenting a result.

Further, the detailed steps of the invention are as follows: step 1, local feature extraction and pre-matching: extracting local characteristic regions of the image pair to be matched and characteristic description thereof, and establishing an initial matching set of local characteristics among the images based on the described similarity.

Step 2, estimating local affine transformation of the initial matching feature region: and on the basis of the local feature extraction and the pre-matching in the step 1, estimating local affine transformation of an initial matching feature region of the image pair to be matched.

Step 3, calculating affine transformation distance: and (3) quantizing the local affine transformation distance between any two pairs of local feature matching pairs in the image pair to be matched on the basis of the local feature extraction and the pre-matching in the step (1).

Step 4, defining an affine space density function: and on the basis of local feature extraction and pre-matching in the step 1, defining an affine transformation space density function of the image pair to be matched based on the affine transformation distance.

And 5, performing affine transformation space clustering based on density on the image pair to be matched according to the local affine transformation obtained in the step 2, the local affine transformation distance obtained in the step 3 and the affine transformation space density function obtained in the step 4 to obtain a cluster and a corresponding affine matrix.

And 6, displaying a result: and (4) determining the final local feature matching pairs to be grouped according to the clustering clusters according to the corresponding relation between the affine matrix in the clustering clusters obtained in the step (5) and the local feature matching pairs in the step (1), wherein the feature matching pairs in the same group correspond to the same target object, and finally presenting the local feature matching pairs positioned on different target objects.

Further, step 1 is specifically as follows: step 1, image pre-matching: and extracting a local characteristic region of the image pair to be matched and the characteristic description thereof, and matching the local characteristics among the images based on the described similarity. The characteristic region detection mainly adopts MSER and DoG detectors, and the MSER and the DoG detectors are selected to carry out region detection to obtain a plurality of local elliptical or circular regions with geometric/scale invariance, and each region is described by corresponding geometric parameters. The method specifically comprises the following steps of generating feature vectors by adopting an SIFT descriptor, and performing primary feature matching based on similarity of Euclidean distances among the feature vectors:

step 1.1, extracting a characteristic region: extracting local characteristic regions of the image: and extracting the local characteristic region of the image pair to be matched by adopting an image local characteristic region detection operator. The local feature region has scale invariance. And obtaining the geometric shape of the local characteristic region as a circle or an ellipse through an image local characteristic region detection operator. The preferred scheme is as follows: and extracting the characteristic region of the image pair to be matched by adopting an MSER or DoG operator. The DoG detection result is a plurality of circular areas, and the MSER is an ellipse.

Step 1.2, mapping an elliptical area: the input of SIFT operator feature description is a circular region parameter, so if MSER operator is adopted for feature detection, the ellipse is firstly mapped to the circular region. Assume ellipse E_iHas a center of p_iThe major and minor axis radii areAngle of rotation omega_iIt will be mapped to a fixed radius r_cCenter coordinate p_iCenter area O of_iIn (3), the mapping formula is as follows:

wherein, (x, y)^TIs E_i(x ', y')^TIs O_iThe corresponding point coordinates of (a). r is_cSet to 13. When mapping specifically, to ensure O_iEach whole pixel point has a value, and O is selected_iInverse mapping of a point on to E_iThe upper part, namely:

at this time, the coordinates (x, y)^TPossibly located at sub-pixel positions in the image, the pixel values of which can be obtained by bilinear interpolation, thereby determining O_iMiddle coordinate (x ', y')^TThe pixel value of (c).

Step 1.3, calculating the main direction angle of the gradient of the characteristic region: given a local feature circular area O_iHaving a central coordinate of p_iRadius r_iThen the gradient size m (x, y) and direction of each pixel point in the regionThe calculation method is as follows:

where L (x, y) represents the pixel value at the coordinate (x, y) point within the region. In this case, the angular interval [0,360) is divided into 36 equal divisions according to the gradient angleThe gradient m (x, y) is superposed in the corresponding interval to generate a gradient histogram, and the gradient median of the interval with the most points in the histogram is selected as the main gradient direction theta_i。

Step 1.4, extracting feature vectors: and generating a feature vector of each circular region based on SIFT feature descriptors, and outputting a 128-dimensional gradient histogram vector for each local region. (see < detailed algorithmic processes from Scale-innovative Key-points > International Journal of computer Vision, vol.60, No.2, pp1482-1489,2004).

Step 1.5, primary matching of local features: if and only if the feature description vector D_iAnd D_jD (D) of the two_i,D_j) The multiplication by a threshold (set to 1.1) is not greater than the feature vector D_iDistance from all other feature vectors, D_iAnd D_jAnd (6) matching. The distance formula is defined as:

wherein D_ikRepresenting vector D_iThe k-th dimension component of (a). The initial matching principle is determined by a large number of statistics, and due to the fact that the dimension of the feature vector is high, the nearest neighbor feature can be guaranteed to be the matching feature of the current feature vector only when the distance difference between the nearest neighbor and the next nearest neighbor of the current feature vector in the space is large enough. This is large enough to be determined using a ratio threshold, and experiments have shown that a setting of 1.1 works well, i.e., the next neighbor is at least 1.1 times greater than the nearest neighbor. The local feature matching pairs obtained by the initial matching usually contain most correct matching pairs, but also contain more false matches. The part of the wrong matching needs to be removed based on the local geometric transformation characteristic, namely, each pair of matched local feature regions have local geometric mapping relations, the geometric mapping relations among a plurality of pairs of correctly matched feature regions belonging to the same or similar objects are relatively consistent, the mapping relations of the local feature regions corresponding to the wrong matching pairs are random, and no obvious association exists between the mapping relations of the local feature regions corresponding to the other matching pairs. Therefore, if a geometric parameter space with respect to the local mapping relations of all the initial matching pairs is constructed, the correct local feature matching pairs are close to each other to form a cluster with a high density, and different matching target objects correspond to different clusters, while the wrong local feature matching pairs are scattered in the whole space to form randomly distributed noise. Therefore, the density-based clustering can be carried out in the local mapping geometric parameter space of the preliminary matching pair to realize the optimization of the local feature matching pair, and further realize the matching of the multi-target object in the image pair to be matched.

Step 2, in order to realize initial matching optimization based on local mapping, firstly estimating local affine transformation of an initial matching feature region, specifically:

when the detection result of the characteristic region is a circular region, any pair of circular matching regions is marked as O_iAnd O'_iAnd the geometric parameters thereof have a circle center coordinate p_iAnd p'_iRadius r_iAnd r'_iAngle of gradient α_iAnd α'_iThen O is added_iUpper point (x, y)^TTo O'_iPoint on (x ', y')^TThe mapping relationship may be expressed as:

and isWherein:

and t is_i＝p′_i-R_ip_iθ_iIs the difference in rotation angle of two circles, s_iDenotes the ratio of the radii of the circles, t_iIs the mapped circle center offset.

When the feature region detection result is an elliptical region, any pair of elliptical regions is marked as E_iAnd E'_iWith geometric parameter having a central coordinate p_iAnd p'_iRadius of major and minor axesAndand a rotation angle omega_iAnd ω'_i. From step 1.2, before SIFT feature extraction, E_iWill be mapped to a radius r_cAnd the central coordinate is p_iAnd p'_iIn the circular region O, the mapping matrix can be expressed as:

and is

Similarly, ellipse E'_iIs also mapped to radius r_cWith a central coordinate of p_iAnd p'_iThe mapping matrix of the circular region O' is:and is

After O and O' are obtained, their gradient direction angles α can be estimated according to step 1.3_iAnd α'_i. On this basis, the mapping of O to O' can be expressed as:

and isθ_i＝α′_i-α_i,t_i＝p′_i-R_ip_i

Wherein, theta_iIs the difference between the gradient directions of O and O', t_iIs the mapped circle center offset. Thus, from E_iTo E'_iThe affine mapping of (c) can be expressed as:

and X_i＝Γ'_i ^-1T_iΓ_i

Step 3, defining local affine transformation difference distances between any two pairs of local feature matching pairs:

step 3.1, give two affine matrices X_iAnd X_j，X_i+X_jIt has no practical significance, and thus the European style cannot be simply adoptedDistance to measure X_iAnd X_jThe distance between them. One more reliable strategy is to assign X_iAnd X_jActing simultaneously on a series of points p on a plane_kN, and calculating a distance between the mapped points, and measuring X based on the distance_iAnd X_jThe difference between, i.e. X_iAnd X_jThe distance of (d) can be preliminarily defined as follows:

wherein,is made by X_iMapping point p_kThe latter result is a three-dimensional coordinate which is then transformed by the function k (·) into a corresponding two-dimensional planar coordinate.

P with respect to coordinate points_kSelecting, theoretically estimating X_iAnd X_jAll points on the support area can be regarded as p_kCandidate point of (2), i.e. if X_iMapping ellipse E_iTo E'_i，X_jMapping ellipse E_jTo E'_jThen E is_iAnd E_jAll the points are available coordinate points. In actual operation, however, only E needs to be selected for simple calculation_iAnd E_jCoordinate center p of_iAnd p_jNamely, namely:

wherein, p'_jIs E'_jCenter coordinates of (D)_j|iIs represented by X_iMapping p_jCoordinate points obtained after and p'_jThe distance of (c).Ensure the distanceSymmetry of (D), i.e. D (X)_i,X_j)＝D(X_j,X_i). The same principle is as follows:

further, D (X) is considered to be equal in distance between any two affine transformations and their inverse transformations_i,X_j) The final definition is as follows:

step 3.2, defining a conflict transformation penalty distance: if X_iMapping ellipse E_iTo E'_i，X_jMapping E_jTo E'_jAnd E is_iAnd E_jAre the same ellipse and E'_iAnd E'_jDifferent, at this time E'_iAnd E'_jUsually only one is an ellipse E_iCorrect correspondence of (A), i.e. X_iAnd X_jUsually only one at most is the correct transformation, so X is defined_iAnd X_jIs a conflicting transformation. Conflicting transforms are multiple transforms that are inherently different (usually at most one is the correct transform and the others are the wrong transforms) and therefore should be relatively distant from each other, but when the distance is calculated as in step 3.1, due to E_iAnd E_jSame, resulting in D'_j|i＝D′_i|j0, and then D (X)_i,X_j) The value of (c) is small. To highlight X_iAnd X_jDefining the distance between the two is not less than the constant C, i.e.:

if X_iAnd X_jIn conflict with each other, the system can be used,

where max (x, y) denotes selecting a larger value from x and y. C is a constant, providedSet to a larger value, the present invention is set to 250. Similarly, when E'_iAnd E'_jSame and E_iAnd E_jWhen different, its mapping matrix X_iAnd X_jAlso a conflicting transformation.

Step 4, defining an affine transformation space density function: the density of data points in the space is positively correlated with the number of points near the data points, and the density is higher as the number is larger. On the other hand, the density function is local, i.e. the density of each data point is only related to the distribution of data points within a limited distance. Combining these two factors, an affine matrix X in a given affine space_iDefining its density function as:

i.e. density ρ_iIs X_iTo the remaining transformation X_jGauss distance of, transformation matrix X of closer distance_jThe contribution to density is large and gradually diminishes to negligible values when the distance exceeds σ. N is the total number of affine matrices, and the value of σ is dynamically set, typically all D (X)_i,X_j) Values at 2% after sorting from small to large.

Step 5, clustering: in order to find affine matrix data points with higher distribution density and closer distribution density in an affine space, a density-based clustering method is adopted for clustering and positioning. Since the shape of a particular cluster in space may be arbitrary, common clustering techniques, such as: the K-means algorithm, etc., requires the data distribution to conform to the gaussian distribution, and is therefore not suitable. And popular density-based clustering methods, such as Mean Shift clustering, require the specification of kernel density estimation functions consistent with spatial data distribution. In view of this, on the basis of customizing the density function of each data point in the space, a density clustering method suitable for random distribution is adopted to realize clustering of the affine space. The method starts from each point in the space and continuously moves to the vicinityAnd moving the data points with higher density until the density extreme point of the local area, namely the modal point of the current clustering cluster, and classifying all the data points with the end points positioned at the same modal point into the same clustering cluster. Then, determining the boundary between different clusters and the boundary density of each cluster, and optimizing each cluster based on the boundary density, which comprises the following steps: step 5.1, defining a clustering path: for determining affine matrices X_iFrom X_iStarting from, the density ratio X in the positioning space_iDensity of (p)_iLarge and with X_iNearest affine matrix X_jDefinition of X_jIs X_iFrom "Cluster parent node" of_jStarting to locate the 'cluster father node', repeating the steps until the 'modal point' with the maximum density in the current cluster is located.

Step 5.2, definition of modal points: the modal points in space are the points with the highest density in each cluster, and for each modal point, a following distance function delta is first defined for each point in space_iThe following were used:

i.e. for an affine matrix X in space_iFollowing distance delta_iRepresents X_iAffine matrix X having a greater density and a closest distance thereto_jThus when X is_iAnd X_jWhen in the same cluster, delta_iUsually accompanied by a density function ρ_iIs increased and decreased only when X is_iWhen the density maximum value point of the current cluster, namely the modal point, is reached, X_jWill be in another cluster, when delta_iThere will be one jump increase. Finally, when X_iDensity value of rho_iIs δ when the density in space is maximum_iIs defined as X_iMaximum value of distance from the rest of the affine matrix.

Definition of following delta_iOn the basis of (2), defining a function

η_i＝ρ_iδ_i

An affine matrix X in a given space_i，η_iAnd its density value rho_iAnd delta_iAre all in direct proportion, i.e. p_iAnd delta_iAll are larger, η_iThe larger. And within the same cluster, the density ρ_iThe value of (a) gradually increases from the edge to the center and reaches a maximum at the mode point. And function delta_iIt gradually decreases from edge to center but jumps up at the mode point, so it can be seen that if and only if X_iAt a modal point in space, ρ_iAnd delta_iAre all larger, i.e. η_iTherefore, modal points may be determined based on first calculating η all points in space_iAnd arranging in descending order, wherein the data points of K before the ordering are the modal points in the space. Since a cluster has only one modal point, K is the number of clusters in the space, and the value can be specified in advance. Or after the number of the affine matrixes contained in each cluster is determined, automatically filtering the cluster with less affine matrix points, and further dynamically determining the value of K.

Step 5.3, primary clustering: from each affine matrix X in space_iStarting, continuously moving to a modal point in the space according to the step 5.2, and finally classifying all affine matrixes converged to the same modal point into the same cluster.

Step 5.4, determining the clustering cluster boundary and the maximum boundary density: assume affine matrix X_iAfter clustering in step 5.3, the cluster belongs to a cluster C_kAnd it is associated with cluster C_lAffine matrix X in (l ≠ k)_jIs smaller than the defined density function p_iThe gaussian variance σ of time, i.e.:

D(X_i,X_j)＜σ

at this time, X is recognized_iX_jForm a cluster C_kAnd C_lThe boundary of (2). Defining boundary density

ρ_ij＝(ρ_i+ρ_j)/2

Thereby determining a cluster C_kThe maximum boundary density of (a) is:

namely, it isIs a cluster C_kThe maximum of the boundary density with all other clusters.

Step 5.5, optimizing clustering: setting cluster C_kAll densities of greater thanAffine matrix X of_iI.e. byAnd eliminating other data points as noise points.

And 6, displaying a result: determining correct feature matching pairs according to the corresponding relation between the affine matrix and the local feature matching pairs in each clustering cluster, grouping and presenting the matching pairs according to the clustering result of the affine matrix, and identifying a pair of matching objects in each group, specifically:

and separating the affine matrixes in the clustered clusters, determining local feature matching pairs corresponding to the affine matrixes, grouping the feature matching pairs according to the clustering cluster to which the affine matrixes belong, and presenting the local feature matching pairs on different target objects, wherein the local feature matching pairs belonging to the same class correspond to the same or similar-content matched target objects.

The multi-target object image matching method has the advantages that the multi-target object image matching method belongs to an image content matching method based on image local features, the consistency of local affine mapping of feature areas on matched objects is considered, an affine space fast clustering method is provided for positioning a plurality of matching pairs with similar affine transformation, and the problem that only one pair of target objects can be matched in the conventional image matching algorithm is effectively solved. The technical scheme solves the problems that the optimization convergence speed is low, the global optimal solution is difficult to obtain and only single-target object matching can be processed generally in the existing optimization-based image matching algorithm, and effectively improves the accuracy and efficiency of multi-target image matching.

Drawings

FIG. 1 is a basic flow diagram of the process of the present invention.

Fig. 2 is a set of image pairs containing multiple objects to be matched and a preliminary matching result thereof.

FIG. 3 is a schematic diagram of an affine transformation solution of an elliptical local feature region.

FIG. 4 is a schematic of an affine transformation distance solution.

FIG. 5 is a mapping of the initial matching set of FIG. 2 to an affine transformation on a two-dimensional plane.

FIG. 6 is a plot of the density values ρ and following distances δ for each affine transformation in the initial matching set of FIG. 2.

FIG. 7 is a representation of affine spatial clustering results of the initial matching set of FIG. 2 on a two-dimensional plane.

Fig. 8 is the final multi-target object matching result of fig. 2.

FIG. 9 is a comparison of the performance of the method of the present invention with a prior art image matching method.

Fig. 10 is a diagram showing an example of matching results of 2 sets of images.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

Referring to fig. 1, a multi-target object image matching method sequentially processes a pair of images to be matched as follows through a computer: step 1, local feature extraction and pre-matching. And 2, estimating local affine transformation of the initial matching area. And 3, calculating the affine transformation distance. And 4, defining an affine space density function. And 5, carrying out affine transformation spatial clustering based on the density. And 6, determining a correct matching pair and presenting a result.

Further, the concrete steps are as follows: step 1, local feature extraction and pre-matching: extracting local characteristic regions of the image pair to be matched and characteristic description thereof, and establishing an initial matching set of local characteristics among the images based on the described similarity. Step 2, estimating local affine transformation of the initial matching feature region: and on the basis of the local feature extraction and the pre-matching in the step 1, estimating local affine transformation of an initial matching feature region of the image pair to be matched. Step 3, calculating affine transformation distance: and (3) quantizing the local affine transformation distance between any two pairs of local feature matching pairs in the image pair to be matched on the basis of the local feature extraction and the pre-matching in the step (1). Step 4, defining an affine space density function: and on the basis of local feature extraction and pre-matching in the step 1, defining an affine transformation space density function of the image pair to be matched based on the affine transformation distance. And 5, performing affine transformation space clustering based on density on the image pair to be matched according to the local affine transformation obtained in the step 2, the local affine transformation distance obtained in the step 3 and the affine transformation space density function obtained in the step 4 to obtain a cluster and a corresponding affine matrix. And 6, displaying a result: and (4) determining the final local feature matching pairs to be grouped according to the clustering clusters according to the corresponding relation between the affine matrix in the clustering clusters obtained in the step (5) and the local feature matching pairs in the step (1), wherein the feature matching pairs in the same group correspond to the same target object, and finally presenting the local feature matching pairs positioned on different target objects. As shown in the flowchart of fig. 1, the method is a multi-step serial matching process: firstly, extracting local features of an image, and carrying out pre-matching based on Euclidean distance between the features to obtain an initial local feature matching set. And estimating a local mapping affine matrix of each pair of matched feature regions, defining affine matrix distance and constructing an affine space. Based on the observation that the correct local feature matching pair has more consistent affine transformation and is clustered into clusters with higher density in the space, a density function is defined, the density value of each affine matrix is estimated, the modal points and the clustering paths of all the points of each clustering cluster in the space are determined, further, the clustering based on the density is realized, and the clustering result is optimized by determining the boundary of each clustering cluster. And finally, determining correct local feature matching and grouping according to the clustering relation of the affine matrixes, and respectively presenting the matching results of the multiple target objects. Specifically, as shown in fig. 1, the present invention discloses a multi-target object image matching method, which mainly comprises the following steps:

step 1, local feature extraction and image pre-matching: and extracting a local characteristic region of the image pair to be matched and the characteristic description thereof, and matching the local characteristics among the images based on the described similarity. The characteristic region detection mainly adopts MSER and DoG detectors, and the MSER and the DoG detectors are selected to carry out region detection to obtain a plurality of local elliptical or circular regions with geometric/scale invariance, and each region is described by corresponding geometric parameters and estimates the gradient direction angle of the region. And generating feature vectors by adopting an SIFT descriptor, and realizing primary feature matching based on a ratio threshold value of nearest neighbor and next nearest neighbor of Euclidean distance between the feature vectors.

Step 1.1, extracting local characteristic regions of the image: and extracting the local characteristic region of the image pair to be matched by adopting an image local characteristic region detection operator. And extracting the characteristic region of the image pair to be matched by adopting an MSER or DoG operator. The DoG detection result is a plurality of circular areas, and each area representation parameter comprises a circle center coordinate p and a radius r. The MSER detection result is an elliptical area, and the representing parameters comprise a central coordinate p and a long-short axis radius (l)¹,l²) And a rotation angle ω.

Step 1.2, orientation of elliptical feature areaAnd (3) mapping a circular area: the input of the SIFT operator feature description is a circular region parameter, so if the MSER operator is adopted for feature detection, the ellipse is firstly mapped to the circular region. Assume ellipse E_iHas a center of p_iThe major and minor axis radii areAngle of rotation omega_iIt will be mapped to a fixed radius r_cCenter coordinate p_iCenter area O of_iIn (3), the mapping formula is as follows:

and is

Step 1.4, extracting feature vectors: and generating a feature vector of each circular region based on SIFT feature descriptors, and outputting a 128-dimensional gradient histogram vector for each local region. (see < differential image features from Scale-innovative Key-points > International Journal of computer Vision, vol.60, No.2, pp1482-1489,2004) for detailed algorithmic procedures.

wherein D_ikRepresenting vector D_iThe k-th dimension component of (a). This initial matching principle is to go throughThe quantity statistics is determined, and due to the fact that the feature vector dimension is high, the nearest neighbor feature can be guaranteed to be the matching feature of the current feature vector only when the distance difference between the nearest neighbor and the next nearest neighbor of the current feature vector in the space is large enough. This is large enough to be determined using a ratio threshold, and experiments have shown that setting to 1.1 works well, i.e. the distance between the next neighbor and the current feature is at least 1.1 times greater than the distance between the nearest neighbors, and fig. 2 gives an example of the initial matching results for 2 images, where the top half in fig. 2 shows 2 images to be matched, the bottom half shows the initial matching results, and there are 2960 pairs of initial matching pairs, and optionally 300 pairs. It can be seen that the local feature matching pairs obtained by the initial matching usually contain most correct matching pairs, but also contain more false matches. The part of the error matching needs to be further rejected based on the local geometric transformation commonality of the matching area.

Step 2, estimating local affine transformation of the initial matching feature region and constructing an affine transformation space: if the matching feature region is a circular region, assume O_iAnd O'_iThe affine transformation matrix of the two can be obtained by combining the differences of the central coordinate displacement, the radius scaling size and the gradient direction angle. If the matching characteristic region is an elliptical region, the matching characteristic region is E_iAnd E'_iBefore matching using SIFT, the two ellipses will be transformed into a circular region O of fixed radius_iAnd O'_iAnd affine transformation between the two_iCan be determined by the displacement of the center coordinates and the difference in the direction angles of the gradients. Redefining E_iTo O_iAnd E'_iTo O'_iRespectively of gamma_iAnd Γ'_iThen E is_iAnd E'_iAffine transformation of_iCan be defined asAfter obtaining the affine transformations of all the initial matching pairs, the affine transformations are aggregated to construct a geometric transformation space, i.e., an affine space.

Solving local imitations of initial feature matching pairsAnd (3) a transmission transformation matrix: giving a pair of matched SIFT feature vectors, and recording as O if the corresponding feature region is a circle_iAnd O'_iAnd its geometric parameter has a circle center coordinate p_iAnd p'_iRadius r_iAnd r'_iAngle of gradient α_iAnd α'_iThen O is added_iUpper point (x, y)^TTo O'_iPoint on (x ', y')^TThe mapping relationship may be expressed as:

and is

Wherein:

and t is_i＝p′_i-R_ip_i

θ_iIs the difference in rotation angle of two circles, s_iDenotes the ratio of the radii of the circles, t_iIs the mapped circle center offset. R_iIs a matrix form of scaling and rotation synthesis transformation between two circles. t is t_iIs the mapped circle center offset.

If the feature area corresponding to the feature area detection result is an ellipse, for any pair of ellipse areas, it is marked as E_iAnd E'_iWith geometric parameter having a central coordinate p_iAnd p'_iRadius of major and minor axesAndand a rotation angle omega_iAnd ω'_i. From step 1.2, before SIFT feature extraction, E_iWill be mapped to a radius r_cAnd the central coordinate is p_iAnd p'_iIn the circular region O, the mapping matrix thereof can be expressed as

And is

Similarly, ellipse E'_iIs also mapped to radius r_cWith a central coordinate of p_iAnd p'_iThe mapping matrix of the circular region O' is:

and is

After O and O' are obtained, their gradient direction angles α can be estimated according to step 1.2_iAnd α'_i. On this basis, the mapping of O to O' can be expressed as:

and isθ_i＝α′_i-α_i,t_i＝p′_i-R_ip_i.。

and X_i＝Γ'_i ^-1T_iΓ_i。

As shown in FIG. 3, the mapping matrix X ═ Γ 'of the ellipses E to E'^-1T gamma is represented by a dotted arrow, the actual solving process is to firstly adopt affine mapping gamma and gamma 'to map E and E' into corresponding circular areas O and O ', the local image contents of O and O' are quantized into SIFT features and matched, and the matching relation determines that an area corresponding relation exists between the two areas and can be represented by an affine matrix T. Since mapping between E to O, O to O 'and E' to O 'can be performed by affine transformations Γ, T, and Γ', respectively (shown by solid arrows in fig. 3), and the affine transformations are reversible transformations, the mapping matrix of the ellipse E to E 'can be expressed as X ═ Γ'^-1 TΓ。

For each pair of matched feature region in the initial matched set of local features, the affine transformation X between the local feature region pairs is calculated according to the method_iAnd obtaining an affine transformation matrix set of the initial matching set of the local features.

Step 3, defining local affine transformation difference distances between any two pairs of local feature matching pairs: the characteristic that the euclidean distance between any two affine transformations is not satisfied, that is, the distance of the affine transformations cannot be calculated by adopting a vector subtraction and modulo method. In view of the fact that similar affine transformations are relatively close to coordinate points obtained after the similar affine transformations act on the same two-dimensional coordinate point, and the coordinate points of different affine transformations act on the same two-dimensional coordinate point are relatively far away, the mapping point coordinate distance is adopted to measure the distance between the affine transformations: i.e. given a pair of affine transformations X_iAnd X_jRespectively acting the two on several selected two-dimensional coordinate points, and estimating the average distance of the mapping point pair as X_iAnd X_jDistance between, coordinate point is selected from X_iAnd X_jThe center coordinates of the associated local feature area. Similarly, X can be calculated_iAnd X_jOf reverse transformation X'_iAnd X'_jThe distance between them. Finally, consider X_iAnd X_jAnd X'_iAnd X'_jDistance consistency, X being defined by the mean of two distance values_iAnd X_jThe distance between them. Furthermore, if X_iMapping ellipse E_iTo E'_i，X_jMapping E_jTo E'_jAnd E is_iAnd E_jAre the same ellipse and E'_iAnd E'_jDifferent or E'_iAnd E'_jSame and E_iAnd E_jAt different times, define X_iAnd X_jFor the conflicting transformations, the distance between them is defined to be not less than a fixed constant, taking into account the differences of the conflicting transformations. The specific method of the step is as follows:

step 3.1, defining the distance between any two pairs of local affine transformations: given affine transformation matrix X_iAnd affine transformation matrix X_jThe distance between the two is defined as follows:

and is

As previously mentioned, p_i p′_iAnd p_j p'_jIs an affine matrix X_iAnd X_jA corresponding pair matches the center coordinates of the local area, thusAnd isWherein k (-) is affine transformedThe three-dimensional coordinates are converted into two-dimensional coordinates. D_j|iIs represented by X_iMapping p_jCoordinate points obtained after and p'_jDue to p'_jIs p_jWarp X_jMapped coordinates, D_j|iReflects affine transformation X_iAnd X_jThe larger the difference, the larger the value thereof. D'_j|iIs shown inMapping p'_jThe coordinate points obtained after the reaction and p_jReflects an affine transformationAnddue to the difference of X_iAndand X_jAnd X_jBetween are reversible transformations, X_iAnd X_jAndandthe difference between should be consistent. In the same way, D_i|jIs represented by X_jMapping p_iCoordinate points obtained after and p'_iDistance of (D'_i|jIs shown inMapping p'_iThe coordinate points obtained after the reaction and p_iThe distance of (c).

D(X_i,X_j) Is the average value of the sum of the four, and ensures the symmetry of the distance and the equidistance between the positive transformation pair and the inverse transformation pair, namely:

step 3.2, defining a conflict transformation penalty distance: assume affine matrix X_iMapping ellipse E_iTo E'_i，X_jMapping E_jTo E'_jIf E is_iAnd E_jAre of the same ellipse and E'_iAnd E'_jDifferent or E'_iAnd E'_jSame and E_iAnd E_jDifferent, at this time E'_iAnd E'_jUsually only one is an ellipse E_iCorrect correspondence of (A), i.e. X_iAnd X_jUsually only one at most is the correct transformation, called X in this case_iAnd X_jIs a conflicting transformation. Conflicting transforms are multiple transforms that are inherently different (usually at most one is the correct transform and the others are the wrong transforms) and therefore should be relatively distant from each other, but when the distance is calculated as in step 3.1, due to E_iAnd E_jSame, resulting in D'_j|i＝D′_i|j0, and then D (X)_i,X_j) The value of (c) is small. To highlight X_iAnd X_jDefining the distance between the two is not less than the constant C, i.e.:

if X_iAnd X_jIn conflict with each other, the system can be used,

where max (x, y) denotes selecting a larger value from x and y. C is a constant set to a larger value, set to 250 according to the present invention. Similarly, when E'_iAnd E'_jSame and E_iAnd E_jWhen different, its mapping matrix X_iAnd X_jAlso a conflicting transformation.

And calculating the distance between every two transformations in the affine transformation set corresponding to the initial local feature matching set based on the steps to obtain the distance value between all affine transformations.

Fig. 5 shows the result of performing initial feature matching on the image pair of fig. 2, finding the corresponding affine transformations and calculating the distances between the different transformations, and then mapping all transformations onto a two-dimensional plane according to the distances, where each data point represents a transformation. It can be seen that, in addition to the noise points randomly scattered in the space, the figure clearly shows 5 cluster clusters (oval marks) with high distribution density, the 5 cluster clusters substantially correspond to 5 groups of matching pairs consisting of correct local feature matching pairs in the two images, each group of matching pairs corresponds to a pair of matching objects, and the scattered data points are wrong transformations and correspond to wrong initial matching. Therefore, the goal of the subsequent process is to perform density-based clustering, find affine transformation points clustered in space, and then find the correct matching pairs.

Step 4, defining an affine transformation space density function: the density of data points in the space is positively correlated with the number of points near the data points, and the density is higher as the number is larger. On the other hand, the density function is local, i.e. the density of each data point is only related to the distribution of data points within a limited distance. Combining the two factors, the density of the current point is defined by the sum of the Gaussian distances between the current point and the nearby points.

Defining an affine transformation space density function based on the affine transformation distance as:

i.e. current affine transformation X_iDensity of (p)_iIs X_iTo the remaining transformation X_jGauss distance of, transformation matrix X of closer distance_jThe contribution to the density is large, and when the distance exceeds σ, the contribution gradually decreases until it is negligible, so that ρ_iIs defined as X_iLocal density values of. N is the total number of affine matrices, the value of σ is dynamically set, σ is all D (X)_i,X_j) Values at 2% after sorting from small to large. Taking the image pair matching in fig. 2 as an example, σ takes a value of about 43.

And 5, in order to find affine matrix data points with higher distribution density and closer distribution density in the affine space, clustering and positioning are carried out by adopting a density-based clustering method. The method starts from each point in the space, continuously moves towards nearby data points with higher density until the density extreme point of a local area, namely the modal point of the current clustering cluster, and then classifies all data points with end points positioned at the same modal point into the same clustering cluster. Thereafter, boundaries between different clusters and the boundary density of each cluster are determined and each cluster is optimized based on the boundary density.

Step 5.1, defining a clustering path: for determining affine matrices X_iFrom X_iStarting from, the density ratio X in the positioning space_iDensity of (p)_iLarge and with X_iNearest affine matrix X_jDefinition of X_jIs X_iFrom "Cluster parent node" of_jStarting to locate the 'cluster father node', repeating the steps until the 'modal point' with the maximum density in the current cluster is located.

i.e. for a general affine matrix X in space_iFollowing distance delta_iRepresents X_iAffine matrix X having a greater density and a closest distance thereto_jThus when X is_iAnd X_jWhen in the same cluster, delta_iUsually accompanied by a density function ρ_iIs increased and decreased only when X is_iWhen the density maximum value point of the current cluster, namely the modal point, is reached, X_jWill be in another cluster, when delta_iThere will be one jump increase. Finally, whenX_iDensity value of rho_iIs δ when the density in space is maximum_iIs defined as X_iMaximum value of distance from the rest of the affine matrix.

Fig. 6 shows the values of the density function ρ and its corresponding following distance δ for each affine transformation associated with the initial matching pair of fig. 2, and it can be seen that the general trend of the following distance δ is decreasing with increasing density ρ. On the other hand, the following distance δ of most data points is small, and δ of only a few points is large, which is consistent with the previous observation that in each cluster, the following distances of all data points except the modal point are the distances between the current point and a certain point with a large density inside the current cluster, and the following distance of only the modal point is the distance between the modal point and a certain point in other clusters, so that the value is usually large. In view of this, the density ρ and the following distance δ may be used to jointly determine the modal point of each cluster in space. Fig. 6 shows the results of the calculations of density and following distance for each point in fig. 5, where the modal points for 5 potential clusters are represented by upper triangle, pentagram, diamond, lower triangle, and square icons, respectively, which are significantly distant from the data points identified by the other non-modal points, i.e., the circle icons.

Definition of δ_iOn the basis of (2), defining a function

η_i＝ρ_iδ_i

An affine matrix X in a given space_i，η_iAnd its density value rho_iAnd delta_iAre all in direct proportion, i.e. p_iAnd delta_iAll are larger, η_iThe larger. And within the same cluster, the density ρ_iThe value of (a) gradually increases from the edge to the center and reaches a maximum at the mode point. To follow the distance delta_iIt gradually decreases from edge to center but jumps up at the mode point, so it can be seen that if and only if X_iAt a modal point in space, ρ_iAnd delta_iAre all larger, i.e. η_iIs relatively large. Thus, the modal points may be based on the following manipulationsIt is determined that η points are calculated for all points in space_iConsidering that only one modal point exists in a cluster, K is the number of clusters in the space and the value can be specified in advance, or after the number of affine matrixes contained in each cluster is determined, automatically filtering the clusters with less affine matrix points to further dynamically determine the value of K.5 data points marked by non-circular icons such as triangles, pentagons and the like in figure 6 are the first 5 points with larger η in the initial matching pair data space of figure 2, and the points are also the modal points of 5 clusters in figure 5. step 5.3, preliminary clustering is to perform preliminary clustering, wherein X is the number of each matrix X in the space_iStarting, continuously moving to a modal point in the space according to the step 5.1, and finally classifying all affine matrixes converged to the same modal point into the same cluster.

D(X_i,X_j)＜σ

ρ_ij＝(ρ_i+ρ_j)/2

Thereby determining a cluster C_kThe maximum boundary density of (a) is:

namely, it isIs a cluster C_kTo all othersMaximum value of boundary density of cluster.

Step 5.5, optimizing clustering: setting cluster C_kAll densities of greater thanAffine matrix X of_iI.e. byAnd eliminating other data points as noise points. Fig. 7 shows a mapping display of the clustering result in the initial matching set transformation space of fig. 2 on a 2-dimensional plane, data points of the same cluster with the same coefficient are marked by icons in the same shape and are amplified and displayed, and walking noise points are marked by dots. It can be seen that the clustering method of the present invention successfully locates the same kind of points around each clustered modal point starting from the 5 modal points selected in fig. 6, and finds 5 clustered clusters with a higher density in the space. Note the icon correspondence of the different cluster icons in fig. 7 to the selected modal points in fig. 6.

And 6, displaying a result: and determining correct feature matching pairs according to the corresponding relation between the affine matrix in each clustering cluster and the local feature matching pairs, grouping and presenting the matching pairs according to the clustering result of the affine matrix, identifying a pair of matched objects in each group, namely obtaining the affine matrix in each clustering cluster based on the step 5.5, determining the local feature matching pairs corresponding to each affine matrix, grouping the local feature matching pairs according to the associated clustering clusters, corresponding to the same matched target object by the local feature matching pairs belonging to the same class, and presenting the local feature matching pairs positioned on different target objects. Fig. 8 shows the final matching result of fig. 2. The matching pairs of the line identifiers in the figure correspond to the data points represented by the icons on the two-dimensional plane of figure 7. It can be seen that, in fig. 7, affine transformation data points belonging to the same cluster and identified by the same icon correspond to local feature matching of the same matching object in fig. 8, and different matching target objects can be automatically and effectively distinguished according to a clustering result, so that the reliability of the multi-target matching algorithm based on geometric transformation clustering is verified.

Examples

The experimental hardware environment of this example is: intel (R) core (tm) i52.67ghz, 4G memory, microsoft windows7 flagship edition, programming environment Visual Studio 2012, MATLAB 8.1(R2013a)32 bit, test chart derived from multi-target object matching standard image set published on Seoul National University, SNU network.

All 6 groups of images to be matched in the SNU image set are selected for the experimental test case, the image resolution is 800 multiplied by 600, each group comprises 3-6 target objects to be matched, and illumination, visual angles and changes of the objects per se exist in different degrees between each pair of objects to be matched, which bring further challenges to image matching.

The experiment is compared with some existing image matching methods, and the method comprises the following steps: ACC (adaptive cruise control) hierarchical clustering image matching method

(geometrical coherence Clustering), spatial coherence image matching SCC (spatial coherence Clustering), spectral matching SM (spectral matching), and weighted Random walk image matching RRWM (weighted Random walk for Graph matching). In the experiment, a DoG operator is adopted for feature extraction, an SIFT operator is adopted for local feature description, and the screening distance threshold value of the initial matching pair is set to be 1.1. All methods start with an initial matching set, the objective is to filter out mismatching pairs, retain the correct matching pairs and group them according to the pairing target object to which they belong. In the method, the minimum distance between conflict transformations is set to be 250, and the clustering number K is specified according to the number of the target objects to be matched in the image pair to be matched. All fixed parameters of the comparison method select the parameter combination with the best performance based on the matching performance test.

We present the performance of each method using performance curves of match Recall (Recall) and 1-Precision (Precision), where,

the comparative results are shown in FIG. 9. Each subtitle in the figure corresponds to the name of the image group, the number of the denominator in the parentheses of the subtitle indicates the total number of the initial candidate matching pairs, and the number of the numerator indicates the correct matching logarithm therein. All comparison methods require a control parameter to adjust the matching performance, so that a corresponding performance curve is drawn by setting different parameter values. The method does not need such adjustment parameters, and draws a corresponding performance curve by displaying the recall rate and the precision of the matching pairs in the first k clustering clusters. As can be seen from the figure, the SM and the RRWM are relatively poor in effect because both of them are dedicated to adopt a global distinguishing strategy to separate correct matching pairs from incorrect matching pairs, which is relatively effective for image pairs containing only one target object to be matched, but when processing multi-target object matching, the local feature matching of a single target object is equivalent to matching noise for other target objects to be matched, so the global feature matching method cannot effectively process the situation of multi-target object matching. In contrast, better performance was achieved for ACC, SCC and the method of the invention. The method has relatively best performance, which shows that the method based on the local affine transformation space density clustering of the feature matching can more effectively filter wrong feature matching. In addition, the associated matching feature pairs of different matching target objects are well-defined in the clustering result, so that the local features belonging to different matching targets are simple and convenient to distinguish. Fig. 10 shows a display of the matching results of two images.

The method has better efficiency, and the time complexity is linearly related to the number of the initial matching pairs. The average time of the tested 6 images is within 1 second, wherein the average time of the core clustering algorithm is less than 0.1 second, so that the method can provide more robust multi-target object matching performance in a shorter time.

Claims

1. A multi-target object image matching method is characterized in that: processing a pair of images to be matched by a computer as follows in sequence:

step 1, local feature extraction and pre-matching: extracting a local characteristic region of an image pair to be matched and characteristic description thereof, and establishing an initial matching set of local characteristics among the images based on the described similarity;

step 2, estimating local affine transformation of the initial matching area: estimating local affine transformation of an initial matching feature region of the image pair to be matched on the basis of local feature extraction and pre-matching in the step 1;

step 3, calculating affine transformation distance: on the basis of local feature extraction and pre-matching in the step 1, quantizing the local affine transformation distance between any two pairs of local feature matching pairs in the image pair to be matched;

step 4, defining an affine space density function: on the basis of local feature extraction and pre-matching in the step 1, defining an affine transformation space density function of an image pair to be matched based on a local affine transformation distance;

and 5, affine transformation spatial clustering based on density: performing affine transformation space clustering based on density on the image pair to be matched according to the local affine transformation obtained in the step 2, the local affine transformation distance obtained in the step 3 and the affine transformation space density function obtained in the step 4 to obtain a clustering cluster and a corresponding affine matrix;

and 6, determining a correct matching pair, and presenting a result: according to the corresponding relation between the affine matrix in the cluster obtained in the step 5 and the local feature matching pairs in the step 1, determining that the final local feature matching pairs are grouped according to the cluster, wherein the feature matching pairs in the same group correspond to the same target object, and finally presenting the local feature matching pairs on different target objects;

wherein, the step 1 is as follows:

step 1.1, extracting local characteristic regions of the image: extracting a local characteristic region of an image pair to be matched by adopting an image local characteristic region detection operator; the local feature region has scale invariance; obtaining the geometric shape of the local characteristic region as a circle or an ellipse through an image local characteristic region detection operator;

step 1.2, extracting feature vectors:

when the local feature region is circular, extracting the local feature vector of the local feature region by adopting a SIFT (scale invariant feature transform) based feature description operator, wherein the method comprises the following steps of: assume that the single local feature region extracted by step 1.1 is a circular region O_iHaving a central coordinate of p_iRadius r_iFirstly, the gradient size m (x, y) and the direction of each pixel point in the local characteristic region are calculatedAnd further estimates the dominant gradient direction α of the local feature region_iThe specific method comprises the following steps:

in the formula, L (x, y) represents a pixel value at a coordinate (x, y) point in the local feature region; after the gradient size and the gradient direction of each pixel point in the local characteristic region are obtained, an angle interval [0,360 ] is divided into 36 equal parts, and the angle is divided according to the gradient angleThe gradient m (x, y) is added to the corresponding angle interval to generate a gradient histogram, and the median of the gradient of the interval with the most points in the gradient histogram is selected as the main gradient direction α_iThereafter, the gradient direction of all pixels in the local feature region is rotated in a counter-clockwise direction α_iSo that the main gradient direction α of the local feature region_iIs 0; then, the gradient histogram of the pixels in the local characteristic region is counted in different regions and histogram normalization operation is carried out, and finally, a 128-dimensional gradient histogram vector, namely a characteristic vector D of the local characteristic region is obtained_i(ii) a When the local feature area is elliptical: if the output of step 1.1 is an elliptical region, the elliptical region with local features is given as E_iHaving a central coordinate of p_iThe major and minor axis radii areAnd a rotation angle w_iThen E is transformed by a two-dimensional space_iMapping to a center at p_iRadius of r_cCircular region O of (a):and isIn the formula (x, y)^TIs E_i(x ', y')^TIs the corresponding point coordinate on the circular area O; r is_cTaking a value of 13; after the circular area is obtained, generating a feature vector by adopting an SIFT feature description operator, wherein the process is consistent with the previous description; step 1.3, primary matching of local features: let a single local region in an image characterize vector D_iFeature description vector D of single local region in another image_jIs D (D)_i,D_j) D if and only if (D)_i,D_j) Multiplication by a given threshold is not greater than the feature vector D_iDistance from all other feature vectors, D_iAnd D_jMatching; taking a threshold value of 1.1; the distance formula is defined as:

wherein D_ikAnd D_jkRespectively represent feature vectors D_iAnd a feature vector D_jThe k-th dimension component of (a);

giving an image pair to be matched, selecting initial matching of the feature vectors of all local feature regions extracted from one image in the feature vectors of the local feature regions of the other image one by one, and then gathering all initial matching pairs corresponding to a pair of local feature regions in different images to form an initial matching set of local features of the image pair to be matched;

the specific steps of the step 3 are as follows:

step 3.1, defining local affine transformation distance between any two pairs of local feature matching pairs: given affine matrix X_iAnd affine matrix X_jThe distance between the two is defined as follows:

wherein

And is

As described in step 2.1, p_i p_i' and p_j p_j' is an affine matrix X_iAnd X_jA corresponding pair matches the center coordinates of the local area, thusAnd isConverting three-dimensional coordinates (x, y, w) after affine transformation into two-dimensional coordinates (x/w, y/w), wherein w represents coordinate values of a third dimension of the pixel point (x, y, w); d_j|iIs represented by X_iMapping p_jCoordinate points obtained after and p'_jDue to p'_jIs p_jWarp X_jMapped coordinates, D_j|iReflects the affine matrix X_iAnd X_jThe larger the difference, the larger its value; d_j|i' means toMapping p_j' coordinate points obtained after and p_jReflects an affine matrixAnddue to the difference of X_iAndand X_jAndbetween are reversible transformations, X_iAnd X_jAndandthe difference between should be consistent; in the same way, D_i|jIs represented by X_jMapping p_iThe coordinate points obtained later and p_iDistance of, D_i|j' means toMapping p_i' coordinate points obtained after and p_iThe distance of (d);

step 3.2, defining a conflict transformation penalty distance: assume affine matrix X_iMapping ellipse E_iTo an ellipse E_i', affine matrix X_jMapping ellipse E_jTo ellipse E'_jIf E is_iAnd E_jIs the same ellipse and E_i' and E_j' different or E_i' and E_j' same and E_iAnd E_jIn a different way, this time called X_iAnd X_jFor the conflicting transformation, the distance between the two is defined to be not less than the constant C, that is:

if X_iAnd X_jIn conflict with each other, the system can be used,

wherein max (x, y) denotes selecting a larger value from x and y; c is a constant and takes a value of 250;

calculating the distance between every two transformations in the affine transformation set corresponding to the local feature initial matching set based on the steps to obtain the distance values between all affine transformations;

the specific steps of the step 5 are as follows:

step 5.1, defining a clustering path: for determining affine matrices X_iFrom X_iStarting from, the density ratio X in the positioning space_iDensity of (p)_iLarge and with X_iNearest affine matrix X_jDefinition of X_jIs X_iFrom "Cluster parent node" of_jStarting, positioning a 'clustering father node' of the cluster, and repeating the step until a 'modal point' with the maximum density in the current clustering cluster is positioned;

i.e. for an affine matrix X in space_iFollowing distance delta_iRepresents X_iAffine matrix X having a greater density and a closest distance thereto_jThus when X is_iAnd X_jWhen in the same cluster, delta_iUsually accompanied by a density function ρ_iIs increased and decreased only when X is_iWhen the density maximum value point of the current cluster, namely the modal point, is reached, X_jWill be in another cluster, when delta_iThere will be one jump increase; finally, when X_iDensity value of rho_iIn space ofWhen the density is maximum, delta_iIs defined as X_iMaximum value of distance from the rest of the affine matrix;

definition of δ_iOn the basis of (2), defining a function

η_i＝ρ_iδ_i

An affine matrix X in a given space_i，η_iAnd its density value rho_iAnd delta_iAre all in direct proportion, i.e. p_iAnd delta_iAll are larger, η_iThe larger; and within the same cluster, the density ρ_iThe value of (a) gradually increases from the edge to the center and reaches a maximum at the mode point; to follow the distance delta_iIt gradually decreases from edge to center but jumps up at the mode point, so it can be seen that if and only if X_iAt a modal point in space, ρ_iAnd delta_iAre all larger, i.e. η_iLarger and therefore modal points may be determined based on first calculating η all points in space_iArranging in descending order, wherein the data points of K before the sorting are modal points in the space; considering that one cluster only has one modal point, K is the number of clusters in the space, and the value can be specified in advance, or after the number of affine matrixes contained in each cluster is determined, the cluster with less affine matrixes is automatically filtered, and the value of K is further dynamically determined;

step 5.3, primary clustering: from each affine matrix X in space_iStarting, continuously moving to a modal point in the space according to the step 5.2, and finally classifying all affine matrixes converged to the same modal point into the same cluster;

D(X_i,X_j)＜σ

at this time, X is recognized_iX_jForm a cluster C_kAnd C_lThe boundary of (2); defining boundary density

ρ_ij＝(ρ_i+ρ_j)/2

Thereby determining a cluster C_kThe maximum boundary density of (a) is:

namely, it isIs a cluster C_kThe maximum value of the boundary density of all other cluster clusters;

step 5.5, optimizing clustering: retention clustering C_kAll density of (g) ((g))_iIs greater thanAffine matrix X of_iIs C_kAnd eliminating the rest affine matrixes as noise points.

2. The multi-target object image matching method as claimed in claim 1, characterized in that: in step 1: when the detection operator adopts a difference Gaussian operator DoG, the detection result of the local characteristic region is a circular region; when the detection operator adopts a stable extremum region operator MSER or an angular point Laplacian operator Harris-Laplacian, the detection result of the local characteristic region is an elliptical region.

3. The multi-target object image matching method as claimed in claim 1, characterized in that: the specific steps of step 2 are as follows:

step 2.1, solving a local affine matrix of the initial feature matching pair: giving a pair of matched local feature region description vectors, and if the corresponding feature region is a circle, marking as O_iAnd O_i', and its geometric parameter has a circle center coordinate p_iAnd p_i', radius r_iAnd r_i', main gradient direction α_iAnd α_i', then O_iUpper point (x, y)^TTo O_i' Point on (x ', y ')^TThe mapping relationship may be expressed as:

and isWherein:

and t is_i＝p_i'-R_ip_i

θ_iIs the difference in rotation angle of two circles, s_iDenotes the ratio of the radii of the circles, R_iIs in the form of a matrix of scaling and rotation synthesis transformations between two circles; t is t_iIs the center offset after mapping;

if the corresponding characteristic region is an ellipse, it is marked as E_iAnd E_i', the geometric parameter of which has a central coordinate p_iAnd p_i', major and minor axis radiiAndand a rotation angle w_iAnd w_i'; from step 1.2, before SIFT feature extraction, E_iWill be mapped to a radius r_cAnd the central coordinate is p_iAnd p_i' in the circular region O, the mapping matrix can be expressed as

And is

A two-dimensional matrix form representing the comprehensive transformation of the long axis, the short axis and the rotation angle in the process of transforming the ellipse into the circle; gamma-shaped_iIs thatThe three-dimensional expansion transformation matrix form of (1) realizes the transformation of an ellipse E_iTransformed into a circle O;

in the same way, ellipse E_i' is also mapped to radius r_cWith a central coordinate of p_iAnd p_iIn the circular region O 'of' the mapping matrix is:

and isAfter obtaining O and O', their main gradient directions α can be estimated according to step 1.2_iAnd α_i'; wherein,and Γ'_iAndand Γ_iCorresponds to the meaning of (1);

on this basis, the mapping of O to O' can be expressed as:

and isθ_i＝α_i'-α_i,t_i＝p_i'-R_ip_i(ii) a Wherein, theta_iIs O anddifference in main gradient direction of O', t_iIs the center offset after mapping; thus, from E_iTo E_iThe affine mapping of' can be expressed as:

and isWherein, gamma is_iDefine an ellipse E_iConversion into a circle O, T_iA transition of the circle O to the circle O' is defined,is of Γ'_iDefine the direction from the circle O' to the circle E_i' of (a); thus, affine matrix X_iDefine a slave ellipse E_iTo E_i' an overall affine transformation;

for each pair of matched feature regions in the initial matched set of local features, the affine matrix X between the local feature region pairs is calculated according to the method_iAnd obtaining an affine matrix set of the initial matching set of the local features.

4. The multi-target object image matching method as claimed in claim 1, characterized in that: the step 4 comprises the following steps:

i.e. the current affine matrix X_iDensity of (p)_iIs X_iTo the rest of the affine matrix X_jGauss distance of, affine matrix X of closer distance_jThe contribution to the density is large, and when the distance exceeds σ, the contribution gradually decreases until it is negligible, so that ρ_iIs defined as X_iLocal density values of; n is an affine momentThe total number of arrays, σ, is dynamically set, σ being all D (X)_i,X_j) Values at 2% after sequencing from small to large;

an affine space is formed by the initial affine transformation set, and the density value of each affine matrix in the affine space is calculated.

5. The multi-target object image matching method as claimed in claim 1, characterized in that: step 6 comprises the following steps:

obtaining affine matrixes in the cluster clusters based on the step 5.5, determining local feature matching pairs corresponding to the affine matrixes, grouping the local feature matching pairs according to the associated cluster clusters, wherein the local feature matching pairs belonging to the same class correspond to the same matching target object, and presenting the local feature matching pairs positioned on different target objects.