CN111461196A

CN111461196A - Method and device for identifying and tracking fast robust image based on structural features

Info

Publication number: CN111461196A
Application number: CN202010229998.8A
Authority: CN
Inventors: 安平; 孙源航; 尤志翔; 高伟; 王嶺
Original assignee: SHANGHAI MEDIA & ENTERTAINMENT TECHNOLOGY GROUP; University of Shanghai for Science and Technology
Current assignee: SHANGHAI MEDIA & ENTERTAINMENT TECHNOLOGY GROUP; University of Shanghai for Science and Technology
Priority date: 2020-03-27
Filing date: 2020-03-27
Publication date: 2020-07-28
Anticipated expiration: 2040-03-27
Also published as: CN111461196B

Abstract

The invention provides a fast robust image recognition tracking method and a device based on structural features, wherein the method adopts GMS feature matching algorithm to carry out feature point matching pair screening on a query image and a training matching image; if a correct feature point matching pair exists, equally dividing interest areas in the query image and the training matching image into small grids, and determining key points for each grid; modeling key points as graph nodes, constructing a graph model, and fusing feature matching and the weight parameters of graph matching; and (4) completing approximate matching of the image by using a random walk algorithm, and completing image identification and tracking. The invention can effectively accelerate the matching identification method and provide more accurate identification tracking performance under the condition of less characteristic point matching pairs.

Description

Method and device for identifying and tracking fast robust image based on structural features

Technical Field

The invention relates to the field of image matching in computer vision, in particular to a method and a device for identifying and tracking a fast robust image based on structural features.

Background

The task of image matching is to find the corresponding relationship between pixel points in two or more images of the same scene. The method is a very important hotspot problem in the research fields of computer vision, information processing and the like, and is also the basis of many computer vision theories and applications. The image matching technology mainly comprises a method based on gray scale matching and a method based on feature matching. The method based on the gradation matching can obtain a good effect in general, but has poor performance in a region where gradation information is relatively small. The method based on feature matching can obtain feature points in an original image according to certain feature extraction operators, the feature points can represent the image better, and the image matching is completed through the mapping relation among the feature points.

The mainstream image matching method still has some disadvantages, the requirement of the method based on gray scale matching on an image source is very high, the gray scale values of the same object shot by different cameras have larger difference, and the method needs to use a template graph to circularly compare in a test graph, has higher calculation complexity and is difficult to practically apply. The feature matching-based method is small in calculation amount and is also suitable for the situation that deformation, external points and occlusion exist in an image, but high-quality feature points are needed, and the mainstream feature matching method only focuses on the relationship among the feature points and does not consider the structural features of the image, so that the generation of wrong matching can be caused. In summary, how to balance the efficiency and accuracy of image matching becomes an urgent problem to be solved.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide a method and a device for quickly identifying and tracking robust images based on structural features, which are used for assisting matching by using the structural features among key points and can provide more accurate identification and tracking performance under the condition of fewer matched feature point pairs.

According to a first aspect of the present invention, there is provided a fast robust image recognition and tracking method based on structural features, comprising:

adopting GMS feature matching algorithm to carry out feature point matching pair screening on the query image and the training matching image;

if a correct feature point matching pair exists, equally dividing interest areas in the query image and the training matching image into small grids, and determining key points for each grid;

modeling key points as graph nodes, generating edges, constructing a graph model, and fusing feature matching and graph matching weight parameters;

and (4) completing approximate matching of the image by using a random walk algorithm, and completing image identification and tracking.

Optionally, the selecting the feature point matching pairs of the query image and the training matching image by using a GMS feature matching algorithm includes:

matching the feature points by a BF violence matching algorithm;

dividing the query image and the training matching image into G grids respectively, and calculating the characteristic point pairs F matched with BF_pAnd F_qNumber of correct matches S in the vicinity_pq，F_pAnd F_qRespectively, the characteristic points in the query image and the training matching image are screened as matching pairs by a BF violence matching algorithm;

by characteristic point F_pAnd F_qTaking the located grid as the center, respectively selecting 9 grids around the located grid, K being 3 × 3, as the area to calculate the matching number, wherein

Is a grid { p^k,q^kThe number of matching pairs between every two adjacent pairs is 1-9;

threshold for true and false matching

η taking an empirical value of 6, n_iIs the total number of features of the grid; comparing the correct matching number S_pqAnd a threshold value t_pTo determine whether it shouldThe points are correctly matched:

wherein p and q are respectively expressed as characteristic points F in the query image_pMatching feature points F in images with training_q。

Optionally, the matching the feature points by a BF violence matching algorithm includes:

firstly, selecting a characteristic point in a query image;

then carrying out BRIEF descriptor Hamming distance test on the feature points in the matched training images in sequence;

and finally, returning the feature points with the closest distance to form a feature point matching set from the query image to the matching training image.

Optionally, the BRIEF descriptor hamming distance test, wherein BRIEF descriptor is obtained by:

extracting FAST characteristic points from each picture by using an ORB algorithm;

and taking each FAST characteristic point as a center, taking a neighborhood large window of S × S, randomly selecting point pairs in the large window, carrying out binary assignment, and calculating a BRIEF descriptor.

Optionally, the equally dividing the region of interest in the query image and the training matching image into small grids, and determining a keypoint for each grid includes:

(1) drawing an interested region R in a query image, equally dividing the R region into N grids, querying whether a correct matching pair reserved by screening exists in each grid, and if so, taking a point with the closest Hamming distance of a BRIEF descriptor of the matching pair as a key point of the grid; if no correct feature point matching pair exists in the grid, taking the maximum harris response corner point in the FAST feature points in the grid as the key point of the grid;

(2) selecting 4 pairs of correct matching pairs reserved by screening from the query image, and solving the perspective transformation tau generated between the query image and the training matching image:

let Z be a₃₃＝1

Obtaining 8 equations from 4 points, and solving 8 unknowns to obtain a perspective transformation matrix; and obtaining the region of interest R in the query image corresponding to the region in the matching training image through the matrix, and dividing the grid of the region in the same way to select representative key points.

Optionally, the modeling the key points as nodes of a graph, generating edges of the graph, and constructing the graph model includes:

modeling the selected key points as the nodes of the graph, then calculating descriptors of the key points as the attributes of the nodes, using Dirony triangulation, constructing a triangular net through the coordinates of the key points to form the edges of the graph, and having the invariance of translation, scaling and rotation; so far, the interest area in the query image and the feature point information and the structure information corresponding to the area in the matching training image are converted into a graph model G^α＝(V^α,E^α) And G^β＝(V^β,E^β) Where G represents a graph model, α represents a query image, and β represents a matching training image.

Optionally, the performing of approximate matching on the graph by using a random walk algorithm is to: and searching a plurality of nodes with the maximum weight by using a PageRank graph matching algorithm to complete matching.

According to a second aspect of the present invention, there is provided a fast robust image recognition and tracking apparatus based on structural features, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor is operable to execute the fast robust image recognition and tracking method based on structural features.

Compared with the prior art, the embodiment of the invention has at least one of the following beneficial effects:

(1) the invention uses GMS algorithm to carry out fast and extremely robust image characteristic preliminary matching, and the screened correct matching pair can be simultaneously used for key point selection, two-image mapping transformation matrix solving and similarity matrix K optimization, thereby optimizing the whole matching structure and accelerating the matching efficiency.

(2) The invention combines the graph model and graph matching mechanism with the image matching, applies the relation and the structural characteristics between the characteristic points to the matching, optimizes the selection of the characteristic points and integrates the selection into the problem of finding the optimal solution. The object recognition can be completed under the condition of fewer characteristic points, and the accuracy is improved.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a flow chart of an image recognition and tracking method according to a preferred embodiment of the present invention;

fig. 2 is a schematic diagram of detecting a FAST corner according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an operator for extracting BRIEF features according to an embodiment of the present invention;

FIG. 4 is a diagram of a GMS matching model according to an embodiment of the present invention;

fig. 5 is a diagram illustrating the effect of the GMS matching algorithm according to an embodiment of the present invention;

FIG. 6 is a schematic view of a given planar object of interest according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of meshing according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a perspective transformation solution according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of Dirony triangulation according to an embodiment of the present invention;

FIG. 10 is a schematic diagram of a fusion of two views according to an embodiment of the present invention;

FIG. 11 is a diagram of the PageRank algorithm according to an embodiment of the invention.

Detailed Description

The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Fig. 1 is a flowchart of an image recognition and tracking method according to an embodiment of the present invention. Referring to fig. 1, a fast robust image recognition and tracking method based on structural features includes:

s11, adopting GMS feature matching algorithm to screen the query image and the training matching image by feature point matching pair;

s12, if a correct feature point matching pair exists, equally dividing the interest areas in the query image and the training matching image into small grids, and determining key points for each grid;

s13, using the key point modeling as the node of the graph, generating edges, constructing a graph model, and fusing the feature matching and the weight parameters of the graph matching;

and S14, completing approximate matching of the graph by using a random walk algorithm, and completing image recognition and tracking.

Preferably, a GMS (Grid-based Motion Statistics for fast) feature matching algorithm is used for screening correct feature point matching pairs by a method of Grid division and Motion statistical characteristics. Equally dividing a given region of the object R of the plane of interest into N grids, and if a correctly screened feature point matching pair exists in the grid, taking a point with the closest matching distance as a key point of the grid; if no correct feature point matching pair exists in the grid, the maximum harris response corner point in the grid is taken as the key point of the grid. And modeling the selected key points as nodes of the graph, and constructing edges of the graph by using Dirony triangulation. Then calculating their descriptors as the attribute of the node, and fusing the feature matching with the weight parameter of the graph matching. And finally, searching a plurality of nodes with the maximum weight by using a graph matching algorithm of random walk of PageRank to finish matching.

The embodiment of the invention uses the structural characteristics among the key points to assist the matching, and can provide more accurate identification and tracking performance under the condition of less characteristic point matching pairs.

Fig. 2 is a schematic diagram of detecting a FAST corner according to an embodiment of the present invention; fig. 3 is a schematic diagram of an operator for extracting features according to an embodiment of the present invention. In the traditional method, SIFT (Scale-inverse Feature Transform) algorithm is mostly used for extracting features, and key points (Feature points) are searched in different Scale spaces, wherein the points are prominent and cannot be changed due to factors such as illumination, affine transformation and noise. The defects are that the speed of searching key points is slow, and the characteristic point descriptor is characterized by a 128-dimensional vector, so that the time is more consumed in characteristic matching. The ORB algorithm combines a detection method of FAST feature points with BRIEF feature descriptors, improves and optimizes the FAST feature points on the original basis, has invariance to noise and perspective transformation thereof, and extracts and describes features of images with different scales, thereby trying to solve the scale problem to a certain extent. The speed of the ORB algorithm is 100 times of that of SIFT, so that the calculation instantaneity can be ensured, the method is more convenient to apply to engineering, and SIFT can be well replaced. Referring to fig. 2, in another preferred embodiment, first, an ORB algorithm (organized FAST and Rotated BRIEF) is used to extract FAST corner points for each picture, and BRIEF (binary Robust independant Elementary features) operators are calculated for feature point matching pairs.

In one embodiment, reference may be made to the following specific steps:

(1) extracting a FAST corner point for each frame image,

in the above formula, t is a threshold (default value is 10, and values of different scenes are different), I_pRepresenting the pixel value of the central pixel, I_p→xPixel values in a circular template are shown. Referring to FIG. 2, the pixel value I of the center pixel_pSmaller than the pixel value I at x in the surrounding pixel circular template_p→xThen the pixel belongs to dark, S_p→xTwo other cases represent bright and similar, respectively. Such a block (circular) area can be divided into three types d, s and b. At this time, only the number of d or b in the circular area is countedIf d or b occurs more than n (n is generally set to 12), which means that the point is darker or lighter than the surroundings, the point is considered as a candidate corner point. Assuming that X characteristic points are to be extracted from the image, reducing the threshold value t, enabling the characteristic points detected by the FAST algorithm to be larger than X, then calculating the Harris response value R of the characteristic points at the position of the characteristic points, and taking the previous X points with large response values as FAST characteristic points;

(2) referring to fig. 3, for each FAST feature point, taking the FAST feature point as a center, taking a large window of the neighborhood of S × S, randomly selecting point pairs (generally 256 pairs) in the large window, performing binary assignment, and calculating a BRIEF operator.

Fig. 4 is a diagram of a GMS matching model according to an embodiment of the present invention. Referring to fig. 4, in another preferred embodiment, the GMS feature matching algorithm is used to filter correct feature point matching pairs, which may be as follows:

(1) matching the characteristic points by a BF (Brute force) violence matching algorithm;

firstly, selecting a feature point from a query image, then sequentially carrying out BRIEF descriptor Hamming distance test on the feature points in the matching training image, and finally returning the feature point with the closest distance to form a feature point matching set from the query image to the matching training image.

(2) Referring to fig. 4, the image is divided into G meshes (generally, G is 20 × 20), and BF-matched pairs of feature points F are calculated_pAnd F_qNumber of correct matches S in the vicinity_pq，

Wherein

Is a grid { p^k,q^kA match between.

Threshold for true and false matching

η is taken as a large empirical value of 6, n_pOf 3 × 3 meshesTotal number of features. Comparing the correct matching number S_pqAnd a threshold value t_pTo determine if the point is correctly matched:

the effect of the GMS feature matching algorithm on screening the correct matching pairs is shown in figure 5, the connecting lines are the correct matching pairs after deletion, and the matching conditions are almost correct, and few wrong matching connecting lines exist.

FIG. 6 is a schematic view of a given planar object of interest according to an embodiment of the present invention; FIG. 7 is a schematic diagram of meshing according to an embodiment of the present invention; FIG. 8 is a schematic diagram of solving for perspective transformation according to an embodiment of the present invention.

In another preferred embodiment, the above-described method of equally dividing a given planar object of interest R region into N grids, often 6 × 10 based on an empirical value of N, and selecting representative keypoints for each grid, may be performed according to the following steps:

(1) referring to fig. 6 and 7, an interested region R is marked out in a query image, the region R is equally divided into N grids, whether a correct matching pair retained by screening exists in each grid is queried, and if so, a point with the closest hamming distance of a BRIEF descriptor of the matching pair is taken as a key point of the grid. And if no correct feature point matching pair exists in the grid, taking the maximum harris response corner point in the FAST feature points in the grid as the key point of the grid.

(2) Selecting 4 pairs of correct matching pairs (preferably selecting grid key points) under screening reservation in the query image, and solving perspective transformation tau generated by the two images as shown in fig. 8:

let Z be a₃₃＝1

8 equations can be obtained from 4 points, and 8 unknowns are solved, so that the perspective transformation matrix can be solved. Through the matrix, the region R of interest in the query image corresponding to the region in the matching training image can be obtained, and the region is divided into grids in the same way to select representative key points.

Fig. 9 is a schematic diagram of a dironi triangulation according to an embodiment of the present invention. Referring to fig. 9, in another preferred embodiment, the keypoint feature descriptors and structural feature data determined in the mesh are converted into a data structure of a graph, where each graph G ═ (V, E) is represented as a set of nodes V and edges E, and the keypoints are modeled as nodes of the graph to construct a graph model, which may specifically adopt: the selected keypoints are modeled as a node of the graph, and then their descriptors are computed as the attributes of this node. And constructing a triangular net by using Dirony triangulation and coordinates of key points to form edges of the graph, wherein the edges have translation, scaling and rotation invariance.

FIG. 10 is a schematic diagram of a fusion of two images according to an embodiment of the present invention. Referring to fig. 10, in another preferred embodiment, the feature matching is fused with the weight parameter of the graph matching, which can be implemented as follows:

(1) referring to FIG. 10, the graph matching problem is transformed, the result of which is represented by an assignment matrix (assignment matrix) X, since both graph grids are N, the assignment matrix is a {0,1} matrix of N × N, where each row, column and only one element of the assignment matrix is 1.

x^*＝argmax(x)＝x^TKx

s.t.X1_n≤1_m,X ^T1_m≤1_n

Wherein c is_i,aRepresentative graph G^αMiddle node i to graph G^βConsistency of middle node a, graph G^αFIG. G^βRepresenting the graph model generated from the query image and the matching training image, respectively, d_i,j,a,bRepresentative graph G^αMiddle line ij to graph G^βThe consistency of the middle line segment ab, X represents the matching result for the corresponding matrix; i. j represents graph G^αIn (1), a and b represent graph G^βA node in (1); drawing G^αFIG. G^βRespectively by point set V^αEdge set E^αAnd set of points V^βEdge set E^βComposition X_i,aShows diagram G^αNode i in (1) and graph G^βOf node b, i.e. and only if node i ∈ V^αCorresponding to node a ∈ V^βWhen X_i,a＝1，X_j,bThe same process is carried out; (X) is an evaluation function, which represents the value under the corresponding assignment matrix X, graph G^αAnd graph G^βThe higher the evaluation quality is, the more similar the two images are;

vectorizing the matrix X; x is the number of^TIs the transpose of vector x; x is the number of^*Vectorizing the obtained optimal corresponding matrix to show the optimal matching relation of the two graphs; k is a similarity matrix and simultaneously comprises first-order node similarity information and second-order edge similarity information; 1_nRepresenting column vectors of N vectors, the constraint ensuring that each part matches at most once, since both graphs consist of N nodes, where N-m-N;

(2) and (3) assigning the similarity matrix K:

wherein A is^αAnd A^βIs G^αAnd G^βOf the adjacent matrix. Merging the feature points with the geometric constraints, order

And

being a BRIEF descriptor of the key point,

is shown as a drawing G^αMidpoint v_iAnd v_jThe coordinate position of (a);

、

is shown as a drawing G^βMidpoint v_aAnd v_bThe coordinate position of (a); tau is the perspective transformation of the query image and the training matching image; d_i,j,a,b(τ) denotes the correspondence of the edge (i, j) with the edge (a, b) under the perspective transformation τ, ω being chosen large enough to ensure that the edge similarity is greater than 0. The diagonal elements of the similarity matrix K include node-to-node similarity information, and the off-diagonal elements include edge-to-edge similarity information.

(3) Filtering candidate matches, preferentially selecting GMS to screen reserved key points, sorting according to matching distances, and reserving at most n_cSet of key points C^tAnd the similarity is highest, and deletion does not belong to C^tThe rows and columns of the key points are used to condense the similarity matrix K to a size reduced to n_c ²N². Taking 5 as n_cThe ideal value of (c). Of course, in other embodiments, n_cOther values may also be used.

FIG. 11 is a diagram of the PageRank algorithm according to an embodiment of the invention. In another preferred embodiment, as described with reference to FIG. 11. The existing graph matching methods are mainly divided into exact graph matching and approximate graph matching methods. The exact graph matching method is more complex and time consuming than the approximate graph matching method. Also, in an actual graph, the exact graph matching method is not robust against external points and various variations. Compared with exact graph matching, approximate graph matching has low complexity, high efficiency and wider application. Therefore, although the present problem is close to the exact graph matching of N to N nodes, it is more appropriate to convert it into approximate graph matching in view of efficiency and robustness. And expanding the graph matching problem by adopting a PageRank algorithm, and searching a plurality of nodes with the maximum weight to complete matching. The principle is that the whole graph reaches a stable state through random walk of a random walker, each node on the association graph corresponds to a similar probability value, namely the probability that a candidate matching pair is reliable matching, and the larger the similar probability value of the node is, the more correct the matching pair corresponding to the node is. Specifically, the following specific steps may be adopted:

(1) for the assigned similarity matrix K, the access probability of each point is initialized, so that the point starts to jump and randomly walks to the matching constraint and finally converges to the quasi-stationary distribution thereof ∈ C for each candidate match (i, a)^tThe corresponding probability is initialized to:

(2) in order to solve the problem of termination points and traps, the transition probability is set to be η, the random jump probability is (1- η), the iterative formula of random walk is changed into X' ═ η KX + (1- η) e, e is an equally-divided random jump probability matrix, iteration is carried out until convergence, and a plurality of nodes with the maximum weight are selected to finish matching.

The above preferred features of the embodiments can be used alone in any embodiment, or in any combination thereof without conflict. In addition, portions which are not described in detail in the embodiments may be implemented by using the prior art.

Based on the foregoing embodiments, in another embodiment, the present invention further provides a device for fast and robust image recognition and tracking based on structural features, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor is operable to execute the method for fast and robust image recognition and tracking based on structural features according to any one of the foregoing embodiments when executing the program.

Optionally, a memory for storing a program; a Memory, which may include a volatile Memory (abbreviated RAM), such as a Random-Access Memory (RAM), a static Random-Access Memory (SRAM), a Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), and the like; the memory may also comprise a non-volatile memory, such as a flash memory. The memory 62 is used to store computer programs (e.g., applications, functional modules, etc. that implement the above-described methods), computer instructions, etc., which may be stored in one or more memories in a partitioned manner. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.

The computer programs, computer instructions, etc. described above may be stored in one or more memories in a partitioned manner. And the computer programs, computer instructions, data, etc. described above may be invoked by a processor.

A processor for executing the computer program stored in the memory to implement the steps of the method according to the above embodiments. Reference may be made in particular to the description relating to the preceding method embodiment.

The processor and the memory may be separate structures or may be an integrated structure integrated together. When the processor and the memory are separate structures, the memory, the processor may be coupled by a bus.

The embodiment of the invention provides a Fast and robust image recognition and tracking method and device by starting with a GMS (Grid-based motion statistics for Fast) Fast search technology and adding a graph matching algorithm based on geometric constraint, can effectively accelerate the matching and recognition method, and provides more accurate recognition and tracking performance under the condition of less feature point matching pairs.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims

1. A fast robust image identification tracking method based on structural features is characterized by comprising the following steps:

2. The method for fast and robustly recognizing and tracking the image based on the structural feature of claim 1, wherein the step of performing feature point matching pair screening on the query image and the training matching image by using a GMS feature matching algorithm comprises the steps of:

matching the feature points by a BF violence matching algorithm;

by characteristic point F_pAnd F_qThe located grid is taken as the center, and 9 grids with K being 3 × 3 are respectively selected as the area to calculate the matching pair numberWherein

threshold for true and false matching

η taking an empirical value of 6, n_pIs the total number of features of the grid; comparing the correct matching number S_pqAnd a threshold value t_pTo determine if the point is correctly matched:

3. The method for fast and robust image recognition and tracking based on structural features as claimed in claim 1, wherein the matching of feature points by BF brute force matching algorithm comprises:

firstly, selecting a characteristic point in a query image;

4. The method for fast and robust image recognition and tracking based on structural features of claim 3, wherein the BRIEF descriptor is Hamming distance test, wherein the BRIEF descriptor is obtained by:

5. The method for fast and robust image recognition and tracking based on structural features as claimed in claim 1, wherein the step of equally dividing the region of interest in the query image and the training matching image into small grids and determining the key points for each grid comprises:

order to

6. The method for fast and robust image recognition and tracking based on structural features as claimed in claim 1, wherein the key points are modeled as nodes of a graph, edges of the graph are generated, and the constructing of the graph model comprises:

modeling the selected key points as the nodes of the graph, then calculating descriptors of the key points as the attributes of the nodes, constructing a triangular net through the coordinates of the key points by using Dirony triangulation, forming the edges of the graph, and enabling the edges to be flatShift, zoom, and rotation invariance; so far, the interest area in the query image and the feature point information and the structure information corresponding to the area in the matching training image are converted into a graph model G^α＝(V^α,E^α) And G^β＝(V^β,E^β) Where G represents a graph model, α represents a query image, and β represents a matching training image.

7. The method for fast and robust image recognition and tracking based on structural features as claimed in claim 6, wherein the fusing the feature matching with the weight parameters of the graph matching comprises:

(1) the grids of the query image and the training matching image are both N, and the assignment matrix is a {0,1} matrix of N × N, wherein each row and each column of the assignment matrix have and only one element of 1;

in the graph matching problem, the first-order similarity between nodes and the second-order similarity between edges and the first-order similarity between the nodes in the graph structure are considered at the same time, and the similarity matrix K simultaneously contains first-order node similarity information and second-order edge similarity information; the matching problem is converted into the following mathematical form:

x^*＝argmax(x)＝x^TKx

s.t.X1_n≤1_m,X^T1_m≤1_n

(2) and (3) assigning the similarity matrix K:

wherein A is^αAnd A^βIs G^αAnd G^βThe adjacency matrix of (a);

shows diagram G^αMidpoint v_iAnd point v_jWhen v is an adjacent relation of_iAnd v_jWhen there is a connected edge

The same process is carried out; c. C_i,aIs shown as a drawing G^αMidpoint v_iAnd graph G^βMidpoint v_aUniformity of；d_i,j,a,b(τ) is graph G^αMiddle edge (i, j) and graph G^βThe consistency of the middle edges (a, b);

merging the feature points with the geometric constraints, order

f_i、

A BRIEF descriptor that is a key point;

is shown as a drawing G^αMidpoint v_iAnd v_jThe coordinate position of (a);

is shown as a drawing G^βMidpoint v_aAnd v_bThe coordinate position of (a); tau is the perspective transformation of the query image and the training matching image;

d_i,j,a,b(τ) represents the consistency of the lower edge (i, j) and the edge (a, b) of the perspective transformation τ, ω is selected to be large enough to ensure that the similarity of the pair of edges is greater than 0, the diagonal elements of the similarity matrix K contain the similarity information of the nodes and the nodes, and the non-diagonal elements contain the similarity information of the edges and the edges;

(3) filtering candidate matches, selecting GMS to screen retained key points, sorting according to matching distance, and retaining at most n_cSet of key points C^tAnd the similarity is highest, and deletion does not belong to C^tThe rows and columns of the key points are used to condense the similarity matrix K to a size reduced to n_c ²N²。

8. The method for fast and robust image recognition and tracking based on structural features of claim 7, wherein the approximate matching of the graph is completed by a random walk algorithm, which is characterized in that: and searching a plurality of nodes with the maximum weight by using a PageRank graph matching algorithm to complete matching.

9. The method for fast and robust image recognition and tracking based on structural features according to claim 8, wherein the finding of the nodes with the maximum weight by using the PageRank graph matching algorithm comprises:

(1) initializing the access probability of each point for the assigned similarity matrix K, making it start to jump and randomly walk to the matching constraint and finally converge on the quasi-stationary distribution, and (i, a) ∈ C for each candidate match^tThe corresponding probability is initialized to:

(2) in order to solve the problem of termination points and traps, the transition probability is set to be η, the random jump probability is set to be (1- η), the iterative formula of random walk is changed into X' ═ η KX + (1- η) e, e is an equally-divided random jump probability matrix, iteration is carried out until convergence, and a plurality of nodes with the maximum weight are selected to finish matching.

10. A fast robust image recognition tracking apparatus based on structural features, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program is operable to perform the method of any one of claims 1 to 9.