CN109146972B

CN109146972B - Visual navigation method based on rapid feature point extraction and gridding triangle constraint

Info

Publication number: CN109146972B
Application number: CN201810954414.6A
Authority: CN
Inventors: 谢非; 黄天胤; 钱伟行; 刘文慧; 霍丽颖; 沈世斌; 张雷; 刘益剑; 张亮; 夏邵君
Original assignee: Zhenjiang Institute For Innovation And Development Nnu
Current assignee: Zhenjiang Institute For Innovation And Development Nnu
Priority date: 2018-08-21
Filing date: 2018-08-21
Publication date: 2022-04-12
Anticipated expiration: 2038-08-21
Also published as: CN109146972A

Abstract

The invention discloses a visual navigation method based on rapid feature point extraction and gridding triangle constraint, which comprises the following steps: acquiring continuous video frame sequence images, and extracting feature points in two adjacent frames of images; carrying out gridding division on the extracted two adjacent frames of images; constructing a 6-dimensional shape descriptor and a 32-dimensional region descriptor for each triangle to obtain a feature vector consisting of 38-dimensional mixed descriptors of each triangle; matching the characteristic vectors of triangles corresponding to the gridding division, selecting characteristic points with obvious characteristics by a central point clustering method, eliminating the characteristic points which are in error matching, and eliminating the motion characteristic points by polar line geometric constraint to obtain effective available characteristic point pairs; and finishing the final motion model solution to obtain a position result of the visual navigation solution. The invention provides a solution for robot visual navigation and positioning in an indoor environment, and has the advantages of high extraction speed of the characteristic points and high matching rate.

Description

Visual navigation method based on rapid feature point extraction and gridding triangle constraint

Technical Field

The invention relates to the technical field of visual navigation and image processing, in particular to a visual navigation method based on rapid feature point extraction and gridding triangle constraint.

Background

In the field of autonomous navigation and positioning of robots, the most widely applied method is a navigation mode depending on a GPS and an inertial system, and with the rapid development of science and technology, many emerging navigation modes, such as visual navigation, are gradually developed. With the rapid development of machine vision technology in recent years, more and more robots begin to use machine vision to perform autonomous obstacle avoidance and path planning. According to the conventional navigation method, the robot is often positioned and navigated by using the GPS method. However, in the field of indoor navigation, GPS signals are difficult to penetrate through a thick wall to transmit signals for the robot, which may cause the real-time performance and accuracy of navigation to be seriously affected. In addition, in an indoor environment, drift generated by the inertial navigation easily causes serious deviation to a navigation result, so that the visual navigation is a more appropriate navigation positioning mode in the indoor environment and is gradually popularized and applied.

The extraction and matching of the feature points of the image in the visual navigation are very important implementation links, and the feature points of the image are generally detected under multiple scales, so that a plurality of points with very close positions and scales can be caused, the feature point aggregation phenomenon can occur, redundant feature points can be formed, and the probability of mismatching can be increased. In order to solve the problem, effective feature point screening is designed to be abundant, the invention provides a visual navigation method based on rapid feature point extraction and gridding triangle constraint, provides a solution for robot visual navigation and positioning in an indoor environment, and has the advantages of high feature point extraction speed and high matching rate, so that the visual navigation data processing operand is reduced, and the algorithm efficiency is improved.

Disclosure of Invention

Aiming at the defects of the prior art, the invention discloses a visual navigation method based on rapid feature point extraction and gridding triangle constraint, which comprises the following steps:

step 1, acquiring continuous video frame sequence images, reading two adjacent frames of color images, carrying out gray processing, extracting feature points in the two adjacent frames of images through a rapid feature point extraction algorithm, and carrying out pre-matching on the extracted feature points;

step 2, performing gridding division on the two extracted adjacent frames of images, triangulating the extracted feature points in each grid by using a Delaunay algorithm, and removing triangles with overlarge or undersize side lengths by detecting the distance between the feature points to obtain a triangular network;

step 3, constructing a 6-dimensional shape descriptor and a 32-dimensional area descriptor for each triangle in the triangular network to obtain a feature vector consisting of 38-dimensional mixed descriptors of each triangle;

step 4, matching the feature vectors of the triangles corresponding to the gridding division of the two adjacent frames of images constructed in the step 3, calculating the Euclidean distance between every two triangle feature vectors in the corresponding grids of the two frames of images, and using the ratio of the minimum value of the Euclidean distance to the second minimum value as the standard for measuring the matching degree;

step 5, extracting corresponding matching feature point pairs according to the matching result of the step 4, selecting feature points with significant characteristics through a central point clustering method, clustering the feature points in spatial distribution, and filtering out partial invalid feature points;

step 6, rejecting mismatching feature points in the significant feature points selected in the step 5 according to an improved random sampling consistency algorithm, and rejecting motion feature points by using epipolar geometric constraint to obtain effective available feature point pairs;

and 7, substituting the obtained effective available feature points into subsequent motion estimation parameter calculation to complete final motion model solution, and performing the feature point processing work of the steps 1 to 6 on the collected continuous frame images to obtain a position result of visual navigation solution. [ Liyubo, visual odometry technical research of mobile robots in outdoor environment, national defense science and technology university, Master academic paper, 2012, pp.32-39 ]

The step 1 comprises the following steps:

step 1-1, reading two adjacent frames of color images from the collected indoor image and carrying out gray processing, wherein the two processed frames of images are marked as g1 and g2 respectively;

step 1-2, respectively carrying out characteristic point detection on g1 and g2 through a rapid characteristic point extraction algorithm to obtain a characteristic point array ps1 of g1 and a characteristic point array ps2 of g 2;

step 1-3, pre-matching the extracted feature points: respectively searching and determining a feature point j matched with the feature point i in the array ps1 in the array ps2 and a feature point q matched with the feature point j in the array ps2 in the array ps1, if the feature point i in the array ps1 and the feature point q matched with the feature point j in the array ps2 in the array ps1 are the same pixel point, the pre-matching is successful, otherwise, the pre-matching is failed, and when all the points in the arrays ps1 and ps2 are subjected to traversal processing, the pre-matching is completed to obtain the pre-matched feature points.

The step 1-2 comprises the following steps:

whether each pixel point in the image is a feature point is sequentially judged from top to bottom and from left to right according to the following judgment criteria:

taking a pixel point p to be judged as a circle center, and taking 3 pixels as radiuses to construct a discrete circle, as shown in fig. 5, 16 pixel points are totally arranged on the discrete circle and are sequentially recorded as points 1-16 in a clockwise direction, wherein the three pixel positions on the right side of the point p are recorded as a point 2, the three pixel positions below the point p are recorded as a point 6, the three pixel positions on the left side of the point p are recorded as a point 10, and the three pixel positions above the point p are recorded as a point 14;

first it is discriminated whether the gray values of the

points

2 and 10 satisfy the following condition,

g1(2) < g1(p) -h or g1(2) > g1(p) + h

g1(10) < g1(p) -h or g1(10) > g1(p) + h

If the

points

2 and 10 do not meet the above conditions, directly judging that the point p is not a characteristic point; otherwise, whether the gray values of the pixel points of the points 6 and 14 meet the following conditions is judged,

g1(6) < g1(p) -h or g1(6) > g1(p) + h

g1(14) < g1(p) -h or g1(14) > g1(p) + h

g1(2), g1(10), g1(6), g1(14) and g1(p) are gray values of g1

gray map points

2, 10, 6, 14 and p respectively, h is a detection threshold and is set to be in a range of 10-30;

if more than 3 points exist in the

points

2, 10, 6 and 14 and satisfy the above judgment condition, the following judgment is carried out, otherwise, the point p is directly judged not to be the characteristic point:

and judging whether the gray value of 9 continuous pixel points is less than g1(p) -h or more than g1(p) + h according to the sequence of the points 1 to 16, if so, judging that the p point is a characteristic point, otherwise, judging that the p point is not the characteristic point.

The step 2 comprises the following steps:

step 2-1, carrying out 6 x 3 gridding division on the extracted two adjacent frames of images;

step 2-2, setting the maximum number of the feature points in each grid interval; the maximum number of feature points may be set to 50-80 to control the amount of detection of feature points and thus the algorithm computational complexity.

Step 2-3, in each grid, utilizing a Delaunay algorithm (a classic and common triangulation algorithm, refer to Dian Jian, Wufang, Wangzuo Boehmeria, Jinyonggang) and utilizing an adaptive blocked arbitrary polygon triangulation algorithm [ J ]. report on surveying and mapping science and technology, 2010,27 (1): 70-74 triangulating the pre-matching feature points extracted in the step 1 to obtain a triangular network, and removing triangles with overlarge or undersize side lengths by detecting the distance between the pre-matching feature points, wherein the removing method comprises the following steps:

respectively calculating the side lengths (represented by the number of pixels) of three sides of each triangle in the triangle network, judging that any triangle with the side length lower than 8 pixels is a triangle with the small side length, removing the triangle, and judging the triangle with the large side length by the following formula:

in the formula I_maxThe maximum value is max, the minimum value is min, R _ TH is a proportional threshold, and the value range is 23-25; and L _ TH is a length threshold value, the value range is 25-30, and if the length of the longest side of the triangle is larger than the right value of the formula, the triangle with the overlarge side length is judged and is rejected.

The step 3 comprises the following steps:

constructing a 6-dimensional shape descriptor and a 32-dimensional area descriptor for each triangle in a triangle network to obtain a feature vector consisting of 38-dimensional mixed descriptors of each triangle, wherein the specific method is as follows:

setting feature point A, B, C to form triangle ABC with e as its centroid, s₁、s₂、s₃Each being three sides of a triangle, s₁The length of the other two edges is the longest edge, the other two edges are sequentially arranged in a counterclockwise sequence, and s is the sum of the lengths of the three edges; phi is a₁、φ₂、φ₃Three internal angles of a triangle respectively, phi₁The other two inner angles are arranged in the same anticlockwise order as the maximum inner angle; gamma ray₁,γ₂,…,γ₃₂Based on the centroid e, with CA as the main direction and 0.3 times of the longest side of the triangle as the radius, a 32-dimensional SIFT descriptor, namely a region descriptor [ refer to Tombstone]Guangdong university of industry, Master academic thesis, 2012), finally, get the 6-dimensional shape descriptor of this triangle

βφ₁,βφ₂,βφ₃And a feature vector W of a 38-dimensional mixed descriptor component of a 32-dimensional region descriptor component_ABCComprises the following steps:

alpha and beta are weight coefficients for keeping balance in the shape descriptor and the region descriptor. Wherein the value ranges of both alpha and beta are 0-1.

Step 4 comprises the following steps:

matching the feature vectors of the triangles corresponding to the gridding division of the two adjacent frames of images constructed in the step 3, wherein the matching method comprises the following steps: calculating the Euclidean distance between every two triangular feature vectors in the corresponding grids of the two frames of images; and (3) taking the ratio of the minimum Euclidean distance value to the second minimum value as a standard for measuring the matching degree, judging that the matching is successful when the ratio ranges from 0.8 to 1, and finally obtaining matching feature point pairs of the two frames of images according to the triangles successfully matched if the matching is failed.

The step 5 comprises the following steps:

step 5-1, extracting matching feature point pairs corresponding to the two frames of images according to the matching result of the step 4;

and 5-2, selecting the characteristic points with remarkable characteristics for the two frames of images by a central point clustering method in sequence, clustering the characteristic points in spatial distribution, and filtering out partial invalid characteristic points.

Step 5-2 comprises:

step 5-2-1, setting the clustering number of the matched feature points as k;

step 5-2-2, randomly selecting k clustering center points;

step 5-2-3, calculating Euclidean distances between each matched feature point and each cluster center point, and selecting the category of the center point with the minimum distance as the category of the feature point;

step 5-2-4, calculating new central points of the newly selected categories to serve as clustering central points of the newly selected categories;

step 5-2-5, repeating step 5-2-3 until the clustering center point is not changed any more, and finishing the k center point clustering;

step 5-2-6, after the clustering of the central points is completed, selecting the characteristic points closest to the central points of the various clusters as the characteristic points with remarkable characteristics;

and 5-2-7, storing the finally selected salient characteristic feature point information of the two frames of images, and executing the step 6.

The step 6 comprises the following steps:

and removing the wrongly matched characteristic points from the finally selected salient characteristic points of the two frames of images by using an improved random sampling consistency algorithm, and removing the motion characteristic points by using epipolar geometric constraint to obtain effective available characteristic point pairs. [ Liyubo, visual odometry technical research of mobile robots in outdoor environment, national defense science and technology university, Master academic paper, 2012, pp.21-31 ]

Through the implementation of the technical scheme, the invention has the beneficial effects that: (1) the improved random sampling consistent algorithm processing is carried out on the feature points, and the improved processing algorithm greatly improves the matching speed and precision of the feature points; (2) gridding and K-average clustering processing are carried out on the image, the redundancy of the characteristic points is greatly reduced, and the arithmetic operation speed is obviously improved; (3) the fast triangle constraint algorithm is improved, and the algorithm operation efficiency is improved; (4) three-dimensional map information can be quickly established, and navigation autonomy and efficiency are improved; (5) the method has the advantages of high operation speed, high recognition rate and strong environmental interference resistance.

Drawings

The foregoing and other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.

FIG. 1 is a block flow diagram of the present invention.

FIG. 2 is a result diagram of feature point extraction by the fast feature point extraction algorithm in the present invention.

FIG. 3 is a graph of the gridding and feature clustering results of the image according to the present invention.

FIG. 4 is a diagram of the visual navigation positioning result of the method of the present invention.

Fig. 5 is a schematic diagram of discrete circles.

Fig. 6 is a schematic diagram of a triangle formed by feature points.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

As shown in FIG. 1, the invention discloses a visual navigation method based on rapid feature point extraction and gridding triangle constraint, comprising the following steps:

step 1: acquiring continuous video frame sequence images, reading two adjacent frames of color images, carrying out gray processing, extracting feature points in the two adjacent frames of images through a rapid feature point extraction algorithm, and carrying out pre-matching on the extracted feature points; as shown in fig. 2, it is a result diagram of feature point extraction by the fast feature point extraction algorithm in the present invention.

Step 2: performing gridding division on the two extracted adjacent frames of images, triangulating the extracted feature points in each grid by using a Delaunay algorithm, and removing triangles with overlarge or undersize side lengths by detecting the distance between the feature points to obtain a triangular network;

gridding is the pre-processing of an image, i.e. simply dividing the image into a certain number of image blocks before feature detection of the image. And the total number of feature points is specified, so that all or a large number of feature points can be prevented from being concentrated on the same object. The redundancy of feature point extraction can be greatly reduced, so that the operation speed and the real-time performance of the algorithm can be greatly improved; FIG. 3 is a graph showing the gridding and feature clustering results of the image according to the present invention.

The K-means clustering improvement algorithm is a method of vector quantization, originally from signal processing. The method can converge the points with similar characteristics in the same region to obtain a clustering characteristic point, thereby greatly reducing the redundancy of characteristic point matching;

and step 3: constructing a 6-dimensional shape descriptor and a 32-dimensional area descriptor for each triangle in the triangular network to obtain a feature vector consisting of 38-dimensional mixed descriptors of each triangle;

and 4, step 4: matching the feature vectors of the triangles corresponding to the gridding division of the two adjacent frames of images constructed in the step (3), calculating the Euclidean distance of every two triangle feature vectors in the grids corresponding to the two frames of images, and using the ratio of the minimum value of the Euclidean distance to the second minimum value as the standard for measuring the matching degree;

In step 1, some matching point pairs are randomly taken out from the pre-matching of the fast characteristic point extraction algorithm, the so-called fast characteristic point extraction method is an algorithm for fast characteristic point extraction and description, and the basic assumption of the improved random sampling consistency algorithm is that a sample contains correct data and also contains abnormal data, namely, a data set contains noise. These outlier data may be due to erroneous measurements, erroneous assumptions, erroneous calculations, etc. It is also assumed that given a correct set of data, there is a way to calculate the model parameters that fit these data. And then calculating a transformation matrix, traversing all matching pairs obtained in the pre-matching according to the obtained transformation matrix, and calculating the percentage of all the pre-matching pairs meeting the transformation matrix model under a certain specific threshold value. Repeating the above steps n times (in the experiment, n is 10000), comparing the matching pair percentage meeting the model under each transformation matrix, and taking the matrix with the maximum corresponding percentage as the best transformation matrix finally obtained. Matched pairs with an error exceeding a certain threshold under the best transform are removed. Repeating the steps m times by using the removed matching points (taking 3 in m in the experiment).

The step 1 comprises the following steps:

step 1-3, pre-matching the extracted feature points, wherein the pre-matching method comprises the following steps: respectively searching and determining a feature point j matched with the feature point i in the array ps1 in the array ps2 and a feature point q matched with the feature point j in the array ps2 in the array ps1, if the feature point i in the array ps1 and the feature point q matched with the feature point j in the array ps2 in the array ps1 are the same pixel point, the pre-matching is successful, otherwise, the pre-matching is failed, and when all the points in the arrays ps1 and ps2 are subjected to traversal processing, the pre-matching is completed to obtain the pre-matched feature points.

The step 1-2 comprises the following steps:

(1) constructing a discrete circle by taking the pixel point p to be judged as the center of a circle and 3 pixels as the radius, wherein 16 pixel points are totally arranged on the discrete circle as shown in figure 5;

(2) first, it is determined whether the gray values of the pixel points at

positions

2 and 10 satisfy the following condition,

g1(2) < g1(p) -h or g1(2) > g1(p) + h

g1(10) < g1(p) -h or g1(10) > g1(p) + h

If the

points

2 and 10 do not meet the above conditions, the point p can be directly judged not to be a characteristic point; otherwise, whether the gray values of the pixel points at the positions 6 and 14 meet the following conditions is judged,

g1(6) < g1(p) -h or g1(6) > g1(p) + h

g1(14) < g1(p) -h or g1(14) > g1(p) + h

g1(2), g1(10), g1(6), g1(14), g1(p) are the gray scale values of the g1 gray scale map at

positions

2, 10, 6, 14, p, respectively, h is a detection threshold, and the set range is 10-30.

(3) If more than 3 points exist in the pixel points at the four positions of 2, 10, 6 and 14 and meet the above judgment condition, the following judgment is carried out, otherwise, the p points can be directly judged not to be the feature points,

judging whether the gray value of 9 continuous pixel points is less than g1(p) -h or more than g1(p) + h according to the sequence of the points at positions 1-16 in the figure 5, if so, judging that the p point is a characteristic point, otherwise, judging that the p point is not the characteristic point.

The step 2 comprises the following steps:

in the formula I_maxIs triangularThe longest side, w and h respectively represent the width and height of the image, max is a maximum value solving function, min is a minimum value solving function, R _ TH is a proportional threshold, and the value range is 23-25; and L _ TH is a length threshold value, the value range is 25-30, and if the length of the longest side of the triangle is larger than the right value of the formula, the triangle with the overlarge side length is judged and is rejected.

The step 3 comprises the following steps:

as shown in FIG. 6, the feature points A, B, C are set to form a triangle ABC with e as its centroid, s₁、s₂、s₃Each being three sides of a triangle, s₁The length of the other two edges is the longest edge, the other two edges are sequentially arranged in a counterclockwise sequence, and s is the sum of the lengths of the three edges; phi is a₁、φ₂、φ₃Three internal angles of a triangle respectively, phi₁The other two inner angles are arranged in the same anticlockwise order as the maximum inner angle; gamma ray₁,γ₂,…,γ₃₂Based on the centroid e, with CA as the main direction and 0.3 times of the longest side of the triangle as the radius, a 32-dimensional SIFT descriptor, namely a region descriptor [ refer to Tombstone]Guangdong university of industry, Master academic thesis, 2012), finally, get the 6-dimensional shape descriptor of this triangle

alpha and beta are weight coefficients for keeping balance in the shape descriptor and the region descriptor. Wherein the value ranges of both alpha and beta are 0-1. In the invention, the value of alpha is 0.6, and the value of beta is 0.4.

Step 4 comprises the following steps:

The step 5 comprises the following steps:

Step 5-2 comprises:

step 5-2-1, setting the clustering number of the matched feature points as k;

step 5-2-2, randomly selecting k clustering center points;

The step 6 comprises the following steps:

and (3) removing the wrongly matched feature points from the finally selected significant feature points of the two frames of images by using an improved random sampling consensus algorithm, and then removing the motion feature points by using epipolar geometry constraint to obtain effective available feature point pairs (refer to Liyu wave. visual odometry technology research of mobile robots in outdoor environment, national defense science and technology university, Master academic paper 2012, pp.21-31). FIG. 4 is a diagram showing the visual navigation positioning result of the method of the present invention.

The present invention provides a visual navigation method based on rapid feature point extraction and gridding triangle constraint, and a number of methods and ways for implementing the technical scheme are provided, and the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a number of improvements and embellishments can be made without departing from the principle of the present invention, and these improvements and embellishments should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.

Claims

1. The visual navigation method based on rapid feature point extraction and gridding triangle constraint is characterized by comprising the following steps of:

step 7, substituting the obtained effective available feature points into subsequent motion estimation parameter calculation to complete final motion model solution, and performing the feature point processing work of the steps 1-6 on the collected continuous frame images to obtain a position result of visual navigation solution;

the step 1 comprises the following steps:

step 1-3, pre-matching the extracted feature points: respectively searching and determining a feature point j matched with the feature point i in the array ps1 in the array ps2 and a feature point q matched with the feature point j in the array ps2 in the array ps1, if the feature point i in the array ps1 and the feature point q matched with the feature point j in the array ps2 in the array ps1 are the same pixel point, the pre-matching is successful, otherwise, the pre-matching is failed, and when all the points in the arrays ps1 and ps2 are subjected to traversal processing, the pre-matching is completed to obtain pre-matched feature points;

the step 1-2 comprises the following steps:

taking a pixel point p to be judged as a circle center, taking 3 pixels as a radius to construct a discrete circle, wherein the discrete circle comprises 16 pixel points which are sequentially marked as points 1-16 clockwise, wherein the three pixel positions on the right side of the point p are marked as a point 2, the three pixel positions below the point p are marked as a point 6, the three pixel positions on the left side of the point p are marked as a point 10, and the three pixel positions above the point p are marked as a point 14;

first it is discriminated whether the gray values of the points 2 and 10 satisfy the following condition,

g1(2) < g1(p) -h or g1(2) > g1(p) + h

g1(10) < g1(p) -h or g1(10) > g1(p) + h

If the points 2 and 10 do not meet the above conditions, directly judging that the point p is not a characteristic point; otherwise, whether the gray values of the pixel points of the points 6 and 14 meet the following conditions is judged,

g1(6) < g1(p) -h or g1(6) > g1(p) + h

g1(14) < g1(p) -h or g1(14) > g1(p) + h

g1(2), g1(10), g1(6), g1(14), g1(p) are the gray values at g1 gray map points 2, 10, 6, 14, p, respectively, and h is the detection threshold;

if more than 3 points exist in the points 2, 10, 6 and 14 and satisfy the above judgment condition, the following judgment is carried out, otherwise, the point p is directly judged not to be the characteristic point:

judging whether the gray value of 9 continuous pixel points is less than g1(p) -h or more than g1(p) + h according to the sequence of the points 1-16, if so, judging that the p point is a characteristic point, otherwise, judging that the p point is not the characteristic point;

the step 2 comprises the following steps:

step 2-2, setting the maximum number of the feature points in each grid interval;

step 2-3, triangulating the pre-matching feature points extracted in the step 1 by using a Delaunay algorithm in each grid to obtain a triangular network, and removing triangles with overlarge or undersize side lengths by detecting the distance between the pre-matching feature points, wherein the removing method comprises the following steps:

respectively calculating the side lengths of three sides of each triangle in the triangle network, expressing by using the number of pixels, judging that any triangle with the side length lower than 8 pixels is a triangle with the over-small side length, and eliminating, and judging the triangle with the over-large side length by the following formula:

in the formula I_maxThe length of the longest side of the triangle is larger than the right value of the formula, the triangle with overlarge side length is judged and eliminated;

the step 3 comprises the following steps:

setting feature point A, B, C to form triangle ABC with e as its centroid, s₁、s₂、s₃Each being three sides of a triangle, s₁The length of the other two edges is the longest edge, the other two edges are sequentially arranged in a counterclockwise sequence, and s is the sum of the lengths of the three edges; phi is a₁、φ₂、φ₃Three internal angles of a triangle respectively, phi₁The other two inner angles are arranged in the same anticlockwise order as the maximum inner angle; gamma ray₁,γ₂,…,γ₃₂Based on the centroid e, with CA as the main direction and 0.3 times of the longest side of the triangle as the radius, constructing a 32-dimensional SIFT descriptor, namely an area descriptor, and finally obtaining a 6-dimensional shape descriptor of the triangle

alpha and beta are weight coefficients for keeping balance in the shape descriptor and the region descriptor;

step 4 comprises the following steps:

matching the feature vectors of the triangles corresponding to the gridding division of the two adjacent frames of images constructed in the step 3, wherein the matching method comprises the following steps: calculating the Euclidean distance between every two triangular feature vectors in the corresponding grids of the two frames of images; the ratio of the minimum Euclidean distance value to the second minimum value is used as a standard for measuring the matching degree, the matching is judged to be successful when the ratio ranges from 0.8 to 1, otherwise, the matching is failed, and finally, the matching feature point pairs of the two frames of images are obtained according to the triangles successfully matched;

the step 5 comprises the following steps:

step 5-2, selecting characteristic points with remarkable characteristics for the two frames of images sequentially through a central point clustering method, clustering the characteristic points in spatial distribution, and filtering out partial invalid characteristic points;

step 5-2 comprises:

step 5-2-1, setting the clustering number of the matched feature points as k;

step 5-2-2, randomly selecting k clustering center points;

5-2-7, storing the finally selected salient characteristic feature point information of the two frames of images, and executing the step 6;

the step 6 comprises the following steps:

and removing the wrongly matched characteristic points from the finally selected salient characteristic points of the two frames of images by using an improved random sampling consistency algorithm, and removing the motion characteristic points by using epipolar geometric constraint to obtain effective available characteristic point pairs.