CN110689578A

CN110689578A - Unmanned aerial vehicle obstacle identification method based on monocular vision

Info

Publication number: CN110689578A
Application number: CN201910963529.6A
Authority: CN
Inventors: 倪晓军; 马浩森
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University; Nanjing University of Posts and Telecommunications
Priority date: 2019-10-11
Filing date: 2019-10-11
Publication date: 2020-01-14

Abstract

The invention discloses an unmanned aerial vehicle obstacle identification method based on monocular vision, which comprises the following processes: acquiring an image in front of the unmanned aerial vehicle acquired by a monocular camera, and converting the image into a gray image; extracting each characteristic point from the gray level image; acquiring another image in front of the unmanned aerial vehicle, which is acquired by the monocular camera, converting the image into a gray image, and extracting each feature point from the gray image; matching each characteristic point in the two images, and forming a matching point pair by the two matched characteristic points; and traversing each pair of matching point pairs, and judging whether the barrier corresponding to the unmanned aerial vehicle and the feature point exceeds a safe distance. The method has the advantages of low calculation amount, greatly accelerated calculation speed and strong real-time performance, can still be effective when the flight speed of the unmanned aerial vehicle is higher, and can reduce the reaction time required when the unmanned aerial vehicle encounters an emergent obstacle.

Description

Unmanned aerial vehicle obstacle identification method based on monocular vision

Technical Field

The invention belongs to the technical field of unmanned aerial vehicle image processing, and particularly relates to an unmanned aerial vehicle obstacle identification method based on monocular vision.

Background

Along with the development of unmanned aerial vehicles in recent years, the application range of the unmanned aerial vehicle automatic cruise technology is more and more extensive. The unmanned aerial vehicle is required to be capable of automatically identifying and avoiding obstacles in the automatic cruising process, the traditional obstacle avoiding method is mostly based on various sensors, and the unmanned aerial vehicle must carry a special module, so that the cost and the power consumption are increased, and the cruising time is shortened. With the development of image processing technology in recent years, an obstacle avoidance method based on machine vision starts to rise; on the other hand, cameras are usually mounted on existing unmanned aerial vehicles, and extra equipment is not needed in the method. The existing common visual obstacle avoidance method comprises monocular vision and binocular vision, the hardware cost of the binocular vision is high, the data processing amount at the same time is large, and the efficiency is low; various methods of monocular vision also have various drawbacks, and further improvement and perfection are still needed.

At present, the common monocular vision barrier identification method is mainly to extract and match characteristic points of front and back two continuous images, and the distance is calculated according to the triangle similarity theorem by comparing the position change of a matching point pair and the moving distance of an unmanned aerial vehicle in the extraction time interval of the front and back two images. Common image feature algorithms include a Scale Invariant Feature Transform (SIFT) algorithm, a Speeded Up Robust Feature (SURF) algorithm and the like, but the two algorithms are long in calculation time, and unmanned aerial vehicle obstacle avoidance has high requirements on real-time performance, so that the algorithms are not suitable. In 2011, the orb (organized FAST and rotanebrief) algorithm proposed by Ethan ruble et al, which has improved speed by 100 times and 10 times, respectively, compared to the SIFT algorithm and SURF algorithm. However, the ORB algorithm also has disadvantages: the unmanned aerial vehicle has no scale invariance, and the unmanned aerial vehicle is necessarily accompanied by the change of the image size in the process of approaching the obstacle, so the ORB algorithm needs to be improved if the ORB algorithm is applied to the obstacle avoidance of the unmanned aerial vehicle.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, provides an unmanned aerial vehicle obstacle identification method based on monocular vision, overcomes the defect that the original ORB algorithm does not have scale invariance, and can be applied to unmanned aerial vehicle obstacle avoidance.

In order to solve the technical problem, the invention provides an unmanned aerial vehicle obstacle identification method based on monocular vision, which is characterized by comprising the following steps:

acquiring an image in front of the unmanned aerial vehicle acquired by a monocular camera, and converting the image into a gray image; extracting each characteristic point from the gray level image;

acquiring another image in front of the unmanned aerial vehicle, which is acquired by the monocular camera, converting the image into a gray image, and extracting each feature point from the gray image;

matching each characteristic point in the two images, and forming a matching point pair by the two matched characteristic points;

and traversing each pair of matching point pairs, and judging whether the barrier corresponding to the unmanned aerial vehicle and the feature point exceeds a safe distance.

Further, after the image is converted into a gray image, the gray image is subjected to gaussian filtering.

Further, the gaussian filtering the grayscale image includes:

and performing Gaussian filtering on the gray level image by adopting a box filter.

Further, the extracting of each feature point from the grayscale image includes the following processes:

response graphs of the gray level images under different scales are obtained to construct a scale space pyramid,

and extracting characteristic points by comparing the response values of the points in each response image in the scale space pyramid.

Further, obtaining response graphs of the gray level images under different scales to construct a scale space pyramid includes:

and filtering the gray level image by using a plurality of box filters with different sizes, wherein response graphs generated after filtering form a group, and the plurality of groups of response graphs form a scale space pyramid.

Further, extracting feature points by comparing response values of points in each response map in the scale space pyramid includes:

the scale space pyramid adopts 3 groups, and each group has 4 layers; and selecting three adjacent layers of response images in each group, namely a first two-layer, a second three-layer and a fourth three-layer of the first group, a first two-layer, a second three-layer and a fourth three-layer of the second group and a first two-layer, a second three-layer and a fourth three-layer of the third group, selecting 26 points around the point in space for each point in the middle layer, and comparing the response value of the point with the response value of the point, wherein if the response value of the point is greater than the response value of the other 26 points, the point is a characteristic point.

Further, matching each feature point in the two images, and forming a matching point pair by the two matched feature points includes:

obtaining BRIEF descriptors of all the feature points;

calculating Hamming similarity of BRIEF descriptors of all feature points;

and taking the characteristic point with the highest Hamming similarity as a matching point, wherein the two characteristic points form a matching point pair.

Further, obtaining the BRIEF descriptor of each feature point includes:

randomly selecting N pairs of point pairs in a certain feature point neighborhood range (X)_i,Y_i)，1<i is less than or equal to N, and a 2 x N matrix is defined

Using the characteristic point direction theta to generate a rotation matrix

S after clockwise rotation of theta angle of matrix S_θ＝R_θS，

To S_θAnd (4) carrying out tau test to generate a BRIEF descriptor taking n-bit binary codes as characteristic points.

Further, after matching is completed, counting the maximum value and the minimum value of the Euclidean distance of the matching point pairs, setting a threshold value according to the maximum value and the minimum value, and removing the matching point pairs with the Euclidean distance larger than the threshold value; and then, optimizing the matching pair by using a RANSAC (random sample consensus) algorithm, and eliminating mismatching.

Further, whether judging unmanned aerial vehicle and the corresponding barrier of characteristic point and surpassing safe distance includes:

traversing each pair of matched points KP (i, inedx (j)), KP (i +1, index (j)), respectively representing P_iAnd P_i+1The jth pair of matching point pairs of the two images, inedx (j) represents the serial numbers of the feature points in the images; a local image of a k × k neighborhood range taking KP (i, inedx (j)) as a center in Pi is used as a template tmp1, and then the scale of tmp1 is enlarged by taking scale as a proportion; p_i+1Taking a local image of a k × k neighborhood range with KP (i +1, inedx (j)) as a center as a template tmp2, matching tmp1 and tmp2 by using a template matching algorithm to obtain similarity TMscale, wherein the smaller the TMscale value, the more similar tmp1 and tmp2 are(ii) a Gradually increasing the scale, repeating the matching steps, and obtaining a scale value when the TMscale is minimum, which is recorded as scale emin, and the scale value can be regarded as the size ratio (namely multiple) of the matching point to the representative obstacle in the front image and the rear image; if the scalemin is larger than the threshold value, the object distance corresponding to the feature point is smaller than d from the monocular camera.

Compared with the prior art, the invention has the following beneficial effects: the invention is lower than SIFT algorithm and SURF algorithm in the calculated amount, can be integrated in the flight control system with higher flight control chip performance, and does not need to transmit the image to the ground station for processing and then send the control information back to the flight control, or additionally install the calculating equipment; meanwhile, due to the fact that the calculated amount is low, the calculating speed is greatly accelerated, the real-time performance is strong, the unmanned aerial vehicle can still be effective when the flying speed of the unmanned aerial vehicle is high, and the required reaction time when the unmanned aerial vehicle encounters an emergent obstacle can be shortened.

Drawings

Fig. 1 is a flow chart of an unmanned aerial vehicle obstacle identification method based on monocular vision according to the present invention;

FIG. 2 is a box filter template (9 × 9) for implementing approximate Gaussian filtering according to the present invention;

FIG. 3 is a box filter template (15 × 15) for implementing approximate Gaussian filtering according to the present invention;

fig. 4 is a schematic diagram of pinhole imaging of an obstacle during movement of the camera according to the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

The invention discloses an unmanned aerial vehicle obstacle identification method based on monocular vision, which is shown in figure 1 and comprises the following processes:

step 1): the unmanned aerial vehicle acquires an image right in front of the unmanned aerial vehicle through a monocular camera (the light path of the monocular camera is consistent with the advancing direction of the unmanned aerial vehicle) fixed right in front of the unmanned aerial vehicle body, selects an ith acquired image, performs graying processing on the image, and converts the image into a grayscale image P_i，P_iRepresenting the ith gray scaleFigure (a).

Step 2): for the gray level image P obtained in the step 1)_iGaussian filtering is carried out to reduce noise, and the obtained feature points are subjected to scale invariance through the processing of the step 3).

The specific method of gaussian filtering is as follows: calculating the gray image P obtained in the step 1)_iThe determinant value of the Hessian matrix of each pixel point in the image. In order to simplify the calculation and improve the calculation speed, the box filter used by the SURF algorithm is adopted to approximately realize Gaussian filtering, and the result after filtering calculation is used as the response value of the pixel point for the next calculation.

The approximate calculation formula of the determinant value of the Hessian matrix is det (H) ═ D_xxD_yy-(0.9D_xy)². Where 0.9 is an empirical value obtained experimentally. FIG. 2 shows three grid graphs in sequence D_xx、D_yyAnd D_xyCorresponding box filters (specifically 9 × 9 matrices), the color of each grid representing the weight at that location in the matrix represented by the box filter, D_xxAnd D_yyThe corresponding box filter represents a matrix with a white part of 1, a black part of-2 and a gray part of 0; d_xyThe corresponding box filter represents a matrix with a white portion of 1, a black portion of-1 and a gray of 0. D in the approximation formula_xx、D_yyAnd D_xyRepresenting corresponding box filters and grayscale images P_iAnd (5) performing convolution on the pixel point to obtain a result. Grayscale image P_iAnd the image generated after all the calculations are completed is the response image.

Step 3): and constructing a scale space and extracting feature points.

When the unmanned aerial vehicle approaches the obstacle, the imaging size of the obstacle in the monocular camera changes, namely the dimension changes. And if feature points of the same obstacle can be extracted and matched at different scales, a scale space pyramid of the image must be established. The method adopts an indirect method of continuously increasing the size of the box filter to obtain response images of the image under different scales so as to construct a scale space pyramid, and characteristic points are extracted through comparison of the response images under different scales.

The specific method for constructing the scale space pyramid comprises the following steps: and forming a group of response graphs generated after filtering the Pi by using a plurality of box filters with different sizes, wherein the plurality of groups of response graphs form a scale space pyramid.

The scale space pyramid of the invention adopts 3 groups, and each group has 4 layers. The box filter sizes of the first set of four-layer images are respectively: 9 × 9, 15 × 15, 21 × 21, 27 × 27. The length of the black and white area of the 15 × 15 box filter is increased by 2 pixels to ensure the existence of a central pixel, the 15 × 15 box filter can be obtained by referring to fig. 3, and the box filters of other sizes can be obtained by referring to the variation rules of fig. 2 and 3. Other sets of box filter sizes are handled in a similar manner by doubling the filter size increase (6, 12, 24, 36). The filter sizes of the second group are 15 × 15, 27 × 27, 39 × 39, 51 × 51. The filter sizes of the third group are 27 × 27, 51 × 51, 75 × 75, 99 × 99. (note: the calculation of the response value of each pixel point can be converted into a finite addition and subtraction operation of time complexity O (1) through the integral image, and the increase of the size of the box filter does not cause the increase of the calculation amount.

The method comprises the steps of selecting adjacent three layers of response images in each group, namely a first two three layers and a second three four layers of a first group, a first two layers and a second three four layers of a second group, and a first two layers and a second three four layers of a third group, selecting 26 points (namely, 8 surrounding points in the same layer, adjacent points in the two layers with the same position and adjacent 8 points, and 26 points in total) around the point in a space for each point of a middle layer (the second and the third layers, and only one layer of the first and the fourth layers is adjacent to the point and can be compared with the point so as to be abandoned), and if the response value of the point is larger than the other 26 points, selecting the point as a characteristic point, namely, a point with a bright characteristic in Pi, which the characteristic is often obvious such as a contour edge, a corner point and the like of an object imaged in a monocular camera.

Step 4): the direction of the feature points is calculated.

Firstly, calculating the moment of each feature point selected in the step 3): m is_pq＝∑_x,y∈Bx^py^qAnd I (x, y), wherein p and q are {0,1}, wherein B is an image area with the feature point as the center and within a radius r range, the specific value of the radius r needs to be set according to the requirements of image size and calculation speed, x and y are longitudinal and transverse coordinates of the feature point within the area B range, and I (x and y) is the image gray level at the point. The image centroid is then found by moment:the vector direction from the feature point to the center of mass is the direction theta of the feature point, which is arctan (m)₀₁/m₁₀)。

Step 5): and calculating the BRIEF descriptor of each characteristic point by adopting a BRIEF method.

According to the BRIEF (binary Robust Independent element features) method, 2 x n pixel points are randomly selected from a feature point neighborhood, gray values are compared pairwise, and a binary character string with the length of n is generated according to a comparison result, namely the BRIEF descriptor of the feature point.

Defining tau test (namely a method for comparing gray value to generate a result) in an s × s neighborhood (within an image range of s × s, the range size of the neighborhood is set according to the image size and the length of a BRIEF descriptor) P taking a feature point as a center, and carrying out the tau test on the points in pairs sequentially to generate the BRIEF descriptor:

wherein I (X) is the gray value of the pixel at point X in the neighborhood P, and I (Y) is the gray value at point Y. Randomly selecting n point pairs in an s multiplied by s neighborhood P taking a characteristic point as a center to carry out tau test, and generating a binary character string with the length of n, namely a BRIEF descriptor of the characteristic point.

In order to have rotational invariance, some improvements are needed to the BRIEF method: randomly selecting N pairs of point pairs in a certain feature point neighborhood range (X)_i,Y_i)，1<i is less than or equal to N, and a 2 x N matrix is defined Using the characteristic point direction theta obtained by calculation in the step 4 to generate a rotation matrix

S after clockwise rotation of theta angle of matrix S_θ＝R_θS, so that the original image can be kept unchanged after rotation processing according to the characteristic point direction (for example, S when the direction of a certain characteristic point in the original image is 45 degrees)_θ1Clockwise rotation of S by 45 °; if the original image is rotated 30 degrees counterclockwise, the feature point direction should be 75 degrees (45+30), and S is_θ2For S rotating 75 ° clockwise, the two matrices rotate in the same direction) to S_θAnd (4) carrying out tau test, and generating a BRIEF descriptor with n-bit binary codes with rotation invariance as characteristic points, wherein n is generally set to be 64, 128 and 256.

Step 6): the (i + 1) th image collected by the monocular camera is obtained again and converted into a gray-scale image P_i+1To P_i+1Repeating the steps 2) to 5) to obtain a gray scale map P_i+1And each feature point BRIEF descriptor, go to step 7).

Step 7): and matching the characteristic points, namely finding out the characteristic points of the same object in the two images.

The P obtained in the step 3) is added_iTaking all the characteristic points as a characteristic point queue 1 to be matched, and obtaining P in step 6)_i+1All the feature points in the list serve as a feature point queue 2 to be matched. Taking a feature point from the feature point queue 1 to be matched, and comparing each feature point in the feature point queue 2 to be matched with the feature point respectively, wherein the comparison method comprises the following steps:

comparing the BRIEF descriptor of the feature point with those of all feature points in the feature point queue 2 to be matched bit by bit, calculating the Hamming similarity between the BRIEF descriptor of the feature point and each BRIEF descriptor of the feature point in the feature point queue 2 to be matched (traversing the BRIEF descriptors of two feature points bit by bit, counting if the characters of the two BRIEF descriptors of the feature point on the same bit are equal, counting the percentage of the same number of bits in the total number of bits, namely Hamming similarity after traversing), taking the feature point with the highest Hamming similarity to the feature point in the feature point queue 2 to be matched as the matching point of the feature point, the two feature points forming a matching point pair, if the Hamming similarity of the matching point pair is higher than a threshold value (set by itself according to conditions such as image size, number of feature points and the like, aiming at reducing mismatching and reducing the calculated amount of the next step), the matching is successful, otherwise, the matching is failed, and the feature point has no feature point matched with the matching point, and respectively removing two feature points from the matching feature point queues 1 and 2 in case of failure or success, then taking one feature point from the to-be-matched feature point queue 1, and repeating the steps until one of the to-be-matched feature point queues 1 and 2 is empty.

After matching is finished, counting the maximum value and the minimum value of the Euclidean distance of the matching point pairs, setting a threshold value according to the maximum value and the minimum value, and removing the matching point pairs with the Euclidean distance larger than the threshold value (the Euclidean distance is larger than the threshold value, namely the Euclidean distance is too far away from the Euclidean distance, so that mismatching can be considered); and then, optimizing the matching pair by using a RANSAC (random sample consensus) algorithm (the method can be realized by directly calling the openCV related method), and eliminating the mismatching.

Step 8): and judging whether the barriers corresponding to the monocular camera and the feature points exceed the safe distance.

Under the pinhole imaging model of the camera, P_iAnd P_i+1The imaging relationship between the two images is shown in FIG. 3, where f denotes the focal length of the camera and l₁，l₂Representing the size of the imaged obstacle in the first and second images, respectively, d₁，d₂Respectively represent P_i，P_i+1The distance between the monocular camera and the obstacle, and L represents the actual size of the obstacle.

From the triangle similarity theorem, we can getIs pushed to

Make the front and back two images have obstaclesThe size ratio of the image is multiple, then

Setting the time interval of two image shooting as t, the flying speed v of the unmanned aerial vehicle during t, then d₁-d₂Put into the above formula, v × t

Assuming that an obstacle within a distance d (unit m) from the unmanned aerial vehicle needs to be detected, that is, feature points within the distance d (unit m) from the unmanned aerial vehicle are screened out (the feature points are often the contour edge, corner point, etc. of an object, and therefore matching point pairs meeting the conditions are screened out, that is, the contour of the obstacle can be approximately obtained), the size ratio multiple of the front and rear image obstacles in imaging needs to satisfy:

according to an inequality, assuming that the speed v of the unmanned aerial vehicle is 1m/s at minimum, the t is 1s at minimum, and the minimum obstacle avoidance distance d is 5m, multiple is greater than or equal to 1.2, that is: when the size ratio of a certain matching point pair of the front frame image and the rear frame image is larger than or equal to 1.2, the distance between the obstacle and the unmanned aerial vehicle is smaller than or equal to 5 m.

The specific judgment process is as follows: traversing each pair of matched points KP (i, inedx (j)), KP (i +1, index (j)), respectively representing P_iAnd P_i+1The jth pair of matching point pairs of the two images, inedx (j) represents the serial numbers of the feature points in the images; creating a template tmp1 centered at KP (i, inedx (j)), P_iIn the local image of k × k neighborhood range with KP (i, inedx (j)) as the center, the size of k is set according to the image size and the number of feature points, it is suggested to set to be an integral multiple of 10 (not odd, but approximately at the center), and then scale (initially 1.1) is used to scale up tmp1 (directly using the correlation method in openCV)Method implementation); creating a template tmp2 (the method is the same as the above) by taking KP (i +1, inedx (j)) as a center, matching tmp1 and tmp2 by using a template matching algorithm to obtain similarity TMscale (the similarity can be realized by directly calling a related method of openCV), wherein the smaller the TMscale value is, the more similar tmp1 and tmp2 are; and gradually increasing the scale (each time the scale is increased by 0.1 to 1.5), repeating the matching steps, and obtaining a scale value when the TMscale is minimum, wherein the scale value is marked as scalemin, and the scale value can be regarded as the size ratio (namely multiple) of the matching point to the representative obstacle in the front image and the back image. If scalemin is greater than the threshold (1.2 in this example), it indicates that the object corresponding to the feature point is less than d (in m, 5m in this example) from the monocular camera.

If the flight speed of the unmanned aerial vehicle is low, the chip performance for calculation is high, so that the shooting interval of two frames of images is short, or the set safety distance is long, and the multiplex value is too low, the following improvements can be adopted: acquiring an image P_kAnd extracting feature points and then continuously obtaining P_k+1，...，P_k+mFeature points of the image; will P_kAnd P_k+mThe feature points of (2) are matched. To detect obstacles within dm from the unmanned aerial vehicle, it is necessary to satisfyWhere m is set according to the magnitude of v, t, d, e.g. the speed v is 1m/s, the interval t between the front and back two shots is 0.5s, the safety distance d is 10m, then m can be set to 4, and finally P is set_kAnd P_k+4As an input, through the steps 7 and 8, when the scalemin is larger than 1.2, the distance between the feature point and the obstacle is within the range of 10 m; then obtain P_k+m+1Image feature points, and P_k+1And 7, matching the characteristic points, and performing steps 7 and 8, wherein when the scalemin is larger than 1.2, the distance between the characteristic points and the barrier is within the range of 10m, and the like.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. An unmanned aerial vehicle obstacle identification method based on monocular vision is characterized by comprising the following processes:

2. The unmanned aerial vehicle obstacle recognition method based on monocular vision as claimed in claim 1, wherein after the image is converted into the gray image, the gray image is gaussian filtered.

3. The unmanned aerial vehicle obstacle recognition method based on monocular vision as set forth in claim 2, wherein the gaussian filtering of the grayscale image comprises:

4. The unmanned aerial vehicle obstacle recognition method based on monocular vision as set forth in claim 1, wherein the extracting of each feature point from the grayscale image comprises the following processes:

5. The unmanned aerial vehicle obstacle recognition method based on monocular vision as set forth in claim 4, wherein obtaining response maps of gray scale images at different scales for constructing a scale space pyramid comprises:

6. The unmanned aerial vehicle obstacle recognition method based on monocular vision as set forth in claim 4, wherein the extracting of feature points through comparison of response values of points in each response map in the scale space pyramid comprises:

7. The unmanned aerial vehicle obstacle recognition method based on monocular vision as set forth in claim 1, wherein the matching of the feature points in the two images, and the forming of the matching point pair by the two matched feature points comprises:

obtaining BRIEF descriptors of all the feature points;

calculating Hamming similarity of BRIEF descriptors of all feature points;

8. The method as claimed in claim 7, wherein the step of obtaining BRIEF descriptors of the feature points includes:

Using the characteristic point direction theta to generate a rotation matrix

S after clockwise rotation of theta angle of matrix S_θ＝R_θS，

9. The unmanned aerial vehicle obstacle recognition method based on monocular vision as claimed in claim 7, wherein after matching is completed, the maximum value and the minimum value of Euclidean distance of the matching point pairs are counted, a threshold value is set according to the maximum value and the minimum value, and the matching point pairs with the Euclidean distance larger than the threshold value are removed; and then, optimizing the matching pair by using a RANSAC (random sample consensus) algorithm, and eliminating mismatching.

10. The unmanned aerial vehicle obstacle recognition method based on monocular vision as claimed in claim 1, wherein the determining whether the obstacle corresponding to the feature point of the unmanned aerial vehicle exceeds the safe distance comprises:

traversing each pair of matched points KP (i, inedx (j)), KP (i +1, index (j)), respectively representing P_iAnd P_i+1The jth pair of matching point pairs of the two images, inedx (j) represents the serial numbers of the feature points in the images; p_iTaking a local image of a k × k neighborhood range with KP (i, inedx (j)) as a center as a template tmp1, and then enlarging the scale of tmp1 by taking scale as a proportion; p_i+1Taking a local image of a k × k neighborhood range with KP (i +1, inedx (j)) as a center as a template tmp2, matching tmp1 and tmp2 by using a template matching algorithm to obtain similarity TMscale, wherein the smaller the TMscale value is, the more similar tmp1 and tmp2 are; gradually increasing the scale, repeating the matching steps, obtaining a scale value when the TMscale is minimum, and recording the scale value as scale emin, wherein the value can be regarded as the size ratio of the matching point pair to the representative obstacle in the front image and the rear image; if the scalemin is larger than the threshold value, the object corresponding to the feature point is representedThe body distance monocular camera is smaller than d.