CN116740334A

CN116740334A - Unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO

Info

Publication number: CN116740334A
Application number: CN202310743710.2A
Authority: CN
Inventors: 冉宁; 张家明; 郝晋渊; 张照彦; 张少康; 郝真鸣
Original assignee: Hebei University
Current assignee: Hebei University
Priority date: 2023-06-23
Filing date: 2023-06-23
Publication date: 2023-09-12
Anticipated expiration: 2043-06-23
Also published as: CN116740334B

Abstract

The invention relates to the technical field of unmanned aerial vehicle positioning, in particular to an unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO, which comprises the steps of arranging two cameras in parallel on the left and right of a region to be monitored, calibrating to obtain internal and external parameters of the two cameras, reading a right camera image, extracting an unmanned aerial vehicle target in the right camera image by using an improved model, recording pixel coordinates and size of the unmanned aerial vehicle target in the right camera, and manufacturing a mask image of the target; positioning an unmanned aerial vehicle target in a left camera through a mask image, and recording pixel coordinates of the unmanned aerial vehicle target in the left camera; according to the obtained pixel coordinates of the unmanned aerial vehicle in the left camera and the right camera, the imaging principle of the binocular cameras and the internal and external parameters of the two cameras are utilized to obtain the space three-dimensional coordinates of the unmanned aerial vehicle target, and the improved YOLO model is utilized to effectively reduce the quantity of the parameters and the size of the model, so that the unmanned aerial vehicle is beneficial to running on edge equipment, and meanwhile, the problem of low detection precision of the small target is solved.

Description

Unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO

Technical Field

The invention relates to the technical field of unmanned aerial vehicle positioning, in particular to an unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO.

Background

The rapid development of unmanned aerial vehicle technology makes unmanned aerial vehicles widely used in the fields of military, civil use and the like. However, as the number of unmanned aerial vehicles increases, unmanned aerial vehicle invasion becomes an increasingly serious problem, and potential safety hazards are brought to society. For example, the drone may be used to conduct spying, monitor sensitive areas, attack targets, etc., pose a threat to national security and personal privacy; in cities, the altitude and path of flight of unmanned aerial vehicles may collide with buildings or vehicles, causing accidents. Therefore, it is very necessary to detect and locate unmanned intrusion.

The current common unmanned aerial vehicle intrusion detection positioning method is mainly used for monitoring through radar, infrared sensors and other devices. However, these devices have the disadvantages of high cost, susceptibility to environmental influences, etc., which limit their effectiveness in practical applications. Unmanned aerial vehicle intrusion detection based on images is more and more paid attention to, the algorithm is used for training a neural network by utilizing data and labels in a data set, the candidate region generation step is omitted, and feature extraction, target classification and target regression are all achieved by putting the candidate region generation step into the neural network, so that the target detection speed is greatly improved. However, the existing target detection model has the disadvantages of higher detection speed, large parameter quantity, higher requirement on hardware, difficulty in being deployed on edge equipment and unsatisfactory accuracy for small target detection.

Therefore, there is a need to provide an unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO, which solves the above technical problems.

Disclosure of Invention

In order to solve the technical problems, the invention provides an unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO.

The invention provides an unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO, which comprises the following steps:

s1, training an improved YOLOV5 model through a real world public training set and a self-built data set;

s2, arranging two cameras in parallel on the left and right sides of a region to be monitored, calibrating to obtain internal and external parameters of the two cameras, and setting a warning distance from the unmanned aerial vehicle to the binocular system;

s3, reading a right camera image, extracting an unmanned aerial vehicle target in the right camera image by using the model obtained in the S1, recording pixel coordinates and size of the unmanned aerial vehicle target in the right camera, and manufacturing a mask image of the target;

s4, positioning the unmanned aerial vehicle target in the left camera by using the mask image obtained in the S3, and recording the pixel coordinates of the unmanned aerial vehicle target in the left camera;

s5, according to the pixel coordinates of the unmanned aerial vehicle in the left camera and the right camera obtained in the S3 and the S4, the imaging principle of the binocular camera and the internal and external parameters in the S2 are utilized to obtain the space three-dimensional coordinates of the unmanned aerial vehicle target.

Preferably, the improved YOLO5 model in S1 is that in the back band part, the SPD module is used to replace the convolution block for downsampling, the ShuffleBlock is used to replace the C3 module, and the SPPF module is removed; and a dynamic convolution block ODConv is used for replacing a C3 module in the Head part, and the detection model is trained by utilizing a training set until the test requirement is met, so that a final detection model is obtained.

It should be noted that: the self-built data set is a self-timer unmanned aerial vehicle video, frame sampling is carried out, 300 frames of images are randomly extracted, an imageLabel tool is used for calibrating, the data set for model training is 51746 color images with 640 multiplied by 480, each image contains zero to three unmanned aerial vehicle targets, and small traversing machine targets which do not appear in the original data set can be added by adding the self-built unmanned aerial vehicle data set, so that the data set is enriched.

Preferably, in the step S2, two cameras are arranged in parallel on the left and right sides of the area to be monitored, and the specific steps of obtaining the internal and external parameters of the two cameras and setting the warning distance are as follows:

s21, arranging double cameras in the area to be monitored, wherein the double cameras are arranged in parallel left and right in a double-monocular mode and can be double-monocular cameras.

S22, performing double-target positioning on the camera module to obtain the internal participation external parameters of the two cameras.

S23, determining the warning distance between the unmanned aerial vehicle and the binocular system.

It should be noted that: the internal parameters need to include focal length, principal coordinate point, camera resolution, radial distortion coefficient, tangential distortion coefficient and projection error; the extrinsic parameters need to include a rotation matrix and a translation matrix. Wherein the focal length is required to be the pixel focal length, and the translation matrix between the two cameras is required to ensure that only the offset of the X axis

Preferably, the step of recording the pixel coordinates and the size of the unmanned aerial vehicle target in the right camera and making the mask image of the target in S3 includes:

s31, when target detection is carried out, the target detection is regarded as a regression problem, an input picture is divided into grids of S multiplied by S, and if the center of a detected target exists in the center of a certain cell, the cell is responsible for predicting the target; each cell can generate B bounding boxes, each bounding box comprises the deviation of the center position of an object relative to the cell position, the width and the height of the bounding box and the confidence of the target, and the bounding boxes are subjected to non-maximum values to obtain the pixel coordinates of the upper left point and the pixel coordinates of the lower right point of the unmanned aerial vehicle target.

S32, intercepting the area on the original image according to two coordinates of the unmanned aerial vehicle target, and generating a mask image of the unmanned aerial vehicle target.

Preferably, in the step S4, using the mask image obtained in the step S3, locating the unmanned aerial vehicle target in the left camera, and recording the pixel coordinates of the unmanned aerial vehicle target in the left camera includes the specific steps of:

s41, extracting features, and extracting square difference features from the mask image obtained in the step two.

S42, comparing the features, manufacturing a sliding block according to the size of the mask image, and calculating the similarity score of the sliding block in the target image.

S43, positioning the best match, searching the region with the highest score in the target image after calculating the score, and recording the pixel coordinates of the upper left corner and the lower right corner of the region.

Preferably, the step of S5 to obtain the spatial three-dimensional coordinates of the unmanned aerial vehicle target specifically includes:

and S51, converting pixel coordinates in binocular vision into space coordinates, and realizing by triangularization.

S52, the pixel coordinates of the same object in the image acquired by the left camera and the right camera are respectively expressed as (u) _l ,v _l ) Sum (u) _r ,v _r ) The spatial coordinates (X, Y, Z) of the object in space are calculated using the following formula:

wherein T is _x Representing the distance between the two cameras, (c) _x ,c _y ) Is the optical center coordinates of the camera, f is the focal length of the camera, (X) ₀ ,Y ₀ ,Z ₀ ) Is the position coordinates of the camera. Substituting the internal and external parameter matrixes obtained in the three steps with the arrangement coordinates of the cameras to obtain the space coordinates of the unmanned aerial vehicle. And obtaining the distance between the unmanned aerial vehicle and the forbidden area by using a distance formula between the three-dimensional coordinates.

Compared with the related art, the unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO has the following beneficial effects:

1. the invention is based on computer vision, has small requirements on environment and is little interfered by environmental factors.

2. The detection method provided by the invention has the advantages of low equipment cost, easiness in maintenance and low daily cost.

3. The invention utilizes the improved YOLO model, effectively reduces the parameter and the model size, is beneficial to running on edge equipment, and simultaneously improves the problem of low detection precision of small targets.

Drawings

Fig. 1 is a schematic flow chart of an unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO;

FIG. 2 is a diagram of a modified YOLO model structure.

Fig. 3 is a schematic diagram of the basic structure of ODConv.

Fig. 4 is a schematic diagram of the basic structure of the SPD.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

A detailed description of a specific implementation of an unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO is described below in connection with specific embodiments.

It should be noted that: the self-built data set is the unmanned aerial vehicle video of self-timer, the photo to unmanned aerial vehicle various models, various angles have been contained in the data set, guarantee the recognition accuracy, and carry out the frame sampling to the data set, the 300 frame images of random extraction, use image label instrument to mark, be used for the color image of model training's data set size 51746 pieces 640 x 480, every image contains zero to three unmanned aerial vehicle target, add the unmanned aerial vehicle data set of self-built can add the small-size machine target that passes through that does not appear in the original data set, enrich the data set.

ODConv is a plug-and-play attention module structured to learn complementary attention along four dimensions of the kernel space using a multidimensional attention mechanism through a parallel strategy as shown in fig. 3. As a "plug and play" operation, it can be easily embedded into existing CNN networks. And the experimental result shows that the performance of the large model can be improved, and the performance of the light model can be improved.

The SPD module is developed by a team of Raja Sunkarad of Missouri university, and the structure is shown in figure 4, and consists of a space-to-depth (SPD) layer and a non-stride (step length is 1) convolution layer, and the learnable information is not lost when downsampling is carried out, so that the precision of a small target is improved.

In the embodiment of the present invention, in the step S2, two cameras are arranged in parallel on the left and right sides of the area to be monitored, and the specific steps of obtaining the internal and external parameters of the two cameras and setting the warning distance are:

In the embodiment of the present invention, the specific steps of recording the pixel coordinates and the size of the unmanned aerial vehicle target in the right camera and making the mask image of the target in S3 are as follows:

In the embodiment of the present invention, in the step S4, using the mask image obtained in the step S3, the unmanned aerial vehicle target in the left camera is positioned, and the specific steps of recording the pixel coordinates of the unmanned aerial vehicle target in the left camera are as follows:

It should be noted that: template matching operation is carried out on the mask image, and a matchTemplate function in OpenCV is used for template matching, wherein the matching mode is normalized correlation coefficient matching, and the correlation degree between two variables can be calculated by the following steps:

1. for two variables X and Y, their mean values are calculated asAnd->

2. For X and Y, their standard deviations s are calculated separately _x Sum s _y 。

3. Calculating covariance of X and Y

4. Calculating normalized correlation coefficients

Where n represents the sample size. The normalized correlation coefficient has a value ranging between [ -1,1] and can be used for describing the strength and direction of the linear relationship between two variables. When r=1, it means that the two variables exhibit a complete positive correlation; when r= -1, it means that the two variables exhibit a complete negative correlation; when r=0, it means that there is no linear correlation between the two variables. Thereby deriving drone target coordinates in another view.

In the embodiment of the present invention, the step of S5 of obtaining the spatial three-dimensional coordinates of the unmanned aerial vehicle target specifically includes:

The circuits and control involved in the present invention are all of the prior art, and are not described in detail herein.

The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes or direct or indirect application in other related technical fields are included in the scope of the present invention.

Claims

1. The unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO is characterized by comprising the following steps of:

s2, arranging two cameras in parallel on the left and right sides of a region to be monitored, calibrating to obtain internal and external parameters of the two cameras, and setting a warning distance between the unmanned aerial vehicle and a binocular system;

2. The unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO of claim 1, wherein the improved YOLO5 model in S1 is that in a back bone part, an SPD module is used for downsampling instead of a convolution block, a ShuffleBlock is used for replacing a C3 module, and an SPPF module is removed; and a dynamic convolution block ODConv is used for replacing a C3 module in the Head part, and the detection model is trained by utilizing a training set until the test requirement is met, so that a final detection model is obtained.

3. The unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO according to claim 1, wherein in the step S2, two cameras are arranged in parallel on the left and right of a region to be monitored, and the specific steps of obtaining internal and external parameters of the two cameras and setting a warning distance are performed by:

s21, arranging double cameras in a region to be monitored, wherein the double cameras are arranged in parallel left and right of a double monocular or are double monocular cameras;

s22, performing double-target positioning on the camera module to obtain internal participation external parameters of the two cameras;

4. The unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO according to claim 1, wherein the specific steps of recording the pixel coordinates and the size of the unmanned aerial vehicle target in the right camera and making the mask image of the target in S3 are as follows:

s31, dividing an input picture into an S multiplied by S grid, and if the center of a detection target exists in the center of a certain cell, predicting the target by the cell; each cell can generate B bounding boxes, each bounding box comprises offset of the center position of an object relative to the cell position, width and height of the bounding box and confidence coefficient of a target, and non-maximum values are carried out on the bounding boxes to obtain upper left point pixel coordinates and lower right point pixel coordinates of the unmanned aerial vehicle target;

5. The unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO according to claim 4, wherein the specific steps of using the mask image obtained in S3 to position the unmanned aerial vehicle object in the left camera and recording the pixel coordinates of the unmanned aerial vehicle object in the left camera in S4 are as follows:

s41, extracting features, namely extracting square difference features from the mask image obtained in the step two;

s42, comparing the characteristics, manufacturing a sliding block according to the size of the mask image, and calculating the similarity score of the sliding block in the target image;

6. The unmanned aerial vehicle intrusion detection positioning method based on binocular vision and improved YOLO according to claim 5, wherein the specific step of obtaining the spatial three-dimensional coordinates of the unmanned aerial vehicle target in S5 is as follows:

s51, converting pixel coordinates in binocular vision into space coordinates through triangularization;

wherein T is _x Representing the distance between the two cameras, (c) _x ,c _y ) Is the optical center coordinates of the camera, f is the focal length of the camera, (X) ₀ ,Y ₀ ,Z ₀ ) Is the position coordinates of the camera.

7. A terminal device comprising two cameras placed in parallel, a reading camera and a computer for performing various image processing instructions, wherein the image processing instructions are loaded and executed by the computer to implement the detection positioning method according to any one of claims 1 to 6.