CN112069997B

CN112069997B - Unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net

Info

Publication number: CN112069997B
Application number: CN202010925792.9A
Authority: CN
Inventors: 胡天江; 李铭慧; 郑勋臣; 潘亮; 王勇
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2020-09-04
Filing date: 2020-09-04
Publication date: 2023-07-28
Anticipated expiration: 2040-09-04
Also published as: CN112069997A

Abstract

The invention discloses an unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net, wherein the method comprises the following steps: the method comprises the steps of shooting an RGB three-channel image of an unmanned aerial vehicle through a ground camera, and then carrying out image preprocessing to obtain an RGB image with a unified standard size; carrying out detection positioning on key areas of an unmanned aerial vehicle on an RGB image by building a DenseHR-Net target detection network model, and identifying minimum circumscribed rectangular detection frames of a plurality of key areas in the image; and selecting one of the key areas in the image as a key point coordinate area according to a preset priority strategy, and extracting the center point coordinate of the minimum circumscribed rectangle detection frame of the key point coordinate area as the key point coordinate of unmanned aerial vehicle landing. According to the invention, the deep learning network DenseHR-Net is adopted to detect the targets of all the positions of the unmanned aerial vehicle, so that the autonomous landing targets of the unmanned aerial vehicle can be effectively extracted, the positioning precision of all the positions of the unmanned aerial vehicle is improved, the condition of error detection or omission of the machine head is avoided, and the robustness of a detection algorithm is enhanced.

Description

Unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net

Technical Field

The invention relates to the technical field of unmanned aerial vehicles, in particular to an unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net.

Background

With the development of unmanned aerial vehicle technology, unmanned aerial vehicle autonomous landing recovery is one of the technical challenges that need to face when unmanned aerial vehicle research and development at present, and complete binocular vision unmanned aerial vehicle autonomous landing guiding process includes several parts such as system modeling, camera calibration, image preprocessing, target detection and position resolving. The target detection is to extract accurate values of the unmanned aerial vehicle head image coordinates from the images captured by the camera, and is a key ring of the whole system.

However, in the course of research and practice of the prior art, the inventor of the present invention found that the existing object detection technology generally extracts the coordinates of the aircraft nose by using methods such as the extraction of angular points, the optical flow method, and the color probability density method, but the above methods cannot meet the basic requirements of object detection or extraction in terms of accuracy. Although the target detection can be performed by an active contour extraction method in the prior art, the method has the problem of poor performance in terms of real-time performance. In addition, due to the fact that aircraft nose detection in the video stream cannot avoid the problems of missing detection, false detection and the like of the aircraft nose, particularly, under the condition of severe weather conditions, the detection algorithm in the prior art is difficult to meet the related robustness requirements. Therefore, there is a need for an unmanned aerial vehicle key point detection method that overcomes the above-mentioned drawbacks.

Disclosure of Invention

The technical problem to be solved by the embodiment of the invention is to provide the unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net, which can effectively extract the unmanned aerial vehicle autonomous landing target, improve the positioning precision of each position of the unmanned aerial vehicle, and can also utilize other positions to position the unmanned aerial vehicle when the error detection condition of the machine head occurs by detecting and positioning a plurality of key positions of the unmanned aerial vehicle, and enhance the robustness of a detection algorithm.

In order to solve the above problems, an embodiment of the present invention provides a method for extracting an autonomous landing target of an unmanned aerial vehicle based on DenseHR-Net, which at least comprises the following steps:

in the landing process of the unmanned aerial vehicle, an RGB three-channel image of the unmanned aerial vehicle is shot by a ground camera and then subjected to image preprocessing, so that an RGB image with uniform standard size is obtained;

carrying out detection and positioning of key areas of an unmanned aerial vehicle on the RGB image by building a DenseHR-Net target detection network model, and identifying a minimum circumscribed rectangular detection frame of a plurality of key areas in the RGB image; the key area comprises an unmanned aerial vehicle head, an unmanned aerial vehicle wing and an unmanned aerial vehicle body;

and selecting one of the key areas in the RGB image as a key point coordinate area according to a preset priority strategy, and extracting the center point coordinate of the minimum circumscribed rectangular detection frame of the key point coordinate area as the key point coordinate of unmanned aerial vehicle landing.

Further, the image preprocessing specifically includes:

and uniformly adjusting the RGB three-channel image into an RGB image with the size of 416 x 416 by adopting a bilinear interpolation method.

Further, the detecting and positioning of the key areas of the unmanned aerial vehicle are performed on the RGB image by building a DenseHR-Net target detecting network model, and the minimum circumscribed rectangular detecting frame of a plurality of key areas in the RGB image is identified, which specifically comprises:

building a frame of a DenseHR-Net target detection network model, and inputting the RGB image into the DenseHR-Net target detection network model for feature extraction;

collecting RGB images of unmanned aerial vehicle landing, and expanding a sample data set after constructing the sample data set of the DenseHR-Net target detection network model;

training the DenseHR-Net target detection network model through the expanded sample data set, and obtaining a final DenseHR-Net target detection network model after training is completed;

inputting the RGB image into the final DenseHR-Net target detection network model to detect and position the key areas of the unmanned aerial vehicle, and identifying the minimum circumscribed rectangular detection frames of a plurality of key areas in the RGB image.

Further, after constructing the sample data set of the DenseHR-Net target detection network model, expanding the sample data set specifically includes:

inverting, cutting and translating the acquired RGB image of the landing of the unmanned aerial vehicle to obtain a new unmanned aerial vehicle image, and forming an expansion data set;

labeling each image in the extended data set to obtain corresponding label data; the tag data comprises a central coordinate value, a wide value, a high value and category information of the target area of the unmanned aerial vehicle.

Further, the training of the DenseHR-Net target detection network model through the expanded sample data set specifically comprises the following steps:

inputting the images in the expanded sample data set to a DenseHR-Net target detection network model in batches for rolling and pooling treatment to obtain output prediction results corresponding to two preset scales;

calculating a loss value of the output prediction result at a detection layer of the DenseHR-Net target detection network, and constructing a loss function;

after the construction of the loss function is completed, iteratively updating the convolution kernel parameters of the DenseHR-Net target detection network by adopting a reverse gradient propagation algorithm;

And stopping training when the loss value is judged to be lower than a preset threshold value, and obtaining a final DenseHR-Net target detection network model.

Further, the penalty values include a bezel coordinate penalty value, a target confidence penalty value, and a category confidence penalty value.

Further, inputting the RGB image to the final DenseHR-Net target detection network model for detecting and positioning a key area of the unmanned aerial vehicle, and identifying a minimum circumscribed rectangular detection frame of a plurality of key areas in the RGB image, wherein the minimum circumscribed rectangular detection frame specifically comprises:

inputting an RGB image to be detected into a trained DenseHR-Net target detection network model to obtain a corresponding prediction tensor, and determining a corresponding prediction rectangular frame according to the prediction tensor; the prediction tensor comprises a central coordinate value, a wide value, a high value, a confidence coefficient and category information of a target area of the unmanned aerial vehicle;

and calculating a plurality of predicted rectangular frames by adopting a non-maximum suppression algorithm to obtain a target rectangular frame with highest confidence coefficient of each key region, and converting the target rectangular frame to obtain the category and the position of the target on the original RGB image.

Further, according to a preset priority policy, one of the key areas in the RGB image is selected as a key point coordinate area, and the center point coordinate of the minimum circumscribed rectangle detection frame of the key point coordinate area is extracted as the key point coordinate of the unmanned aerial vehicle landing, specifically:

According to a priority strategy, firstly selecting an unmanned aerial vehicle head from a plurality of key areas in the RGB image as a key point coordinate area, and taking the central point coordinate of the minimum circumscribed rectangular detection frame of the unmanned aerial vehicle head as a key point coordinate;

when the unmanned aerial vehicle head is selected to be failed as a key point coordinate area, taking the central point coordinate of the minimum circumscribed rectangular detection frame of the unmanned aerial vehicle wing as the key point coordinate;

when the unmanned aerial vehicle aircraft nose and the unmanned aerial vehicle wing are selected to be used as the key point coordinate areas, the central point coordinate of the minimum circumscribed rectangular detection frame of the unmanned aerial vehicle body is used as the key point coordinate.

Further, the unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net further comprises the following steps:

and extracting key points of images shot by the binocular camera at the same moment, and calculating the current world coordinates of the unmanned aerial vehicle by combining the internal and external parameters of the unmanned aerial vehicle sensor.

The embodiment of the invention also provides an unmanned aerial vehicle autonomous landing target extraction device based on DenseHR-Net, which comprises the following components:

the image preprocessing module is used for preprocessing the images after the RGB three-channel images of the unmanned aerial vehicle are shot through the ground camera in the landing process of the unmanned aerial vehicle, so as to obtain RGB images with uniform standard sizes;

The key region detection positioning module is used for carrying out unmanned aerial vehicle key region detection positioning on the RGB image by building a DenseHR-Net target detection network model, and identifying a minimum circumscribed rectangular detection frame of a plurality of key regions in the RGB image; the key area comprises an unmanned aerial vehicle head, an unmanned aerial vehicle wing and an unmanned aerial vehicle body;

and the key point extraction module is used for selecting one of the plurality of key areas in the RGB image as a key point coordinate area according to a preset priority strategy, and extracting the central point coordinate of the minimum circumscribed rectangular detection frame of the key point coordinate area as the key point coordinate of the unmanned aerial vehicle landing.

The embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides a method and a device for extracting an unmanned aerial vehicle autonomous landing target based on DenseHR-Net, wherein the method comprises the following steps: in the landing process of the unmanned aerial vehicle, an RGB three-channel image of the unmanned aerial vehicle is shot by a ground camera and then subjected to image preprocessing, so that an RGB image with uniform standard size is obtained; carrying out detection and positioning of key areas of an unmanned aerial vehicle on the RGB image by building a DenseHR-Net target detection network model, and identifying a minimum circumscribed rectangular detection frame of a plurality of key areas in the RGB image; the key area comprises an unmanned aerial vehicle head, an unmanned aerial vehicle wing and an unmanned aerial vehicle body; and selecting one of the key areas in the RGB image as a key point coordinate area according to a preset priority strategy, and extracting the center point coordinate of the minimum circumscribed rectangular detection frame of the key point coordinate area as the key point coordinate of unmanned aerial vehicle landing.

Compared with the prior art, the embodiment of the invention adopts the deep learning network DenseHR-Net to carry out target detection on the unmanned aerial vehicle head, the wings and the engine body, and trains the DenseHR-Net target detection model after expanding the constructed sample data set, so that the samples are reasonably constructed, the number of the samples is ensured to be enough, the effective detection on each position of the unmanned aerial vehicle is further ensured, and the detection precision is provided. Meanwhile, through detecting and positioning a plurality of key parts of the unmanned aerial vehicle, when the situation of error leak detection of the machine head occurs, the unmanned aerial vehicle can be positioned by utilizing other parts, the robustness of a detection algorithm is enhanced, and the unmanned aerial vehicle positioning method has important significance and use value for positioning the unmanned aerial vehicle in the landing process of the unmanned aerial vehicle by utilizing binocular vision.

Drawings

Fig. 1 is a schematic flow chart of an unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net according to a first embodiment of the present invention;

fig. 2 is a schematic flow chart of a key area detection positioning of an unmanned aerial vehicle according to a first embodiment of the present invention;

fig. 3 is a schematic flow chart of another unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net according to the first embodiment of the present invention;

fig. 4 is a schematic structural diagram of an unmanned aerial vehicle autonomous landing target extraction device based on DenseHR-Net according to a second embodiment of the present invention;

Fig. 5 is a schematic structural diagram of another unmanned aerial vehicle autonomous landing target extraction device based on DenseHR-Net according to a second embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

In the description of the present application, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or an implicit indication of the number of technical features being indicated. Thus, a feature defining "a first", "a second", etc. may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.

Firstly, introducing the application scene provided by the invention, for example, providing an unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net, which has strong real-time performance, simple principle, strong robustness and high precision, effectively carries out target detection on the head, wings and organisms of the unmanned aerial vehicle.

First embodiment of the present invention:

please refer to fig. 1-3.

As shown in fig. 1, the embodiment provides a method for extracting an autonomous landing target of an unmanned aerial vehicle based on DenseHR-Net, which at least comprises the following steps:

s1, in the landing process of the unmanned aerial vehicle, an RGB three-channel image of the unmanned aerial vehicle is shot through a ground camera, and then image preprocessing is carried out, so that an RGB image with a uniform standard size is obtained.

In a preferred embodiment, the image preprocessing is specifically:

Specifically, for step S1, an RGB three-channel image captured by the ground camera during the landing process of the unmanned aerial vehicle is input, so as to output an RGB image with a size of 416×416 after bilinear interpolation, and in this embodiment, scaling is performed by using a bilinear interpolation algorithm.

S2, carrying out detection and positioning on key areas of the RGB image by building a DenseHR-Net target detection network model, and identifying minimum circumscribed rectangular detection frames of a plurality of key areas in the RGB image; wherein, the key region includes unmanned aerial vehicle aircraft nose, unmanned aerial vehicle wing and unmanned aerial vehicle organism.

Specifically, for step S2, mainly, multi-region detection and extraction of the unmanned aerial vehicle are performed, a single-channel image after pretreatment is input, and a minimum external rectangular frame of the unmanned aerial vehicle head, the left and right wings and the body part in the image is output. The DenseHR-Net in the embodiment is a target detection model based on a convolutional neural network, an object detection task is directly treated as a regression problem, a forward process obtains a prediction tensor through convolution and pooling operation, a loss function is constructed by comparing the prediction tensor with a sample value, and after the loss function is constructed, the convolution kernel parameter is iteratively updated by a backward gradient propagation algorithm with the aim of minimum loss value. And stopping training to obtain a final network model when the loss value is lower than a certain threshold value.

In addition, denseHR-Net is a targeted network for extracting unmanned aerial vehicle targets in an unmanned aerial vehicle approach landing scene, and as unmanned aerial vehicle targets in the whole scene are generally smaller, in order to strengthen the extraction capacity of small targets on the basis of a small network, the invention provides a light high-resolution dense connection characteristic extraction network.

In a preferred embodiment, as shown in fig. 2, the performing, by building a DenseHR-Net target detection network model, detection and positioning of a key area of the unmanned aerial vehicle on the RGB image, and identifying a minimum circumscribed rectangular detection frame of a plurality of key areas in the RGB image specifically includes:

s21, building a frame of a DenseHR-Net target detection network model, and inputting the RGB image into the DenseHR-Net target detection network model for feature extraction;

Specifically, for step S21, the DenseHR-Net framework is constructed as follows: the RGB image with 416 x 416 input is divided into two paths, wherein the series path is a 16-layer convolutional neural network formed by stacking convolutional pooling layers, feature extraction is carried out by continuously expanding receptive fields, the parallel path carries out feature extraction without resolution reduction on each different size of the series path, and feature graphs generated by the two paths of networks are finally cascaded in the depth direction to extract the features in a full-scale range. The detection layer adopts a 13 x 13 and 26 x 26 two-scale detection mode, wherein the characteristic detection layer with the size of 13 x 13 is obtained after convolutional pooling downsampling by 32 times; and carrying out nearest neighbor interpolation up-sampling on the 13 x 13 feature images to obtain feature images with 26 x 26 scales, fusing the feature images with the same size with the shallow layers to obtain a 26 x 26 fused feature layer with shallow semantic information, collocating two detection layers with different sizes to detect targets with different sizes, and finally determining which detection layer to use for detection according to an actual track calculation result.

S22, acquiring RGB images of unmanned aerial vehicle landing, and expanding a sample data set after constructing the sample data set of the DenseHR-Net target detection network model;

In a preferred embodiment, after constructing the sample dataset of the DenseHR-Net target detection network model, expanding the sample dataset specifically includes:

Specifically, for step S22, an unmanned aerial vehicle landing RGB image is collected, and a new unmanned aerial vehicle image is obtained by inversion, cutting and translation, so as to form an extended data set; the label data of each picture is obtained through manual labeling, and the label data comprises: coordinate value b of center point of object to be detected of unmanned plane _x ，b _y Width to height value b _w ，b _h And category.

S23, training the DenseHR-Net target detection network model through the expanded sample data set, and obtaining a final DenseHR-Net target detection network model after training is completed;

in a preferred embodiment, the training of the DenseHR-Net target detection network model by the expanded sample data set specifically includes:

In a preferred embodiment, the penalty values include a bounding box coordinate penalty value, a target confidence penalty value, and a category confidence penalty value.

Specifically, for step S23, training the DenseHR-Net network by using the extended data set; the characteristic extraction process is as follows: inputting photos in a data set into a DenseHR-Net network in batches for convolution and pooling operation to obtain two output predictions with different scales; and calculating loss values of the predicted result at a DenseHR-Net detection layer, wherein the loss values comprise frame coordinate loss, target confidence loss and category confidence loss. Wherein the loss value is calculated from a loss function.

After the construction of the loss function is completed, the convolution kernel parameter is iteratively updated by a backward gradient propagation algorithm with the aim of minimizing the loss value. And stopping training to obtain a final network model when the loss value is lower than a certain threshold value.

S24, inputting the RGB image into the final DenseHR-Net target detection network model to detect and locate key areas of the unmanned aerial vehicle, and identifying minimum circumscribed rectangular detection frames of a plurality of key areas in the RGB image.

In a preferred embodiment, the inputting the RGB image into the final DenseHR-Net target detection network model performs the detection and positioning of the critical area of the unmanned aerial vehicle, and identifies the minimum circumscribed rectangular detection frame of the plurality of critical areas in the RGB image, which specifically includes:

Specifically, for step S24, the target detection and positioning of the unmanned aerial vehicle key region is performed, the image to be detected is input into the DenseHR-Net network model to obtain a predicted tensor, and the predicted tensor includes the central coordinate value (t _x ，t _y ) Width and height values (t _w ，t _h ) As well as confidence and category. Then selecting a target frame of a key area of the unmanned aerial vehicle; and processing the plurality of predicted rectangular frames obtained in the previous step through a non-maximum suppression (NMS) algorithm to obtain a target rectangular frame with highest confidence coefficient, and converting the target rectangular frame to obtain the category and the position of the target on the original image.

In a specific embodiment, a DenseHR-Net target detection network model is firstly constructed; the DenseHR-Net framework construction steps are as follows: inputting a 416 x 416 RGB image into a DenseHR-Net network to perform feature extraction operations such as rolling and pooling, and obtaining a feature map of 13 x 13 scale after downsampling by 32 times, and obtaining a 24-dimensional vector of 3 x (4+1+3) at each pixel position of the feature map; performing nearest neighbor interpolation up-sampling on the 13 x 13 feature images to obtain feature images with 26 x 26 scale, and fusing the feature images with the same size with the shallow layer to obtain a 26 x 26 fused feature image with shallow layer semantic information, and also obtaining a 24-dimensional vector of 3 x (4+1+3) at each pixel position; wherein the number 3 indicates the number of anchor boxes on the feature map, the number 4 indicates the central coordinate value and the width and height value of the prediction result, the number 1 indicates the confidence of the prediction frame, and the category number indicates the probability of predicting the category at the feature point.

Secondly, aiming at the constructed network, organizing a sample data set, collecting an unmanned aerial vehicle landing RGB image, and obtaining a new unmanned aerial vehicle image through inversion, cutting and translation to form an extended data set; obtaining label data of each picture through manual labeling; then training the DenseHR-Net network by using the extended data set, and extracting the characteristics as follows: inputting photos in a data set into a DenseHR-Net network in batches for rolling and pooling operation to obtain two multi-scale output predictions with the sizes of 13 x 13 and 26 x 26 respectively; and calculating a loss value of the predicted result at a DenseHR-Net detection layer, wherein the loss value comprises frame coordinate loss, confidence loss and category confidence loss. The specific calculation formula is as follows:

wherein the method comprises the steps ofIndicating whether the j-th anchor box of the i-th pixel block is responsible for predicting object, lambda _coord Weight, lambda, representing positioning error _nocoord Weights representing classification errors classification error, +.>Confidence, x, indicating whether the pixel block predicts an object _i ,y _i True value, w, representing the center coordinates of an object _i ,h _i A wide-high value representing a prediction box, +.>The representation is a confidence level for some type of object. (note: x _i Representing the true value +_>Representing the predicted value; the representation of the other parameters is the same).

By constructing the loss function, the convolution kernel parameter is iteratively updated by a back gradient propagation algorithm with the aim of minimizing the loss value. When the loss value is lower than a certain threshold value (here, the value of 0.05 can be directly selected), training is stopped to obtain a final network model.

And finally, inputting an image to be detected into the trained network model to detect and position the target in the key area of the unmanned aerial vehicle. After the image to be detected is converted into (416 x 416) size, the size is input into a DenseHR-Net network model, a prediction result is obtained through forward calculation, and finally, the obtained plurality of prediction rectangular frames are processed through a non-maximum suppression algorithm to obtain a target rectangular frame with highest confidence coefficient, and the target rectangular frame is converted to obtain the category and the position of the target on the original image.

S3, selecting one of the key areas in the RGB image as a key point coordinate area according to a preset priority strategy, and extracting the center point coordinate of the minimum circumscribed rectangular detection frame of the key point coordinate area as the key point coordinate of unmanned aerial vehicle landing.

In a preferred embodiment, according to a preset priority policy, one of the plurality of key areas in the RGB image is selected as a key point coordinate area, and a center point coordinate of a minimum circumscribed rectangular detection frame of the key point coordinate area is extracted as a key point coordinate of landing of the unmanned aerial vehicle, specifically:

Specifically, for step S3, the coordinates of the center point of the unmanned aerial vehicle head detection frame are first selected as the coordinates of the key points, and if the condition of missed detection and false detection of the unmanned aerial vehicle head occurs, the coordinates of the center point of the unmanned aerial vehicle wing or body are selected as the coordinates of the key points;

for the extracted three parts of the airplane region, the embodiment of the invention adopts a method for setting priority to carry out robustness and accuracy optimization on airplane key point detection. The effect of selecting the center point of the nose area is better than that of selecting the center point of the wing area directly, so that the embodiment of the invention sets the extraction priority of the key points as follows:

Priority:Headbox>>Wingbox>>Planebox

From the above analysis, the specific steps of the third stage are as follows: according to a priority strategy, the central point coordinate of the unmanned aerial vehicle head detection frame is selected as the key point coordinate, and if the unmanned aerial vehicle head detection omission occurs, the central point of the unmanned aerial vehicle wing or the unmanned aerial vehicle body is used as the key point coordinate.

In a preferred embodiment, as shown in fig. 3, the unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net further includes:

Specifically, after the coordinates of the key points are calculated, the key points are extracted from the images shot by the left and right eye cameras at the same moment, the world coordinates of the unmanned aerial vehicle are calculated by combining the internal and external parameters of the sensor, and the positioning accuracy of the unmanned aerial vehicle in the landing process of the unmanned aerial vehicle is improved by utilizing binocular vision, so that the unmanned aerial vehicle has important significance and use value.

The unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net provided by the embodiment comprises the following steps: in the landing process of the unmanned aerial vehicle, an RGB three-channel image of the unmanned aerial vehicle is shot by a ground camera and then subjected to image preprocessing, so that an RGB image with uniform standard size is obtained; carrying out detection and positioning of key areas of an unmanned aerial vehicle on the RGB image by building a DenseHR-Net target detection network model, and identifying a minimum circumscribed rectangular detection frame of a plurality of key areas in the RGB image; and selecting one of the key areas in the RGB image as a key point coordinate area according to a preset priority strategy, and extracting the center point coordinate of the minimum circumscribed rectangular detection frame of the key point coordinate area as the key point coordinate of unmanned aerial vehicle landing.

Compared with the prior art, the method and the device have the advantages that the deep learning network DenseHR-Net is adopted to carry out target detection on the unmanned aerial vehicle head, wings and organisms, and training regression is carried out on the types and positions identified by the DenseHR-Net target detection model after the constructed sample data set is expanded, so that the samples are reasonably constructed, the number of the samples is ensured to be enough, effective detection on each position of the unmanned aerial vehicle is further ensured, and the accuracy and the robustness of a detection algorithm are improved. Meanwhile, the detection algorithm can retrain at any time by adding refractory samples in the later period, and has sufficient optimization space. In addition, through detecting and positioning a plurality of key positions of the unmanned aerial vehicle, when the condition of error leak detection of the machine head occurs, the unmanned aerial vehicle can be positioned by utilizing other positions, a priority function is set on a selected part of the key points, priority ordering is carried out on the key points of different positions, the accuracy and the robustness are balanced, and the unmanned aerial vehicle positioning method has important significance and use value for positioning the unmanned aerial vehicle in the landing process of the unmanned aerial vehicle by utilizing binocular vision.

Second embodiment of the present invention:

please refer to fig. 4-5.

As shown in fig. 4, this embodiment provides an unmanned aerial vehicle autonomous landing target extraction device based on DenseHR-Net, including:

The image preprocessing module 100 is used for preprocessing an image after an RGB three-channel image of the unmanned aerial vehicle is shot by a ground camera in the landing process of the unmanned aerial vehicle, so as to obtain an RGB image with uniform standard size;

specifically, for the image preprocessing module 100, an RGB three-channel image captured by a ground camera during the landing process of the unmanned aerial vehicle is input, so as to output an RGB image with a size of 416×416 after bilinear interpolation, and in this embodiment, scaling is performed by using a bilinear interpolation algorithm.

The key region detection positioning module 200 is configured to perform unmanned aerial vehicle key region detection positioning on the RGB image by constructing a DenseHR-Net target detection network model, and identify a minimum circumscribed rectangular detection frame of a plurality of key regions in the RGB image; the key area comprises an unmanned aerial vehicle head, an unmanned aerial vehicle wing and an unmanned aerial vehicle body;

specifically, for the key region detection positioning module 200, the multi-region detection extraction of the unmanned aerial vehicle is mainly performed, the preprocessed single-channel image is input, and the minimum external rectangular frame of the unmanned aerial vehicle head, the left wing, the right wing and the body part in the image is output. The DenseHR-Net in the embodiment is a target detection model based on a convolutional neural network, an object detection task is directly treated as a regression problem, a forward process obtains a prediction tensor through convolution and pooling operation, a loss function is constructed by comparing the prediction tensor with a sample value, and after the loss function is constructed, the convolution kernel parameter is iteratively updated by a backward gradient propagation algorithm with the aim of minimum loss value. And stopping training to obtain a final network model when the loss value is lower than a certain threshold value.

The key point extraction module 300 is configured to select one of a plurality of key areas in the RGB image as a key point coordinate area according to a preset priority policy, and extract a center point coordinate of a minimum circumscribed rectangular detection frame of the key point coordinate area as a key point coordinate of the unmanned aerial vehicle landing.

Specifically, for the key point extraction module 300, the central point coordinate of the unmanned aerial vehicle head detection frame is first selected as the key point coordinate, and if the condition of missed detection and false detection of the unmanned aerial vehicle head occurs, the central point of the unmanned aerial vehicle wing or the machine body is used as the key point coordinate;

Priority:Headbox>>Wingbox>>Planebox

according to the priority strategy, the central point coordinate of the unmanned aerial vehicle head detection frame is preferably selected as the key point coordinate, and if the unmanned aerial vehicle head detection omission occurs, the central point of the unmanned aerial vehicle wing or the unmanned aerial vehicle body is used as the key point coordinate.

In a preferred embodiment, as shown in fig. 5, the unmanned aerial vehicle autonomous landing target extraction device based on DenseHR-Net further includes:

the world coordinate calculation module 400 is configured to extract key points of images shot by the binocular camera at the same time, and calculate current world coordinates of the unmanned aerial vehicle by combining internal and external parameters of the unmanned aerial vehicle sensor.

Specifically, for the world coordinate calculation module 400, after calculating the coordinates of the key points, the key points are extracted from the images shot by the left and right eye cameras at the same moment, the world coordinates of the unmanned aerial vehicle are calculated by combining the internal and external parameters of the sensor, and the positioning accuracy of the unmanned aerial vehicle in the landing process of the unmanned aerial vehicle is improved by using binocular vision, so that the unmanned aerial vehicle has important significance and use value.

The unmanned aerial vehicle autonomous landing target extraction device based on DenseHR-Net provided by the embodiment comprises: the image preprocessing module is used for preprocessing the images after the RGB three-channel images of the unmanned aerial vehicle are shot through the ground camera in the landing process of the unmanned aerial vehicle, so as to obtain RGB images with uniform standard sizes; the key region detection positioning module is used for carrying out unmanned aerial vehicle key region detection positioning on the RGB image by building a DenseHR-Net target detection network model, and identifying a minimum circumscribed rectangular detection frame of a plurality of key regions in the RGB image; the key area comprises an unmanned aerial vehicle head, an unmanned aerial vehicle wing and an unmanned aerial vehicle body; and the key point extraction module is used for selecting one of the plurality of key areas in the RGB image as a key point coordinate area according to a preset priority strategy, and extracting the central point coordinate of the minimum circumscribed rectangular detection frame of the key point coordinate area as the key point coordinate of the unmanned aerial vehicle landing.

According to the embodiment, the deep learning network DenseHR-Net is adopted to carry out target detection on the unmanned aerial vehicle head, the unmanned aerial vehicle wing and the unmanned aerial vehicle body, and training regression is carried out on the types and the positions identified by the DenseHR-Net target detection model after the constructed sample data set is expanded, so that the sample construction is reasonable, the number of the samples is ensured to be enough, the effective detection on each position of the unmanned aerial vehicle is further ensured, and the accuracy and the robustness of a detection algorithm are improved. Meanwhile, the detection algorithm can retrain at any time by adding refractory samples in the later period, and has sufficient optimization space. In addition, through detecting and positioning a plurality of key positions of the unmanned aerial vehicle, when the condition of error leak detection of the machine head occurs, the unmanned aerial vehicle can be positioned by utilizing other positions, a priority function is set on a selected part of the key points, priority ordering is carried out on the key points of different positions, the accuracy and the robustness are balanced, and the unmanned aerial vehicle positioning method has important significance and use value for positioning the unmanned aerial vehicle in the landing process of the unmanned aerial vehicle by utilizing binocular vision.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the modules may be divided into a logic function, and there may be other division manners in actual implementation, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.

While the foregoing is directed to the preferred embodiments of the present invention, it should be noted that modifications and variations could be made by those skilled in the art without departing from the principles of the present invention, and such modifications and variations are to be regarded as being within the scope of the invention.

Claims

1. An unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net is characterized by at least comprising the following steps:

the method for detecting and positioning the key areas of the unmanned aerial vehicle in the RGB image by building a DenseHR-Net target detection network model, and identifying the minimum circumscribed rectangular detection frames of a plurality of key areas in the RGB image specifically comprises the following steps:

inputting the RGB image into the final DenseHR-Net target detection network model to detect and position key areas of the unmanned aerial vehicle, and identifying minimum circumscribed rectangular detection frames of a plurality of key areas in the RGB image;

inputting the RGB image to the final DenseHR-Net target detection network model for detecting and positioning key areas of the unmanned aerial vehicle, and identifying a minimum circumscribed rectangular detection frame of a plurality of key areas in the RGB image, wherein the minimum circumscribed rectangular detection frame specifically comprises:

Calculating a plurality of predicted rectangular frames by adopting a non-maximum suppression algorithm to obtain a target rectangular frame with highest confidence coefficient of each key region, and converting the target rectangular frame to obtain the category and the position of the target on the original RGB image;

2. The unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net according to claim 1, wherein the image preprocessing is specifically:

3. The unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net according to claim 2, wherein the expanding the sample data set after constructing the sample data set of the DenseHR-Net target detection network model specifically comprises:

4. The unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net according to claim 2, wherein the training of the DenseHR-Net target detection network model by the expanded sample dataset specifically comprises:

5. The method for extracting unmanned aerial vehicle autonomous landing target based on DenseHR-Net according to claim 4, wherein the loss values comprise a frame coordinate loss value, a target confidence loss value, and a class confidence loss value.

6. The unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net according to claim 1, wherein one of a plurality of key areas in the RGB image is selected as a key point coordinate area according to a preset priority strategy, and the central point coordinate of the minimum circumscribed rectangular detection frame of the key point coordinate area is extracted as the key point coordinate of unmanned aerial vehicle landing, specifically:

7. The unmanned aerial vehicle autonomous landing target extraction method based on DenseHR-Net according to claim 1, further comprising:

8. Unmanned aerial vehicle autonomous landing target extraction device based on DenseHR-Net, characterized by comprising:

the key region detection positioning module is used for carrying out unmanned aerial vehicle key region detection positioning on the RGB image by building a DenseHR-Net target detection network model, and identifying a minimum circumscribed rectangular detection frame of a plurality of key regions in the RGB image; the key area comprises an unmanned aerial vehicle head, an unmanned aerial vehicle wing and an unmanned aerial vehicle body; the method for detecting and positioning the key areas of the unmanned aerial vehicle in the RGB image by building a DenseHR-Net target detection network model, and identifying the minimum circumscribed rectangular detection frames of a plurality of key areas in the RGB image specifically comprises the following steps: