CN112329551A

CN112329551A - Unmanned aerial vehicle autonomous landing method and model training method

Info

Publication number: CN112329551A
Application number: CN202011109393.1A
Authority: CN
Inventors: 陈杰; 李坚强; 张一帆; 杜威铭; 刘桂彬
Original assignee: Shenzhen Zhongke Baotai Technology Co ltd
Current assignee: Shenzhen Zhongke Baotai Aerospace Technology Co.,Ltd.
Priority date: 2020-10-16
Filing date: 2020-10-16
Publication date: 2021-02-05

Abstract

The embodiment of the application provides an unmanned aerial vehicle autonomous landing method and a model training method, wherein the unmanned aerial vehicle autonomous landing method comprises the following steps: the unmanned aerial vehicle acquires a real environment image of a to-be-landed area through the airborne image acquisition device; the unmanned aerial vehicle converts the real environment image into image structure data; the unmanned aerial vehicle inputs the graph structure data into a graph convolution decision network model which is trained in advance, and the action of obtaining the graph structure data output by the graph convolution decision network model is obtained; the unmanned aerial vehicle executes the action to land. Through this application embodiment, the security and the reliability that unmanned aerial vehicle independently descended have been improved.

Description

Unmanned aerial vehicle autonomous landing method and model training method

Technical Field

The application belongs to the technical field of unmanned aerial vehicles and artificial intelligence, and particularly relates to an unmanned aerial vehicle autonomous landing method and a model training method.

Background

With the continuous development of unmanned aerial vehicle technology, unmanned aerial vehicles play an increasingly important role in many real-world scenes, such as surveillance, environmental mapping, package delivery, and search and rescue.

In a particular application, the drone may need to be autonomously landed. Existing unmanned aerial vehicle autonomous landing or autonomous landing techniques typically rely on GPS to perform positioning and navigation tasks. However, the anti-jamming capability of the GPS is weak, and if the unmanned aerial vehicle encounters electronic interference, the GPS signal may not support the positioning and navigation functions, so that an accident occurs in the autonomous landing process of the unmanned aerial vehicle. In addition, even in natural environments, GPS signals may be affected by many factors. Therefore, the safety and reliability of the unmanned aerial vehicle independent landing depending on the GPS signal are low.

Disclosure of Invention

The embodiment of the application provides an unmanned aerial vehicle autonomous landing method and a model training method, and safety and reliability of unmanned aerial vehicle autonomous landing can be improved.

In a first aspect, an embodiment of the present application provides an autonomous landing method for an unmanned aerial vehicle, where the method may be applied to an unmanned aerial vehicle, and the method includes:

acquiring a real environment image of a to-be-landed area through an airborne image acquisition device;

converting the real environment image into image structure data;

inputting the graph structure data into a graph convolution decision network model which is trained in advance, and obtaining the graph structure data output by the graph convolution decision network model;

and executing the action to land.

In the embodiment of the application, after the unmanned aerial vehicle collects the real environment image, the graph convolution decision network model trained and completed in advance is used for predicting the graph structure data of the real environment image, the action corresponding to each graph structure is obtained, and finally the predicted action is executed to independently land without depending on a GPS signal, so that the reliability and the safety of the autonomous landing of the unmanned aerial vehicle are improved.

Illustratively, in the real environment image collected by the unmanned aerial vehicle, the left side of the image is a cement land or a texture land, the action predicted by using the graph convolution decision network model is 'left', and the unmanned aerial vehicle correspondingly flies left to land on the cement land or the texture land. Wherein, the landform type that is fit for unmanned aerial vehicle to descend is ground such as cement ground and texture ground, and landform such as river, lake is not fit for unmanned aerial vehicle to descend.

In some possible implementations of the first aspect, the converting the real environment image into the image structure data includes:

cutting the real environment image into image blocks;

inputting the image blocks into a landform classification model which is trained in advance, and obtaining the landform type of each image block output by the landform classification model;

and taking each image block as a node, taking the landform type of each image block as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than a similarity threshold value to obtain the graph structure data of the real environment image.

In a second aspect, an embodiment of the present application provides a model training method, where the method includes:

acquiring image data of a simulation environment;

converting the image data of the simulation environment into graph structure data for training and graph structure data for testing according to a preset similarity threshold;

training a graph convolution decision network model of the simulation environment by using graph structure data for training to obtain the graph convolution decision network model of the trained simulation environment;

testing the graph convolution decision network model of the trained simulation environment by using the test graph structure data to obtain the action prediction accuracy;

after updating the preset similarity threshold, returning to the step of converting the image data into graph structure data for training and graph structure data for testing according to the preset similarity threshold;

and after repeating for multiple times, selecting a preset similarity threshold corresponding to the highest action prediction accuracy as a target similarity threshold, and taking a graph convolution decision network model of the trained simulation environment corresponding to the target similarity threshold as a target graph convolution decision network model.

In some possible implementations of the second aspect, the converting the image data into the graph structure data for training and the graph structure data for testing according to the preset similarity threshold includes:

training a landform classification model by using image data of a simulation environment to obtain a trained landform classification model;

acquiring marked image data, wherein the marked image data is data obtained by marking a preset landform type in the image data of the simulation environment with an action label;

dividing the labeled image data into training data and testing data;

converting training data into graph structure data for training according to a preset similarity threshold;

and converting the test data into the graph structure data for testing according to a preset similarity threshold.

In some possible implementations of the second aspect, the converting training data into graph structure data for training according to a preset similarity threshold includes:

cutting the training data into image blocks;

inputting image blocks of training data into the trained landform classification model to obtain the landform type of each image block output by the trained landform classification model;

and taking each image block of the training data as a node, taking the landform type of each image block as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than a preset similarity threshold value to obtain graph structure data for training.

In some possible implementations of the second aspect, the converting the test data into graph structure data for testing according to a preset similarity threshold includes:

cutting the test data into image blocks;

inputting image blocks of the test data into the trained landform classification model to obtain the landform type of each image block output by the trained landform classification model;

and taking each image block of the test data as a node, taking the landform type of each image block as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than a preset similarity threshold value to obtain the graph structure data for testing.

In some possible implementations of the second aspect, the method further comprises:

acquiring image data of a real environment, wherein the image data of the real environment is an environment image of a to-be-landed area acquired by an unmanned aerial vehicle;

converting the image data of the real environment into graph structure data of the real environment according to the target similarity threshold;

and training a target graph convolution decision network model by using graph structure data of a real environment to obtain the optimized target graph convolution decision network model.

In the implementation mode, the target graph convolution decision network model is optimized by collecting the real environment image of the area to be landed, the optimized target graph convolution decision network model is loaded on the unmanned aerial vehicle, and the unmanned aerial vehicle uses the optimized target graph convolution decision network model to perform action prediction, so that the safety and the reliability of autonomous landing of the unmanned aerial vehicle are further improved.

It should be noted that the target graph convolution decision network model may be directly loaded onto the unmanned aerial vehicle, or the target graph convolution decision network model may be optimized through the real environment image data, and then the optimized target graph convolution decision network model is loaded onto the unmanned aerial vehicle.

In some possible implementations of the second aspect, converting the image data of the real environment into graph structure data of the real environment according to a target similarity threshold includes:

training a landform classification model by using image data of a real environment to obtain a trained target landform classification model;

acquiring image data of a marked real environment, wherein the image data of the marked real environment is obtained by marking a preset landform type in the image data of the real environment with an action label;

cutting image data of a real environment into image blocks;

inputting image blocks of a real environment into a target landform classification model, and obtaining the landform types of the image blocks output by the target landform classification model;

and when the similarity is greater than a preset similarity threshold, adding an edge between two nodes corresponding to the similarity to obtain graph structure data of the real environment.

In a third aspect, an embodiment of the present application provides a drone, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor, when executing the computer program, implements the autonomous landing method of the drone according to the first aspect.

In a fourth aspect, embodiments of the present application provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor, when executing the computer program, implements the model training method according to the second aspect.

In a fifth aspect, embodiments of the present application provide a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the method according to any one of the first aspect or the second aspect.

In a sixth aspect, embodiments of the present application provide a computer program product, which, when run on an electronic device, causes the electronic device to perform the method of any one of the first aspect or the second aspect.

It is understood that the beneficial effects of the second to sixth aspects can be seen from the description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a schematic block diagram of a process of a model training method according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a graph structure transformation provided in an embodiment of the present application;

fig. 3 is a schematic comparison diagram of image blocks and image structures provided in an embodiment of the present application;

fig. 4 is a schematic flowchart of step S102 according to an embodiment of the present disclosure;

FIG. 5 is a schematic block diagram of a process flow of a model optimization process provided by an embodiment of the present application;

fig. 6 is a schematic block diagram of a flow of an autonomous landing method of an unmanned aerial vehicle according to an embodiment of the present application;

FIG. 7 is a block diagram schematically illustrating a structure of a model training apparatus according to an embodiment of the present disclosure;

fig. 8 is a block diagram illustrating a structure of an autonomous landing gear of an unmanned aerial vehicle according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Unmanned aerial vehicle or its equipment of carrying often the cost is expensive, if unmanned aerial vehicle is unexpected in the flight landing in-process, not only can seriously damage unmanned aerial vehicle or its equipment of carrying, still can cause unpredictable injury to the facility on ground or people etc.. Therefore, the safety and reliability of the unmanned aerial vehicle landing process are particularly important.

The embodiment of the application provides an unmanned aerial vehicle independently descends scheme to improve security and the reliability of unmanned aerial vehicle descending in-process. The autonomous landing scheme of the unmanned aerial vehicle provided by the embodiment of the application is introduced and explained in the following model training phase and the application phase respectively. In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application.

Model training phase

Referring to fig. 1, a schematic flow chart of a model training method provided in an embodiment of the present application, where the method is applied to a terminal device, and the method may include the following steps:

and step S101, acquiring image data of the simulation environment.

Specifically, a simulation environment is built, and then an image is collected from the simulation environment to obtain image data of the simulation environment.

For example, using Gazebo simulation software, a map and a drone model are directly loaded on a ros operating system, and a simulation environment is built.

Wherein different pixel values are selected against a picture of the real environment, and the selected relief (e.g., grass, cement, etc.) of the different pixel values is filled to generate a map. For a building, the ros operating system has a model of the building that can be dragged directly onto the map. And the unmanned aerial vehicle model is loaded by an algorithm package downloaded by a ros operating system and the like. The unmanned aerial vehicle models used in the real environment and the simulation environment can be the same or different.

A variety of constructed relief types may be included in the simulation environment, including, but not limited to, cement, grass, textured (brick), roof, and land. Generally, the texture land and the cement land are landforms which are relatively suitable for the landing of the unmanned aerial vehicle.

After a simulation environment is built, a map and an unmanned aerial vehicle model of the area are loaded on an operating system by using simulation software, an unmanned aerial vehicle is selected in the simulation environment, and then the unmanned aerial vehicle is used for acquiring image data in the simulation environment so as to acquire the image data of the simulation environment.

It will be appreciated that the image data of the simulated environment includes a plurality of pictures, each picture including at least one terrain type.

And S102, converting the image data of the simulation environment into graph structure data for training and graph structure data for testing according to a preset similarity threshold.

It should be noted that the preset similarity threshold may be set manually according to actual needs, and is not limited herein.

The image data of the simulation environment comprises a plurality of pictures, each picture is converted into a graph structure, and the graph structures of the plurality of pictures form the graph structure data.

The graph structure may be represented as an undirected graph

Wherein

A vector of nodes, passing edges

And (4) connecting. Each node

Each link associated with a d-dimensional feature vector describing the node and between a pair of nodes

Has a real value

Weighted similarity (e.g., number of similar image blocks).

In specific application, each picture is cut into image blocks, each image block serves as a node, and the landform type of the image block is node information. And when the similarity between the image block and the adjacent image block is greater than the preset similarity threshold, adding an edge between the image block and the adjacent image block as edge information in the graph structure.

For example, referring to the diagram structure conversion diagram shown in fig. 2, as shown in fig. 2, the image data of the simulation environment includes a certain picture, the picture is cut into image blocks, and the image blocks of the picture include the image blocks 211 to 219 shown in fig. 2. Firstly, inputting image blocks into a landform classification model, and predicting the landform type of each image block; and then calculating the similarity between each image block and the adjacent image block. The image blocks are located at different positions, and the number of adjacent image blocks is different. Image block 211, its neighboring image blocks include image block 212, image block 215, and image block 214 in fig. 2. And neighboring image blocks of image block 215 include image block 211, image block 212, image block 213, image block 216, image block 214, image block 217, image block 218, and image block 219.

Then, each image block is used as a node. The image blocks 211-219 correspond to nodes 221-229 in sequence, wherein the image block 211 corresponds to the node 221, the image block 215 corresponds to the node 225, the image block 219 corresponds to the node 229, and so on.

The feature type of each image block is used as node information, for example, the feature type of the image block 211 is texture, and the node information of the node 221 is texture. For another example, the topographic type of the image block 219 is a grassland, and the node information of the node 229 is a grassland.

And after the similarity between each image block and the adjacent image block is calculated, comparing the similarity with a preset similarity threshold. And when the similarity is greater than the preset similarity threshold, adding an edge between the nodes corresponding to the two image blocks, otherwise, when the similarity is less than the preset similarity threshold, not adding the edge or adding a dotted line edge.

For example, as shown in fig. 2, the similarity between image block 211 and image block 215 is greater than the preset similarity threshold, and a solid edge is added between node 221 and node 225. And the similarity between image block 212 and image block 213 is less than the preset similarity threshold, a dashed edge is added between node 222 and node 223. Of course, in a specific application, when the similarity between two image blocks is smaller than the preset similarity threshold, no edge may be added.

For a specific example, referring to a comparison schematic diagram of image blocks and a diagram structure shown in fig. 3, as shown in fig. 3, in the left side of fig. 3, a picture in a simulation environment is cut into image blocks. And to the right of fig. 3, the diagram structure corresponding to the left picture is shown. In a specific application, different nodes can be labeled with different colors to represent different landform types, that is, different colors are used to represent different landform types.

In specific applications, there are many ways to convert image data of a simulation environment into graph structure data according to a preset similarity threshold. For example, in some embodiments, referring to the specific flowchart of step S102 shown in fig. 4, the process of converting the image data into the graph structure data for training and the graph structure data for testing according to the preset similarity threshold may include the following steps:

step S401, training a landform classification model by using the image data of the simulation environment to obtain the trained landform classification model.

It should be noted that the geomorphic classification model is constructed in advance, and the geomorphic classification model may be a model composed of a convolutional layer, a ReLU layer, a max pooling layer, a scatter layer, and a density layer.

In specific application, all the image data of the simulation environment can be directly used for training the landform classification model, or the image data of the simulation environment can be divided into training data and test data, and then the training data is used for training the landform classification model. For example, the image data of the simulation environment has a total of 1600 pictures, of which 1000 are used for training, i.e., as training data, and 600 are used for testing, i.e., as testing data.

The training method of the landform classification model can be referred to in the publication "CN 110728295A" or "CN 110766038A", which is not described herein again. Of course, the geomorphic classification model may also be trained by other methods.

Step S402, obtaining marked image data, wherein the marked image data is obtained by marking a preset geomorphic type in the image data of the simulation environment with an action label.

Specifically, for each picture in the image data of the simulation environment, a corresponding action tag is manually labeled. The action tags may include, but are not limited to, left, right, forward, back, and land.

In the process of marking the picture, marking corresponding actions for the picture with the preset landform type according to the position of the preset landform type in the picture. For example, in a picture, the left side of the picture is textured, and the action of "left" is labeled.

For example, the preset landform type may include, but is not limited to, a texture land and a cement land, that is, in the picture data of the simulation environment, pictures of the preset landform type with the texture land or the cement land and the like are labeled, and pictures of the preset landform type without the texture land or the cement land and the like are not labeled.

Step S403, dividing the labeled image data into training data and test data.

In other embodiments, the image data of the simulation environment may be divided into training data and test data, and then the training data and the test data are labeled with action labels, respectively, to obtain labeled training data and labeled test data.

And S404, converting the training data into graph structure data for training according to a preset similarity threshold.

Specifically, for each picture in the training data, the picture is cut into image blocks, and then the image blocks are input into the geomorphic classification model obtained in step S401. And outputting the landform type of each image block by the landform classification model.

Then, each image block of the training data is used as a node, the landform type to which each image block belongs is used as node information, the similarity between each node and an adjacent node is calculated, and when the similarity is larger than a preset similarity threshold, an edge is added between two nodes corresponding to the similarity, so that graph structure data for training is obtained.

Step S405, converting the test data into graph structure data for testing according to a preset similarity threshold.

Specifically, for each picture in the test data, the picture is cut into image blocks, and then the image blocks are input into the geomorphic classification model obtained in step S401. And outputting the landform type of each image block by the landform classification model.

And then, taking each image block of the test data as a node, taking the landform type of each image block as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than a preset similarity threshold value to obtain the graph structure data for testing.

Step S103, using the graph structure data for training to train the graph convolution decision network model of the simulation environment, and obtaining the graph convolution decision network model of the simulation environment after training.

It should be noted that, a Graph Convolution decision Network model (GCNNDN) of the simulation environment is constructed in advance. The graph convolution decision network model may consist of four graph convolution layers, a pooling layer, and a full connection layer.

The graph convolution layer can be written as a non-linear function implementation, and can be defined as:

it is shown that the activation function is,

H^(l)and W^(l)The activation matrix and the weight matrix of the l-th layer. Let H⁽⁰⁾X. The eigenvector L ═ I of the laplacian matrix_n-D^-1/2AD^-1/2＝UΛU^TA and D are respectively corresponding to the graph structure

Λ is a diagonal matrix where the diagonal elements are eigenvalues of L.

Specifically, graph structure data for training is input into a graph convolution decision network model of a simulation environment, and a full connection layer of the model predicts probability values of various actions in the input graph structure by using a softmax function. Then, the action with the high probability value is selected as the predicted action. And calculating the loss between the predicted action and the marked real action by using a loss function (specifically, a cross entropy loss function), and then reversely updating the network parameters of the loss function by using a random gradient descent method. And repeating the iteration for multiple times until the value of the loss function is not changed or the model tends to be converged, and obtaining the graph convolution decision network model of the trained simulation environment.

And step S104, testing the graph convolution decision network model of the trained simulation environment by using the graph structure data for testing to obtain the action prediction accuracy.

Specifically, each graph structure in the graph structure data for testing is input to the graph convolution decision network model obtained in step S103. The graph convolution decision network model outputs (or predicts) the actions of the respective graph structures. And judging whether the predicted action of each graph structure is consistent with the manually marked action, if not, judging that the prediction is incorrect, and if so, judging that the prediction is correct. According to the principle, the prediction accuracy is calculated.

And step S105, after updating the preset similarity threshold, returning to step S102, and executing steps S102-S104.

It should be noted that, in the process of converting image data into a graph structure, different similarity thresholds may affect the extraction of node-seen edge information, thereby affecting the graph structure used for training the graph convolution decision network model. Therefore, it is necessary to set an appropriate similarity threshold value.

In specific application, a plurality of different similarity thresholds can be manually preset so as to extract a plurality of groups of graph structure data sets for training graph convolution decision network models. The preset similarity threshold value can be a value within the range of 0-1. For example, the preset similarity threshold is set to 0.5 for the first time.

That is, a plurality of different similarity thresholds are preset, and when the preset similarity threshold needs to be updated, one threshold that has not been selected before is selected from the preset similarity thresholds to serve as the updated preset similarity threshold. And then, returning to the step S102, and executing the steps S102 to S104 to obtain the action prediction accuracy corresponding to the updated preset similarity threshold. And sequentially carrying out a plurality of times to obtain action prediction accuracy rates corresponding to a plurality of preset similarity thresholds.

And S106, after repeating for many times, selecting a preset similarity threshold corresponding to the highest action prediction accuracy as a target similarity threshold, and taking a graph convolution decision network model of the trained simulation environment corresponding to the target similarity threshold as a target graph convolution decision network model.

Specifically, after steps S102 to S104 are performed multiple times using multiple different preset similarity thresholds, multiple motion prediction accuracy rates can be obtained. And comparing the accuracy rates of the plurality of motion predictions, selecting a preset similarity threshold value corresponding to the highest accuracy rate as a target similarity threshold value, and taking a graph convolution decision network model corresponding to the target similarity threshold value as a target graph convolution decision network model.

Wherein, the target graph convolution decision network model can be regarded as a graph convolution decision network model after training. In other words, in practical applications, the drone may use the target graph convolution decision network model to predict an action, and perform autonomous landing according to the predicted action.

At this point, the model training phase of the graph convolution decision network model ends. The application phase will be described below.

Application phase

And after the graph convolution decision network model is trained, loading the trained graph convolution decision network model to the unmanned aerial vehicle. In actual flight operation, when the unmanned aerial vehicle needs to independently land, a real environment image of an area to be landed is collected, the real environment image is converted into graph structure data, the graph structure data is input into a graph convolution decision network model which is loaded in advance, the action of the graph structure data is predicted, and independent landing is carried out according to the predicted action.

In some embodiments, the target similarity threshold and the target graph convolution decision network model in step S106 may be loaded to the unmanned aerial vehicle, and the unmanned aerial vehicle converts the acquired real environment image into graph structure data using the target similarity threshold, and then predicts the action of the graph structure data using the target graph convolution decision network model.

In other embodiments, in order to further improve the accuracy of motion prediction and improve the safety and reliability of autonomous landing of the unmanned aerial vehicle, the target graph convolution decision network model obtained in the model training stage may be subjected to fine tuning, and then the fine-tuned target graph convolution decision network model is loaded onto the unmanned aerial vehicle.

That is, based on the above embodiment, referring to the flow schematic block diagram of the model optimization process shown in fig. 5, the model training method may further include the following steps:

step S501, image data of a real environment is obtained, and the image data of the real environment is an environment image of a to-be-landed area acquired by the unmanned aerial vehicle.

Specifically, before the unmanned aerial vehicle flies for operation, the unmanned aerial vehicle is used for shooting image data of a certain area. For example, when the unmanned aerial vehicle is going to the C area for flight operation, the unmanned aerial vehicle is used for shooting image data of the C area, at this time, the C area is the area to be landed, and the image data of the C area is the image data of the real environment.

The acquired image data of the real environment may be a small amount of image data.

It is understood that the image data of the real environment includes a plurality of pictures, and each picture has a landscape such as a cement land, a grass land, and the like.

Step S502, converting the image data of the real environment into graph structure data of the real environment according to the target similarity threshold.

Wherein, the target similarity threshold is the similarity threshold in step S106.

Specifically, first, a landform classification model is trained using image data of a real environment, and a trained target landform classification model is obtained. And then, acquiring image data of the marked real environment, wherein the image data of the marked real environment is obtained by marking a preset landform type in the image data of the real environment with an action label.

And cutting the image data of the real environment into image blocks. And then, inputting the image blocks of the real environment into the target landform classification model to obtain the landform types of the image blocks output by the target landform classification model.

And finally, taking the image block of each real environment as a node, taking the landform type of the image block of each real environment as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than a preset similarity threshold value to obtain the graph structure data of the real environment.

It should be noted that, for the process of converting the image data of the real environment into the image structure data, reference may be made to the relevant content of step S102, which is not described herein again.

Step S503, training a target graph convolution decision network model by using graph structure data of a real environment to obtain the optimized target graph convolution decision network model.

Specifically, the graph structure data in step S502 is input to the target graph convolution decision network model in step S106, so as to perform fine tuning or optimization on the model, thereby obtaining an optimized target graph convolution decision network model.

Further, graph structure data of the real environment is input into the target graph convolution decision network model, the model predicts the prediction action of each graph structure data, and then the loss value between the prediction action and the real action is calculated, and a cross entropy loss function can be specifically used. The network parameters are then updated using a random gradient descent algorithm. And repeating the iteration for multiple times until the value of the loss function is not changed or the model tends to be converged, so as to obtain the trimmed graph convolution decision network model and further realize model trimming or model optimization.

That is, the image data of the real environment is used to update the network parameters of the target graph convolution decision network model of the simulation environment, optimize the loss function, and further fine tune the model.

After the optimized target graph convolution decision network model is obtained, the optimized target graph convolution decision network model, the target similarity threshold and the landform classification model obtained through training in the step S502 are loaded to the unmanned aerial vehicle. The unmanned aerial vehicle can perform autonomous landing by using the optimized target graph convolution decision network model, the target similarity threshold and the landform classification model trained in the step S502.

The target graph convolution decision network model is optimized by collecting the real environment image of the area to be landed, the optimized target graph convolution decision network model is loaded on the unmanned aerial vehicle, and the unmanned aerial vehicle uses the optimized target graph convolution decision network model to perform action prediction, so that the safety and the reliability of autonomous landing of the unmanned aerial vehicle are further improved.

The autonomous landing process of one side of the drone will be described below.

Referring to fig. 6, a schematic flow diagram of an autonomous landing method of a drone provided in an embodiment of the present application is shown, where the method may be applied to a drone, and the method may include the following steps:

step S601, the unmanned aerial vehicle acquires a real environment image of the area to be landed through the airborne image acquisition device.

For example, unmanned aerial vehicle carries out flight operation to C region, and in the flight operation in-process, for some reason, unmanned aerial vehicle need independently descend, and at this moment, C region is the above-mentioned region of waiting to descend promptly. And the unmanned aerial vehicle acquires the real environment image of the area C to obtain the real environment image of the area to be landed.

The image capturing device may be, but is not limited to, a camera.

Step S602, the unmanned aerial vehicle converts the real environment image into image structure data.

Specifically, the real environment image is converted into map structure data using a pre-loaded landform classification model and a target similarity threshold.

The geomorphic classification model loaded on the unmanned aerial vehicle can be a geomorphic classification model obtained in the model fine tuning process, namely, the geomorphic classification model obtained in the step S502. Of course, if the model fine-tuning process is not performed, the landform classification model may be the landform classification model in step S401 described above.

First, the real environment image is cut into image blocks. And inputting the image blocks into a landform classification model which is trained in advance, and obtaining the landform type of each image block output by the landform classification model. At this moment, the landform classification model trained in advance is the landform classification model loaded on the unmanned aerial vehicle.

And then, taking each image block as a node, taking the landform type of each image block as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than a similarity threshold value to obtain the graph structure data of the real environment image. The specific process can refer to the corresponding content above, and is not described herein again.

And S603, inputting the graph structure data into a graph convolution decision network model trained in advance by the unmanned aerial vehicle, and obtaining the graph structure data output by the graph convolution decision network model.

At this time, the graph convolution decision network model trained in advance may refer to a target graph convolution decision network model after model optimization, or may refer to a target graph convolution decision network model without a model optimization process.

And step S604, the unmanned aerial vehicle executes the action to land.

For example, when the predicted action is leftward, the drone flies leftward, and when the next predicted action is landing, the drone lands/lands autonomously.

According to the unmanned aerial vehicle autonomous landing method, after the unmanned aerial vehicle acquires the real environment image, the graph structure data of the real environment image is predicted by using the graph convolution decision network model trained in advance, the action corresponding to each graph structure is obtained, finally, the predicted action is executed to perform autonomous landing, and the unmanned aerial vehicle autonomous landing is not dependent on a GPS signal, so that the reliability and the safety of the unmanned aerial vehicle autonomous landing are improved.

Through the unmanned aerial vehicle autonomous landing scheme that this application embodiment provided, can seek geomorphology such as texture ground or cement ground that is fit for the unmanned aerial vehicle to control unmanned aerial vehicle to descend to these places that are fit for the unmanned aerial vehicle landing.

It should be noted that the model training phase and the model fine tuning phase are generally performed in a terminal device, and the terminal device may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. And the unmanned aerial vehicle is independently descended the process and is generally gone on in unmanned aerial vehicle.

However, in other embodiments, the execution bodies of the model training phase, the model fine tuning phase, and the autonomous landing process of the drone may be arbitrary. For example, in the stage of model training performed by the ground computing equipment, a trained target graph convolution decision network model is obtained, then the target graph convolution decision network model is loaded onto the unmanned aerial vehicle, and the unmanned aerial vehicle performs a model fine tuning process and an unmanned aerial vehicle autonomous landing process. For another example, the autonomous landing process of the unmanned aerial vehicle may be performed by the ground station, at this time, the ground station device obtains a real environment image returned by the unmanned aerial vehicle, predicts an action of the graph structure data of the real environment image by using the target graph convolution decision network model or the model-optimized target graph convolution decision network model, and transmits a corresponding control instruction to the unmanned aerial vehicle according to the predicted action, so that the ground station device controls the unmanned aerial vehicle to land.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 7 is a schematic block diagram of a structure of a model training apparatus provided in an embodiment of the present application, which corresponds to the model training method in the foregoing embodiment, and only shows portions related to the embodiment of the present application for convenience of description.

The apparatus may include:

a simulation environment image data acquisition module 71, configured to acquire image data of a simulation environment;

the conversion module 72 is configured to convert the image data of the simulation environment into graph structure data for training and graph structure data for testing according to a preset similarity threshold;

a model training module 73, configured to train a graph convolution decision network model of the simulation environment using graph structure data for training, to obtain a graph convolution decision network model of the trained simulation environment;

the test module 74 is configured to test the graph convolution decision network model of the trained simulation environment by using the graph structure data for testing, so as to obtain an action prediction accuracy;

a similarity threshold updating module 75, configured to return to the step of converting the image data into the graph structure data for training and the graph structure data for testing according to the preset similarity threshold after updating the preset similarity threshold;

and the selecting module 76 is configured to, after repeating for multiple times, select a preset similarity threshold corresponding to the highest motion prediction accuracy as a target similarity threshold, and use the graph convolution decision network model of the trained simulation environment corresponding to the target similarity threshold as the target graph convolution decision network model.

In some possible implementations, the conversion module is specifically configured to:

dividing the labeled image data into training data and testing data;

cutting the training data into image blocks;

cutting the test data into image blocks;

In some possible implementations, the apparatus may further include:

the real environment image data acquisition module is used for acquiring image data of a real environment, wherein the image data of the real environment is an environment image of a to-be-landed area acquired by the unmanned aerial vehicle;

the image conversion module is used for converting the image data of the real environment into the graph structure data of the real environment according to the target similarity threshold;

and the model optimization module is used for training the target graph convolution decision network model by using the graph structure data of the real environment to obtain the optimized target graph convolution decision network model.

In some possible implementations, the image transformation module is specifically configured to:

cutting image data of a real environment into image blocks;

The model training device has the function of implementing the model training method, the function can be implemented by hardware, or can be implemented by hardware executing corresponding software, the hardware or software comprises one or more modules corresponding to the function, and the modules can be software and/or hardware.

Corresponding to the unmanned aerial vehicle autonomous landing method in the foregoing embodiment, fig. 8 shows a block diagram illustrating a structure of the unmanned aerial vehicle autonomous landing apparatus provided in the embodiment of the present application, and for convenience of explanation, only the portions related to the embodiment of the present application are shown.

The apparatus may include:

the image acquisition module 81 is used for acquiring a real environment image of the area to be landed through the airborne image acquisition device;

a real environment image conversion module 82, configured to convert the real environment image into image structure data;

the prediction module 83 is configured to input the graph structure data into a graph convolution decision network model that is trained in advance, and obtain an action of the graph structure data output by the graph convolution decision network model;

and an executing module 84 for executing the action to land.

In some possible implementations, the real environment image conversion module is specifically configured to:

cutting the real environment image into image blocks;

The unmanned aerial vehicle autonomous landing device has the function of realizing the unmanned aerial vehicle autonomous landing method, the function can be realized by hardware, and also can be realized by corresponding software executed by hardware, the hardware or the software comprises one or more modules corresponding to the functions, and the modules can be software and/or hardware.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/modules, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and reference may be made to the part of the embodiment of the method specifically, and details are not described here.

Fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 9, the electronic apparatus 9 of this embodiment includes: at least one processor 90 (only one shown in fig. 9), a memory 91, and a computer program 92 stored in the memory 91 and executable on the at least one processor 90, the processor 90 implementing the steps in any of the various method embodiments described above when executing the computer program 92.

The electronic device 9 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. In other embodiments, the electronic device may also be a drone or a control device onboard a drone. The electronic device may include, but is not limited to, a processor 90, a memory 91. Those skilled in the art will appreciate that fig. 9 is merely an example of the electronic device 9, and does not constitute a limitation of the electronic device 9, and may include more or less components than those shown, or combine some of the components, or different components, such as an input-output device, a network access device, etc.

The Processor 90 may be a Central Processing Unit (CPU), and the Processor 90 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 91 may in some embodiments be an internal storage unit of the electronic device 9, such as a hard disk or a memory of the electronic device 9. The memory 91 may also be an external storage device of the electronic device 9 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 9. Further, the memory 91 may also include both an internal storage unit and an external storage device of the electronic device 9. The memory 91 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer program. The memory 91 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.

The embodiments of the present application provide a computer program product, which when running on an electronic device, enables the electronic device to implement the steps in the above method embodiments when executed.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. An unmanned aerial vehicle autonomous landing method is applied to an unmanned aerial vehicle, and the method comprises the following steps:

converting the real environment image into image structure data;

an act of inputting the graph structure data into a graph convolution decision network model which is trained in advance to obtain the graph structure data output by the graph convolution decision network model;

and executing the action to land.

2. The method of claim 1, wherein converting the real environment image into graph structure data comprises:

cutting the real environment image into image blocks;

3. A method of model training, comprising:

acquiring image data of a simulation environment;

using the graph structure data for training to train a graph convolution decision network model of the simulation environment, and obtaining the graph convolution decision network model of the simulated environment after training;

testing a graph convolution decision network model of the trained simulation environment by using the graph structure data for testing to obtain action prediction accuracy;

4. The method of claim 3, wherein converting the image data into graph structure data for training and graph structure data for testing according to a preset similarity threshold comprises:

training a landform classification model by using the image data of the simulation environment to obtain a trained landform classification model;

acquiring marked image data, wherein the marked image data is obtained by marking a preset landform type in the image data of the simulation environment with an action label;

dividing the labeled image data into training data and testing data;

converting the training data into graph structure data for training according to the preset similarity threshold;

and converting the test data into the graph structure data for testing according to the preset similarity threshold.

5. The method of claim 4, wherein converting the training data into graph structure data for the training according to the preset similarity threshold comprises:

cutting the training data into image blocks;

inputting the image blocks of the training data into a trained landform classification model to obtain the landform type of each image block output by the trained landform classification model;

and taking each image block of the training data as a node, taking the landform type of each image block as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than the preset similarity threshold, so as to obtain the graph structure data for training.

6. The method of claim 4, wherein converting the test data into the graph structure data for testing according to the predetermined similarity threshold comprises:

cutting the test data into image blocks;

inputting the image blocks of the test data into a trained landform classification model to obtain the landform type of each image block output by the trained landform classification model;

and taking each image block of the test data as a node, taking the landform type of each image block as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than the preset similarity threshold, so as to obtain the test graph structure data.

7. The method of any of claims 3 to 6, further comprising:

and training the target graph convolution decision network model by using the graph structure data of the real environment to obtain the optimized target graph convolution decision network model.

8. The method of claim 7, wherein converting the image data of the real environment into graph structure data of the real environment according to the target similarity threshold comprises:

training a landform classification model by using the image data of the real environment to obtain a trained target landform classification model;

cutting the image data of the real environment into image blocks;

inputting image blocks of a real environment into the target landform classification model, and obtaining the landform type of each image block output by the target landform classification model;

and taking each image block of the real environment as a node, taking the landform type of each image block of the real environment as node information, calculating the similarity between each node and an adjacent node, and adding an edge between two nodes corresponding to the similarity when the similarity is greater than the preset similarity threshold, so as to obtain the graph structure data of the real environment.

9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 2 or 3 to 8 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 2 or 3 to 8.