CN117930872B

CN117930872B - Large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning

Info

Publication number: CN117930872B
Application number: CN202410338166.8A
Authority: CN
Inventors: 顾增辉; 王恒; 张忱
Original assignee: Beijing Feian Aviation Technology Co ltd
Current assignee: Beijing Feian Aviation Technology Co ltd
Priority date: 2024-03-25
Filing date: 2024-03-25
Publication date: 2024-05-28
Anticipated expiration: 2044-03-25
Also published as: CN117930872A

Abstract

The invention relates to the technical field of unmanned aerial vehicle flight control systems, in particular to a large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning, which comprises the following components: an ideal track sequence and space data sequence acquisition module, an overall track stability acquisition module, a comprehensive weight coefficient acquisition module and an unmanned aerial vehicle classification training module; acquiring an ideal track sequence and a space data sequence; obtaining an influence data point range according to the space data sequence; obtaining the regional spatial offset according to the range of the influence data points; obtaining the stability of the whole track according to the regional space deviation; obtaining a comprehensive weight coefficient according to the difference of the track stability between unmanned aerial vehicles; obtaining an optimized classification distance according to the comprehensive weight coefficient; and carrying out large-scale unmanned aerial vehicle cluster training according to the optimized classification distance. The method improves the accuracy of the clustering result, and ensures that the unmanned aerial vehicle cluster flight control is more accurate after training.

Description

Large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning

Technical Field

The invention relates to the technical field of unmanned aerial vehicle flight control systems, in particular to a large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning.

Background

When a large-scale unmanned aerial vehicle cluster performs flight operation, a large amount of unmanned aerial vehicle cluster flight position data are required to be collected for training in order to control unmanned aerial vehicles to automatically avoid collision in a complex flight environment; however, because a large amount of unmanned aerial vehicle cluster flight position data of gathering can receive interference factors such as wind speed, wind direction, electromagnetic wave influence, lead to different unmanned aerial vehicle cluster flight position data in the sample space can present different gathering distribution characteristics, in order to improve the efficiency of training unmanned aerial vehicle cluster intelligence flight, need carry out the cluster classification to unmanned aerial vehicle cluster flight position data, then train respectively to the data after classifying.

By conventional methodsClustering unmanned aerial vehicle cluster flight position data by a hierarchical clustering algorithm, wherein the unmanned aerial vehicle cluster flight position data are influenced by a plurality of interference factors such as wind speed, wind direction and electromagnetic waves, so that the unmanned aerial vehicle cluster flight position data presenting different aggregation distribution characteristics have a plurality of dimensional correlation influences; whereas conventionalThe hierarchical clustering algorithm only clusters through the distance between unmanned aerial vehicle cluster flight position data, and cannot effectively integrate interference factors of multiple dimensions to cluster, so that a clustering result is inaccurate.

Disclosure of Invention

The invention provides a large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning, which aims to solve the existing problems: conventional onesThe hierarchical clustering algorithm only clusters through the distance between unmanned aerial vehicle cluster flight position data, and cannot effectively integrate interference factors of multiple dimensions of the unmanned aerial vehicle cluster flight position data to cluster, so that a clustering result is inaccurate.

The large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning adopts the following technical scheme:

The method comprises the following modules:

the system comprises an ideal track sequence and a space data sequence acquisition module, wherein the ideal track sequence and the space data sequence acquisition module are used for acquiring an ideal track sequence and a plurality of space data sequences of unmanned aerial vehicles, the ideal track sequence comprises a plurality of ideal track data points, and the space data sequence comprises a plurality of space data points of the unmanned aerial vehicles;

The whole track stability acquisition module is used for dividing the range of the unmanned aerial vehicle space data points in the space data sequence to obtain the influence data point range of each unmanned aerial vehicle space data point; obtaining the regional spatial offset of each unmanned aerial vehicle spatial data point according to the difference between the unmanned aerial vehicle spatial data point and the ideal track data point in the influence data point range; obtaining the overall track stability of each unmanned aerial vehicle according to the regional space deviation degree of the unmanned aerial vehicle space data points, wherein the overall track stability is used for describing the difference between the actual flight track and the ideal flight track of the unmanned aerial vehicle;

The comprehensive weight coefficient acquisition module is used for clustering the unmanned aerial vehicle to obtain a plurality of cluster clusters; distinguishing different unmanned aerial vehicles among the cluster clusters to obtain a plurality of first final comparison unmanned aerial vehicles and a plurality of second final comparison unmanned aerial vehicles; obtaining a comprehensive weight coefficient of each unmanned aerial vehicle according to the difference of the overall track stability between the first final comparison unmanned aerial vehicle and the second final comparison unmanned aerial vehicle;

the unmanned aerial vehicle classification training module is used for obtaining a normalized space weight coefficient between any two unmanned aerial vehicles according to the difference change condition of the comprehensive weight coefficient between the unmanned aerial vehicles; optimizing the distance between the unmanned aerial vehicles according to the normalized space weight coefficient to obtain an optimized classification distance between any two unmanned aerial vehicles; and carrying out large-scale unmanned aerial vehicle cluster training according to the optimized classification distance.

Preferably, the method for performing range division on the unmanned aerial vehicle space data points in the space data sequence to obtain the influence data point range of each unmanned aerial vehicle space data point includes the following specific steps:

Presetting a recording time quantity T1; for the first The/>, in the spatial data sequence of the unmanned aerial vehicleSpatial data points of unmanned aerial vehicle, will be/>T1 unmanned aerial vehicle spatial data points ahead of the unmanned aerial vehicle spatial data points, and/>T1 unmanned aerial vehicle space data points after the unmanned aerial vehicle space data points are integrally recorded as the/>Influence data point range for individual unmanned aerial vehicle spatial data points.

Preferably, the method for obtaining the regional spatial offset of each unmanned aerial vehicle spatial data point according to the difference between the unmanned aerial vehicle spatial data point and the ideal track data point in the influence data point range comprises the following specific steps:

In the method, in the process of the invention, Represents the/>The/>, in the spatial data sequence of the unmanned aerial vehicleRegional spatial offset of the individual unmanned aerial vehicle spatial data points; /(I)Represents the/>The/>The number of all unmanned aerial vehicle spatial data points in the range of influence data points for the individual unmanned aerial vehicle spatial data points; /(I)Represents the/>The/>Space data points of unmanned aerial vehicle and/>, in ideal track sequenceThe Euclidean distance between the ideal trace data points; /(I)Expressed in/>The/>In the range of the influence data points of the spatial data points of the unmanned aerial vehicle, the/>The Euclidean distance between the spatial data point of the unmanned aerial vehicle and the corresponding ideal track data point in the ideal track sequence; /(I)The representation takes absolute value.

Preferably, the overall track stability of each unmanned aerial vehicle is obtained according to the regional spatial offset of the unmanned aerial vehicle spatial data points, and the method comprises the following specific steps:

In the method, in the process of the invention, Represents the/>Overall trajectory stability of the individual unmanned aerial vehicle; /(I)Represents the/>The number of all unmanned aerial vehicle spatial data points in the spatial data sequence of the individual unmanned aerial vehicle; /(I)Represents the/>The/>, in the spatial data sequence of the unmanned aerial vehicleRegional spatial offset of the individual unmanned aerial vehicle spatial data points; /(I)An exponential function based on a natural constant is represented.

Preferably, the distinguishing between different unmanned aerial vehicles in the cluster to obtain a plurality of first final comparison unmanned aerial vehicles and a plurality of second final comparison unmanned aerial vehicles comprises the following specific methods:

Marking any unmanned aerial vehicle as a target unmanned aerial vehicle, marking a cluster containing the target unmanned aerial vehicle as a target cluster of the target unmanned aerial vehicle, arranging all target clusters of the target unmanned aerial vehicle according to the sequence from small to large of unmanned aerial vehicles contained in the target cluster, and marking the arranged sequence as a target cluster sequence of the target unmanned aerial vehicle;

For any one target cluster in a target cluster sequence of the target unmanned aerial vehicle, marking each unmanned aerial vehicle except the target unmanned aerial vehicle in the target cluster as a comparison unmanned aerial vehicle of the target cluster, and acquiring all comparison unmanned aerial vehicles of all target clusters in the target cluster sequence of the target unmanned aerial vehicle;

marking the whole of any two adjacent target clusters in the target cluster sequence of the target unmanned aerial vehicle as a target cluster pair; for any one comparison unmanned aerial vehicle in a first target cluster in any one target cluster pair, if the comparison unmanned aerial vehicle does not appear in a second target cluster, marking the comparison unmanned aerial vehicle as a first final comparison unmanned aerial vehicle;

And for any one comparison unmanned aerial vehicle in the second target cluster in any one target cluster pair, if the comparison unmanned aerial vehicle does not appear in the first target cluster, marking the comparison unmanned aerial vehicle as a second final comparison unmanned aerial vehicle.

Preferably, the method for obtaining the comprehensive weight coefficient of each unmanned aerial vehicle according to the difference of the overall track stability between the first final comparison unmanned aerial vehicle and the second final comparison unmanned aerial vehicle comprises the following specific steps:

Marking the whole of any two adjacent target clusters in the target cluster sequence of any unmanned aerial vehicle as a target cluster pair;

Acquiring a first class track contrast and a second class track contrast of each target cluster according to the first final comparison unmanned aerial vehicle and the second final comparison unmanned aerial vehicle;

In the method, in the process of the invention, Representing the comprehensive weight coefficient of the unmanned aerial vehicle; /(I)Representing the number of all target cluster pairs of the unmanned aerial vehicle; the overall track stability of the unmanned aerial vehicle is represented; /(I) Represents unmanned aerial vehicle's/>)A first class trace contrast for each target class cluster pair; /(I)Represents unmanned aerial vehicle's/>)Second class track contrast for each target class cluster pair; /(I)Representing preset super parameters; /(I)The representation takes absolute value.

Preferably, the method for obtaining the first category track contrast and the second category track contrast of each target cluster according to the first final comparison unmanned aerial vehicle and the second final comparison unmanned aerial vehicle includes the following specific steps:

And marking the average value of the overall track stability of all the first final comparison unmanned aerial vehicles in the target cluster pair as the first class track contrast of the target cluster pair, and marking the average value of the overall track stability of all the second final comparison unmanned aerial vehicles in the target cluster pair as the second class track contrast of the target cluster pair.

Preferably, the method for obtaining the normalized space weight coefficient between any two unmanned aerial vehicles according to the difference change condition of the comprehensive weight coefficient between the unmanned aerial vehicles comprises the following specific steps:

Will be the first Personal unmanned plane and/>The absolute value of the difference of the comprehensive weight coefficients of the unmanned aerial vehicle is recorded as the/>Personal unmanned plane and/>The method comprises the steps that the average value of the comprehensive weight coefficient difference values among all unmanned aerial vehicles is recorded as the comprehensive weight difference average value;

Will be the first Personal unmanned plane and/>The absolute value of the difference value between the comprehensive weight coefficient difference value and the comprehensive weight difference mean value of the unmanned aerial vehicle is recorded as the/>Personal unmanned plane and/>Initial comprehensive weight coefficients of the individual unmanned aerial vehicles; all initial comprehensive weight coefficients are obtained, linear normalization is carried out on all initial comprehensive weight coefficients, and each normalized initial comprehensive weight coefficient is recorded as a normalized space weight coefficient.

Preferably, the optimizing the distance between the unmanned aerial vehicles according to the normalized space weight coefficient to obtain the optimized classification distance between any two unmanned aerial vehicles comprises the following specific methods:

In the method, in the process of the invention, Represents the/>Personal unmanned plane and/>Optimizing the classification distance of the unmanned aerial vehicle; /(I)Represents the/>All unmanned aerial vehicle space data points and/>, in individual unmanned aerial vehiclesMinimum value of euclidean distance of all unmanned aerial vehicle space data points in the individual unmanned aerial vehicles; /(I)Represents the/>Personal unmanned plane and/>The normalized space weight coefficient of the unmanned aerial vehicle; /(I)Represents the/>Personal unmanned plane and/>The comprehensive weight coefficient difference value of the unmanned aerial vehicle; /(I)Representing the composite weight difference mean.

Preferably, the method for performing the large-scale unmanned aerial vehicle cluster training according to the optimized classification distance comprises the following specific steps:

Taking the optimized classification distance between any two unmanned aerial vehicles as Distance measurement of hierarchical clustering algorithm, and pass/>, according to distance measurementClustering all unmanned aerial vehicles by using a hierarchical clustering algorithm to obtain a plurality of clusters, and respectively inputting each cluster into a large-scale unmanned aerial vehicle cluster flight system to complete training.

The technical scheme of the invention has the beneficial effects that: the method comprises the steps that the difference between the space data points of the unmanned aerial vehicle and ideal track points is used for obtaining a regional space deviation degree, the overall track stability is obtained according to the regional space deviation degree, a comprehensive weight coefficient is obtained according to the difference of the overall track stability, the distance between unmanned aerial vehicles is optimized according to the comprehensive weight coefficient to obtain an optimized classification distance, and the large-scale unmanned aerial vehicle cluster training is carried out according to the optimized classification distance; the overall track stability of the invention reflects the difference between the actual flight track and the ideal flight track of the unmanned aerial vehicle, the comprehensive weight coefficient reflects the final distribution proportion of the comprehensive environmental interference factors, and the optimized classification distance reflects the similarity degree of the actual flight track among the unmanned aerial vehicles; the accuracy of the clustering result is improved, and the unmanned aerial vehicle cluster flight control is more accurate after training.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a block diagram of a large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning.

Detailed Description

In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description is given below of the large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning according to the invention, which is provided by combining the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of the large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning provided by the invention with reference to the accompanying drawings.

Referring to fig. 1, a block diagram of a large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning according to an embodiment of the present invention is shown, where the system includes the following modules:

the ideal track sequence and spatial data sequence acquisition module 101 is configured to acquire an ideal track sequence and spatial data sequences of a plurality of unmanned aerial vehicles.

The conventional method is thatClustering unmanned aerial vehicle cluster flight position data by a hierarchical clustering algorithm, wherein the unmanned aerial vehicle cluster flight position data are influenced by a plurality of interference factors such as wind speed, wind direction and electromagnetic waves, so that the unmanned aerial vehicle cluster flight position data presenting different aggregation distribution characteristics have a plurality of dimensional correlation influences; whereas traditional/>The hierarchical clustering algorithm only clusters through the distance between unmanned aerial vehicle cluster flight position data, and cannot effectively integrate interference factors of multiple dimensions to cluster, so that a clustering result is inaccurate. For this reason, the embodiment proposes a large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning.

Specifically, in order to implement the deep reinforcement learning-based large-scale unmanned aerial vehicle cluster flight system provided in this embodiment, an ideal track sequence and a spatial data sequence need to be acquired first, and the specific process is as follows: on a straight line road with the length of 200 meters, any one end of the straight line road is taken as a starting point, one end of the straight line road except the starting point is taken as an ending point, transmitters with built-in GPS positioning modules are installed every 1 meter, three unmanned aerial vehicle space data of latitude data, longitude data and altitude data of each transmitter are sequentially collected from the starting point, all transmitters are ordered according to the order from the starting point to the ending point, the ordered sequence is recorded as an ideal track sequence, and data points formed by the latitude data, the longitude data and the altitude data collected each time are recorded as ideal track data points. Wherein the ideal track sequence comprises a plurality of transmitters, each transmitter corresponding to an ideal track data point, each ideal track data point corresponding to one dimension data, one longitude data, and one altitude data.

Further, 30 unmanned aerial vehicles provided with GPS positioning modules are sequentially arranged from the starting point to the following pointThe speed of the device (1) makes linear uniform motion to fly to a termination point; recording three unmanned aerial vehicle space data, namely latitude data, longitude data and altitude data of each unmanned aerial vehicle once every 1 second, and recording data points formed by the recorded latitude data, longitude data and altitude data as one unmanned aerial vehicle space data point, wherein the total collection time is 200 seconds; taking any unmanned aerial vehicle as an example, sequencing all unmanned aerial vehicle space data points of the unmanned aerial vehicle according to the recording time from small to large, and recording the sequenced sequence as a space data sequence; and acquiring spatial data sequences of all unmanned aerial vehicles. The space data sequence comprises a plurality of unmanned aerial vehicle space data points, each unmanned aerial vehicle space data point corresponds to one recording moment, each unmanned aerial vehicle space data point corresponds to one ideal track data point, and each unmanned aerial vehicle space data point corresponds to a plurality of unmanned aerial vehicle space data.

In addition, it should be noted that, in this embodiment, the length of the straight line road, the mounting interval of the transmitter, the number of unmanned aerial vehicles, the flight speed of the unmanned aerial vehicle, the recording time and the total recording time are not specifically limited, and the length of the straight line road, the mounting interval of the transmitter, the number of unmanned aerial vehicles, the flight speed of the unmanned aerial vehicle, the recording time and the total recording time may be determined according to specific implementation conditions.

So far, the ideal track sequence and the spatial data sequences of a plurality of unmanned aerial vehicles are obtained through the method.

The overall track stability obtaining module 102 is configured to perform range division on unmanned aerial vehicle space data points in a space data sequence to obtain an influence data point range of each unmanned aerial vehicle space data point; obtaining the regional spatial offset of each unmanned aerial vehicle spatial data point according to the difference between the unmanned aerial vehicle spatial data point and the ideal track data point in the influence data point range; and obtaining the overall track stability of each unmanned aerial vehicle according to the regional space deviation degree of the unmanned aerial vehicle space data points.

It should be noted that, the unmanned aerial vehicle is affected by wind speed, wind direction, electromagnetic wave and other interference factors in the actual flight process, so that a certain deviation exists between the actual flight track and the ideal flight track of the unmanned aerial vehicle; meanwhile, as the unmanned aerial vehicle is always in a continuous flight state, the flight states between adjacent recording moments have strong relevance, and the corresponding flight positions have strong continuity; thus, for any one recording instant, the deviation of the actual flight trajectory around that recording instant from the ideal flight trajectory is more similar. The embodiment obtains the overall track stability of each recording moment by analyzing the deviation of the actual flight track around each recording moment and the ideal flight track, so as to facilitate the subsequent analysis and processing.

Specifically, a recording time number T1 is preset, where the embodiment is described by taking t1=4 as an example, and the embodiment is not specifically limited, where T1 may be determined according to specific implementation conditions; in the first placeThe/>, in the spatial data sequence of the unmanned aerial vehicleTaking the unmanned aerial vehicle space data point as an example, the/>T1 unmanned aerial vehicle spatial data points ahead of the unmanned aerial vehicle spatial data points and/>T1 unmanned aerial vehicle space data points after the unmanned aerial vehicle space data points are integrally recorded as the/>Influence data point range for individual unmanned aerial vehicle spatial data points. Wherein if/>The number of unmanned aerial vehicle space data points remaining before and after the number of unmanned aerial vehicle space data points does not meet the preset T1, then the number of unmanned aerial vehicle space data points is the/>Obtaining the/>, until the number of the unmanned aerial vehicle space data points remaining before and after the unmanned aerial vehicle space data points is obtainedThe range of the sequence of influence of the spatial data points of the individual unmanned aerial vehicles.

Further, according to the firstThe/>, in the spatial data sequence of the unmanned aerial vehicleInfluence data point range of space data points of unmanned aerial vehicle to obtain the/>The/>, in the spatial data sequence of the unmanned aerial vehicleRegional spatial offset of individual unmanned aerial vehicle spatial data points. Wherein/>The/>, in the spatial data sequence of the unmanned aerial vehicleThe method for calculating the regional spatial offset of the spatial data points of the unmanned aerial vehicle comprises the following steps:

In the method, in the process of the invention, Represents the/>The/>, in the spatial data sequence of the unmanned aerial vehicleRegional spatial offset of the individual unmanned aerial vehicle spatial data points; /(I)Represents the/>The/>The number of all unmanned aerial vehicle spatial data points in the range of influence data points for the individual unmanned aerial vehicle spatial data points; /(I)Represents the/>The/>Space data points of unmanned aerial vehicle and/>, in ideal track sequenceThe Euclidean distance between the ideal trace data points; /(I)Expressed in/>The/>In the range of the influence data points of the spatial data points of the unmanned aerial vehicle, the/>The Euclidean distance between the spatial data point of the unmanned aerial vehicle and the corresponding ideal track data point in the ideal track sequence; /(I)The representation takes absolute value. Wherein if/>Represents the/>The/>, in the spatial data sequence of the unmanned aerial vehicleThe greater the regional spatial offset of the spatial data points of the individual unmanned aerial vehicle, the description is at/>The greater the degree of external interference to the unmanned aerial vehicle within the recording time around the individual unmanned aerial vehicle spatial data points. Acquisition of the/>Regional spatial offset of all unmanned aerial vehicle spatial data points in the spatial data sequence of the individual unmanned aerial vehicle. The obtaining of the euclidean distance is a well-known technique, and this embodiment will not be described in detail.

Further, according to the firstObtaining the regional spatial offset of all unmanned aerial vehicle spatial data points in the spatial data sequence of the personal unmanned aerial vehicle to obtain the/>Overall trajectory stability of the individual unmanned aerial vehicle. Wherein/>The calculation method of the overall track stability of the unmanned aerial vehicle comprises the following steps:

In the method, in the process of the invention, Represents the/>Overall trajectory stability of the individual unmanned aerial vehicle; /(I)Represents the/>The number of all unmanned aerial vehicle spatial data points in the spatial data sequence of the individual unmanned aerial vehicle; /(I)Represents the/>The/>, in the spatial data sequence of the unmanned aerial vehicleRegional spatial offset of the individual unmanned aerial vehicle spatial data points; /(I)An exponential function that is based on a natural constant; examples use is made ofThe functions are presented with inverse proportion relation and normalization processing, and an implementer can select the inverse proportion function and the normalization function according to actual conditions. Wherein if/>The greater the overall track stability of the individual unmanned aerial vehicle, the description of the/>The larger the difference between the actual flight trajectory of the unmanned aerial vehicle and the ideal flight trajectory is, the more/>, the following is reflectedThe more significant the change in the region of the spatial data sequence of the individual drone. And acquiring the overall track stability of all unmanned aerial vehicles.

So far, the method is used for obtaining the overall track stability of all unmanned aerial vehicles.

The comprehensive weight coefficient acquisition module 103 is used for clustering the unmanned aerial vehicle to obtain a plurality of clusters; distinguishing different unmanned aerial vehicles among the cluster clusters to obtain a plurality of first final comparison unmanned aerial vehicles and a plurality of second final comparison unmanned aerial vehicles; and obtaining the comprehensive weight coefficient of each unmanned aerial vehicle according to the difference of the overall track stability between the first final comparison unmanned aerial vehicle and the second final comparison unmanned aerial vehicle.

It should be noted that the conventionalThe hierarchical decomposition sequence of the hierarchical clustering algorithm is from bottom to top, each data is firstly regarded as a class cluster, then the distance between the two class clusters is continuously calculated, judgment and combination are carried out, a cluster tree is built from bottom to top, each node corresponds to one cluster, and therefore more unmanned aerial vehicle space data can be located in the same class cluster; the spatial data of unmanned aerial vehicles contained in different clusters are not completely the same, so that the unmanned aerial vehicles contained in the clusters are correspondingly different, and meanwhile, the regional variation degree of the spatial data sequence is represented by the overall track stability of the unmanned aerial vehicles, so that the regional variation degree of the spatial data sequence of the whole of the different clusters is not the same, and the variation characteristics of the spatial data of the unmanned aerial vehicles contained in the corresponding different clusters are also different; in order to accurately divide the space data of unmanned aerial vehicles with different change characteristics as far as possible, in the embodiment, the comprehensive weight coefficient of each unmanned aerial vehicle is obtained by analyzing the corresponding change characteristics of the same unmanned aerial vehicle in different clustering clusters, and the clustering is performed according to the comprehensive weight coefficients.

Specifically, the stability of the overall track among all unmanned aerial vehicles is taken asDistance measurement of hierarchical clustering algorithm, and pass/>, according to distance measurementClustering all unmanned aerial vehicles by using a hierarchical clustering algorithm to obtain a plurality of clusters; taking any unmanned aerial vehicle as an example, marking a cluster containing the unmanned aerial vehicle as a target cluster of the unmanned aerial vehicle, arranging all target clusters of the unmanned aerial vehicle according to the sequence from small to large of unmanned aerial vehicles contained in the target cluster, and marking the arranged sequence as a target cluster sequence of the unmanned aerial vehicle; taking any one target cluster in the target cluster sequence of the unmanned aerial vehicle as an example, marking each unmanned aerial vehicle except the unmanned aerial vehicle in the target cluster as a comparison unmanned aerial vehicle of the target cluster, and acquiring all comparison unmanned aerial vehicles of all target clusters in the target cluster sequence of the unmanned aerial vehicle; marking the whole of any two adjacent target clusters in the target cluster sequence of the unmanned aerial vehicle as a target cluster pair; taking any one comparison unmanned aerial vehicle in a first target cluster in any one target cluster pair as an example, if the comparison unmanned aerial vehicle does not appear in a second target cluster, marking the comparison unmanned aerial vehicle as a first final comparison unmanned aerial vehicle; taking any one comparison unmanned aerial vehicle in a second target cluster in any one target cluster pair as an example, if the comparison unmanned aerial vehicle does not appear in the first target cluster, marking the comparison unmanned aerial vehicle as a second final comparison unmanned aerial vehicle, and acquiring all first final comparison unmanned aerial vehicles and all second final comparison unmanned aerial vehicles in the target cluster pair. The process in which data is clustered according to distance metrics is/>The well-known contents of hierarchical clustering algorithm are not described in detail in this embodiment.

Further, the average value of the overall track stability of all the first final comparison unmanned aerial vehicles in the target cluster pair is recorded as the first class track contrast of the target cluster pair, the average value of the overall track stability of all the second final comparison unmanned aerial vehicles in the target cluster pair is recorded as the second class track contrast of the target cluster pair, and the first class track contrast and the second class track contrast of all the target cluster pairs of the unmanned aerial vehicle are obtained. Each unmanned aerial vehicle corresponds to a plurality of target cluster groups, and each target cluster group of each unmanned aerial vehicle corresponds to a first category track contrast and a second category track contrast.

Further, according to the first category track contrast and the second category track contrast of all the target cluster pairs of the unmanned aerial vehicle, the comprehensive weight coefficient of the unmanned aerial vehicle is obtained. The calculation method of the comprehensive weight coefficient of the unmanned aerial vehicle comprises the following steps:

In the method, in the process of the invention, Representing the comprehensive weight coefficient of the unmanned aerial vehicle; /(I)Representing the number of all target cluster pairs of the unmanned aerial vehicle; /(I)Representing the overall track stability of the unmanned aerial vehicle; /(I)Represents the unmanned plane's/>A first class trace contrast for each target class cluster pair; /(I)Represents the unmanned plane's/>Second class track contrast for each target class cluster pair; /(I)Representing preset hyper-parameters, preset/>, in this embodimentFor preventing denominator from being 0; /(I)The representation takes absolute value. The larger the comprehensive weight coefficient of the unmanned aerial vehicle is, the more obvious the corresponding deviation change characteristic of the unmanned aerial vehicle in the clustering process is. And obtaining the comprehensive weight coefficient of all unmanned aerial vehicles.

So far, the comprehensive weight coefficient of all unmanned aerial vehicles is obtained through the method.

The unmanned aerial vehicle classification training module 104 is used for obtaining a normalized space weight coefficient between any two unmanned aerial vehicles according to the difference change condition of the comprehensive weight coefficient between the unmanned aerial vehicles; optimizing the distance between the unmanned aerial vehicles according to the normalized space weight coefficient to obtain an optimized classification distance between any two unmanned aerial vehicles; and carrying out large-scale unmanned aerial vehicle cluster training according to the optimized classification distance.

Specifically, will bePersonal unmanned plane and/>The absolute value of the difference of the comprehensive weight coefficients of the unmanned aerial vehicle is recorded as the/>Personal unmanned plane and/>The method comprises the steps that the average value of the comprehensive weight coefficient difference values among all unmanned aerial vehicles is recorded as the comprehensive weight difference average value; will/>Personal unmanned plane and/>The absolute value of the difference value between the comprehensive weight coefficient difference value and the comprehensive weight difference mean value of the unmanned aerial vehicle is recorded as the/>Personal unmanned plane and/>Initial comprehensive weight coefficients of the individual unmanned aerial vehicles; all initial comprehensive weight coefficients are obtained, linear normalization is carried out on all initial comprehensive weight coefficients, and each normalized initial comprehensive weight coefficient is recorded as a normalized space weight coefficient. According to/>Personal unmanned plane and/>The normalized space weight coefficient of the personal unmanned aerial vehicle is obtained to be the/>Personal unmanned plane and/>Optimized classification distance of the individual unmanned aerial vehicle. Wherein/>Personal unmanned plane and/>The calculation method of the optimized classification distance of the personal unmanned aerial vehicle comprises the following steps:

In the method, in the process of the invention, Represents the/>Personal unmanned plane and/>Optimizing the classification distance of the unmanned aerial vehicle; /(I)Represents the/>All unmanned aerial vehicle space data points and/>, in individual unmanned aerial vehiclesMinimum value of euclidean distance of all unmanned aerial vehicle space data points in the individual unmanned aerial vehicles; /(I)Represents the/>Personal unmanned plane and/>The normalized space weight coefficient of the unmanned aerial vehicle; /(I)Represents the/>Personal unmanned plane and/>The comprehensive weight coefficient difference value of the unmanned aerial vehicle; /(I)Representing the composite weight difference mean. Wherein if/>Personal unmanned plane and/>The greater the optimized classification distance of the unmanned aerial vehicle, the description of the/>Personal unmanned plane and/>The more similar the actual flight trajectory of the individual unmanned aerial vehicle is, the more the actual flight trajectory reflects the/>Personal unmanned plane and/>The more likely a single drone belongs to one class. And obtaining the optimal classification distance among all unmanned aerial vehicles.

Further, the optimal classification distance between any two unmanned aerial vehicles is taken asDistance measurement of hierarchical clustering algorithm, and pass/>, according to distance measurementClustering all unmanned aerial vehicles by using a hierarchical clustering algorithm to obtain a plurality of clusters, and respectively inputting each cluster into a large-scale unmanned aerial vehicle cluster flight system to complete training.

This embodiment is completed.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the invention, but any modifications, equivalent substitutions, improvements, etc. within the principles of the present invention should be included in the scope of the present invention.

Claims

1. Deep reinforcement learning-based large-scale unmanned aerial vehicle cluster flight system is characterized in that the system comprises the following modules:

The method for obtaining the overall track stability of each unmanned aerial vehicle according to the regional space deviation degree of the unmanned aerial vehicle space data points comprises the following specific steps:

In the method, in the process of the invention, Represents the/>Overall trajectory stability of the individual unmanned aerial vehicle; /(I)Represents the/>The number of all unmanned aerial vehicle spatial data points in the spatial data sequence of the individual unmanned aerial vehicle; /(I)Represents the/>The/>, in the spatial data sequence of the unmanned aerial vehicleRegional spatial offset of the individual unmanned aerial vehicle spatial data points; /(I)An exponential function that is based on a natural constant;

the method for distinguishing different unmanned aerial vehicles among the cluster clusters to obtain a plurality of first final comparison unmanned aerial vehicles and a plurality of second final comparison unmanned aerial vehicles comprises the following specific methods:

For any one comparison unmanned aerial vehicle in a second target cluster in any one target cluster pair, if the comparison unmanned aerial vehicle does not appear in the first target cluster, marking the comparison unmanned aerial vehicle as a second final comparison unmanned aerial vehicle;

According to the difference of the overall track stability between the first final comparison unmanned aerial vehicle and the second final comparison unmanned aerial vehicle, the comprehensive weight coefficient of each unmanned aerial vehicle is obtained, and the method comprises the following specific steps:

In the method, in the process of the invention, Representing the comprehensive weight coefficient of the unmanned aerial vehicle; /(I)Representing the number of all target cluster pairs of the unmanned aerial vehicle; /(I)The overall track stability of the unmanned aerial vehicle is represented; /(I)Represents unmanned aerial vehicle's/>)A first class trace contrast for each target class cluster pair; /(I)Represents unmanned aerial vehicle's/>)Second class track contrast for each target class cluster pair; /(I)Representing preset super parameters; The representation takes absolute value;

The method for acquiring the first category track contrast and the second category track contrast of each target cluster according to the first final comparison unmanned aerial vehicle and the second final comparison unmanned aerial vehicle comprises the following specific methods:

Marking the average value of the overall track stability of all the first final comparison unmanned aerial vehicles in the target cluster pair as the first class track contrast of the target cluster pair, and marking the average value of the overall track stability of all the second final comparison unmanned aerial vehicles in the target cluster pair as the second class track contrast of the target cluster pair;

2. The deep reinforcement learning-based large-scale unmanned aerial vehicle cluster flight system of claim 1, wherein the method for performing range division on unmanned aerial vehicle space data points in the space data sequence to obtain the influence data point range of each unmanned aerial vehicle space data point comprises the following specific steps:

3. The deep reinforcement learning-based large-scale unmanned aerial vehicle cluster flight system according to claim 1, wherein the obtaining the regional spatial offset of each unmanned aerial vehicle spatial data point according to the difference between the unmanned aerial vehicle spatial data point and the ideal trajectory data point in the influence data point range comprises the following specific methods:

4. The large-scale unmanned aerial vehicle cluster flight system based on deep reinforcement learning according to claim 1, wherein the method for obtaining the normalized space weight coefficient between any two unmanned aerial vehicles according to the difference change condition of the comprehensive weight coefficient between unmanned aerial vehicles comprises the following specific steps:

5. The deep reinforcement learning-based large-scale unmanned aerial vehicle cluster flight system according to claim 4, wherein the optimizing the distance between the unmanned aerial vehicles according to the normalized space weight coefficient to obtain the optimized classification distance between any two unmanned aerial vehicles comprises the following specific steps:

6. The deep reinforcement learning-based large-scale unmanned aerial vehicle cluster flight system according to claim 1, wherein the large-scale unmanned aerial vehicle cluster training according to the optimized classification distance comprises the following specific methods: