CN115290096B - Unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm - Google Patents
Unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm Download PDFInfo
- Publication number
- CN115290096B CN115290096B CN202211195962.8A CN202211195962A CN115290096B CN 115290096 B CN115290096 B CN 115290096B CN 202211195962 A CN202211195962 A CN 202211195962A CN 115290096 B CN115290096 B CN 115290096B
- Authority
- CN
- China
- Prior art keywords
- aerial vehicle
- unmanned aerial
- algorithm
- reinforcement learning
- track planning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 117
- 230000002787 reinforcement Effects 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000005457 optimization Methods 0.000 claims abstract description 45
- 230000006870 function Effects 0.000 claims abstract description 17
- 230000009471 action Effects 0.000 claims abstract description 12
- 230000007613 environmental effect Effects 0.000 claims abstract description 5
- 238000012545 processing Methods 0.000 claims description 40
- 238000012937 correction Methods 0.000 claims description 33
- 239000003795 chemical substances by application Substances 0.000 claims description 20
- 230000000875 corresponding effect Effects 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000013461 design Methods 0.000 claims description 12
- 230000001965 increasing effect Effects 0.000 claims description 8
- 239000006185 dispersion Substances 0.000 claims description 4
- 238000005259 measurement Methods 0.000 claims description 4
- 238000003860 storage Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000012876 topography Methods 0.000 claims description 3
- 238000005286 illumination Methods 0.000 description 34
- 241000196324 Embryophyta Species 0.000 description 23
- 238000009826 distribution Methods 0.000 description 14
- 239000008186 active pharmaceutical agent Substances 0.000 description 6
- 238000009499 grossing Methods 0.000 description 6
- 238000011160 research Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001149 cognitive effect Effects 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- RZVHIXYEVGDQDX-UHFFFAOYSA-N 9,10-anthraquinone Chemical compound C1=CC=C2C(=O)C3=CC=CC=C3C(=O)C2=C1 RZVHIXYEVGDQDX-UHFFFAOYSA-N 0.000 description 1
- 241000207199 Citrus Species 0.000 description 1
- 241000675108 Citrus tangerina Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 235000020971 citrus fruits Nutrition 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000009189 diving Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 235000012055 fruits and vegetables Nutrition 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 238000004540 process dynamic Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/20—Instruments for performing navigational calculations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/04—Constraint-based CAD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/10—Numerical modelling
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Automation & Control Theory (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
Abstract
The invention relates to the technical field of unmanned aerial vehicle dynamic track planning, and discloses an unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm, which comprises the following steps: s1: acquiring a terrain environment in which the unmanned aerial vehicle needs to fly; s2: establishing a flight path planning model according to the acquired environmental data and the performance constraint of the unmanned aerial vehicle, representing the environment as an artificial potential field, establishing a gravitational potential field by taking a target point as a center, and establishing a repulsive potential field by taking an obstacle and a threat as centers; s3: when a track planning model is established, a function structural body for correcting a positioning error is added, the current resultant force borne by the unmanned aerial vehicle is calculated according to the artificial potential field, and the unmanned aerial vehicle is enabled to move forward under the action of the resultant force; s4: designing a reinforcement learning difference algorithm based on a flight path planning model; s5: and optimizing the reinforcement learning differential algorithm, implanting the optimized reinforcement learning differential algorithm into an intelligent system of the unmanned aerial vehicle, and solving the algorithm based on the optimization of the reinforcement learning differential algorithm to complete the flight path planning of the unmanned aerial vehicle.
Description
Technical Field
The invention relates to the technical field of unmanned aerial vehicle dynamic track planning, in particular to an unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm.
Background
Oranges and tangerines in the south of hills are mainly planted in hills and mountainous areas, and have the characteristics of large planting density, small scale, large dispersity, variable topographic relief, large steep curves and the like, so that the traditional manual plant protection operation mode is very difficult, and the adoption of the plant protection unmanned aerial vehicle for autonomous operation has obvious advantages.
However, the complex terrain environment causes unstable hilly climate conditions, often accompanied by environmental disturbances such as gusts, heavy fog and rainstorms, and flight operations using a manual remote control mode or autonomous flight operations under a fixed route are difficult to meet the requirements of the plant protection unmanned aerial vehicle on track planning in the complex environment of hilly and mountainous areas. Therefore, the dynamic track planning algorithm of the plant protection unmanned aerial vehicle suitable for the planting characteristics of hilly and mountainous areas is researched, the dynamic planning and autonomous operation of the track of the plant protection unmanned aerial vehicle in the complex environment are realized, and the method is a key link for improving the plant protection efficiency of the citrus unmanned aerial vehicle in the south of the mountains.
As a core part of a track planning system, an optimal track is searched by using a track planning algorithm, which is a popular research subject all the time, and the track planning problem of the plant protection unmanned aerial vehicle in the complex environment of hilly and mountainous areas is a dynamic multi-constraint optimization problem with high dimension, multiple constraints and strong coupling, which is an NP-hard problem. Solving a dynamic multi-constraint optimization problem, the most difficult task is to maintain the diversity of the solution, which requires the algorithm to have very fast convergence speed and calculation accuracy. The traditional evolutionary algorithm is more suitable for solving the problem of static track planning, the problem of dynamic multi-constraint optimization track planning under complex conditions is difficult to process efficiently, the problems of low convergence speed, easy falling into local optimum and the like generally exist, the performance of the algorithm is unsatisfactory, the problem of track planning of the plant protection unmanned aerial vehicle under the complex environment of hilly and mountainous areas is a dynamic multi-constraint optimization problem, and the real-time performance of the algorithm requires that the algorithm has very high planning speed and calculation precision.
Few scholars currently conducting research on dynamic trajectory planning. And the Hidalgo and the like adopt an RRT algorithm to combine with a GPU to realize autonomous real-time planning of the unmanned aerial vehicle flight path in a plurality of simulated scene environments. The algorithm efficiency under various scenes is verified through a numerical simulation experiment, and the algorithm adopts the GPU for calculation, so that the requirement on hardware configuration is very high. Cai and the like adopt an optimization algorithm based on cognitive behaviors to realize real-time planning of the unmanned aerial vehicle flight path in a 3-dimensional environment. The algorithm firstly adopts a three-level function model to design a track route, designs a track target function into three levels of high, medium and low, and adopts a cognitive behavior optimization algorithm for optimization, and experimental results show that the algorithm is superior to a particle swarm algorithm and an RRT algorithm, but the track route is difficult to grade in an actual flight environment. Wan et al uses DeepLabV3+ deep learning model to segment the fruit tree canopy image, and the results of the implementation show that the accuracy of extracting the route by the algorithm is 95% through the fruit tree canopy barycenter number of the segmented binary image, but the algorithm can only be used for planning the route of fruits and vegetables with canopies, and has certain limitations.
In summary, the algorithms for dynamic track planning are few at present, and the conventional planning algorithm and the intelligent optimization algorithm generally have the problems of low convergence rate, easy algorithm to enter local optimization and the like when solving the problem of complex dynamic track planning. Therefore, it is necessary to design an algorithm capable of efficiently processing the dynamic multi-constraint flight path planning problem, and therefore, a method for planning the dynamic flight path of the unmanned aerial vehicle based on the reinforcement learning difference algorithm is provided.
Disclosure of Invention
The invention aims to disclose an unmanned aerial vehicle dynamic track planning method based on a reinforcement learning difference algorithm, and solves the problem of how to efficiently process dynamic multi-constraint track planning.
In order to achieve the purpose, the invention adopts the following technical scheme:
an unmanned aerial vehicle dynamic track planning method based on a reinforcement learning difference algorithm comprises the steps of
S1: acquiring a terrain environment in which the unmanned aerial vehicle needs to fly;
s2: establishing a flight path planning model according to the acquired environmental data and the performance constraint of the unmanned aerial vehicle, representing the environment as an artificial potential field, establishing a gravitational potential field by taking a target point as a center, and establishing a repulsive potential field by taking an obstacle and a threat as centers;
s3: when a track planning model is established, a function structure body for correcting positioning errors is added, the current resultant force borne by the unmanned aerial vehicle is calculated according to the artificial potential field, and the unmanned aerial vehicle is enabled to advance under the action of the resultant force;
s4: designing a reinforcement learning difference algorithm based on a flight path planning model;
s5: and optimizing the reinforcement learning differential algorithm, implanting the optimized reinforcement learning differential algorithm into an unmanned aerial vehicle intelligent system, and solving the algorithm optimized based on the reinforcement learning differential algorithm to complete the flight path planning of the unmanned aerial vehicle.
Preferably, the function structure for increasing the positioning error correction in S3 includes the following steps;
s21: setting an unmanned aerial vehicle flight path planning area consisting of 1 departure point, 1 destination, R horizontal correction points and L vertical correction points;
s22: constructing an unmanned aerial vehicle track planning area containing a point of 2+ R + L, wherein the unmanned aerial vehicle needs to be positioned in real time in the space flight process, the positioning error comprises a vertical error and a horizontal error, the vertical error and the horizontal error are respectively increased by delta special units when the unmanned aerial vehicle flies for 1m, and the vertical error and the horizontal error are both smaller than theta units when the unmanned aerial vehicle reaches a target point, so that the unmanned aerial vehicle can fly according to the planned track;
s23: the unmanned aerial vehicle needs to correct the positioning error in the flight process, correction points exist in a track planning area and can be used for error correction, when the unmanned aerial vehicle reaches the correction points, the error correction can be carried out according to the error correction types of the correction points, the positions for correcting vertical and horizontal errors can be determined before the track planning according to the terrain, when the vertical error and the horizontal error can be corrected in time, the unmanned aerial vehicle can fly according to a preset route, and finally reaches a destination after error correction is carried out through a plurality of correction points.
Preferably, the design of the strong chemical habit difference evolution algorithm in S4 comprises the following steps: s31: combining reinforcement learning and a differential evolution algorithm, and adopting a Q learning algorithm or a deep Q learning algorithm as an intelligent agent to carry out intelligent decision;
s32: analyzing the optimization problem by using the dispersion measurement, the autocorrelation roughness, the terrain information roughness and the fitness cloud, and taking the terrain feature information of the fitness of the optimization problem as the state space of the reinforcement learning intelligent agent;
s33: selecting a control parameter and a variation strategy of a differential evolution algorithm as an action space of the intelligent agent, and designing population evolution efficiency as the reward of the intelligent agent;
s34: and finally, the intelligent agent obtains the local information of the optimization problem through the state space, executes the corresponding operation of the action space according to the state space information, calculates the reward obtained after the corresponding action operation is executed, and returns the reward to the intelligent agent.
Preferably, the calculation of the resultant force in S2 determines the movement direction of the drone according to the following formula:wherein,indicating the attraction of the target to the drone,is the coordinate vector of the target, and X is the coordinate vector of the current position of the drone; k is coefficient, and the value is 0-1;the repulsion force of the no-fly zone to the unmanned aerial vehicle is expressed, and the existing repulsion field function is adopted in the scheme to completeCalculating; the resultant force F of the attraction force and the repulsion force is the moving direction of the unmanned aerial vehicle.
Preferably, in the step S5, the solution is performed through an algorithm optimized based on a reinforcement learning difference algorithm, so as to complete the flight path planning of the unmanned aerial vehicle and the obstacle avoidance under the constraint condition on the flight path.
Preferably, the constraint condition obstacle avoidance includes the following steps: s61: inputting the initial position of the unmanned aerial vehicle as the current positionIn m no-fly zonesThe position of the heart is determined by the position of the heart,and a target position G assigned by the drone;
s62: taking two variables G1 and G2, respectively representing a target position and a final target position in the calculation process, and initializing G1= G2= G; open up two storage spaces of A, B to with unmanned aerial vehicle current positionStoring in A; initializing iteration times num =0;
s63: determining the motion direction of the unmanned aerial vehicle, setting the motion step length of the unmanned aerial vehicle to be L, and enabling the unmanned aerial vehicle to move from the current positionMoving according to the movement step length L in the determined movement direction, and updating the current position by the moved positionAnd storing the position of the unmanned aerial vehicle in A, wherein the iteration number num = num +1;
s64: judging whether num > N is true, if yes, setting num =0 and performing step S65, otherwise, returning to step S63; wherein N is a preset total number of iterations;
s65: judging the current positionWhether the distance d from G1 satisfies d<d 0 Wherein d is 0 Is a preset distance threshold;
s66: judging whether the last M position points stored in A are all in a preset circular area, if so, indicating that the position points are in a balance position or a local minimum point currently, and performing jump-out processing; if not, continuing to step S63;
s67: solving a straight line expression between two points which are stored at the last time of A;
s68: judging whether the straight line intersects with each circular no-fly zone, if not, returning to the step S63, otherwise, assigning the last stored position of A to G1, emptying A, and then performing the step S63;
s69: storing all the positions in A into B, judging whether G1 is equal to G2, if not, making order= G1, G1= G2, and then proceeds to step S63;
s610: and the position points stored in the B are the obstacle avoidance tracks of the unmanned aerial vehicle.
Preferably, the establishing of the track planning model in S2 further includes the following steps: s71: acquiring image data of a target area including surface topography data and plant data;
s72: obtaining an initial route of the unmanned aerial vehicle based on the image data of the target area;
s73: extracting a first actual geographic coordinate of the initial air route based on the inflection point position on the initial air route, and adjusting the first actual geographic coordinate based on the elevation value of the surface topographic data to obtain a first elevation coordinate;
s74: adjusting the initial route based on the first elevation coordinate to obtain a terrain route;
s75: dividing the initial route into sections at a preset distance, and extracting a second actual geographic coordinate of an end point of each section point by point;
s76: adjusting a second actual geographic coordinate based on the crop planting data to obtain a second elevation coordinate, and adjusting the initial air route based on the second elevation coordinate to obtain a crop planting air route;
s77: and establishing a track planning model based on the terrain route and the plant route.
Compared with the prior art, the unmanned aerial vehicle dynamic track planning method based on the reinforcement learning difference algorithm has the following beneficial effects:
1. according to the unmanned aerial vehicle dynamic track planning method based on the reinforcement learning difference algorithm, by aiming at the problem of insufficient diversity of solutions under complex dynamic multi-constraint conditions, a constraint condition processing method combining a self-adaptive relaxation variable method and a feasibility criterion is provided, the reduction of diversity of the solutions is avoided, the time for searching the optimal solution by the algorithm is shortened, the difficulty of optimizing the algorithm is reduced, and the efficiency of the algorithm is improved.
2. According to the unmanned aerial vehicle dynamic flight path planning method based on the reinforcement learning differential algorithm, the situation information of the optimization problem is obtained by adopting fitness terrain analysis methods such as dispersion measurement, autocorrelation roughness, terrain information roughness and fitness cloud and is used as the state space of the intelligent body, the deep reinforcement learning algorithm is combined with the differential evolution, the differential evolution algorithm can be used for adaptively selecting an optimal variation strategy through the decision of the intelligent body in the process of solving the boundary limitation continuous domain optimization problem, the optimal solution can be quickly and efficiently found in real time, and the dynamic planning of the flight path is realized.
Drawings
The invention is further illustrated by means of the attached drawings, but the embodiments in the drawings do not constitute any limitation to the invention, and for a person skilled in the art, without inventive effort, further drawings may be derived from the following figures.
Fig. 1 is a schematic flow diagram of an unmanned aerial vehicle dynamic flight path planning method based on a reinforcement learning difference algorithm according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As shown in fig. 1, the present invention provides a method for planning a dynamic flight path of an unmanned aerial vehicle based on a reinforcement learning difference algorithm, including:
s1: acquiring a terrain environment in which the unmanned aerial vehicle needs to fly;
s2: establishing a track planning model according to the acquired environment data and the performance constraint of the unmanned aerial vehicle, representing the environment as an artificial potential field, establishing a gravitational potential field by taking a target point as a center, and establishing a repulsive potential field by taking an obstacle and a threat as centers;
s3: when a track planning model is established, a function structural body for correcting a positioning error is added, the current resultant force borne by the unmanned aerial vehicle is calculated according to the artificial potential field, and the unmanned aerial vehicle is enabled to move forward under the action of the resultant force;
s4: designing a reinforcement learning difference algorithm based on a flight path planning model; the difference calculation is an operation performed by using difference, and reinforcement learning is also called refit learning, evaluation learning or reinforcement learning, is one of paradigms and methodologies of machine learning, and is used for describing and solving the problem that an intelligent agent achieves return maximization or achieves a specific target through a learning strategy in an interaction process with an environment.
A common model for reinforcement learning is the standard markov decision process. Reinforcement learning can be classified into pattern-based reinforcement learning and modeless reinforcement learning, as well as active reinforcement learning and passive reinforcement learning, under given conditions. Variations of reinforcement learning include reverse reinforcement learning, hierarchical reinforcement learning, and reinforcement learning of partially observable systems. Algorithms used for solving the reinforcement learning problem can be divided into two types, namely a strategy search algorithm and a value function algorithm. The deep learning model can be used in reinforcement learning to form the deep reinforcement learning.
Reinforcement learning does not require any data to be given in advance, but rather obtains learning information and updates model parameters by receiving feedback of the environment on the actions.
The reinforcement learning problem is discussed in the fields of information theory, game theory, automatic control and the like, and is used for explaining a balance state, a design recommendation system and a robot interaction system under the condition of limited rationality.
S5: and optimizing the reinforcement learning differential algorithm, implanting the optimized reinforcement learning differential algorithm into an intelligent system of the unmanned aerial vehicle, and solving the algorithm based on the optimization of the reinforcement learning differential algorithm to complete the flight path planning of the unmanned aerial vehicle.
Preferably, the function structure for increasing the positioning error correction in S3 includes the following steps;
s21: setting an unmanned aerial vehicle flight path planning area consisting of 1 departure point, 1 destination, R horizontal correction points and L vertical correction points;
s22: constructing an unmanned aerial vehicle track planning area containing a point of 2+ R + L, wherein the unmanned aerial vehicle needs to be positioned in real time in the space flight process, the positioning error comprises a vertical error and a horizontal error, the vertical error and the horizontal error are respectively increased by delta special units when the unmanned aerial vehicle flies for 1m, and the vertical error and the horizontal error are both smaller than theta units when the unmanned aerial vehicle reaches a target point, so that the unmanned aerial vehicle can fly according to the planned track;
s23: the unmanned aerial vehicle needs to correct the positioning error in the flight process, correction points exist in a track planning area and can be used for error correction, when the unmanned aerial vehicle reaches the correction points, the error correction can be carried out according to the error correction types of the correction points, the positions for correcting vertical and horizontal errors can be determined before the track planning according to the terrain, when the vertical error and the horizontal error can be corrected in time, the unmanned aerial vehicle can fly according to a preset route, and finally reaches a destination after error correction is carried out through a plurality of correction points.
Preferably, the design of the strong chemical habit differential evolution algorithm in S4 comprises the following steps: s31: combining reinforcement learning and a differential evolution algorithm, and adopting a Q learning algorithm or a deep Q learning algorithm as an agent to make intelligent decision;
s32: analyzing the optimization problem by using the dispersion measurement, the autocorrelation roughness, the terrain information roughness and the adaptability cloud, and using the adaptability terrain feature information of the optimization problem as the state space of the reinforcement learning intelligent agent;
s33: selecting a control parameter and a variation strategy of a differential evolution algorithm as an action space of the intelligent agent, and designing population evolution efficiency as the reward of the intelligent agent;
s34: and finally, the intelligent agent obtains local information of the optimization problem through the state space, executes corresponding operation of the action space according to the state space information, calculates rewards obtained after corresponding action operation is executed and returns the rewards to the intelligent agent, and continuously trains and tests the intelligent agent by selecting an IEEE Congress on evolution computing (CEC) series competition dynamic optimization problem test set, so that the reinforcement learning differential evolution algorithm can quickly and efficiently find the optimal solution in real time according to the change of constraint conditions, and the dynamic planning of the flight path is realized.
Preferably, in S2, the direction of motion of the drone is determined according to the following equation:wherein,indicating the attraction of the target to the drone,is the coordinate vector of the target, and X is the coordinate vector of the current position of the drone; k is coefficient, and the value is 0-1;the repulsion force of the no-fly zone to the unmanned aerial vehicle is expressed, and the existing repulsion field function is adopted in the scheme to complete the schemeCalculating; the resultant force F of the attraction force and the repulsion force is the direction of the unmanned aerial vehicle movement.
Preferably, in the step S5, the algorithm optimized based on the reinforcement learning difference algorithm is used for solving, so as to complete the flight path planning of the unmanned aerial vehicle and the obstacle avoidance under the constraint condition on the flight path.
The constraint condition obstacle avoidance method comprises the following steps: s61: inputting the initial position of the unmanned aerial vehicle as the current positionThe central positions of the m no-fly zones,and a target position G assigned by the drone;
s62: two variables G1 and G2 are taken to respectively represent targets in the calculation processPosition and final target position, and initializing G1= G2= G; open up two storage spaces of A, B to with unmanned aerial vehicle current positionStoring in A; initializing iteration times num =0;
s63: determining the motion direction of the unmanned aerial vehicle, setting the motion step length of the unmanned aerial vehicle to be L, and enabling the unmanned aerial vehicle to move from the current positionMoving according to the movement step length L in the determined movement direction, and updating the current position by the moved positionAnd storing the position of the unmanned aerial vehicle in A, wherein the iteration times num = num +1;
s64: judging whether num > N is true, if yes, setting num =0 and performing step S65, otherwise, returning to step S63; wherein N is a preset total number of iterations;
s65: judging the current positionWhether the distance d from G1 satisfies d<d 0 In which d is 0 Is a preset distance threshold;
s66: judging whether the last M position points stored in A are all in a preset circular area, if so, indicating that the position points are in a balance position or a local minimum point currently, and performing jump-out processing; if not, continuing to step S63;
s67: solving a straight line expression between two points which are stored at the last time of A;
s68: judging whether the straight line intersects with each circular no-fly zone, if not, returning to the step S63, otherwise, assigning the last stored position of A to G1, emptying A, and then performing the step S63;
s69: storing all the positions in A into B, judging whether G1 is equal to G2, if not, making order= G1, G1= G2, and then proceeds to step S63;
s610: and the position points stored in the B are the obstacle avoidance tracks of the unmanned aerial vehicle.
Preferably, in S2, the establishing of the track planning model further includes the following steps:
s71: acquiring image data of a target area including surface topography data and plant data;
s72: obtaining an initial route of the unmanned aerial vehicle based on the image data of the target area;
s73: extracting a first actual geographic coordinate of the initial air route based on the inflection point position on the initial air route, and adjusting the first actual geographic coordinate based on the elevation value of the surface topographic data to obtain a first elevation coordinate;
s74: adjusting an initial route based on the first elevation coordinate to obtain a terrain route;
s75: dividing the initial route into sections at a preset distance, and extracting a second actual geographic coordinate of an endpoint of each section point by point;
s76: adjusting a second actual geographic coordinate based on the crop planting data to obtain a second elevation coordinate, and adjusting an initial route based on the second elevation coordinate to obtain a crop planting route;
s77: and establishing a flight path planning model based on the terrain route and the plant crop route.
Preferably, in S71, the plant data includes a plant type and a plant area, and is obtained by:
s711, dividing the target area into a plurality of sub-areas;
s712, respectively obtaining aerial photos of each sub-area;
and S712, carrying out image recognition processing on the aerial photographs, and acquiring the type of the plant crops contained in each sub-area and the area of each type of plant crops.
Specifically, the aerial photography unmanned aerial vehicle can be controlled in a manual flying mode to acquire aerial photographs of all sub-areas. Because the endurance time of the unmanned aerial vehicle is limited, it is difficult to directly acquire aerial photos of the whole target area, and therefore the method and the device divide the target area and then acquire the aerial photos of each sub-area respectively.
Preferably, in S712, the image recognition processing of the aerial photograph includes:
carrying out enhancement processing on the aerial photo to obtain an enhanced image;
and inputting the enhanced image into a pre-trained neural network model for image recognition processing to obtain the types of the plant crops contained in the enhanced image and calculate the occupied areas of the various types of plant crops.
Preferably, the enhancing the aerial photo to obtain an enhanced image includes:
performing illumination optimization processing on the aerial photo to obtain a first image;
carrying out noise reduction processing on the first image to obtain a second image;
and extracting the region of interest of the second image to obtain an enhanced image.
When the camera is used for aerial photography, the shooting range of the camera can be influenced by cloud layers, so that the illumination distribution is unbalanced, and the influence of air quality can be caused. Therefore, the invention can effectively reduce the influence of the problem of illumination distribution on the final identification of the type and the occupied area of the plant crops by performing illumination optimization processing on the aerial photo, thereby improving the safety of the flight path planning of the invention.
Preferably, the performing illumination optimization processing on the aerial photo to obtain a first image includes:
s81: decomposing the aerial photo by using an improved Retinex model, and decomposing the aerial photo into an illumination component image L and a reflection component image S;
s82: dividing the irradiation component image L into a plurality of sub-images, and storing all the sub-images obtained by the division into a set cutLSet;
s83: respectively acquiring an illumination distribution value of each sub-image in the set cutLSet;
s84: dividing the reflection component image S into a plurality of sub-images, and storing all the sub-images obtained by the division into a set cutSSet;
s85: and respectively carrying out optimization processing on each sub-image in the set cutSSet through a preset model to obtain a first image.
The existing Retinex algorithm generally directly processes the obtained reflection component image to obtain an illumination optimization result, but such a processing manner does not consider information added to the illumination component, so that the finally obtained processing result is not accurate enough. Therefore, after the illumination component image and the reflection component image are obtained, the illumination distribution value is obtained by blocking the illumination component image, and the illumination distribution value is added into the illumination optimization processing process of the reflection component image S, so that the accuracy of the illumination optimization processing result is further improved.
When the illumination distribution value is obtained, the invention accelerates the obtaining speed of the illumination distribution value by dividing the illumination component image L. Similarly, when the reflection component image S is optimized, by performing the division processing, it is avoided that each pixel point is respectively obtained with the calculation parameter in the processing formula, the calculation amount of the corresponding parameter is reduced, and the calculation speed is accelerated while the calculation accuracy is ensured.
Preferably, S81 includes:
s811: the pixel value of the pixel point in the irradiation component image L is obtained by the following equation:
wherein,show aboutTo be solved equation,Representing the pixel value of pixel point d in the illumination component image L,which represents a constant coefficient of the constant,represented in the illumination component image L, centred on the pixel point dA collection of pixel points within a window of size,to representThe number of the pixel points in (a) is,to representThe pixel value of the pixel point g in the illumination component image L,andindicating a control parameter greater than 0 and,the pixel value of the corresponding pixel point in the aerial photo I of the pixel point d is represented,expressing the pixel value of the corresponding pixel point in the aerial photo I of the pixel point g; k is a radical ofRepresenting the number of operations;
s812: the reflection component image S is acquired using the following formula:
wherein,the coordinates of the points of pixels are represented,、andrespectively represents the coordinates in the aerial photograph I, the illumination component image L and the reflection component image S asThe pixel value of the pixel point of (1).
In the process of acquiring the illumination component image L and the reflection component image S, the influence of pixel points around the pixel points on the acquired result is not considered in the conventional retinex algorithm, so that the acquired results of the illumination component image and the reflection component image are not accurate enough.
Preferably, the S82 includes:
s821: smoothing the irradiation component image L to obtain a smoothed irradiation component image smL;
s822: the smL is divided in the following mode:
a first round of division processing:
dividing the smL into D sub-images with the same number of pixel points, and dividing the sub-images into D sub-imagesStoring all sub-images obtained by the division into a set;
Respectively calculateThe judgment coefficient of each sub-image in the image data will beStoring the subimage with middle judgment coefficient larger than the set judgment coefficient threshold value into the collectionWill beStoring the subimage of which the middle judgment coefficient is less than or equal to the set judgment coefficient threshold value into a set cutLSet;
dividing the nth round, wherein n is more than or equal to 2:
respectively dividing the sets obtained by the n-1 th division processingDividing each sub-image into D sub-images with the same number of pixel points, and storing all sub-images obtained by the division into a set;
Respectively calculateThe judgment coefficient of each sub-image in the image processing system isStoring the subimage with middle judgment coefficient larger than the set judgment coefficient threshold value into the collectionWill beStoring the subimages of which the middle judgment coefficients are less than or equal to the set judgment coefficient threshold value into a set cutLSet;
judgment ofIf the number of the middle elements is less than the set number threshold, the division processing of the smL is finished, and the sub-images contained in the current set cutLSet are used as the division processing result.
In the embodiment of the invention, the illumination component image L is divided, when the sub-image is obtained, the smoothing processing is firstly carried out, and then the division processing is carried out on the smoothing processing result, so that the influence of the pixel points with sudden change on the division processing efficiency can be avoided. In the invention, the whole illumination distribution value of the sub-image needs to be acquired, so that the influence of a single mutation pixel point on the whole sub-image is very small, but the influence on the dividing efficiency is very large, and the number of dividing rounds is greatly increased due to the pixel point with the mutation pixel value. The invention can make the difference between the pixel points in the same subimage as small as possible, and the difference between different subimages as large as possible, so that the illumination distribution value is more representative.
Preferably, the S821 includes:
the illumination component image L is smoothed using the following formula:
wherein,representing the pixel value of pixel point h in smL,representing a pixelThe set of pixels in the preset size neighborhood of the corresponding pixel in L for point h,the length of the connecting line between the pixel point corresponding to the pixel point h in the L and the pixel point m is represented,the pixel value of the pixel point corresponding to the pixel point h in L is represented,representing the pixel value of pixel m in L,representThe variance of the distance between the pixel point in (b) and the corresponding pixel point in (L) of the pixel point (h),to representThe variance of the difference between the pixel value of the pixel point h and the pixel point corresponding to the pixel point h in L.
The embodiment of the invention also considers the relation between the pixel point and the surrounding pixel points in the aspects of pixel value and distance while smoothing the pixel point, so that the transition of the pixel value in the smoothed image is more natural, and corresponding detail information can be reserved. If the processing is performed by using a gaussian filter or the like as it is, the detail information is easily lost. Affecting the accuracy of the sub-image division result.
Preferably, the judgment coefficient is calculated by the following formula:
wherein,which represents the coefficient of judgment,representing a collection of pixel points in the sub-image,representing the pixel value of pixel u in smL,representing the gradient magnitude of the pixel point u in smL,to representThe total number of pixel points contained in (a),a variance reference value in terms of pixel values,a variance reference value in terms of gradient magnitude is indicated,representing a preset scaling factor.
In the embodiment of the invention, the judgment coefficient not only considers the pixel value but also considers the gradient amplitude, and the difference between pixel points in the obtained sub-images is smaller by comprehensively considering the two aspects, so that the accuracy of representing the whole sub-image by using a single illumination distribution value by light is improved.
Preferably, S83 includes:
s831: converting the sub-image to an HSV color space;
s832: acquiring an image V of a brightness component corresponding to the sub-image in an HSV color space;
s833: respectively counting the occurrence frequency of each pixel value in the image V;
s834: and taking the pixel value with the highest occurrence frequency as the illumination distribution value of the sub-image.
Since a value is used to represent the whole case, the present invention will present the most pixel values as the illumination distribution values.
Preferably, the S84 includes:
s841: acquiring a characteristic image DS based on the reflection component image S;
s842: dividing the characteristic image DS into a plurality of sub-images;
s843: the division result of the division processing on the feature image DS is applied to the reflection component image S, and a set cutSSet of sub-images is obtained.
Specifically, the embodiment of the present invention does not directly perform the division processing on the reflection component image S, but obtains the feature image based on the reflection component image S and performs the division processing based on the feature image. The arrangement mode can improve the accuracy of the dividing processing result and ensure the dividing speed. In the feature image DS, since the pixel values are obtained by comprehensive calculation from a plurality of aspects, the information that can be expressed by the pixel values is richer than the information that can be expressed by the pixel values of the pixels of the original reflection component image S.
When the feature image is divided, the smoothed illumination component image may be divided, or an existing extracted image division may be used.
In step S843, for example, a set corresponding to the pixel points in the sub-image Q obtained by the feature image DS is DSQ; and acquiring a set SDSQ of corresponding pixel points of the DSQ in the S, and forming the pixel points in the SDSQ into a sub-image.
Preferably, the S841 includes:
for reflectionThe pixel value of the pixel point T in the characteristic image DS is obtained by adopting the following formula:
Wherein,which represents a preset weight coefficient for the weight of the image,representing pixel point T in the imageThe value of the pixel of (1) is,representing pixel point T in imageThe value of the pixel of (a) is,representing pixel point T in the imageThe value of the pixel of (a) is,for the hue component image of the reflection component image S in the HSV color space,is an image of the lightness component of the reflected component image S in the HSV color space,an image representing the luminance component of the reflected component image S in the Lab color space.
Specifically, the pixel values of the pixels in the feature image are weighted and fused from the aspects of hue component, brightness component and brightness component, so that information which can be expressed by one pixel in the feature image is richer.
Preferably, S85 includes:
the following formula is adopted to calculate and optimize the sub-image in the cutSSet:
wherein,representing sub-images obtained after an optimization processIn (d), the pixel value of the pixel point with coordinates (x, y),sub-image where pixel point with (x, y) as coordinate is locatedThe value of the light distribution of (a),a pixel value representing a pixel point of coordinates (x, y) in the reflection component image S,sub-image representing the location of pixel point with coordinates (x, y)Permeability coefficient of (a);
wherein,andthe weight parameter is represented by a weight value,respectively representing sub-imagesThe average value of the red component, the green component and the blue component of the pixel point in the RGB color space,representing sub-imagesThe variance of the dark channel values of the pixel points in (1),representing a preset control coefficient;
obtaining sub-imagesSet of corresponding pixel points of the pixel points in the illumination component image L;
Will be assembledAs the average value of the illumination distribution values of the sub-images in (1)The value of (c).
When optimization processing is carried out, the same parameters are adopted for calculating the pixel points in the same sub-image, therefore, for the pixel points in the same sub-image, except the first pixel point for optimization processing, other pixel points are calculated by adopting the parameters obtained by the first pixel point for optimization processing, and therefore the efficiency of optimization processing is effectively improved. Specifically, in the invention, when the pixel points of the same sub-image are optimized,andexcept that the pixel point of the first optimization processing needs to be calculated, other pixel points do not need to be calculated.
The dynamic track planning problem modeling is to research and utilize a grid three-dimensional space division method, combine environmental terrain information, establish a three-dimensional terrain flying environment model, analyze the self performance constraint of the unmanned aerial vehicle, simultaneously consider external constraints such as terrain threats (obstacles), atmospheric threats (gusts and thick fog), sudden threats (flying birds), no-fly zones (high-voltage towers) and the like, establish an external environment constraint condition mathematical model, research and utilize the shortest track length, the smallest track threat and the lowest flying height to construct a track evaluation function, and realize the modeling of the dynamic track planning problem.
The precision, accuracy and optimizing speed of the algorithm are requirements of a dynamic track planning problem on the algorithm, so that the design of the algorithm capable of efficiently solving dynamic multi-constraint conditions is the key point of project research, the invention combines reinforcement learning with a differential evolution algorithm to design the algorithm, the reinforcement learning differential evolution algorithm design combines reinforcement learning with the differential evolution algorithm, the design of an action space, a state space and a value reward function in the reinforcement learning algorithm is researched, the relation between a reinforcement learning decision controller and a differential evolution algorithm variation strategy and control parameters is established, and the algorithm can adaptively select parameters and variation strategies in real time when different dynamic optimization problems are solved.
The dynamic track planning of the reinforcement learning differential evolution algorithm is to research a dynamic multi-constraint condition processing strategy, construct a proper track coding mode, find out the algorithm performance of the reinforcement learning-based differential evolution algorithm in the dynamic track planning problem, design a discrete track point smoothing processing algorithm and realize the dynamic planning of the plant protection unmanned aerial vehicle track under the dynamic multi-constraint condition.
The dynamic track planning problem modeling needs to acquire terrain information of an operation area, including the number of mountains, the heights of the mountains, the operation area, the area outline and the like, a grid three-dimensional space division method is adopted to establish a flight environment model, self performance constraints such as the maximum flight range, the minimum flight height, the maximum turning angle, the maximum diving angle, the minimum step length and the like of the plant protection unmanned aerial vehicle are considered, terrain threats, atmospheric threats, sudden threats, no-fly zones and other external environment constraints existing in a hilly mountain area terrain orange planting base are analyzed, and a multi-constraint condition equation is established; and constructing a track evaluation function according to the shortest track length, the lowest flight height and the smallest track threat, and realizing modeling of a track planning problem under the dynamic multi-constraint condition.
The method aims to solve the problems that a differential evolution algorithm is difficult to select a variable strategy when solving different optimization problems, the algorithm performance is further improved and the like. The method comprises the steps of analyzing a single-target optimization problem in a series of continuous domains by using fitness terrain analysis methods such as information entropy roughness and fitness distance correlation to obtain fitness terrain features corresponding to the optimization problem, establishing a relation between the fitness terrain features and a differential evolution algorithm variation strategy by using a random forest, achieving an improved differential evolution algorithm, and adaptively selecting the variation strategy according to the fitness terrain features of the problem when different optimization problems are solved.
And analyzing the single-target optimization problem limited by the boundary condition by adopting a fitness terrain analysis method, researching the relation between the fitness terrain characteristic and the optimization problem, and judging the complexity of the optimization problem through the fitness terrain analysis characteristic. And the differential evolution algorithm based on the local fitness terrain is realized by analyzing the local fitness terrain of the optimization problem.
The application of the reinforcement learning differential evolution algorithm dynamic track planning comprises the following core steps: a constraint condition processing method based on the combination of the adaptive relaxation variables and the feasibility criteria is adopted to process the dynamic constraint equation, so that the constraint conditions are simplified, the number of feasible solutions is increased, and the solving speed of the algorithm is increased; converting the shortest flight path length, the lowest flight height and the minimum flight path threat into three mutually contradictory objective functions by adopting a self-adaptive weight factor; and introducing a flight environment model, coding a population of a reinforcement learning differential evolution algorithm to solve the dynamic track planning problem, providing a new track smoothing algorithm based on 5-order PH curve splicing to smooth track points, verifying the algorithm performance through a numerical simulation experiment, finally embedding the algorithm into an autonomous research and development plant protection unmanned aerial vehicle flight control system, and performing experiment verification in an actual environment.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (6)
1. An unmanned aerial vehicle dynamic flight path planning method based on reinforcement learning difference algorithm is characterized by comprising the following steps: s1: acquiring a terrain environment in which the unmanned aerial vehicle needs to fly;
s2: establishing a flight path planning model according to the acquired environmental data and the performance constraint of the unmanned aerial vehicle, representing the environment as an artificial potential field, establishing a gravitational potential field by taking a target point as a center, and establishing a repulsive potential field by taking an obstacle and a threat as centers;
s3: when a track planning model is established, a function structure body for correcting positioning errors is added, the current resultant force borne by the unmanned aerial vehicle is calculated according to the artificial potential field, and the unmanned aerial vehicle is enabled to advance under the action of the resultant force;
s4: designing a reinforcement learning difference algorithm based on a flight path planning model;
s5: optimizing the reinforcement learning differential algorithm, implanting the optimized reinforcement learning differential algorithm into an unmanned aerial vehicle intelligent system, and solving the algorithm optimized based on the reinforcement learning differential algorithm to complete the flight path planning of the unmanned aerial vehicle;
the design of the S4 mesoscale chemical learning differential evolution algorithm comprises the following steps:
s31: combining reinforcement learning and a differential evolution algorithm, and adopting a Q learning algorithm or a deep Q learning algorithm as an intelligent agent to carry out intelligent decision;
s32: analyzing the optimization problem by using the dispersion measurement, the autocorrelation roughness, the terrain information roughness and the adaptability cloud, and using the adaptability terrain feature information of the optimization problem as the state space of the reinforcement learning intelligent agent;
s33: selecting a control parameter and a variation strategy of a differential evolution algorithm as an action space of the intelligent agent, and designing population evolution efficiency as reward of the intelligent agent;
s34: and finally, the intelligent agent obtains the local information of the optimization problem through the state space, executes the corresponding operation of the action space according to the state space information, calculates the reward obtained after the corresponding action operation is executed, and returns the reward to the intelligent agent.
2. The unmanned aerial vehicle dynamic track planning method based on the reinforcement learning difference algorithm according to claim 1, wherein the step of adding a function structure body for positioning error correction in the step S3 comprises the following steps:
s21: setting an unmanned aerial vehicle track planning area consisting of 1 departure point, 1 destination, R horizontal correction points and L vertical correction points of the unmanned aerial vehicle;
s22: constructing an unmanned aerial vehicle track planning area containing a point of 2+ R + L, wherein the unmanned aerial vehicle needs to be positioned in real time in the space flight process, the positioning error comprises a vertical error and a horizontal error, the vertical error and the horizontal error are respectively increased by delta special units when the unmanned aerial vehicle flies for 1m, and the vertical error and the horizontal error are both smaller than theta units when the unmanned aerial vehicle reaches a target point, so that the unmanned aerial vehicle can fly according to the planned track;
s23: the unmanned aerial vehicle needs to correct the positioning error in the flight process, correction points exist in a track planning area and are used for error correction, when the unmanned aerial vehicle reaches the correction points, the error correction can be carried out according to the error correction types of the correction points, the positions for correcting vertical and horizontal errors are determined before track planning according to terrain, when the vertical errors and the horizontal errors can be corrected in time, the unmanned aerial vehicle can fly according to a preset route, and finally reaches a destination after error correction is carried out through a plurality of correction points.
3. The dynamic unmanned aerial vehicle track planning method based on the reinforcement learning difference algorithm as claimed in claim 1, wherein the resultant force calculation in S3 determines the motion direction of the unmanned aerial vehicle according to the following formula:;wherein,indicating the attraction of the target to the drone,is the coordinate vector of the target, and X is the coordinate vector of the current position of the drone; k is coefficient, and the value is 0-1;the repulsion force of the no-fly zone to the unmanned aerial vehicle is expressed and is completed by adopting the existing repulsion field functionCalculating; the resultant force F of the attraction force and the repulsion force is the direction of the unmanned aerial vehicle movement.
4. The dynamic unmanned aerial vehicle track planning method based on the reinforcement learning differential algorithm as claimed in claim 1, wherein in S5, the algorithm optimized based on the reinforcement learning differential algorithm is solved to complete the track planning of the unmanned aerial vehicle and the obstacle avoidance of the constraint condition on the track.
5. The unmanned aerial vehicle dynamic track planning method based on the reinforcement learning difference algorithm of claim 4, wherein the constraint condition obstacle avoidance comprises the following steps:
s61: inputting the initial position of the unmanned aerial vehicle as the current positionThe central positions of the m no-fly zones,and a target position G assigned by the drone;
s62: taking two variables G1 and G2, respectively representing a target position in the calculation process and a final target position, and initializing G1= G2= G; opening up two storage spaces A and B and using the current position of the unmanned aerial vehicleStoring in A; initializationThe iteration number num =0;
s63: determining the motion direction of the unmanned aerial vehicle, setting the motion step length of the unmanned aerial vehicle to be L, and enabling the unmanned aerial vehicle to move from the current positionMoving according to the movement step length L in the determined movement direction, and updating the current position with the moved positionAnd storing the position of the unmanned aerial vehicle in A, wherein the iteration number num = num +1;
s64: judging whether num > N is true, if yes, setting num =0 and performing step S65, otherwise, returning to step S63; wherein N is a preset total number of iterations;
s65: judging the current positionWhether the distance d from G1 satisfies d<In whichIs a preset distance threshold;
s66: judging whether the last M position points stored in A are all in a preset circular area, if so, indicating that the position points are in a balance position or a local minimum point currently, and performing jump-out processing; if not, continuing to step S63;
s67: solving a straight line expression between two points stored in the last step A;
s68: judging whether the straight line intersects with each circular no-fly zone, if not, returning to the step S63, otherwise, assigning the last stored position of A to G1, emptying A, and then performing the step S63;
s69: storing all the positions in A into B, judging whether G1 is equal to G2, if not, making order= G1, G1= G2, and then proceeds to step S63;
s610: and the position points stored in the B are the obstacle avoidance tracks of the unmanned aerial vehicle.
6. The unmanned aerial vehicle dynamic track planning method based on the reinforcement learning difference algorithm as claimed in claim 1, wherein the establishing of the track planning model in S2 further comprises the following steps:
s71: acquiring image data of a target area including surface topography data and plant data;
s72: obtaining an initial route of the unmanned aerial vehicle based on the image data of the target area;
s73: extracting a first actual geographic coordinate of the initial route based on the inflection point position on the initial route, and adjusting the first actual geographic coordinate based on the elevation value of the terrain data of the earth surface to obtain a first elevation coordinate;
s74: adjusting the initial route based on the first elevation coordinate to obtain a terrain route;
s75: dividing the initial route into sections at a preset distance, and extracting a second actual geographic coordinate of an endpoint of each section point by point;
s76: adjusting a second actual geographic coordinate based on the crop planting data to obtain a second elevation coordinate, and adjusting the initial air route based on the second elevation coordinate to obtain a crop planting air route;
s77: and establishing a flight path planning model based on the terrain route and the plant crop route.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211195962.8A CN115290096B (en) | 2022-09-29 | 2022-09-29 | Unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211195962.8A CN115290096B (en) | 2022-09-29 | 2022-09-29 | Unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115290096A CN115290096A (en) | 2022-11-04 |
CN115290096B true CN115290096B (en) | 2022-12-20 |
Family
ID=83834641
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211195962.8A Active CN115290096B (en) | 2022-09-29 | 2022-09-29 | Unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115290096B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116400722B (en) * | 2023-05-10 | 2024-07-09 | 江苏方天电力技术有限公司 | Unmanned aerial vehicle obstacle avoidance flight method and related device |
CN116540723B (en) * | 2023-05-30 | 2024-04-12 | 南通大学 | Underwater robot sliding mode track tracking control method based on artificial potential field |
CN116412831B (en) * | 2023-06-12 | 2023-09-19 | 中国电子科技集团公司信息科学研究院 | Multi-unmanned aerial vehicle dynamic obstacle avoidance route planning method for recall and anti-dive |
CN117668497B (en) * | 2024-01-31 | 2024-05-07 | 山西卓昇环保科技有限公司 | Carbon emission analysis method and system based on deep learning under environment protection |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8626565B2 (en) * | 2008-06-30 | 2014-01-07 | Autonomous Solutions, Inc. | Vehicle dispatching method and system |
CN109540163B (en) * | 2018-11-20 | 2022-06-07 | 太原科技大学 | Obstacle avoidance path planning algorithm based on combination of differential evolution and fuzzy control |
CN110673637B (en) * | 2019-10-08 | 2022-05-13 | 福建工程学院 | Unmanned aerial vehicle pseudo path planning method based on deep reinforcement learning |
US20210123741A1 (en) * | 2019-10-29 | 2021-04-29 | Loon Llc | Systems and Methods for Navigating Aerial Vehicles Using Deep Reinforcement Learning |
CN112286203B (en) * | 2020-11-11 | 2021-10-15 | 大连理工大学 | Multi-agent reinforcement learning path planning method based on ant colony algorithm |
CN112712193B (en) * | 2020-12-02 | 2024-08-02 | 南京航空航天大学 | Multi-unmanned aerial vehicle local route planning method and device based on improved Q-Learning |
US20220214692A1 (en) * | 2021-01-05 | 2022-07-07 | Ford Global Technologies, Llc | VIsion-Based Robot Navigation By Coupling Deep Reinforcement Learning And A Path Planning Algorithm |
CN113255890A (en) * | 2021-05-27 | 2021-08-13 | 中国人民解放军军事科学院评估论证研究中心 | Reinforced learning intelligent agent training method based on PPO algorithm |
CN113359744B (en) * | 2021-06-21 | 2022-03-01 | 暨南大学 | Robot obstacle avoidance system based on safety reinforcement learning and visual sensor |
CN113534838A (en) * | 2021-07-15 | 2021-10-22 | 西北工业大学 | Improved unmanned aerial vehicle track planning method based on artificial potential field method |
CN114048689B (en) * | 2022-01-13 | 2022-04-15 | 南京信息工程大学 | Multi-unmanned aerial vehicle aerial charging and task scheduling method based on deep reinforcement learning |
-
2022
- 2022-09-29 CN CN202211195962.8A patent/CN115290096B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN115290096A (en) | 2022-11-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115290096B (en) | Unmanned aerial vehicle dynamic track planning method based on reinforcement learning difference algorithm | |
CN109614985B (en) | Target detection method based on densely connected feature pyramid network | |
CN110544296B (en) | Intelligent planning method for three-dimensional global track of unmanned aerial vehicle in uncertain enemy threat environment | |
CN110806756B (en) | Unmanned aerial vehicle autonomous guidance control method based on DDPG | |
CN110181508B (en) | Three-dimensional route planning method and system for underwater robot | |
CN108319286A (en) | A kind of unmanned plane Air Combat Maneuvering Decision Method based on intensified learning | |
CN106970615A (en) | A kind of real-time online paths planning method of deeply study | |
CN110531786B (en) | Unmanned aerial vehicle maneuvering strategy autonomous generation method based on DQN | |
CN109492556A (en) | Synthetic aperture radar target identification method towards the study of small sample residual error | |
CN112462803B (en) | Unmanned aerial vehicle path planning method based on improved NSGA-II | |
CN116954233A (en) | Automatic matching method for inspection task and route | |
CN115357031B (en) | Ship path planning method and system based on improved ant colony algorithm | |
CN117556979B (en) | Unmanned plane platform and load integrated design method based on group intelligent search | |
CN115060263A (en) | Flight path planning method considering low-altitude wind and energy consumption of unmanned aerial vehicle | |
CN117214904A (en) | Intelligent fish identification monitoring method and system based on multi-sensor data | |
CN110147816A (en) | A kind of acquisition methods of color depth image, equipment, computer storage medium | |
CN115933693A (en) | Robot path planning method based on adaptive chaotic particle swarm algorithm | |
Short et al. | Abio-inspiredalgorithminimage-based pathplanning and localization using visual features and maps | |
CN117948976B (en) | Unmanned platform navigation method based on graph sampling and aggregation | |
CN117724524A (en) | Unmanned aerial vehicle route planning method based on improved spherical vector particle swarm algorithm | |
CN114972429B (en) | Target tracking method and system for cloud edge cooperative self-adaptive reasoning path planning | |
CN116795098A (en) | Spherical amphibious robot path planning method based on improved sparrow search algorithm | |
CN113589810B (en) | Dynamic autonomous obstacle avoidance movement method and device for intelligent body, server and storage medium | |
CN115454096A (en) | Robot strategy training system and training method based on curriculum reinforcement learning | |
CN113034598A (en) | Unmanned aerial vehicle power line patrol method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |