CN115574826A - National park unmanned aerial vehicle patrol path optimization method based on reinforcement learning - Google Patents
National park unmanned aerial vehicle patrol path optimization method based on reinforcement learning Download PDFInfo
- Publication number
- CN115574826A CN115574826A CN202211572414.2A CN202211572414A CN115574826A CN 115574826 A CN115574826 A CN 115574826A CN 202211572414 A CN202211572414 A CN 202211572414A CN 115574826 A CN115574826 A CN 115574826A
- Authority
- CN
- China
- Prior art keywords
- path
- unmanned aerial
- aerial vehicle
- energy consumption
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/20—Instruments for performing navigational calculations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
- G06Q10/047—Optimisation of routes or paths, e.g. travelling salesman problem
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Tourism & Hospitality (AREA)
- Biophysics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Development Economics (AREA)
- General Business, Economics & Management (AREA)
- Automation & Control Theory (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Game Theory and Decision Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention discloses a national park Unmanned Aerial Vehicle (UAV) patrol path optimization method based on reinforcement learning, which comprises the steps of taking an unmanned aerial vehicle flight path as an optimization target, adding constraint conditions of unmanned aerial vehicle traversal path points, unmanned aerial vehicle electric quantity limitation and path point task execution energy consumption, and establishing an UAV path planning model with a self-service charging function; then respectively corresponding the unmanned aerial vehicle, the path points, the charging base station, the energy, the battery capacity, the flight path energy consumption and the path point task energy consumption in the unmanned aerial vehicle path planning model to a CVRP problem model; the unmanned aerial vehicle patrol route planning problem which needs to consider side energy consumption constraint and point energy consumption constraint originally is reduced into a CVRP problem which takes the route length as an optimization target and takes the customer demand and the vehicle load as constraints by using a feedforward weighting method; and finally, solving the reduced CVRP problem by using a multi-decoder attention model.
Description
Technical Field
The invention belongs to the technical field of computer intelligent calculation and unmanned aerial vehicle flight control, and particularly relates to a national park unmanned aerial vehicle patrol route optimization method based on reinforcement learning.
Background
The field patrol monitoring is the most important ecological monitoring and daily supervision means in national parks and natural conservation places, and a patrol guard collects data in the aspects of wild species population, habitat, phenology and the like through patrol monitoring, can timely discover ecological environment problems, inhibit illegal activities and the like, realizes effective protection on the national parks and the natural conservation places, and provides decision basis for natural resource supervision. However, national parks and natural protection lands have large areas, wide ranges and complex terrains, people and vehicles in most regions are difficult to reach, and the traditional manual patrol mode has low efficiency, wastes time and labor. Therefore, in recent years, unmanned aerial vehicles are increasingly used for patrol monitoring work of various natural protection places.
The unmanned aerial vehicle technology is an unmanned aerial vehicle remote sensing technology which is realized by fusing an aircraft technology, a communication technology, a GPS (global positioning system), a differential positioning technology and an image technology, and automatic acquisition and transmission of monitoring data are realized by carrying sensing equipment such as a high-definition camera and an intelligent sensor and combining a wireless communication network. The existing unmanned aerial vehicle used for patrol monitoring of national parks and natural conservation places has the challenges of short endurance, high requirement on flight control personnel, difficult storage and transportation of airplanes, high application integration difficulty and the like, and is difficult to meet the application requirements of normalized monitoring.
The automatic airport of unmanned aerial vehicle is the ground automation facility of assisting unmanned aerial vehicle full flow operation, for unmanned aerial vehicle provides all-weather protection, through automatic opening and shutting, go up and down, get and unload structural design, let unmanned aerial vehicle take off, descend, deposit and battery management all can accomplish automatically, need not artificial intervention. The unmanned aerial vehicle is stored in the automatic airport, and when flight demands exist, the unmanned aerial vehicle takes off from the airport autonomously, and automatically lands in the automatic airport after the task is finished, so that charging is carried out in the automatic airport, preparation is made for the next task, and full-automatic operation is realized.
For realizing the normalized development of unmanned aerial vehicle in national park and the ecological monitoring work of nature protected area, satisfy the field and patrol and protect the monitoring management demand, this patent carries out path planning, electric quantity state control, commander's dispatch to unmanned aerial vehicle based on the automatic airport of unmanned aerial vehicle, and very big degree promotes unmanned aerial vehicle and patrols and protects monitoring efficiency.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a national park unmanned aerial vehicle patrol route optimization method based on reinforcement learning.
The invention is realized by the following technical scheme:
a national park unmanned aerial vehicle patrol path optimization method based on reinforcement learning comprises the following steps:
step 1: inputting three-dimensional terrain data to generate a bounded three-dimensional regionAccording to the performance and patrol requirement of the airborne camera of the unmanned aerial vehicle, a path point set is set above the area in the airThe unmanned aerial vehicle is required to complete the visual coverage task after traversing all path points;
and 2, step: taking the flight path of the unmanned aerial vehicle as an optimization target, adding constraint conditions of traversal path points of the unmanned aerial vehicle, electric quantity limitation of the unmanned aerial vehicle and task execution energy consumption of the path points, and establishing an unmanned aerial vehicle path planning model with a self-service charging function;
and step 3: respectively corresponding unmanned aerial vehicle, path points, charging base stations, energy, battery capacity, flight path energy consumption and path point task energy consumption in the established unmanned aerial vehicle path planning model with the self-service charging function to vehicles, customers, warehouses, goods, the maximum cargo capacity of the vehicles, the path length and customer requirements in the CVRP problem model; defining new path point task energy consumption by using a feedforward weighting method, so that the new path point task energy consumption comprises the task energy consumption of a path point and the average edge energy consumption reaching the path point; corresponding the obtained new path point task energy consumption to the client requirement of the CVRP problem model, and further reducing the unmanned aerial vehicle patrol path planning problem into a CVRP problem which takes the path length as an optimization target and takes the client requirement and the vehicle cargo load as constraints;
and 4, step 4: the CVRP problem reduced in step 3 is solved using a multi-decoder attention model.
In the above technical solution, in step 2, an unmanned aerial vehicle path planning model with a self-service charging function is established, and the specific steps are as follows:
step 2.1: defining flight path decision variables for dronesx ij ;
x ij =1, representing unmanned aerial vehicle from a waypointiFly to the waypointj;
x ij =0, meaning that the drone is not following a waypointiFly to the waypointj;
Defining an objective function:
wherein the content of the first and second substances,is flight path energy consumption and represents the path point of the unmanned planeiAnd a waypointjEnergy consumption is needed;
the flight path decision variables are to form a complete and feasible one-time traversal path, and the constraints are as follows:
step 2.2: aiming at the self-service charging function of the unmanned aerial vehicle, the route planning with the charging base station is adjusted, the energy consumption of the unmanned aerial vehicle is measured according to the flight path, and the maximum endurance of the unmanned aerial vehicle is recorded asQDefining the energy loss variableThe charging base station is the starting point of the unmanned aerial vehicle and is recorded as;
Remaining range of the drone during performance of the mission not exceeding maximum rangeIs expressed as follows:
wherein the content of the first and second substances,is a path pointTask energy consumption, representing the point of unmanned aerial vehicle completing pathThe required energy consumption of the patrol task is reduced,representing points of a pathPoints of other pathsTo the path pointThe decision variables of the edges of (a) are,indicating unmanned aerial vehicle slave waypointsPerforming a mission to fly to a waypointThe residual energy after the reaction;
when the unmanned aerial vehicle leaves the charging base station, the electric quantity is full, and the formula is as follows:
indicating that the unmanned aerial vehicle leaves the charging base station to reach the waypointThe residual energy of the waste water is the energy,indicating that the unmanned aerial vehicle flies to a waypoint from a charging base stationThe decision-making variables of (a) are,is a path pointTask energy consumption, representing the point of unmanned aerial vehicle completing pathEnergy consumption required by the patrol task.
In the above technical scheme, in step 3, firstly, under the condition that the edge energy consumption constraint between the path point and the path point is not considered, a deep reinforcement learning method is used to independently solve the CVRP problem corresponding to the unmanned aerial vehicle patrol path for multiple times, and the number of the solution times is recorded asAnd training the neural network in the deep reinforcement learning model again every time of solving, and using the neural network trained every time for predicting the CVRP problem corresponding to the original unmanned aerial vehicle patrol problemThe secondary solution is obtainedGrouping different solutions to form a solution setSolution setTherein comprisesPlanting a patrol path scheme;
Wherein the content of the first and second substances,representing points of a pathTo the path pointIs in the solution setThe number of occurrences in (1) is equivalent to the weighted average of the path energy consumption required for reaching any path point, and the weight isThen the solution set is obtained by optimizing the path length of the reference total patrol task。
In the above technical solution, the solving process of step 4 includes the following steps:
step 4.1: firstly, according to the scale of input information, several groups of data sets with identical path point quantity are produced, and said data sets are equipped withGroup data set, firstThe information in the group dataset comprises a randomly generated starting pointAnd the position of the path pointAnd randomly generated waypoint task energy consumptionWherein;
Step 4.2: using generatedTraining the multi-decoder attention model in a block data set, where the parameters of the encoder and decoder areThe model is trained by a strategy gradient algorithm with baseline, and parameters of the optimized model are continuously updated circularly to obtain a trained attention model of the multi-decoder;
step 4.3: after the training of the model parameters is finished, inputting the data of the task planning problem of the original unmanned aerial vehicle as a reduced CVRP problem example into the trained model, and taking the output sequence of the model at the moment as a path point access scheme of the unmanned aerial vehicle patrol problem.
In the above technical solution, in step 4.3, the data of the original unmanned aerial vehicle mission planning problem includes a starting point、A path pointAnd information of energy consumption of each path point task, wherein the energy consumption of the path point task refers to the energy consumption of the new path point task defined in the step 2.
The invention has the advantages and beneficial effects that:
the base station is introduced to provide real-time charging service for the working unmanned aerial vehicle, and the unmanned aerial vehicle can access the base station to perform charging for multiple times when executing tasks. Under the system, a constraint formula is constructed by taking the optimized unmanned aerial vehicle task path length as a target, a multi-unmanned aerial vehicle path planning model is established, and the problem is converted into a combined optimization problem. A known combined optimization solver is utilized, a feedforward weighting method is designed to calculate the path energy consumption constraint, and the problem is further converted into a vehicle path problem (CVRP) with capacity limitation. In addition, the deep reinforcement learning method based on the multi-decoder attention model can stably output a high-quality solution of a visual coverage problem for a specific scene, has generalization capability for solving the reduced unmanned aerial vehicle path planning problem, has strong adaptability to a training data set, and can guarantee an efficient training network for path planning under different scenes to obtain the high-quality solution. Based on a trained learning model, the result can be quickly obtained by only calling neural network prediction after the unmanned aerial vehicle path problem example is reduced, the solving speed is higher than the efficiency of the traditional search algorithm, and the decision requirement of the unmanned aerial vehicle quick scheduling planning can be met.
Drawings
FIG. 1 is a flow chart of the national park unmanned aerial vehicle patrol route optimization method based on reinforcement learning.
FIG. 2 is a flow chart of a solution of a multi-decoder attention model to an example problem.
For a person skilled in the art, other relevant figures can be obtained from the above figures without inventive effort.
Detailed Description
In order to make the technical solution of the present invention better understood, the technical solution of the present invention is further described below with reference to specific examples.
A national park unmanned aerial vehicle patrol path optimization method based on reinforcement learning is disclosed, referring to the attached figure 1, and comprises the following steps:
step 1: inputting three-dimensional terrain data to generate a bounded three-dimensional regionAccording to the unmanned plane on-board shootingHead-like performance and patrol requirements set a set of waypoints in the air above the areaObtaining initial dataAnd the unmanned aerial vehicle is required to complete the visual coverage task after traversing all path points.
Step 2: and establishing a constraint formula, taking the flight path of the unmanned aerial vehicle as an optimization target, adding constraint conditions of traversal path points of the unmanned aerial vehicle, electric quantity limitation of the unmanned aerial vehicle and energy consumption of task execution of the path points, and establishing an unmanned aerial vehicle path planning model with a self-service charging function without considering uncontrollable factors such as wind power, visibility and unmanned aerial vehicle faults. The method comprises the following specific steps.
Step 2.1: defining flight path decision variables for an unmanned aerial vehiclex ij ;
x ij =1, representing unmanned aerial vehicle from a waypointiFly to the waypointj;
x ij =0, meaning that the drone is not following a waypointiFly to the waypointj;
Defining an objective function:
wherein the content of the first and second substances,is flight path energy consumption and represents the path point of the unmanned planeiAnd a waypointjThe energy consumption generated between the unmanned aerial vehicle and the unmanned aerial vehicle is in direct proportion to the distance between the path points, and the aim of the task is to optimize the flight path of the unmanned aerial vehicle and minimize the flight path on the premise of completing the task aim. Meanwhile, the flight path decision variables need to ensure that a complete and feasible one-time traversal path can be formed, and the specific constraints are as follows:
Step 2.2: aiming at the self-service charging function of the unmanned aerial vehicle, the route planning with the charging base station is adjusted, the energy consumption of the unmanned aerial vehicle is measured according to the flight path, and the maximum endurance of the unmanned aerial vehicle is recorded asQDefining the energy loss variableCharging base station is the departure point of the unmanned aerial vehicle and is recorded。
First, the drone consumes energy as it moves between waypoints and the remaining range of the drone during the mission should not exceed the maximum rangeIs given by the following equation:
wherein, the first and the second end of the pipe are connected with each other,is a path pointTask energy consumption, representing the point of unmanned aerial vehicle completing pathThe energy consumption required by the patrol task is reduced,representing points of a pathPoints of other routesTo the path pointThe decision variables of the edges of (a) are,representing unmanned aerial vehicle slave waypointsPerforming a mission to fly to a waypointThe remaining energy (i.e., electricity).
Secondly, when unmanned aerial vehicle leaves charging base station, the electric quantity is full, and the formula is expressed as follows:
indicating that the unmanned aerial vehicle leaves the charging base station to reach the waypointThe residual energy of the waste water is the energy,indicating that the drone is flying from the charging base stationTo the path pointThe decision variable(s) of (a),is a path pointTask energy consumption, representing the point of unmanned aerial vehicle completing pathEnergy consumption required by the patrol task.
In conclusion, an unmanned aerial vehicle path planning model with a self-service charging function is established, and the model comprises an objective function (1) and constraint formulas (2), (3), (4), (5) and (6). The solution of this model is a combinatorial optimization problem promptly, that is to say, the unmanned aerial vehicle patrols the route planning problem and transforms for a combinatorial optimization problem.
And step 3: referring to table 1, the unmanned aerial vehicle, the waypoints, the charging base station, the energy (i.e., the electric quantity), the battery capacity, the flight path energy consumption, and the waypoint task energy consumption in the unmanned aerial vehicle path planning model with the self-service charging function, which are established as above, are respectively corresponding to the maximum cargo capacity, the path length, and the customer demand of the vehicle, the customer, the warehouse, the goods, and the vehicle in the CVRP problem (the capacity-limited vehicle path solving problem) model, and then the unmanned aerial vehicle path planning model is converted into the capacity-limited vehicle path solving problem (CVRP).
Table 1: correspondence between unmanned aerial vehicle path planning and CVRP problem model
The energy consumption of the unmanned aerial vehicle comprises the side energy consumption from the path point to the path point and the point energy consumption required by the path point to complete the patrol task, but in the CVRP problem model, the side energy consumption is only used as an optimization target for planning the vehicle path, and only the point energy consumption is used as a constraint condition of the vehicle path. Therefore, the invention uses a feedforward weighting method to enable point energy consumption to replace 'point plus edge energy consumption', and then add edge energy consumption into the constraint condition, so that the problem of unmanned aerial vehicle patrol route planning which originally needs to consider edge energy consumption constraint and point energy consumption constraint is reduced to a CVRP problem which takes the route length as an optimization target and takes customer requirements and vehicle cargo as constraints. The specific treatment method is as follows.
Firstly, under the condition of not considering the limit energy consumption constraint, a deep reinforcement learning method is used for independently solving the CVRP problem corresponding to the unmanned aerial vehicle patrol path for multiple times, and the solving times are recorded asAnd (2) training the neural network in the deep reinforcement learning model again (or independently) every time of solving, using the neural network trained every time for predicting the CVRP problem corresponding to the original unmanned aerial vehicle patrol problem, wherein the generation and extraction of the training set are random, so that the method has the advantages of high reliability, high accuracy and low costTotal of sub-trainingThe neural networks are different, and the prediction results of the neural networks are different, so that the neural networks can obtainGrouping different solutions to form a solution setSolution setTherein comprisesAnd a patrol path scheme is adopted.
Redefining new path point task energy consumption based on known solution set(i.e., waypoints)Energy consumption required for completing the patrol task):
wherein the content of the first and second substances,representing points of a pathTo the path pointIs in the solution setThe number of occurrences in (1) is equivalent to weighted average of the path energy consumption required for reaching any path point, and the weight isThen refer to the solution set optimized by the total patrol task path length。
The obtained new path point task energy consumptionCustomer requirements for the CVRP problem model so that new waypoint tasks consume energyThe task energy consumption of a path point and the average side energy consumption for reaching the path point are included, and the patrol path problem which originally needs to consider side energy consumption constraint and point energy consumption constraint is reduced to a CVRP problem which takes path length as an optimization target and takes customer demand and vehicle cargo as constraints.
And 4, step 4: the CVRP problem reduced in step 3 is solved using a multi-decoder attention model.
The data of the unmanned aerial vehicle path planning problem comprises a starting pointInformation anda path pointAnd information of task energy consumption of each path point (the path point task energy consumption refers to new path point task energy consumption defined in the step 2), and the information is reduced to information of warehouse, client demand and the like in the CVRP problem according to the step 3 and is used as input information of the model. The encoder structure of the model is based on a transformer model, a plurality of decoders with the same structure and independent parameters are used in a decoder part, the difference degree of construction solutions between the decoders is measured by Kullback-Leibler divergence (abbreviated as 'KL divergence') between probability distributions calculated by different decoders, and in addition, each decoder increases the masking of nodes when calculating attention weights and is used for realizing task path constraint in the CVRP problem. The model is trained by a policy gradient algorithm with baseline and a plurality of data sets which are randomly generated and have the same scale with the problem to be solved. Referring to fig. 2, the specific solving process is as follows.
Step 4.1: firstly, groups with the same path point number (namely, the same path point number) are generated according to the scale of input information) Assuming common data sets ofGroup data set, in orderFor the example of a group dataset, the information therein includes a randomly generated starting pointAnd the position of the path pointAnd randomly generated waypoint task energy consumptionWherein。
Step 4.2: using generatedTraining the multi-decoder attention model in a block data set, where the parameters of the encoder and decoder areThe model is trained by a policy gradient algorithm with baseline, model parameters are continuously updated and optimized in a circulating mode, the training target is the model parameters for optimizing the shortest path length of a client access scheme and KL divergence of decoder parameters, and the model parameters are recordedThe total length of the task path is obtained for the solution under the model parameters, and is recordedAnd (4) carrying out parameter training for the KL divergence of the decoder parameters under the model parameters according to the following algorithm to obtain the trained attention model of the multi-decoder.
The reinforcement learning algorithm with baseline is as follows:
5, combining the optimization objectives according to the currentGroup dataset and parametersCalculating the task path length and KL divergence of the output result of the modelOptimizing direction;
Step 4.3: after the training of the model parameters is finished, the data (including the starting point) of the original unmanned aerial vehicle mission planning problem is processed、 A path pointAnd information of task energy consumption of each path point) as a reduced CVRP problem instance, inputting the trained model, and taking an output sequence of the model at the moment as a path of the unmanned aerial vehicle patrol problemA point access scheme.
The invention being thus described by way of example, it should be understood that any simple alterations, modifications or other equivalent alterations as would be within the skill of the art without the exercise of inventive faculty, are within the scope of the invention.
Claims (5)
1. A national park Unmanned Aerial Vehicle (UAV) patrol path optimization method based on reinforcement learning is characterized by comprising the following steps of:
step 1: inputting three-dimensional terrain data to generate a bounded three-dimensional regionAccording to the performance and patrol requirement of the airborne camera of the unmanned aerial vehicle, a path point set is set above the area in the airThe unmanned aerial vehicle is required to complete the visual coverage task after traversing all path points;
step 2: taking the flight path of the unmanned aerial vehicle as an optimization target, adding constraint conditions of traversal path points of the unmanned aerial vehicle, electric quantity limitation of the unmanned aerial vehicle and energy consumption of task execution of the path points, and establishing an unmanned aerial vehicle path planning model with a self-service charging function;
and 3, step 3: respectively corresponding unmanned aerial vehicle, path points, charging base stations, energy, battery capacity, flight path energy consumption and path point task energy consumption in the established unmanned aerial vehicle path planning model with the self-service charging function to vehicles, customers, warehouses, goods, the maximum cargo capacity of the vehicles, the path length and customer requirements in the CVRP problem model; defining new path point task energy consumption by using a feedforward weighting method, so that the new path point task energy consumption comprises the task energy consumption of a path point and the average edge energy consumption reaching the path point; corresponding the obtained new path point task energy consumption to the client requirement of the CVRP problem model, and further reducing the unmanned aerial vehicle patrol path planning problem into a CVRP problem which takes the path length as an optimization target and takes the client requirement and the vehicle cargo load as constraints;
and 4, step 4: the CVRP problem reduced in step 3 is solved using a multi-decoder attention model.
2. The reinforcement learning-based national park unmanned aerial vehicle patrol route optimization method according to claim 1, wherein: in step 2, an unmanned aerial vehicle path planning model with a self-service charging function is established, and the method specifically comprises the following steps:
step 2.1: defining flight path decision variables for an unmanned aerial vehiclex ij ;
x ij =1, representing unmanned aerial vehicle from a waypointiFly to waypointj;
x ij =0, meaning that the drone is not following a waypointiFly to waypointj;
Defining an objective function:
wherein the content of the first and second substances,is flight path energy consumption and represents the path point of the unmanned planeiAnd a waypointjEnergy consumption is needed;
the flight path decision variables need to form a complete and feasible one-time traversal path, and the constraints are as follows:
step 2.2: for no oneThe self-service charging function of the unmanned aerial vehicle is adjusted to adjust the route planning with the charging base station, the energy consumption of the unmanned aerial vehicle is measured according to the flight path, and the maximum endurance of the unmanned aerial vehicle is recorded asQDefining the energy loss variableThe charging base station is the starting point of the unmanned aerial vehicle and is recorded as;
The remaining range of the drone during execution of the mission does not exceed the maximum rangeIs expressed as follows:
wherein the content of the first and second substances,is a path pointTask energy consumption, representing the point of path completed by the unmanned aerial vehicleThe energy consumption required by the patrol task is reduced,representing points of a pathPoints of other routesTo the path pointThe decision variables of the edges of (a) are,indicating unmanned aerial vehicle slave waypointsPerforming a mission to fly to a waypointThe residual energy after the reaction;
when unmanned aerial vehicle leaves charging base station, the electric quantity is full, and the formula is as follows:
indicating that the unmanned aerial vehicle leaves the charging base station to reach the waypointThe residual energy of the waste water is the energy,indicating that the unmanned aerial vehicle flies to a waypoint from a charging base stationThe decision-making variables of (a) are,is a path pointTask energy consumption, representing the point of path completed by the unmanned aerial vehicleEnergy consumption required by the patrol task.
3. The reinforcement learning-based national park unmanned aerial vehicle patrol route optimization method according to claim 2, wherein: in step 3, firstly, under the condition of not considering the edge energy consumption constraint between the path points, a deep reinforcement learning method is used for independently solving the CVRP problem corresponding to the unmanned aerial vehicle patrol path for multiple times, and the solving times are recorded asAnd (3) retraining the neural network in the deep reinforcement learning model every time of solving, and using the neural network trained every time to predict the CVRP problem corresponding to the original unmanned aerial vehicle patrol problemThe secondary solution is obtainedGrouping different solutions to form solution setsSolution setTherein comprisesA patrol path scheme is planted;
Wherein, the first and the second end of the pipe are connected with each other,representing points of a pathTo the path pointIs in the solution setThe number of occurrences in (1) is equivalent to the weighted average of the path energy consumption required for reaching any path point, and the weight isThen the solution set is obtained by optimizing the path length of the reference total patrol task。
4. The reinforcement learning-based national park unmanned aerial vehicle patrol route optimization method according to claim 1, wherein: the solving process of the step 4 comprises the following steps:
step 4.1: firstly, according to the scale of input information, several groups of data sets with same path point quantity are generated, and said method is characterized by thatGroup data set, firstThe information in the group dataset comprises a randomly generated starting pointAnd the position of the path pointAnd randomly generated waypoint task energy consumptionWherein;
Step 4.2: using generatedTraining the multi-decoder attention model in a block data set, where the parameters of the encoder and decoder areThe model is trained by a strategy gradient algorithm with baseline, and parameters of the optimized model are continuously updated circularly to obtain a trained attention model of the multi-decoder;
step 4.3: after the training of the model parameters is finished, inputting the data of the task planning problem of the original unmanned aerial vehicle as a reduced CVRP problem example into the trained model, and taking the output sequence of the model at the moment as a path point access scheme of the unmanned aerial vehicle patrol problem.
5. The reinforcement learning-based national park unmanned aerial vehicle patrol route optimization method according to claim 4, wherein: in step 4.3, the data of the original unmanned aerial vehicle mission planning problem comprises a starting point、A path pointAnd information of energy consumption of each path point task, wherein the energy consumption of the path point task refers to the energy consumption of the new path point task defined in the step 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211572414.2A CN115574826B (en) | 2022-12-08 | 2022-12-08 | National park unmanned aerial vehicle patrol path optimization method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211572414.2A CN115574826B (en) | 2022-12-08 | 2022-12-08 | National park unmanned aerial vehicle patrol path optimization method based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115574826A true CN115574826A (en) | 2023-01-06 |
CN115574826B CN115574826B (en) | 2023-04-07 |
Family
ID=84590469
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211572414.2A Active CN115574826B (en) | 2022-12-08 | 2022-12-08 | National park unmanned aerial vehicle patrol path optimization method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115574826B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116894519A (en) * | 2023-07-21 | 2023-10-17 | 江苏舟行时空智能科技股份有限公司 | Position point optimization determination method meeting user service coverage requirement |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110263983A (en) * | 2019-05-31 | 2019-09-20 | 中国人民解放军国防科技大学 | Double-layer path planning method and system for logistics distribution of vehicles and unmanned aerial vehicles |
CN110428111A (en) * | 2019-08-08 | 2019-11-08 | 西安工业大学 | Multi-Tasking method for planning track when UAV/UGV collaboration is long |
CN110470301A (en) * | 2019-08-13 | 2019-11-19 | 上海交通大学 | Unmanned plane paths planning method under more dynamic task target points |
CN111429052A (en) * | 2020-03-16 | 2020-07-17 | 北京航空航天大学 | Initial solution structure for vehicle path problem distributed by cooperating unmanned aerial vehicle |
CN111536979A (en) * | 2020-07-08 | 2020-08-14 | 浙江浙能天然气运行有限公司 | Unmanned aerial vehicle routing inspection path planning method based on random optimization |
CN112132312A (en) * | 2020-08-14 | 2020-12-25 | 蓝海(福建)信息科技有限公司 | Path planning method based on evolution multi-objective multi-task optimization |
US20210020051A1 (en) * | 2017-07-27 | 2021-01-21 | Beihang University | Airplane flight path planning method and device based on the pigeon-inspired optimization |
US20210325195A1 (en) * | 2020-04-20 | 2021-10-21 | Insurance Services Office, Inc. | Systems and Methods for Automated Vehicle Routing Using Relaxed Dual Optimal Inequalities for Relaxed Columns |
CN114422363A (en) * | 2022-01-11 | 2022-04-29 | 北京科技大学 | Unmanned aerial vehicle loaded RIS auxiliary communication system capacity optimization method and device |
CN115065939A (en) * | 2022-06-08 | 2022-09-16 | 电子科技大学长三角研究院(衢州) | Auxiliary communication unmanned aerial vehicle trajectory planning and power control method capable of charging in flight |
CN115185303A (en) * | 2022-09-14 | 2022-10-14 | 南开大学 | Unmanned aerial vehicle patrol path planning method for national parks and natural protected areas |
CN115280103A (en) * | 2020-03-23 | 2022-11-01 | 支付宝实验室(新加坡)有限公司 | System and method for determining a path by learning selective optimization |
-
2022
- 2022-12-08 CN CN202211572414.2A patent/CN115574826B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210020051A1 (en) * | 2017-07-27 | 2021-01-21 | Beihang University | Airplane flight path planning method and device based on the pigeon-inspired optimization |
CN110263983A (en) * | 2019-05-31 | 2019-09-20 | 中国人民解放军国防科技大学 | Double-layer path planning method and system for logistics distribution of vehicles and unmanned aerial vehicles |
CN110428111A (en) * | 2019-08-08 | 2019-11-08 | 西安工业大学 | Multi-Tasking method for planning track when UAV/UGV collaboration is long |
CN110470301A (en) * | 2019-08-13 | 2019-11-19 | 上海交通大学 | Unmanned plane paths planning method under more dynamic task target points |
CN111429052A (en) * | 2020-03-16 | 2020-07-17 | 北京航空航天大学 | Initial solution structure for vehicle path problem distributed by cooperating unmanned aerial vehicle |
CN115280103A (en) * | 2020-03-23 | 2022-11-01 | 支付宝实验室(新加坡)有限公司 | System and method for determining a path by learning selective optimization |
US20210325195A1 (en) * | 2020-04-20 | 2021-10-21 | Insurance Services Office, Inc. | Systems and Methods for Automated Vehicle Routing Using Relaxed Dual Optimal Inequalities for Relaxed Columns |
CN111536979A (en) * | 2020-07-08 | 2020-08-14 | 浙江浙能天然气运行有限公司 | Unmanned aerial vehicle routing inspection path planning method based on random optimization |
CN112132312A (en) * | 2020-08-14 | 2020-12-25 | 蓝海(福建)信息科技有限公司 | Path planning method based on evolution multi-objective multi-task optimization |
CN114422363A (en) * | 2022-01-11 | 2022-04-29 | 北京科技大学 | Unmanned aerial vehicle loaded RIS auxiliary communication system capacity optimization method and device |
CN115065939A (en) * | 2022-06-08 | 2022-09-16 | 电子科技大学长三角研究院(衢州) | Auxiliary communication unmanned aerial vehicle trajectory planning and power control method capable of charging in flight |
CN115185303A (en) * | 2022-09-14 | 2022-10-14 | 南开大学 | Unmanned aerial vehicle patrol path planning method for national parks and natural protected areas |
Non-Patent Citations (1)
Title |
---|
闵桂龙等: "军事后勤中的多目标无人机任务规划", 《计算机仿真》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116894519A (en) * | 2023-07-21 | 2023-10-17 | 江苏舟行时空智能科技股份有限公司 | Position point optimization determination method meeting user service coverage requirement |
Also Published As
Publication number | Publication date |
---|---|
CN115574826B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113342046B (en) | Power transmission line unmanned aerial vehicle routing inspection path optimization method based on ant colony algorithm | |
CN105045274B (en) | A kind of intelligent shaft tower connected graph construction method for unmanned plane inspection trajectory planning | |
CN106959700B (en) | A kind of unmanned aerial vehicle group collaboration patrol tracing path planing method based on upper limit confidence interval algorithm | |
CN114169066B (en) | Space target characteristic measuring and reconnaissance method based on micro-nano constellation approaching reconnaissance | |
CN110597286B (en) | Method for realizing unmanned aerial vehicle autonomous inspection of power transmission line by using smart hangar | |
Liu et al. | Application of unmanned aerial vehicle hangar in transmission tower inspection considering the risk probabilities of steel towers | |
CN115185303B (en) | Unmanned aerial vehicle patrol path planning method for national parks and natural protected areas | |
CN113268081B (en) | Small unmanned aerial vehicle prevention and control command decision method and system based on reinforcement learning | |
Liang et al. | Drone fleet deployment strategy for large scale agriculture and forestry surveying | |
CN114638155A (en) | Unmanned aerial vehicle task allocation and path planning method based on intelligent airport | |
Zheng et al. | Robustness of the planning algorithm for ocean observation tasks | |
Zheng et al. | The collaborative power inspection task allocation method of “unmanned aerial vehicle and operating vehicle” | |
CN115574826B (en) | National park unmanned aerial vehicle patrol path optimization method based on reinforcement learning | |
CN116578120A (en) | Unmanned aerial vehicle scheduling method and device, unmanned aerial vehicle system and computer equipment | |
Qiu et al. | Improved F‐RRT∗ Algorithm for Flight‐Path Optimization in Hazardous Weather | |
Gaowei et al. | Using multi-layer coding genetic algorithm to solve time-critical task assignment of heterogeneous UAV teaming | |
CN113283827B (en) | Two-stage unmanned aerial vehicle logistics path planning method based on deep reinforcement learning | |
Hehtke et al. | An Autonomous Mission Management System to Assist Decision Making of a HALE Operator | |
Li et al. | Intelligent Early Warning Method Based on Drone Inspection | |
Sehrawat et al. | A power prediction approach for a solar-powered aerial vehicle enhanced by stacked machine learning technique | |
Zheng | Multimachine Collaborative Path Planning Method Based on A* Mechanism Connection Depth Neural Network Model | |
Dai et al. | A genetic algorithm-based research on drone trajectory planning strategy of cooperative inspection of transmission lines, substations and distribution lines | |
Yin et al. | Multi UAV cooperative task allocation method for intensive corridors of transmission lines inspection | |
CN117472083B (en) | Multi-unmanned aerial vehicle collaborative marine search path planning method | |
Liu et al. | Research on Unmanned Aerial Vehicle Trajectory Planning Based on Agent Reinforcement Learning in Alpine Forest Environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |