CN114997617B - Multi-unmanned platform multi-target combined detection task allocation method and system - Google Patents
Multi-unmanned platform multi-target combined detection task allocation method and system Download PDFInfo
- Publication number
- CN114997617B CN114997617B CN202210566512.9A CN202210566512A CN114997617B CN 114997617 B CN114997617 B CN 114997617B CN 202210566512 A CN202210566512 A CN 202210566512A CN 114997617 B CN114997617 B CN 114997617B
- Authority
- CN
- China
- Prior art keywords
- area
- subtask
- unmanned platform
- task
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000001514 detection method Methods 0.000 title claims abstract description 24
- 230000006870 function Effects 0.000 claims abstract description 15
- 230000002787 reinforcement Effects 0.000 claims abstract description 10
- 239000000203 mixture Substances 0.000 claims abstract description 7
- 230000008901 benefit Effects 0.000 claims description 25
- 239000003795 chemical substances by application Substances 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 3
- 238000005192 partition Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000007123 defense Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Data Mining & Analysis (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Development Economics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a multi-unmanned platform multi-target combined detection task distribution method and a system, wherein the method comprises the following steps: acquiring a target existence probability map of a task area, and carrying out data clustering by using a Gaussian mixture model to divide the task area into a plurality of key subregions; obtaining the area, position and other information of each subarea by utilizing the dividing result; designing a reward function according to the area of each area and the fight capability of the two sides of the friend or foe such as defensive power, attack power and the like; and designing a dynamic update strategy on the basis of the reinforcement learning DDQN network, and distributing task areas for each unmanned platform. The task allocation method provided by the invention converts the allocation of the multi-unmanned platform to the multi-target into the allocation of the multi-unmanned platform to the sub-task area, so that the task scale is reduced, and the complexity is reduced; and comprehensively considering factors such as search time, attack and damage proportion, task distribution uniformity degree and the like, distributing tasks for each unmanned platform, ensuring self safety and ensuring that the unmanned platform can rapidly detect enemy targets.
Description
Technical Field
The invention belongs to the technical field of multi-agent task planning, and particularly relates to a multi-unmanned platform multi-target combined detection task distribution method and system.
Background
The multi-unmanned platform collaboration system is superior to a single unmanned platform (e.g., unmanned ship, etc.) in terms of efficiency, performance, robustness, etc. By utilizing the cooperation of multiple unmanned platforms, the method can integrate the search and the enclosure of the enemy targets, improve the task execution efficiency and maximize the income, and has important significance for the maintenance of rights and interests such as the territory, the territory and the like in China. The distribution of enemy targets in performing a probing task tends to be diffuse and uneven, which can reduce the collaborative performance of the unmanned platform. Therefore, it is very important to perform a reasonable task planning before executing the task, so that the advantages and resources of each unmanned platform can be effectively utilized, the system efficiency is maximized, and the task execution efficiency is improved.
The task allocation of the multi-unmanned platform cooperative system is a multi-objective optimization problem, and not only the attack and damage ratio of the two parties of the friend and the foe but also factors of various aspects such as search efficiency, resource consumption and the like are considered. The complexity of the multi-objective optimization problem increases exponentially as the task size becomes larger, and the actual task scenario is complex and variable, and task allocation by the conventional method often needs to be solved from scratch, which can take a lot of time and resources.
Disclosure of Invention
Aiming at the defects and improvement demands of the prior art, the invention discloses a multi-unmanned platform multi-target combined detection task distribution method and a system, and aims to solve the technical problems of high solving complexity and large calculation amount of the existing task distribution method.
In order to achieve the above purpose, the invention provides a multi-unmanned platform multi-target combined detection task allocation method, which comprises the following steps:
s1, acquiring a target existence probability map of a task area, and fitting the target existence probability map by using m Gaussian models, so that the task area is divided into m subtask areas;
S2, obtaining an optimal task allocation result by maximizing accumulated rewards based on a reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j E [1, m ], n is the class number of the unmanned platform,/>For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
Further, dynamically changing DDQN a parameter update interval of a target value network of the network according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
Further, the step S1 includes:
Acquiring a target existence probability map of a task area; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
Further, after dividing the m subtask areas, the area of each subtask area, the enemy target value and the distance from the enemy target value to the starting point of the unmanned platform are obtained according to the Gaussian function corresponding to each subtask area.
In another aspect of the present invention, a multi-unmanned platform multi-target joint detection task allocation system is provided, including:
The region dividing module is used for acquiring a target existence probability map of a task region, and fitting the target existence probability map by using m Gaussian models so as to divide the task region into m subtask regions;
the task allocation module is used for obtaining an optimal task allocation result by maximizing accumulated rewards based on the reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j E [1, m ], n is the class number of the unmanned platform,/>For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
Further, dynamically changing DDQN a parameter update interval of a target value network of the network according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
Further, the area dividing module is specific to acquiring a target existence probability map of the task area; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
Further, after dividing the m subtask areas, the area of each subtask area, the enemy target value and the distance from the enemy target value to the starting point of the unmanned platform are obtained according to the Gaussian function corresponding to each subtask area.
In general, through the above technical solutions conceived by the present invention, the following beneficial effects can be obtained:
(1) According to the invention, the known task area target existence probability map is fitted and clustered, the task area is divided into a plurality of subtask areas containing a plurality of targets, the task distribution problem of the multi-unmanned platform to the multi-enemy targets is converted into the task distribution problem of the multi-unmanned platform to the multi-subtask areas, the task scale is reduced, the calculated amount is reduced, and the solving complexity is reduced. The rewarding function is designed by comprehensively considering various factors such as the attack and defense attributes, the searching capability and the like, so that the intelligent agent can minimize own loss and maximize attack benefits while efficiently completing tasks.
(2) According to the invention, an intelligent updating strategy is designed on the basis of the original DDQN network, the parameter updating interval of the target value network is dynamically changed according to the real-time rewards, the stability of the network is improved, and the task allocation strategy with higher rewards is learned by an intelligent agent.
Drawings
FIG. 1 is a schematic flow chart of a multi-unmanned platform multi-target joint detection task allocation method according to an embodiment of the present invention;
FIG. 2 is a second schematic flow chart of a multi-unmanned platform multi-target joint detection task allocation method according to an embodiment of the present invention;
FIG. 3 is a third flow chart of a multi-unmanned platform multi-target joint detection task allocation method according to an embodiment of the present invention;
FIG. 4 is a graph of initial target existence probabilities provided by an embodiment of the present invention;
FIG. 5 is a graph of probability of existence of a target obtained by Gaussian fitting according to an embodiment of the invention;
FIG. 6 is a plot of the resulting partitions of a Gaussian fit provided by an embodiment of the invention;
FIG. 7 is a graph of average instant prize provided by an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Deep reinforcement learning is a combination of deep learning and reinforcement learning, and is a method for adjusting self strategies by continuously performing trial and error with environment through an agent. The dynamic planning method can cope with environmental changes and is more suitable for task allocation than the traditional method. Therefore, the invention provides a multi-unmanned platform multi-target joint detection task distribution method and system based on deep reinforcement learning, different profit functions are designed aiming at the movement characteristics of the unmanned platform, the searching efficiency and the attack and damage advantages are considered at the same time when the decision is made, and the high efficiency and the synergy of executing tasks are ensured.
Referring to fig. 1, in combination with fig. 2 and fig. 3, the present invention provides a multi-unmanned platform multi-target joint detection task allocation method, which includes the following steps:
s1, acquiring a target existence probability map of a task area, and fitting the target existence probability map by using m Gaussian models, so that the task area is divided into m subtask areas.
In this embodiment, an initial target existence probability map of the task area is obtained by means of reconnaissance, detection, and the like, as shown in fig. 4, and is rasterized. And setting a plurality of sampling points for the task area, and dividing the sampling points into grids according to the probability of each grid. M gaussian models G n (x, y) (n=1, 2..m) were used to fit the target presence probability map for the task area, as shown in fig. 5. Each Gaussian model has a specific gravity of w n andThe target existence probability of the grid (x, y) can be expressed as follows by a gaussian mixture model:
Wherein, the formula of the gaussian density function G n (x, y) is as follows:
Where p= [ x, y ] T represents the center position vector of the grid (x, y). Firstly, a K-Means method is adopted to roughly divide a task area, and an initial value of a Gaussian mixture model parameter w n、μn、Cn is obtained. The final partition is then obtained by estimating the various parameters using the maximum expectation algorithm, as shown in fig. 6. After the partitions are obtained, the area and the regional value of each subtask region and the distance from the starting point of the unmanned platform, namely the origin, can be obtained by the Gaussian function corresponding to each region, as shown in Table 1.
TABLE 1 enemy target information table in subareas
Project | Zone 1 | Zone 2 | Zone 3 |
Area of | 108.9372 | 130.0092 | 267.7715 |
Distance of | 34.9 | 56.9 | 40.2 |
Value of | 21.44 | 25.56 | 52.99 |
Attack success rate | [0.7,0.7,0.8] | [0.8,0.7,0.8] | [0.8,0.7,0.9] |
S2, obtaining the optimal task allocation result by maximizing the accumulated rewards based on the reinforcement learning DDQN network.
Specifically, comprehensively evaluating the detection range of the unmanned platform carrying the detector in unit time to obtain the searching capability of various unmanned platforms; and comprehensively evaluating the attack and defense capabilities of the enemy targets and various unmanned platforms on the my sides to obtain the attack success rate of the various unmanned platforms on the enemy targets in the area, as shown in table 2. And obtaining the attack success rate of the enemy targets in each area to various unmanned platforms on the my side, as shown in table 1.
Table 2 unmanned platform information table
Project | Type 1 | Type 2 | Type 3 |
Search capability | 2 | 1 | 1 |
Value of | 20 | 10 | 30 |
Quantity of | 2 | 2 | 2 |
Attack success rate | [0.8,0.7,0.7] | [0.7,0.8,0.6] | [0.8,0.7,0.8] |
For any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, in this embodiment, l=100; f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j E [1, m ], n is the class number of the unmanned platform,/>For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient. For subtask areas with large areas or long distances, unmanned platforms with higher searching capability or more unmanned platforms are considered to be allocated.
Further, the number of the subtask area to be allocated and the number of the unmanned platforms are combined into a vector, namely a1×4 vector, which is used as the input of the network. The network calculates and outputs a vector of 1×27 dimensions, namely the value of all possible decisions in the current state, and selects the maximum value and converts the corresponding subscript and action. The actions are the number of unmanned platforms of each type dispatched by the subtask area. The network prototype is DDQN network, and the update interval of the target value network is fixed. The exploration force of the early-stage intelligent agent is large, and the network can be converged towards the direction of higher rewards by the small parameter updating interval; however, in the later stage of training, the smaller updating interval makes the parameters of the target value network updated frequently, and network convergence is difficult. Therefore, the invention provides an intelligent updating strategy, namely a strategy that a smaller parameter is given at a more interval in the early stage of network training, so that an intelligent agent can learn a higher rewarding more easily; when the instant benefit increases to a certain value, the parameter updating interval is gradually increased, so that frequent updating of the parameters is avoided, and the stability of the network is improved. The calculation formula is as follows:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1. In this embodiment, an initial parameter update interval N 0 =50 is set, and the prize threshold r th = -2, η=1.
The final distribution results are shown in Table 3, and the final profit map is shown in FIG. 7.
Table 3 distribution results table
The invention also provides a multi-unmanned platform multi-target combined detection task distribution system, which comprises:
The region dividing module is used for acquiring a target existence probability map of a task region, and fitting the target existence probability map by using m Gaussian models so as to divide the task region into m subtask regions;
And the task allocation module is used for obtaining the optimal task allocation result by maximizing the accumulated rewards based on the reinforcement learning DDQN network. The calculation mode of the reward function for any subtask region is the same as the multi-unmanned platform multi-target combined detection task allocation method, and is not repeated here.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (8)
1. The multi-unmanned platform multi-target combined detection task allocation method is characterized by comprising the following steps of:
s1, acquiring a target existence probability map of a task area, and fitting the target existence probability map by using m Gaussian models, so that the task area is divided into m subtask areas;
S2, obtaining an optimal task allocation result by maximizing accumulated rewards based on a reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j epsilon [1, m ], n is the class number of the unmanned platform,For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
2. The multi-unmanned platform multi-target joint detection task allocation method according to claim 1, wherein the parameter update interval of the target value network of the DDQN network is dynamically changed according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
3. The multi-unmanned platform multi-target joint detection task allocation method according to claim 1 or 2, wherein the step S1 comprises:
Acquiring a target existence probability map of a task area; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
4. The multi-unmanned platform multi-target joint detection task allocation method according to claim 3, wherein after dividing m subtask areas, the area of each subtask area, the enemy target value and the distance from the enemy target value to the starting point of the unmanned platform are obtained according to a Gaussian function corresponding to each subtask area.
5. A multi-unmanned platform multi-target joint detection task allocation system, comprising:
The region dividing module is used for acquiring a target existence probability map of a task region, and fitting the target existence probability map by using m Gaussian models so as to divide the task region into m subtask regions;
the task allocation module is used for obtaining an optimal task allocation result by maximizing accumulated rewards based on the reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j epsilon [1, m ], n is the class number of the unmanned platform,For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
6. The multi-unmanned platform multi-target joint detection task allocation system according to claim 5, wherein the parameter update interval of the target value network of the DDQN network is dynamically changed according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
7. The multi-unmanned platform multi-target joint detection task allocation system according to claim 5 or 6, wherein the region dividing module is specifically configured to obtain a target existence probability map of a task region; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
8. The multi-unmanned platform multi-target joint detection task allocation system according to claim 7, wherein after dividing the m subtask areas, the area of each subtask area, the enemy target value and the distance to the starting point of the unmanned platform are calculated according to the gaussian function corresponding to each subtask area.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210566512.9A CN114997617B (en) | 2022-05-23 | 2022-05-23 | Multi-unmanned platform multi-target combined detection task allocation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210566512.9A CN114997617B (en) | 2022-05-23 | 2022-05-23 | Multi-unmanned platform multi-target combined detection task allocation method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114997617A CN114997617A (en) | 2022-09-02 |
CN114997617B true CN114997617B (en) | 2024-06-07 |
Family
ID=83027440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210566512.9A Active CN114997617B (en) | 2022-05-23 | 2022-05-23 | Multi-unmanned platform multi-target combined detection task allocation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114997617B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
CN112558642A (en) * | 2020-12-30 | 2021-03-26 | 上海大学 | Sea-air combined capturing method suitable for heterogeneous multi-unmanned system |
CN113589842A (en) * | 2021-07-26 | 2021-11-02 | 中国电子科技集团公司第五十四研究所 | Unmanned clustering task cooperation method based on multi-agent reinforcement learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
MX2019009470A (en) * | 2017-02-08 | 2019-09-16 | Walmart Apollo Llc | Task management in retail environment. |
-
2022
- 2022-05-23 CN CN202210566512.9A patent/CN114997617B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
CN112558642A (en) * | 2020-12-30 | 2021-03-26 | 上海大学 | Sea-air combined capturing method suitable for heterogeneous multi-unmanned system |
CN113589842A (en) * | 2021-07-26 | 2021-11-02 | 中国电子科技集团公司第五十四研究所 | Unmanned clustering task cooperation method based on multi-agent reinforcement learning |
Also Published As
Publication number | Publication date |
---|---|
CN114997617A (en) | 2022-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fu | Deep belief network based ensemble approach for cooling load forecasting of air-conditioning system | |
CN111563188B (en) | Mobile multi-agent cooperative target searching method | |
Wang et al. | Shapley Q-value: A local reward approach to solve global reward games | |
Bingul | Adaptive genetic algorithms applied to dynamic multiobjective problems | |
CN104408518B (en) | Based on the neural network learning optimization method of particle swarm optimization algorithm | |
CN114815882B (en) | Unmanned aerial vehicle autonomous formation intelligent control method based on reinforcement learning | |
Zhao et al. | A decomposition-based many-objective ant colony optimization algorithm with adaptive solution construction and selection approaches | |
CN116307464A (en) | AGV task allocation method based on multi-agent deep reinforcement learning | |
CN111797966B (en) | Multi-machine collaborative global target distribution method based on improved flock algorithm | |
Gao et al. | Consensus evaluation method of multi-ground-target threat for unmanned aerial vehicle swarm based on heterogeneous group decision making | |
CN115047907B (en) | Air isomorphic formation command method based on multi-agent PPO algorithm | |
CN113780576A (en) | Cooperative multi-agent reinforcement learning method based on reward self-adaptive distribution | |
CN114281103B (en) | Aircraft cluster collaborative search method with zero interaction communication | |
Juang et al. | A self-generating fuzzy system with ant and particle swarm cooperative optimization | |
Pourpanah et al. | mBSO: A multi-population brain storm optimization for multimodal dynamic optimization problems | |
CN114997617B (en) | Multi-unmanned platform multi-target combined detection task allocation method and system | |
CN111967199A (en) | Agent contribution distribution method under reinforcement learning multi-agent cooperation task | |
Dias et al. | Quantum-inspired neuro coevolution model applied to coordination problems | |
CN117112176A (en) | Intelligent community fog calculation task scheduling method based on ant lion algorithm | |
CN116797116A (en) | Reinforced learning road network load balancing scheduling method based on improved reward and punishment mechanism | |
CN109359671B (en) | Classification intelligent extraction method for hydropower station reservoir dispatching rules | |
Gaowei et al. | Using multi-layer coding genetic algorithm to solve time-critical task assignment of heterogeneous UAV teaming | |
CN110505293A (en) | Cooperation caching method based on improved drosophila optimization algorithm in a kind of mist wireless access network | |
CN114166228B (en) | Unmanned aerial vehicle continuous monitoring path planning method | |
CN116165886A (en) | Multi-sensor intelligent cooperative control method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |