CN114997617B - Multi-unmanned platform multi-target combined detection task allocation method and system - Google Patents

Multi-unmanned platform multi-target combined detection task allocation method and system Download PDF

Info

Publication number
CN114997617B
CN114997617B CN202210566512.9A CN202210566512A CN114997617B CN 114997617 B CN114997617 B CN 114997617B CN 202210566512 A CN202210566512 A CN 202210566512A CN 114997617 B CN114997617 B CN 114997617B
Authority
CN
China
Prior art keywords
area
subtask
unmanned platform
task
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210566512.9A
Other languages
Chinese (zh)
Other versions
CN114997617A (en
Inventor
杨卫东
王棋
钟胜
颜露新
邹旭
王伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202210566512.9A priority Critical patent/CN114997617B/en
Publication of CN114997617A publication Critical patent/CN114997617A/en
Application granted granted Critical
Publication of CN114997617B publication Critical patent/CN114997617B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a multi-unmanned platform multi-target combined detection task distribution method and a system, wherein the method comprises the following steps: acquiring a target existence probability map of a task area, and carrying out data clustering by using a Gaussian mixture model to divide the task area into a plurality of key subregions; obtaining the area, position and other information of each subarea by utilizing the dividing result; designing a reward function according to the area of each area and the fight capability of the two sides of the friend or foe such as defensive power, attack power and the like; and designing a dynamic update strategy on the basis of the reinforcement learning DDQN network, and distributing task areas for each unmanned platform. The task allocation method provided by the invention converts the allocation of the multi-unmanned platform to the multi-target into the allocation of the multi-unmanned platform to the sub-task area, so that the task scale is reduced, and the complexity is reduced; and comprehensively considering factors such as search time, attack and damage proportion, task distribution uniformity degree and the like, distributing tasks for each unmanned platform, ensuring self safety and ensuring that the unmanned platform can rapidly detect enemy targets.

Description

Multi-unmanned platform multi-target combined detection task allocation method and system
Technical Field
The invention belongs to the technical field of multi-agent task planning, and particularly relates to a multi-unmanned platform multi-target combined detection task distribution method and system.
Background
The multi-unmanned platform collaboration system is superior to a single unmanned platform (e.g., unmanned ship, etc.) in terms of efficiency, performance, robustness, etc. By utilizing the cooperation of multiple unmanned platforms, the method can integrate the search and the enclosure of the enemy targets, improve the task execution efficiency and maximize the income, and has important significance for the maintenance of rights and interests such as the territory, the territory and the like in China. The distribution of enemy targets in performing a probing task tends to be diffuse and uneven, which can reduce the collaborative performance of the unmanned platform. Therefore, it is very important to perform a reasonable task planning before executing the task, so that the advantages and resources of each unmanned platform can be effectively utilized, the system efficiency is maximized, and the task execution efficiency is improved.
The task allocation of the multi-unmanned platform cooperative system is a multi-objective optimization problem, and not only the attack and damage ratio of the two parties of the friend and the foe but also factors of various aspects such as search efficiency, resource consumption and the like are considered. The complexity of the multi-objective optimization problem increases exponentially as the task size becomes larger, and the actual task scenario is complex and variable, and task allocation by the conventional method often needs to be solved from scratch, which can take a lot of time and resources.
Disclosure of Invention
Aiming at the defects and improvement demands of the prior art, the invention discloses a multi-unmanned platform multi-target combined detection task distribution method and a system, and aims to solve the technical problems of high solving complexity and large calculation amount of the existing task distribution method.
In order to achieve the above purpose, the invention provides a multi-unmanned platform multi-target combined detection task allocation method, which comprises the following steps:
s1, acquiring a target existence probability map of a task area, and fitting the target existence probability map by using m Gaussian models, so that the task area is divided into m subtask areas;
S2, obtaining an optimal task allocation result by maximizing accumulated rewards based on a reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j E [1, m ], n is the class number of the unmanned platform,/>For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
Further, dynamically changing DDQN a parameter update interval of a target value network of the network according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
Further, the step S1 includes:
Acquiring a target existence probability map of a task area; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
Further, after dividing the m subtask areas, the area of each subtask area, the enemy target value and the distance from the enemy target value to the starting point of the unmanned platform are obtained according to the Gaussian function corresponding to each subtask area.
In another aspect of the present invention, a multi-unmanned platform multi-target joint detection task allocation system is provided, including:
The region dividing module is used for acquiring a target existence probability map of a task region, and fitting the target existence probability map by using m Gaussian models so as to divide the task region into m subtask regions;
the task allocation module is used for obtaining an optimal task allocation result by maximizing accumulated rewards based on the reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j E [1, m ], n is the class number of the unmanned platform,/>For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
Further, dynamically changing DDQN a parameter update interval of a target value network of the network according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
Further, the area dividing module is specific to acquiring a target existence probability map of the task area; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
Further, after dividing the m subtask areas, the area of each subtask area, the enemy target value and the distance from the enemy target value to the starting point of the unmanned platform are obtained according to the Gaussian function corresponding to each subtask area.
In general, through the above technical solutions conceived by the present invention, the following beneficial effects can be obtained:
(1) According to the invention, the known task area target existence probability map is fitted and clustered, the task area is divided into a plurality of subtask areas containing a plurality of targets, the task distribution problem of the multi-unmanned platform to the multi-enemy targets is converted into the task distribution problem of the multi-unmanned platform to the multi-subtask areas, the task scale is reduced, the calculated amount is reduced, and the solving complexity is reduced. The rewarding function is designed by comprehensively considering various factors such as the attack and defense attributes, the searching capability and the like, so that the intelligent agent can minimize own loss and maximize attack benefits while efficiently completing tasks.
(2) According to the invention, an intelligent updating strategy is designed on the basis of the original DDQN network, the parameter updating interval of the target value network is dynamically changed according to the real-time rewards, the stability of the network is improved, and the task allocation strategy with higher rewards is learned by an intelligent agent.
Drawings
FIG. 1 is a schematic flow chart of a multi-unmanned platform multi-target joint detection task allocation method according to an embodiment of the present invention;
FIG. 2 is a second schematic flow chart of a multi-unmanned platform multi-target joint detection task allocation method according to an embodiment of the present invention;
FIG. 3 is a third flow chart of a multi-unmanned platform multi-target joint detection task allocation method according to an embodiment of the present invention;
FIG. 4 is a graph of initial target existence probabilities provided by an embodiment of the present invention;
FIG. 5 is a graph of probability of existence of a target obtained by Gaussian fitting according to an embodiment of the invention;
FIG. 6 is a plot of the resulting partitions of a Gaussian fit provided by an embodiment of the invention;
FIG. 7 is a graph of average instant prize provided by an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Deep reinforcement learning is a combination of deep learning and reinforcement learning, and is a method for adjusting self strategies by continuously performing trial and error with environment through an agent. The dynamic planning method can cope with environmental changes and is more suitable for task allocation than the traditional method. Therefore, the invention provides a multi-unmanned platform multi-target joint detection task distribution method and system based on deep reinforcement learning, different profit functions are designed aiming at the movement characteristics of the unmanned platform, the searching efficiency and the attack and damage advantages are considered at the same time when the decision is made, and the high efficiency and the synergy of executing tasks are ensured.
Referring to fig. 1, in combination with fig. 2 and fig. 3, the present invention provides a multi-unmanned platform multi-target joint detection task allocation method, which includes the following steps:
s1, acquiring a target existence probability map of a task area, and fitting the target existence probability map by using m Gaussian models, so that the task area is divided into m subtask areas.
In this embodiment, an initial target existence probability map of the task area is obtained by means of reconnaissance, detection, and the like, as shown in fig. 4, and is rasterized. And setting a plurality of sampling points for the task area, and dividing the sampling points into grids according to the probability of each grid. M gaussian models G n (x, y) (n=1, 2..m) were used to fit the target presence probability map for the task area, as shown in fig. 5. Each Gaussian model has a specific gravity of w n andThe target existence probability of the grid (x, y) can be expressed as follows by a gaussian mixture model:
Wherein, the formula of the gaussian density function G n (x, y) is as follows:
Where p= [ x, y ] T represents the center position vector of the grid (x, y). Firstly, a K-Means method is adopted to roughly divide a task area, and an initial value of a Gaussian mixture model parameter w n、μn、Cn is obtained. The final partition is then obtained by estimating the various parameters using the maximum expectation algorithm, as shown in fig. 6. After the partitions are obtained, the area and the regional value of each subtask region and the distance from the starting point of the unmanned platform, namely the origin, can be obtained by the Gaussian function corresponding to each region, as shown in Table 1.
TABLE 1 enemy target information table in subareas
Project Zone 1 Zone 2 Zone 3
Area of 108.9372 130.0092 267.7715
Distance of 34.9 56.9 40.2
Value of 21.44 25.56 52.99
Attack success rate [0.7,0.7,0.8] [0.8,0.7,0.8] [0.8,0.7,0.9]
S2, obtaining the optimal task allocation result by maximizing the accumulated rewards based on the reinforcement learning DDQN network.
Specifically, comprehensively evaluating the detection range of the unmanned platform carrying the detector in unit time to obtain the searching capability of various unmanned platforms; and comprehensively evaluating the attack and defense capabilities of the enemy targets and various unmanned platforms on the my sides to obtain the attack success rate of the various unmanned platforms on the enemy targets in the area, as shown in table 2. And obtaining the attack success rate of the enemy targets in each area to various unmanned platforms on the my side, as shown in table 1.
Table 2 unmanned platform information table
Project Type 1 Type 2 Type 3
Search capability 2 1 1
Value of 20 10 30
Quantity of 2 2 2
Attack success rate [0.8,0.7,0.7] [0.7,0.8,0.6] [0.8,0.7,0.8]
For any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, in this embodiment, l=100; f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j E [1, m ], n is the class number of the unmanned platform,/>For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient. For subtask areas with large areas or long distances, unmanned platforms with higher searching capability or more unmanned platforms are considered to be allocated.
Further, the number of the subtask area to be allocated and the number of the unmanned platforms are combined into a vector, namely a1×4 vector, which is used as the input of the network. The network calculates and outputs a vector of 1×27 dimensions, namely the value of all possible decisions in the current state, and selects the maximum value and converts the corresponding subscript and action. The actions are the number of unmanned platforms of each type dispatched by the subtask area. The network prototype is DDQN network, and the update interval of the target value network is fixed. The exploration force of the early-stage intelligent agent is large, and the network can be converged towards the direction of higher rewards by the small parameter updating interval; however, in the later stage of training, the smaller updating interval makes the parameters of the target value network updated frequently, and network convergence is difficult. Therefore, the invention provides an intelligent updating strategy, namely a strategy that a smaller parameter is given at a more interval in the early stage of network training, so that an intelligent agent can learn a higher rewarding more easily; when the instant benefit increases to a certain value, the parameter updating interval is gradually increased, so that frequent updating of the parameters is avoided, and the stability of the network is improved. The calculation formula is as follows:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1. In this embodiment, an initial parameter update interval N 0 =50 is set, and the prize threshold r th = -2, η=1.
The final distribution results are shown in Table 3, and the final profit map is shown in FIG. 7.
Table 3 distribution results table
The invention also provides a multi-unmanned platform multi-target combined detection task distribution system, which comprises:
The region dividing module is used for acquiring a target existence probability map of a task region, and fitting the target existence probability map by using m Gaussian models so as to divide the task region into m subtask regions;
And the task allocation module is used for obtaining the optimal task allocation result by maximizing the accumulated rewards based on the reinforcement learning DDQN network. The calculation mode of the reward function for any subtask region is the same as the multi-unmanned platform multi-target combined detection task allocation method, and is not repeated here.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (8)

1. The multi-unmanned platform multi-target combined detection task allocation method is characterized by comprising the following steps of:
s1, acquiring a target existence probability map of a task area, and fitting the target existence probability map by using m Gaussian models, so that the task area is divided into m subtask areas;
S2, obtaining an optimal task allocation result by maximizing accumulated rewards based on a reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j epsilon [1, m ], n is the class number of the unmanned platform,For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
2. The multi-unmanned platform multi-target joint detection task allocation method according to claim 1, wherein the parameter update interval of the target value network of the DDQN network is dynamically changed according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
3. The multi-unmanned platform multi-target joint detection task allocation method according to claim 1 or 2, wherein the step S1 comprises:
Acquiring a target existence probability map of a task area; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
4. The multi-unmanned platform multi-target joint detection task allocation method according to claim 3, wherein after dividing m subtask areas, the area of each subtask area, the enemy target value and the distance from the enemy target value to the starting point of the unmanned platform are obtained according to a Gaussian function corresponding to each subtask area.
5. A multi-unmanned platform multi-target joint detection task allocation system, comprising:
The region dividing module is used for acquiring a target existence probability map of a task region, and fitting the target existence probability map by using m Gaussian models so as to divide the task region into m subtask regions;
the task allocation module is used for obtaining an optimal task allocation result by maximizing accumulated rewards based on the reinforcement learning DDQN network; for any subtask region, the bonus function R t is expressed as:
Wherein L is a positive number, f r represents decision correctness, when the number of various unmanned platforms dispatched by the decision exceeds the actual remaining number, or the total number of unmanned platforms dispatched by the current subtask area is 0, the decision error f r =0, otherwise, the decision correctness f r=1;fa represents attack benefit, f l represents loss amount, f s represents search benefit, and w 1、w2、w3、w4 represents weight coefficient;
the attack benefit f a is calculated as follows:
Wherein, For the enemy target value of the jth subtask area, j epsilon [1, m ], n is the class number of the unmanned platform,For the attack success rate of the ith unmanned platform to the enemy target in the jth subtask area, k ij is the number of the ith unmanned platforms distributed in the jth subtask area;
The loss f l is calculated as follows:
Wherein, For the value of class i unmanned platform,/>The attack success rate of an enemy target in the jth subtask area to the ith unmanned platform is achieved;
the search benefit f s is calculated as follows:
Wherein spow i is the searching capability of the i-th unmanned platform, S j is the area of the j-th subtask region, R j is the distance between the j-th subtask region and the starting point of the unmanned platform, and w 5 represents the weight coefficient.
6. The multi-unmanned platform multi-target joint detection task allocation system according to claim 5, wherein the parameter update interval of the target value network of the DDQN network is dynamically changed according to rewards obtained by the agent;
the parameter update interval is calculated by the following steps:
Wherein, N t is the update interval of the time t, step t is the iteration times of the time t, r t is the reward obtained at the time t, and r th is the reward threshold; η is a weight coefficient greater than 0 and equal to or less than 1.
7. The multi-unmanned platform multi-target joint detection task allocation system according to claim 5 or 6, wherein the region dividing module is specifically configured to obtain a target existence probability map of a task region; roughly dividing a task area by adopting a K-Means method, and obtaining an initial value of a Gaussian mixture model parameter; and estimating each parameter by using a maximum expected algorithm, so as to divide the task area into m subtask areas.
8. The multi-unmanned platform multi-target joint detection task allocation system according to claim 7, wherein after dividing the m subtask areas, the area of each subtask area, the enemy target value and the distance to the starting point of the unmanned platform are calculated according to the gaussian function corresponding to each subtask area.
CN202210566512.9A 2022-05-23 2022-05-23 Multi-unmanned platform multi-target combined detection task allocation method and system Active CN114997617B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210566512.9A CN114997617B (en) 2022-05-23 2022-05-23 Multi-unmanned platform multi-target combined detection task allocation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210566512.9A CN114997617B (en) 2022-05-23 2022-05-23 Multi-unmanned platform multi-target combined detection task allocation method and system

Publications (2)

Publication Number Publication Date
CN114997617A CN114997617A (en) 2022-09-02
CN114997617B true CN114997617B (en) 2024-06-07

Family

ID=83027440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210566512.9A Active CN114997617B (en) 2022-05-23 2022-05-23 Multi-unmanned platform multi-target combined detection task allocation method and system

Country Status (1)

Country Link
CN (1) CN114997617B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
CN112558642A (en) * 2020-12-30 2021-03-26 上海大学 Sea-air combined capturing method suitable for heterogeneous multi-unmanned system
CN113589842A (en) * 2021-07-26 2021-11-02 中国电子科技集团公司第五十四研究所 Unmanned clustering task cooperation method based on multi-agent reinforcement learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
MX2019009470A (en) * 2017-02-08 2019-09-16 Walmart Apollo Llc Task management in retail environment.

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
CN112558642A (en) * 2020-12-30 2021-03-26 上海大学 Sea-air combined capturing method suitable for heterogeneous multi-unmanned system
CN113589842A (en) * 2021-07-26 2021-11-02 中国电子科技集团公司第五十四研究所 Unmanned clustering task cooperation method based on multi-agent reinforcement learning

Also Published As

Publication number Publication date
CN114997617A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
Fu Deep belief network based ensemble approach for cooling load forecasting of air-conditioning system
CN111563188B (en) Mobile multi-agent cooperative target searching method
Wang et al. Shapley Q-value: A local reward approach to solve global reward games
Bingul Adaptive genetic algorithms applied to dynamic multiobjective problems
CN104408518B (en) Based on the neural network learning optimization method of particle swarm optimization algorithm
CN114815882B (en) Unmanned aerial vehicle autonomous formation intelligent control method based on reinforcement learning
Zhao et al. A decomposition-based many-objective ant colony optimization algorithm with adaptive solution construction and selection approaches
CN116307464A (en) AGV task allocation method based on multi-agent deep reinforcement learning
CN111797966B (en) Multi-machine collaborative global target distribution method based on improved flock algorithm
Gao et al. Consensus evaluation method of multi-ground-target threat for unmanned aerial vehicle swarm based on heterogeneous group decision making
CN115047907B (en) Air isomorphic formation command method based on multi-agent PPO algorithm
CN113780576A (en) Cooperative multi-agent reinforcement learning method based on reward self-adaptive distribution
CN114281103B (en) Aircraft cluster collaborative search method with zero interaction communication
Juang et al. A self-generating fuzzy system with ant and particle swarm cooperative optimization
Pourpanah et al. mBSO: A multi-population brain storm optimization for multimodal dynamic optimization problems
CN114997617B (en) Multi-unmanned platform multi-target combined detection task allocation method and system
CN111967199A (en) Agent contribution distribution method under reinforcement learning multi-agent cooperation task
Dias et al. Quantum-inspired neuro coevolution model applied to coordination problems
CN117112176A (en) Intelligent community fog calculation task scheduling method based on ant lion algorithm
CN116797116A (en) Reinforced learning road network load balancing scheduling method based on improved reward and punishment mechanism
CN109359671B (en) Classification intelligent extraction method for hydropower station reservoir dispatching rules
Gaowei et al. Using multi-layer coding genetic algorithm to solve time-critical task assignment of heterogeneous UAV teaming
CN110505293A (en) Cooperation caching method based on improved drosophila optimization algorithm in a kind of mist wireless access network
CN114166228B (en) Unmanned aerial vehicle continuous monitoring path planning method
CN116165886A (en) Multi-sensor intelligent cooperative control method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant