CN115526417A

CN115526417A - Multi-unmanned vehicle task allocation method and device, vehicle and storage medium

Info

Publication number: CN115526417A
Application number: CN202211266909.2A
Authority: CN
Inventors: 王建强; 刘艺璁; 韩泽宇; 杨奕彬; 王裕宁; 许庆; 徐少兵
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-10-17
Filing date: 2022-10-17
Publication date: 2022-12-27

Abstract

The application relates to the technical field of multi-unmanned vehicles, in particular to a multi-unmanned vehicle task allocation method, a multi-unmanned vehicle task allocation device, a multi-unmanned vehicle task allocation vehicle and a storage medium, wherein the method comprises the following steps: inputting the acquired target tasks and preset decomposition targets of the multiple unmanned vehicles into a preset knowledge graph which is established in advance, and outputting a time sequence constraint relation between one or more subtasks of the target tasks; and generating each unmanned vehicle distribution scheme according to the preset distribution parameters of each subtask and the time sequence constraint relation between each subtask, performing multi-target optimization on the total time consumption and the success rate of each distribution scheme by using preset scheme evaluation indexes to obtain the optimization results of the total time consumption and the success rate of each distribution scheme, determining the optimal distribution scheme in all the distribution schemes by combining with manual selection intents, and respectively transmitting the subtasks in the optimal distribution scheme to the corresponding unmanned vehicles. Therefore, the problems that in the related art, the complex task cannot be decomposed into executable subtasks, the distribution result is incomplete and unreasonable and the like are solved.

Description

Multi-unmanned vehicle task allocation method and device, vehicle and storage medium

Technical Field

The present disclosure relates to the field of multi-unmanned vehicle technologies, and in particular, to a method and an apparatus for task allocation for multi-unmanned vehicle, an electronic device, and a storage medium.

Background

The development of unmanned vehicle technology is not only helpful to promote the intellectualization of civil vehicles and the transportation field, but also widely applied to other scenes including storage, manufacturing, ports, airports and the like to perform mobile transportation or other specific tasks, such as search and rescue after natural disasters occur. Unmanned vehicles belong to mobile robots, task allocation is a complex key problem in a multi-robot system, and human-computer collaborative decision technology integrates human intelligence and mechanistic, so that the applicability and efficiency of task allocation of multiple unmanned vehicles are improved.

The types and the characteristics of multiple unmanned vehicles are different to determine that task allocation has multi-objective, the objectives often cannot be optimal simultaneously, and most of the current applications only consider single objective or simple superposition of multiple objectives, so that allocation results are not comprehensive and reasonable. Intelligent automatic allocation of complex tasks cannot be realized, the complex tasks cannot be decomposed into executable subtasks, and the executable subtasks are organically combined with subsequent task allocation.

Disclosure of Invention

The application provides a method and a device for distributing tasks of multiple unmanned vehicles, vehicles and a storage medium, and aims to solve the problems that in the related art, complex tasks cannot be decomposed into executable subtasks, so that the distribution result is not comprehensive and unreasonable and the like.

The embodiment of the first aspect of the application provides a multi-unmanned vehicle task allocation method, which comprises the following steps: acquiring target tasks of a plurality of unmanned vehicles and preset decomposition targets of the target tasks; inputting the target task and the preset decomposition target into a preset knowledge graph which is established in advance, and outputting a time sequence constraint relation between one or more subtasks of the target task; generating each unmanned vehicle distribution scheme according to the preset distribution parameters of each subtask and the time sequence constraint relation between each subtask, and performing multi-target optimization on the total time consumption and the success rate of each distribution scheme by using the preset scheme evaluation indexes to obtain the optimization results of the total time consumption and the success rate of each distribution scheme; and determining the optimal distribution scheme in all the distribution schemes according to the optimization result and the manual selection intention, and respectively sending the subtasks in the optimal distribution scheme to the corresponding unmanned vehicles.

Optionally, in an embodiment of the present application, the preset decomposition target includes one or more of a task type vector, a task position vector, and an action object vector, and the inputting the target task and the preset decomposition target into a preset knowledge graph established in advance, and outputting a time-sequence constraint relationship between one or more subtasks of the target task includes: inputting one or more of the task type vector, the task position vector and the action object vector into the pre-established preset knowledge graph as vectors of each node of a 0 th layer, and outputting directed edges of each node as predicted values of each subtask time sequence relation constraint after message transmission; and determining a time sequence constraint relation among one or more subtasks of the target task according to the preset value and/or the manual modification intention.

Optionally, in an embodiment of the present application, the training of the preset knowledge graph based on training data carrying results of the manual decomposition task includes: acquiring training data carrying a manual decomposition task result, wherein the manual decomposition task result comprises an actual time sequence relation constraint characteristic among subtasks; modeling a task decomposition knowledge graph by using a preset relational graph convolution neural network, and performing directed edge prediction training by using the training data to obtain prediction time sequence relation constraint knowledge among subtasks; and calculating training loss according to the actual time sequence relation constraint characteristics and the prediction time sequence relation constraint knowledge, and stopping iterative training until the training loss meets a stopping condition to obtain the preset knowledge graph.

Optionally, in an embodiment of the present application, the performing multi-objective optimization on the total time consumption and the success rate of each allocation plan by using preset plan evaluation indexes to obtain an optimization result of the total time consumption and the success rate of each allocation plan includes: correspondingly encoding each distribution scheme by using a preset encoding mode to obtain a genotype encoding result of each distribution scheme; performing phenotype decoding on the genotype coding result of each distribution scheme, transmitting each distribution scheme from the genotype to a phenotype, and randomly generating a plurality of feasible scheme solutions meeting preset constraint conditions as an initialization solution set; and carrying out alternate evolution calculation on the initialization solution set, optimizing the total time consumption and the success rate of the scheme, and obtaining the optimization result of the total time consumption and the success rate of each distribution scheme.

Optionally, in an embodiment of the present application, the determining an optimal allocation scheme of all allocation schemes according to the optimization result and the manual selection intention includes: matching the time-consuming grade and the success rate grade of each distribution scheme according to the optimization result; determining semantic description of each distribution scheme according to the time level and the success rate level of each distribution scheme, and acquiring manual selection intention of a user based on the semantic description; and determining the optimal distribution scheme in all the distribution schemes according to the manual selection intention.

Optionally, in an embodiment of the present application, before determining an optimal allocation scheme of all allocation schemes according to the optimization result and the manual selection intention, the method includes: and screening one or more distribution schemes meeting preset conditions from all the distribution schemes by adopting a preset space equidistant principle, wherein the optimal distribution scheme is the distribution scheme in the one or more distribution schemes.

The embodiment of the second aspect of the application provides a multi-unmanned vehicle task allocation device, which comprises: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring target tasks of a plurality of unmanned vehicles and preset decomposition targets of the target tasks; the processing module is used for inputting the target task and the preset decomposition target into a preset knowledge graph which is established in advance and outputting a time sequence constraint relation between one or more subtasks of the target task; the optimization module is used for generating each unmanned vehicle distribution scheme according to the preset distribution parameters of each subtask and the time sequence constraint relation between each subtask, and performing multi-target optimization on the total time consumption and the success rate of each distribution scheme by using preset scheme evaluation indexes to obtain the optimization results of the total time consumption and the success rate of each distribution scheme; and the distribution module is used for determining the optimal distribution scheme in all the distribution schemes according to the optimization result and the manual selection intention, and respectively sending the subtasks in the optimal distribution scheme to the corresponding unmanned vehicles.

Optionally, in an embodiment of the present application, the preset decomposition target includes one or more of a task type vector, a task position vector, and an action object vector, and the processing module is further configured to input one or more of the task type vector, the task position vector, and the action object vector into the preset knowledge graph established in advance, to serve as a vector of each node in a layer 0, and output a directed edge of each node as a predicted value of each sub-task timing relationship constraint after message transmission; and determining a time sequence constraint relation among one or more subtasks of the target task according to the preset value and/or the manual modification intention.

Optionally, in an embodiment of the present application, the method further includes: the training module is used for acquiring training data carrying a manual decomposition task result, wherein the manual decomposition task result comprises an actual time sequence relation constraint characteristic among subtasks; modeling a task decomposition knowledge graph by using a preset relational graph convolution neural network, and performing directed edge prediction training by using the training data to obtain prediction time sequence relation constraint knowledge among subtasks; and calculating training loss according to the actual time sequence relation constraint characteristics and the prediction time sequence relation constraint knowledge, and stopping iterative training until the training loss meets a stopping condition to obtain the preset knowledge graph.

Optionally, in an embodiment of the present application, the optimization module is further configured to correspondingly encode each allocation scheme by using a preset encoding manner, so as to obtain a genotype encoding result of each allocation scheme; performing phenotype decoding on the genotype coding result of each distribution scheme, emitting each distribution scheme into a phenotype from the genotype, and randomly generating a plurality of feasible scheme solutions meeting preset constraint conditions to serve as an initialization solution set; and carrying out alternate evolution calculation on the initialization solution set, optimizing the total time consumption and the success rate of the scheme, and obtaining the optimization result of the total time consumption and the success rate of each distribution scheme.

Optionally, in an embodiment of the present application, the allocation module is further configured to match a time-consuming level and a success rate level of each allocation scheme according to the optimization result; determining semantic description of each distribution scheme according to the time level and the success rate level of each distribution scheme, and acquiring manual selection intention of a user based on the semantic description; and determining the optimal distribution scheme in all distribution schemes according to the manual selection intention.

Optionally, in an embodiment of the present application, the method further includes: and the screening module is used for screening one or more distribution schemes meeting preset conditions from all the distribution schemes by adopting a preset space equidistance principle before determining the optimal distribution scheme in all the distribution schemes according to the optimization result and the manual selection intention, wherein the optimal distribution scheme is the distribution scheme in the one or more distribution schemes.

An embodiment of a third aspect of the present application provides a vehicle, comprising: the system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the multi-unmanned vehicle task allocation method according to the embodiment.

A fourth aspect of the present application provides a computer-readable storage medium, on which a computer program is stored, where the program is executed by a processor to implement the method for assigning tasks to multiple unmanned vehicles as described in the above embodiments.

Therefore, the application has at least the following beneficial effects:

according to the method and the device, the acquired target tasks and preset decomposition targets of the multiple unmanned vehicles can be input into preset knowledge map prediction task decomposition time sequence relation constraints which are established in advance, each unmanned vehicle distribution scheme is generated by combining distribution parameters of each subtask, multi-objective optimization is carried out on the total time of the task distribution scheme and the scheme success rate by design scheme evaluation indexes, and the optimal distribution scheme is determined by combining manual selection intentions and handed to the multiple unmanned vehicles for cooperative execution. By introducing people into a multi-unmanned vehicle task allocation decision-making loop, performing man-machine cooperative task decomposition based on a knowledge graph, and optimizing task allocation by considering multi-objective, the efficient cooperative decision of a machine-assisted person is realized. Therefore, in the related art, the complex task cannot be decomposed into executable subtasks, which results in the problems of incomplete and unreasonable distribution results.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a flowchart of a task allocation method for multiple unmanned vehicles according to an embodiment of the present application;

FIG. 2 is a directed graph of sub-tasks and their timing relationship constraints provided in accordance with an embodiment of the present application;

FIG. 3 is a schematic diagram of knowledge graph modeling provided in accordance with an embodiment of the present application;

FIG. 4 is a general schematic diagram of a multi-unmanned vehicle task allocation method provided according to an embodiment of the application;

FIG. 5 is a rescue subtask and its timing relationship constraint directed graph provided in accordance with an embodiment of the present application;

FIG. 6 is a Gantt chart of a possible scenario provided according to an embodiment of the present application;

FIG. 7 is a schematic diagram of Monte Carlo simulation times provided in accordance with an embodiment of the present application;

FIG. 8 is a multi-objective optimization scheme solution set dot diagram provided in accordance with an embodiment of the present application;

FIG. 9 is a filtered allocation scheme provided in accordance with an embodiment of the present application;

FIG. 10 is a block diagram illustrating a method for assigning tasks to multiple unmanned vehicles according to an embodiment of the present disclosure;

FIG. 11 is a schematic structural diagram of a vehicle according to an embodiment of the present application.

Description of reference numerals: the system comprises an acquisition module-100, a processing module-200, an optimization module-300, a distribution module-400, a memory-1101, a processor-1102 and a communication interface-1103.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

A multi-unmanned vehicle task allocation method, apparatus, vehicle, and storage medium according to embodiments of the present application are described below with reference to the accompanying drawings. In order to solve the problems mentioned in the background technology, the application provides a multi-unmanned vehicle task allocation method, in the method, the acquired target tasks and preset decomposition targets of a plurality of unmanned vehicles are input into a preset knowledge graph which is established in advance to predict task decomposition time sequence relation constraints, each unmanned vehicle allocation scheme is generated by combining allocation parameters of each subtask, multi-objective optimization is carried out on the total time and scheme success rate of the task allocation schemes by design scheme evaluation indexes, and an optimal allocation scheme is determined by combining manual selection intentions and handed to the multi-unmanned vehicles to be executed cooperatively. The method is characterized in that a human is introduced into a multi-unmanned vehicle task allocation decision-making loop, man-machine cooperative task decomposition is carried out based on a knowledge graph, and optimization of task allocation is carried out in consideration of multi-objective performance, so that efficient cooperative decision of a machine-assisted human is realized. Therefore, in the related art, the complex task cannot be decomposed into executable subtasks, which results in the problems of incomplete and unreasonable distribution results.

Specifically, fig. 1 is a schematic flow chart of a method for allocating tasks to multiple unmanned vehicles according to an embodiment of the present application.

As shown in fig. 1, the task allocation method for multiple unmanned vehicles comprises the following steps:

in step S101, target tasks of a plurality of unmanned vehicles and preset decomposition targets of the target tasks are acquired.

The preset decomposition target can comprise one or more of a task type vector, a task position vector and an action object vector.

It can be understood that a man-machine cooperative decision idea can be adopted in the embodiment of the application, the man-machine cooperative task decomposition is carried out by acquiring the target tasks of the multiple unmanned vehicles and the preset decomposition targets of the target tasks, introducing people into a multi-unmanned vehicle task distribution decision-making loop, and the optimization of task distribution is carried out by considering multi-objective, so that the efficient cooperative decision of the robot assistant is realized.

In step S102, the target task and the preset decomposition target are input into a preset knowledge graph established in advance, and a time sequence constraint relationship between one or more subtasks of the target task is output.

According to the task decomposition method and device, based on the time sequence constraint relation among one or more subtasks of the target task, man-machine cooperation type task decomposition is achieved, and the intelligent and automatic level of task decomposition is improved.

In an embodiment of the present application, inputting a target task and a preset decomposition target into a preset knowledge graph established in advance, and outputting a time sequence constraint relationship between one or more subtasks of the target task, including: inputting one or more of a task type vector, a task position vector and an action object vector into a preset knowledge graph established in advance to serve as a vector of each node on the 0 th layer, and outputting a directed edge of each node after message transmission to serve as a predicted value of each subtask time sequence relation constraint; and determining the time sequence constraint relation among one or more subtasks of the target task according to a preset value and/or a manual modification intention.

It can be understood that, in the embodiment of the present application, after obtaining perception and situation information by a person in actual application, only part of task decomposition work needs to be performed, the task type vector t, the task position vector p, and the action object vector o of each subtask are definitely given and input to the knowledge graph as the vector of each node on the 0 th layer, after message transmission, the directed edge of each node can be output as the predicted value of the time sequence relation constraint of each subtask, and whether to adopt or make necessary modification is determined by the predicted value and the person, so that the time sequence constraint relation between one or more subtasks of the target task is determined.

In step S103, each unmanned vehicle allocation scheme is generated according to the preset allocation parameters of each subtask and the time sequence constraint relationship between each subtask, and multi-objective optimization is performed on the total time consumption and the success rate of each allocation scheme by using the preset scheme evaluation indexes, so as to obtain the optimization results of the total time consumption and the success rate of each allocation scheme.

Wherein, the presetting of the distribution parameters may include: task allocation relation, subtask execution time and time sequence relation among subtasks.

After the task decomposition, each subtask and timing relationship constraint thereof can be obtained, and in the embodiment of the present application, a directed graph can be used, as shown in fig. 2, a block in the graph represents each subtask, and an arrow points to a subtask behind a timing sequence.

According to the embodiment of the application, based on the distribution parameters of each subtask and the time sequence constraint relation between each subtask, a model can be established, and a task distribution feasible scheme can be generated and expressed in a Gantt chart form, which is specifically as follows.

And (3) modeling subtasks and timing relation constraint thereof by using a directed graph adjacency matrix, wherein the following formula is as follows:

G _d ＝[g _ij ] _K×K

wherein G is _d Representing a directed graph adjacency matrix, g _ij Is a variable of 0-1, 0 represents a child taskThe task i and the subtask j have no time sequence relation constraint, 1 means that the subtask i is performed before the subtask j, and K is the total number of the subtasks.

The task allocation purpose is to allocate each subtask to each unmanned vehicle for execution, and the task allocation relationship can be represented by a matrix as follows:

A＝[a _ij ] _N×K

wherein A represents a task allocation relation matrix, a _ij And the variable is 0-1, wherein 0 represents that the unmanned vehicle i does not execute the subtask j, otherwise, the subtask j is executed, and N is the total number of the unmanned vehicles.

Based on the task allocation relationship, the execution time of each subtask estimated by the embodiment of the application comprises two parts, namely mobility time and task completion time. After the map and the situation information are obtained, the position of the unmanned vehicle and the position of the task are known, and the mobility time can be obtained by combining the kinematics characteristics of the unmanned vehicle and adopting algorithms such as RRT or A and the like. The time to complete a task varies with the type of task and may be given based on experience or historical data. When a dimension array is adopted to represent the execution time of each subtask, the following formula is shown:

T＝[t _j ] _1×K

wherein T represents the time-array for execution of the subtasks, T _j Is used for the execution of the subtask j.

The directed graph adjacency matrix only represents the time sequence relation constraint which needs to be satisfied between the subtasks, the task allocation relation matrix does not represent the time sequence relation between the subtasks, and a one-dimensional array is additionally adopted to represent the time sequence relation between the subtasks, namely the priority level of execution of each subtask is as follows:

O＝[o _j ] _1×K

wherein, O represents the time sequence relation array among the subtasks, K is not less than 1 and not more than j and not more than K, the former subtask in the array has higher execution priority in time sequence, and the attention is paid that O should not violate the time sequence relation constraint G in the task decomposition _d 。

Directed graph adjacency matrix G _d The task allocation relation matrix A, the execution time array T of the subtasks and the time sequence relation array O among the subtasks are givenAnd the task allocation scheme is uniquely determined, the scheme meeting the time sequence relation constraint in the task decomposition is called a task allocation feasible scheme, and the key point of the task allocation feasible scheme comprises two parts, namely a task allocation relation and a time sequence relation which can be simultaneously and visually represented in a Gantt chart form.

G is to be _d Task allocation and time sequence relation of the A, the T and the O are arranged into a task allocation feasible scheme Gantt chart in a manner of conforming to constraint, and initial execution time of each subtask needs to be calculated, and a serial scheme generation algorithm (SSGS) can be adopted in the embodiment of the application, and the method comprises the following steps:

1) Sequentially selecting a subtask O from the O _j (numbering);

2) Obtaining an execution vehicle required by the subtask through A, and obtaining execution time required by the subtask through T;

3) Without violating G _d Setting a time sequence relation constraint and a resource constraint that the unmanned vehicle can only execute one subtask at the same time, and setting the initial execution time of the subtask as the earliest feasible time;

4) The next subtask is selected in O, and the serial loop generates the initial time.

For the supplementary notes in step 3) therein: when subtask o _j When the scheme is arranged, the front subtasks are arranged and completed, the ending time of all the front subtasks is sequenced, and the latest ending time is the earliest time that the current subtask can be arranged. And (4) solving an intersection set for the occupied time of all the execution vehicles by any subtask, and meeting resource constraint if the intersection set is an empty set. If the resource constraints can be met, the scheme is scheduled at the earliest time; and if the resource constraint cannot be met, calculating the ending moment of the next subtask after the time sequence, judging again, and arranging the solution until the resource constraint is met.

Furthermore, the embodiment of the application can design scheme evaluation indexes, establish a multi-objective optimization problem, solve task allocation optimization schemes by adopting an evolutionary algorithm, and perform multi-objective optimization on the total time consumption and the scheme success rate of each allocation scheme to obtain a final optimization result.

Specifically, the total time of a solution refers to the completion time of the whole solution, or the end time of the completion of the last task; the scheme success rate mainly focuses on the robustness of the implementation of the scheme, the inconsistency between the planned time consumption and the actual time consumption of each subtask is considered, the smooth implementation of the whole scheme is possibly influenced, certain redundancy is provided on the time consumption estimation, the scheme robustness is improved, the whole scheme can better cope with uncertainty, and therefore the scheme success rate is improved.

The premise of the total time of the calculation scheme is to estimate the execution time of each subtask, and the calculation scheme is divided into mobility time and task completion time.

After map and situation information are obtained, the position of the unmanned vehicle and the position of a task are known, and the mobility time can be obtained by combining the kinematics characteristics of the unmanned vehicle and adopting algorithms such as RRT or A and the like for calculation; the time to complete a task varies with the type of task and may be given based on experience or historical data. The total time of the scheme is estimated as follows:

1) Evaluating the map and the situation information, setting a risk threshold value, and marking the area with the risk higher than the threshold value in the map as an impassable area according to the obstacle;

2) Determining the current position and the target position of the unmanned vehicle, roughly planning a path from the unmanned vehicle to the target position by adopting a path planning algorithm such as RRT or A, and carrying out speed planning by combining an unmanned vehicle kinematics model;

3) Calculating the mobility time of the unmanned vehicle moving from the current position to the target position according to the path and the speed;

4) And according to the type of the task to be completed, providing the time for completing the task based on experience or historical data, and adding the time for completing the task and the mobility time to obtain the total time of the scheme.

The embodiment of the application can approximate the real solution of the problem by using the Monte Carlo simulation calculation scheme success rate and a large amount of random samples. The execution time of each subtask is defined as a reference time length, the actual time estimation of the subtask corresponding to the generation scheme is defined as a planning time length, and the actual time of the subtask generated in the Monte Carlo simulation is defined as a simulation time length. The simulated durations for each subtask are generated using the most common normal distribution in practice (any probability distribution can also be used based on the true data), with the reference duration as the mathematical expectation for that normal distribution. And for the whole task allocation scheme, plan time and simulation time are respectively adopted to generate the scheme, so that the total time of different schemes is obtained. In the process of calculating the success rate of the scheme, if the total scheme consumption under the simulation time does not exceed the total scheme consumption under the planning time, the scheme is considered to be successfully executed, otherwise, the scheme is failed to be executed. Carrying out a large amount of Monte Carlo simulation, and calculating the success rate of the scheme as follows:

wherein SR represents a plan success rate, p _n And (3) indicating an indicating variable (1 is true, 0 is false) that the total time of the scheme under the simulation time length in the nth Monte Carlo simulation does not exceed the total time of the scheme under the planning time length, and N is the total times of the Monte Carlo simulation.

In an embodiment of the present application, the performing multi-objective optimization on the total time consumption and the success rate of each allocation scheme by using preset scheme evaluation indexes to obtain an optimization result of the total time consumption and the success rate of each allocation scheme includes: correspondingly coding each distribution scheme by using a preset coding mode to obtain a genotype coding result of each distribution scheme; performing phenotype decoding on the genotype coding result of each distribution scheme, emitting each distribution scheme into a phenotype from the genotype, and randomly generating a plurality of feasible scheme solutions meeting preset constraint conditions to serve as an initialization solution set; and carrying out surrogate evolution calculation on the initialization solution set, optimizing the total time consumption and the success rate of the scheme, and obtaining the optimization results of the total time consumption and the success rate of each distribution scheme.

It can be understood that, in the embodiment of the present application, the NSGA-II algorithm may be adopted to perform multi-objective optimization on the total time consumption and the scheme success rate of the task allocation scheme, so that the total time consumption and the success rate of each allocation scheme are optimal.

Specifically, since the total time of the task allocation scheme and the success rate of the task allocation scheme are contradictory to each other, and the optimal solution cannot be achieved at the same time, the pareto optimal solution is calculated by using the evolution-based multi-objective optimization algorithm NSGA-II, which is detailed below.

1) The independent variable genotype coding of the scheme, which is used as the genotype and is a multi-objective optimization problem, aims to facilitate random sampling. When the subtask distribution relation A, the execution time T and the time sequence relation O change, the obtained scheme also changes, the scheme is an independent variable of the multi-objective optimization problem, the independent variable is called as a genotype in an NSGA-II algorithm, appropriate coding is required, and the coding mode and the data type are shown in the table 1. Wherein, table 1 is an optimized variable coding table.

TABLE 1

Optimizing variables	Coding method	Data type
			Unmanned vehicle-subtask numbering	Permutation coding	Array of NxK integers
Number of unmanned vehicles required by subtasks	Integer coding	Array of 1 xK integers
			Time of execution T	Real number encoding	1 xK real number array
Timing relationship O	Integer coding	Array of 1 xK integers

The execution time T and the time sequence relation O are directly coded in a proper mode, the subtask distribution relation A is not easy to directly code, and the subtask distribution relation A is indirectly coded through three variables of unmanned vehicle-subtask numbers, the number of unmanned vehicles required by the subtasks and the time sequence relation O. For each subtask, the unmanned vehicles are randomly arranged between 1 and N, representing the priority for executing the subtask, with the higher the number of numbers, the higher the priority. And sequentially selecting subtasks according to the time sequence relation O, finding corresponding unmanned vehicle number arrangement in the unmanned vehicle-subtask numbers, and selecting unmanned vehicles according to the priority level according to the number of the unmanned vehicles required by the subtasks to execute the subtasks. Meanwhile, in the subtask allocation relationship A, the unmanned vehicle-subtask elements with the allocation relationship are set to be 1, and the others are set to be 0, so that indirect coding of the subtask allocation relationship A is completed.

2) And (4) regarding the task allocation scheme and the Gantt chart corresponding to the task allocation scheme as the phenotype in the algorithm, and mapping the genotype into the phenotype through a decoding operation.

The emphasis of the representational decoding is to highlight the task assignment properties. According to the embodiment of the application, the SSGS algorithm can be used for arranging the task allocation and time sequence relation expressed by A, T and O into a Gantt chart of the task allocation scheme in a constraint manner, and the phenotype can be obtained.

3) And randomly generating a series of feasible solution which accords with the constraint as an initialization solution set.

The feasible scheme solution set is randomly sampled under the constraint of the subtask time sequence relation obtained by the man-machine cooperation decision, and whether the random time sequence relation O meets the fixed adjacency matrix G or not is judged by designing a cyclic comparison function in a program _d And (4) representing subtask time sequence relation constraint, if the constraint is met, retaining, and if the constraint is not met, abandoning until the number of feasible solution reaches the number N of the initialization population _pop 。

4) And (4) carrying out surrogate evolution calculation, optimizing the total time of the scheme and the success rate of the scheme, and calculating to finish outputting the pareto front solution set.

Firstly, toThe current child solution set is subjected to non-dominated sorting with minimum total use time of the scheme and maximum success rate of the scheme, the calculation modes of the two indexes can be calculated according to the above embodiment, the crowdedness is calculated after the sorting is finished, selection is performed according to an elite strategy, excellent individuals are reserved, and a new parent population is formed. Selecting, crossing and mutating the new parent population to generate a new excellent offspring population, and increasing the evolution generations until reaching a preset total evolution generation N _evo And then outputting the corresponding solution set as a pareto frontier solution set.

In step S104, the optimal allocation plan of all allocation plans is determined according to the optimization result and the manual selection intention, and the subtasks in the optimal allocation plan are respectively assigned to the corresponding unmanned vehicles.

It can be understood that the embodiment of the application can utilize human intelligence and mechanistic, determine the optimal distribution scheme through human-computer mixing, and be cooperatively executed by the decision-making machine and the multiple unmanned vehicles.

In one embodiment of the present application, determining an optimal allocation scheme of all allocation schemes according to the optimization result and the manual selection intention includes: matching the time consumption grade and the success rate grade of each distribution scheme according to the optimization result; determining semantic description of each distribution scheme according to the time-use level and the success rate level of each distribution scheme, and acquiring a manual selection intention of a user based on the semantic description; and determining the optimal allocation scheme in all allocation schemes according to the manual selection intention.

Specifically, the embodiment of the application can grade the rapidity (reverse index in total use) and the success rate of each allocation scheme, can grade the allocation schemes into high, medium and low (or high, medium, low and low), converts the high, medium and low into brief semantic descriptions, helps people to better understand the machine decision result, and finally optimizes the task allocation optimal scheme by combining the brief semantic descriptions of the feature scheme provided by the decision machine.

In one embodiment of the present application, before determining an optimal allocation scheme of all allocation schemes according to the optimization result and the manual selection intention, the method includes: and screening one or more distribution schemes meeting preset conditions from all the distribution schemes by adopting a preset space equidistant principle, wherein the optimal distribution scheme is the distribution scheme in the one or more distribution schemes.

After the optimization solution of multi-objective optimization is carried out on the total time of each task allocation scheme and the success rate of the scheme, N can be obtained _pop The method and the device for selecting the pareto frontier scheme solution set can screen one or more distribution schemes by adopting a spatial equidistant principle.

Specifically, the embodiment of the application can adopt the principle of spatial equidistance to screen out N _final (3≤N _final Less than or equal to 5) feature schemes, and providing the final decision for people, wherein the screening process is as follows.

1) Two extremum solutions with the minimum total solution time and the maximum success rate are reserved and numbered as 1 and N respectively _pop ；

2) According to pareto frontier non-dominated solution set characteristics, where N _pop 2 solutions can be according to the proximity of the minimum solution (number 1) when the scheme is used totally or the solution with the maximum success rate (number N) of the scheme _pop ) The degrees of separation of (a) are uniquely and definitely numbered in sequence;

3) Defining a cumulative spatial distance d for a number n scheme solution _n For the cumulative spatial distance of all neighboring solution solutions starting from the minimum solution (number 1) at the solution population, the method is calculated as follows:

in the formula (I), the compound is shown in the specification,

indicates the total time index of the k-numbered scheme,

and the success rate index numbered as k scheme is shown.

4) According to the number of characteristic schemes N _final Determining an equal division distance d _div The calculation is as follows:

5) Starting from the minimum solution (number 1) of the total use of the scheme, the distance between the minimum solution and d is sequentially searched _div The closest feature scheme, assuming number p, is then the distance from number p and d _div The nearest next feature scheme number q can be found by using the following formula, and the numbers of all feature schemes can be obtained by using the following formula in a cycle:

further, embodiments of the present application may retain only N from depreciation at pareto frontier _final And (4) solving the characteristic scheme, and discarding all the other scheme solutions so as to select the optimal distribution scheme from the reserved distribution schemes.

In an embodiment of the present application, the preset knowledge graph is obtained based on training data with results of manual decomposition tasks, and the training data includes: acquiring training data carrying a manual decomposition task result, wherein the manual decomposition task result comprises an actual time sequence relation constraint characteristic among subtasks; modeling a task decomposition knowledge graph by using a preset relational graph convolutional neural network, and performing directed edge prediction training by using training data to obtain prediction time sequence relation constraint knowledge among subtasks; and calculating training loss according to the actual time sequence relation constraint characteristics and the predicted time sequence relation constraint knowledge, and stopping iterative training until the training loss meets a stopping condition to obtain a preset knowledge graph.

Facing to complex tasks under a specific scene, people acquire perception and situation information, and manually decompose the complex tasks by utilizing human experience and intelligence to form a series of subtasks executable by unmanned vehicles and time sequence relation constraint data thereof, wherein the subtasks have time sequence relation constraint characteristics. Considering that the task decomposition has global properties among the subtasks, the task decomposition is characterized by the form of graph G, as follows:

G＝(V,E,V)

wherein v is _i ,v _j E is the set of nodes, e _ij E is node v _i Point direction v _j Set of directed edges. In the embodiment of the application, a Relational Graph Convolutional neural network (R-GCNs) is used to model a task decomposition knowledge Graph, as shown in fig. 3, where nodes represent characteristics of subtasks except for vehicle execution and execution time (which are determined when a multi-objective optimization problem is solved in subsequent modeling when the vehicle is executed and executed), and directed edges represent a time sequence precedence relationship (where an arrow points to a subtask with a time sequence offset), as follows:

wherein t, p and o respectively represent the task type, the task position and the feature vector of the action object,

is node v _i Hidden vector at the l-th layer of the neural network, sigma is ReLU activation function, g _m To consider the message transfer function of the directed edge, M _i Is connecting node v _i Set of directed edge types (chronologically preceding or succeeding node v) _i )。

Task decomposition data of a person in a similar specific scene is used for a supervised learning knowledge graph network, directional edge prediction of R-GCNs is carried out, time sequence relation constraint knowledge among subtasks is extracted, time sequence relation constraint among subtasks is intelligently predicted by a machine, and a minimum loss function L training neural network is designed:

wherein T represents a task decomposition data set, and f representsNode v _i And node v _j At the directed edge e _ij And (4) a connected scoring function, wherein the higher the score is, the higher the credibility is.

The overall flow of the multi-unmanned vehicle task allocation method according to the embodiment of the application is shown in fig. 4, and a detailed description is given below with reference to a specific search and rescue task in a post-disaster urban environment.

After people acquire the perception and situation map, the search and rescue tasks are divided into subtasks, the types of the tasks comprise information collection, rescue, aggregation and the like, the positions of the tasks are given through designated points or areas of the map, the information collection tasks are inactive objects, and the rescue tasks are inactive objects of trapped people. Vectorizing the ith subtask, wherein a task type vector t is represented by one-hot encoding, a task position vector p is represented by three-dimensional coordinate numerical value, an action object vector o is represented by the number of trapped persons, and feature vectors v of the subtask are jointly formed _i . Directed edge e _ij Is a one-dimensional vector, representing v _i And v _j Whether a time sequence precedence relationship exists between the two elements can be represented by 0 and 1 respectively without time sequence relationship constraint and v _i Is required to precede v _j . And collecting task decomposition data of people in similar scenes, and forming a task decomposition data set T through nodes and directed edge vectors to serve as human knowledge.

Modeling the knowledge graph by using graph neural network, and designing message transfer function g _m As a linear transformation function, the following equation:

g _m (h _i ,h _j )＝W _i h _i +W _j h _j

in the formula, W _i And W _j A weight matrix determined for optimization to be learned. The graph neural network model 0-L layers can be represented as:

the scoring function f in the loss function can be designed as a high-dimensional space factor, as follows:

in the formula, R is a diagonal matrix to be determined by learning optimization. The minimization loss function for neural network training can be expressed as:

after the training is finished, when the neural network on the decision machine obtains each node vector v given by the person _i Outputting the final node vector through the determined message transmission mechanism

And every two nodes are scored with directed edges, and 0-1 continuous value can be output

And rounded to a discrete value of 0 or 1, representing node v _i And node v _j Whether a timing relationship constraint exists. The decision machine intelligently predicts time sequence relation constraints among subtasks based on human knowledge, realizes man-machine cooperative task decomposition, and improves the intelligent and automatic level of task decomposition.

The task decomposition result directed graph is shown in fig. 5, and the adjacency matrix thereof is:

assuming that 3 unmanned vehicles are provided in total, the numbers are from 1 to 3, and the unmanned vehicles can be used for information collection and rescue, and the task allocation relation of each unmanned vehicle is as follows:

based on the task allocation relationship, the execution time of each subtask is estimated by combining the map and the situation information, and the method comprises the following steps:

T＝[5.5 5.5 3.5 6.0]

the inter-subtask timing relationships are given as follows, indicating that

subtasks

1, 3, 2, 4 execute in sequence, which complies with the timing relationship constraints of task decomposition:

O＝[1 3 2 4]

and calculating the initial execution time of each subtask by adopting a serial scheme generation algorithm, and arranging the task allocation feasible scheme into a Gantt chart, as shown in FIG. 6.

Further, the embodiment of the present application may perform monte carlo simulation on the task allocation feasible scheme, and the result is shown in fig. 7, where the abscissa is the simulation times and the ordinate is the scheme success rate, and after thousands of simulations, a scheme success rate index close to convergence can be obtained. And performing multi-objective optimization on the total time consumption of the task allocation scheme and the success rate of the scheme by adopting an NSGA-II algorithm, wherein a scheme solution set of any evolutionary algebra can be represented in a dot graph mode, as shown in figure 8, each point represents a feasible scheme, and the horizontal and vertical coordinates are evaluation indexes of the total time consumption of the scheme and the success rate of the scheme respectively. With the increase of evolution algebra, the solution set of the scheme is changed, and the evaluation index is better.

The pareto frontier solution set is obtained through a multi-objective optimization algorithm, 5 characteristic schemes are left after screening by adopting the method, as shown in 9, each point represents one characteristic scheme, each scheme can be graded, and the optimal task allocation scheme is selected by people to be executed cooperatively by multiple unmanned vehicles by combining brief semantic description as shown in the following table 2. Wherein, table 2 is a schema semantic description table.

TABLE 2

According to the multi-unmanned vehicle task allocation method provided by the embodiment of the application, the acquired target tasks and preset decomposition targets of the multiple unmanned vehicles are input into a preset knowledge graph to predict task decomposition time sequence relation constraint, each unmanned vehicle allocation scheme is generated by combining allocation parameters of each subtask, design scheme evaluation indexes are used for carrying out multi-objective optimization on total time of the task allocation scheme and scheme success rate, and an optimal allocation scheme is determined by combining manual selection intention and is handed to the multiple unmanned vehicles to be cooperatively executed. The method is characterized in that a human is introduced into a multi-unmanned vehicle task allocation decision-making loop, man-machine cooperative task decomposition is carried out based on a knowledge graph, and optimization of task allocation is carried out in consideration of multi-objective performance, so that efficient cooperative decision of a machine-assisted human is realized. Therefore, the problems that in the related art, the complex task cannot be decomposed into executable subtasks, the distribution result is incomplete and unreasonable and the like are solved. Next, a task assigning apparatus for a multi-unmanned vehicle according to an embodiment of the present application will be described with reference to the accompanying drawings.

Fig. 10 is a block schematic diagram of a multi-drone vehicle task assignment device of an embodiment of the present application.

As shown in fig. 10, the multi-unmanned vehicle task assigning apparatus 10 includes: an acquisition module 100, a processing module 200, an optimization module 300, and an assignment module 400.

The acquiring module 100 is configured to acquire target tasks of multiple unmanned vehicles and preset decomposition targets of the target tasks; the processing module 200 is configured to input the target task and the preset decomposition target into a preset knowledge graph established in advance, and output a time sequence constraint relationship between one or more subtasks of the target task; the optimization module 300 is configured to generate each unmanned vehicle allocation scheme according to the preset allocation parameters of each subtask and the time sequence constraint relationship between each subtask, and perform multi-objective optimization on the total time consumption and the success rate of each allocation scheme by using preset scheme evaluation indexes to obtain an optimization result of the total time consumption and the success rate of each allocation scheme; the allocation module 400 is configured to determine an optimal allocation scheme among all allocation schemes according to the optimization result and the manual selection intention, and respectively send the subtasks in the optimal allocation scheme to the corresponding unmanned vehicles.

In an embodiment of the present application, the preset decomposition target includes one or more of a task type vector, a task position vector, and an action object vector, and the processing module 200 is further configured to input one or more of the task type vector, the task position vector, and the action object vector into a preset knowledge graph established in advance, to serve as a vector of each node on a layer 0, and output a directed edge of each node after message passing as a predicted value of each subtask timing relationship constraint; and determining the time sequence constraint relation among one or more subtasks of the target task according to the preset value and/or the manual modification intention.

In one embodiment of the present application, the apparatus 10 of the embodiment of the present application further comprises: and a training module.

The training module is used for acquiring training data carrying a manual decomposition task result, wherein the manual decomposition task result comprises an actual time sequence relation constraint characteristic among subtasks; modeling a task decomposition knowledge graph by using a preset relational graph convolutional neural network, and performing directed edge prediction training by using training data to obtain prediction time sequence relation constraint knowledge among subtasks; and calculating training loss according to the actual time sequence relation constraint characteristics and the predicted time sequence relation constraint knowledge, and stopping iterative training until the training loss meets a stopping condition to obtain a preset knowledge graph.

In an embodiment of the present application, the optimization module 300 is further configured to correspondingly encode each allocation scheme by using a preset encoding manner, so as to obtain a genotype encoding result of each allocation scheme; performing phenotype decoding on the genotype coding result of each distribution scheme, transmitting each distribution scheme from the genotype to a phenotype, and randomly generating a plurality of feasible scheme solutions meeting preset constraint conditions as an initialization solution set; and carrying out alternate evolution calculation on the initialization solution set, optimizing the total time consumption and the success rate of the scheme, and obtaining the optimization result of the total time consumption and the success rate of each distribution scheme.

In an embodiment of the present application, the allocating module 400 is further configured to match the time-use level and the success rate level of each allocating scheme according to the optimization result; determining semantic description of each distribution scheme according to the time-use level and the success rate level of each distribution scheme, and acquiring a manual selection intention of a user based on the semantic description; and determining the optimal allocation scheme in all allocation schemes according to the manual selection intention.

In one embodiment of the present application, the apparatus 10 of the present application embodiment further comprises: and a screening module.

The screening module is used for screening one or more distribution schemes meeting preset conditions from all the distribution schemes by adopting a preset space equidistance principle before determining the optimal distribution scheme in all the distribution schemes according to the optimization result and the manual selection intention, wherein the optimal distribution scheme is the distribution scheme in the one or more distribution schemes.

It should be noted that the foregoing explanation of the embodiment of the multi-unmanned vehicle task allocation method is also applicable to the multi-unmanned vehicle task allocation apparatus of this embodiment, and details are not repeated here.

According to the multi-unmanned vehicle task allocation device provided by the embodiment of the application, the acquired target tasks and preset decomposition targets of the multiple unmanned vehicles are input into the preset knowledge graph prediction task decomposition time sequence relation constraint which is established in advance, each unmanned vehicle allocation scheme is generated by combining the allocation parameters of each subtask, the design scheme evaluation indexes carry out multi-objective optimization on the total time of the task allocation schemes and the scheme success rate, and the optimal allocation scheme is determined by combining with the manual selection intention and handed to the multiple unmanned vehicles for cooperative execution. By introducing people into a multi-unmanned vehicle task allocation decision-making loop, performing man-machine cooperative task decomposition based on a knowledge graph, and optimizing task allocation by considering multi-objective, the efficient cooperative decision of a machine-assisted person is realized. Therefore, the problems that in the related art, complex tasks cannot be decomposed into executable subtasks, so that the distribution result is incomplete and unreasonable and the like are solved.

Fig. 11 is a schematic structural diagram of a vehicle according to an embodiment of the present application. The vehicle may include:

a memory 1101, a processor 1102, and a computer program stored on the memory 1101 and executable on the processor 1102.

The processor 1102, when executing the program, implements the multi-drone vehicle task assignment method provided in the embodiments described above.

Further, the vehicle further includes:

a communication interface 1103 for communicating between the memory 1101 and the processor 1102.

A memory 1101 for storing computer programs that are executable on the processor 1102.

The Memory 1101 may comprise a Random Access Memory (RAM) Memory, and may also include a non-volatile Memory, such as at least one disk Memory.

If the memory 1101, the processor 1102 and the communication interface 1103 are implemented independently, the communication interface 1103, the memory 1101 and the processor 1102 may be connected to each other through a bus and perform communication with each other. The bus may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 11, but this is not intended to represent only one bus or type of bus.

Optionally, in a specific implementation, if the memory 1101, the processor 1102 and the communication interface 1103 are integrated on one chip, the memory 1101, the processor 1102 and the communication interface 1103 may complete communication with each other through an internal interface.

The processor 1102 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present Application.

Embodiments of the present application also provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the multi-unmanned vehicle task allocation method as above.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to implicitly indicate the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a programmable gate array, a field programmable gate array, or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

Claims

1. A task allocation method for multiple unmanned vehicles is characterized by comprising the following steps:

acquiring target tasks of a plurality of unmanned vehicles and preset decomposition targets of the target tasks;

inputting the target task and the preset decomposition target into a preset knowledge graph which is established in advance, and outputting a time sequence constraint relation between one or more subtasks of the target task;

generating each unmanned vehicle distribution scheme according to the preset distribution parameters of each subtask and the time sequence constraint relation between each subtask, and performing multi-target optimization on the total time consumption and the success rate of each distribution scheme by using the preset scheme evaluation indexes to obtain the optimization results of the total time consumption and the success rate of each distribution scheme;

and determining the optimal distribution scheme in all the distribution schemes according to the optimization result and the manual selection intention, and respectively sending the subtasks in the optimal distribution scheme to the corresponding unmanned vehicles.

2. The method of claim 1, wherein the preset decomposition objective comprises one or more of a task type vector, a task position vector and an action object vector, and the inputting the target task and the preset decomposition objective into a preset knowledge graph established in advance and outputting a time sequence constraint relation between one or more subtasks of the target task comprises:

inputting one or more of the task type vector, the task position vector and the action object vector into the pre-established preset knowledge graph as a vector of each node of a 0 th layer, and outputting directed edges of each node as a predicted value of each subtask time sequence relation constraint after message transmission;

and determining a time sequence constraint relation among one or more subtasks of the target task according to the preset value and/or the manual modification intention.

3. The method according to claim 1 or 2, wherein the preset knowledge graph is obtained based on training data carrying results of a manual decomposition task, and comprises:

acquiring training data carrying a manual decomposition task result, wherein the manual decomposition task result comprises an actual time sequence relation constraint characteristic among subtasks;

modeling a task decomposition knowledge graph by using a preset relational graph convolutional neural network, and performing directed edge prediction training by using the training data to obtain prediction time sequence relation constraint knowledge among subtasks;

and calculating training loss according to the actual time sequence relation constraint characteristics and the prediction time sequence relation constraint knowledge, and stopping iterative training until the training loss meets a stopping condition to obtain the preset knowledge graph.

4. The method according to claim 1, wherein the method for performing multi-objective optimization on the total time consumption and the success rate of each allocation scheme by using the preset scheme evaluation index to obtain the optimization result of the total time consumption and the success rate of each allocation scheme comprises the following steps:

correspondingly encoding each distribution scheme by using a preset encoding mode to obtain a genotype encoding result of each distribution scheme;

performing phenotype decoding on the genotype coding result of each distribution scheme, emitting each distribution scheme into a phenotype from the genotype, and randomly generating a plurality of feasible scheme solutions meeting preset constraint conditions to serve as an initialization solution set;

and carrying out alternate evolution calculation on the initialization solution set, optimizing the total time consumption and the success rate of the scheme, and obtaining the optimization result of the total time consumption and the success rate of each distribution scheme.

5. The method of claim 1, wherein determining an optimal allocation solution of all allocation solutions according to the optimization results and the manual selection intent comprises:

matching the time-consuming grade and the success rate grade of each distribution scheme according to the optimization result;

determining semantic description of each distribution scheme according to the time level and the success rate level of each distribution scheme, and acquiring manual selection intention of a user based on the semantic description;

and determining the optimal distribution scheme in all the distribution schemes according to the manual selection intention.

6. The method according to claim 1 or 5, comprising, before determining the optimal allocation scheme of all allocation schemes based on the optimization result and the manual selection intent:

and screening one or more distribution schemes meeting preset conditions from all the distribution schemes by adopting a preset space equidistant principle, wherein the optimal distribution scheme is the distribution scheme in the one or more distribution schemes.

7. A multi-drone vehicle task assignment device, comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring target tasks of a plurality of unmanned vehicles and preset decomposition targets of the target tasks;

the processing module is used for inputting the target task and the preset decomposition target into a preset knowledge graph which is established in advance and outputting a time sequence constraint relation between one or more subtasks of the target task;

the optimization module is used for generating each unmanned vehicle distribution scheme according to the preset distribution parameters of each subtask and the time sequence constraint relation between each subtask, and performing multi-target optimization on the total time consumption and the success rate of each distribution scheme by using preset scheme evaluation indexes to obtain the optimization results of the total time consumption and the success rate of each distribution scheme;

and the distribution module is used for determining the optimal distribution scheme in all the distribution schemes according to the optimization result and the manual selection intention, and respectively sending the subtasks in the optimal distribution scheme to the corresponding unmanned vehicles.

8. The apparatus of claim 7, wherein the preset decomposition goal comprises one or more of a task type vector, a task location vector, and an action object vector, and wherein the processing module is further configured to:

9. The apparatus of claim 7 or 8, further comprising: the training module is used for acquiring training data carrying a manual decomposition task result, wherein the manual decomposition task result comprises an actual time sequence relation constraint characteristic among subtasks;

and calculating training loss according to the actual time sequence relation constraint characteristics and the predicted time sequence relation constraint knowledge, and stopping iterative training until the training loss meets a stop condition to obtain the preset knowledge graph.

10. The apparatus of claim 7, wherein the optimization module is further configured to:

correspondingly coding each distribution scheme by using a preset coding mode to obtain a genotype coding result of each distribution scheme; performing phenotype decoding on the genotype coding result of each distribution scheme, transmitting each distribution scheme from the genotype to a phenotype, and randomly generating a plurality of feasible scheme solutions meeting preset constraint conditions as an initialization solution set; and carrying out surrogate evolution calculation on the initialization solution set, optimizing the total time consumption and the success rate of the scheme, and obtaining the optimization results of the total time consumption and the success rate of each distribution scheme.

11. The apparatus of claim 7, wherein the assignment module is further configured to:

matching the time consumption grade and the success rate grade of each distribution scheme according to the optimization result;

and determining the optimal distribution scheme in all distribution schemes according to the manual selection intention.

12. The apparatus of claim 7 or 11, further comprising: and the screening module is used for screening one or more distribution schemes meeting preset conditions from all the distribution schemes by adopting a preset space equidistance principle before determining the optimal distribution scheme in all the distribution schemes according to the optimization result and the manual selection intention, wherein the optimal distribution scheme is the distribution scheme in the one or more distribution schemes.

13. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the multi-drone vehicle task assignment method of any one of claims 1-6.

14. A computer-readable storage medium, on which a computer program is stored, characterized in that the program is executed by a processor for implementing a multi-drone vehicle task assignment method according to any one of claims 1-6.