CN112985414A

CN112985414A - Multi-agent navigation method, device, equipment and medium

Info

Publication number: CN112985414A
Application number: CN202110380560.4A
Authority: CN
Inventors: 李冠毅
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2021-04-09
Filing date: 2021-04-09
Publication date: 2021-06-18

Abstract

The invention discloses a multi-agent navigation method, a device, equipment and a medium, wherein a first agent in the method can determine an expected path based on the node cost value of a first node contained between a first position and a target position of the first agent, and control the first agent to move according to the determined expected path and acquired path information.

Description

Multi-agent navigation method, device, equipment and medium

Technical Field

The invention relates to the technical field of intelligent agents, in particular to a multi-intelligent-agent navigation method, a multi-intelligent-agent navigation device, multi-intelligent-agent navigation equipment and a multi-intelligent-agent navigation medium.

Background

Agents, i.e., entities with intelligence, are an important concept in the field of artificial intelligence. Any independent entity that is capable of thinking and that can interact with the environment can be abstracted as an agent.

In the same environment, when a plurality of intelligent agents execute a navigation task, because the intelligent agents with narrow moving space and opposite running directions do not mutually give, mutual conflict can be caused, and the advance of the other party is blocked.

To solve the above problems, a common method in the prior art includes: a central decision method and an avoidance method.

The central decision method is to establish a central server, and the central server performs unified scheduling on the intelligent agents, but the central decision method has higher requirements on the network and the environment, and can only be applied to high-controllable scenes such as factory freight and the like generally, so the environmental adaptability of the central decision method is poor, and the cost is increased due to the need of adding the central server.

The avoidance method is that the intelligent agent judges whether the intelligent agent is in a conflict state in real time, and if the intelligent agent is in the conflict state, the intelligent agent moves to a fixed avoidance position to avoid according to a conflict scene. The avoidance method is only suitable for large-space scenes of a small number of intelligent agents; in a scene where collision obviously occurs, one agent in the scene must perform backward avoidance, so that the avoidance method has poor flexibility.

Disclosure of Invention

The invention provides a multi-agent navigation method, a multi-agent navigation device, multi-agent navigation equipment and a multi-agent navigation medium, which are used for solving the problems of poor environment adaptability and poor flexibility when navigation tasks are performed to avoid conflicts in the prior art.

The invention provides a multi-agent navigation method, which comprises the following steps:

acquiring path information sent by at least one second agent in the multi-agent;

determining an expected path of the first agent according to a first node contained between a current first position and a target position of the first agent and the node cost value of the first node;

and controlling the first agent to move according to the expected path and the path information.

Accordingly, the present invention provides an apparatus for multi-agent navigation, the apparatus comprising:

an obtaining module, configured to obtain path information sent by at least one second agent in the multi-agent;

the system comprises a determining module, a calculating module and a processing module, wherein the determining module is used for determining an expected path of a first agent according to a first node contained between a current first position and a target position of the first agent and a node cost value of the first node;

and the control module is used for controlling the first intelligent agent to move according to the expected path and the path information.

Accordingly, the present invention provides an electronic device comprising a processor and a memory, said memory storing program instructions, said processor being adapted to carry out the steps of any of the above-described multi-agent navigation methods when executing a computer program stored in the memory.

Accordingly, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of any of the above-described multi-agent navigation methods.

Accordingly, the present application also provides a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a processor, implement the steps of any of the above-described multi-agent navigation methods.

The invention provides a multi-agent navigation method, a device, equipment and a medium, wherein a first agent obtains path information sent by at least one second agent in the multi-agent, determines an expected path of the first agent according to a first node contained between a current first position and a target position of the first agent and the node cost value of the first node, and controls the first agent to move according to the determined expected path and the obtained path information. The first intelligent agent can determine the expected path based on the node cost value of the first node contained between the first position of the first intelligent agent and the target position of the first intelligent agent, and control the first intelligent agent to move according to the determined expected path and the acquired path information.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a process diagram of a multi-agent navigation method according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating a path in a map represented by nodes according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a path for controlling the movement of a first agent according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a desired path of an agent according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a path for controlling the movement of a first agent according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of an apparatus for multi-agent navigation according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to improve the environmental adaptability and flexibility when performing navigation tasks, embodiments of the present invention provide a multi-agent navigation method, apparatus, device, and medium.

Example 1: fig. 1 is a schematic process diagram of a multi-agent navigation method, which is applied to a first agent in the multi-agent, where the first agent is any one of the multi-agents, according to an embodiment of the present invention, and the process includes the following steps:

s101: obtaining path information transmitted by at least one second agent of the multi-agent.

The multi-agent navigation method provided by the embodiment of the invention is applied to the electronic equipment, and the electronic equipment can be a controller of the first agent and also can be the first agent. The intelligent agent provided by the invention is provided with the communication module, so that a first intelligent agent can communicate with other second intelligent agents based on the communication module. Wherein the second agent is any agent of the multi-agent other than the first agent.

Since the width of the path of the first agent during the moving process may be a width that allows a plurality of agents to move in parallel, or may be a width that only allows one agent to move, when the width of the path only allows the width of one agent to move, if two agents need to use the path, a collision may occur during the moving process. For example, if a first agent and a second agent move in opposite directions in the path, the first agent and the second agent may collide. As another example, if a first agent and a second agent with different moving speeds move in the same direction in the path, the first agent and the second agent may also collide.

In order to avoid conflicts with other agents in controlling the first agent to move to the currently determined target location, the electronic device needs to acquire path information sent by at least one second agent in an operating state in the multi-agent, so as to plan a desired path of the first agent.

Specifically, the path information sent by at least one second agent in the multi-agent obtained by the electronic device can be obtained in a cloud communication mode, a local router communication mode and a broadcast communication mode. The embodiment of the invention does not limit the acquisition mode of the information between the intelligent agents.

S102: and determining an expected path of the first agent according to a first node contained between the current first position and the target position of the first agent and the node cost value of the first node.

Specifically, a first node included between a first position and a target position is determined according to the current first position and the target position of a first agent; the electronic device may obtain a current first position of the electronic device, specifically, the electronic device may obtain the current first position of the electronic device through Beidou navigation, may also obtain the current first position of the electronic device through GPS navigation, and may also adopt a robot positioning technology, for example, positioning based on an established map in combination with vision, and so on, which is not described in detail in the embodiments of the present invention.

In order to realize the control of the electronic device, after receiving the navigation command, the electronic device performs path planning based on the navigation command. The navigation command comprises a determined target position, the target position is an end position to which the first intelligent agent moves in advance, and each first node between the first position and the target position is determined according to the current position (marked as the first position and the target position) of the first intelligent agent. Wherein:

the navigation command may be a voice command, for example, if the user issues the navigation command to the first agent in a voice manner to indicate that the first agent moves to the target position, the first agent may obtain the navigation command in a voice form through its own audio acquisition device; touch commands, such as a user inputting a navigation command through a touch screen; it may also be a navigation command sent by the user through other terminals, such as a mobile terminal, a PC, etc.

In the embodiment of the invention, paths allowing the intelligent agents to move are drawn in advance in the known map, and each allowed path is dispersed into a plurality of nodes. Each node has its own corresponding identification information, for example, the identification information may be a number assigned to each node in advance.

Further, in the map, a width of each path allowing the agent to move is marked in advance, and the agent moves only within the width of each path allowing the agent to move.

In some embodiments, the distance between any two adjacent nodes is generally any value between 10 and 20 centimeters, wherein the distance between any two adjacent nodes may be the same or different.

In some embodiments, nodes are provided at intersections where multiple allowed paths intersect.

Specifically, each first node between the first position and the target position is determined according to each node in a map stored in advance, and the first position and the target position on the map, wherein the first node refers to a node included in each possible allowable path that passes when the first node moves from the first position to the target position.

In the embodiment of the invention, the node cost value of each first node can be determined according to the acquired path information sent by at least one second agent, and the expected path from the first position to the target position of the first agent can be determined according to the node cost value of each first node. The node cost value represents the possibility that the first agent conflicts with other agents at the first node, and the higher the node cost value is, the higher the possibility that the conflict occurs is. According to the node cost value of each first node, an expected path with the node cost value meeting the requirement can be determined, so that the possibility of conflict in the expected path is reduced, and the expected path is a navigation path for the first agent to move from the first position to the target position.

S103: and controlling the first agent to move according to the expected path and the path information.

Specifically, according to the determined expected path and the acquired path information sent by the second agent, the electronic device controls the first agent to move on the expected path. The path information includes at least one of a current second position of the second agent, a target position of the second agent, and second node sequence information included in a path from the second agent to the target position.

Further, if the fact that the first intelligent agent conflicts with other intelligent agents on the expected path is determined according to the expected path and the path information, the first intelligent agent is controlled to avoid; if it is determined that the first agent does not conflict with other agents on the desired path, the first agent is controlled to move from the first location to the target location.

In the embodiment of the invention, the first agent acquires the path information sent by at least one second agent in the multi-agent, determines the expected path of the first agent according to the first node contained between the current first position and the target position of the first agent and the node cost value of the first node, and controls the first agent to move according to the determined expected path and the acquired path information. The first intelligent agent determines the expected path, so that the problem of poor environmental adaptability of the existing central decision method is solved, the environmental adaptability is improved, the flexibility of controlling the movement of the first intelligent agent is improved, and the possibility of collision can be reduced because the expected path is determined according to the node cost value of the first node when the first intelligent agent is controlled to move.

Example 2: in order to determine a desired path for a first agent to move from a first location to a target location, on the basis of the above embodiment, in an embodiment of the present invention, the determining the desired path of the first agent includes at least the following two ways:

mode 1, starting from a first position, based on the node cost value of a first node, sequentially determining each target node satisfying the node cost value condition from the first node to determine an expected path.

In the method, according to the current first position of a first agent and each node on a movable path on a map saved in advance, from the first position, each next node adjacent to the first position and the node cost value of each next node are determined, the next node with the lowest node cost value is determined as the next target node of the first position, the next target node is the target node included in the expected path, and based on the target node, the steps are repeated, namely determining each next node adjacent to the target node and the node cost value of each next node, determining the next node with the lowest node cost value as the next target node of the target node, and so on until the determined next target node is adjacent to the target position, determining a node sequence formed by each determined target node, the first position and the target position as an expected path of the first agent.

Specifically, in the embodiment of the present invention, each target node in the desired path from the first position and the target position may be determined by using an existing Dijkstra (Dijkstra) search algorithm.

And 2, determining each first path between the first position and the target position, determining a node cost total value of each first path according to the node cost value of the first node contained in each first path, and determining the first path with the node cost total value meeting the requirement as the expected path.

In the method, each first path from the first position to the target position is determined according to the first position and the target position, and for each first path, according to the node cost value of each first node in the first path, the sum of the node cost values of each first node in the first path is determined as the total node cost value of the first path.

Further, according to the node cost total value of each first path, according to a preset requirement, determining that the first path with the node cost total value meeting the requirement is an expected path, wherein the preset requirement can be determined by considering the node cost value of each first node in the first path, the length of the first path, the node cost total value of the first path and the like.

For example, the preset requirement may be that the total node cost value is lower than a preset threshold, or the total node cost value is the lowest, or the total node cost value is lower than the preset threshold and the first path length is the shortest, and so on. The embodiment of the present invention is not limited thereto.

Example 3: in order to determine an expected path whose node total cost value meets the requirement, in the embodiment of the present invention, determining the first path whose node total cost value meets the requirement as the expected path includes at least the following three ways:

the method a: and determining the first path with the lowest total node cost value as the expected path.

In order to reduce the possibility of collision when the first agent moves in the determined expected path, the total node cost value is the lowest, which means that the possibility of collision is the lowest.

In the method, each first path between an initial position and a target position is determined according to a first node included between the initial position (namely one implementation mode of the first position) and the target position of a first agent and a node cost value of the first node; determining a node cost total value of each first path according to the node cost value of the first node contained in each first path; and finally, determining the first path with the total node cost value meeting the requirement as the expected path.

Mode b: in each set period, according to the path information sent by the second agent acquired in the period, determining a first path with the lowest total node cost value, and determining a sub-path with a first set length with the first position as the starting point in the first path as a sub-path of the expected path in the period.

In the embodiment of the present invention, information transmission between agents may be periodic, for example, each agent periodically broadcasts and transmits its own path information, so that other agents may plan their own path and movement based on the latest path information.

Since the complete expected path is generally long, the time for the first agent to move to the target location along the expected path is long, and the path information and location of the second agent will change with the change of time, and it may also happen that a new second agent joins and occupies part of the first nodes in the expected path at a certain moment, so the change factor for moving to the target location along the expected path is many, and if the first agent moves completely according to the determined expected path, it may not be the optimal path. After the path information of the second agent is acquired each time, the expected path is determined based on the first position and the target position of the second agent, and the reliability of path planning can be effectively improved.

In some embodiments, when the expected path is determined based on the total node cost value, only the sub-path of the expected path corresponding to the present period may be determined, specifically, the first path with the lowest total node cost value is identified, and the sub-path with the first set length with the first position as the starting point in the first path is determined as the sub-path of the expected path in the present period. That is, only the sub-path with the first set length and with the first position as the starting point in the first path with the lowest total node cost value in the present period is planned as the sub-path of the expected path, and the sub-path is only a part of the expected path. And after receiving the path information of other agents again, determining the next sub-path of the first agent based on the received path information until the sub-path corresponding to a certain period comprises the target position, wherein the sub-path corresponding to each period forms an expected path for moving to the target position.

In this manner, the first location is the starting location of the first agent in the present cycle (i.e., another implementation of the first location). That is, in the method b, when the sub-path of the desired path corresponding to each period is determined, the sub-path is determined based on the start position of the period and the path information of the second agent acquired in the period.

Mode c: in each set period, according to the path information sent by the second agent acquired in the period, determining a node cost total value of a sub-path with a second set length in each first path, which takes the first position as a starting point, and determining the sub-path with the lowest node cost total value as the sub-path of the expected path in the period.

In the embodiment of the present invention, if a first agent may periodically obtain path information of at least one second agent, after obtaining the path information of the second agent, only a sub-path of an expected path that can move to a target location in the present period may be planned, where the sub-path is only a part of the expected path, and the sub-paths corresponding to multiple periods may form a complete expected path that moves to the target location. Therefore, when determining the sub-path of the expected path in the period, the determination can be performed according to the total node cost values of the partial nodes.

Therefore, when the path information of the second agent is periodically acquired, for each determined first path, according to each first node of the first path, a sub-path of a second set length with the first position as a starting point of the first path is determined, where the first position is a starting position of the first agent in the present period (that is, another implementation manner of the first position), and according to each determined sub-path and each first node of the first path, each first target node included in the sub-path is determined. Determining a total node cost value of the sub-path according to the node cost value of each first target node in the sub-path; and determining the sub-path with the lowest node cost total value as the sub-path of the expected path in the period according to the node cost total value of each sub-path.

The first set length in the second set length and the manner b is preset, and if it is desired to improve the efficiency of determining the total node cost value of the first path, the first set length and the second set length may be set to be shorter, and if it is desired to avoid the occurrence of the collision in the desired path as much as possible, the first set length and the second set length may be set to be longer. The first set length and the second set length may be the same or different.

In some embodiments, the first set length and the second set length may be set according to a length that the first agent is able to move within one period.

Example 4: in order to determine the node cost value of the first node, on the basis of the above embodiments, in an embodiment of the present invention, the node cost value is determined according to the target location and the current location of the agent (including the first agent and the second agent), the time when the agent is expected to reach each node, the target priority of the agent, and the like, wherein the time when the agent is expected to reach each node may be determined based on the location of the node and the moving speed of the agent.

In the embodiment of the present invention, the node cost value may represent the probability of collision, and in general, the higher the node cost value is, the higher the probability of collision is, if a certain node is a path overlapping node, the node cost value corresponding to the node will be higher, and correspondingly, if a certain node is a non-path overlapping node, the node cost value corresponding to the node will be lower.

In order to determine whether a node is a path coincident node or a non-path coincident node, in the embodiment of the present invention, it may be determined whether the identification information of the first node is the same as the identification information of any one of the second nodes according to the identification information of each first node in the first path and the identification information of each second node included in the path information, and if the identification information of the first node is the same as the identification information of any one of the second nodes, it is determined that the first node is a path coincident node of the first path and the path information; and if the identification information of the first node is different from the identification information of each second node, determining that the first node is a non-path coincident node of the first path and the path information. The identification information of the node (including the first node and the second node) is information that uniquely identifies the node, and may be location information of the node or a number that is previously assigned to each node in a map.

For example, when the node is a non-path-coincident node, the probability of collision occurring at the non-path-coincident node is very low, so the cost value of the node at the non-path-coincident node may be set to a lower value, for example, 0, and when the node is a path-coincident node, the probability of collision occurring at the path-coincident node is very high, so the cost value of the node at the path-coincident node may be set to a value higher than the cost value of the node at the non-path-coincident node, for example, a value greater than 0, for example, 0.1, or 1, 2, 3, etc.

In a specific implementation, because the first agent and the second agent may have different collision possibilities at different path overlapping nodes, node cost values of different path overlapping nodes may also be different, and other information (such as priority, moving direction, and the like) needs to be combined to determine the node cost value of a path overlapping node, where the higher the collision possibility at the path overlapping node is, the higher the node cost value of the path overlapping node is; the smaller the probability of a collision occurring at a path coincident node, the smaller the node cost value of that path coincident node. In order to accurately determine the node cost value of the first node, in the embodiment of the present invention, the node cost value of the first node is determined according to at least one of the following manners:

and in the mode A, if the first node is determined to be the target position of the second agent, determining the node cost value of the first node according to the current second position of the second agent.

In this way, according to each first node in the first path and the second node included in the path information, it is determined whether the first node is a path coincident node of the first path and the path information.

In this manner, according to the first node and the target position of the second agent in the path information, if the first node is determined to be the target position of the second agent, according to the current second position of the second agent, if the second position is the same as the target position, it is determined that the second agent reaches the target position, because it is unknown how long the second agent will stay at the target position, it is determined that the probability that the second agent will collide with the first agent is extremely high, and therefore, the node cost value of the type of path overlapping node may be set to a large value, for example, the node cost value may be set to 2, 3, and the like.

In the method B, for a first node that is the same as any second node included in the path information, if a moving direction of the first agent through the first node is opposite to a moving direction of the second agent through the first node in a first path between the first location and the target location, determining that a node cost value of the first node is a preset value, or determining a node cost value of the first node according to a second target priority of the second agent;

in this manner, according to the first node and each second node included in the path information, where the second node is a node included in a path through which the second agent moves from its current second location to its destination location, if it is determined that the first node is the same as any one of the second nodes included in the path information, the moving direction of the first agent when passing through the first node and the moving direction of the second agent when passing through the second node that is the same as the first node are determined.

Specifically, the moving direction of the first agent from the first node to the next first node is determined according to the first node and the next first node of the first node, and the moving direction of the second agent from the second node to the next second node is determined according to the second node and the next second node of the second node.

Further, if the moving direction of the first agent passing through the first node is opposite to the moving direction of the second agent passing through the second node, the node cost value of the first node opposite to the moving direction is determined, and since it is determined that the second agent and the first agent may collide at the first node when the node directions of the first node are opposite, the node cost value of the path overlapping node of the type may be a preset value, where the preset value is a value smaller than the node cost value determined in the manner a, for example, the node cost value may be set to 0.7, 1.3, and the like.

In some embodiments, to more accurately determine the cost value of the node, a second target priority of a second agent may also be considered. For example, if the second target priority is lower than the first target priority of the first agent, on the basis that the node cost value of the first node with the opposite moving direction is a preset value, determining that the target increment of the node cost value corresponding to the second target priority is 0; and if the second target priority is higher than the first target priority, determining the target increment of the node cost value corresponding to the second target priority according to the second target priority and the increment of the node cost value corresponding to each priority, wherein the higher the priority is, the higher the increment of the node cost value is, and determining the sum of the node cost value of the first node with the opposite moving direction and the target increment as the node cost value of the first node.

In some embodiments, if the second target priority is lower than the first target priority, determining that a target weight coefficient of the node cost value corresponding to the second target priority is 1; if the second target priority is higher than the first target priority, according to the second target priority and the set weight coefficient corresponding to the priority, wherein the weight coefficient corresponding to the priority higher than the first target priority is a numerical value larger than 1, determining the target weight coefficient corresponding to the second target priority, and taking the product value of the node cost value corresponding to the difference value and the target weight coefficient as the node cost value of the first node.

And in the mode C, for a first node which is the same as any second node contained in the path information, determining a difference value between a first time when the first agent expects to reach the first node and a second time when the second agent expects to reach the first node, and determining the node cost value of the first node according to the difference value or determining the node cost value of the first node according to the difference value and the second target priority of the second agent.

In this manner, according to the first node and each second node included in the path information, if it is determined that the first node is the same as any of the second nodes included in the path information, a first time at which the first agent is expected to reach the first node and a second time at which the second agent is expected to reach the first node are determined.

Specifically, a first time at which a first agent is expected to reach the first node is determined according to a distance from a current first position of the first agent to the first node and a moving speed of the first agent, and a second time at which a second agent is expected to reach the same second node is determined according to a distance from a current second position of the second agent to the first node and a moving speed of the second agent.

Further, a difference value between the first time and the second time is determined according to the first time and the second time, and according to the difference value and a preset threshold, if the difference value is determined to be greater than the preset threshold, it is determined that the probability that the first agent collides with the first node is extremely low, for example, the node cost value of the first node may be determined to be 0; and if the difference is smaller than the preset threshold value, determining the node cost value of the first node according to the difference.

Specifically, the node cost value of the first node may be determined according to a predetermined cost value determination function and the difference value, or the node cost value of the target difference value range in which the difference value of the first node is located may be determined according to the node cost value corresponding to each preset difference value range.

When the first node is a path overlapping node, according to the difference between the first time and the second time, the probability that the first agent and the second agent collide at the first node is higher when the difference is smaller, and therefore the node cost value of the first node is higher when the difference between the first time and the second time is smaller.

And if the difference value between the first moment and the second moment is smaller than a preset threshold value, determining a second target priority of a second agent, and determining the node cost value of the first node according to the second target priority and the difference value between the first moment and the second moment. Specifically, a function and the difference are determined according to a preset cost value, and a node cost value corresponding to the difference is determined, wherein the smaller the difference is, the higher the node cost value of the difference is.

In some embodiments, in order to more accurately determine the node cost value of the first node corresponding to the difference value smaller than the preset threshold, further, the second target priority and the first target priority may be considered on the basis of determining the node cost value corresponding to the difference value. For example, if the second target priority is lower than the first target priority, determining that the target increment of the node cost value corresponding to the second target priority is 0; if the second target priority is higher than the first target priority, determining a target increment of the node cost value corresponding to the second target priority according to the second target priority and the increment of the node cost value corresponding to each priority, wherein the higher the priority is, the higher the increment of the node cost value is, and determining the sum of the node cost value corresponding to the difference and the target increment as the node cost value of the first node;

in some embodiments, if the second target priority is lower than the first target priority, determining that a target weight coefficient of the node cost value corresponding to the second target priority is 1; and if the second target priority is higher than the first target priority, determining a target weight coefficient corresponding to the second target priority according to the second target priority and a weight coefficient corresponding to the set priority, wherein the weight coefficient corresponding to the priority higher than the first target priority is a numerical value larger than 1, and taking a product value of the node cost value corresponding to the difference and the target weight coefficient as the node cost value of the first node.

In the embodiment of the present invention, since the number of the second agents that conflict with the first agent at the first node may be more than one, and the node cost value of the first node may be determined by using at least one of the above-described modes a, B, and C, when determining the node cost value of the first node, the sum of the node cost values of each first node determined for different second agents may be determined as the final node cost value of the first node. Or setting a weight value corresponding to each mode, and determining the sum of the cost value determined based on the mode and the product of the weight values as the node cost value.

For example, if the first node is a target location of at least two second agents and the current second locations of the two second agents are target locations, the node cost value of the first node is determined for each second agent by using the method a, and the final node cost value of the first node is determined for the sum of the node cost values of the first node determined for each second agent.

If there are at least two second agents passing through the first node, and the moving direction of the first node is opposite to the moving direction of the first agent passing through the first node, the node cost value of the first node determined by the above method B for each second agent is determined, and the sum of the node cost values of the first node determined for each second agent is determined as the final node cost value of the first node.

If there are at least two second agents passing through the first node, and the difference between the first time when the first agent expects to reach the first node and each second time when each of the at least two second agents expects to reach the first node is smaller than the preset threshold, the node cost value of the first node determined by the method C for each second agent is determined as the final node cost value of the first node.

For example, the node cost value of the first node is determined by simultaneously adopting the mode A and the mode B; if the first node is a target position of a second agent, and the current second position of the second agent is the target position, and at least one second agent pass exists on the first node, and the moving direction of the first node is opposite to the moving direction of the first agent passing through the first node, the sum of the node cost values of the first node determined by the method a and the method B is determined to be the final node cost value of the first node.

For another example, the node cost value of the first node is determined by simultaneously adopting the mode A and the mode C; if the first node is a target position of a second agent, the current second position of the second agent is a target position, and at least one second agent passes through the first node, and the difference value between the first time when the first agent is expected to reach the first node and the second time when any one of the at least one second agent is expected to reach the first node is smaller than a preset threshold value, the sum of the node cost values of the first node determined by the method a and the method C is determined to be the final node cost value of the first node.

For another example, the node cost value of the first node is determined by simultaneously adopting the mode A, the mode B and the mode C; if the first node is a target position of a second agent, the current second position of the second agent is the target position, and there is a difference between a moving direction of at least one second agent passing through the first node and a moving direction of the first agent passing through the first node, and a first time when the first agent is expected to reach the first node and a second time when any one of the second agents is expected to reach the first node, which is smaller than a preset threshold, the sum of the node cost values of the first node determined by the method a, the method B and the method C is determined as the final node cost value of the first node.

Example 5: in order to avoid collision during the moving of the agents, on the basis of the foregoing embodiments, in an embodiment of the present invention, the controlling the first agent to move according to the expected path and the path information includes:

determining, for the first node in the desired path, whether the first agent and the second agent would collide at the first node; and controlling the first agent to move according to the conflict determination result.

In the embodiment of the invention, for each first node in the expected path, whether the first agent and the second agent conflict at the first node is determined.

In a specific implementation process, the first agent is controlled to move according to a conflict determination result at each first node, specifically, if a conflict occurs at the first node, the first agent is controlled to avoid, and if no conflict occurs at each first node, the first agent is controlled to move from the first position to the target position, and no avoidance is needed in the moving process.

To determine whether a first agent and a second agent may conflict at a first node, in an embodiment of the present invention, the determining whether the first agent and the second agent may conflict according to the first node in the desired path includes:

acquiring a second target priority and a second moving direction of the second agent; and

if the second target priority is higher than the first target priority of the first agent, the first moving direction and the second moving direction of the first agent are opposite, and a second node contained in the path information is the same as any first node in the expected path, it is determined that the first agent and the second agent in the expected path will collide.

Otherwise, it is determined that the first agent and the second agent do not conflict in the desired path. For example, if the second moving direction is opposite to the first moving direction, it is determined that each second node included in the path information is different from any node in the expected path, and it is determined that the first agent and the second agent do not conflict in the expected path; if the second target priority is higher than the first target priority of the first agent and the first moving direction of the first agent is the same as the second moving direction, according to the second node contained in the path information and the first node in the expected path, if the second node is determined to be different from any first node in the expected path, the first agent and the second agent are determined not to conflict in the expected path.

In order to determine whether a first agent and a second agent conflict at a first node in a specific implementation process, in an embodiment of the present invention, the electronic device may further obtain a second target priority and a second moving direction of the second agent. The second target priority is determined according to the agent priority, the service priority, the distance priority and the like of the second agent, and the second moving direction is the moving direction of the second agent.

And if the second target priority is higher than the first target priority of the first agent and the first moving direction and the second moving direction of the first agent are opposite, according to the second node and the first node in the expected path contained in the path information, if the second node is determined to be the same as any first node in the expected path, determining that the first agent and the second agent conflict in the expected path.

Specifically, according to the identification information of the second node and the identification information of the first node in the expected path included in the path information, if it is determined that the identification information of the second node is the same as the identification information of any first node in the expected path, it is determined that the first agent and the second agent are more likely to collide in the expected path.

In order to determine more accurately whether the second node included in the path information is the same as any first node in the expected path, on the basis of the foregoing embodiments, in an embodiment of the present invention, the determining that the second node included in the path information is the same as any first node in the expected path includes:

a second node corresponding to a second position where the second agent is located at present is the same as any first node in the expected path; or

And a target second node in the path information is the same as any first node in the expected path, and the target second node is a second node which is located within a third set length range from the second position.

In some embodiments, the path information includes a second location of the second agent, and if the second location is any first node in the expected path, it is determined that a second node corresponding to the second location is the same as any first node in the expected path, and it is determined that the first agent and the second agent are more likely to collide in the expected path.

In some embodiments, a target second node located within a third set length range from the second position is determined according to each second node and the second position included in the path information, and if the target second node is the same as any first node in the expected path, it is determined that the first agent and the second agent have a greater possibility of colliding in the expected path.

The third set length, which may be the same as or different from the first set length of the mode b and the second set length of the mode c in the embodiment 3, may be set shorter if it is desired to improve the efficiency of determining whether or not a collision occurs in the desired path, or may be set longer if it is desired to avoid a collision occurring in the desired path as much as possible.

Specifically, according to the identification information of the target second node and the identification information of the first node, if it is determined that the identification information of any first node is the same as the identification information of the target second node, it is determined that the target second node is the same as any first node in the expected path.

Example 6: to control the first agent to move to the target location, on the basis of the foregoing embodiments, in an embodiment of the present invention, the controlling the first agent to move according to the result of the conflict determination includes:

if at least one first node conflicts and a first target priority of the first agent is higher than a second target priority of the second agent, controlling the first agent to move to the target position according to the expected route; or

If at least one first node conflicts and the second target priority of the second agent is higher than the first target priority of the first agent, a stoppable node is determined, and the first agent is controlled to move to the target position after the stoppable node is stopped.

In the embodiment of the invention, if the first agent and any second agent are determined to conflict at least one first node, and the second target priority of the second agent is determined to be higher than the first target priority of the first agent, the first agent is controlled to avoid.

Specifically, the stoppable node may be a node corresponding to the first position, or may be another node, and after the stoppable node is determined, the first agent is controlled to stop waiting at the stoppable node.

Specifically, if the stoppable node is a node corresponding to the current first position of the first agent, the first agent is controlled to stop waiting at the first position; and if the stoppable node is other nodes, determining a path from the current first position to the stoppable node, and controlling the first agent to stop waiting after moving to the stoppable node.

In order to control the first agent to move to the target position, the first agent is controlled to move to the target position after the stoppable node waits for stopping.

As a possible implementation manner, the electronic device may control the first agent to move to the target location along the expected path when determining that the second node corresponding to the second location does not exist in the expected path according to the path information acquired later.

As another possible implementation manner, in the embodiment of the present invention, when the sub-path of the sub-path desired path with the first set length using the first position as the starting point in the first path with the lowest total node cost value in the above-mentioned embodiment 2 is used, since the electronic device is a sub-path that periodically determines the desired path for each cycle, the electronic device controls the first agent to stop at the stoppable node until the end of the current cycle, and obtaining the path information sent by the second agent in the next period, and determining the sub-path of the expected path in the next period, if it is determined that the first agent and the second agent still collide in the expected path in the next period, moving to a new stoppable node continues to wait until the agent does not conflict with a second agent in the expected path for a period, the first agent is controlled to move from the stoppable node to the target location according to a sub-path of the new desired path.

Example 7: in order to determine the target priority of the agent, on the basis of the foregoing embodiments, determining the target priority of the agent includes:

and in the first mode, the highest priority in at least one of the equipment priority, the service priority and the distance priority in the path information is determined as the target priority of the intelligent agent.

In the method, according to at least one of the determined intelligent agent priority, the service priority and the distance priority, the highest priority in the at least one is determined, and the highest priority is determined as the target priority of the intelligent agent.

The agent priority is the priority of the agent itself, and different agents may be configured with different priorities in advance, for example, the agent serving the VIP client has a higher device priority than the agent serving the general client.

The service priority is the priority of the service executed by the agent, for example, in a restaurant scenario, the service priority of the agent executing the meal delivery service is higher than the service priority of the agent executing the chat service.

The distance priority is a priority determined according to the distance of the agent from the target position, and the closer the agent is to the target position, the higher the distance priority, for example, the agent 0.5 m away from the target position has a higher distance priority than the agent 3 m away from the target position.

For example, when the agent priority of the agent is a first priority, the service priority is a third priority, and the distance priority is a second priority, the first priority with the highest priority is determined as the target priority of the agent.

And secondly, determining the priority weight and value as the target priority of the intelligent agent according to the intelligent agent priority, the service priority, the distance priority and the corresponding weights in the path information.

In this way, the priority weight and value may also be determined according to the agent priority, the service priority, the example priority, and the respective corresponding weights, and determined as the target priority of the agent.

For example, when the agent priority is the first priority, the service priority is the third priority, and the distance priority is the second priority, the corresponding weight of the agent priority is 0.5, the service priority is 0.2, and when the example priority is 0.3, the sum of the priority weights is determined to be 1.7, that is, the target priority of the agent is 1.7.

Example 8: in order to determine a stoppable node, on the basis of the foregoing embodiments, in an embodiment of the present invention, the determining a stoppable node includes:

if a first node different from each second node included in the path information exists in the desired path, a stoppable node may be selected from the different first nodes.

Since the same target first node as each second node included in the path information exists in the desired path, the different first node may be located before the target first node in the desired path or may be located after the target first node in the desired path. Since the first agent may not move to the target first node for avoidance by the target first node, in embodiments of the present invention, to increase the likelihood of moving to a stoppable node, the stoppable node is determined in the following manner.

Mode 1: and if a first node which is different from each second node contained in the path information exists in the path before the target first node of the expected path, wherein the target first node conflicts with the first node, selecting the stoppable node from the different first nodes.

In this way, a stoppable node is first determined in the expected path, the electronic device determines a first node before a target first node of the expected path where a conflict occurs, determines whether a first node different from a second node included in the path information exists in the first node before the target first node, and selects a stoppable node from the non-coincident first nodes if it is determined that a first node different from the second node exists in the first node before the target first node, and any one of the different nodes may be selected as a stoppable node, and preferably, a first node closer to the target position is selected as a stoppable node.

Fig. 2 is a schematic diagram illustrating a path in a map represented by nodes according to an embodiment of the present invention, where, as shown in fig. 2, the nodes included in the path are respectively the 1 st node from the left (left and right in the map) to the 1 st node from the right (left and right in the map), the 1 st node from the lower (upper and lower in the map) to the 8 th node, and the 1 st node from the upper to the 7 th node.

Fig. 3 is a schematic diagram of a path for controlling the movement of the first agent according to an embodiment of the present invention, as shown in fig. 3, the electronic device controls the first agent a to move from the current first location a1 to the target location a2, the desired path of the first agent a is a sequence of nodes between the a1 node and the a2 node, such as nodes included in a sequence of nodes ending below (up and down in the figure) the a2 node from the left (left and right in the figure) to the right (left and right in the figure) in fig. 3, and nodes included in a sequence of nodes ending below (up and down in the figure) the a2 node to the a2 node from below (up and down in the figure). Wherein the arrows in the figure indicate the direction of movement of the agent.

Second agent B is moving from second location B1 to target location B2 of agent B itself, and the path information of second agent B includes nodes from node B1 to node B2, such as nodes included in a sequence of nodes ending on (up and down in the figure) node B2 starting from node B1 on the right (left and right in the figure) to node B2 on the left (left and right in the figure) in fig. 2, and nodes included in a sequence of nodes ending on (up and down in the figure) node B2 and ending on node B2, where the first target priority of first agent a is lower than the second target priority of second agent B, and the first moving direction of first agent a is opposite to the second moving direction of second agent B.

The first nodes before the target first node in the expected path, where collision may occur, and the second nodes different from each other in the path information include the 2 nd node to the 5 th node on the left (left and right in the figure), and the 5 th node is selected as a stoppable node, so that the electronic device controls the first agent a to move from the 2 nd node on the left (left and right in the figure) to the 5 th node for avoiding until it is determined that the first agent and the second agent do not collide in the expected path, and controls the first agent a to move to the target position a 2.

Mode 2: selecting the stoppable node from nodes other than the second node included in the path information and the first node included in the desired path.

In this manner, if it is determined that all of the first nodes before the target first node are the same as the second nodes, it is determined that there is no stoppable node in the desired path, and therefore the electronic device may determine a stoppable node from the paths that are allowed to move other than the desired path, and select one node from the nodes other than the second node included in the path information and the first node included in the desired path to determine as a stoppable node.

In some embodiments, in order to reduce the time for moving to the target location, in an embodiment of the present invention, the selecting the stoppable node from nodes other than the second node included in the path information and the first node included in the desired path includes:

and selecting the node closest to the first position from the other nodes, and determining the node as a stoppable node.

And determining the node closest to the first position in other nodes except the second node contained in the path information and the first node contained in the expected path from the paths allowing movement, and determining the node as a stoppable node.

For example, fig. 4 is a schematic diagram of a desired path of an agent according to an embodiment of the present invention, as shown in fig. 4, the electronic device controls the first agent a to move from the current first position a1 to the target position a2, the desired path of the first agent a is a sequence of nodes between the a1 node and the a2 node, such as a sequence of nodes included in the sequence of nodes starting from the a1 node on the left side (left and right in the figure) to the a2 node on the right side (left and right in the figure) and ending below (up and down in the figure) the a2 node in fig. 4, and a sequence of nodes included in the sequence of nodes starting below (up and down in the figure) the a2 node to ending at the a2, that is, a path is composed of a path from the 8 th node to the 15 th node on the left side (left and right and up and down from the 3 rd node to the 7 th node on the top side (up and down in the figure).

The second agent B is a target location B2 that moves from the second location B1 to agent B itself, the path information of agent B includes the nodes from node B1 to node B2, such as nodes included in the sequence of nodes beginning from node B1 on the right (left and right in the figure) to node B2 on the left (left and right in the figure) and ending above (top and bottom in the figure) and thus above (top and bottom in the figure) node B2 to node B2 in FIG. 2, i.e., a path from the 4 th node on the right (left and right in the figure) to the 17 th node, and from the 4 th node below (upper and lower in the figure) to the 8 th node, the first target priority of the first agent a is lower than the second target priority of the second agent B, and a first direction of movement of first agent a is opposite a second direction of movement of second agent B.

Since there is no stoppable node in the desired path of the first agent a, the electronic device determines a node closest to the first position among any other nodes except the second node included in the desired path and the first node included in the desired path, and controls the agent to move from the first position to the determined node for avoidance.

Fig. 5 is a schematic diagram of a path for controlling the movement of a first agent according to an embodiment of the present invention, as shown in fig. 5, the electronic device determines that the other path is a path formed from the 1 st node to the 5 th node on the left side (left and right in the drawing), so that the electronic device controls the first agent to move from the node a1 to the other path for avoiding, and moves to the other path for avoiding according to the path formed from the 15 th node to the 18 th node on the right side (left and right in the drawing) in fig. 4, until it is determined that the first agent a and the second agent B do not collide in the desired path, the electronic device controls the first agent a to move to the target position a 2.

Example 9: on the basis of the above embodiments, fig. 6 is a schematic structural diagram of an apparatus for multi-agent navigation according to an embodiment of the present invention, where the apparatus includes:

an obtaining module 601, configured to obtain path information sent by at least one second agent in the multi-agent;

a determining module 602, configured to determine an expected path of a first agent according to a first node included between a current first location and a target location of the first agent and a node cost value of the first node;

a control module 603, configured to control the first agent to move according to the expected path and the path information.

In a possible implementation manner, the determining module is specifically configured to determine, starting from the first location, target nodes satisfying a node cost value condition in sequence from the first nodes based on the node cost values of the first nodes, so as to determine the expected path; or determining each first path between the first position and the target position, determining the node cost total value of each first path according to the node cost value of the first node contained in each first path, and determining the first path with the node cost total value meeting the requirement as the expected path.

In a possible implementation manner, the determining module is specifically configured to:

determining a first path with the lowest total node cost value as an expected path; or

In each set period, according to the path information sent by the second agent acquired in the period, determining a first path with the lowest total node cost value, and determining a sub-path with a first set length in the first path, which takes the first position as a starting point, as a sub-path of the expected path in the period; or

In each set period, according to the path information sent by the second agent acquired in the period, determining a node cost total value of a sub-path with a second set length in each first path, which takes the first position as a starting point, and determining the sub-path with the lowest node cost total value as the sub-path of the expected path in the period.

In a possible embodiment, the determining module is further configured to determine the node cost value of the first node according to at least one of the following:

if the first node is determined to be the target position of the second agent, determining the node cost value of the first node according to the current second position of the second agent;

for a first node that is the same as any second node included in the path information, if a moving direction of the first agent passing through the first node is opposite to a moving direction of the second agent passing through the first node in a first path between the first position and the target position, determining that a node cost value of the first node is a preset value, or determining the node cost value of the first node according to a second target priority of the second agent;

and for a first node which is the same as any second node contained in the path information, determining a difference value between a first time when the first agent expects to reach the first node and a second time when the second agent expects to reach the first node, and determining the node cost value of the first node according to the difference value or determining the node cost value of the first node according to the difference value and a second target priority of the second agent.

In a possible implementation, the control module is specifically configured to determine whether the first agent and the second agent may conflict with each other according to the first node in the desired path; and controlling the first agent to move according to the conflict determination result.

In a possible implementation manner, the control module is specifically configured to obtain a second target priority and a second moving direction of the second agent; if the second target priority is higher than the first target priority of the first agent, the first moving direction of the first agent is opposite to the second moving direction, and the second node contained in the path information is the same as any first node in the expected path, it is determined that the first agent and the second agent conflict in the expected path.

In a possible implementation manner, the control module is specifically configured to enable a second node corresponding to a second location where the second agent is currently located to be the same as any first node in the expected path; or a target second node in the path information is the same as any first node in the expected path, and the target second node is a second node which is located within a third set length range from the second position.

In a possible implementation, the control module is specifically configured to, if a conflict occurs at least one first node and a first target priority of the first agent is higher than a second target priority of the second agent, control the first agent to move to the target location according to the desired route; or if at least one first node conflicts and the second target priority of the second agent is higher than the first target priority of the first agent, determining a stoppable node, and controlling the first agent to move to the target position after the stoppable node is stopped.

In a possible implementation, the determining module, further configured to determine the target priority of the agent, includes: determining the highest priority in at least one of the agent priority, the service priority and the distance priority in the path information as the target priority of the agent; or determining the target priority of the agent according to at least one of the agent priority, the service priority and the distance priority in the path information and the corresponding weight thereof; wherein the agent comprises the first agent and the second agent.

In a possible implementation manner, the control module is specifically configured to select the stoppable node from the different first nodes if a first node different from each second node included in the path information exists in a path before the conflicting first node of the expected path; or selecting the stoppable node from nodes other than the second node included in the path information and the first node included in the desired path.

In a possible implementation manner, the control module is specifically configured to select a node closest to the first position from the other nodes, and determine the node as a stoppable node.

Example 10: fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and on the basis of the foregoing embodiments, an electronic device according to an embodiment of the present invention is further provided, where the electronic device includes a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702 and the memory 703 complete mutual communication through the communication bus 704;

the memory 703 has stored therein a computer program which, when executed by the processor 701, causes the processor 701 to perform the steps of:

acquiring path information sent by at least one second agent in the multi-agent; determining an expected path of the first agent according to a first node contained between a current first position and a target position of the first agent and the node cost value of the first node; and controlling the first agent to move according to the expected path and the path information.

Since the principle of the electronic device for solving the problem is similar to that of the multi-agent navigation method, the implementation of the electronic device can be referred to the implementation of the method, and repeated details are not repeated.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface 702 is used for communication between the above-described electronic apparatus and other apparatuses. The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

Example 11: on the basis of the foregoing embodiments, an embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to perform the following steps:

The principle of solving the problem of the computer-readable medium provided by the embodiment of the invention is the same as that of solving the problem of the multi-agent navigation method, and specific contents can be seen in the method embodiment.

Example 12: on the basis of the above embodiments, the present invention further provides a computer program product comprising a computer program stored on a computer-readable storage medium, the computer program comprising program instructions for executing by a processor the steps of:

The computer program product provided by the embodiment of the present invention solves the problems by the same principle as the multi-agent navigation method described above, and the detailed contents can be referred to the above method embodiment.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A multi-agent navigation method, applied to a first agent of said multi-agent, said method comprising:

2. The method of claim 1, wherein the determining the desired path of the first agent comprises:

determining each target node meeting node cost value conditions from the first nodes in sequence from the first positions based on the node cost values of the first nodes so as to determine the expected path; or

Determining each first path between the first position and the target position, determining a node cost total value of each first path according to the node cost value of the first node contained in each first path, and determining the first path with the node cost total value meeting the requirement as the expected path.

3. The method according to claim 2, wherein the determining the first path with the total node cost value satisfying the requirement as the desired path comprises:

4. The method according to any of claims 1-3, wherein the node cost value of the first node is determined according to at least one of:

5. The method of claim 1, wherein controlling the first agent to move according to the desired path and the path information comprises:

determining whether the first agent and the second agent conflict with each other according to the first node in the expected path;

and controlling the first agent to move according to the conflict determination result.

6. The method of claim 5, wherein determining whether the first agent and the second agent may conflict comprises, based on the first node in the desired path:

acquiring a second target priority and a second moving direction of the second agent;

if the second target priority is higher than the first target priority of the first agent, the first moving direction of the first agent is opposite to the second moving direction, and the second node contained in the path information is the same as any first node in the expected path, it is determined that the first agent and the second agent conflict in the expected path.

7. The method of claim 6, wherein the step of identifying the second node included in the path information as the same as any of the first nodes in the desired path comprises:

8. The method of claim 5, wherein controlling the first agent to move based on the conflict determination comprises:

9. The method of any of claims 6-8, wherein determining the target priority of the agent comprises:

determining the highest priority in at least one of the agent priority, the service priority and the distance priority in the path information as the target priority of the agent; or

Determining the target priority of the agent according to at least one of agent priority, service priority and distance priority in the path information and the corresponding weight thereof;

wherein the agent comprises the first agent and the second agent.

10. The method of claim 8, wherein determining a stoppable node comprises:

if a first node different from each second node contained in the path information exists in a path before the first node of the expected path where the conflict occurs, selecting the stoppable node from the different first nodes; or

Selecting the stoppable node from nodes other than the second node included in the path information and the first node included in the desired path.

11. The method according to claim 10, wherein said selecting the stoppable node from nodes other than the second node included in the path information and the first node included in the desired path comprises:

12. An apparatus for multi-agent navigation, the apparatus comprising:

13. An electronic device, comprising: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;

the memory has stored therein a computer program which, when executed by the processor, causes the processor to perform the method of any of claims 1-11.

14. A computer-readable storage medium, characterized in that it stores a computer program executable by a processor, which program, when run on the processor, causes the processor to carry out the method of any one of claims 1-11.