CN115759199A - Multi-robot environment exploration method and system based on hierarchical graph neural network - Google Patents

Multi-robot environment exploration method and system based on hierarchical graph neural network Download PDF

Info

Publication number
CN115759199A
CN115759199A CN202211454807.3A CN202211454807A CN115759199A CN 115759199 A CN115759199 A CN 115759199A CN 202211454807 A CN202211454807 A CN 202211454807A CN 115759199 A CN115759199 A CN 115759199A
Authority
CN
China
Prior art keywords
robot
graph
topological graph
topological
environment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211454807.3A
Other languages
Chinese (zh)
Other versions
CN115759199B (en
Inventor
程吉禹
张�浩
张伟
张�林
宋然
李晓磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202211454807.3A priority Critical patent/CN115759199B/en
Publication of CN115759199A publication Critical patent/CN115759199A/en
Application granted granted Critical
Publication of CN115759199B publication Critical patent/CN115759199B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Manipulator (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention provides a multi-robot environment exploration method and system based on a hierarchical graph neural network, and relates to the field of multi-robot unknown environment exploration. The method comprises the following steps: the environment modeling method based on the topological graph represents the continuous environment map as the topological graph; extracting features of the topological graph based on a hierarchical graph neural network, aggregating feature information of different hop counts in the topological graph, and fusing features of nodes and edges in the topological graph by using a multi-head attention mechanism to obtain a final output topological graph; and taking the node characteristics corresponding to a single robot node in the final output topological graph as the state of a corresponding intelligent agent in reinforcement learning, and performing information fusion on the node characteristics from a plurality of robots by using a multi-head attention mechanism to obtain the total state value of the robot system. The invention can extract the characteristic information in the environment topological graph with attention, and utilizes the multi-agent reinforcement learning framework to carry out strategy learning, thereby improving the overall cooperativity and the task execution efficiency of the multi-robot system.

Description

Multi-robot environment exploration method and system based on hierarchical graph neural network
Technical Field
The invention belongs to the field of multi-robot unknown environment exploration, and particularly relates to a multi-robot environment exploration method and system based on a hierarchical graph neural network.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
The unknown environment exploration, as a basic problem of robotics, is always a hot problem of research in the field of robots and is also a major challenge. In the task of exploring the unknown environment of multiple robots, the multiple robots are required to locally sense the structural information of a working space through sensors carried by the robots, communicate with companion robots under a certain condition, cooperatively make a decision, and further quickly and accurately reconstruct an environment model in the motion process. This task has two major difficulties: the first difficulty is how to properly allocate task points among multiple robots, and the overall efficiency of a single robot and a multi-robot system is considered, so that conflicts are avoided. The problem of co-ordination between robots is itself a problem with NP-hard. Secondly, for entities (robot teammates, obstacles, map boundary information, etc.) with complex and numerous features in the environment, it is also challenging to perform appropriate modeling and select a feature extraction framework adapted to the entities to reduce the decision complexity.
For task allocation among multiple robots, the existing work is roughly divided into methods based on emotion modeling, methods based on auction mechanism and methods based on deep reinforcement learning according to different theoretical bases. Although the method based on the auction mechanism is widely applied to indoor exploration tasks, the method itself may bring about serious path repetition, which affects the overall working efficiency of the system. Although the emotion-based method is relieved for the problem of path repetition, the emotional state and the feature extraction both need manual design, and the generalization capability in a complex environment is poor. The reinforcement learning-based method is a solution to the decision problem arising in recent years with the development of artificial intelligence, and has the advantages of being capable of processing a complex state space and an action space while considering decision efficiency and optimality.
For entities with complex and numerous characteristics in an environment, topological diagrams are widely used to describe spatial and structural characteristics among entities. However, for complex topological structure information in a topological graph, most of the existing methods adopt a graph neural network with a permutation invariance to perform feature extraction. Although the method can extract the features of the topological graph with the variable size and structure, the hierarchical information embodied in the topological graph is easy to ignore.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a multi-robot environment exploration method and a multi-robot environment exploration system based on a hierarchical graph neural network.
In order to achieve the above object, one or more embodiments of the present invention provide the following technical solutions:
the invention provides a multi-robot environment exploration method based on a hierarchical graph neural network.
A multi-robot environment exploration method based on a hierarchical graph neural network comprises the following steps:
the environment modeling method based on the topological graph represents the continuous environment map as the topological graph;
extracting features of the topological graph based on a hierarchical graph neural network, aggregating feature information of different hop counts in the topological graph, and fusing features of nodes and edges in the topological graph by using a multi-head attention mechanism to obtain a final output topological graph;
and taking the node characteristics corresponding to the single robot node in the final output topological graph as the state of a corresponding intelligent agent in reinforcement learning, and performing information fusion on the node characteristics from a plurality of robots by using a multi-head attention mechanism to obtain the total state value of the robot system.
The invention provides a multi-robot environment exploration system based on a hierarchical graph neural network.
A multi-robot environment exploration system based on a hierarchical graph neural network comprises:
an environment modeling module configured to: the environment modeling method based on the topological graph represents the continuous environment map as the topological graph;
a feature extraction module configured to: extracting features of the topological graph based on the hierarchical graph neural network, aggregating feature information of different hop counts in the topological graph, and fusing the features of nodes and edges in the topological graph by using a multi-head attention mechanism to obtain a final output topological graph;
a reinforcement learning module configured to: and taking the node characteristics corresponding to the single robot node in the final output topological graph as the state of a corresponding intelligent agent in reinforcement learning, and performing information fusion on the node characteristics from a plurality of robots by using a multi-head attention mechanism to obtain the total state value of the robot system.
A third aspect of the present invention provides a computer readable storage medium having stored thereon a program which, when executed by a processor, performs the steps in the hierarchical graph neural network based multi-robot environment exploration method according to the first aspect of the present invention.
A fourth aspect of the present invention provides an electronic device, comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor implements the steps of the hierarchical graph neural network-based multi-robot environment exploration method according to the first aspect of the present invention when executing the program.
The above one or more technical solutions have the following beneficial effects:
(1) The invention adopts the topological graph to model the entities in the environment, extracts the spatial feature information among the key entities in the environment while reducing the computational complexity, and is beneficial to the robot to make a decision more reasonably and efficiently. On the basis, the invention adopts the hierarchical graph neural network as a strategy network to distinguish and process information from different topological levels, thereby improving the cognitive ability of the robot on the spatial characteristics and enabling the decision of the robot to be more targeted and interpretable.
(2) The invention adopts a multi-agent deep reinforcement learning framework of 'centralized training and decentralized execution' to encode the cooperativity of the robot system into the discrete strategy of each robot, thereby reducing the conflicts of repeated coverage, collision and the like of the robots and leading each robot to efficiently and autonomously finish the exploration of unknown environment in a team.
(3) The robot decision mode is decentralized decision, and in a multi-robot team, each robot shares a policy network; meanwhile, the graph neural network can extract features of topological graphs with different input sizes. Therefore, the strategy network provided by the invention has strong generalization and expandability, and can expand the model obtained by training in a small scene to a large scene, thereby greatly saving the training cost and improving the training efficiency.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a schematic diagram of a topology establishment process according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a multi-agent reinforcement learning framework employed by the first embodiment of the present invention;
FIG. 3 is a flow chart of a hierarchical neural policy network used in the first embodiment of the present invention.
Fig. 4 is a system configuration diagram of the second embodiment.
Detailed Description
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
The general idea provided by the invention is as follows:
in a first aspect, the invention provides a topological graph-based environment modeling method and a hierarchical graph neural network-based feature extraction and decision framework. In the environment modeling, first, the environment is discretized according to a certain distance specification, and a continuous environment map (including a robot) is represented as a discrete grid point map. In the invention, each grid point in the map passable area is used as a node in the topological graph, and then the upper edge is added between adjacent nodes, so that the topological feature extraction of the environment is completed, and a topological map is obtained. The entire modeling process is shown in fig. 1.
Then, for such topological graph information, a hierarchical graph neural network is proposed to extract information in the environment. The hierarchical graph neural network updates the characteristics of each node into the characteristic aggregation of the neighbor nodes thereof through the flowing and aggregation of information. After iterative updating is carried out for multiple times, information aggregation of different hop counts in the topological graph can be obtained. Different from the existing method, the invention utilizes a multi-head attention mechanism to perform importance distinction on information from different hop counts in the topological graph so as to help the robot to sense the surrounding environment information in a more distinguishing way.
In a second aspect, the present invention provides a "focused training, distributed execution" multi-agent reinforcement learning framework that implicitly integrates the collaboration between agents into the independent policies of each agent. When multiple robots are deployed in a search task, the cooperation between the robots must be considered comprehensively to ensure efficiency and avoid conflicts such as repeated searches. As shown in fig. 2, the present invention proposes that a plurality of robots share a centralized value network and a decentralized strategy network during training. Wherein the final output of the centralized value network comes from the weighted results of each robot value network. The corresponding weighting factors are assigned by means of attention. In this manner, the contribution of a single robot to the overall task in the system can be reasonably distributed, and the cooperativity among multiple robots is implicitly coded into the strategy that each robot is discrete. In addition, since each agent shares parameters during the training process, the trained model shows good extensibility, which makes it possible to train the model in a small scene (containing fewer robots) and extend it to a larger scene (containing more robots) for application.
Example one
The embodiment discloses a multi-robot environment exploration method based on a hierarchical graph neural network.
As shown in fig. 1-3, the multi-robot environment exploration method based on the hierarchical graph neural network comprises the following steps:
the environment modeling method based on the topological graph represents the continuous environment map as the topological graph;
extracting features of the topological graph based on a hierarchical graph neural network, aggregating feature information of different hop counts in the topological graph, and fusing features of nodes and edges in the topological graph by using a multi-head attention mechanism to obtain a final output topological graph;
and taking the node characteristics corresponding to a single robot node in the final output topological graph as the state of a corresponding intelligent agent in reinforcement learning, and performing information fusion on the node characteristics from a plurality of robots by using a multi-head attention mechanism to obtain the total state value of the robot system.
The embodiment mainly provides a multi-robot unknown environment exploration method based on a hierarchical graph neural network, and the actual force is performed in a simulated 2D map. The continuous map in this embodiment is first discretized into a 2D grid map. As shown in FIG. 1, the trafficable region in the environment and the grid points corresponding to the robots are used as nodes in the topological graph, and the adjacent grid points are oppositeUpper edges will be added between nodes. Finally, after the upper edges are added between the robot nodes and the adjacent nodes, the topological representation of the grid map is obtained. Specifically, the present embodiment employs a three-dimensional vector
Figure BDA0003953010100000061
To represent the characteristics of node i:
Figure BDA0003953010100000062
wherein R is a set of robots; w is a group of t Is a collection of passable areas; f t Representing a set of boundary points at time t;
Figure BDA0003953010100000063
indicating whether node i belongs to the corresponding set. The characteristic of the edge j in the topological graph is defined as the distance between two node nodes corresponding to the edge j in the graph, namely
Figure BDA0003953010100000064
Finally, the feature sets of the nodes and edges in the topological graph are respectively expressed as
Figure BDA0003953010100000065
And
Figure BDA0003953010100000066
the topology is represented as G = { V, E }.
For the topological graph G, the embodiment adopts a hierarchical graph neural network to extract features. The hierarchical graph neural network mainly comprises two modules: an underlying feature aggregation module and a hierarchical feature aggregation module.
Carrying out feature extraction on the topological graph based on the hierarchical graph neural network, which specifically comprises the following steps:
inputting the topological graph into a bottom layer feature aggregation module, updating the feature of each node in the topological graph into the feature aggregation of a neighbor, updating the feature of an edge into the feature aggregation of two adjacent nodes, and iterating the bottom layer feature aggregation module for K times to obtain K levels of feature information and K +1 output topological graphs;
inputting the K +1 output topological graphs into a hierarchical feature aggregation module, and fusing the features of nodes and edges in the output topological graphs by using a multi-head attention mechanism to obtain a hidden feature graph;
and inputting the hidden feature map into an output layer to extract the topological features of the hidden layer, so as to obtain a final output topological map.
The bottom layer characteristic aggregation module is responsible for aggregating the characteristic information of different hop counts in the topological graph and mainly comprises two updating functions phi e 、φ v And an aggregation function p e→v . Each time the bottom layer feature aggregation is carried out, the feature of each node in the topological graph passes through rho e→v and φv Update as feature aggregation of its neighbors, the edge's features will pass phi e Updating the feature aggregation of his two neighboring nodes. And iterating the modules to perform K times of operation to obtain the characteristic information from K layers. In specific implementation, the embodiment defines an equivalent directed graph as G = { V, E }, where G =
Figure BDA0003953010100000071
Figure BDA0003953010100000072
At the same time, for the update function phi e ,φ v And an aggregation function ρ e→v As defined below:
Figure BDA0003953010100000073
Figure BDA0003953010100000074
Figure BDA0003953010100000075
wherein
Figure BDA0003953010100000076
Respectively, are a collection of edge and node features in the output topology. After the K times of bottom layer feature aggregation operations are iteratively performed, K +1 output topological graphs are finally obtained. And the hierarchical feature aggregation module takes the K +1 topological graphs as input and fuses the features of the nodes and the edges in the graphs by using a multi-head attention mechanism. In specific implementation, for each node I and each edge j in the ith topological graph, learnable weights are introduced in the embodiment
Figure BDA0003953010100000077
Figure BDA0003953010100000078
And
Figure BDA0003953010100000079
generate a corresponding key, denoted as
Figure BDA00039530101000000710
And
Figure BDA00039530101000000711
Figure BDA00039530101000000712
where m represents the number of heads corresponding to the current attention mechanism. This embodiment then introduces learnable
Figure BDA00039530101000000713
And
Figure BDA00039530101000000714
for each node i and edge j, the corresponding feature of the mth head attention mechanism aggregation is represented as
Figure BDA00039530101000000715
And
Figure BDA00039530101000000716
Figure BDA00039530101000000717
Figure BDA0003953010100000081
thus, information from different levels is fused into a topological graph through a multi-head attention mechanism. For output graphs from M different head numbers, the characteristics of edges and nodes are pasted and sent to an MLP to obtain an output graph G out ={V out ,E out }:
Figure BDA0003953010100000082
Figure BDA0003953010100000083
Figure BDA0003953010100000084
Figure BDA0003953010100000085
In the final output topological graph, the node characteristics corresponding to the robot node i are used as the state of the corresponding intelligent agent i in reinforcement learning; edges connected with the robot node i as candidates of the current action value of the robot are sent into a softmax layer to obtain action probability distribution. And finally, the robot determines the action at the current moment according to the probability distribution.
The strategy network in the embodiment adopts a multi-agent reinforcement learning framework with an operator-critic structure, and adopts a centralized critic structure and a decentralized operator structure. At each momentt, the state of the robot system being a map, i.e. s, detected by a plurality of robots before time t t =G t . Observation and observation of robot i
Figure BDA0003953010100000086
Is the topology information within the K hop range obtained by agent i at time t. The action space of each agent is a = { up, down, left, right }. If the grid points adjacent to the robot are impassable, the action will be set to "stop".
The present embodiment employs a fully collaborative multi-agent reinforcement learning so that agents share a common reward objective. The reward of the robot is the difference between the previous time and the search area at that time. Regarding the aspect of the network structure, since the inputs of the policy function and the cost function are local information of the robot, the embodiment shares the first layers of parameters of the value network and the policy network to improve the stability of training. After the input graph passes through the hierarchical graph neural network, a hidden feature graph is generated, and then the output layer of the hierarchical graph neural network is used for extracting the topological features of the hidden layer. The value network output layer performs information fusion on the characteristics from the n robots by using a multi-head attention mechanism, and finally the characteristics of the output nodes are one-dimensional.
The characteristics of the node corresponding to the robot i are taken as the state value of the agent i, namely
Figure BDA0003953010100000091
And finally, the sum of the state values of all the agents is used as the state value of the whole system. For a policy network, in order to fully utilize a topological network, the present embodiment sends the features of the edges connected to the agent on the output graph to the softmax layer as a behavior probability distribution.
Figure BDA0003953010100000092
Representing the set of edges adjacent to node i, the action of agent i may be represented as:
Figure BDA0003953010100000093
the training algorithm adopts an operator-critic structure, the operator output action enables the obtained accumulated expected reward to be maximum, the critic network scores the operator action, loss is calculated according to the reward obtained by the operator action and the score given by the critic, and the critic network parameters are updated by gradient reduction. After a large amount of training, the rewards acquired by the robots tend to be stable, the algorithm is gradually converged, a plurality of robots can complete exploration on the unknown environment, and the task is completed.
Example two
The embodiment discloses a multi-robot environment exploration system based on a hierarchical graph neural network.
As shown in fig. 2, the system for multi-robot environment exploration based on hierarchical graph neural network includes:
an environment modeling module configured to: the environment modeling method based on the topological graph represents the continuous environment map as the topological graph;
a feature extraction module configured to: extracting features of the topological graph based on a hierarchical graph neural network, aggregating feature information of different hop counts in the topological graph, and fusing features of nodes and edges in the topological graph by using a multi-head attention mechanism to obtain a final output topological graph;
a reinforcement learning module configured to: and taking the node characteristics corresponding to the single robot node in the final output topological graph as the state of a corresponding intelligent agent in reinforcement learning, and performing information fusion on the node characteristics from a plurality of robots by using a multi-head attention mechanism to obtain the total state value of the robot system.
EXAMPLE III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in the hierarchical graph neural network-based multi-robot environment exploration method according to embodiment 1 of the present disclosure.
Example four
An object of the present embodiment is to provide an electronic device.
An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for multi-robot environment exploration based on hierarchical graph neural network according to embodiment 1 of the present disclosure when executing the program.
The steps involved in the apparatuses of the above second, third and fourth embodiments correspond to the first embodiment of the method, and the detailed description thereof can be found in the relevant description of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media containing one or more sets of instructions; it should also be understood to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any of the methods of the present invention.
Those skilled in the art will appreciate that the modules or steps of the present invention described above can be implemented using general purpose computer means, or alternatively, they can be implemented using program code that is executable by computing means, such that they are stored in memory means for execution by the computing means, or they are separately fabricated into individual integrated circuit modules, or multiple modules or steps of them are fabricated into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (10)

1. The multi-robot environment exploration method based on the hierarchical graph neural network is characterized by comprising the following steps of:
the environment modeling method based on the topological graph represents the continuous environment map as the topological graph;
extracting features of the topological graph based on a hierarchical graph neural network, aggregating feature information of different hop counts in the topological graph, and fusing features of nodes and edges in the topological graph by using a multi-head attention mechanism to obtain a final output topological graph;
and taking the node characteristics corresponding to the single robot node in the final output topological graph as the state of a corresponding intelligent agent in reinforcement learning, and performing information fusion on the node characteristics from a plurality of robots by using a multi-head attention mechanism to obtain the total state value of the robot system.
2. The multi-robot environment exploration method based on the hierarchical graph neural network as claimed in claim 1, wherein the environment modeling method based on the topological graph represents a continuous environment map as the topological graph, specifically: discretizing the environment according to a certain distance specification, taking the passable area in the environment and the corresponding lattice points of the robot as nodes in the topological graph, adding an upper edge between adjacent nodes, and simultaneously adding an upper edge between the nodes of the robot and the adjacent nodes, and representing the environmental graph as the topological graph.
3. The multi-robot environment exploration method based on the hierarchical graph neural network as claimed in claim 1, wherein the topological graph is subjected to feature extraction based on the hierarchical graph neural network, and specifically comprises:
inputting the topological graph into a bottom layer feature aggregation module, updating the feature of each node in the topological graph into the feature aggregation of a neighbor, updating the feature of an edge into the feature aggregation of two adjacent nodes, and iterating the bottom layer feature aggregation module for K times to obtain K levels of feature information and K +1 output topological graphs;
inputting the K +1 output topological graphs into a hierarchical feature aggregation module, and fusing the features of nodes and edges in the output topological graphs by using a multi-head attention mechanism to obtain a hidden feature graph;
and inputting the hidden feature diagram into an output layer to extract the topological feature of the hidden layer, so as to obtain a final output topological diagram.
4. The multi-robot environment exploration method based on hierarchical graph neural network of claim 1, wherein a multi-head attention mechanism is used to perform information fusion on node features from multiple robots, and the final output features are one-dimensional.
5. The multi-robot environment exploration method based on the hierarchical graph neural network as claimed in claim 1, wherein at each time t, the robot system state is a map detected by a plurality of robots before the time t, and finally the sum of the state values of all the agents is used as the state value of the whole system.
6. The multi-robot environment exploration method based on the hierarchical graph neural network as claimed in claim 1, further comprising the steps of taking edges connected with the robot nodes in the final output topological graph as candidates of the current action value of the robot, sending the candidates into a softmax layer to obtain action probability distribution, and determining the action of the robot at the current moment based on the action probability distribution.
7. The multi-robot environment exploration method based on hierarchical graph neural network of claim 6, wherein the actions of robot i are expressed as:
Figure FDA0003953010090000021
wherein ,
Figure FDA0003953010090000022
represents the set of edges adjacent to node i,
Figure FDA0003953010090000023
representing the edges in the final output topology connected to node j.
8. A multi-robot environment exploration system based on a hierarchical graph neural network is characterized in that: the method comprises the following steps:
an environment modeling module configured to: the environment modeling method based on the topological graph represents the continuous environment map as the topological graph;
a feature extraction module configured to: extracting features of the topological graph based on a hierarchical graph neural network, aggregating feature information of different hop counts in the topological graph, and fusing features of nodes and edges in the topological graph by using a multi-head attention mechanism to obtain a final output topological graph;
a reinforcement learning module configured to: and taking the node characteristics corresponding to the single robot node in the final output topological graph as the state of a corresponding intelligent agent in reinforcement learning, and performing information fusion on the node characteristics from a plurality of robots by using a multi-head attention mechanism to obtain the total state value of the robot system.
9. Computer-readable storage medium, on which a program is stored which, when being executed by a processor, carries out the steps of the method for multi-robot environment exploration based on hierarchical graph neural networks according to any of claims 1 to 7.
10. Electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, characterized in that the processor implements the steps in the method for multi-robot environment exploration based on hierarchical graph neural networks according to any of claims 1-7 when executing said program.
CN202211454807.3A 2022-11-21 2022-11-21 Multi-robot environment exploration method and system based on hierarchical graph neural network Active CN115759199B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211454807.3A CN115759199B (en) 2022-11-21 2022-11-21 Multi-robot environment exploration method and system based on hierarchical graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211454807.3A CN115759199B (en) 2022-11-21 2022-11-21 Multi-robot environment exploration method and system based on hierarchical graph neural network

Publications (2)

Publication Number Publication Date
CN115759199A true CN115759199A (en) 2023-03-07
CN115759199B CN115759199B (en) 2023-09-26

Family

ID=85333480

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211454807.3A Active CN115759199B (en) 2022-11-21 2022-11-21 Multi-robot environment exploration method and system based on hierarchical graph neural network

Country Status (1)

Country Link
CN (1) CN115759199B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116755397A (en) * 2023-05-26 2023-09-15 北京航空航天大学 Multi-machine collaborative task scheduling method based on graph convolution strategy gradient

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128657A (en) * 2021-06-17 2021-07-16 中国科学院自动化研究所 Multi-agent behavior decision method and device, electronic equipment and storage medium
US20220245365A1 (en) * 2020-05-20 2022-08-04 Tencent Technology (Shenzhen) Company Limited Translation method and apparatus based on multimodal machine learning, device, and storage medium
CN114973125A (en) * 2022-05-12 2022-08-30 武汉大学 Method and system for assisting navigation in intelligent navigation scene by using knowledge graph
CN115130663A (en) * 2022-08-30 2022-09-30 中国海洋大学 Heterogeneous network attribute completion method based on graph neural network and attention mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220245365A1 (en) * 2020-05-20 2022-08-04 Tencent Technology (Shenzhen) Company Limited Translation method and apparatus based on multimodal machine learning, device, and storage medium
CN113128657A (en) * 2021-06-17 2021-07-16 中国科学院自动化研究所 Multi-agent behavior decision method and device, electronic equipment and storage medium
CN114973125A (en) * 2022-05-12 2022-08-30 武汉大学 Method and system for assisting navigation in intelligent navigation scene by using knowledge graph
CN115130663A (en) * 2022-08-30 2022-09-30 中国海洋大学 Heterogeneous network attribute completion method based on graph neural network and attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
STEVE P ET AL: "《Learning Scalable Policies over Graphs for Multi-Robot Task Allocation Using Capsule Attention Networks》", 《IEEE》, pages 1 - 9 *
ZHANG H ET AL: "《H2GNN: Hierarchical-Hops Graph Neural Networks for Multi-Robot Exploration in Unknown Environments》", 《IEEE》, pages 1 - 3 *
罗骜: "《基于图神经网络的场景理解算法研究》", 《中国博士学位论文全文数据库 信息科技辑》, no. 2021, pages 138 - 40 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116755397A (en) * 2023-05-26 2023-09-15 北京航空航天大学 Multi-machine collaborative task scheduling method based on graph convolution strategy gradient
CN116755397B (en) * 2023-05-26 2024-01-23 北京航空航天大学 Multi-machine collaborative task scheduling method based on graph convolution strategy gradient

Also Published As

Publication number Publication date
CN115759199B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
Mossalam et al. Multi-objective deep reinforcement learning
CN113919485A (en) Multi-agent reinforcement learning method and system based on dynamic hierarchical communication network
Luo et al. Multi-agent collaborative exploration through graph-based deep reinforcement learning
Tan et al. Multi-type task allocation for multiple heterogeneous unmanned surface vehicles (USVs) based on the self-organizing map
Taghizadeh et al. A novel graphical approach to automatic abstraction in reinforcement learning
CN115759199B (en) Multi-robot environment exploration method and system based on hierarchical graph neural network
Sadhu et al. Aerial-DeepSearch: Distributed multi-agent deep reinforcement learning for search missions
CN114815801A (en) Adaptive environment path planning method based on strategy-value network and MCTS
Barrett Making friends on the fly: advances in ad hoc teamwork
Li et al. A graph-based reinforcement learning method with converged state exploration and exploitation
Ghazanfari et al. Extracting bottlenecks for reinforcement learning agent by holonic concept clustering and attentional functions
CN113408949B (en) Robot time sequence task planning method and device and electronic equipment
Byeon Advances in Value-based, Policy-based, and Deep Learning-based Reinforcement Learning
Hong et al. Deterministic policy gradient based formation control for multi-agent systems
Lauri et al. Robust multi-agent patrolling strategies using reinforcement learning
Manoury et al. Chime: An adaptive hierarchical representation for continuous intrinsically motivated exploration
Marzi et al. Feudal graph reinforcement learning
Guisi et al. Reinforcement learning with multiple shared rewards
Rai et al. Membrane computing based scalable distributed learning and collaborative decision making for cyber physical systems
Gao et al. A Survey of Markov Model in Reinforcement Learning
Ji et al. Research on Path Planning of Mobile Robot Based on Reinforcement Learning
US20230289563A1 (en) Multi-node neural network constructed from pre-trained small networks
CN114723005B (en) Multi-layer network collapse strategy deducing method based on depth map representation learning
Zhang Ant Colony Algorithm for Distributed Constrained Optimization
Mohammed Modified ant colony optimization for solving traveling salesman problem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant