CN111738372A - Distributed multi-agent space-time feature extraction method and behavior decision method - Google Patents

Distributed multi-agent space-time feature extraction method and behavior decision method Download PDF

Info

Publication number
CN111738372A
CN111738372A CN202010873794.8A CN202010873794A CN111738372A CN 111738372 A CN111738372 A CN 111738372A CN 202010873794 A CN202010873794 A CN 202010873794A CN 111738372 A CN111738372 A CN 111738372A
Authority
CN
China
Prior art keywords
agent
space
vector
time
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010873794.8A
Other languages
Chinese (zh)
Other versions
CN111738372B (en
Inventor
蒲志强
刘振
王彗木
丘腾海
易建强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202010873794.8A priority Critical patent/CN111738372B/en
Publication of CN111738372A publication Critical patent/CN111738372A/en
Application granted granted Critical
Publication of CN111738372B publication Critical patent/CN111738372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a distributed multi-agent space-time feature extraction method and a behavior decision method. The behavior decision method comprises the following steps: intelligent agent for acquiring current moment and previous moments
Figure 635967DEST_PATH_IMAGE001
The state information which can be sensed constructs a space-time state vector; inputting the space-time state vector into the network generation layer of the graph and outputting the intelligent agent
Figure 964180DEST_PATH_IMAGE001
The original feature vector of (2); inputting the original feature vector into a spatial feature extraction layer, and outputting a spatial relationship feature vector; inputting the space relation characteristic vector into a space-time characteristic extraction layer, and outputting the space-time relation characteristic vector; calculating an agent based on the obtained spatiotemporal relationship feature vectors
Figure 104175DEST_PATH_IMAGE001
A behavior decision at the current moment; updating time step, calculating next time agent
Figure 543246DEST_PATH_IMAGE001
Spatiotemporal features and behavioral decisions. The invention realizes the extraction of the distributed space-time characteristic relation of the multi-agent system under the constraints of complex environment, time-varying topology, limited resources and the like, and improves the self-adaptive capacity and performance robustness of the multi-agent system in large-scale complex tasks.

Description

Distributed multi-agent space-time feature extraction method and behavior decision method
Technical Field
The invention belongs to the field of multi-agent and group intelligence, and particularly relates to a distributed multi-agent space-time feature extraction method and a behavior decision method.
Background
The multi-agent system has the advantages of distributivity, simplicity, flexibility, robustness and the like, provides a brand-new solution for a plurality of complex problems with great challenges, and the group intelligence represented by the multi-agent system is also one of five intelligent forms of the key development established in 'new generation artificial intelligence development planning' in China. With the rapid development of micro-nano electronics, computing platforms, autonomous control and other emerging technologies, a multi-agent system composed of unmanned autonomous platforms such as unmanned aerial vehicles and unmanned vehicles is increasingly applied in important scenes concerning national civilization and national defense safety. The unmanned autonomous multi-agent system can quickly form area coverage in a networked, distributed and collaborative mode, achieves cluster resource optimization scheduling, improves task completion rate and response speed, can serve as a normalized deployment system to serve the fields of mountain patrol, disaster early warning, environment monitoring, regional logistics and the like on the one hand, and can also serve as a quick response system for emergencies to provide the capabilities of quick material scheduling, disaster monitoring evaluation, communication guarantee support and the like under the scenes such as epidemic prevention and control, sudden disasters, large-scale active personal defense and the like on the other hand.
The behavior decision of the multi-agent mainly comprises a centralized mode and a distributed mode. The centralized decision-making has a central decision-making point, all information is gathered to the central node through a communication network, behavior decision-making instructions of all the intelligent agents are calculated through a centralized planning decision-making algorithm, and then the decision-making instructions of all the intelligent agents are issued to each intelligent agent through the communication network for execution. The centralized mode has high requirement on the reliability of a communication network and a central node, and has larger behavior delay, and the intelligent agent is difficult to realize self-adaptive autonomous behavior decision along with the change of tasks and environments when facing to the actual application scene, thereby greatly limiting the exertion of the intelligent synergistic effect of the multi-intelligent-agent system. In an actual scene, a multi-agent system often covers a large range and is difficult to form a centralized network, a single agent often only has limited environment sensing capability, communication capability and behavior capability, and the communication topology connection relationship among the agents will change all the time in a dynamic task, so that distributed decision brings better self-adaption capability and task performance for the multi-agent system in complex environments and tasks.
When an agent makes a decision, the decision-making basis is the state information of the current task and environment, and when large-scale clustering and complex tasks/environments are faced, how to abstract and reduce the state information by effective means and further extract the space-time characteristic relationship between the agent and between the agent and the task environment elements is a key for ensuring the multi-agent system to realize abstract understanding of the task and the environment and further realize autonomous decision-making and intelligent control.
The graph attention network is a machine learning method which is just emerging in recent years, multiple problems in reality are abstracted into a graph structure, a graph neural network is adopted for feature extraction, an attention mechanism is further adopted for realizing fusion of different feature representation spaces, and related technologies are gradually verified and applied in scenes such as social networks, traffic road condition forecast and the like. On the other hand, the long-term and short-term memory network is an important recurrent neural network and is widely applied to the processing of timing problems, particularly in the fields of voice recognition, semantic analysis and the like. In particular, the long and short term memory network with peepholes has the characteristic of repeatedly refining the historical state, thereby having better performance when dealing with complex state and time-varying systems. The essential spatial topological structure of the multi-agent system and the time sequence dependency of tasks enable the graph attention network and the long-short term memory network with peepholes to have natural application advantages on the multi-agent, however, due to the rising of the related technology, the application of the graph attention network and the long-short term memory network on an unmanned autonomous multi-agent system composed of an unmanned aerial vehicle, an unmanned vehicle and the like is rarely reported at present, particularly, the two networks are combined for extracting the spatiotemporal relation characteristics of the multi-agent system, and the method has important prospective and innovative significance in the machine learning field and the multi-agent system field.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the problem of efficient spatiotemporal feature relation extraction of a multi-agent system in dynamic and complex tasks, a first aspect of the present invention provides a distributed multi-agent spatiotemporal feature extraction method, which includes the following steps:
step S100, at the time
Figure 745212DEST_PATH_IMAGE001
Based on
Figure 364412DEST_PATH_IMAGE002
Agent for each moment from moment
Figure 162604DEST_PATH_IMAGE003
Observable spatial state vector
Figure 158242DEST_PATH_IMAGE004
Splicing to obtain agents
Figure 889437DEST_PATH_IMAGE003
At the current moment
Figure 413960DEST_PATH_IMAGE001
Space-time state vector of
Figure 699447DEST_PATH_IMAGE005
(ii) a Wherein,
Figure 495847DEST_PATH_IMAGE006
the number is a preset historical state number;
step S200 based on
Figure 815969DEST_PATH_IMAGE007
Obtaining agents by generating layers through a graph network
Figure 511393DEST_PATH_IMAGE003
Of the original feature vector
Figure 284177DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 621617DEST_PATH_IMAGE009
Figure 530667DEST_PATH_IMAGE010
a selected feature space dimension;
step S300 based on
Figure 662572DEST_PATH_IMAGE008
Obtaining the intelligent agent through the spatial characteristic relation extraction layer
Figure 657072DEST_PATH_IMAGE003
At the current moment
Figure 798204DEST_PATH_IMAGE001
Space relation characteristic vector of
Figure 561760DEST_PATH_IMAGE011
(ii) a The spatial feature relation extraction layer is constructed in a mode that a graph attention network module and a full-connection network module are stacked alternately;
step S400, acquiring time
Figure 864566DEST_PATH_IMAGE001
Front side
Figure 346363DEST_PATH_IMAGE012
Individual moment intelligent agent
Figure 291185DEST_PATH_IMAGE003
Spatial relationship feature vector of
Figure 174827DEST_PATH_IMAGE013
Will be
Figure 651464DEST_PATH_IMAGE014
Figure 886136DEST_PATH_IMAGE013
Inputting the data into a time-space feature extraction layer, outputting an intelligent agent by adopting a long-term and short-term memory network with peepholes based on graph convolution operation
Figure 103491DEST_PATH_IMAGE003
At the current moment
Figure 841639DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 486247DEST_PATH_IMAGE015
In some preferred embodiments, step S100 "agent at each moment in time
Figure 942637DEST_PATH_IMAGE003
Observable spatial state vectors "including agent self states, task goal states, observable other agent states, observable environmental element states;
the self state of the intelligent agent comprises the self position, speed and acceleration state of the intelligent agent;
the task target state comprises a target position and a speed state;
the observable other agent states include observable location, speed states of other agents;
the observable environmental element states include observable position and velocity states of obstacles in the environment, and position states of no-traffic zones in the environment.
In some preferred embodiments, the graph network generation layer in step S200 is composed of a plurality of layers of fully-connected neural networks.
In some preferred embodiments, the method for extracting spatial feature relationship based on the spatial feature relationship extraction layer in step S300 includes:
step S310, using the agent
Figure 698103DEST_PATH_IMAGE003
Of the original feature vector0
Figure 821917DEST_PATH_IMAGE008
And the original feature vectors of all neighboring agents0
Figure 371847DEST_PATH_IMAGE016
Obtaining, as input, a spatial relationship feature vector by a first graph attention network module1
Figure 315532DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 874689DEST_PATH_IMAGE017
Figure 587430DEST_PATH_IMAGE018
as an agent
Figure 839420DEST_PATH_IMAGE003
A set of neighbor agents capable of direct communication;
Figure 4822DEST_PATH_IMAGE019
0
Figure 99162DEST_PATH_IMAGE008
Figure 931988DEST_PATH_IMAGE020
0
Figure 89300DEST_PATH_IMAGE016
step S320, in1
Figure 741998DEST_PATH_IMAGE008
For input, obtaining space relation characteristic vector through first full-connection network module2
Figure 642958DEST_PATH_IMAGE008
Step S330, based on the space relation feature vector obtained in the step S320, through the stacked attention network module and the full-connection network module, the method of the step S310 and the step S320 is adopted to iteratively calculate the second
Figure 330292DEST_PATH_IMAGE021
Spatial relationship feature vector of order
f2-1
Figure 658505DEST_PATH_IMAGE008
f2
Figure 798499DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 237571DEST_PATH_IMAGE022
Figure 779411DEST_PATH_IMAGE023
the number of stacked layers of the power network module and the fully connected network module is noted for the figure;
step S340, in the iterative computation
Figure 278525DEST_PATH_IMAGE023
The second timeBased on k2(-1)
Figure 171395DEST_PATH_IMAGE008
By the method of step S310, through
Figure 414157DEST_PATH_IMAGE023
Obtaining space relation characteristic vector by drawing attention network module k2-1
Figure 810503DEST_PATH_IMAGE008
(ii) a Splicing vector value0
Figure 214940DEST_PATH_IMAGE008
,2
Figure 329526DEST_PATH_IMAGE008
,4
Figure 644489DEST_PATH_IMAGE008
,…, k2(-1)
Figure 895342DEST_PATH_IMAGE008
, k2-1
Figure 736259DEST_PATH_IMAGE008
]Input the first
Figure 338141DEST_PATH_IMAGE023
Fully connecting network modules to obtain
Figure 188286DEST_PATH_IMAGE024
Spatial relationship feature vector k2
Figure 559224DEST_PATH_IMAGE008
As time of day
Figure 571042DEST_PATH_IMAGE001
Lower intelligent agent
Figure 660221DEST_PATH_IMAGE003
Final output based on the spatial feature relationship extraction layer
Figure 782898DEST_PATH_IMAGE025
In some preferred embodiments, the spatial relationship feature vector1
Figure 273922DEST_PATH_IMAGE008
The acquisition method comprises the following steps:
step S311, using a learnable matrix
Figure 722221DEST_PATH_IMAGE026
Pair relation feature vector0
Figure 33117DEST_PATH_IMAGE008
0
Figure 959484DEST_PATH_IMAGE027
Linear transformation is carried out and spliced into a new relation characteristic vector0
Figure 305015DEST_PATH_IMAGE028
=[
Figure 658636DEST_PATH_IMAGE026
0
Figure 742915DEST_PATH_IMAGE008
,
Figure 472973DEST_PATH_IMAGE029
0
Figure 673011DEST_PATH_IMAGE027
](ii) a Will be provided with0
Figure 197533DEST_PATH_IMAGE028
Inputting a full-connection neural network and outputting an intelligent agent
Figure 483021DEST_PATH_IMAGE003
For intelligent agent
Figure 751191DEST_PATH_IMAGE030
Attention coefficient of
Figure 71314DEST_PATH_IMAGE031
(ii) a Obtaining an agent
Figure 32317DEST_PATH_IMAGE003
For intelligent agent
Figure 805101DEST_PATH_IMAGE030
Attention normalization coefficient of
Figure 611382DEST_PATH_IMAGE032
Figure 786012DEST_PATH_IMAGE033
Step S312, adopting a multi-head attention mechanism for the first time
Figure 652337DEST_PATH_IMAGE034
Calculating the normalized attention coefficient by the method of step S311
Figure 912417DEST_PATH_IMAGE035
(ii) a Intelligent agent under fusion of computing multi-head attention mechanism
Figure 522390DEST_PATH_IMAGE003
Spatial relationship feature vector of1
Figure 820034DEST_PATH_IMAGE008
1
Figure 122840DEST_PATH_IMAGE036
Wherein,
Figure 604637DEST_PATH_IMAGE037
to keep track of the number of the first hours of attention,
Figure 18301DEST_PATH_IMAGE038
the function is activated for the sigmoid and,
Figure 901943DEST_PATH_IMAGE039
is as follows
Figure 641229DEST_PATH_IMAGE034
Attention is drawn to the chosen linear transformation matrix,
Figure 610322DEST_PATH_IMAGE040
representing a concatenation operation of the vectors.
In some preferred embodiments, the long-short term memory network with peephole based on graph volume operation comprises
Figure 562097DEST_PATH_IMAGE041
The long and short term memory network units are connected in series and provided with peepholes;
and an input gate, a forgetting gate and an output gate of the long and short term memory network unit are constructed based on a graph convolution neural network.
In some preferred embodiments, the spatiotemporal relationship feature vector
Figure 565825DEST_PATH_IMAGE015
The acquisition method comprises the following steps:
the long-short term memory network unit near the output end is recorded as
Figure 210433DEST_PATH_IMAGE042
And increase the serial number in the reverse direction;
first, the
Figure 932402DEST_PATH_IMAGE043
A long and short term memory network unit
Figure 687868DEST_PATH_IMAGE044
The cell state of (A) is recorded as
Figure 280523DEST_PATH_IMAGE045
The output is a space-time relation feature vector
Figure 96033DEST_PATH_IMAGE046
Input is as
Figure 39718DEST_PATH_IMAGE047
Temporal spatial relationship feature vector
Figure 861525DEST_PATH_IMAGE048
The first stepp+1 unit
Figure 574266DEST_PATH_IMAGE049
Output space-time relation feature vector
Figure 295097DEST_PATH_IMAGE050
And cell state thereof
Figure 726079DEST_PATH_IMAGE051
(ii) a Wherein,
Figure 823348DEST_PATH_IMAGE052
based on cell state notation
Figure 656174DEST_PATH_IMAGE053
Feature vector of space-time relation
Figure 547907DEST_PATH_IMAGE054
Figure 731764DEST_PATH_IMAGE025
Figure 632724DEST_PATH_IMAGE055
Obtaining spatiotemporal relation feature vectors through the long-short term memory network
Figure 54478DEST_PATH_IMAGE056
In some preferred embodiments, the spatiotemporal relationship feature vector
Figure 382691DEST_PATH_IMAGE046
The calculation method comprises the following steps:
step 401, will
Figure 522685DEST_PATH_IMAGE057
Temporal spatial relationship feature vector
Figure 227336DEST_PATH_IMAGE048
The first step
Figure 769176DEST_PATH_IMAGE058
A unit
Figure 268290DEST_PATH_IMAGE049
Network element status of
Figure 164090DEST_PATH_IMAGE051
And the first
Figure 406852DEST_PATH_IMAGE058
A unit
Figure 803198DEST_PATH_IMAGE049
Output space-time relation feature vector
Figure 473214DEST_PATH_IMAGE050
Inputting the input data into a forgetting gate based on a neural network adopting graph convolution, and calculating to obtain an forgetting gate output variable
Figure 587801DEST_PATH_IMAGE059
Figure 634254DEST_PATH_IMAGE060
Wherein, the graph convolution operation is represented,
Figure 150686DEST_PATH_IMAGE061
and
Figure 991603DEST_PATH_IMAGE062
the weight coefficient matrix and the bias of the forgetting gate diagram convolutional neural network respectively,
Figure 593486DEST_PATH_IMAGE038
activating a function for sigmoid;
step 402, will
Figure 443630DEST_PATH_IMAGE057
Temporal spatial relationship feature vector
Figure 814568DEST_PATH_IMAGE048
The first step
Figure 826387DEST_PATH_IMAGE058
A unit
Figure 915566DEST_PATH_IMAGE049
Output space-time relation feature vector
Figure 303822DEST_PATH_IMAGE050
And cell state thereof
Figure 791916DEST_PATH_IMAGE051
Inputting to an input gate based on the use of a convolutional neural network, and comparing the cell states
Figure 974636DEST_PATH_IMAGE045
Updating, wherein the calculation formula is as follows:
Figure 551111DEST_PATH_IMAGE063
Figure 477478DEST_PATH_IMAGE064
Figure 557430DEST_PATH_IMAGE065
wherein,
Figure 176630DEST_PATH_IMAGE066
is the output of the input gate or gates,
Figure 974822DEST_PATH_IMAGE067
in order to be a transitional state of the cell,
Figure 704880DEST_PATH_IMAGE068
Figure 639338DEST_PATH_IMAGE069
the weight coefficient matrix corresponding to the neural network is convolved for the input gate map,
Figure 429440DEST_PATH_IMAGE070
Figure 714927DEST_PATH_IMAGE071
the corresponding bias of the neural network is convolved for the input gate map,
Figure 983098DEST_PATH_IMAGE072
for the purpose of the tanh activation function,
Figure 37641DEST_PATH_IMAGE073
is the Hadamard product;
at step 403, will
Figure 998644DEST_PATH_IMAGE057
Temporal spatial relationship feature vector
Figure 771428DEST_PATH_IMAGE048
The first step
Figure 846219DEST_PATH_IMAGE058
A unit
Figure 20848DEST_PATH_IMAGE049
Space-time of outputRelational feature vector
Figure 887173DEST_PATH_IMAGE050
And step 402 updated
Figure 412833DEST_PATH_IMAGE043
A unit
Figure 22805DEST_PATH_IMAGE044
Network element status of
Figure 51941DEST_PATH_IMAGE045
Input to an output gate based on a neural network using convolution of a graph to obtain
Figure 89167DEST_PATH_IMAGE043
A unit
Figure 836544DEST_PATH_IMAGE044
Output space-time relation feature vector
Figure 781366DEST_PATH_IMAGE046
The calculation formula is as follows:
Figure 665008DEST_PATH_IMAGE074
Figure 201032DEST_PATH_IMAGE075
wherein,
Figure 170125DEST_PATH_IMAGE076
is the output variable of the output gate,
Figure 121900DEST_PATH_IMAGE077
Figure 388278DEST_PATH_IMAGE078
and (4) convolution of the weight coefficient matrix and the bias corresponding to the neural network for the output gate graph.
In some preferred embodiments, all of the agents in the distributed multi-agent share learnable parameters.
The invention provides a distributed multi-agent behavior decision method in a second aspect, and the method is based on the above method for extracting the space-time characteristics of the distributed multi-agent to obtain the agent
Figure 767307DEST_PATH_IMAGE003
At the current moment
Figure 223696DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 979162DEST_PATH_IMAGE015
Computing agents using model-knowledge-driven methods or reinforcement learning data-driven methods
Figure 837397DEST_PATH_IMAGE003
At the current moment
Figure 918485DEST_PATH_IMAGE001
Behavioral decision set of
Figure 862171DEST_PATH_IMAGE079
(ii) a Wherein,
Figure 155749DEST_PATH_IMAGE080
Figure 868490DEST_PATH_IMAGE081
is the chosen decision space dimension.
In some preferred embodiments of the distributed multi-agent behavioral decision method, all agents in the distributed multi-agent share learnable parameters.
The invention provides a distributed multi-agent space-time feature extraction system, which comprises a state vector acquisition module, an original feature generation module, a space relation calculation module and a space-time relation calculation module;
the state vector acquisition module is configured to acquire the state vector at the moment
Figure 854900DEST_PATH_IMAGE001
Based on
Figure 285882DEST_PATH_IMAGE082
Agent for each moment from moment
Figure 383151DEST_PATH_IMAGE003
Observable spatial state vector
Figure 950398DEST_PATH_IMAGE004
Splicing to obtain agents
Figure 107710DEST_PATH_IMAGE003
At the current moment
Figure 25987DEST_PATH_IMAGE001
Space-time state vector of
Figure 664298DEST_PATH_IMAGE005
(ii) a Wherein,
Figure 351631DEST_PATH_IMAGE006
the number is a preset historical state number;
the primitive feature generation module configured to generate a primitive feature based on
Figure 679844DEST_PATH_IMAGE007
Obtaining agents by generating layers through a graph network
Figure 819839DEST_PATH_IMAGE003
Of the original feature vector
Figure 524489DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 331908DEST_PATH_IMAGE083
Figure 831023DEST_PATH_IMAGE010
a selected feature space dimension;
the spatial relationship calculation module is configured to calculate the spatial relationship based on
Figure 458313DEST_PATH_IMAGE008
Obtaining the intelligent agent through the spatial characteristic relation extraction layer
Figure 435497DEST_PATH_IMAGE003
At the current moment
Figure 97422DEST_PATH_IMAGE001
Space relation characteristic vector of
Figure 767438DEST_PATH_IMAGE084
(ii) a The spatial feature relation extraction layer is constructed in a mode that a graph attention network module and a full-connection network module are stacked alternately;
the time-space relation calculation module is configured to obtain time
Figure 882024DEST_PATH_IMAGE001
Front side
Figure 662899DEST_PATH_IMAGE012
Individual moment intelligent agent
Figure 179331DEST_PATH_IMAGE003
Spatial relationship feature vector of
Figure 20248DEST_PATH_IMAGE055
Will be
Figure 908217DEST_PATH_IMAGE025
Figure 492782DEST_PATH_IMAGE055
Inputting the data into a time-space feature extraction layer, outputting an intelligent agent by adopting a long-term and short-term memory network with peepholes based on graph convolution operation
Figure 863721DEST_PATH_IMAGE003
At the current moment
Figure 875539DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 699139DEST_PATH_IMAGE056
The fourth aspect of the invention provides a distributed multi-agent behavior decision system, which is based on the above-mentioned distributed multi-agent space-time feature extraction system and further comprises a behavior decision calculation module;
the behavioral decision calculation module is configured to be based on an agent
Figure 87395DEST_PATH_IMAGE003
At the current moment
Figure 312840DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 557876DEST_PATH_IMAGE056
Computing agents using model-knowledge-driven methods or reinforcement learning data-driven methods
Figure 134351DEST_PATH_IMAGE003
At the current moment
Figure 60719DEST_PATH_IMAGE001
Behavioral decision set of
Figure 140670DEST_PATH_IMAGE079
(ii) a Wherein,
Figure 759870DEST_PATH_IMAGE085
Figure 558062DEST_PATH_IMAGE081
is the chosen decision space dimension.
In a fifth aspect of the present invention, a storage device is proposed, in which a plurality of programs are stored, said programs being adapted to be loaded and executed by a processor to implement the above-mentioned distributed multi-agent spatiotemporal feature extraction method, and/or the above-mentioned distributed multi-agent behaviour decision method.
In a sixth aspect of the present invention, a processing apparatus is provided, which includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is suitable to be loaded and executed by a processor to realize the above-mentioned distributed multi-agent space-time feature extraction method and/or the above-mentioned distributed multi-agent behavior decision method.
The invention has the beneficial effects that:
by adopting a distributed characteristic extraction and behavior decision mode, compared with a centralized mode, the method is closer to the actual application scene of a large-scale multi-agent system, and can fully exert the application advantages of the multi-agent system such as distribution, networking and synergy; the spatial-temporal characteristic relation contained in the multi-agent system is extracted through the graph attention mechanism and the long and short memory networks, so that an important basis can be provided for the follow-up intelligent behavior decision of the multi-agent system, the agent can realize the autonomous behavior decision in dynamic and complex tasks, and the characteristic extraction layer is constructed by adopting parameter learnable models such as the graph neural network and the long and short memory networks with peepholes, so that the extraction of hidden characteristics and changed characteristics in the agent can be realized, and the task and environment adaptability of the agent is improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of a distributed multi-agent behavior decision method according to an embodiment of the present invention;
FIG. 2 is a diagram of a distributed multi-agent spatiotemporal feature extraction method for a single agent in an embodiment of the invention
Figure 22542DEST_PATH_IMAGE003
Generating a layer schematic diagram;
FIG. 3 is a diagram of a distributed multi-agent spatiotemporal feature extraction method for a single agent in an embodiment of the invention
Figure 225508DEST_PATH_IMAGE003
Schematic diagram of the spatial feature extraction layer;
FIG. 4 is a diagram of a distributed multi-agent spatiotemporal feature extraction method for a single agent in an embodiment of the invention
Figure 281189DEST_PATH_IMAGE003
The space-time feature extraction layer schematic diagram of (1);
FIG. 5 is a diagram of a distributed multi-agent spatiotemporal feature extraction method for a single agent in an embodiment of the invention
Figure 301098DEST_PATH_IMAGE003
The 1 st long-short term memory network unit structure diagram in the space-time characteristic extraction layer;
FIG. 6 is a schematic diagram of intelligent agent collaboration behavior in an embodiment of the invention
FIG. 7 is a block diagram of a distributed multi-agent behavior decision system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The invention provides a novel multi-agent space-time feature extraction method, which aims at the problems of high-efficiency space-time feature relation extraction and intelligent decision making of a multi-agent system in dynamic and complex tasks, adopts a graph structure to express the space-time relation between an agent individual and the environment, further adopts a graph neural network and an attention mechanism to extract the space feature relation, adopts a long-short term memory network with peepholes to realize the time feature relation extraction, and finally realizes the autonomous space-time feature relation extraction and the intelligent behavior decision making of the agent in the dynamic and complex tasks.
The invention discloses a distributed multi-agent space-time feature extraction method, which comprises the following steps:
step S100, at the time
Figure 834847DEST_PATH_IMAGE001
Based on
Figure 889391DEST_PATH_IMAGE086
Agent for each moment from moment
Figure 584814DEST_PATH_IMAGE003
Observable spatial state vector
Figure 357598DEST_PATH_IMAGE087
Splicing to obtain agents
Figure 429460DEST_PATH_IMAGE003
At the current moment
Figure 338510DEST_PATH_IMAGE001
Space-time state vector of
Figure 470414DEST_PATH_IMAGE088
(ii) a Wherein,
Figure 730494DEST_PATH_IMAGE006
the number is a preset historical state number;
step S200 based on
Figure 606046DEST_PATH_IMAGE007
Obtaining agents by generating layers through a graph network
Figure 431920DEST_PATH_IMAGE003
Of the original feature vector
Figure 466216DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 213592DEST_PATH_IMAGE083
Figure 627256DEST_PATH_IMAGE010
a selected feature space dimension;
step S300 based on
Figure 510898DEST_PATH_IMAGE008
Obtaining the intelligent agent through the spatial characteristic relation extraction layer
Figure 719026DEST_PATH_IMAGE003
At the current moment
Figure 688119DEST_PATH_IMAGE001
Space relation characteristic vector of
Figure 171053DEST_PATH_IMAGE084
(ii) a The spatial feature relation extraction layer is constructed in a mode that a graph attention network module and a full-connection network module are stacked alternately;
step S400, acquiring time
Figure 909202DEST_PATH_IMAGE001
Front side
Figure 553810DEST_PATH_IMAGE012
Individual moment intelligence
Figure 10199DEST_PATH_IMAGE003
Spatial relationship feature vector of
Figure 765665DEST_PATH_IMAGE055
Will be
Figure 623900DEST_PATH_IMAGE025
Figure 439409DEST_PATH_IMAGE055
Inputting the data into a time-space feature extraction layer, outputting an intelligent agent by adopting a long-term and short-term memory network with peepholes based on graph convolution operation
Figure 117515DEST_PATH_IMAGE003
At the current moment
Figure 676672DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 389413DEST_PATH_IMAGE056
In order to more clearly explain the present invention, the following will expand the detailed description of the steps in an embodiment of the distributed multi-agent spatio-temporal feature extraction method according to the present invention with reference to the drawings.
In one embodiment of the present invention, the device comprises
Figure 644333DEST_PATH_IMAGE089
Individual agent's multi-agent system relies on method specification, the first of which
Figure 75314DEST_PATH_IMAGE003
Figure 172583DEST_PATH_IMAGE090
) The method for extracting space-time characteristics of an agent comprises steps S100-S400, as shown in FIG. 1 (in FIG. 1, a distributed multi-agent behavior decision is shown)The flow diagram of the policy method includes the content of the distributed multi-agent spatio-temporal feature extraction method, so this embodiment only describes the relevant content, and other parts are described in the following corresponding embodiments, such as distributed sharing parameters, agent behavior decision based on spatio-temporal relation feature vectors, and updating time steps to return to the next time calculation in step S100).
Step S100, at the current time
Figure 739831DEST_PATH_IMAGE001
Based on
Figure 897143DEST_PATH_IMAGE002
Agent for each moment from moment
Figure 549841DEST_PATH_IMAGE003
Observable spatial state vector
Figure 450801DEST_PATH_IMAGE004
Splicing to obtain agents
Figure 138134DEST_PATH_IMAGE003
At the current moment
Figure 466347DEST_PATH_IMAGE001
Space-time state vector of
Figure 606341DEST_PATH_IMAGE005
(ii) a Wherein,
Figure 45413DEST_PATH_IMAGE006
the preset historical state number is an adjustable non-negative integer.
In this step, each discrete moment agent is obtained
Figure 587253DEST_PATH_IMAGE003
Observable spatial states including agent's own state, task goal state, observable states of other agents, observable environmental element statesState; the state of the agent itself includes but is not limited to the state of the agent's own position, velocity, acceleration; task goal states include, but are not limited to, goal position, speed state; other agent states that may be observed include, but are not limited to, the location, speed state of other agents that may be observed; the observable environmental element states include, but are not limited to, observable position, velocity state of obstacles in the environment, position state of no-pass zones in the environment, and other environmental state information affecting multi-agent system tasks.
Step S200 based on
Figure 86367DEST_PATH_IMAGE007
Obtaining agents by generating layers through a graph network
Figure 713658DEST_PATH_IMAGE003
Of the original feature vector
Figure 956420DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 880995DEST_PATH_IMAGE009
Figure 551011DEST_PATH_IMAGE010
is the selected feature space dimension.
In this step, the graph network generation layer is realized by using a multilayer fully-connected neural network.
FIG. 2 shows a method for extracting spatiotemporal features of a distributed multi-agent space-time feature of a single agent
Figure 400018DEST_PATH_IMAGE003
A schematic layer diagram is generated. Spatial state vector
Figure 446472DEST_PATH_IMAGE005
After being input into the network generation layer of the graph, the graph passes through
Figure 962904DEST_PATH_IMAGE091
Layer is connected completelyThe neural network is connected for extracting the original features, and the network input is the digitalized state information, so that the features do not need to be extracted by adopting a convolutional network and the like, and the fitting capacity of the original feature vector to the complex intelligent agent relation is enhanced by the multilayer full-connection network.
Step S300 based on
Figure 803821DEST_PATH_IMAGE008
Obtaining the intelligent agent through the spatial characteristic relation extraction layer
Figure 140124DEST_PATH_IMAGE003
At the current moment
Figure 724689DEST_PATH_IMAGE001
Space relation characteristic vector of
Figure 95628DEST_PATH_IMAGE011
(ii) a The spatial feature relation extraction layer is constructed in a mode that a graph attention network module and a full-connection network module are stacked alternately.
In this embodiment, the attention network module uses a multi-head attention mechanism to implement feature aggregation of any agent with respect to surrounding agents and environmental information under different attentions.
FIG. 3 shows a distributed multi-agent spatiotemporal feature extraction method for a single agent according to the present invention
Figure 107446DEST_PATH_IMAGE003
As shown in the figure, the method for extracting a spatial feature relationship based on the spatial feature relationship extraction layer in this embodiment includes:
step S310, using the agent
Figure 196625DEST_PATH_IMAGE003
Of the original feature vector0
Figure 319302DEST_PATH_IMAGE008
And all neighbors intelligenceOriginal feature vector of a volume0
Figure 810326DEST_PATH_IMAGE092
Obtaining, as input, a spatial relationship feature vector by a first graph attention network module1
Figure 993045DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 303941DEST_PATH_IMAGE017
Figure 230309DEST_PATH_IMAGE018
as an agent
Figure 844348DEST_PATH_IMAGE003
A set of neighbor agents capable of direct communication;
Figure 197969DEST_PATH_IMAGE019
0
Figure 996161DEST_PATH_IMAGE008
Figure 726220DEST_PATH_IMAGE093
0
Figure 926257DEST_PATH_IMAGE092
0
Figure 450779DEST_PATH_IMAGE092
transmission to agents via a communication network
Figure 1846DEST_PATH_IMAGE003
Intelligent agent
Figure 270016DEST_PATH_IMAGE003
Set of directly communicable neighbor agents
Figure 590139DEST_PATH_IMAGE018
The obtaining method may be: setting a determination parameter of observable distance
Figure 285563DEST_PATH_IMAGE094
When another agent is present
Figure 58347DEST_PATH_IMAGE030
And an agent
Figure 130208DEST_PATH_IMAGE003
Is less than
Figure 39258DEST_PATH_IMAGE094
Then, it is determined
Figure 171162DEST_PATH_IMAGE017
In this embodiment, the spatial relationship feature vector1
Figure 165663DEST_PATH_IMAGE008
The acquisition method comprises the following steps:
step S311, using a learnable matrix
Figure 38285DEST_PATH_IMAGE026
Pair relation feature vector0
Figure 801842DEST_PATH_IMAGE008
0
Figure 104648DEST_PATH_IMAGE016
Linear transformation is carried out and spliced into a new relation characteristic vector0
Figure 586444DEST_PATH_IMAGE095
=[
Figure 108DEST_PATH_IMAGE026
0
Figure 149330DEST_PATH_IMAGE008
Figure 623037DEST_PATH_IMAGE029
0
Figure 592130DEST_PATH_IMAGE016
](ii) a Will be provided with0
Figure 809484DEST_PATH_IMAGE095
Inputting a full-connection neural network and outputting an intelligent agent
Figure 547633DEST_PATH_IMAGE003
For intelligent agent
Figure 192241DEST_PATH_IMAGE030
Attention coefficient of
Figure 648630DEST_PATH_IMAGE031
(ii) a Obtaining an agent
Figure 404097DEST_PATH_IMAGE003
For intelligent agent
Figure 262331DEST_PATH_IMAGE030
Attention normalization coefficient of
Figure 812261DEST_PATH_IMAGE032
The formula is
Figure 24455DEST_PATH_IMAGE033
Step S312, adopting a multi-head attention mechanism for the first time
Figure 583613DEST_PATH_IMAGE034
Calculating the normalized attention coefficient by the method of step S311
Figure 296354DEST_PATH_IMAGE035
(ii) a Intelligent calculation under fusion of multi-head attention mechanismBody
Figure 282764DEST_PATH_IMAGE003
Spatial relationship feature vector of1
Figure 448166DEST_PATH_IMAGE008
The formula is
1
Figure 545435DEST_PATH_IMAGE036
Wherein,
Figure 378262DEST_PATH_IMAGE037
to keep track of the number of the first hours of attention,
Figure 535574DEST_PATH_IMAGE038
the function is activated for the sigmoid and,
Figure 188272DEST_PATH_IMAGE039
is as follows
Figure 89232DEST_PATH_IMAGE034
Attention is drawn to the chosen linear transformation matrix,
Figure 776565DEST_PATH_IMAGE040
representing a concatenation operation of the vectors.
Spatial relationship feature vector y
Figure 104779DEST_PATH_IMAGE008
Figure 244773DEST_PATH_IMAGE096
Is odd and
Figure 683845DEST_PATH_IMAGE097
) The calculation of (2) can also be performed by adopting the methods of step (311) and step (312), wherein1
Figure 225684DEST_PATH_IMAGE008
Become into y
Figure 724799DEST_PATH_IMAGE008
Will be0
Figure 638176DEST_PATH_IMAGE008
0
Figure 880939DEST_PATH_IMAGE027
0
Figure 277285DEST_PATH_IMAGE028
Is changed into y-1
Figure 947301DEST_PATH_IMAGE008
y-1
Figure 61887DEST_PATH_IMAGE027
y-1
Figure 108341DEST_PATH_IMAGE028
And (4) finishing.
Step S320, in1
Figure 359194DEST_PATH_IMAGE008
For input, obtaining space relation characteristic vector through first full-connection network module2
Figure 200111DEST_PATH_IMAGE008
2
Figure 801993DEST_PATH_IMAGE009
In this step, the fully-connected network module is composed of a plurality of fully-connected neural network layers, and performs enhanced representation and dimension compression on the features.
Step S330, based on the space relation characteristic vector obtained in the step S320, through the stacked graph attention network module and the full connection network module, adoptingStep S310, step S320 method iteration calculation
Figure 386558DEST_PATH_IMAGE021
Spatial relationship feature vector of order f2-1
Figure 757497DEST_PATH_IMAGE008
f2
Figure 34894DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 124073DEST_PATH_IMAGE022
Figure 246750DEST_PATH_IMAGE023
the number of stacked layers of network modules and fully connected network modules is noted.
Step S340, in the iterative computation
Figure 472195DEST_PATH_IMAGE023
Then, based on k2(-1)
Figure 923423DEST_PATH_IMAGE008
By the method of step S310, through
Figure 234319DEST_PATH_IMAGE023
Obtaining space relation characteristic vector by drawing attention network module k2-1
Figure 160687DEST_PATH_IMAGE008
(ii) a Splicing vector value0
Figure 506217DEST_PATH_IMAGE008
,2
Figure 859838DEST_PATH_IMAGE008
,4
Figure 658030DEST_PATH_IMAGE008
,…, k2(-1)
Figure 388089DEST_PATH_IMAGE008
, k2-1
Figure 588126DEST_PATH_IMAGE008
]Input the first
Figure 112648DEST_PATH_IMAGE023
Fully connecting network modules to obtain
Figure 398136DEST_PATH_IMAGE024
Spatial relationship feature vector k2
Figure 931885DEST_PATH_IMAGE008
As time of day
Figure 986429DEST_PATH_IMAGE001
Lower intelligent agent
Figure 947432DEST_PATH_IMAGE003
Final output based on the spatial feature relationship extraction layer
Figure 720216DEST_PATH_IMAGE014
In this embodiment, steps S330 and S340 are processes of performing iterative computations by using the methods of steps S310 and S320 based on the alternating stack graph attention network module and the fully-connected network module, and each stack of the graph attention network module and the fully-connected network module is taken as a unit, which is included in this embodiment
Figure 526498DEST_PATH_IMAGE023
A unit, therefore, needs to be performed
Figure 698198DEST_PATH_IMAGE023
And (3) iterative calculation:
the first time is step S310 and step S320, obtaining1
Figure 564522DEST_PATH_IMAGE008
2
Figure 824602DEST_PATH_IMAGE008
The second iteration adopts the graph attention network module and the full-connection network module in the second unit to obtain the feature vector2
Figure 434575DEST_PATH_IMAGE008
And spatial relationship feature vectors corresponding to all neighbor agents2
Figure 463711DEST_PATH_IMAGE027
Figure 766517DEST_PATH_IMAGE098
2
Figure 248313DEST_PATH_IMAGE027
Transmission to agents via a communication network
Figure 661977DEST_PATH_IMAGE003
) Adopting the methods of step S310 and step S320 as input to respectively obtain the space relation feature vectors3
Figure 545620DEST_PATH_IMAGE008
4
Figure 284906DEST_PATH_IMAGE008
Repeating the steps in the same way
Figure 253999DEST_PATH_IMAGE099
The method operations of sub (including the first and second iterations) steps 310, 320, where the second step
Figure 205774DEST_PATH_IMAGE021
Second space closingIs a feature vector f2-1
Figure 209502DEST_PATH_IMAGE008
f2
Figure 854110DEST_PATH_IMAGE008
Figure 310499DEST_PATH_IMAGE022
In the first place
Figure 65966DEST_PATH_IMAGE023
At the time of the second iteration, based on k2(-1)
Figure 927130DEST_PATH_IMAGE008
By the method of step S310, through
Figure 742639DEST_PATH_IMAGE023
Obtaining space relation characteristic vector by drawing attention network module k2-1
Figure 686324DEST_PATH_IMAGE008
(ii) a Then the original feature vector is processed0
Figure 245482DEST_PATH_IMAGE008
The space relation feature vector obtained by the method of the step 3022
Figure 958223DEST_PATH_IMAGE008
,4
Figure 679054DEST_PATH_IMAGE008
,…, k2(-1)
Figure 110035DEST_PATH_IMAGE008
And the feature vectors obtained after the last operation step 301 k2-1
Figure 207304DEST_PATH_IMAGE008
Splicing is carried out to obtain the spliced characteristic vector0
Figure 40131DEST_PATH_IMAGE008
,2
Figure 931864DEST_PATH_IMAGE008
,4
Figure 850141DEST_PATH_IMAGE008
,…, k2(-1)
Figure 751101DEST_PATH_IMAGE008
, k2-1
Figure 438435DEST_PATH_IMAGE008
]Inputting the data into the last full-connection network module and outputting the space relation characteristic vector k2
Figure 501068DEST_PATH_IMAGE083
I.e. the current time
Figure 906642DEST_PATH_IMAGE001
Lower intelligent agent
Figure 608363DEST_PATH_IMAGE003
Final output of the spatial feature relationship extraction layer, denoted as k2
Figure 150203DEST_PATH_IMAGE100
Step S400, acquiring time
Figure 649317DEST_PATH_IMAGE001
Front side
Figure 276608DEST_PATH_IMAGE012
Figure 519370DEST_PATH_IMAGE012
Adjustable positive integer) time agent
Figure 915717DEST_PATH_IMAGE003
Spatial relationship feature vector of
Figure 585732DEST_PATH_IMAGE055
Will be
Figure 700319DEST_PATH_IMAGE025
Figure 481193DEST_PATH_IMAGE055
Inputting the data into a time-space feature extraction layer, outputting an intelligent agent by adopting a long-term and short-term memory network with peepholes based on graph convolution operation
Figure 997625DEST_PATH_IMAGE003
At the current moment
Figure 104121DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 706004DEST_PATH_IMAGE056
Wherein
Figure 290569DEST_PATH_IMAGE101
FIG. 4 is a diagram illustrating a distributed multi-agent spatiotemporal feature extraction method for a single agent according to the present invention
Figure 661508DEST_PATH_IMAGE003
The space-time feature extraction layer diagram of (1). In this embodiment, the long and short term memory network with peephole based on graph convolution operation comprises
Figure 673326DEST_PATH_IMAGE102
The long and short term memory network units are connected in series and provided with peepholes; the input gate, forgetting gate and output gate of the long and short term memory network unit are constructed based on the graph convolution neural network.
Space-time relation feature vector
Figure 499855DEST_PATH_IMAGE056
The acquisition method comprises the following steps: the long-short term memory network unit near the output end is recorded as
Figure 888111DEST_PATH_IMAGE042
And increase the serial number in the reverse direction; first, the
Figure 113556DEST_PATH_IMAGE043
A long and short term memory network unit
Figure 296276DEST_PATH_IMAGE044
The cell state of (A) is recorded as
Figure 872751DEST_PATH_IMAGE045
The output is a space-time relation feature vector
Figure 799118DEST_PATH_IMAGE046
Input is as
Figure 144649DEST_PATH_IMAGE057
Temporal spatial relationship feature vector
Figure 29428DEST_PATH_IMAGE048
The first stepp+1 unit
Figure 827620DEST_PATH_IMAGE049
Output space-time relation feature vector
Figure 557679DEST_PATH_IMAGE050
And cell state thereof
Figure 492137DEST_PATH_IMAGE051
(ii) a Wherein,
Figure 813397DEST_PATH_IMAGE052
based on cell state notation
Figure 364464DEST_PATH_IMAGE053
Feature vector of space-time relation
Figure 632634DEST_PATH_IMAGE054
Figure 684248DEST_PATH_IMAGE025
Figure 645251DEST_PATH_IMAGE055
Obtaining spatiotemporal relation feature vectors through the long-short term memory network
Figure 152455DEST_PATH_IMAGE056
In this embodiment, the structure diagram of the 1 st long/short term memory network unit is shown in fig. 5, wherein the dotted line part represents the state of the previous long/short term memory network unit as the signal source. First, the
Figure 224317DEST_PATH_IMAGE043
Space-time relation characteristic vector output by long and short term memory network unit
Figure 726842DEST_PATH_IMAGE103
The calculation method comprises the following steps:
step 401, will
Figure 593167DEST_PATH_IMAGE057
Temporal spatial relationship feature vector
Figure 853247DEST_PATH_IMAGE048
The first step
Figure 463220DEST_PATH_IMAGE058
A unit
Figure 492356DEST_PATH_IMAGE049
Network element status of
Figure 529582DEST_PATH_IMAGE051
And the first
Figure 276958DEST_PATH_IMAGE058
A unit
Figure 690622DEST_PATH_IMAGE049
Output space-time relation feature vector
Figure 574264DEST_PATH_IMAGE050
Inputting the input data into a forgetting gate based on a neural network adopting graph convolution, and calculating to obtain an forgetting gate output variable
Figure 50901DEST_PATH_IMAGE059
Figure 19994DEST_PATH_IMAGE060
Wherein, the graph convolution operation is represented,
Figure 971769DEST_PATH_IMAGE104
and
Figure 975497DEST_PATH_IMAGE105
the weight coefficient matrix and the bias of the forgetting gate diagram convolutional neural network respectively,
Figure 354526DEST_PATH_IMAGE038
activating a function for sigmoid;
step 402, will
Figure 810915DEST_PATH_IMAGE057
Temporal spatial relationship feature vector
Figure 566381DEST_PATH_IMAGE048
The first step
Figure 424616DEST_PATH_IMAGE058
A unit
Figure 240125DEST_PATH_IMAGE049
Output space-time relation feature vector
Figure 183811DEST_PATH_IMAGE050
And cell state thereof
Figure 477389DEST_PATH_IMAGE051
Inputting to an input gate based on the use of a convolutional neural network, and comparing the cell states
Figure 455709DEST_PATH_IMAGE045
Updating, wherein the calculation formula is as follows:
Figure 442119DEST_PATH_IMAGE063
Figure 873101DEST_PATH_IMAGE064
Figure 970370DEST_PATH_IMAGE065
wherein,
Figure 537617DEST_PATH_IMAGE106
is the output of the input gate or gates,
Figure 715437DEST_PATH_IMAGE067
in order to be a transitional state of the cell,
Figure 633714DEST_PATH_IMAGE107
Figure 269095DEST_PATH_IMAGE069
the weight coefficient matrix corresponding to the neural network is convolved for the input gate map,
Figure 956428DEST_PATH_IMAGE070
Figure 284642DEST_PATH_IMAGE071
the corresponding bias of the neural network is convolved for the input gate map,
Figure 424636DEST_PATH_IMAGE072
for the purpose of the tanh activation function,
Figure 129287DEST_PATH_IMAGE073
is the Hadamard product;
at step 403, will
Figure 671127DEST_PATH_IMAGE057
Temporal spatial relationship feature vector
Figure 170241DEST_PATH_IMAGE048
The first step
Figure 797531DEST_PATH_IMAGE058
A unit
Figure 774715DEST_PATH_IMAGE049
Output space-time relation feature vector
Figure 436640DEST_PATH_IMAGE050
And step 402 updated
Figure 106656DEST_PATH_IMAGE043
A unit
Figure 221242DEST_PATH_IMAGE044
Network element status of
Figure 2117DEST_PATH_IMAGE045
Input to an output gate based on a neural network using convolution of a graph to obtain
Figure 518549DEST_PATH_IMAGE043
A unit
Figure 627975DEST_PATH_IMAGE044
Output space-time relation feature vector
Figure 229857DEST_PATH_IMAGE046
The calculation formula is as follows:
Figure 814422DEST_PATH_IMAGE074
Figure 185361DEST_PATH_IMAGE075
wherein,
Figure 197179DEST_PATH_IMAGE076
is the output variable of the output gate,
Figure 20779DEST_PATH_IMAGE108
Figure 409035DEST_PATH_IMAGE078
and (4) convolution of the weight coefficient matrix and the bias corresponding to the neural network for the output gate graph.
A distributed multi-agent behavior decision method according to a second embodiment of the present invention, as shown in fig. 1, based on the above-mentioned distributed multi-agent spatiotemporal feature extraction method, further includes step S500: obtaining an agent
Figure 634480DEST_PATH_IMAGE003
At the current moment
Figure 817199DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 455991DEST_PATH_IMAGE056
The method based on model knowledge driving or the method based on reinforcement learning data driving (preferably adopting Actor-criticic architecture to train and learn the behavior of the intelligent agent) is adopted to calculate the intelligent agent
Figure 382359DEST_PATH_IMAGE003
At the current moment
Figure 462310DEST_PATH_IMAGE001
Behavioral decision set of
Figure 81510DEST_PATH_IMAGE079
(ii) a Wherein,
Figure 142352DEST_PATH_IMAGE085
Figure 606831DEST_PATH_IMAGE081
is the chosen decision space dimension.
In the embodiments of the above-mentioned method for extracting spatiotemporal features of distributed multi-agent and the method for deciding behaviors of distributed multi-agent, the distributed shared parameters of all agents in distributed multi-agent may include a graph network generation layer in step S200, a spatial feature extraction layer in step S300, learnable parameters of relevant neural networks in the spatiotemporal feature extraction layer in step S400, and control and decision parameters for making a behavior decision of an agent based on the extracted spatiotemporal features.
FIG. 6 is a diagram illustrating intelligent collaboration behavior of agents in an embodiment of the invention. Including 8 enclosure agents (little black circle), 1 target agent (little triangle), 2 barrier agents (pentagon) in this embodiment, for clearer demonstration, the little black circle of agent is connected through the lines in the picture. The target of the enclosure intelligent agent is to form an enclosure for the target intelligent agent, in the process, collision between the target intelligent agent and the obstacle and collision between the target intelligent agent and other enclosure intelligent agents are avoided, and the target intelligent agent tries to get rid of the enclosure. In an embodiment, four phases are included: in a series of processes, the enclosure intelligent agent adopts the method provided by the invention to extract space-time characteristics, and on the basis, an Actor-Critic architecture is adopted to make a behavior decision, a target intelligent agent adopts a traditional artificial potential field method to make a behavior decision, and an obstacle intelligent agent is set as a static obstacle. In a certain time, the captive intelligent bodies learn the cooperative behaviors, the achievement realizes the enclosure of the target intelligent bodies, and the self-adaptive and distributed cooperative advantages of the method provided by the invention in the process of coping with complex and dynamic multi-intelligent-body behavior decision are shown.
The description in the above embodiment is a data processing method at the current time, and at the next time
Figure 806868DEST_PATH_IMAGE109
Will be based on
Figure 596970DEST_PATH_IMAGE001
And (5) extracting the time-space characteristics and generating behavior decision according to the method, namely updating the time step, returning to the step S100, and calculating the next time.
The description in the embodiment is a process of space-time feature extraction and behavior decision generation of the intelligent agents, and each intelligent agent adopts the method to calculate at the same time; each intelligent agent can independently calculate, and the calculation result can be sent to each intelligent agent after the calculation of each intelligent agent is carried out through the cloud platform.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the above-described distributed multi-agent behavior decision method may refer to the corresponding process in the foregoing embodiment of the distributed multi-agent spatiotemporal feature extraction method, and details are not repeated herein.
The distributed multi-agent space-time feature extraction system comprises a state vector acquisition module, an original feature generation module, a space relation calculation module and a space-time relation calculation module;
the state vector acquisition module is configured to acquire the state vector at the moment
Figure 616878DEST_PATH_IMAGE001
Based on
Figure 150628DEST_PATH_IMAGE002
Agent for each moment from moment
Figure 205172DEST_PATH_IMAGE003
Observable spatial state vector
Figure 900595DEST_PATH_IMAGE004
Splicing to obtain agents
Figure 673379DEST_PATH_IMAGE003
At the current moment
Figure 745240DEST_PATH_IMAGE001
Space-time state vector of
Figure 654290DEST_PATH_IMAGE005
(ii) a Wherein,
Figure 786194DEST_PATH_IMAGE006
the number is a preset historical state number;
the primitive feature generation module configured to generate a primitive feature based on
Figure 780695DEST_PATH_IMAGE007
Obtaining agents by generating layers through a graph network
Figure 656247DEST_PATH_IMAGE003
Of the original feature vector
Figure 950962DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 256698DEST_PATH_IMAGE009
Figure 597549DEST_PATH_IMAGE010
a selected feature space dimension;
the spatial relationship calculation module is configured to calculate the spatial relationship based on
Figure 11213DEST_PATH_IMAGE008
Obtaining the intelligent agent through the spatial characteristic relation extraction layer
Figure 894855DEST_PATH_IMAGE003
At the current moment
Figure 102983DEST_PATH_IMAGE001
Space relation characteristic vector of
Figure 72076DEST_PATH_IMAGE084
(ii) a The spatial feature relation extraction layer is constructed in a mode that a graph attention network module and a full-connection network module are stacked alternately;
the time-space relation calculation module is configured to obtain time
Figure 289430DEST_PATH_IMAGE001
Front side
Figure 27579DEST_PATH_IMAGE012
Individual moment intelligent agent
Figure 672187DEST_PATH_IMAGE003
Spatial relationship feature vector of
Figure 128576DEST_PATH_IMAGE013
Will be
Figure 149622DEST_PATH_IMAGE014
Figure 7857DEST_PATH_IMAGE013
Inputting the data into a time-space feature extraction layer, outputting an intelligent agent by adopting a long-term and short-term memory network with peepholes based on graph convolution operation
Figure 820436DEST_PATH_IMAGE003
At the current moment
Figure 498542DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 57699DEST_PATH_IMAGE015
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the above-described distributed multi-agent spatiotemporal feature extraction system may refer to the corresponding process in the foregoing embodiment of the distributed multi-agent spatiotemporal feature extraction method, and will not be described herein again.
The distributed multi-agent behavior decision system of the fourth embodiment of the invention is based on the above-mentioned distributed multi-agent space-time feature extraction system, and also includes a behavior decision calculation module;
the behavioral decision calculation module is configured to be based on an agent
Figure 770441DEST_PATH_IMAGE003
At the current moment
Figure 756851DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 187832DEST_PATH_IMAGE015
Computing agents using model-knowledge-driven methods or reinforcement learning data-driven methods
Figure 285101DEST_PATH_IMAGE003
At the current moment
Figure 852349DEST_PATH_IMAGE001
Behavioral decision set of
Figure 9661DEST_PATH_IMAGE079
(ii) a Wherein,
Figure 662359DEST_PATH_IMAGE080
Figure 563319DEST_PATH_IMAGE081
is the chosen decision space dimension.
The flow chart of this embodiment is shown in fig. 7.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the above-described distributed multi-agent behavior decision system may refer to the corresponding process in the foregoing embodiment of the distributed multi-agent behavior decision method, and details are not repeated herein.
It should be noted that, the distributed multi-agent spatiotemporal feature extraction system and the distributed multi-agent behavior decision system provided in the foregoing embodiments are only exemplified by the division of the above functional modules, and in practical applications, the above functions may be allocated by different functional modules according to needs, that is, the modules or steps in the embodiments of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiments may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the above described functions. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.
A storage device according to a fifth embodiment of the present invention stores a plurality of programs, which are suitable for being loaded and executed by a processor to implement the above-mentioned method for extracting spatiotemporal features of distributed multi-agent, and/or the above-mentioned method for deciding the behavior of distributed multi-agent.
A processing apparatus according to a sixth embodiment of the present invention includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is suitable to be loaded and executed by a processor to realize the above-mentioned distributed multi-agent space-time feature extraction method and/or the above-mentioned distributed multi-agent behavior decision method.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The computer program, when executed by a Central Processing Unit (CPU), performs the above-described functions defined in the method of the present application. It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (13)

1. A distributed multi-agent space-time feature extraction method is characterized by comprising the following steps:
step S100, at the time
Figure 493374DEST_PATH_IMAGE001
Based on
Figure 879356DEST_PATH_IMAGE002
Agent for each moment from moment
Figure 127935DEST_PATH_IMAGE003
Observable spatial state vector
Figure 991986DEST_PATH_IMAGE004
Splicing to obtain agents
Figure 9620DEST_PATH_IMAGE003
At the current moment
Figure 300924DEST_PATH_IMAGE001
Space-time state vector of
Figure 302378DEST_PATH_IMAGE005
(ii) a Wherein,
Figure 970120DEST_PATH_IMAGE006
the number is a preset historical state number;
step S200 based on
Figure 842261DEST_PATH_IMAGE007
Obtaining agents by generating layers through a graph network
Figure 304466DEST_PATH_IMAGE003
Of the original feature vector
Figure 527637DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 733491DEST_PATH_IMAGE009
Figure 991297DEST_PATH_IMAGE010
a selected feature space dimension;
step S300 based on
Figure 889983DEST_PATH_IMAGE008
Obtaining the intelligent agent through the spatial characteristic relation extraction layer
Figure 4045DEST_PATH_IMAGE003
At the current moment
Figure 748010DEST_PATH_IMAGE001
Space relation characteristic vector of
Figure 329164DEST_PATH_IMAGE011
(ii) a The spatial feature relation extraction layer is constructed in a mode that a graph attention network module and a full-connection network module are stacked alternately;
step S400, acquiring time
Figure 133172DEST_PATH_IMAGE001
Front side
Figure 330935DEST_PATH_IMAGE012
Individual moment intelligent agent
Figure 878591DEST_PATH_IMAGE003
Spatial relationship feature vector of
Figure 845410DEST_PATH_IMAGE013
Will be
Figure 85898DEST_PATH_IMAGE014
Figure 505378DEST_PATH_IMAGE013
Inputting the data into a time-space feature extraction layer, outputting an intelligent agent by adopting a long-term and short-term memory network with peepholes based on graph convolution operation
Figure 856725DEST_PATH_IMAGE003
At the current moment
Figure 678051DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 89440DEST_PATH_IMAGE015
2. The distributed multi-agent spatiotemporal feature extraction method as defined in claim 1, wherein "agent at each moment of time" in step S100
Figure 993287DEST_PATH_IMAGE003
Observable spatial state vectors "including agent's own state, task goal state, other intelligence observableEnergy state, observable environmental element state;
the self state of the intelligent agent comprises the self position, speed and acceleration state of the intelligent agent;
the task target state comprises a target position and a speed state; the observable other agent states include observable location, speed states of other agents;
the observable environmental element states include observable position and velocity states of obstacles in the environment, and position states of no-traffic zones in the environment.
3. The distributed multi-agent spatiotemporal feature extraction method as defined in claim 1, wherein the graph network generation layer in step S200 is composed of a plurality of layers of fully connected neural networks.
4. The method for extracting spatiotemporal features of distributed multi-agent as claimed in claim 1, wherein the method for extracting spatial feature relationship based on the spatial feature relationship extraction layer in step S300 comprises:
step S310, using the agent
Figure 882746DEST_PATH_IMAGE003
Of the original feature vector0
Figure 824157DEST_PATH_IMAGE008
And the original feature vectors of all neighboring agents0
Figure 406448DEST_PATH_IMAGE016
Obtaining, as input, a spatial relationship feature vector by a first graph attention network module1
Figure 800520DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 493670DEST_PATH_IMAGE017
Figure 24008DEST_PATH_IMAGE018
as an agent
Figure 777200DEST_PATH_IMAGE003
A set of neighbor agents capable of direct communication;
Figure 658569DEST_PATH_IMAGE019
0
Figure 155409DEST_PATH_IMAGE008
Figure 805833DEST_PATH_IMAGE020
0
Figure 464348DEST_PATH_IMAGE016
step S320, in1
Figure 833012DEST_PATH_IMAGE008
For input, obtaining space relation characteristic vector through first full-connection network module2
Figure 867964DEST_PATH_IMAGE008
Step S330, based on the space relation feature vector obtained in the step S320, through the stacked attention network module and the full-connection network module, the method of the step S310 and the step S320 is adopted to iteratively calculate the second
Figure 638474DEST_PATH_IMAGE021
Spatial relationship feature vector of order
f2-1
Figure 733469DEST_PATH_IMAGE008
f 2
Figure 589430DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 425143DEST_PATH_IMAGE022
Figure 784580DEST_PATH_IMAGE023
the number of stacked layers of the power network module and the fully connected network module is noted for the figure;
step S340, in the iterative computation
Figure 50476DEST_PATH_IMAGE023
Then, based on k2(-1)
Figure 128154DEST_PATH_IMAGE008
By the method of step S310, through
Figure 770488DEST_PATH_IMAGE023
Obtaining space relation characteristic vector by drawing attention network module k2-1
Figure 250011DEST_PATH_IMAGE008
(ii) a Splicing vector value0
Figure 686808DEST_PATH_IMAGE008
,2
Figure 251782DEST_PATH_IMAGE008
,4
Figure 432227DEST_PATH_IMAGE008
,…, k2(-1)
Figure 766257DEST_PATH_IMAGE008
, k2-1
Figure 373956DEST_PATH_IMAGE008
]Input the first
Figure 426225DEST_PATH_IMAGE023
Fully connecting network modules to obtain
Figure 144783DEST_PATH_IMAGE024
Spatial relationship feature vector k 2
Figure 598898DEST_PATH_IMAGE008
As time of day
Figure 377498DEST_PATH_IMAGE001
Lower intelligent agent
Figure 182643DEST_PATH_IMAGE003
Final output based on the spatial feature relationship extraction layer
Figure 436382DEST_PATH_IMAGE014
5. The distributed multi-agent spatiotemporal feature extraction method as defined in claim 4, wherein the spatial relationship feature vector1
Figure 745004DEST_PATH_IMAGE008
The acquisition method comprises the following steps:
step S311, using a learnable matrix
Figure 960084DEST_PATH_IMAGE025
Pair relation feature vector0
Figure 721367DEST_PATH_IMAGE008
0
Figure 47306DEST_PATH_IMAGE016
Linear transformation is carried out and spliced into a new relation characteristic vector0
Figure 210434DEST_PATH_IMAGE026
=[
Figure 330837DEST_PATH_IMAGE025
0
Figure 844995DEST_PATH_IMAGE008
,
Figure 709046DEST_PATH_IMAGE027
0
Figure 726680DEST_PATH_IMAGE016
](ii) a Will be provided with0
Figure 283563DEST_PATH_IMAGE026
Inputting a full-connection neural network and outputting an intelligent agent
Figure 19438DEST_PATH_IMAGE003
For intelligent agent
Figure 687180DEST_PATH_IMAGE028
Attention coefficient of
Figure 559321DEST_PATH_IMAGE029
(ii) a Obtaining an agent
Figure 287106DEST_PATH_IMAGE003
For intelligent agent
Figure 510277DEST_PATH_IMAGE028
Attention normalization coefficient of
Figure 713200DEST_PATH_IMAGE030
Figure 705427DEST_PATH_IMAGE031
Step S312, adopting a multi-head attention mechanism for the first time
Figure 604113DEST_PATH_IMAGE032
Calculating the normalized attention coefficient by the method of step S311
Figure 49001DEST_PATH_IMAGE033
(ii) a Intelligent agent under fusion of computing multi-head attention mechanism
Figure 58545DEST_PATH_IMAGE003
Spatial relationship feature vector of1
Figure 170858DEST_PATH_IMAGE008
1
Figure 974866DEST_PATH_IMAGE034
Wherein,
Figure 172629DEST_PATH_IMAGE035
to keep track of the number of the first hours of attention,
Figure 720285DEST_PATH_IMAGE036
the function is activated for the sigmoid and,
Figure 687104DEST_PATH_IMAGE037
is as follows
Figure 927592DEST_PATH_IMAGE032
Attention is drawn to the chosen linear transformation matrix,
Figure 347072DEST_PATH_IMAGE038
representing a concatenation operation of the vectors.
6. The distributed multi-agent spatiotemporal feature extraction method as defined in claim 1, wherein the graph convolution operation based long and short term memory network with peepholes comprises
Figure 698419DEST_PATH_IMAGE039
The long and short term memory network units are connected in series and provided with peepholes; and an input gate, a forgetting gate and an output gate of the long and short term memory network unit are constructed based on a graph convolution neural network.
7. The distributed multi-agent spatiotemporal feature extraction method as claimed in claim 6, wherein the spatiotemporal relationship feature vector
Figure 519745DEST_PATH_IMAGE015
The acquisition method comprises the following steps:
the long-short term memory network unit near the output end is recorded as
Figure 196714DEST_PATH_IMAGE040
And increase the serial number in the reverse direction;
first, the
Figure 103490DEST_PATH_IMAGE041
A long and short term memory network unit
Figure 992948DEST_PATH_IMAGE042
The cell state of (A) is recorded as
Figure 931430DEST_PATH_IMAGE043
The output is a space-time relation feature vector
Figure 248142DEST_PATH_IMAGE044
Input is as
Figure 642214DEST_PATH_IMAGE045
Temporal spatial relationship feature vector
Figure 335363DEST_PATH_IMAGE046
The first stepp+1 unit
Figure 131281DEST_PATH_IMAGE047
Output space-time relation feature vector
Figure 884473DEST_PATH_IMAGE048
And cell state thereof
Figure 765842DEST_PATH_IMAGE049
(ii) a Wherein,
Figure 997103DEST_PATH_IMAGE050
based on cell state notation
Figure 647527DEST_PATH_IMAGE051
Feature vector of space-time relation
Figure 571621DEST_PATH_IMAGE052
Figure 940285DEST_PATH_IMAGE014
Figure 975237DEST_PATH_IMAGE013
Obtaining spatiotemporal relation feature vectors through the long-short term memory network
Figure 480168DEST_PATH_IMAGE015
8. The distributed multi-agent spatiotemporal feature extraction method as defined in claim 7, wherein the spatiotemporal relationship is characterizedEigenvector
Figure 575163DEST_PATH_IMAGE044
The calculation method comprises the following steps:
step 401, will
Figure 431124DEST_PATH_IMAGE045
Temporal spatial relationship feature vector
Figure 4187DEST_PATH_IMAGE046
The first step
Figure 626274DEST_PATH_IMAGE053
A unit
Figure 892170DEST_PATH_IMAGE047
Network element status of
Figure 969848DEST_PATH_IMAGE049
And the first
Figure 612182DEST_PATH_IMAGE053
A unit
Figure 91705DEST_PATH_IMAGE047
Output space-time relation feature vector
Figure 794081DEST_PATH_IMAGE048
Inputting the input data into a forgetting gate based on a neural network adopting graph convolution, and calculating to obtain an forgetting gate output variable
Figure 93476DEST_PATH_IMAGE054
Figure 273921DEST_PATH_IMAGE055
Wherein,
Figure 873530DEST_PATH_IMAGE056
the operation of the convolution of the graph is shown,
Figure 481229DEST_PATH_IMAGE057
and
Figure 533498DEST_PATH_IMAGE058
the weight coefficient matrix and the bias of the forgetting gate diagram convolutional neural network respectively,
Figure 252056DEST_PATH_IMAGE036
activating a function for sigmoid;
step 402, will
Figure 706171DEST_PATH_IMAGE045
Temporal spatial relationship feature vector
Figure 484771DEST_PATH_IMAGE046
The first step
Figure 24337DEST_PATH_IMAGE053
A unit
Figure 546585DEST_PATH_IMAGE047
Output space-time relation feature vector
Figure 875714DEST_PATH_IMAGE048
And cell state thereof
Figure 825216DEST_PATH_IMAGE049
Inputting to an input gate based on the use of a convolutional neural network, and comparing the cell states
Figure 586498DEST_PATH_IMAGE043
Updating, wherein the calculation formula is as follows:
Figure 912437DEST_PATH_IMAGE059
Figure 75566DEST_PATH_IMAGE060
Figure 195968DEST_PATH_IMAGE061
wherein,
Figure 710126DEST_PATH_IMAGE062
is the output of the input gate or gates,
Figure 574177DEST_PATH_IMAGE063
in order to be a transitional state of the cell,
Figure 591812DEST_PATH_IMAGE064
Figure 148695DEST_PATH_IMAGE065
the weight coefficient matrix corresponding to the neural network is convolved for the input gate map,
Figure 884570DEST_PATH_IMAGE066
Figure 286732DEST_PATH_IMAGE067
the corresponding bias of the neural network is convolved for the input gate map,
Figure 424452DEST_PATH_IMAGE068
for the purpose of the tanh activation function,
Figure 152237DEST_PATH_IMAGE069
is the Hadamard product;
at step 403, will
Figure 109829DEST_PATH_IMAGE045
Temporal spatial relationship feature vector
Figure 581261DEST_PATH_IMAGE046
The first step
Figure 570558DEST_PATH_IMAGE053
A unit
Figure 203665DEST_PATH_IMAGE047
Output space-time relation feature vector
Figure 914132DEST_PATH_IMAGE048
And step 402 updated
Figure 923676DEST_PATH_IMAGE041
A unit
Figure 770410DEST_PATH_IMAGE042
Network element status of
Figure 839997DEST_PATH_IMAGE043
Input to an output gate based on a neural network using convolution of a graph to obtain
Figure 772181DEST_PATH_IMAGE041
A unit
Figure 585416DEST_PATH_IMAGE042
Output space-time relation feature vector
Figure 552235DEST_PATH_IMAGE044
The calculation formula is as follows:
Figure 527144DEST_PATH_IMAGE070
Figure 212204DEST_PATH_IMAGE071
wherein,
Figure 563550DEST_PATH_IMAGE072
is the output variable of the output gate,
Figure 384876DEST_PATH_IMAGE073
Figure 530686DEST_PATH_IMAGE074
and (4) convolution of the weight coefficient matrix and the bias corresponding to the neural network for the output gate graph.
9. A distributed multi-agent spatiotemporal feature extraction method according to any of claims 1-8, characterized in that all agents in the distributed multi-agent share learnable parameters.
10. A distributed multi-agent behavior decision method, characterized in that an agent is obtained based on the distributed multi-agent spatiotemporal feature extraction method of any one of claims 1 to 9
Figure 437463DEST_PATH_IMAGE003
At the current moment
Figure 592500DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 265403DEST_PATH_IMAGE015
Computing agents using model-knowledge-driven methods or reinforcement learning data-driven methods
Figure 847694DEST_PATH_IMAGE003
At the current moment
Figure 241766DEST_PATH_IMAGE001
Behavioral decision set of
Figure 934916DEST_PATH_IMAGE075
(ii) a Wherein,
Figure 730833DEST_PATH_IMAGE076
Figure 484026DEST_PATH_IMAGE077
is the chosen decision space dimension.
11. A distributed multi-agent behavior decision method as defined in claim 10, wherein all agents in the distributed multi-agent share learnable parameters.
12. A distributed multi-agent space-time feature extraction system is characterized by comprising a state vector acquisition module, an original feature generation module, a space relation calculation module and a space-time relation calculation module;
the state vector acquisition module is configured to acquire the state vector at the moment
Figure 99815DEST_PATH_IMAGE001
Based on
Figure 596655DEST_PATH_IMAGE002
Agent for each moment from moment
Figure 247079DEST_PATH_IMAGE003
Observable spatial state vector
Figure 171173DEST_PATH_IMAGE004
Splicing to obtain agents
Figure 805417DEST_PATH_IMAGE003
At the current moment
Figure 574790DEST_PATH_IMAGE001
Space-time state vector of
Figure 79720DEST_PATH_IMAGE005
(ii) a Wherein,
Figure 643557DEST_PATH_IMAGE006
the number is a preset historical state number;
the primitive feature generation module configured to generate a primitive feature based on
Figure 233938DEST_PATH_IMAGE007
Obtaining agents by generating layers through a graph network
Figure 69651DEST_PATH_IMAGE003
Of the original feature vector
Figure 694668DEST_PATH_IMAGE008
(ii) a Wherein,
Figure 960564DEST_PATH_IMAGE009
Figure 303821DEST_PATH_IMAGE010
a selected feature space dimension;
the spatial relationship calculation module is configured to calculate the spatial relationship based on
Figure 680575DEST_PATH_IMAGE008
Obtaining the intelligent agent through the spatial characteristic relation extraction layer
Figure 160098DEST_PATH_IMAGE003
At the current moment
Figure 596896DEST_PATH_IMAGE001
Space relation characteristic vector of
Figure 365132DEST_PATH_IMAGE011
(ii) a Wherein the spatial feature relation extraction layer adopts a graphThe attention network module and the full-connection network module are constructed in an alternate stacking mode;
the time-space relation calculation module is configured to obtain time
Figure 545577DEST_PATH_IMAGE001
Front side
Figure 145186DEST_PATH_IMAGE012
Individual moment intelligent agent
Figure 752885DEST_PATH_IMAGE003
Spatial relationship feature vector of
Figure 539575DEST_PATH_IMAGE013
Will be
Figure 523712DEST_PATH_IMAGE014
Figure 977827DEST_PATH_IMAGE013
Inputting the data into a time-space feature extraction layer, outputting an intelligent agent by adopting a long-term and short-term memory network with peepholes based on graph convolution operation
Figure 756427DEST_PATH_IMAGE003
At the current moment
Figure 27484DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 549732DEST_PATH_IMAGE015
13. A distributed multi-agent behavior decision making system, based on the distributed multi-agent spatiotemporal feature extraction system of claim 12, further comprising a behavior decision calculation module; the behavioral decision calculation module is configured to be based on an agent
Figure 858354DEST_PATH_IMAGE003
At the current moment
Figure 73434DEST_PATH_IMAGE001
Space-time relation characteristic vector of
Figure 100296DEST_PATH_IMAGE015
Computing agents using model-knowledge-driven methods or reinforcement learning data-driven methods
Figure 160656DEST_PATH_IMAGE003
At the current moment
Figure 589363DEST_PATH_IMAGE001
Behavioral decision set of
Figure 709766DEST_PATH_IMAGE075
(ii) a Wherein,
Figure 958345DEST_PATH_IMAGE076
Figure 822396DEST_PATH_IMAGE077
is the chosen decision space dimension.
CN202010873794.8A 2020-08-26 2020-08-26 Distributed multi-agent space-time feature extraction method and behavior decision method Active CN111738372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010873794.8A CN111738372B (en) 2020-08-26 2020-08-26 Distributed multi-agent space-time feature extraction method and behavior decision method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010873794.8A CN111738372B (en) 2020-08-26 2020-08-26 Distributed multi-agent space-time feature extraction method and behavior decision method

Publications (2)

Publication Number Publication Date
CN111738372A true CN111738372A (en) 2020-10-02
CN111738372B CN111738372B (en) 2020-11-17

Family

ID=72658807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010873794.8A Active CN111738372B (en) 2020-08-26 2020-08-26 Distributed multi-agent space-time feature extraction method and behavior decision method

Country Status (1)

Country Link
CN (1) CN111738372B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112203291A (en) * 2020-12-03 2021-01-08 中国科学院自动化研究所 Cluster control method for area coverage and connectivity maintenance based on knowledge embedding
CN113253684A (en) * 2021-05-31 2021-08-13 杭州蓝芯科技有限公司 Multi-AGV (automatic guided vehicle) scheduling method and device based on graph convolution neural network and electronic equipment
CN113296502A (en) * 2021-05-08 2021-08-24 华东师范大学 Multi-robot collaborative navigation method based on hierarchical relation graph learning in dynamic environment
CN114118400A (en) * 2021-10-11 2022-03-01 中国科学院自动化研究所 Concentration network-based cluster countermeasure method and device
CN114371719A (en) * 2021-12-09 2022-04-19 湖南国天电子科技有限公司 SAC-based autonomous control method for underwater robot
CN115268481A (en) * 2022-07-06 2022-11-01 中国航空工业集团公司沈阳飞机设计研究所 Unmanned aerial vehicle countermeasure strategy decision method and system
CN115439932A (en) * 2022-09-02 2022-12-06 四元生命工程有限公司 Behavior and information detection method taking regular tetrahedron framework as logic thinking

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190251358A1 (en) * 2017-03-30 2019-08-15 Hrl Laboratories, Llc System and method for neuromorphic visual activity classification based on foveated detection and contextual filtering
CN111401557A (en) * 2020-06-03 2020-07-10 超参数科技(深圳)有限公司 Agent decision making method, AI model training method, server and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190251358A1 (en) * 2017-03-30 2019-08-15 Hrl Laboratories, Llc System and method for neuromorphic visual activity classification based on foveated detection and contextual filtering
CN111401557A (en) * 2020-06-03 2020-07-10 超参数科技(深圳)有限公司 Agent decision making method, AI model training method, server and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PAULA FRAGA-LAMAS等: "A Review on IoT Deep Learning UAV Systems for Autonomous Obstacle Detection and Collision Avoidance", 《REMOTE SENSING》 *
刘铭: "信息演化网络在分布式协同决策中的研究与应用", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112203291A (en) * 2020-12-03 2021-01-08 中国科学院自动化研究所 Cluster control method for area coverage and connectivity maintenance based on knowledge embedding
CN113296502A (en) * 2021-05-08 2021-08-24 华东师范大学 Multi-robot collaborative navigation method based on hierarchical relation graph learning in dynamic environment
CN113253684A (en) * 2021-05-31 2021-08-13 杭州蓝芯科技有限公司 Multi-AGV (automatic guided vehicle) scheduling method and device based on graph convolution neural network and electronic equipment
CN113253684B (en) * 2021-05-31 2021-09-21 杭州蓝芯科技有限公司 Multi-AGV (automatic guided vehicle) scheduling method and device based on graph convolution neural network and electronic equipment
CN114118400A (en) * 2021-10-11 2022-03-01 中国科学院自动化研究所 Concentration network-based cluster countermeasure method and device
CN114118400B (en) * 2021-10-11 2023-01-03 中国科学院自动化研究所 Concentration network-based cluster countermeasure method and device
CN114371719A (en) * 2021-12-09 2022-04-19 湖南国天电子科技有限公司 SAC-based autonomous control method for underwater robot
CN114371719B (en) * 2021-12-09 2023-08-08 湖南国天电子科技有限公司 SAC-based autonomous control method for underwater robot
CN115268481A (en) * 2022-07-06 2022-11-01 中国航空工业集团公司沈阳飞机设计研究所 Unmanned aerial vehicle countermeasure strategy decision method and system
CN115439932A (en) * 2022-09-02 2022-12-06 四元生命工程有限公司 Behavior and information detection method taking regular tetrahedron framework as logic thinking
CN115439932B (en) * 2022-09-02 2023-10-10 四元生命工程有限公司 Behavior and information detection method taking regular tetrahedral architecture as logic thinking

Also Published As

Publication number Publication date
CN111738372B (en) 2020-11-17

Similar Documents

Publication Publication Date Title
CN111738372B (en) Distributed multi-agent space-time feature extraction method and behavior decision method
Tzafestas Synergy of IoT and AI in modern society: The robotics and automation case
Odonkor et al. Distributed operation of collaborating unmanned aerial vehicles for time-sensitive oil spill mapping
CN110737968B (en) Crowd trajectory prediction method and system based on deep convolutional long and short memory network
Hussein et al. A bi-directional agent-based pedestrian microscopic model
CN112686281A (en) Vehicle track prediction method based on space-time attention and multi-stage LSTM information expression
CN113159283B (en) Model training method based on federal transfer learning and computing node
CN113313947A (en) Road condition evaluation method of short-term traffic prediction graph convolution network
CN114519932B (en) Regional traffic condition integrated prediction method based on space-time relation extraction
Wei et al. Learning motion rules from real data: Neural network for crowd simulation
Gollapalli et al. A Neuro-Fuzzy Approach to Road Traffic Congestion Prediction.
CN113191241A (en) Model training method and related equipment
CN113253733A (en) Navigation obstacle avoidance method, device and system based on learning and fusion
Roldán et al. Swarmcity project: Can an aerial swarm monitor traffic in a smart city?
CN114639233A (en) Congestion state prediction method and device, electronic equipment and storage medium
CN112115744B (en) Point cloud data processing method and device, computer storage medium and electronic equipment
Li et al. Multi-mechanism swarm optimization for multi-UAV task assignment and path planning in transmission line inspection under multi-wind field
Xu et al. Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey
CN113537267A (en) Method and device for generating countermeasure sample, storage medium and electronic equipment
CN107225571A (en) Motion planning and robot control method and apparatus, robot
Jolfaei et al. Guest editorial introduction to the special issue on deep learning models for safe and secure intelligent transportation systems
CN111814915B (en) Multi-agent space-time feature extraction method and system and behavior decision method and system
Lei et al. Digital twin‐based multi‐objective autonomous vehicle navigation approach as applied in infrastructure construction
CN115762147A (en) Traffic flow prediction method based on adaptive graph attention neural network
Skulimowski et al. Communication quality in anticipatory vehicle swarms: A simulation-based model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant