CN116449836A - Reconfigurable intelligent surface-assisted multi-robot system track planning method - Google Patents

Reconfigurable intelligent surface-assisted multi-robot system track planning method Download PDF

Info

Publication number
CN116449836A
CN116449836A CN202310365852.XA CN202310365852A CN116449836A CN 116449836 A CN116449836 A CN 116449836A CN 202310365852 A CN202310365852 A CN 202310365852A CN 116449836 A CN116449836 A CN 116449836A
Authority
CN
China
Prior art keywords
network
mobile robot
robot
robot system
duel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310365852.XA
Other languages
Chinese (zh)
Other versions
CN116449836B (en
Inventor
刘元玮
高新宇
董杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tiantan Intelligent Technology Co ltd
Original Assignee
Beijing Tiantan Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tiantan Intelligent Technology Co ltd filed Critical Beijing Tiantan Intelligent Technology Co ltd
Priority to CN202310365852.XA priority Critical patent/CN116449836B/en
Publication of CN116449836A publication Critical patent/CN116449836A/en
Application granted granted Critical
Publication of CN116449836B publication Critical patent/CN116449836B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0276Control of position or course in two dimensions specially adapted to land vehicles using signals provided by a source external to the vehicle
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention discloses a reconfigurable intelligent surface-assisted multi-robot system track planning method. Firstly, establishing a communication model of a multi-robot system, then providing an integrated machine learning scheme, combining a long-term memory-autoregressive integrated moving average model and a duel-depth Q network algorithm, respectively predicting initial and final positions of robots and planning tracks under the condition of avoiding overestimation of action values, and finally optimizing by using the scheme to obtain the optimal track of the multi-robot system. The invention provides a novel track planning method of a reconfigurable intelligent surface auxiliary multi-robot system, which has good application value.

Description

Reconfigurable intelligent surface-assisted multi-robot system track planning method
Technical Field
The invention relates to the field of wireless communication, in particular to a multi-communication robot track planning method based on an indoor environment.
Background
Today, robots are widely considered to be much less capable when working independently, with the true strength being the cooperation of multiple robots. As a result, multi-robot systems in shared environments have attracted considerable attention in various emerging applications, such as cargo transportation, automatic patrol, and emergency rescue. In these scenarios, robots need to coordinate with each other to achieve some well-defined goal, e.g., move from one given location to another. However, with the increasing complexity of application environments, significant local computing resources are consumed in cooperatively processing tasks in a multi-robot system. Wireless communication using advanced multiple access techniques is of great importance for multi-robot systems due to the collaborative requirements and high computational complexity of trajectory planning in multi-robot systems.
In certain communication areas, such as blind spots, there is still a problem of spectrum shortage. To address this problem, reconfigurable smart surfaces are potential candidates for improving spectral efficiency, which is a passive device that can actively reflect signals to users. In particular, the use of a reconfigurable intelligent surface enables the creation of a virtual line of sight between a base station and a robot when the robot is located in a communication blind zone. In view of the advantages of reconfigurable smart surfaces and non-orthogonal multiple access techniques, reconfigurable smart surface assistance is considered as a potential solution to efficiently address the problem of multi-robot system trajectory design. Therefore, inspired by the advantage of reconfigurable intelligent surfaces, reconfigurable intelligent surface-aided multi-robot system trajectory planning has been considered as one of the candidates for next-generation wireless communication robot navigation.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a reconfigurable intelligent surface-assisted multi-robot system track planning method.
In order to achieve the above purpose, the present invention adopts the following technical methods:
a reconfigurable intelligent surface-assisted multi-robot system track planning method comprises the following steps:
step one: establishing a communication model of a reconfigurable intelligent surface-assisted multi-robot system; the communication model specifically comprises a single antenna base station, L mobile robots and a reconfigurable intelligent surface with K reflecting elements;
defining channels from the single antenna base station to the reconfigurable intelligent surface, from the reconfigurable intelligent surface to the first mobile robot and from the single antenna base station to the first mobile robot as h respectively H ∈C 1×K 、g i ∈C K×l And I i ∈C 1×l The method comprises the steps of carrying out a first treatment on the surface of the In addition, for reconfigurable intelligent surfaces, t.epsilon.0, T]The reflection coefficient matrix at the moment is expressed as:
wherein beta is k And theta k Respectively representing the amplitude and phase of the kth reflective element; the position q of the first mobile robot at the time t i (t) the received signal is:
wherein n-CN (0, delta) 2 ) Representing zero mean and variance as delta 2 Additive white gaussian noise of S i (t) and S j (t) is the transmission symbol of the i-th mobile robot and the j-th mobile robot, respectively; in addition, the decoding order value thereof satisfies O (j)>O (i), indicating that the decoding order of the ith mobile robot is prioritized over that of the jth mobile robot, the received rate of the ith mobile robot and the rate of decoding the jth mobile robot are expressed as:
wherein p is i (t)、p j (t) and d represent the transmission power of the ith mobile robot and the d mobile robot, respectively; then the trajectory optimization problem is expressed as maximizing the total communication rate for all users over the 0 to T period;
step two: utilizing integrated machine learning combined with a long-term memory-autoregressive integrated moving average model and a duel-depth Q network algorithm to respectively predict initial and final positions of the robot and conduct track planning under the condition of avoiding overestimation of action value;
step three: optimizing to obtain an optimal track of the multi-robot system on the basis of the second step; when the neural network reaches convergence through training, the optimal robot track can be output.
Further, the specific process of the second step is as follows:
predicting possible initial termination position sets of all robots respectively according to long-term memory network and autoregressive integrated moving average modelAnd->The method comprises the following steps:
wherein, alpha, beta and gamma respectively represent parameters of a long-term and short-term memory network and parameters of an autoregressive integrated moving average model; s is S max 、S minAnd->Respectively representing the maximum value in the training sample, the minimum value in the training sample, the predicted value of the autoregressive model and the predicted value of the moving average model; then, the weights are assigned according to the CRITIC weight method, and +.>Andfusion was performed as follows:
wherein w is 1 And w 2 Respectively representing the assigned weights, and optimizing by the duel-bucket double-depth Q network learning; in a duel-level double-depth Q network, a single antenna is definedThe base station is an agent, and the decentralization processing is performed simultaneously to avoid the incapability of convergence, so that a loss function can be obtained as follows:
wherein mu C ,And->Respectively representing parameters of a convolution layer in the duel double-depth Q network, parameters of a first dense layer in the duel double-depth Q network and parameters of a second dense layer in the duel double-depth Q network; Γ (e'), Q e 、Q f And f max (e', μ) each representing a state feature vector, a state Q estimation function, an action Q estimation function, and an action corresponding to a maximum Q value; furthermore, e' and v represent current action, next action and current network parameters, respectively; based on the loss function, the agent learns to converge, and then the optimal network parameters can be output.
The present invention also provides a computer readable storage medium having stored therein a computer program which when executed by a processor implements the above method.
The invention also provides a reconfigurable intelligent surface-assisted multi-robot system, which comprises a processor and a memory, wherein the memory is used for storing a computer program; the processor is configured to execute the computer program to implement the above method.
The invention has the beneficial effects that: the invention provides an integrated machine learning solution for a reconfigurable intelligent surface-assisted multi-robot system to solve the track planning problem. Firstly, establishing a communication model of a multi-robot system, then providing an integrated machine learning scheme, combining a long-term memory-autoregressive integrated moving average model and a duel-depth Q network algorithm, respectively predicting initial and final positions of robots and planning tracks under the condition of avoiding overestimation of action values, and finally optimizing by using the scheme to obtain the optimal track of the multi-robot system. The invention provides a solution for track planning of the reconfigurable intelligent surface-assisted multi-robot system, and has good application value.
Drawings
FIG. 1 is a general idea of a method according to an embodiment of the invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, wherein the present embodiment is provided with the technical method as a premise, and a detailed implementation manner and a specific operation process are provided, and the protection scope of the present invention is not limited to the present embodiment.
The embodiment provides a reconfigurable intelligent surface-assisted multi-robot system track planning method. By the method, the navigation problem of the multi-robot system can be realized. As shown in fig. 1, the method of this embodiment first establishes a communication model of a multi-robot system, and then proposes an integrated machine learning scheme, which combines a long-term memory-autoregressive integrated moving average model and a duel-depth Q network algorithm, respectively predicts initial and final positions of robots and performs trajectory planning under the condition of avoiding overestimation of action values, and finally optimizes by using the scheme to obtain an optimal trajectory of the multi-robot system.
The reconfigurable intelligent surface-assisted multi-robot system track planning method specifically comprises the following steps:
step one: establishing a communication model of a reconfigurable intelligent surface-assisted multi-robot system; the communication model comprises a single antenna base station, L mobile robots and a reconfigurable intelligent surface with K reflecting elements;
defining channels from the single antenna base station to the reconfigurable intelligent surface, from the reconfigurable intelligent surface to the first mobile robot and from the single antenna base station to the first mobile robot as h respectively H ∈C 1×K 、g i ∈C K×1 And I i ∈C 1×1 . In addition, for reconfigurable intelligent surfaces, t.epsilon.0, T]Moment of reflection coefficientThe array can be expressed as:
wherein beta is k And theta k Respectively representing the amplitude and phase of the kth reflective element; the position q of the first mobile robot at the time t i (t) the received signal is:
wherein n-CN (0, delta) 2 ) Representing zero mean and variance as delta 2 Additive white gaussian noise of S i (t) and S j (t) is the transmission symbol of the i-th mobile robot and the j-th mobile robot, respectively. In addition, the decoding order value thereof satisfies O (j)>O (i), indicating that the decoding order of the ith mobile robot is prioritized over the jth mobile robot, the received rate of the ith mobile robot and the rate of decoding the jth mobile robot may be expressed as:
wherein p is i (t) and d represent the transmission power of the ith mobile robot and the d-th mobile robot, respectively. Then the trajectory optimization problem is expressed as maximizing the total communication rate for all users over the 0 to T period.
Step two: and (3) performing track planning on initial and final position predictions of the robot respectively under the condition of avoiding overestimation of action value by utilizing integrated machine learning combined with a long-term memory-autoregressive integrated moving average model and a duel-depth Q network algorithm.
As a variant of recurrent neural network, long-short-term memory networks handle non-stationary and non-linear numbers efficientlyThe sequence is excellent. However, long and short term memory does not completely solve the problem of gradient extinction for long sequences. Autoregressive integrated moving average models, which do not suffer from this problem, provide an effective solution to linear sequence data. However, it is a temporal prediction model, essentially capturing linear relationships, and cannot involve nonlinear relationships. In this embodiment, the possible initial termination position sets of all robots are respectively predicted according to the long-term memory network and the autoregressive integrated moving average modelAnd->The method comprises the following steps:
wherein, alpha, beta and gamma respectively represent parameters of the long-term and short-term memory network and parameters of the autoregressive integrated moving average model. S is S max 、S minAnd->Representing the maximum value in the training sample, the minimum value in the training sample, the predicted value of the autoregressive model, and the predicted value of the moving average model, respectively. Then, the weights are assigned according to the CRITIC weight method, and +.>Andfusion was performed as follows:
wherein w is 1 And w 2 Respectively representing the assigned weights, and optimizing by the duel-bucket double-depth Q network learning. In the duel-bucket dual-depth Q network, a single-antenna base station is defined as an agent, and the decentralization processing is performed simultaneously to avoid the incapability of converging, so that a loss function can be obtained as follows:
wherein mu C ,And->The parameters of the convolution layer in the duel double-depth Q network, the parameters of the first dense layer in the duel double-depth Q network and the parameters of the second dense layer in the duel double-depth Q network are respectively represented. Γ (e'), Q e 、Q f And f max (e', μ) each represents a state feature vector, a state Q estimation function, an action Q estimation function, and an action corresponding to the maximum Q value. In addition, e' and μ represent the current action, the next action and the current network parameters, respectively. Based on the loss function, the agent learns to converge, and then the optimal network parameters can be output.
Step three: and on the basis of the second step, optimizing to obtain the optimal track of the multi-robot system. When the neural network reaches convergence through training, the optimal robot track can be output.
Various modifications and variations of the present invention will be apparent to those skilled in the art in light of the foregoing teachings and are intended to be included within the scope of the following claims.

Claims (4)

1. The reconfigurable intelligent surface-assisted multi-robot system track planning method is characterized by comprising the following steps of:
step one: establishing a communication model of a reconfigurable intelligent surface-assisted multi-robot system; the communication model specifically comprises a single antenna base station, L mobile robots and a reconfigurable intelligent surface with K reflecting elements;
defining channels from the single antenna base station to the reconfigurable intelligent surface, from the reconfigurable intelligent surface to the first mobile robot and from the single antenna base station to the first mobile robot as h respectively H ∈C 1×K 、g i ∈C K×l And I i ∈C 1×l The method comprises the steps of carrying out a first treatment on the surface of the In addition, for reconfigurable intelligent surfaces, t.epsilon.0, T]The reflection coefficient matrix at the moment is expressed as:
wherein beta is k And theta k Respectively representing the amplitude and phase of the kth reflective element; the position q of the first mobile robot at the time t i (t) the received signal is:
wherein n-CN (0, delta) 2 ) Representing zero mean and variance as delta 2 Additive white gaussian noise of S i (t) and S j (t) is the transmission symbol of the i-th mobile robot and the j-th mobile robot, respectively; in addition, the decoding order value thereof satisfies O (j)>O (i), indicating that the decoding order of the ith mobile robot is prioritized over that of the jth mobile robot, the received rate of the ith mobile robot and the rate of decoding the jth mobile robot are expressed as:
wherein p is i (t)、p j (t) and d represent the transmission power of the ith mobile robot and the d mobile robot, respectively; then the trajectory optimization problem is expressed as maximizing the total communication rate for all users over the 0 to T period;
step two: utilizing integrated machine learning combined with a long-term memory-autoregressive integrated moving average model and a duel-depth Q network algorithm to respectively predict initial and final positions of the robot and conduct track planning under the condition of avoiding overestimation of action value;
step three: optimizing to obtain an optimal track of the multi-robot system on the basis of the second step; when the neural network reaches convergence through training, the optimal robot track can be output.
2. The method according to claim 1, wherein the specific process of the second step is:
predicting possible initial termination position sets of all robots respectively according to long-term memory network and autoregressive integrated moving average modelAnd->The method comprises the following steps:
wherein, alpha, beta and gamma respectively represent parameters of a long-term and short-term memory network and parameters of an autoregressive integrated moving average model; s is S max 、S minAnd->Respectively representing the maximum value in the training sample, the minimum value in the training sample, the predicted value of the autoregressive model and the predicted value of the moving average model; then, the weights are assigned according to the CRITIC weight method, and +.>Andfusion was performed as follows:
wherein w is 1 And w 2 Respectively representing the assigned weights, and optimizing by the duel-bucket double-depth Q network learning; in the duel-bucket dual-depth Q network, a single-antenna base station is defined as an agent, and the decentralization processing is performed simultaneously to avoid the incapability of converging, so that a loss function can be obtained as follows:
wherein mu C ,And->Respectively representing parameters of a convolution layer in the duel double-depth Q network, parameters of a first dense layer in the duel double-depth Q network and parameters of a second dense layer in the duel double-depth Q network; Γ (e'),Q e 、Q f And f max (e', μ) each representing a state feature vector, a state Q estimation function, an action Q estimation function, and an action corresponding to a maximum Q value; furthermore, e' and μ represent the current action, the next action and the current network parameters, respectively; based on the loss function, the agent learns to converge, and then the optimal network parameters can be output.
3. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the method of any of claims 1-2.
4. A reconfigurable intelligent surface-assisted multi-robot system comprising a processor and a memory for storing a computer program; the processor being adapted to implement the method of any of claims 1-2 when the computer program is executed.
CN202310365852.XA 2023-04-07 2023-04-07 Reconfigurable intelligent surface-assisted multi-robot system track planning method Active CN116449836B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310365852.XA CN116449836B (en) 2023-04-07 2023-04-07 Reconfigurable intelligent surface-assisted multi-robot system track planning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310365852.XA CN116449836B (en) 2023-04-07 2023-04-07 Reconfigurable intelligent surface-assisted multi-robot system track planning method

Publications (2)

Publication Number Publication Date
CN116449836A true CN116449836A (en) 2023-07-18
CN116449836B CN116449836B (en) 2024-09-06

Family

ID=87123218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310365852.XA Active CN116449836B (en) 2023-04-07 2023-04-07 Reconfigurable intelligent surface-assisted multi-robot system track planning method

Country Status (1)

Country Link
CN (1) CN116449836B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113162679A (en) * 2021-04-01 2021-07-23 南京邮电大学 DDPG algorithm-based IRS (inter-Range instrumentation System) auxiliary unmanned aerial vehicle communication joint optimization method
CN114422363A (en) * 2022-01-11 2022-04-29 北京科技大学 Unmanned aerial vehicle loaded RIS auxiliary communication system capacity optimization method and device
WO2022241808A1 (en) * 2021-05-19 2022-11-24 广州中国科学院先进技术研究所 Multi-robot trajectory planning method
CN115412159A (en) * 2022-09-01 2022-11-29 大连理工大学 Safety communication method based on assistance of aerial intelligent reflecting surface
CN115665759A (en) * 2022-10-24 2023-01-31 江苏海洋大学 Auxiliary wireless safety communication transmission method based on UAV-RIS

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113162679A (en) * 2021-04-01 2021-07-23 南京邮电大学 DDPG algorithm-based IRS (inter-Range instrumentation System) auxiliary unmanned aerial vehicle communication joint optimization method
WO2022241808A1 (en) * 2021-05-19 2022-11-24 广州中国科学院先进技术研究所 Multi-robot trajectory planning method
CN114422363A (en) * 2022-01-11 2022-04-29 北京科技大学 Unmanned aerial vehicle loaded RIS auxiliary communication system capacity optimization method and device
CN115412159A (en) * 2022-09-01 2022-11-29 大连理工大学 Safety communication method based on assistance of aerial intelligent reflecting surface
CN115665759A (en) * 2022-10-24 2023-01-31 江苏海洋大学 Auxiliary wireless safety communication transmission method based on UAV-RIS

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李思贤: "基于智能反射面的无人机通信研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》, 15 January 2022 (2022-01-15) *

Also Published As

Publication number Publication date
CN116449836B (en) 2024-09-06

Similar Documents

Publication Publication Date Title
Huang et al. Deep reinforcement learning for UAV navigation through massive MIMO technique
CN110531996B (en) Particle swarm optimization-based computing task unloading method in multi-micro cloud environment
Elbir et al. A hybrid architecture for federated and centralized learning
CN113781002B (en) Low-cost workflow application migration method based on agent model and multiple group optimization in cloud edge cooperative network
CN110794965B (en) Virtual reality language task unloading method based on deep reinforcement learning
WO2022242468A1 (en) Task offloading method and apparatus, scheduling optimization method and apparatus, electronic device, and storage medium
CN114521002B (en) Edge computing method for cloud edge end cooperation
CN113613301B (en) Air-ground integrated network intelligent switching method based on DQN
CN110380762A (en) A kind of extensive cut-in method that calculating is merged with communication
Lei et al. Learning-based resource allocation: Efficient content delivery enabled by convolutional neural network
CN116546559B (en) Distributed multi-target space-ground combined track planning and unloading scheduling method and system
CN115065678A (en) Multi-intelligent-device task unloading decision method based on deep reinforcement learning
Santos et al. Universal adversarial attacks on neural networks for power allocation in a massive MIMO system
CN118054828B (en) Intelligent super-surface-oriented beam forming method, device, equipment and storage medium
CN114158010B (en) Unmanned aerial vehicle communication system and resource allocation strategy prediction method based on neural network
CN116449836B (en) Reconfigurable intelligent surface-assisted multi-robot system track planning method
Zhang et al. Multi-objective optimization for UAV-enabled wireless powered IoT networks: an LSTM-based deep reinforcement learning approach
CN109981372A (en) Streaming big data processing method and system based on edge calculations
Wu et al. Joint Task Offloading and Resource Allocation in Multi-UAV Multi-Server Systems: An Attention-based Deep Reinforcement Learning Approach
Ho et al. Deep reinforcement learning for URLLC in 5G mission-critical cloud robotic application
Pan et al. Leveraging ai and intelligent reflecting surface for energy-efficient communication in 6g iot
CN110582097A (en) Processing method and device for reducing automobile calculation overhead and storage medium
KR102472585B1 (en) Big data-based modular AI engine server and driving method of the same
Wang et al. Adaptive compute offloading algorithm for metasystem based on deep reinforcement learning
Montonen et al. Applying 5G and Edge Processing in Smart Manufacturing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant