CN114815834A - A dynamic path planning method for mobile agents in a stage environment - Google Patents
A dynamic path planning method for mobile agents in a stage environment Download PDFInfo
- Publication number
- CN114815834A CN114815834A CN202210465123.7A CN202210465123A CN114815834A CN 114815834 A CN114815834 A CN 114815834A CN 202210465123 A CN202210465123 A CN 202210465123A CN 114815834 A CN114815834 A CN 114815834A
- Authority
- CN
- China
- Prior art keywords
- mobile agent
- dynamic
- obstacles
- state
- obstacle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 230000003068 static effect Effects 0.000 claims abstract description 28
- 238000004088 simulation Methods 0.000 claims abstract description 12
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 28
- 230000009471 action Effects 0.000 claims description 21
- 238000013528 artificial neural network Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 11
- 230000003993 interaction Effects 0.000 claims description 8
- 230000007704 transition Effects 0.000 claims description 8
- 230000006403 short-term memory Effects 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 2
- 241000283283 Orcinus orca Species 0.000 claims 1
- 230000000694 effects Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0223—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving speed control of the vehicle
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0221—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0276—Control of position or course in two dimensions specially adapted to land vehicles using signals provided by a source external to the vehicle
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Aviation & Aerospace Engineering (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本发明涉及智能机器人路径规划技术领域,具体涉及一种舞台环境下的移动智能体动态路径规划方法。The invention relates to the technical field of intelligent robot path planning, in particular to a dynamic path planning method for a mobile intelligent body in a stage environment.
背景技术Background technique
为了满足基层文化多样化服务需求,需要在基层小型文化服务综合体活动空间中开展文化演出、会议议事、展览阅览以及民俗活动等文化服务功能。有分类配置的文化设施不能满足居民对综合文化需求的期望。小型文化服务综合体可以较好的解决该问题,其着力点是推行多功能活动空间建设,它可以集成民俗活动、展览、会议、阅览等功能,形成一体化的场馆服务载体,既满足了农村基层文化的多样性和自组织性的需求,又探索出一条满足我国新农村公共文化服务要求的新模式。In order to meet the diverse service needs of grassroots culture, it is necessary to carry out cultural service functions such as cultural performances, conference discussions, exhibition reading and folk activities in the activity space of grassroots small cultural service complexes. Cultural facilities with classified configuration cannot meet residents' expectations for comprehensive cultural needs. Small cultural service complexes can better solve this problem. Its focus is to promote the construction of multi-functional activity spaces. It can integrate folk activities, exhibitions, conferences, reading and other functions to form an integrated venue service carrier, which not only meets the needs of rural areas. The diversity and self-organization needs of grassroots culture have also explored a new model to meet the requirements of public cultural services in my country's new rural areas.
为了减少土地资源的浪费,提高空间利用效率,就需要各种智能移动体协助完成多种功能空间相互快速组合以及切换,因服务空间不同,其功能空间内的配置设施和使用要求也不同,为了达到小型文化综合体“一厅多用”要求,往往要通过智能移动体协助完成多种功能空间相互快速组合以及切换,实现在小型综合体空间内拥挤环境下的动态路径规划等,从而满足单一空间多种文化服务需求。In order to reduce the waste of land resources and improve the efficiency of space utilization, various intelligent mobile bodies are required to assist in the rapid combination and switching of various functional spaces. To meet the requirements of "multi-purpose in one hall" in small cultural complexes, it is often necessary to use intelligent mobile bodies to assist in the rapid combination and switching of multiple functional spaces, so as to realize dynamic path planning in a crowded environment in small complex spaces, so as to meet the needs of a single space. A variety of cultural service needs.
小型文化综合体空间是一个典型的人、机、物共存环境,在功能空间切换过程中,空间内多个物体装备需要有其他装备物情况下进行轨迹规划运动,实现空间功能的切换服务,所以在切换过程中如何快速地躲避动静态障碍物,到达目标点,需要我们设计动态路径规划算法控制智能移动体,且文化综合体空间内环境拥挤,对动态路径规划算法要求较高。The small cultural complex space is a typical coexistence environment of people, machines and objects. In the process of functional space switching, multiple objects and equipment in the space need to have other equipment to carry out trajectory planning motion to realize the switching service of space functions, so How to quickly avoid dynamic and static obstacles and reach the target point in the switching process requires us to design a dynamic path planning algorithm to control the intelligent moving body, and the environment in the cultural complex space is crowded, so the dynamic path planning algorithm has high requirements.
传统的动态路径规划算法依赖于传感器的快速刷新来感知周围障碍物的信息,规划出来的路径也会随着动态障碍物的变化,出现绕路或者不自然轨迹的问题。不能够预测周围动态障碍物的运动趋势,缺乏适应性。对于以上问题,亟需提出一种可以区分动静态障碍物以及预测动态障碍物趋势的动态路径规划方法。The traditional dynamic path planning algorithm relies on the rapid refresh of the sensor to perceive the information of the surrounding obstacles, and the planned path will also change with the dynamic obstacles, resulting in the problem of detours or unnatural trajectories. It is unable to predict the movement trend of surrounding dynamic obstacles and lacks adaptability. For the above problems, it is urgent to propose a dynamic path planning method that can distinguish between dynamic and static obstacles and predict the trend of dynamic obstacles.
发明内容SUMMARY OF THE INVENTION
针对现有技术中存在的问题,本发明的目的在于提供一种舞台环境下的移动智能体动态路径规划方法,该方法通过在深度强化学习方法的基础上设计了新的马尔可夫决策过程和网络结构,通过引入社会注意力机制给动态障碍物添加注意力分数,使用长短期记忆神经网络来解决前馈网络维数不固定的问题,构建新的奖励函数来应对动静态障碍物的不同躲避情况,提出新的经验池更新方法提高网络训练的收敛速度,从而让移动智能体实现区分动静态障碍物以及预测动态障碍物的运动趋势。为了实现上述目的,本发明采用的技术方案如下:Aiming at the problems existing in the prior art, the purpose of the present invention is to provide a dynamic path planning method for a mobile agent in a stage environment. The method designs a new Markov decision process and Network structure, add attention score to dynamic obstacles by introducing social attention mechanism, use long short-term memory neural network to solve the problem of unfixed dimension of feedforward network, and build a new reward function to deal with different avoidance of dynamic and static obstacles Therefore, a new experience pool update method is proposed to improve the convergence speed of network training, so that the mobile agent can distinguish between dynamic and static obstacles and predict the motion trend of dynamic obstacles. In order to achieve the above object, the technical scheme adopted in the present invention is as follows:
一种舞台环境下的移动智能体动态路径规划方法,包括以下步骤:A dynamic path planning method for a mobile agent in a stage environment, comprising the following steps:
1)基于gym库建立移动智能体和动静态障碍物的仿真环境模型;1) Establish a simulation environment model of mobile agents and dynamic and static obstacles based on the gym library;
2)设计马尔代夫决策过程,设计状态空间S、动作空间A、转移概率P、奖励R和折扣因子γ;2) Design the Maldives decision-making process, design the state space S, the action space A, the transition probability P, the reward R and the discount factor γ;
3)设计神经网络结构;3) Design the neural network structure;
4)使用最佳互惠碰撞避免算法(ORCA),通过模仿学习预训练来初始化网络参数;模仿学习结束之后然后通过移动智能体在仿真环境下的实际交互进行训练来优化网络参数;4) Use the optimal reciprocal collision avoidance algorithm (ORCA) to initialize the network parameters through imitation learning pre-training; after the imitation learning is over, the network parameters are optimized by training the actual interaction of the mobile agent in the simulation environment;
5)通过自适应时刻估计方法(Adam)训练神经网络得到最优值函数:5) The optimal value function is obtained by training the neural network through the adaptive time estimation method (Adam):
V*(ut)=∑γΔt·Vpref·P(ut,at)V * (u t )=∑γΔt ·Vpref ·P( u t ,at )
6)通过最大化累计回报来设定最优策略:6) Set the optimal strategy by maximizing the cumulative return:
其中,ut表示当前移动智能体和障碍物的联合状态,at表示动作空间的集合,γ表示衰减因子,Δt表示两个动作之间的时间间隔,Vpref表示首选速度,V*表示在最优值函数,P表示为状态转移函数,R表示为奖励函数;表示下一时刻的联合状态where ut represents the joint state of the current mobile agent and the obstacle, at represents the set of action spaces, γ represents the decay factor, Δt represents the time interval between two actions, Vpref represents the preferred speed, and V * represents the Merit function, P is the state transition function, R is the reward function; Represents the joint state at the next moment
7)根据最优策略来选择当前时刻的动作at直到移动智能体到达目标。7) Select the action at the current moment according to the optimal strategy until the mobile agent reaches the target.
进一步的,所述步骤1)中将移动智能体和动态障碍物设定为半径为0.3米的圆,而将静态障碍物定义为半径在0.5米到1米之间的圆形或者为面积在1平方米到1.5平方米之间的四边形。Further, in the step 1), the mobile agent and the dynamic obstacle are set as a circle with a radius of 0.3 meters, and the static obstacle is defined as a circle with a radius of 0.5 meters to 1 meter or an area of Quadrilaterals between 1 square meter and 1.5 square meters.
进一步的,所述步骤2)中,设定状态空间S,其中动态障碍物的状态为SD=[Px,Py,Vx,Vy,r,Vpref]、静态障碍物的状态为SS=[Px,Py,r],移动智能体的状态为ST=[Px,Py,Gx,Gy,Vx,Vy,θ,r,Vpref]联合状态ut=[ST,SS,SD].其中(Px,Py)为移动智能体和动静态障碍物的当前位置,(Gx,Gy)为所设定的目标点的位置,θ为移动智能体的航向角,r为移动智能体和动静态障碍物的半径大小,Vpref为移动智能体的首选速度,(Vx,Vy)为移动智能体和动态障碍物的移动速度;Further, in the step 2), the state space S is set, wherein the state of the dynamic obstacle is S D =[P x ,P y ,V x ,V y ,r,V pref ], the state of the static obstacle is S S =[P x ,P y ,r], the state of the mobile agent is S T =[P x ,P y ,G x ,G y ,V x ,V y ,θ,r,V pref ] joint State u t =[S T , S S , S D ]. Among them (P x , P y ) are the current positions of the mobile agent and static and dynamic obstacles, and (G x , G y ) are the set target points , θ is the heading angle of the mobile agent, r is the radius of the mobile agent and the dynamic and static obstacles, V pref is the preferred speed of the mobile agent, (V x , V y ) is the mobile agent and the dynamic obstacle the moving speed of the object;
动作空间A为线速度和角速度,为了符合动力学约束,角速度分成18等分在[-π/4,π/4]区间内,线速度按照函数x取1,2,3,4,5可获得5个变化平滑的线速度,动作空间共有90中动作组合;The action space A is the linear velocity and the angular velocity. In order to meet the dynamic constraints, the angular velocity is divided into 18 equal parts in the interval [-π/4, π/4], and the linear velocity follows the function Take 1, 2, 3, 4, and 5 for x to obtain 5 linear velocities with smooth changes. There are 90 action combinations in the action space;
转移概率P通过轨迹预测模型来近似计算;The transition probability P is approximated by the trajectory prediction model;
奖励R设置为:The reward R is set as:
其中Gx,y是目标点的位置信息,Px,y是移动智能体的当前位置信息,ds是移动智能体和静态障碍物之间的距离,dd是移动智能体和动态障碍物之间的距离;折扣因子γ取0.9。where G x, y is the position information of the target point, P x, y is the current position information of the mobile agent, d s is the distance between the mobile agent and the static obstacle, and d d is the mobile agent and the dynamic obstacle The distance between; the discount factor γ is taken as 0.9.
进一步的,所述步骤3)中的网络结构由以下模块组成:1、输入层:输入层即为上述步骤而中的联合状态ut=[ST,SS,SD]。2、长短期记忆神经网络模块(LSTM):通过LSTM模块可以将移动智能体周围的障碍物排序,并且可以固定网络层输出参数。3、社会注意力机制:通过社会注意力机制模块可以分析出移动智能体与周围动态障碍物发生碰撞的概率,并且以分数的形式展现出来。4、输出层:输出层通过对网络参数的加权线性组合输出最优值函数V*(ut)。Further, the network structure in the step 3) is composed of the following modules: 1. Input layer: the input layer is the joint state u t =[ST , S S , S D ] in the above steps. 2. Long Short-Term Memory Neural Network Module (LSTM): Through the LSTM module, the obstacles around the mobile agent can be sorted, and the output parameters of the network layer can be fixed. 3. Social attention mechanism: Through the social attention mechanism module, the probability of collision between the mobile agent and the surrounding dynamic obstacles can be analyzed and displayed in the form of scores. 4. Output layer: The output layer outputs the optimal value function V * (u t ) through a weighted linear combination of network parameters.
进一步的,所述步骤3)中网络运行流程如下:首先将移动智能体和障碍物的状态信息输入进网络,然后根据状态信息将障碍物分为动态障碍物和静态障碍物,将移动智能体的状态和动态障碍物的状态输入LSTM模块,再输入进社会注意力机制模块,再将经过处理的状态、得到的交互特征和静态障碍物状态输入两层全连接层,最后通过激活函数对其进行归一化处理得到最优值函数。Further, the network operation process in the step 3) is as follows: first, input the state information of the mobile agent and the obstacle into the network, and then divide the obstacles into dynamic obstacles and static obstacles according to the state information, and then divide the mobile agent into a dynamic obstacle and a static obstacle. The state of the dynamic obstacle and the state of the dynamic obstacle are input into the LSTM module, and then into the social attention mechanism module, and then the processed state, the obtained interaction feature and the state of the static obstacle are input into the two-layer fully-connected layer, and finally the activation function is used to adjust it. Perform normalization to get the optimal value function.
进一步的,所述步骤4)中移动智能体在仿真环境下交互时将当前的状态信息、动作信息和奖励信息作为一条经验存储到经验池中,当经验达到最大容量时,将新的经验取代奖励低的旧的经验存储,从而提高选取优秀经验的概率,提高网络的收敛速度。在每一集交互过程中当移动智能体碰到障碍物或者超过单次集运行的最大时间时结束当前集。然后将经验通过梯度方向传播来更新网络参数。Further, in the step 4), when the mobile agent interacts in the simulation environment, the current state information, action information and reward information are stored in the experience pool as an experience, and when the experience reaches the maximum capacity, the new experience replaces the reward. Low old experience storage, thereby increasing the probability of selecting excellent experience and improving the convergence speed of the network. During each episode of interaction, the current episode ends when the mobile agent encounters an obstacle or exceeds the maximum time for a single episode run. The experience is then propagated through the gradient direction to update the network parameters.
本发明有益效果是:设计了新的马尔可夫决策过程来适应综合体空间内障碍物复杂的情况,设计了新的网络结构,来实现对动静态障碍物分类处理,实现对动态障碍物的预测,设计了新的奖励函数来应对不同障碍物的情况,提出新的经验池更新方法来提高神经网络的训练效率,使用模仿学习来对网络进行预训练,提高网络的收敛速度;从而让移动智能体在综合体空间内实现高效的动态规划方法。The beneficial effects of the invention are as follows: a new Markov decision-making process is designed to adapt to the complex situation of obstacles in the complex space, and a new network structure is designed to realize the classification and processing of dynamic and static obstacles, and realize the detection of dynamic obstacles. Prediction, a new reward function is designed to deal with different obstacles, a new experience pool update method is proposed to improve the training efficiency of the neural network, imitation learning is used to pre-train the network, and the convergence speed of the network is improved; The agent implements an efficient dynamic programming method in the complex space.
附图说明Description of drawings
图1是本发明实施例的方法实现流程图;Fig. 1 is the method realization flow chart of the embodiment of the present invention;
图2是本发明实施例中网络结构图;Fig. 2 is a network structure diagram in an embodiment of the present invention;
图3是本发明实施例中仿真结果图;Fig. 3 is the simulation result diagram in the embodiment of the present invention;
图4是本发明实施例中网络训练总奖励图。FIG. 4 is a graph of the total reward of network training in an embodiment of the present invention.
具体实施方式Detailed ways
下面结合附图及具体实施例,对本发明作进一步的详细说明。The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.
本发明的目的在于提供一种舞台环境下的移动智能体动态路径规划方法,该方法通过在深度强化学习方法的基础上设计了新的马尔可夫决策过程和网络结构,通过引入社会注意力机制给动态障碍物添加注意力分数,使用长短期记忆神经网络来解决前馈网络维数不固定的问题,构建新的奖励函数来应对动静态障碍物的不同躲避情况,提出新的经验池更新方法提高网络训练的收敛速度,从而让移动智能体实现区分动静态障碍物以及预测动态障碍物的运动趋势。The purpose of the present invention is to provide a dynamic path planning method for a mobile agent in a stage environment. The method designs a new Markov decision process and network structure on the basis of a deep reinforcement learning method, and introduces a social attention mechanism. Add attention scores to dynamic obstacles, use long short-term memory neural network to solve the problem that the dimension of feedforward network is not fixed, build a new reward function to deal with different avoidance situations of dynamic and static obstacles, and propose a new experience pool update method Improve the convergence speed of network training, so that the mobile agent can distinguish between dynamic and static obstacles and predict the movement trend of dynamic obstacles.
在本实施例中,仿真环境如图3所示,设规划地图的范围为10*10,路径规划的起始点为(0,-10),目标点为(0,7.5),静态障碍物位置随机分布,为长方形或者正方形,动态障碍物为半径大小为0.5的圆形。In this embodiment, the simulation environment is shown in Figure 3. The range of the planning map is set to 10*10, the starting point of the path planning is (0,-10), the target point is (0,7.5), and the static obstacle position Random distribution, rectangular or square, dynamic obstacles are circles with a radius of 0.5.
一种舞台环境下的移动智能体动态路径规划算法,具体步骤如下:A dynamic path planning algorithm for mobile agents in a stage environment, the specific steps are as follows:
1)基于gym库建立移动智能体和动静态障碍物的仿真环境模型;1) Establish a simulation environment model of mobile agents and dynamic and static obstacles based on the gym library;
2)设计马尔代夫决策过程,设计状态空间S、动作空间A、转移概率P、奖励R和折扣因子γ;2) Design the Maldives decision-making process, design the state space S, the action space A, the transition probability P, the reward R and the discount factor γ;
3)设计神经网络结构;3) Design the neural network structure;
4)使用最佳互惠碰撞避免算法(ORCA),通过模仿学习预训练3000集来初始化网络参数;然后通过移动智能体在仿真环境下的实际交互进行训练来优化网络参数。4) Using the Optimal Reciprocal Collision Avoidance Algorithm (ORCA), the network parameters are initialized by imitating the learning pre-training 3000 sets; then the network parameters are optimized by training the actual interaction of mobile agents in a simulated environment.
5)通过自适应时刻估计方法(Adam)训练神经网络得到最优值函数:5) The optimal value function is obtained by training the neural network through the adaptive time estimation method (Adam):
V*(ut)=∑γΔt·Vpref·P(ut,at)V * (u t )=∑γΔt ·Vpref ·P( u t ,at )
6)通过最大化累计回报来设定最优策略:6) Set the optimal strategy by maximizing the cumulative return:
其中,ut表示当前移动智能体和障碍物的联合状态,a表示动作空间的集合,γ表示衰减因子,Δt表示两个动作之间的时间间隔,Vpref表示首选速度,V*表示在最优值函数,P表示为状态转移函数,R表示为奖励函数;表示下一时刻的联合状态where ut represents the joint state of the current mobile agent and obstacles, a represents the set of action spaces, γ represents the decay factor, Δt represents the time interval between two actions, Vpref represents the preferred speed, and V * represents the optimal Value function, P is the state transition function, R is the reward function; Represents the joint state at the next moment
7)根据最优策略来选择当前时刻的动作at直到移动智能体到达目标。7) Select the action at the current moment according to the optimal strategy until the mobile agent reaches the target.
以上是本发明的较佳实施例,凡依本发明技术方案所作的改变,所产生的功能作用未超出本发明技术方案的范围时,均属于本发明的保护范围。The above are the preferred embodiments of the present invention. Any changes made according to the technical solutions of the present invention, when the resulting functional effects do not exceed the scope of the technical solutions of the present invention, belong to the protection scope of the present invention.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210465123.7A CN114815834B (en) | 2022-04-29 | 2022-04-29 | Dynamic path planning method for mobile intelligent body in stage environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210465123.7A CN114815834B (en) | 2022-04-29 | 2022-04-29 | Dynamic path planning method for mobile intelligent body in stage environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114815834A true CN114815834A (en) | 2022-07-29 |
CN114815834B CN114815834B (en) | 2024-11-29 |
Family
ID=82509534
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210465123.7A Active CN114815834B (en) | 2022-04-29 | 2022-04-29 | Dynamic path planning method for mobile intelligent body in stage environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114815834B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116090688A (en) * | 2023-04-10 | 2023-05-09 | 中国人民解放军国防科技大学 | Moving Target Traversal Access Sequence Planning Method Based on Improved Pointer Network |
CN118394109A (en) * | 2024-06-26 | 2024-07-26 | 烟台中飞海装科技有限公司 | Simulated countermeasure training method based on multi-agent reinforcement learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110632931A (en) * | 2019-10-09 | 2019-12-31 | 哈尔滨工程大学 | Collision avoidance planning method for mobile robot based on deep reinforcement learning in dynamic environment |
CN112666939A (en) * | 2020-12-09 | 2021-04-16 | 深圳先进技术研究院 | Robot path planning algorithm based on deep reinforcement learning |
CN113341958A (en) * | 2021-05-21 | 2021-09-03 | 西北工业大学 | Multi-agent reinforcement learning movement planning method with mixed experience |
CN113342047A (en) * | 2021-06-23 | 2021-09-03 | 大连大学 | Unmanned aerial vehicle path planning method for improving artificial potential field method based on obstacle position prediction in unknown environment |
-
2022
- 2022-04-29 CN CN202210465123.7A patent/CN114815834B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110632931A (en) * | 2019-10-09 | 2019-12-31 | 哈尔滨工程大学 | Collision avoidance planning method for mobile robot based on deep reinforcement learning in dynamic environment |
CN112666939A (en) * | 2020-12-09 | 2021-04-16 | 深圳先进技术研究院 | Robot path planning algorithm based on deep reinforcement learning |
CN113341958A (en) * | 2021-05-21 | 2021-09-03 | 西北工业大学 | Multi-agent reinforcement learning movement planning method with mixed experience |
CN113342047A (en) * | 2021-06-23 | 2021-09-03 | 大连大学 | Unmanned aerial vehicle path planning method for improving artificial potential field method based on obstacle position prediction in unknown environment |
Non-Patent Citations (1)
Title |
---|
陈旿 等: "一种多智能体协同信息一致性算法", 航空学报, vol. 38, no. 12, 25 December 2017 (2017-12-25) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116090688A (en) * | 2023-04-10 | 2023-05-09 | 中国人民解放军国防科技大学 | Moving Target Traversal Access Sequence Planning Method Based on Improved Pointer Network |
CN118394109A (en) * | 2024-06-26 | 2024-07-26 | 烟台中飞海装科技有限公司 | Simulated countermeasure training method based on multi-agent reinforcement learning |
Also Published As
Publication number | Publication date |
---|---|
CN114815834B (en) | 2024-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhao et al. | The experience-memory Q-learning algorithm for robot path planning in unknown environment | |
Shah et al. | Long-distance path planning for unmanned surface vehicles in complex marine environment | |
CN114397896B (en) | A Dynamic Path Planning Method Based on Improved Particle Swarm Optimization Algorithm | |
Sundarraj et al. | Route planning for an autonomous robotic vehicle employing a weight-controlled particle swarm-optimized Dijkstra algorithm | |
Lv et al. | Blind travel prediction based on obstacle avoidance in indoor scene | |
CN111611749B (en) | Simulation method and system for automatic guidance of indoor crowd evacuation based on RNN | |
CN114815834A (en) | A dynamic path planning method for mobile agents in a stage environment | |
CN112799386A (en) | Robot Path Planning Method Based on Artificial Potential Field and Reinforcement Learning | |
Raheem et al. | Development of a* algorithm for robot path planning based on modified probabilistic roadmap and artificial potential field | |
Chang et al. | Interpretable fuzzy logic control for multirobot coordination in a cluttered environment | |
Lamouik et al. | Deep neural network dynamic traffic routing system for vehicles | |
CN117289691A (en) | Training method for path planning agent for reinforcement learning in navigation scene | |
CN114089751A (en) | A Path Planning Method for Mobile Robots Based on Improved DDPG Algorithm | |
CN118348975A (en) | Path planning method, amphibious unmanned platform, storage medium and program product | |
CN113391633A (en) | Urban environment-oriented mobile robot fusion path planning method | |
Ou et al. | Hybrid path planning based on adaptive visibility graph initialization and edge computing for mobile robots | |
Lei et al. | Digital twin‐based multi‐objective autonomous vehicle navigation approach as applied in infrastructure construction | |
Xue et al. | Multi-agent path planning based on MPC and DDPG | |
Jiang et al. | Fuzzy neural network based dynamic path planning | |
CN115202357A (en) | An autonomous mapping method based on spiking neural network | |
Wang et al. | A mapless navigation method based on deep reinforcement learning and path planning | |
CN119289981A (en) | A mobile robot path planning method based on SAC algorithm | |
Kodagoda et al. | Socially aware path planning for mobile robots | |
Hliwa et al. | Optimal path planning of mobile robot using hybrid tabu search-firefly algorithm | |
Tran et al. | Mobile robot planner with low-cost cameras using deep reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |