CN111230875B - Double-arm robot humanoid operation planning method based on deep learning - Google Patents
Double-arm robot humanoid operation planning method based on deep learning Download PDFInfo
- Publication number
- CN111230875B CN111230875B CN202010081726.8A CN202010081726A CN111230875B CN 111230875 B CN111230875 B CN 111230875B CN 202010081726 A CN202010081726 A CN 202010081726A CN 111230875 B CN111230875 B CN 111230875B
- Authority
- CN
- China
- Prior art keywords
- mechanical arm
- evaluation function
- strategy network
- motion
- tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1656—Programme controls characterised by programming, planning systems for manipulators
- B25J9/1664—Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/163—Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1679—Programme controls characterised by the tasks executed
- B25J9/1682—Dual arm manipulator; Coordination of several manipulators
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Manipulator (AREA)
Abstract
The invention provides a double-arm robot humanoid operation planning method based on deep learning, which comprises the following steps: step S1, training a strategy network; the strategy network can calculate the probability of the mechanical arm in each state at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes; s2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network; and step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm. According to the invention, the human motion sample is used as a guide, and the human motion sample is selected or changed by combining with the specific environment of the current robot, so that the double-arm robot can perform humanoid operation planning.
Description
Technical Field
The invention relates to a robot mechanical arm operation planning method, in particular to a double-arm robot humanoid operation planning method based on deep learning, and belongs to the technical field of robots.
Background
In dangerous complex working environments such as anti-riot, battlefield combat, harmful gas and nuclear radiation of armed police, the main consideration of path planning is how a robot can move from an initial state to a target state without collision, and the technical level of path planning is an important index for measuring the intelligent degree of the robot. Because of the strong learning ability of human beings, a large number of exercise action samples are accumulated through continuous exercise learning and adaptation training of different environments. When the motion task similar to the previous motion task is executed, according to the motion sample, the required motion can be quickly and effectively completed by slightly changing according to the actual environment.
The existing path planning methods mainly include path planning algorithms based on graphs, such as an a-x algorithm, a visual method, an artificial potential field algorithm, a probability road graph algorithm (PRM) based on random sampling, a rapid spread random tree algorithm (RRT), and the like.
The dimension of the mechanical arm is higher than that of the mobile robot, the traditional path planning algorithm has the problems of large calculated amount, difficult description of obstacles, incomplete algorithm, non-optimal algorithm and the like, the planning algorithm based on sampling satisfies the problem of complete probability, but the planning result is possibly different each time due to the fact that certain randomness is introduced in sampling, and the problem that the planning result cannot be prejudged exists.
Based on this problem, there is a need in the art for a method for performing humanoid work planning by using a human motion sample as a guide, and on the other hand, selecting or changing the human motion sample in combination with a specific environment in which the current robot is located, thereby implementing a double-arm robot.
Disclosure of Invention
The invention aims to provide a motion planning method which can take human motion actions as guidance of robot motion planning and can consider the specific environment of the robot at present.
The technical scheme of the invention is as follows.
The first aspect of the invention provides a double-arm robot humanoid operation planning method based on deep learning, which comprises the following steps:
step S1, training a strategy network; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
S2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
and step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
Preferably, the policy network employs human locomotor action samples.
Preferably, the human locomotion activity sample is obtained by a motion capture device.
Preferably, the evaluation function is used to evaluate the cost of moving the robot arm from the current state to the states at the next moment.
Preferably, the evaluation function is further used for evaluating whether the mechanical arm collides with itself and the surrounding environment during the movement.
Preferably, the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
Preferably, the step S3 further includes:
step S31, selecting; according to possible human motion actions obtained by the strategy network, selecting an action q with larger probability;
step S32, an expanding step; after the action q, continuing to utilize the strategy network to select the child node with larger probability to expand the tree to obtain a plurality of subtrees, and stopping after expanding a certain step number;
step S33, an evaluation step; evaluating each node by using an evaluation function from the leaf node of the tree of which expansion is completed;
step S34, returning the step; returning the result of the evaluation function from the leaf node to the root node of the tree, wherein the evaluation value of each leaf node is updated along with the evaluation value of the returned child node; after all the backpasses are completed, the final evaluation value of the node q selected initially is obtained, so that the node with the highest evaluation value is selected as the next movement of the mechanical arm.
A second aspect of the present invention provides a two-arm robot humanoid job planning system, comprising:
a policy network training device; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
An evaluation function determination means; the evaluation function is used for evaluating the movement scheme provided by the strategy network
Tree searching means; the tree search device can perform tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
Preferably, the policy network employs human motion action samples; the evaluation function is used for evaluating the motion scheme provided by the strategy network; the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
A third aspect of the present invention provides a two-arm robot comprising two substantially symmetrical mechanical arms, characterized in that the two-arm robot performs path planning for the operation of the two substantially symmetrical mechanical arms according to the deep learning-based two-arm robot humanoid operation planning method according to any one of the first aspects of the present invention.
By the technical scheme, the invention can obtain the following technical effects.
(1) The invention combines a strategy network based on deep learning, an evaluation function based on a motion environment and tree search to form a novel robot motion planning method.
(2) When the strategy network is trained, human motion is taken as a sample, the characteristics of human motion are used for reference, and the humanoid characteristics of operation planning of the double-arm robot are realized.
Drawings
Fig. 1 is a schematic diagram of a two-arm robot humanoid work planning method based on deep learning of the present invention.
Detailed Description
Example 1
In fig. 1, embodiment 1 of the present invention provides a two-arm robot humanoid job planning method based on deep learning, which includes the following steps:
step S1, training a strategy network; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
S2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
and step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
In a preferred embodiment, the policy network employs human locomotor action samples.
In a preferred embodiment, the human locomotion activity sample is obtained by a motion capture device.
In a preferred embodiment, the evaluation function is used to evaluate the cost of moving the robot arm from the current state to the states at the next moment.
In a preferred embodiment, the evaluation function is also used to evaluate whether the robot arm collides with itself and the surrounding environment during the movement.
In a preferred embodiment, the tree search treats each motion state of the robotic arm as a node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
In a preferred embodiment, the step S3 further includes:
step S31, selecting; according to possible human motion actions obtained by the strategy network, selecting an action q with larger probability;
step S32, an expanding step; after the action q, continuing to utilize the strategy network to select the child node with larger probability to expand the tree to obtain a plurality of subtrees, and stopping after expanding a certain step number;
step S33, an evaluation step; evaluating each node by using an evaluation function from the leaf node of the tree of which expansion is completed;
step S34, returning the step; returning the result of the evaluation function from the leaf node to the root node of the tree, wherein the evaluation value of each leaf node is updated along with the evaluation value of the returned child node; after all the backpasses are completed, the final evaluation value of the node q selected initially is obtained, so that the node with the highest evaluation value is selected as the next movement of the mechanical arm.
Example 2
The embodiment 2 of the invention provides a double-arm robot humanoid operation planning system, which comprises:
a policy network training device; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
An evaluation function determination means; the evaluation function is used for evaluating the movement scheme provided by the strategy network
Tree searching means; the tree search device can perform tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
In a preferred embodiment, the policy network employs human locomotor action samples; the evaluation function is used for evaluating the motion scheme provided by the strategy network; the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
Example 3
Embodiment 3 of the present invention provides a two-arm robot, which comprises two substantially symmetrical mechanical arms, wherein the two-arm robot performs path planning on the operations of the two substantially symmetrical mechanical arms according to the two-arm robot humanoid operation planning method based on deep learning according to any one of embodiment 1 of the present invention.
Claims (6)
1. A double-arm robot humanoid operation planning method based on deep learning comprises the following steps:
step S1, training a strategy network; the strategy network can calculate the probability of the mechanical arm in each state at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes;
s2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of mechanical arm path planning;
the method is characterized in that the strategy network adopts a human motion action sample;
the evaluation function is used for evaluating whether the mechanical arm collides with the mechanical arm and the surrounding environment in the movement process;
the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
2. The deep learning-based two-arm robot humanoid job planning method of claim 1, wherein the human motion samples are acquired through a motion capture device.
3. The deep learning-based two-arm robot humanoid job planning method of claim 1, wherein the evaluation function is used for evaluating the cost of each state of the mechanical arm from the current state to the next moment.
4. The method for planning a humanoid job of a double-arm robot based on deep learning according to claim 1, wherein the step S3 further comprises:
step S31, selecting; according to possible human motion actions obtained by the strategy network, selecting an action q with larger probability;
step S32, an expanding step; after the action q, continuing to utilize the strategy network to select the child node with larger probability to expand the tree to obtain a plurality of subtrees, and stopping after expanding a certain step number;
step S33, an evaluation step; evaluating each node by using an evaluation function from the leaf node of the tree of which expansion is completed;
step S34, returning the step; returning the result of the evaluation function from the leaf node to the root node of the tree, wherein the evaluation value of each leaf node is updated along with the evaluation value of the returned child node; after all the backpasses are completed, the final evaluation value of the node q selected initially is obtained, so that the node with the highest evaluation value is selected as the next movement of the mechanical arm.
5. A two-arm robotic humanoid job planning system comprising:
a policy network training device; the strategy network can calculate the probability of the mechanical arm in each state at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes;
an evaluation function determination means; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
tree searching means; the tree searching device can be used for executing tree searching by combining the strategy network and the evaluation function to obtain a result of mechanical arm path planning;
the strategy network adopts a human motion action sample; the tree search treats each motion state of the mechanical arm as one node of a tree; performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of a tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of all the root nodes from the leaf nodes, and finally selecting an optimal node according to the values of all the nodes as a mechanical arm path planning result;
the evaluation function is also used for evaluating whether the mechanical arm collides with the mechanical arm and the surrounding environment during the movement process.
6. A two-arm robot comprising two substantially symmetrical robotic arms, wherein the two-arm robot routes the work of the two substantially symmetrical robotic arms according to the deep learning based two-arm robot humanoid work planning method of any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010081726.8A CN111230875B (en) | 2020-02-06 | 2020-02-06 | Double-arm robot humanoid operation planning method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010081726.8A CN111230875B (en) | 2020-02-06 | 2020-02-06 | Double-arm robot humanoid operation planning method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111230875A CN111230875A (en) | 2020-06-05 |
CN111230875B true CN111230875B (en) | 2023-05-12 |
Family
ID=70861916
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010081726.8A Active CN111230875B (en) | 2020-02-06 | 2020-02-06 | Double-arm robot humanoid operation planning method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111230875B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113084807B (en) * | 2021-03-31 | 2022-04-19 | 中国科学技术大学 | Method for searching toxic gas leakage source of multi-robot system |
CN117021101B (en) * | 2023-08-24 | 2024-03-19 | 中国矿业大学 | Multi-arm path planning method for belt conveyor dismounting robot |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108284436A (en) * | 2018-03-17 | 2018-07-17 | 北京工业大学 | Remote mechanical dual arm system and method with learning by imitation mechanism |
CN109405843A (en) * | 2018-09-21 | 2019-03-01 | 北京三快在线科技有限公司 | A kind of paths planning method and device and mobile device |
WO2019178147A1 (en) * | 2018-03-12 | 2019-09-19 | Virginia Polytechnic Institute And State University | Intelligent distribution of data for robotic and autonomous systems |
CN110297490A (en) * | 2019-06-17 | 2019-10-01 | 西北工业大学 | Heterogeneous module robot via Self-reconfiguration planing method based on nitrification enhancement |
CN110509279A (en) * | 2019-09-06 | 2019-11-29 | 北京工业大学 | A kind of trajectory path planning method and system of apery mechanical arm |
-
2020
- 2020-02-06 CN CN202010081726.8A patent/CN111230875B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019178147A1 (en) * | 2018-03-12 | 2019-09-19 | Virginia Polytechnic Institute And State University | Intelligent distribution of data for robotic and autonomous systems |
CN108284436A (en) * | 2018-03-17 | 2018-07-17 | 北京工业大学 | Remote mechanical dual arm system and method with learning by imitation mechanism |
CN109405843A (en) * | 2018-09-21 | 2019-03-01 | 北京三快在线科技有限公司 | A kind of paths planning method and device and mobile device |
CN110297490A (en) * | 2019-06-17 | 2019-10-01 | 西北工业大学 | Heterogeneous module robot via Self-reconfiguration planing method based on nitrification enhancement |
CN110509279A (en) * | 2019-09-06 | 2019-11-29 | 北京工业大学 | A kind of trajectory path planning method and system of apery mechanical arm |
Non-Patent Citations (1)
Title |
---|
基于轨迹匹配的模仿学习在类人机器人运动行为中的研究;门玉森;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20170315(第3期);第17-40页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111230875A (en) | 2020-06-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111230875B (en) | Double-arm robot humanoid operation planning method based on deep learning | |
Wu et al. | Online planning for ad hoc autonomous agent teams | |
CN109960880A (en) | A kind of industrial robot obstacle-avoiding route planning method based on machine learning | |
CN102402712A (en) | Robot reinforced learning initialization method based on neural network | |
WO2011115960A1 (en) | Temporal tracking robot control system | |
Guigue et al. | Pareto optimality and multiobjective trajectory planning for a 7-DOF redundant manipulator | |
CN110275528B (en) | Improved path optimization method for RRT algorithm | |
CN115958590A (en) | RRT-based mechanical arm deep frame obstacle avoidance motion planning method and device | |
CN110530373B (en) | Robot path planning method, controller and system | |
Jankowski et al. | Vp-sto: Via-point-based stochastic trajectory optimization for reactive robot behavior | |
CN110705803A (en) | Route planning method based on triangle inner center guide RRT algorithm | |
Parhi et al. | Smart navigation of humanoid robots using DAYKUN-BIP virtual target displacement and petri-net strategy | |
Ponsini et al. | Analysis of soccer robot behaviors using time petri nets | |
Yun et al. | Enhanced D∗ Lite Algorithm for mobile robot navigation | |
Zhang et al. | A closed-loop perception, decision-making and reasoning mechanism for human-like navigation | |
Zare et al. | Cyrus 2D Simulation 2019 | |
CN112131754A (en) | Extended POMDP planning method and system based on robot accompanying behavior model | |
Panda et al. | Autonomous mobile robot path planning using hybridization of particle swarm optimization and Tabu search | |
Liu et al. | Evolving hidden Markov model based human intention learning and inference | |
CN116360454A (en) | Robot path collision avoidance planning method based on deep reinforcement learning in pedestrian environment | |
Tang et al. | Reinforcement learning for robots path planning with rule-based shallow-trial | |
CN117400269B (en) | Mechanical arm path planning method based on bidirectional sampling and virtual potential field guiding | |
CN113204238A (en) | Path planning method and device for mobile robot and storage medium | |
Chung et al. | Using ALC-PSO algorithm with particle growing method path planning in dynamic environments | |
Tabata et al. | Casting manipulation of unknown string by robot arm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |