CN111230875B - Double-arm robot humanoid operation planning method based on deep learning - Google Patents

Double-arm robot humanoid operation planning method based on deep learning Download PDF

Info

Publication number
CN111230875B
CN111230875B CN202010081726.8A CN202010081726A CN111230875B CN 111230875 B CN111230875 B CN 111230875B CN 202010081726 A CN202010081726 A CN 202010081726A CN 111230875 B CN111230875 B CN 111230875B
Authority
CN
China
Prior art keywords
mechanical arm
evaluation function
strategy network
motion
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010081726.8A
Other languages
Chinese (zh)
Other versions
CN111230875A (en
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Fanchuan Intelligent Robot Technology Co ltd
Original Assignee
Beijing Fanchuan Intelligent Robot Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Fanchuan Intelligent Robot Technology Co ltd filed Critical Beijing Fanchuan Intelligent Robot Technology Co ltd
Priority to CN202010081726.8A priority Critical patent/CN111230875B/en
Publication of CN111230875A publication Critical patent/CN111230875A/en
Application granted granted Critical
Publication of CN111230875B publication Critical patent/CN111230875B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1679Programme controls characterised by the tasks executed
    • B25J9/1682Dual arm manipulator; Coordination of several manipulators

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Manipulator (AREA)

Abstract

The invention provides a double-arm robot humanoid operation planning method based on deep learning, which comprises the following steps: step S1, training a strategy network; the strategy network can calculate the probability of the mechanical arm in each state at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes; s2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network; and step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm. According to the invention, the human motion sample is used as a guide, and the human motion sample is selected or changed by combining with the specific environment of the current robot, so that the double-arm robot can perform humanoid operation planning.

Description

Double-arm robot humanoid operation planning method based on deep learning
Technical Field
The invention relates to a robot mechanical arm operation planning method, in particular to a double-arm robot humanoid operation planning method based on deep learning, and belongs to the technical field of robots.
Background
In dangerous complex working environments such as anti-riot, battlefield combat, harmful gas and nuclear radiation of armed police, the main consideration of path planning is how a robot can move from an initial state to a target state without collision, and the technical level of path planning is an important index for measuring the intelligent degree of the robot. Because of the strong learning ability of human beings, a large number of exercise action samples are accumulated through continuous exercise learning and adaptation training of different environments. When the motion task similar to the previous motion task is executed, according to the motion sample, the required motion can be quickly and effectively completed by slightly changing according to the actual environment.
The existing path planning methods mainly include path planning algorithms based on graphs, such as an a-x algorithm, a visual method, an artificial potential field algorithm, a probability road graph algorithm (PRM) based on random sampling, a rapid spread random tree algorithm (RRT), and the like.
The dimension of the mechanical arm is higher than that of the mobile robot, the traditional path planning algorithm has the problems of large calculated amount, difficult description of obstacles, incomplete algorithm, non-optimal algorithm and the like, the planning algorithm based on sampling satisfies the problem of complete probability, but the planning result is possibly different each time due to the fact that certain randomness is introduced in sampling, and the problem that the planning result cannot be prejudged exists.
Based on this problem, there is a need in the art for a method for performing humanoid work planning by using a human motion sample as a guide, and on the other hand, selecting or changing the human motion sample in combination with a specific environment in which the current robot is located, thereby implementing a double-arm robot.
Disclosure of Invention
The invention aims to provide a motion planning method which can take human motion actions as guidance of robot motion planning and can consider the specific environment of the robot at present.
The technical scheme of the invention is as follows.
The first aspect of the invention provides a double-arm robot humanoid operation planning method based on deep learning, which comprises the following steps:
step S1, training a strategy network; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
S2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
and step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
Preferably, the policy network employs human locomotor action samples.
Preferably, the human locomotion activity sample is obtained by a motion capture device.
Preferably, the evaluation function is used to evaluate the cost of moving the robot arm from the current state to the states at the next moment.
Preferably, the evaluation function is further used for evaluating whether the mechanical arm collides with itself and the surrounding environment during the movement.
Preferably, the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
Preferably, the step S3 further includes:
step S31, selecting; according to possible human motion actions obtained by the strategy network, selecting an action q with larger probability;
step S32, an expanding step; after the action q, continuing to utilize the strategy network to select the child node with larger probability to expand the tree to obtain a plurality of subtrees, and stopping after expanding a certain step number;
step S33, an evaluation step; evaluating each node by using an evaluation function from the leaf node of the tree of which expansion is completed;
step S34, returning the step; returning the result of the evaluation function from the leaf node to the root node of the tree, wherein the evaluation value of each leaf node is updated along with the evaluation value of the returned child node; after all the backpasses are completed, the final evaluation value of the node q selected initially is obtained, so that the node with the highest evaluation value is selected as the next movement of the mechanical arm.
A second aspect of the present invention provides a two-arm robot humanoid job planning system, comprising:
a policy network training device; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
An evaluation function determination means; the evaluation function is used for evaluating the movement scheme provided by the strategy network
Tree searching means; the tree search device can perform tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
Preferably, the policy network employs human motion action samples; the evaluation function is used for evaluating the motion scheme provided by the strategy network; the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
A third aspect of the present invention provides a two-arm robot comprising two substantially symmetrical mechanical arms, characterized in that the two-arm robot performs path planning for the operation of the two substantially symmetrical mechanical arms according to the deep learning-based two-arm robot humanoid operation planning method according to any one of the first aspects of the present invention.
By the technical scheme, the invention can obtain the following technical effects.
(1) The invention combines a strategy network based on deep learning, an evaluation function based on a motion environment and tree search to form a novel robot motion planning method.
(2) When the strategy network is trained, human motion is taken as a sample, the characteristics of human motion are used for reference, and the humanoid characteristics of operation planning of the double-arm robot are realized.
Drawings
Fig. 1 is a schematic diagram of a two-arm robot humanoid work planning method based on deep learning of the present invention.
Detailed Description
Example 1
In fig. 1, embodiment 1 of the present invention provides a two-arm robot humanoid job planning method based on deep learning, which includes the following steps:
step S1, training a strategy network; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
S2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
and step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
In a preferred embodiment, the policy network employs human locomotor action samples.
In a preferred embodiment, the human locomotion activity sample is obtained by a motion capture device.
In a preferred embodiment, the evaluation function is used to evaluate the cost of moving the robot arm from the current state to the states at the next moment.
In a preferred embodiment, the evaluation function is also used to evaluate whether the robot arm collides with itself and the surrounding environment during the movement.
In a preferred embodiment, the tree search treats each motion state of the robotic arm as a node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
In a preferred embodiment, the step S3 further includes:
step S31, selecting; according to possible human motion actions obtained by the strategy network, selecting an action q with larger probability;
step S32, an expanding step; after the action q, continuing to utilize the strategy network to select the child node with larger probability to expand the tree to obtain a plurality of subtrees, and stopping after expanding a certain step number;
step S33, an evaluation step; evaluating each node by using an evaluation function from the leaf node of the tree of which expansion is completed;
step S34, returning the step; returning the result of the evaluation function from the leaf node to the root node of the tree, wherein the evaluation value of each leaf node is updated along with the evaluation value of the returned child node; after all the backpasses are completed, the final evaluation value of the node q selected initially is obtained, so that the node with the highest evaluation value is selected as the next movement of the mechanical arm.
Example 2
The embodiment 2 of the invention provides a double-arm robot humanoid operation planning system, which comprises:
a policy network training device; the strategy network can calculate the probability of each state of the mechanical arm at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes
An evaluation function determination means; the evaluation function is used for evaluating the movement scheme provided by the strategy network
Tree searching means; the tree search device can perform tree search by combining the strategy network and the evaluation function to obtain a result of the path planning of the mechanical arm.
In a preferred embodiment, the policy network employs human locomotor action samples; the evaluation function is used for evaluating the motion scheme provided by the strategy network; the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
Example 3
Embodiment 3 of the present invention provides a two-arm robot, which comprises two substantially symmetrical mechanical arms, wherein the two-arm robot performs path planning on the operations of the two substantially symmetrical mechanical arms according to the two-arm robot humanoid operation planning method based on deep learning according to any one of embodiment 1 of the present invention.

Claims (6)

1. A double-arm robot humanoid operation planning method based on deep learning comprises the following steps:
step S1, training a strategy network; the strategy network can calculate the probability of the mechanical arm in each state at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes;
s2, determining an evaluation function; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
step S3, performing tree search by combining the strategy network and the evaluation function to obtain a result of mechanical arm path planning;
the method is characterized in that the strategy network adopts a human motion action sample;
the evaluation function is used for evaluating whether the mechanical arm collides with the mechanical arm and the surrounding environment in the movement process;
the tree search treats each motion state of the mechanical arm as one node of a tree; and performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of the tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of the root nodes from the leaf nodes, and finally selecting the optimal node according to the values of the nodes as a mechanical arm path planning result.
2. The deep learning-based two-arm robot humanoid job planning method of claim 1, wherein the human motion samples are acquired through a motion capture device.
3. The deep learning-based two-arm robot humanoid job planning method of claim 1, wherein the evaluation function is used for evaluating the cost of each state of the mechanical arm from the current state to the next moment.
4. The method for planning a humanoid job of a double-arm robot based on deep learning according to claim 1, wherein the step S3 further comprises:
step S31, selecting; according to possible human motion actions obtained by the strategy network, selecting an action q with larger probability;
step S32, an expanding step; after the action q, continuing to utilize the strategy network to select the child node with larger probability to expand the tree to obtain a plurality of subtrees, and stopping after expanding a certain step number;
step S33, an evaluation step; evaluating each node by using an evaluation function from the leaf node of the tree of which expansion is completed;
step S34, returning the step; returning the result of the evaluation function from the leaf node to the root node of the tree, wherein the evaluation value of each leaf node is updated along with the evaluation value of the returned child node; after all the backpasses are completed, the final evaluation value of the node q selected initially is obtained, so that the node with the highest evaluation value is selected as the next movement of the mechanical arm.
5. A two-arm robotic humanoid job planning system comprising:
a policy network training device; the strategy network can calculate the probability of the mechanical arm in each state at the next moment according to the current state and the target state of the mechanical arm, and give out various motion schemes;
an evaluation function determination means; the evaluation function is used for evaluating the motion scheme provided by the strategy network;
tree searching means; the tree searching device can be used for executing tree searching by combining the strategy network and the evaluation function to obtain a result of mechanical arm path planning;
the strategy network adopts a human motion action sample; the tree search treats each motion state of the mechanical arm as one node of a tree; performing multi-step expansion on the motion of the mechanical arm through the strategy network to obtain a plurality of nodes of a tree, evaluating the expanded leaf nodes according to the evaluation function, returning and updating the values of all the root nodes from the leaf nodes, and finally selecting an optimal node according to the values of all the nodes as a mechanical arm path planning result;
the evaluation function is also used for evaluating whether the mechanical arm collides with the mechanical arm and the surrounding environment during the movement process.
6. A two-arm robot comprising two substantially symmetrical robotic arms, wherein the two-arm robot routes the work of the two substantially symmetrical robotic arms according to the deep learning based two-arm robot humanoid work planning method of any one of claims 1-4.
CN202010081726.8A 2020-02-06 2020-02-06 Double-arm robot humanoid operation planning method based on deep learning Active CN111230875B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010081726.8A CN111230875B (en) 2020-02-06 2020-02-06 Double-arm robot humanoid operation planning method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010081726.8A CN111230875B (en) 2020-02-06 2020-02-06 Double-arm robot humanoid operation planning method based on deep learning

Publications (2)

Publication Number Publication Date
CN111230875A CN111230875A (en) 2020-06-05
CN111230875B true CN111230875B (en) 2023-05-12

Family

ID=70861916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010081726.8A Active CN111230875B (en) 2020-02-06 2020-02-06 Double-arm robot humanoid operation planning method based on deep learning

Country Status (1)

Country Link
CN (1) CN111230875B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113084807B (en) * 2021-03-31 2022-04-19 中国科学技术大学 Method for searching toxic gas leakage source of multi-robot system
CN117021101B (en) * 2023-08-24 2024-03-19 中国矿业大学 Multi-arm path planning method for belt conveyor dismounting robot

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108284436A (en) * 2018-03-17 2018-07-17 北京工业大学 Remote mechanical dual arm system and method with learning by imitation mechanism
CN109405843A (en) * 2018-09-21 2019-03-01 北京三快在线科技有限公司 A kind of paths planning method and device and mobile device
WO2019178147A1 (en) * 2018-03-12 2019-09-19 Virginia Polytechnic Institute And State University Intelligent distribution of data for robotic and autonomous systems
CN110297490A (en) * 2019-06-17 2019-10-01 西北工业大学 Heterogeneous module robot via Self-reconfiguration planing method based on nitrification enhancement
CN110509279A (en) * 2019-09-06 2019-11-29 北京工业大学 A kind of trajectory path planning method and system of apery mechanical arm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019178147A1 (en) * 2018-03-12 2019-09-19 Virginia Polytechnic Institute And State University Intelligent distribution of data for robotic and autonomous systems
CN108284436A (en) * 2018-03-17 2018-07-17 北京工业大学 Remote mechanical dual arm system and method with learning by imitation mechanism
CN109405843A (en) * 2018-09-21 2019-03-01 北京三快在线科技有限公司 A kind of paths planning method and device and mobile device
CN110297490A (en) * 2019-06-17 2019-10-01 西北工业大学 Heterogeneous module robot via Self-reconfiguration planing method based on nitrification enhancement
CN110509279A (en) * 2019-09-06 2019-11-29 北京工业大学 A kind of trajectory path planning method and system of apery mechanical arm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于轨迹匹配的模仿学习在类人机器人运动行为中的研究;门玉森;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20170315(第3期);第17-40页 *

Also Published As

Publication number Publication date
CN111230875A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
CN111230875B (en) Double-arm robot humanoid operation planning method based on deep learning
Wu et al. Online planning for ad hoc autonomous agent teams
CN109960880A (en) A kind of industrial robot obstacle-avoiding route planning method based on machine learning
CN102402712A (en) Robot reinforced learning initialization method based on neural network
WO2011115960A1 (en) Temporal tracking robot control system
Guigue et al. Pareto optimality and multiobjective trajectory planning for a 7-DOF redundant manipulator
CN110275528B (en) Improved path optimization method for RRT algorithm
CN115958590A (en) RRT-based mechanical arm deep frame obstacle avoidance motion planning method and device
CN110530373B (en) Robot path planning method, controller and system
Jankowski et al. Vp-sto: Via-point-based stochastic trajectory optimization for reactive robot behavior
CN110705803A (en) Route planning method based on triangle inner center guide RRT algorithm
Parhi et al. Smart navigation of humanoid robots using DAYKUN-BIP virtual target displacement and petri-net strategy
Ponsini et al. Analysis of soccer robot behaviors using time petri nets
Yun et al. Enhanced D∗ Lite Algorithm for mobile robot navigation
Zhang et al. A closed-loop perception, decision-making and reasoning mechanism for human-like navigation
Zare et al. Cyrus 2D Simulation 2019
CN112131754A (en) Extended POMDP planning method and system based on robot accompanying behavior model
Panda et al. Autonomous mobile robot path planning using hybridization of particle swarm optimization and Tabu search
Liu et al. Evolving hidden Markov model based human intention learning and inference
CN116360454A (en) Robot path collision avoidance planning method based on deep reinforcement learning in pedestrian environment
Tang et al. Reinforcement learning for robots path planning with rule-based shallow-trial
CN117400269B (en) Mechanical arm path planning method based on bidirectional sampling and virtual potential field guiding
CN113204238A (en) Path planning method and device for mobile robot and storage medium
Chung et al. Using ALC-PSO algorithm with particle growing method path planning in dynamic environments
Tabata et al. Casting manipulation of unknown string by robot arm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant