CN110861084A - Four-legged robot falling self-resetting control method based on deep reinforcement learning - Google Patents

Four-legged robot falling self-resetting control method based on deep reinforcement learning Download PDF

Info

Publication number
CN110861084A
CN110861084A CN201911128299.8A CN201911128299A CN110861084A CN 110861084 A CN110861084 A CN 110861084A CN 201911128299 A CN201911128299 A CN 201911128299A CN 110861084 A CN110861084 A CN 110861084A
Authority
CN
China
Prior art keywords
robot
joint
falling
output
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911128299.8A
Other languages
Chinese (zh)
Other versions
CN110861084B (en
Inventor
宋光明
何淼
韦中
宋爱国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201911128299.8A priority Critical patent/CN110861084B/en
Publication of CN110861084A publication Critical patent/CN110861084A/en
Application granted granted Critical
Publication of CN110861084B publication Critical patent/CN110861084B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B62LAND VEHICLES FOR TRAVELLING OTHERWISE THAN ON RAILS
    • B62DMOTOR VEHICLES; TRAILERS
    • B62D57/00Vehicles characterised by having other propulsion or other ground- engaging means than wheels or endless track, alone or in addition to wheels or endless track
    • B62D57/02Vehicles characterised by having other propulsion or other ground- engaging means than wheels or endless track, alone or in addition to wheels or endless track with ground-engaging propulsion means, e.g. walking members
    • B62D57/032Vehicles characterised by having other propulsion or other ground- engaging means than wheels or endless track, alone or in addition to wheels or endless track with ground-engaging propulsion means, e.g. walking members with alternately or sequentially lifted supporting base and legs; with alternately or sequentially lifted feet or skid

Landscapes

  • Engineering & Computer Science (AREA)
  • Mechanical Engineering (AREA)
  • Robotics (AREA)
  • Chemical & Material Sciences (AREA)
  • Combustion & Propulsion (AREA)
  • Transportation (AREA)
  • Manipulator (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention provides a four-footed robot falling self-resetting control method based on deep reinforcement learning, and belongs to the technical field of machine learning and robot control. The method comprises the following steps: the method comprises the steps of establishing a four-footed robot model, constructing and learning an actuator network, training a control strategy and executing four steps by a bottom system. According to the invention, the robot can realize autonomous reset on the flat ground under any falling posture by using a deep reinforcement learning algorithm, pre-programming and human intervention are not needed, and the intelligence, flexibility and environmental adaptability of the robot are improved.

Description

Four-legged robot falling self-resetting control method based on deep reinforcement learning
Technical Field
The invention belongs to the technical field of machine learning and robot control, and particularly relates to a four-legged robot falling self-resetting control method based on deep reinforcement learning.
Background
The legged robot is used as an important branch in the field of robots, can replace human beings to search and operate in unknown complex severe environments such as earthquakes, nuclear radiation, fires and the like, and has wide application prospect. The large land animals are mostly quadruped animals in the whole natural world, and the shadow of the quadruped animals can be seen in cliffs, hills, grasslands or deserts, which fully shows the approval of natural selection for the moving mode of the quadruped. The quadruped robot takes quadruped animals as bionic objects, has the potential capability of flexible movement like the quadruped animals, and is a mobile robot with wide application prospect.
In recent years, the quadruped robot has obvious progress in gait planning, obstacle crossing and the like, but the realization of autonomous motion like quadruped animals has a large gap, wherein the self-resetting function of the quadruped robot can be quickly and flexibly realized after falling; existing control methods are mostly task-specific based on models, and almost every operation needs to be developed from scratch.
Disclosure of Invention
The four-footed robot falling self-resetting control method based on deep reinforcement learning enables the four-footed robot to realize autonomous resetting without artificial assistance; and different tasks can be independently and efficiently executed according to requirements by simply replacing the configuration of the neural network parameters, so that the development period is greatly shortened.
The invention provides a four-footed robot falling self-resetting control method based on deep reinforcement learning, which comprises four steps of establishing a four-footed robot model, constructing and learning an actuator network, training a control strategy and executing by a bottom system, and specifically comprises the following steps:
step 1, establishing a four-footed robot model: determining various physical parameters of the robot; the key point for realizing the falling self-resetting function lies in the mutual matching among the legs and the joints of each leg;
step 2, building a deep reinforcement learning framework and learning an actuator network: learning an actuator network on the system through self-supervision learning, and using the actuator network in simulation modeling of 12 joints of the quadruped robot;
step 3, training the controller: training a simple parameterized controller by using the model generated in the steps 1 and 2, generating foot tracks in a sine wave form, determining coordinate systems and a centroid coordinate system of each joint by using a coordinate transformation method, and calculating corresponding joint positions in a resetting process by using inverse kinematics;
and 4, executing by the bottom layer system: and (3) randomly setting the initial falling position and the initial falling posture of the robot, outputting the neural network trained in the step (3) as the execution actions of 12 joints of the robot, determining the motion scheme of each joint so as to drive the joint to move, and completing the task of falling and self-resetting.
The invention is further improved in that: the step 2 specifically comprises the following steps:
2.1: the status is a robot status measurement provided to the controller. The state space S is described as a 9-dimensional vector space, comprising
Figure BDA0002277541980000021
Respectively represent:
Figure BDA0002277541980000022
-a robot direction vector measured by an IMU (Inertial measurement unit).
rz-robot base height.
v-base line speed.
w-base angular velocity.
Figure BDA0002277541980000031
-joint position.
Phi-joint velocity.
Θ — historical state of the joint (t ═ t)k-0.01s and t ═ tk-0.02 s).
αk-1-the previous action of the robot.
C-constant.
An action is a command provided to an actuator. The motion space a is described as a 2-dimensional discrete vector space,
Figure BDA0002277541980000032
representing joint torque speed and joint torque position, respectively.
The reward is specified to induce the robot to produce ideal behavior; and setting an award function pi, and awarding a strategy corresponding to the maximum value after the discount sum, namely the action selected and executed by the robot according to the strategy instruction.
The reward function is:
Figure BDA0002277541980000033
where γ ∈ (0,1) is the discount factor and τ (π) is the trajectory distribution under the reward function π.
2.2: the method comprises the following steps of constructing a deep neural network N for judging falling self-reset income of the robot, and specifically:
constructing an MLP (Multi-Layer Perception) four-Layer neural network N for judging the falling self-resetting income of the robot, wherein the MLP comprises the following steps: an input layer LiTwo hidden layers LhAn output layer Lo(ii) a The input layer input items are the historical states of the robot under the generalized coordinates q and the generalized speed v.
The output item of the output layer Lo comprises two dimensions which respectively represent the speed estimation deviation S and the position estimation deviation P of the torque of each joint of the robot; the speed estimation deviation S is the deviation between the actual speed of the current robot joint torque and the target speed, the position estimation deviation P is the deviation between the actual position of the current robot joint torque and the target position, each leg of the robot is supposed to have 3 degrees of freedom and 3 × 4 joint torques in total, and the output of an output layer is a 2 × 12 matrix;
setting an activation function of the deep neural network N:
setting an input layer activation function of the deep neural network N as a Relu function:
f(x)=max(0,x)
the output layer activation function is:
Figure BDA0002277541980000041
the input layer is vector X, and the output of the hidden layer 1 is:
f(w1+b1)
the output of the hidden layer 2 layer is:
f(w2+b2)
the final output layer output is then:
f(x)=f(b2+w2(t(b1+w1x)))
wherein the function f is the tanh function:
Figure BDA0002277541980000042
w is the weight and b is the deviation.
The invention is further improved in that: the step 4 specifically comprises the following steps:
4.1 setting the falling initial position and the falling initial posture of the robot randomly.
4.2 the deep neural network N outputs the execution actions of 12 joints of the robot.
4.3 the output position trajectory is simulated assuming that the robot fully follows the joint torque speed command and the joint torque position command.
4.4 judging whether the joint movement exceeds the available space range. If yes, refusing to sample, resetting the position to the previous position, and sampling the output command again; if not, the action is performed.
And 4.5, judging whether the robot recovers the initial normal state. If not, executing the output command again according to the sampling command; if so, the robot completes the falling self-resetting task.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, the deep depth reinforcement learning is applied to the falling self-resetting function of the quadruped robot, so that the complicated manual adjusting process during artificial participation is avoided; the automatic reset reduces the time for completing the task and has high flexibility; the robot can continuously learn and accumulate, and can smoothly complete tasks in an untrained unknown environment and under different falling states.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic view of the hip coordinate system of the present invention;
FIG. 3 is a schematic view of the thigh and calf coordinate system of the present invention;
figure 4 is a schematic diagram of a fall reduction procedure of the invention;
FIG. 5 is a diagram of a neural network architecture of the present invention;
FIG. 6 is a schematic of the overall control strategy of the present invention;
Detailed Description
As shown in fig. 1, the method for fall self-resetting of a quadruped robot based on deep depth reinforcement learning provided by this embodiment includes establishing a quadruped robot model, constructing and learning an actuator network, training a control strategy, executing four steps by a bottom system,
the specific contents are as follows:
step 1, establishing a four-footed robot model, and determining various physical parameters of the robot; the key point for realizing the falling self-resetting function lies in the mutual matching among the legs and the joints of each leg, each physical parameter of the quadruped robot comprises the lengths of the hip, the thigh and the shank of the robot, the motion parameter comprises the degree of freedom of each joint, and the available space position of each joint is limited to accord with the actual biological motion condition. The quadruped robot provided by the invention has three joints, namely three degrees of freedom, namely hip, thigh and shank on each leg.
And 2, building a deep and deep reinforcement learning framework, learning an actuator network on the system through self-supervision learning, and using the actuator network in simulation modeling of 12 joints of the quadruped robot. 2.1: the status is a robot status measurement provided to the controller. The state space S is described as a 9-dimensional vector space, comprising
Figure BDA0002277541980000061
Respectively represent: a robot direction vector measured by an IMU (Inertial measurement unit);
Figure BDA0002277541980000062
-a robot direction vector measured by an IMU (Inertial measurement unit).
rz-robot base height.
v-base line speed.
w-base angular velocity.
Figure BDA0002277541980000071
-joint position.
Phi-joint velocity.
Θ — historical state of the joint (t ═ t)k-0.01s and t ═ tk-0.02 s).
αk-1-the previous action of the robot.
C-constant.
An action is a command provided to an actuator. The motion space a is described as a two-dimensional discrete vector space,
Figure BDA0002277541980000073
representing joint torque speed and joint torque position, respectively.
The reward is specified to induce the robot to produce ideal behavior; and setting an award function pi, and awarding a strategy corresponding to the maximum value after the discount sum, namely the action selected and executed by the robot according to the strategy instruction.
The reward function is:
Figure BDA0002277541980000072
where γ ∈ (0,1) is the discount factor and τ (π) is the trajectory distribution under the reward function π.
2.2, constructing a deep neural network N for judging falling self-reset income of the robot, and specifically comprising the following steps:
constructing an MLP (Multi-Layer Perception) four-Layer neural network N for judging the falling self-resetting income of the robot, wherein the MLP comprises the following steps: an input layer LiTwo hidden layers LhAn output layer Lo(ii) a The input layer input items are the historical states of the robot under the generalized coordinates q and the generalized speed v.
Output layer LoThe output item of (2) comprises two dimensions which respectively represent the speed estimation deviation S and the position estimation deviation P of the torque of each joint of the robot, each leg of the robot is assumed to have 3 degrees of freedom and 3 x 4 joint torques in total, and the output of the output layer is a 2 x 12 matrix.
2.3 set activation function of the deep neural network N:
setting an input layer activation function of the deep neural network N as a Relu function: (x) max (0, x), output layer activation function is
Figure BDA0002277541980000081
The input layer is vector X, and the output of the hidden layer 1 is: f (w)1+b1) Hidden layer 2 output: the method comprises the following steps: f (w)2+b2) The final output layer output is f (x) f (b)2+w2(t(b1+w1x))). Wherein the function f is the tanh function:
Figure BDA0002277541980000082
w is the weight and b is the deviation.
And 3, training a simple parameterized controller by using the models generated in the steps 1 and 2, generating foot tracks in a sine wave form, establishing each joint coordinate system and a centroid coordinate system by using a coordinate transformation method, and calculating the corresponding joint position in the resetting process by inverse kinematics, wherein the coordinate systems are established as shown in fig. 2 and 3.
And 4, in the execution stage of the bottom system, randomly setting the falling initial position and posture of the robot, outputting the deep neural network trained in the step 3 as the execution actions of 12 joints of the robot, determining each joint movement scheme so as to drive the joints to move, and completing the falling self-resetting task.
The step 4 specifically comprises the following steps:
4.1 setting the falling initial position and the falling initial posture of the robot randomly.
4.2 the deep neural network N outputs the execution actions of 12 joints of the robot.
4.3 the output position trajectory is simulated assuming that the robot fully follows the joint torque speed command and the joint torque position command.
4.4 judging whether the joint movement exceeds the available space range. If yes, refusing to sample, resetting the position to the previous position, and sampling the output command again; if not, the action is performed, as shown in the process of FIG. 4.
And 4.5, judging whether the robot recovers the initial normal state. If not, executing the output command again according to the sampling command; if so, the robot completes the falling self-resetting task, and the final reset completion state is shown in fig. 5.

Claims (4)

1. A four-footed robot falling self-resetting control method based on deep reinforcement learning is characterized by comprising the following steps:
step 1, establishing a four-footed robot model: determining various physical parameters of the robot; the key point for realizing the falling self-resetting function lies in the mutual matching among the legs and the joints of each leg;
step 2, building a deep reinforcement learning framework and learning an actuator network: learning an actuator network on the system through self-supervision learning, and using the actuator network in simulation modeling of 12 joints of the quadruped robot;
step 3, training the controller: training a simple parameterized controller by using the model generated in the steps 1 and 2, generating foot tracks in a sine wave form, determining coordinate systems and a centroid coordinate system of each joint by using a coordinate transformation method, and calculating corresponding joint positions in a resetting process by using inverse kinematics;
and 4, executing by the bottom layer system: and (3) randomly setting the initial falling position and the initial falling posture of the robot, outputting the neural network trained in the step (3) as the execution actions of 12 joints of the robot, determining the motion scheme of each joint so as to drive the joint to move, and completing the task of falling and self-resetting.
2. The four-footed robot falling self-resetting control method based on deep reinforcement learning of claim 1 is characterized in that the four-footed robot in step 1 has three joints of hip, thigh and shank, namely three degrees of freedom, on each leg; each physical parameter of the quadruped robot comprises the length of a hip, thigh and shank of the robot, and the motion parameter comprises the degree of freedom of each joint and limits the available space position of each joint to accord with the actual biological motion condition.
3. The four-footed robot fall self-resetting control method based on deep reinforcement learning of claim 1,
the specific steps of building the deep reinforcement learning framework in the step 2 are as follows:
2.1 status is a robot status measurement provided to the controller; the state space S is described as a 9-dimensional vector space, comprising
Figure FDA0002277541970000021
Wherein:
Figure FDA0002277541970000023
-a robot direction vector measured by an IMU (Inertial measurement unit);
rz-a robot base height; v-base line speed; w-base angular velocity;
Figure FDA0002277541970000025
-a joint position; phi-joint velocity; Θ — historical state of the joint (t ═ t)k-0.01s and t ═ tk-0.02s) sparse samples αk-1-a previous action of the robot; c is a constant;
2.2 action is a command provided to the actuator; the motion space a is described as a two-dimensional discrete vector space,
Figure FDA0002277541970000024
respectively representing joint torque speed and joint torque position;
2.3, the reward is specified to induce the robot to generate ideal behaviors; setting an award function pi, and awarding a strategy corresponding to the maximum value after the discount sum, namely, selecting an executed action by the robot according to a strategy instruction;
the reward function is:
Figure FDA0002277541970000022
wherein gamma belongs to (0,1) as a discount factor, and tau (pi) is the track distribution under the reward function pi;
the learning executor network in the step 2 comprises the following specific steps:
2.4, constructing an MLP (Multi-Layer Perception) four-Layer neural network N for judging the falling self-resetting income of the robot, wherein the MLP comprises the following components: an input layer LiTwo hidden layers LhAn output layer Lo(ii) a The input items of the input layer are the historical states of the robot under the generalized coordinates q and the generalized speed v;
the output item of the output layer Lo comprises two dimensions which respectively represent the speed estimation deviation S and the position estimation deviation P of the torque of each joint of the robot; the speed estimation deviation S is the deviation between the actual speed of the current robot joint torque and the target speed, the position estimation deviation P is the deviation between the actual position of the current robot joint torque and the target position, each leg of the robot is supposed to have 3 degrees of freedom and 3 × 4 joint torques in total, and the output of an output layer is a 2 × 12 matrix;
2.5, setting an activation function of the neural network N:
setting the input layer activation function of the neural network N as a Relu function:
f(x)=max(0,x)
the output layer activation function is
Figure FDA0002277541970000031
The input layer is vector X, and the output of hidden layer 1 is:
f(w1+b1)
the output of the hidden layer 2 layer is:
f(w2+b2)
the final output layer output is then:
f(x)=f(b2+w2(t(b1+w1x)))。
wherein the function f is the tanh function:
Figure FDA0002277541970000032
w is the weight and b is the deviation.
4. The four-footed robot falling self-resetting control method based on deep reinforcement learning of claim 1 is characterized in that the specific steps executed by the bottom layer system in step 4 are as follows:
4.1: randomly setting the falling initial position and the falling initial posture of the robot;
4.2: the deep neural network N outputs the execution actions of 12 joints of the robot;
4.3: assuming that the robot completely follows the joint torque speed command and the joint torque position command, and simulating an output position track;
4.4: judging whether the joint movement exceeds the available space range, if so, refusing to sample, resetting the position to the previous position, and sampling the output command again; if not, executing the action;
4.5) judging whether the robot recovers the initial normal state; if not, executing the output command again according to the sampling command; if so, the robot completes the falling self-resetting task.
CN201911128299.8A 2019-11-18 2019-11-18 Four-legged robot falling self-resetting control method based on deep reinforcement learning Active CN110861084B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911128299.8A CN110861084B (en) 2019-11-18 2019-11-18 Four-legged robot falling self-resetting control method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911128299.8A CN110861084B (en) 2019-11-18 2019-11-18 Four-legged robot falling self-resetting control method based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN110861084A true CN110861084A (en) 2020-03-06
CN110861084B CN110861084B (en) 2022-04-05

Family

ID=69654912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911128299.8A Active CN110861084B (en) 2019-11-18 2019-11-18 Four-legged robot falling self-resetting control method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN110861084B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111409073A (en) * 2020-04-02 2020-07-14 深圳国信泰富科技有限公司 Tumbling self-recovery method and system for high-intelligence robot
CN111506100A (en) * 2020-06-15 2020-08-07 深圳市优必选科技股份有限公司 Multi-legged robot joint control method and device and multi-legged robot
CN112405568A (en) * 2020-10-20 2021-02-26 同济大学 Humanoid robot falling prediction method
CN112859904A (en) * 2021-01-25 2021-05-28 乐聚(深圳)机器人技术有限公司 Method, device and equipment for recovering standing posture of robot and storage medium
CN113110459A (en) * 2021-04-20 2021-07-13 上海交通大学 Motion planning method for multi-legged robot
WO2022223056A1 (en) * 2021-07-12 2022-10-27 上海微电机研究所(中国电子科技集团公司第二十一研究所) Robot motion parameter adaptive control method and system based on deep reinforcement learning
CN115407790A (en) * 2022-08-16 2022-11-29 中国北方车辆研究所 Four-legged robot lateral velocity estimation method based on deep learning
TWI811156B (en) * 2022-11-16 2023-08-01 英業達股份有限公司 Transition method of locomotion gait of robot
CN116898583A (en) * 2023-06-21 2023-10-20 北京长木谷医疗科技股份有限公司 Deep learning-based intelligent rasping control method and device for orthopedic operation robot

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1297805A (en) * 1999-11-24 2001-06-06 索尼公司 Movable robot with legs and its controlling and operating method
US6330494B1 (en) * 1998-06-09 2001-12-11 Sony Corporation Robot and method of its attitude control
CN1518488A (en) * 2002-03-15 2004-08-04 ���ṫ˾ Operation control device for leg-type mobile robot and operation control method and robot device
CN102372042A (en) * 2011-09-07 2012-03-14 广东工业大学 Motion planning system for biped robot
CN106886155A (en) * 2017-04-28 2017-06-23 齐鲁工业大学 A kind of quadruped robot control method of motion trace based on PSO PD neutral nets
CN107450555A (en) * 2017-08-30 2017-12-08 唐开强 A kind of Hexapod Robot real-time gait planing method based on deeply study
CN108983804A (en) * 2018-08-27 2018-12-11 燕山大学 A kind of biped robot's gait planning method based on deeply study
CN109483530A (en) * 2018-10-18 2019-03-19 北京控制工程研究所 A kind of legged type robot motion control method and system based on deeply study

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6330494B1 (en) * 1998-06-09 2001-12-11 Sony Corporation Robot and method of its attitude control
CN1297805A (en) * 1999-11-24 2001-06-06 索尼公司 Movable robot with legs and its controlling and operating method
CN1518488A (en) * 2002-03-15 2004-08-04 ���ṫ˾ Operation control device for leg-type mobile robot and operation control method and robot device
CN102372042A (en) * 2011-09-07 2012-03-14 广东工业大学 Motion planning system for biped robot
CN106886155A (en) * 2017-04-28 2017-06-23 齐鲁工业大学 A kind of quadruped robot control method of motion trace based on PSO PD neutral nets
CN107450555A (en) * 2017-08-30 2017-12-08 唐开强 A kind of Hexapod Robot real-time gait planing method based on deeply study
CN108983804A (en) * 2018-08-27 2018-12-11 燕山大学 A kind of biped robot's gait planning method based on deeply study
CN109483530A (en) * 2018-10-18 2019-03-19 北京控制工程研究所 A kind of legged type robot motion control method and system based on deeply study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨茜等: "一种弹跳机器人姿态调节中离散和连续运动建模与实验研究", 《机器人》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111409073A (en) * 2020-04-02 2020-07-14 深圳国信泰富科技有限公司 Tumbling self-recovery method and system for high-intelligence robot
CN111506100A (en) * 2020-06-15 2020-08-07 深圳市优必选科技股份有限公司 Multi-legged robot joint control method and device and multi-legged robot
CN111506100B (en) * 2020-06-15 2020-10-02 深圳市优必选科技股份有限公司 Multi-legged robot joint control method and device and multi-legged robot
CN112405568A (en) * 2020-10-20 2021-02-26 同济大学 Humanoid robot falling prediction method
CN112859904A (en) * 2021-01-25 2021-05-28 乐聚(深圳)机器人技术有限公司 Method, device and equipment for recovering standing posture of robot and storage medium
CN113110459A (en) * 2021-04-20 2021-07-13 上海交通大学 Motion planning method for multi-legged robot
WO2022223056A1 (en) * 2021-07-12 2022-10-27 上海微电机研究所(中国电子科技集团公司第二十一研究所) Robot motion parameter adaptive control method and system based on deep reinforcement learning
CN115407790A (en) * 2022-08-16 2022-11-29 中国北方车辆研究所 Four-legged robot lateral velocity estimation method based on deep learning
CN115407790B (en) * 2022-08-16 2024-04-26 中国北方车辆研究所 Four-foot robot lateral speed estimation method based on deep learning
TWI811156B (en) * 2022-11-16 2023-08-01 英業達股份有限公司 Transition method of locomotion gait of robot
CN116898583A (en) * 2023-06-21 2023-10-20 北京长木谷医疗科技股份有限公司 Deep learning-based intelligent rasping control method and device for orthopedic operation robot
CN116898583B (en) * 2023-06-21 2024-04-26 北京长木谷医疗科技股份有限公司 Deep learning-based intelligent rasping control method and device for orthopedic operation robot

Also Published As

Publication number Publication date
CN110861084B (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN110861084B (en) Four-legged robot falling self-resetting control method based on deep reinforcement learning
Vukobratovic When were active exoskeletons actually born?
Atmeh et al. Implementation of an adaptive, model free, learning controller on the Atlas robot
Pérez-Higueras et al. Hunavsim: A ros 2 human navigation simulator for benchmarking human-aware robot navigation
Tang et al. Humanmimic: Learning natural locomotion and transitions for humanoid robot via wasserstein adversarial imitation
Kuo et al. Development of humanoid robot simulator for gait learning by using particle swarm optimization
Rokbani et al. Prototyping a biped robot using an educational robotics kit
Ammar et al. Learning to walk using a recurrent neural network with time delay
Ferreira et al. Diagonal walk reference generator based on Fourier approximation of ZMP trajectory
Li et al. Agile and versatile bipedal robot tracking control through reinforcement learning
Wei et al. Learning Gait-conditioned Bipedal Locomotion with Motor Adaptation
Soyguder et al. Slegs robot: development and design of a novel flexible and self-reconfigurable robot leg
Belter et al. Evolving feasible gaits for a hexapod robot by reducing the space of possible solutions
Fachantidis et al. Model-based reinforcement learning for humanoids: A study on forming rewards with the iCub platform
Steinhauser Habitat-Lab Quadruped Embodied AI Research
Bentrah et al. Full body adjustment using iterative inverse kinematic and body parts correlation
Shafii et al. Two humanoid simulators: Comparison and synthesis
Vollaro et al. Application of Block-Based Programming to the Selected Open-Source Quadrupedal Platform for Improving Robotics Training
Huan et al. Adaptive evolutionary neural network gait generation for humanoid robot optimized with modified differential evolution algorithm
Issa et al. Learning the Quadruped Robot by Reinforcement Learning (RL).
Mortazi et al. Using embodiment theory to train a set of actuators with different expertise to accomplish a duty: An application to train a quadruped robot for walking
Verner et al. Experiential learning through designing robots and motion behaviors: A tiered approach
Liu et al. A Reinforcement Learning Toolkit for Quadruped Robots With Pybullet
Agarwal Interaction Between Artificial Intelligence and Mechanical Engineering
Amirshirzad et al. Context based echo state networks for robot movement primitives

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant