CN110861084A - Four-legged robot falling self-resetting control method based on deep reinforcement learning - Google Patents
Four-legged robot falling self-resetting control method based on deep reinforcement learning Download PDFInfo
- Publication number
- CN110861084A CN110861084A CN201911128299.8A CN201911128299A CN110861084A CN 110861084 A CN110861084 A CN 110861084A CN 201911128299 A CN201911128299 A CN 201911128299A CN 110861084 A CN110861084 A CN 110861084A
- Authority
- CN
- China
- Prior art keywords
- robot
- joint
- falling
- output
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/163—Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1656—Programme controls characterised by programming, planning systems for manipulators
- B25J9/1664—Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B62—LAND VEHICLES FOR TRAVELLING OTHERWISE THAN ON RAILS
- B62D—MOTOR VEHICLES; TRAILERS
- B62D57/00—Vehicles characterised by having other propulsion or other ground- engaging means than wheels or endless track, alone or in addition to wheels or endless track
- B62D57/02—Vehicles characterised by having other propulsion or other ground- engaging means than wheels or endless track, alone or in addition to wheels or endless track with ground-engaging propulsion means, e.g. walking members
- B62D57/032—Vehicles characterised by having other propulsion or other ground- engaging means than wheels or endless track, alone or in addition to wheels or endless track with ground-engaging propulsion means, e.g. walking members with alternately or sequentially lifted supporting base and legs; with alternately or sequentially lifted feet or skid
Landscapes
- Engineering & Computer Science (AREA)
- Mechanical Engineering (AREA)
- Robotics (AREA)
- Chemical & Material Sciences (AREA)
- Combustion & Propulsion (AREA)
- Transportation (AREA)
- Manipulator (AREA)
- Feedback Control In General (AREA)
Abstract
The invention provides a four-footed robot falling self-resetting control method based on deep reinforcement learning, and belongs to the technical field of machine learning and robot control. The method comprises the following steps: the method comprises the steps of establishing a four-footed robot model, constructing and learning an actuator network, training a control strategy and executing four steps by a bottom system. According to the invention, the robot can realize autonomous reset on the flat ground under any falling posture by using a deep reinforcement learning algorithm, pre-programming and human intervention are not needed, and the intelligence, flexibility and environmental adaptability of the robot are improved.
Description
Technical Field
The invention belongs to the technical field of machine learning and robot control, and particularly relates to a four-legged robot falling self-resetting control method based on deep reinforcement learning.
Background
The legged robot is used as an important branch in the field of robots, can replace human beings to search and operate in unknown complex severe environments such as earthquakes, nuclear radiation, fires and the like, and has wide application prospect. The large land animals are mostly quadruped animals in the whole natural world, and the shadow of the quadruped animals can be seen in cliffs, hills, grasslands or deserts, which fully shows the approval of natural selection for the moving mode of the quadruped. The quadruped robot takes quadruped animals as bionic objects, has the potential capability of flexible movement like the quadruped animals, and is a mobile robot with wide application prospect.
In recent years, the quadruped robot has obvious progress in gait planning, obstacle crossing and the like, but the realization of autonomous motion like quadruped animals has a large gap, wherein the self-resetting function of the quadruped robot can be quickly and flexibly realized after falling; existing control methods are mostly task-specific based on models, and almost every operation needs to be developed from scratch.
Disclosure of Invention
The four-footed robot falling self-resetting control method based on deep reinforcement learning enables the four-footed robot to realize autonomous resetting without artificial assistance; and different tasks can be independently and efficiently executed according to requirements by simply replacing the configuration of the neural network parameters, so that the development period is greatly shortened.
The invention provides a four-footed robot falling self-resetting control method based on deep reinforcement learning, which comprises four steps of establishing a four-footed robot model, constructing and learning an actuator network, training a control strategy and executing by a bottom system, and specifically comprises the following steps:
and 4, executing by the bottom layer system: and (3) randomly setting the initial falling position and the initial falling posture of the robot, outputting the neural network trained in the step (3) as the execution actions of 12 joints of the robot, determining the motion scheme of each joint so as to drive the joint to move, and completing the task of falling and self-resetting.
The invention is further improved in that: the step 2 specifically comprises the following steps:
2.1: the status is a robot status measurement provided to the controller. The state space S is described as a 9-dimensional vector space, comprising
Respectively represent:
rz-robot base height.
v-base line speed.
w-base angular velocity.
Phi-joint velocity.
Θ — historical state of the joint (t ═ t)k-0.01s and t ═ tk-0.02 s).
αk-1-the previous action of the robot.
C-constant.
An action is a command provided to an actuator. The motion space a is described as a 2-dimensional discrete vector space,representing joint torque speed and joint torque position, respectively.
The reward is specified to induce the robot to produce ideal behavior; and setting an award function pi, and awarding a strategy corresponding to the maximum value after the discount sum, namely the action selected and executed by the robot according to the strategy instruction.
The reward function is:
where γ ∈ (0,1) is the discount factor and τ (π) is the trajectory distribution under the reward function π.
2.2: the method comprises the following steps of constructing a deep neural network N for judging falling self-reset income of the robot, and specifically:
constructing an MLP (Multi-Layer Perception) four-Layer neural network N for judging the falling self-resetting income of the robot, wherein the MLP comprises the following steps: an input layer LiTwo hidden layers LhAn output layer Lo(ii) a The input layer input items are the historical states of the robot under the generalized coordinates q and the generalized speed v.
The output item of the output layer Lo comprises two dimensions which respectively represent the speed estimation deviation S and the position estimation deviation P of the torque of each joint of the robot; the speed estimation deviation S is the deviation between the actual speed of the current robot joint torque and the target speed, the position estimation deviation P is the deviation between the actual position of the current robot joint torque and the target position, each leg of the robot is supposed to have 3 degrees of freedom and 3 × 4 joint torques in total, and the output of an output layer is a 2 × 12 matrix;
setting an activation function of the deep neural network N:
setting an input layer activation function of the deep neural network N as a Relu function:
f(x)=max(0,x)
the output layer activation function is:
the input layer is vector X, and the output of the hidden layer 1 is:
f(w1+b1)
the output of the hidden layer 2 layer is:
f(w2+b2)
the final output layer output is then:
f(x)=f(b2+w2(t(b1+w1x)))
wherein the function f is the tanh function:
w is the weight and b is the deviation.
The invention is further improved in that: the step 4 specifically comprises the following steps:
4.1 setting the falling initial position and the falling initial posture of the robot randomly.
4.2 the deep neural network N outputs the execution actions of 12 joints of the robot.
4.3 the output position trajectory is simulated assuming that the robot fully follows the joint torque speed command and the joint torque position command.
4.4 judging whether the joint movement exceeds the available space range. If yes, refusing to sample, resetting the position to the previous position, and sampling the output command again; if not, the action is performed.
And 4.5, judging whether the robot recovers the initial normal state. If not, executing the output command again according to the sampling command; if so, the robot completes the falling self-resetting task.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, the deep depth reinforcement learning is applied to the falling self-resetting function of the quadruped robot, so that the complicated manual adjusting process during artificial participation is avoided; the automatic reset reduces the time for completing the task and has high flexibility; the robot can continuously learn and accumulate, and can smoothly complete tasks in an untrained unknown environment and under different falling states.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic view of the hip coordinate system of the present invention;
FIG. 3 is a schematic view of the thigh and calf coordinate system of the present invention;
figure 4 is a schematic diagram of a fall reduction procedure of the invention;
FIG. 5 is a diagram of a neural network architecture of the present invention;
FIG. 6 is a schematic of the overall control strategy of the present invention;
Detailed Description
As shown in fig. 1, the method for fall self-resetting of a quadruped robot based on deep depth reinforcement learning provided by this embodiment includes establishing a quadruped robot model, constructing and learning an actuator network, training a control strategy, executing four steps by a bottom system,
the specific contents are as follows:
And 2, building a deep and deep reinforcement learning framework, learning an actuator network on the system through self-supervision learning, and using the actuator network in simulation modeling of 12 joints of the quadruped robot. 2.1: the status is a robot status measurement provided to the controller. The state space S is described as a 9-dimensional vector space, comprising
Respectively represent: a robot direction vector measured by an IMU (Inertial measurement unit);
rz-robot base height.
v-base line speed.
w-base angular velocity.
Phi-joint velocity.
Θ — historical state of the joint (t ═ t)k-0.01s and t ═ tk-0.02 s).
αk-1-the previous action of the robot.
C-constant.
An action is a command provided to an actuator. The motion space a is described as a two-dimensional discrete vector space,representing joint torque speed and joint torque position, respectively.
The reward is specified to induce the robot to produce ideal behavior; and setting an award function pi, and awarding a strategy corresponding to the maximum value after the discount sum, namely the action selected and executed by the robot according to the strategy instruction.
The reward function is:
where γ ∈ (0,1) is the discount factor and τ (π) is the trajectory distribution under the reward function π.
2.2, constructing a deep neural network N for judging falling self-reset income of the robot, and specifically comprising the following steps:
constructing an MLP (Multi-Layer Perception) four-Layer neural network N for judging the falling self-resetting income of the robot, wherein the MLP comprises the following steps: an input layer LiTwo hidden layers LhAn output layer Lo(ii) a The input layer input items are the historical states of the robot under the generalized coordinates q and the generalized speed v.
Output layer LoThe output item of (2) comprises two dimensions which respectively represent the speed estimation deviation S and the position estimation deviation P of the torque of each joint of the robot, each leg of the robot is assumed to have 3 degrees of freedom and 3 x 4 joint torques in total, and the output of the output layer is a 2 x 12 matrix.
2.3 set activation function of the deep neural network N:
setting an input layer activation function of the deep neural network N as a Relu function: (x) max (0, x), output layer activation function is
The input layer is vector X, and the output of the hidden layer 1 is: f (w)1+b1) Hidden layer 2 output: the method comprises the following steps: f (w)2+b2) The final output layer output is f (x) f (b)2+w2(t(b1+w1x))). Wherein the function f is the tanh function:w is the weight and b is the deviation.
And 3, training a simple parameterized controller by using the models generated in the steps 1 and 2, generating foot tracks in a sine wave form, establishing each joint coordinate system and a centroid coordinate system by using a coordinate transformation method, and calculating the corresponding joint position in the resetting process by inverse kinematics, wherein the coordinate systems are established as shown in fig. 2 and 3.
And 4, in the execution stage of the bottom system, randomly setting the falling initial position and posture of the robot, outputting the deep neural network trained in the step 3 as the execution actions of 12 joints of the robot, determining each joint movement scheme so as to drive the joints to move, and completing the falling self-resetting task.
The step 4 specifically comprises the following steps:
4.1 setting the falling initial position and the falling initial posture of the robot randomly.
4.2 the deep neural network N outputs the execution actions of 12 joints of the robot.
4.3 the output position trajectory is simulated assuming that the robot fully follows the joint torque speed command and the joint torque position command.
4.4 judging whether the joint movement exceeds the available space range. If yes, refusing to sample, resetting the position to the previous position, and sampling the output command again; if not, the action is performed, as shown in the process of FIG. 4.
And 4.5, judging whether the robot recovers the initial normal state. If not, executing the output command again according to the sampling command; if so, the robot completes the falling self-resetting task, and the final reset completion state is shown in fig. 5.
Claims (4)
1. A four-footed robot falling self-resetting control method based on deep reinforcement learning is characterized by comprising the following steps:
step 1, establishing a four-footed robot model: determining various physical parameters of the robot; the key point for realizing the falling self-resetting function lies in the mutual matching among the legs and the joints of each leg;
step 2, building a deep reinforcement learning framework and learning an actuator network: learning an actuator network on the system through self-supervision learning, and using the actuator network in simulation modeling of 12 joints of the quadruped robot;
step 3, training the controller: training a simple parameterized controller by using the model generated in the steps 1 and 2, generating foot tracks in a sine wave form, determining coordinate systems and a centroid coordinate system of each joint by using a coordinate transformation method, and calculating corresponding joint positions in a resetting process by using inverse kinematics;
and 4, executing by the bottom layer system: and (3) randomly setting the initial falling position and the initial falling posture of the robot, outputting the neural network trained in the step (3) as the execution actions of 12 joints of the robot, determining the motion scheme of each joint so as to drive the joint to move, and completing the task of falling and self-resetting.
2. The four-footed robot falling self-resetting control method based on deep reinforcement learning of claim 1 is characterized in that the four-footed robot in step 1 has three joints of hip, thigh and shank, namely three degrees of freedom, on each leg; each physical parameter of the quadruped robot comprises the length of a hip, thigh and shank of the robot, and the motion parameter comprises the degree of freedom of each joint and limits the available space position of each joint to accord with the actual biological motion condition.
3. The four-footed robot fall self-resetting control method based on deep reinforcement learning of claim 1,
the specific steps of building the deep reinforcement learning framework in the step 2 are as follows:
2.1 status is a robot status measurement provided to the controller; the state space S is described as a 9-dimensional vector space, comprising
Wherein:
rz-a robot base height; v-base line speed; w-base angular velocity;-a joint position; phi-joint velocity; Θ — historical state of the joint (t ═ t)k-0.01s and t ═ tk-0.02s) sparse samples αk-1-a previous action of the robot; c is a constant;
2.2 action is a command provided to the actuator; the motion space a is described as a two-dimensional discrete vector space,respectively representing joint torque speed and joint torque position;
2.3, the reward is specified to induce the robot to generate ideal behaviors; setting an award function pi, and awarding a strategy corresponding to the maximum value after the discount sum, namely, selecting an executed action by the robot according to a strategy instruction;
the reward function is:
wherein gamma belongs to (0,1) as a discount factor, and tau (pi) is the track distribution under the reward function pi;
the learning executor network in the step 2 comprises the following specific steps:
2.4, constructing an MLP (Multi-Layer Perception) four-Layer neural network N for judging the falling self-resetting income of the robot, wherein the MLP comprises the following components: an input layer LiTwo hidden layers LhAn output layer Lo(ii) a The input items of the input layer are the historical states of the robot under the generalized coordinates q and the generalized speed v;
the output item of the output layer Lo comprises two dimensions which respectively represent the speed estimation deviation S and the position estimation deviation P of the torque of each joint of the robot; the speed estimation deviation S is the deviation between the actual speed of the current robot joint torque and the target speed, the position estimation deviation P is the deviation between the actual position of the current robot joint torque and the target position, each leg of the robot is supposed to have 3 degrees of freedom and 3 × 4 joint torques in total, and the output of an output layer is a 2 × 12 matrix;
2.5, setting an activation function of the neural network N:
setting the input layer activation function of the neural network N as a Relu function:
f(x)=max(0,x)
the output layer activation function is
The input layer is vector X, and the output of hidden layer 1 is:
f(w1+b1)
the output of the hidden layer 2 layer is:
f(w2+b2)
the final output layer output is then:
f(x)=f(b2+w2(t(b1+w1x)))。
wherein the function f is the tanh function:
w is the weight and b is the deviation.
4. The four-footed robot falling self-resetting control method based on deep reinforcement learning of claim 1 is characterized in that the specific steps executed by the bottom layer system in step 4 are as follows:
4.1: randomly setting the falling initial position and the falling initial posture of the robot;
4.2: the deep neural network N outputs the execution actions of 12 joints of the robot;
4.3: assuming that the robot completely follows the joint torque speed command and the joint torque position command, and simulating an output position track;
4.4: judging whether the joint movement exceeds the available space range, if so, refusing to sample, resetting the position to the previous position, and sampling the output command again; if not, executing the action;
4.5) judging whether the robot recovers the initial normal state; if not, executing the output command again according to the sampling command; if so, the robot completes the falling self-resetting task.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911128299.8A CN110861084B (en) | 2019-11-18 | 2019-11-18 | Four-legged robot falling self-resetting control method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911128299.8A CN110861084B (en) | 2019-11-18 | 2019-11-18 | Four-legged robot falling self-resetting control method based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110861084A true CN110861084A (en) | 2020-03-06 |
CN110861084B CN110861084B (en) | 2022-04-05 |
Family
ID=69654912
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911128299.8A Active CN110861084B (en) | 2019-11-18 | 2019-11-18 | Four-legged robot falling self-resetting control method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110861084B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111409073A (en) * | 2020-04-02 | 2020-07-14 | 深圳国信泰富科技有限公司 | Tumbling self-recovery method and system for high-intelligence robot |
CN111506100A (en) * | 2020-06-15 | 2020-08-07 | 深圳市优必选科技股份有限公司 | Multi-legged robot joint control method and device and multi-legged robot |
CN112405568A (en) * | 2020-10-20 | 2021-02-26 | 同济大学 | Humanoid robot falling prediction method |
CN112859904A (en) * | 2021-01-25 | 2021-05-28 | 乐聚(深圳)机器人技术有限公司 | Method, device and equipment for recovering standing posture of robot and storage medium |
CN113110459A (en) * | 2021-04-20 | 2021-07-13 | 上海交通大学 | Motion planning method for multi-legged robot |
WO2022223056A1 (en) * | 2021-07-12 | 2022-10-27 | 上海微电机研究所(中国电子科技集团公司第二十一研究所) | Robot motion parameter adaptive control method and system based on deep reinforcement learning |
CN115407790A (en) * | 2022-08-16 | 2022-11-29 | 中国北方车辆研究所 | Four-legged robot lateral velocity estimation method based on deep learning |
TWI811156B (en) * | 2022-11-16 | 2023-08-01 | 英業達股份有限公司 | Transition method of locomotion gait of robot |
CN116898583A (en) * | 2023-06-21 | 2023-10-20 | 北京长木谷医疗科技股份有限公司 | Deep learning-based intelligent rasping control method and device for orthopedic operation robot |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1297805A (en) * | 1999-11-24 | 2001-06-06 | 索尼公司 | Movable robot with legs and its controlling and operating method |
US6330494B1 (en) * | 1998-06-09 | 2001-12-11 | Sony Corporation | Robot and method of its attitude control |
CN1518488A (en) * | 2002-03-15 | 2004-08-04 | ���ṫ˾ | Operation control device for leg-type mobile robot and operation control method and robot device |
CN102372042A (en) * | 2011-09-07 | 2012-03-14 | 广东工业大学 | Motion planning system for biped robot |
CN106886155A (en) * | 2017-04-28 | 2017-06-23 | 齐鲁工业大学 | A kind of quadruped robot control method of motion trace based on PSO PD neutral nets |
CN107450555A (en) * | 2017-08-30 | 2017-12-08 | 唐开强 | A kind of Hexapod Robot real-time gait planing method based on deeply study |
CN108983804A (en) * | 2018-08-27 | 2018-12-11 | 燕山大学 | A kind of biped robot's gait planning method based on deeply study |
CN109483530A (en) * | 2018-10-18 | 2019-03-19 | 北京控制工程研究所 | A kind of legged type robot motion control method and system based on deeply study |
-
2019
- 2019-11-18 CN CN201911128299.8A patent/CN110861084B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6330494B1 (en) * | 1998-06-09 | 2001-12-11 | Sony Corporation | Robot and method of its attitude control |
CN1297805A (en) * | 1999-11-24 | 2001-06-06 | 索尼公司 | Movable robot with legs and its controlling and operating method |
CN1518488A (en) * | 2002-03-15 | 2004-08-04 | ���ṫ˾ | Operation control device for leg-type mobile robot and operation control method and robot device |
CN102372042A (en) * | 2011-09-07 | 2012-03-14 | 广东工业大学 | Motion planning system for biped robot |
CN106886155A (en) * | 2017-04-28 | 2017-06-23 | 齐鲁工业大学 | A kind of quadruped robot control method of motion trace based on PSO PD neutral nets |
CN107450555A (en) * | 2017-08-30 | 2017-12-08 | 唐开强 | A kind of Hexapod Robot real-time gait planing method based on deeply study |
CN108983804A (en) * | 2018-08-27 | 2018-12-11 | 燕山大学 | A kind of biped robot's gait planning method based on deeply study |
CN109483530A (en) * | 2018-10-18 | 2019-03-19 | 北京控制工程研究所 | A kind of legged type robot motion control method and system based on deeply study |
Non-Patent Citations (1)
Title |
---|
杨茜等: "一种弹跳机器人姿态调节中离散和连续运动建模与实验研究", 《机器人》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111409073A (en) * | 2020-04-02 | 2020-07-14 | 深圳国信泰富科技有限公司 | Tumbling self-recovery method and system for high-intelligence robot |
CN111506100A (en) * | 2020-06-15 | 2020-08-07 | 深圳市优必选科技股份有限公司 | Multi-legged robot joint control method and device and multi-legged robot |
CN111506100B (en) * | 2020-06-15 | 2020-10-02 | 深圳市优必选科技股份有限公司 | Multi-legged robot joint control method and device and multi-legged robot |
CN112405568A (en) * | 2020-10-20 | 2021-02-26 | 同济大学 | Humanoid robot falling prediction method |
CN112859904A (en) * | 2021-01-25 | 2021-05-28 | 乐聚(深圳)机器人技术有限公司 | Method, device and equipment for recovering standing posture of robot and storage medium |
CN113110459A (en) * | 2021-04-20 | 2021-07-13 | 上海交通大学 | Motion planning method for multi-legged robot |
WO2022223056A1 (en) * | 2021-07-12 | 2022-10-27 | 上海微电机研究所(中国电子科技集团公司第二十一研究所) | Robot motion parameter adaptive control method and system based on deep reinforcement learning |
CN115407790A (en) * | 2022-08-16 | 2022-11-29 | 中国北方车辆研究所 | Four-legged robot lateral velocity estimation method based on deep learning |
CN115407790B (en) * | 2022-08-16 | 2024-04-26 | 中国北方车辆研究所 | Four-foot robot lateral speed estimation method based on deep learning |
TWI811156B (en) * | 2022-11-16 | 2023-08-01 | 英業達股份有限公司 | Transition method of locomotion gait of robot |
CN116898583A (en) * | 2023-06-21 | 2023-10-20 | 北京长木谷医疗科技股份有限公司 | Deep learning-based intelligent rasping control method and device for orthopedic operation robot |
CN116898583B (en) * | 2023-06-21 | 2024-04-26 | 北京长木谷医疗科技股份有限公司 | Deep learning-based intelligent rasping control method and device for orthopedic operation robot |
Also Published As
Publication number | Publication date |
---|---|
CN110861084B (en) | 2022-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110861084B (en) | Four-legged robot falling self-resetting control method based on deep reinforcement learning | |
Vukobratovic | When were active exoskeletons actually born? | |
Atmeh et al. | Implementation of an adaptive, model free, learning controller on the Atlas robot | |
Pérez-Higueras et al. | Hunavsim: A ros 2 human navigation simulator for benchmarking human-aware robot navigation | |
Tang et al. | Humanmimic: Learning natural locomotion and transitions for humanoid robot via wasserstein adversarial imitation | |
Kuo et al. | Development of humanoid robot simulator for gait learning by using particle swarm optimization | |
Rokbani et al. | Prototyping a biped robot using an educational robotics kit | |
Ammar et al. | Learning to walk using a recurrent neural network with time delay | |
Ferreira et al. | Diagonal walk reference generator based on Fourier approximation of ZMP trajectory | |
Li et al. | Agile and versatile bipedal robot tracking control through reinforcement learning | |
Wei et al. | Learning Gait-conditioned Bipedal Locomotion with Motor Adaptation | |
Soyguder et al. | Slegs robot: development and design of a novel flexible and self-reconfigurable robot leg | |
Belter et al. | Evolving feasible gaits for a hexapod robot by reducing the space of possible solutions | |
Fachantidis et al. | Model-based reinforcement learning for humanoids: A study on forming rewards with the iCub platform | |
Steinhauser | Habitat-Lab Quadruped Embodied AI Research | |
Bentrah et al. | Full body adjustment using iterative inverse kinematic and body parts correlation | |
Shafii et al. | Two humanoid simulators: Comparison and synthesis | |
Vollaro et al. | Application of Block-Based Programming to the Selected Open-Source Quadrupedal Platform for Improving Robotics Training | |
Huan et al. | Adaptive evolutionary neural network gait generation for humanoid robot optimized with modified differential evolution algorithm | |
Issa et al. | Learning the Quadruped Robot by Reinforcement Learning (RL). | |
Mortazi et al. | Using embodiment theory to train a set of actuators with different expertise to accomplish a duty: An application to train a quadruped robot for walking | |
Verner et al. | Experiential learning through designing robots and motion behaviors: A tiered approach | |
Liu et al. | A Reinforcement Learning Toolkit for Quadruped Robots With Pybullet | |
Agarwal | Interaction Between Artificial Intelligence and Mechanical Engineering | |
Amirshirzad et al. | Context based echo state networks for robot movement primitives |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |