CN103204193A - Under-actuated biped robot walking control method - Google Patents

Under-actuated biped robot walking control method Download PDF

Info

Publication number
CN103204193A
CN103204193A CN2013101202519A CN201310120251A CN103204193A CN 103204193 A CN103204193 A CN 103204193A CN 2013101202519 A CN2013101202519 A CN 2013101202519A CN 201310120251 A CN201310120251 A CN 201310120251A CN 103204193 A CN103204193 A CN 103204193A
Authority
CN
China
Prior art keywords
robot
under
actuated bipod
inverted pendulum
formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013101202519A
Other languages
Chinese (zh)
Other versions
CN103204193B (en
Inventor
刘道远
潘刚
彭自强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201310120251.9A priority Critical patent/CN103204193B/en
Publication of CN103204193A publication Critical patent/CN103204193A/en
Application granted granted Critical
Publication of CN103204193B publication Critical patent/CN103204193B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Manipulator (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses an under-actuated biped robot walking control method aiming at the problem of planar walking control of a biped robot. By adopting a MACCEPA flexible actuator and utilizing self dynamic characteristics of the biped robot, quick walking is achieved effectively. In continuous interaction of the robot with the ground, the robot learns walking independently in a trial and error mode by fully utilizing trial-and-error learning capability of a Q learning method; and stable, natural and periodical quick walking of the robot is achieved, and the method has high application value.

Description

A kind of under-actuated bipod robot ambulation control method
Technical field
The present invention relates to a kind of power type walking method for biped robot, relate in particular to a kind of under-actuated bipod robot ambulation control method.
Background technology
At present, biped robot's traveling method mainly comprises the walking of ZMP criterion and limit cycle walking.The walking of ZMP criterion requires the zero strong point of robot to remain at the interior .ZMP stability criterion planning of the polygon joint motions track that biped constitutes, and can realize the walking of the multiple gait of robot.At present, the successful examples of ZMP criterion walking is mainly the ASIMO of Japanese honda company.But more artificial constraint has been adopted in the walking of ZMP criterion, adopts the motor of the big inertia high gain of high rigidity to come the accurate tracking desired trajectory, does not take full advantage of the dynamics of robot itself, causes high energy consumption.In addition, the ZMP criterion can only be applicable to the robot of sole one class, and the robot of types such as no sole, arc foot can't define the ZMP point.Based on trajectory planning and the track following of ZMP stability criterion, obtained extensively and successfully using at the traditional double biped robot, and be not suitable for passive robot.
The limit cycle walking is a kind of new walking theory that the twentieth century end occurs.Be subjected to the inspiration of human walking, its walking is cycle stability, and namely gait sequence can form a stable limit cycle in state space, but in any instantaneous local stability that do not have of gait cycle.This method is less to the artificial constraint of robot, can take full advantage of the dynamics of robot self, thereby possesses higher energy efficiency, the speed of travel and antijamming capability.At present, the under-actuated bipod robot successful examples of employing limit cycle walking principle comprises the biped robot of Cornell university.Robot adopts the PD controller, and parameter needs manual regulation, and work capacity is huge.
The servomotor of traditional rigidity actuator has high inertia, and energy consumption is bigger, can not take full advantage of self dynamics of robot, is not suitable for owing to drive walking control.Comparatively speaking, flexible actuator can be considered a special spring, takes full advantage of biped robot's dynamics.And the quick walking of biped robot or have shock effect when running.Flexible actuator can effectively absorb impact, helps to realize quick walking.
Summary of the invention
The objective of the invention is at the deficiencies in the prior art, a kind of under-actuated bipod robot ambulation control method is provided.
The objective of the invention is to be achieved through the following technical solutions: a kind of under-actuated bipod robot is the control method of walking fast, comprises the steps:
Step 1: upper computer is gathered the robot initial condition according to the sensor that is installed on under-actuated bipod robot trunk and the four limbs, comprises the angle (θ of trunk, first thigh, second thigh, first shank, second shank and vertical direction 1, θ 2, θ 3, θ 4, θ 5) and cireular frequency
Step 2: biped robot's modeling comprises that setting up the under-actuated bipod robot motion controls model and equivalent inverted pendulum model thereof;
Step 3: initialization Q learning network comprises: initialization RBF neural network, initialization qualification mark Φ 0, move vectorial A;
Step 4: calculate RBF neural network output Q (s t, a);
Step 5: adopt ε-greedy policy selection to move vectorial a t
Step 6: robot is carried out dynamics simulation, find the solution the under-actuated bipod robot model according to following formula, obtain new state x T+1, s T+1With remuneration signal enhancement value r t
Step 7: upgrade the Q learning network, comprise and upgrade qualification mark Φ t, calculate the TD error e, upgrade the RBF network weight.
Step 8: repeating step 4-7, up to the new state x of under-actuated bipod robot T+1With previous state x tIdentical, namely find fixed point.
Step 9: upper computer is with the under-actuated bipod robotary x of fixed point correspondence tAction vector a with correspondence tOutput to the under-actuated bipod robot, control under-actuated bipod robot obtains stablizing fast speed cycle gait.
The invention has the beneficial effects as follows: the present invention is the under-actuated bipod ROBOT CONTROL method that adopts the MACCEPA flexible actuator.Adopt the MACCEPA flexible actuator, can take full advantage of biped robot's dynamics itself, reduced robot energy consumption.And the impact in the time of effectively having absorbed the robot collision has been played the certain protection effect to robot.This method has successfully to be controlled the biped robot fast and has realized stable, nature, the advantage of the dynamic gait of cycle and low energy consumption.
Description of drawings
Fig. 1 is under-actuated bipod robot and equivalent inverted pendulum illustraton of model;
Fig. 2 is MACCEPA actuator scheme drawing;
Fig. 3 is RBF neural network scheme drawing;
Fig. 4 is that the biped robot controls block diagram;
Fig. 5 is control flow chart.
The specific embodiment
As shown in Figure 1, the biped robot comprises trunk 1, first thigh 2, second thigh 3, first shank 4, second shank 5, wherein, trunk 1 links to each other with first thigh 2 by first motor 6, link to each other with second thigh 3 by second motor 7, first thigh 2 links to each other with first shank 4 by the 3rd motor 8, and second thigh 3 links to each other with second shank 5 by the 4th motor 9.Trunk 1 is θ with the vertical direction angle 1, first thigh 2 is θ with the vertical direction angle 2, second thigh 3 is θ with the vertical direction angle 3, first shank 4 is θ with the vertical direction angle 4, second shank 5 is θ with the vertical direction angle 5Length and the quality of under-actuated bipod robot trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5 are respectively l iAnd m i, i=1,2 ..., 5.In order to simplify calculating, need be the inverted pendulum model with robot model's equivalence in the method.Equivalence inverted pendulum 10 with the vertical direction angle is
Figure BDA00003021398700033
First motor 6, second motor 7, the 3rd motor 8 and the 4th motor 9 all adopt MACCEPA(Mechanically Adjustable Compliance and Controllable Equilibrium Position Actuator) the soft drive motor.As shown in Figure 2, comprise for first bar 11, second bar 12 and the auxiliary rod 13 that connect that captive joint with trunk 1 as first bar 11 of first motor 6, second bar 12 is captiveed joint with first thigh 2, the annexation of all the other motors by that analogy.
The characteristic equation of MACCEPA flexible actuator is as follows:
τ = - k ( α - φ ) - b α · ,
In the formula, τ is joint moment, and α is biped robot joint relative angle,
Figure BDA00003021398700034
Be joint relative angle speed, k is elasticity modulus, and φ is the joint balance angle, and b is the damping constant of actuator and gets definite value that k and φ can regulate.Thereby each MACCEPA motor has two controlling quantity k and φ.The control signal input end of each motor links to each other with a control signal mouth of upper computer respectively; Upper computer is realized by industrial computer, as adopting the PC104 industrial computer.
Under-actuated bipod robot of the present invention is the control method of walking fast, comprises the steps:
Step 1: upper computer is gathered the robot initial condition according to the sensor that is installed on under-actuated bipod robot trunk and the four limbs, comprises the angle (θ of trunk 1, first thigh 2, second thigh 3, first shank 4, second shank 5 and vertical direction 1, θ 2, θ 3, θ 4, θ 5) and cireular frequency
Figure BDA00003021398700032
Step 2: the under-actuated bipod robot modeling, as shown in Figure 1.Comprise that setting up the under-actuated bipod robot motion controls model and equivalent inverted pendulum model thereof.The complete cycle walking process of robot comprises swing process and collision process.Swing process refers to that the robot supporting leg lands, and is axially preceding rotation with the end, and leading leg simultaneously swings to supporting leg the place ahead, contacts with ground until leading leg.Collision process refers to lead leg when swing process finishes terminal and the moment collision takes place on ground, and simultaneously, supporting leg is liftoff.After the collision, supporting leg is converted to leads leg, and leading leg converts supporting leg to.One step of robot ambulation refers to finish after collision next time through swing process by by beginning after the last time collision.
The motion control model of the under-actuated bipod robot in the swing process is:
D ( θ ) θ · · + C ( θ , θ · ) θ · + G ( θ · ) = u ,
Wherein D is broad sense inertia battle array, and C is centnifugal force and coriolis force item, and G is the gravity item, u=(u 1, u 2, u 3, u 4) ' be moment of face, θ=(θ 1, θ 2, θ 3, θ 4, θ 5) '.
The motion control model conversion of under-actuated bipod robot is become equation of state:
x · = f ( x ) + g ( x ) u ,
Wherein:
f ( x ) = θ · D - 1 ( q ) ( - C ( θ , θ · ) θ · - G ( θ ) ) , g ( x ) = 0 D - 1 ( θ ) ,
In the formula, x = ( θ 1 , θ 2 , θ 3 , θ 4 , θ 5 , θ · 1 , θ · 2 , θ · 3 , θ · 4 , θ · 5 ) ′ , F (x) and g (x) are nonlinear functions.
The collision process that robot contacts with ground is a transients, refers to lead leg when swing process finishes terminal and ground generation moment collision, utilizes theorem of impulse to get:
∫ t - t + D ( θ ) θ · · + C ( θ , θ · ) θ · + G ( θ · ) ) dt = ∫ t - t + u + Fdt ,
Wherein, the outer application force when F is collision, t -, t +Be moment before and after the collision;
Following formula can be rewritten into:
x +=Δ(x -),
Calculate for convenience, robotary x need be converted into equivalent inverted pendulum model.Under-actuated bipod robot equivalence inverted pendulum model parameter comprises the length L of inverted pendulum, angle With kinetic energy E;
The center-of-gravity position of biped robot's trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5 G i = G x i G y i , i = 1,2 , . . . , 5 For:
G 4 = l 4 2 sin ( θ 4 ) l 4 2 cos ( θ 4 )
G 2 = l 4 sin ( θ 4 ) + l 2 2 sin ( θ 2 ) l 4 cos ( θ 4 ) + l 2 2 cos ( θ 2 )
G 1 = l 4 sin ( θ 4 ) + l 2 sin ( θ 2 ) + l 1 2 sin ( θ 1 ) l 4 cos ( θ 4 ) + l 2 cos ( θ 2 ) + l 1 2 sin ( θ 1 )
G 3 = l 4 sin ( θ 4 ) + l 2 sin ( θ 2 ) - l 3 2 sin ( θ 3 ) l 4 cos ( θ 4 ) + l 2 cos ( θ 2 ) - l 3 2 cos ( θ 3 )
G 5 = l 4 sin ( θ 4 ) + l 2 sin ( θ 2 ) θ l 3 sin ( θ 3 ) - l 5 2 sin ( θ 5 ) l 4 cos ( θ 4 ) + l 2 cos ( θ 2 ) - l 3 cos ( θ 3 ) - l 5 2 cos ( θ 5 )
Equivalence inverted pendulum center-of-gravity position G = G x G y , For G = Σ i = 1 5 m i G i Σ i = 1 5 m i ,
Can calculate the angle of inverted pendulum according to the position of center of gravity
Figure BDA00003021398700059
And length L, the kinetic energy E of inverted pendulum is the kinetic energy E of under-actuated bipod robot trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5 1, E 2, E 3, E 4, E 5Sum, equivalent inverted pendulum model
Figure BDA000030213987000510
Computing formula be:
Figure BDA00003021398700058
Step 3: initialization Q learning network comprises: initialization RBF neural network, initialization qualification mark Φ 0, move vectorial A;
Adopt 3 layers of RBF neural network of multiinput-multioutput as shown in Figure 3, be input as continuous equivalent inverted pendulum state vector, be output as the corresponding Q value of set of actions.
Input layer: the state vector that is input as equivalent inverted pendulum
Figure BDA00003021398700064
Hidden layer: what hidden layer adopted is Gaussian function
Figure BDA00003021398700061
Wherein the center width of j hidden layer neuron and center vector are respectively σ jAnd c j
Output layer: the output valve of m node of output layer is m vectorial a of action among the vectorial A of robot ambulation t step action mCorresponding Q (s t, a m), s tBe the t equivalent inverted pendulum state in step.
Network weight matrix W between hidden layer and output layer Jk, j=1 wherein, 2 ..., H, k=1,2 ..., M.H is the hidden layer node number, and M is the output node number.
The qualification mark is defined as:
Φ t = Σ p = 1 t λ t - p ▿ w Q ( s p , a )
Wherein,
▿ w Q ( s p , a ) = ∂ Q ( s p , a ) ∂ w t
In the formula, t represents the current t step in the robot ambulation process, and p represents that the p that walks before the robot goes on foot s pRepresent p when step under-actuated bipod robot equivalence inverted pendulum state, W tRepresent the t network weight in step, λ is qualification mark discount rate.
Qualification mark initialization Φ 0=0.
The action vector is A=(k 1, φ 1, k 2, φ 2, k 3, φ 3, k 4, φ 4), k wherein 1, k 2, k 3, k 4Be respectively the elasticity modulus of four motors, φ 1, φ 2, φ 3, φ 4Be respectively the balance angle of four motors.
Step 4: calculate RBF neural network output Q (s t, a);
Can calculate:
Figure BDA00003021398700071
In the formula
Figure BDA00003021398700074
w MjJ hidden layer node is to the network weight of m output layer node.
Step 5: adopt ε-greedy policy selection to move vectorial a t
Incorporate in the pseudo-greedy algorithm in conjunction with simulated annealing thought and Boltzmann-Gibbs distribution, the present invention has adopted a kind of random chance ε with the ε decay greedy algorithm of continuous step number decay.The algorithm of random chance ε decay is:
ε=ε 0·exp(-step/N),
In the formula, ε 0Be arbitrary constant initial value, ε 0∈ (0,1), step are the continuous walking step number, and N is according to the self-defined integer of experiment situation.Action selection module adopts ε decay greedy algorithm to select next step action a t
Step 6: robot is carried out dynamics simulation, find the solution the under-actuated bipod robot model according to following formula, obtain new state x T+1, s T+1With remuneration signal enhancement value r t
x · = f ( x ) + g ( x ) u x + = Δ ( x - ) ,
The remuneration signal has directly reflected results of learning in the intensified learning, successfully goes when making a move when robot, and upper computer is proceeded the test of next step walking of robot; When robot was fallen down, upper computer restarted the next round test.If its angle was identical with cireular frequency and previous step after robot was successfully gone and made a move, then think and found fixed point, provide bigger remuneration signal this moment.Reinforcement value r in the above-mentioned intensified learning arranges as follows:
Figure BDA00003021398700073
Step 7: upgrade the Q learning network.Comprise and upgrade qualification mark Φ t, calculate the TD error e, upgrade the RBF network weight.
Qualification mark more new formula is:
Φ t = Σ p = 1 t λ t - p ▿ w Q ( s p , a ) ,
Wherein:
▿ w W ( s p , a ) = ∂ Q ( s p , a ) ∂ W t .
Introduce TD(Temporal Difference) error e, the error anti-pass in network, is revised weights and threshold value:
e = r t + max a ∈ A Q ( s t + 1 , a ) - Q ( s t , a t ) ,
Wherein, r tThe reinforcement value in expression under-actuated bipod robot ambulation t step, Q (s t, a t) represent that t goes on foot the Q value of selected action, Q (s T+1, a) expression t+1 goes on foot the Q value of selected action.
When revising neural network weight, in conjunction with qualification mark thought, the error computing formula of RBF network weight is:
Δ W t = ηe Φ t = η [ r t + γ max a ∈ A Q ( s t + 1 , a ) - Q ( s t , a t ) ] Φ t ,
In the formula, η is learning rate, and γ is discount factor, and all in interval (0,1), r is reinforcement value, s for η, γ tBe the state in biped robot's equivalence inverted pendulum t step, s T+1Be the t+1 state in step, a is action, a tRepresent the action that t selected during the step, Φ tBe the qualification mark of t during the step.In the RBF of multiinput-multioutput neural network, selected action a is only adjusted in the weights adjustment of network tThe map network weights, the corresponding network weight of other action is not adjusted.More new formula is as follows for concrete network weight:
Output layer weights increment Delta w JkThe error correction formula be:
Figure BDA00003021398700085
The width parameter increment Delta σ of hidden layer node jThe error correction formula is:
Figure BDA00003021398700086
The center vector increment Delta c of hidden layer node IjThe error correction formula is:
Figure BDA00003021398700091
Wherein: λ is qualification mark discount rate, and η is learning rate, and α is factor of momentum, η, α all in interval (0,1),
Figure BDA00003021398700092
Be the Gaussian function of hidden layer, t represents the current t step in the robot ambulation process.
Step 8: repeating step 4-7, up to the new state x of under-actuated bipod robot T+1With previous state x tIdentical, namely find fixed point.
Step 9: upper computer is with the under-actuated bipod robotary x of fixed point correspondence tAction vector a with correspondence tOutput to the under-actuated bipod robot, control under-actuated bipod robot obtains stablizing fast speed cycle gait.
The present invention is the under-actuated bipod ROBOT CONTROL method that adopts the MACCEPA flexible actuator.The MACCEPA flexible actuator can take full advantage of biped robot's dynamics itself, reduces robot energy consumption.And the impact can effectively absorb the robot collision time, robot has been played the certain protection effect.This method has successfully to be controlled the under-actuated bipod robot fast and has realized stable, nature, the advantage of the dynamic gait of cycle and low energy consumption.

Claims (6)

1. the under-actuated bipod robot control method of walking fast, it is characterized in that, comprise the steps: step 1: upper computer is gathered the robot initial condition according to the sensor that is installed on under-actuated bipod robot trunk and the four limbs, comprises the angle (θ of trunk, first thigh, second thigh, first shank, second shank and vertical direction 1, θ 2, θ 3, θ 4, θ 5) and cireular frequency
Figure FDA00003021398600011
Step 2: biped robot's modeling comprises that setting up the under-actuated bipod robot motion controls model and equivalent inverted pendulum model thereof;
Step 3: initialization Q learning network comprises: initialization RBF neural network, qualification mark Φ 0With the vectorial A of action;
Step 4: calculate RBF neural network output Q (s t, a);
Step 5: adopt ε-greedy policy selection to move vectorial a t
Step 6: robot is carried out dynamics simulation, find the solution the under-actuated bipod robot model according to following formula, obtain new state x T+1, s T+1With remuneration signal enhancement value r t
Step 7: upgrade the Q learning network, comprise and upgrade qualification mark Φ t, calculate the TD error e, upgrade the RBF network weight;
Step 8: repeating step 4-7, up to the new state x of under-actuated bipod robot T+1With previous state x tIdentical, namely find fixed point;
Step 9: upper computer is with the under-actuated bipod robotary x of fixed point correspondence tAction vector a with correspondence tOutput to the under-actuated bipod robot, control under-actuated bipod robot obtains stablizing fast speed cycle gait.
2. according to the control method of the quick walking of the described under-actuated bipod robot of claim 1, it is characterized in that described step 2 is specially: the motion control model of the under-actuated bipod robot in the swing process is:
Wherein, D is broad sense inertia battle array, and C is centnifugal force and coriolis force item, and G is the gravity item, u=(u 1, u 2, u 3, u 4) ' be moment of face, θ=(θ 1, θ 2, θ 3, θ 4, θ 5) ',
The motion control model conversion of under-actuated bipod robot is become equation of state:
Wherein:
Figure FDA00003021398600022
In the formula,
Figure FDA00003021398600023
F (x) and g (x) are nonlinear functions;
The collision process that robot contacts with ground is a transients, refers to lead leg when swing process finishes terminal and ground generation moment collision, utilizes theorem of impulse to get:
Figure FDA00003021398600024
Wherein, the outer application force when F is collision, t -, t +Be moment before and after the collision;
Following formula can be rewritten into:
x +=Δ(x -),
Calculate for convenience, robotary x need be converted into equivalent inverted pendulum model.Under-actuated bipod robot equivalence inverted pendulum model parameter comprises the length L of inverted pendulum, angle
Figure FDA00003021398600028
With kinetic energy E;
The center-of-gravity position of biped robot's trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5
Figure FDA00003021398600025
For:
Figure FDA00003021398600026
Figure FDA00003021398600027
Figure FDA00003021398600031
Figure FDA00003021398600033
Equivalence inverted pendulum center-of-gravity position For
Figure FDA00003021398600035
Can calculate the angle of inverted pendulum according to the position of center of gravity
Figure FDA00003021398600038
And length L, the kinetic energy E of inverted pendulum is the kinetic energy E of under-actuated bipod robot trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5 1, E 2, E 3, E 4, E 5Sum, equivalent inverted pendulum model
Figure FDA00003021398600039
Computing formula be:
Figure FDA00003021398600036
3. according to the quick control method of walking of the described under-actuated bipod robot of claim 1, it is characterized in that described step 3 is specially: adopt 3 layers of RBF neural network of multiinput-multioutput, be input as continuous equivalent inverted pendulum state vector, be output as the corresponding Q value of set of actions
Input layer: the state vector that is input as equivalent inverted pendulum
Figure FDA000030213986000310
Hidden layer: what hidden layer adopted is Gaussian function
Figure FDA00003021398600037
Wherein the center width of j hidden layer neuron and center vector are respectively σ jAnd c j,
Output layer: the output valve of m node of output layer is m vectorial a of action among the vectorial A of robot ambulation t step action mCorresponding Q (s t, a m), s tBe the t equivalent inverted pendulum state in step,
Network weight matrix W between hidden layer and output layer Jk, j=1 wherein, 2 ..., H, k=1,2 ..., M, H are the hidden layer node number, M is the output node number,
The qualification mark is defined as:
Figure FDA00003021398600041
Wherein,
Figure FDA00003021398600042
In the formula, t represents the current t step in the robot ambulation process, and p represents that the p that walks before the robot goes on foot s pRepresent p when step under-actuated bipod robot equivalence inverted pendulum state, W tRepresent the t network weight in step, λ is qualification mark discount rate.Qualification mark initialization Φ 0=0,
The action vector is A=(k 1, φ 1, k 2, φ 2, k 3, φ 3, k 4, φ 4), k wherein 1, k 2, k 3, k 4Be respectively the elasticity modulus of four motors, φ 1, φ 2, φ 3, φ 4Be respectively the balance angle of four motors.
4. the control method of walking fast according to the described under-actuated bipod robot of claim 1 is characterized in that described step 4 is specially: calculate RBF neural network output Q (s by following formula t, a):
Figure FDA00003021398600043
In the formula
Figure FDA00003021398600044
w MjJ hidden layer node is to the network weight of m output layer node.
5. according to the control method of the quick walking of the described under-actuated bipod robot of claim 1, it is characterized in that described step 5 is specially: adopted random chance ε to select to move with the ε decay greedy algorithm of continuous step number decay:
ε=ε 0·exp(-step/N),
In the formula, ε 0Be arbitrary constant initial value, ε 0∈ (0,1), step are the continuous walking step number, and N is according to the self-defined integer of experiment situation.
6. the control method of walking fast according to the described under-actuated bipod robot of claim 1 is characterized in that described step 6 is specially: robot is carried out dynamics simulation, find the solution the under-actuated bipod robot model according to following formula, obtain new state x T+1, s T+1With remuneration signal enhancement value r t
Figure FDA00003021398600051
The remuneration signal has directly reflected results of learning in the intensified learning, successfully goes when making a move when robot, and upper computer is proceeded the test of next step walking of robot; When robot was fallen down, upper computer restarted the next round test.If robot successfully go make a move its angle of back identical with cireular frequency and previous step then think found fixed point, provide at this moment bigger remuneration signal.Reinforcement value r in the above-mentioned intensified learning arranges as follows:
Figure FDA00003021398600052
Step 7: upgrade the Q learning network, comprise and upgrade qualification mark Φ t, calculate the TD error e, upgrade the RBF network weight,
Qualification mark more new formula is:
Figure FDA00003021398600053
Wherein:
Figure FDA00003021398600054
Introduce TD(Temporal Difference) error e, the error anti-pass in network, is revised weights and threshold value:
Figure FDA00003021398600055
Wherein, r tThe reinforcement value in expression under-actuated bipod robot ambulation t step, Q (s t, a t) represent that t goes on foot the Q value of selected action, Q (s T+1, a) expression t+1 goes on foot the Q value of selected action,
When revising neural network weight, in conjunction with qualification mark thought, the error computing formula of RBF network weight is:
Figure FDA00003021398600061
In the formula, η is learning rate, and γ is discount factor, and all in interval (0,1), r is reinforcement value, s for η, γ tBe the state in biped robot's equivalence inverted pendulum t step, s T+1Be the t+1 state in step, a is action, a tRepresent the action that t selected during the step, Φ tBe the qualification mark of t during the step.In the RBF of multiinput-multioutput neural network, selected action a is only adjusted in the weights adjustment of network tThe map network weights, the corresponding network weight of other action is not adjusted.More new formula is as follows for concrete network weight:
Output layer weights increment Delta w JkThe error correction formula be:
Figure FDA00003021398600062
The width parameter increment Delta σ of hidden layer node jThe error correction formula is:
Figure FDA00003021398600063
The center vector increment Delta c of hidden layer node IjThe error correction formula is:
Figure FDA00003021398600064
Wherein: λ is qualification mark discount rate, and η is learning rate, and α is factor of momentum, η, α all in interval (0,1),
Figure FDA00003021398600065
Be the Gaussian function of hidden layer, t represents the current t step in the robot ambulation process.
CN201310120251.9A 2013-04-08 2013-04-08 A kind of under-actuated bipod robot ambulation control method Expired - Fee Related CN103204193B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310120251.9A CN103204193B (en) 2013-04-08 2013-04-08 A kind of under-actuated bipod robot ambulation control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310120251.9A CN103204193B (en) 2013-04-08 2013-04-08 A kind of under-actuated bipod robot ambulation control method

Publications (2)

Publication Number Publication Date
CN103204193A true CN103204193A (en) 2013-07-17
CN103204193B CN103204193B (en) 2015-10-07

Family

ID=48751651

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310120251.9A Expired - Fee Related CN103204193B (en) 2013-04-08 2013-04-08 A kind of under-actuated bipod robot ambulation control method

Country Status (1)

Country Link
CN (1) CN103204193B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104932264A (en) * 2015-06-03 2015-09-23 华南理工大学 Humanoid robot stable control method of RBF-Q learning frame
CN105329333A (en) * 2015-11-20 2016-02-17 清华大学 Delayed feedback-based biped robot walking non-monocyclic gait control method
CN105938364A (en) * 2016-01-15 2016-09-14 浙江大学 Calculation method of kinetic model of 3D under-actuated biped robot
CN106094813A (en) * 2016-05-26 2016-11-09 华南理工大学 It is correlated with based on model humanoid robot gait's control method of intensified learning
CN106096286A (en) * 2016-06-15 2016-11-09 北京千安哲信息技术有限公司 Clinical path formulating method and device
CN104331081B (en) * 2014-10-10 2017-11-07 北京理工大学 A kind of gait planning method of biped robot inclined-plane walking
CN107891920A (en) * 2017-11-08 2018-04-10 北京理工大学 A kind of leg joint offset angle automatic obtaining method for biped robot
CN111142378A (en) * 2020-01-07 2020-05-12 四川省桑瑞光辉标识系统股份有限公司 Neural network optimization method of biped robot neural network controller
CN111198581A (en) * 2020-01-17 2020-05-26 同济大学 Speed adjusting method and device for virtual passive walking robot and storage medium terminal
CN111891249A (en) * 2020-06-19 2020-11-06 浙江大学 Hydraulic hexapod robot and walking gait control method based on centroid fluctuation
CN112446289A (en) * 2020-09-25 2021-03-05 华南理工大学 Method for improving performance of P300 spelling device
CN112859901A (en) * 2021-01-21 2021-05-28 北京理工大学 Continuous dynamic stable jumping control method of humanoid robot
CN113050409A (en) * 2019-12-28 2021-06-29 深圳市优必选科技股份有限公司 Humanoid robot, control method thereof, and computer-readable storage medium
CN113220004A (en) * 2021-04-15 2021-08-06 海南大熊软件科技有限公司 Gait control method for quadruped robot, and computer-readable storage medium
CN113467235A (en) * 2021-06-10 2021-10-01 清华大学 Biped robot gait control method and control device
CN113467481A (en) * 2021-08-11 2021-10-01 哈尔滨工程大学 Path planning method based on improved Sarsa algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080001228A (en) * 2006-06-29 2008-01-03 명지대학교 산학협력단 Method of stair walking for biped robots
CN101323325A (en) * 2008-07-04 2008-12-17 清华大学 Power type walking method of dual-foot robot
KR20100093834A (en) * 2009-02-17 2010-08-26 동아대학교 산학협력단 Method for generating optimal trajectory of a biped robot for walking up a staircase
CN102910218A (en) * 2012-10-17 2013-02-06 同济大学 Double-feet passive walking state control method with knee bending behavior

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080001228A (en) * 2006-06-29 2008-01-03 명지대학교 산학협력단 Method of stair walking for biped robots
CN101323325A (en) * 2008-07-04 2008-12-17 清华大学 Power type walking method of dual-foot robot
KR20100093834A (en) * 2009-02-17 2010-08-26 동아대학교 산학협력단 Method for generating optimal trajectory of a biped robot for walking up a staircase
CN102910218A (en) * 2012-10-17 2013-02-06 同济大学 Double-feet passive walking state control method with knee bending behavior

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
彭自强: "基于Q学习和神经网络的双足机器人控制", 《中国优秀硕士学位论文全文数据库》 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331081B (en) * 2014-10-10 2017-11-07 北京理工大学 A kind of gait planning method of biped robot inclined-plane walking
CN104932264B (en) * 2015-06-03 2018-07-20 华南理工大学 The apery robot stabilized control method of Q learning frameworks based on RBF networks
CN104932264A (en) * 2015-06-03 2015-09-23 华南理工大学 Humanoid robot stable control method of RBF-Q learning frame
CN105329333A (en) * 2015-11-20 2016-02-17 清华大学 Delayed feedback-based biped robot walking non-monocyclic gait control method
CN105938364A (en) * 2016-01-15 2016-09-14 浙江大学 Calculation method of kinetic model of 3D under-actuated biped robot
CN105938364B (en) * 2016-01-15 2018-09-25 浙江大学 A kind of kinetic model computational methods of 3D under-actuated bipods robot
CN106094813A (en) * 2016-05-26 2016-11-09 华南理工大学 It is correlated with based on model humanoid robot gait's control method of intensified learning
CN106094813B (en) * 2016-05-26 2019-01-18 华南理工大学 Humanoid robot gait's control method based on model correlation intensified learning
CN106096286A (en) * 2016-06-15 2016-11-09 北京千安哲信息技术有限公司 Clinical path formulating method and device
CN107891920A (en) * 2017-11-08 2018-04-10 北京理工大学 A kind of leg joint offset angle automatic obtaining method for biped robot
CN113050409A (en) * 2019-12-28 2021-06-29 深圳市优必选科技股份有限公司 Humanoid robot, control method thereof, and computer-readable storage medium
CN113050409B (en) * 2019-12-28 2023-12-01 深圳市优必选科技股份有限公司 Humanoid robot, control method thereof and computer-readable storage medium
CN111142378A (en) * 2020-01-07 2020-05-12 四川省桑瑞光辉标识系统股份有限公司 Neural network optimization method of biped robot neural network controller
CN111198581B (en) * 2020-01-17 2021-02-12 同济大学 Speed adjusting method and device for virtual passive walking robot and storage medium terminal
CN111198581A (en) * 2020-01-17 2020-05-26 同济大学 Speed adjusting method and device for virtual passive walking robot and storage medium terminal
CN111891249A (en) * 2020-06-19 2020-11-06 浙江大学 Hydraulic hexapod robot and walking gait control method based on centroid fluctuation
CN112446289A (en) * 2020-09-25 2021-03-05 华南理工大学 Method for improving performance of P300 spelling device
CN112446289B (en) * 2020-09-25 2023-08-22 华南理工大学 Method for improving P300 spelling device performance
CN112859901A (en) * 2021-01-21 2021-05-28 北京理工大学 Continuous dynamic stable jumping control method of humanoid robot
CN113220004A (en) * 2021-04-15 2021-08-06 海南大熊软件科技有限公司 Gait control method for quadruped robot, and computer-readable storage medium
CN113467235A (en) * 2021-06-10 2021-10-01 清华大学 Biped robot gait control method and control device
CN113467235B (en) * 2021-06-10 2022-09-02 清华大学 Biped robot gait control method and control device
CN113467481A (en) * 2021-08-11 2021-10-01 哈尔滨工程大学 Path planning method based on improved Sarsa algorithm

Also Published As

Publication number Publication date
CN103204193B (en) 2015-10-07

Similar Documents

Publication Publication Date Title
CN103204193A (en) Under-actuated biped robot walking control method
Dadashzadeh et al. From template to anchor: A novel control strategy for spring-mass running of bipedal robots
Fujiwara et al. Towards an optimal falling motion for a humanoid robot
Park et al. Switching control design for accommodating large step-down disturbances in bipedal robot walking
Remy et al. A matlab framework for efficient gait creation
Deng et al. Level-ground walking for a bipedal robot with a torso via hip series elastic actuators and its gait bifurcation control
Visser et al. Control strategy for energy-efficient bipedal walking with variable leg stiffness
CN103750927B (en) Artificial leg knee joint adaptive iterative learning control method
CN104898672A (en) Optimized control method of humanoid robot walking track
Ma et al. Efficient HZD gait generation for three-dimensional underactuated humanoid running
Added et al. Control of the passive dynamic gait of the bipedal compass-type robot through trajectory tracking
Nakada et al. Learning arm motion strategies for balance recovery of humanoid robots
Chen et al. A strategy for push recovery in quadruped robot based on reinforcement learning
CN105329333B (en) The non-monocycle gait control method of biped robot's walking based on Delay Feedback
An et al. Gait transition of quadruped robot using rhythm control and stability analysis
Godage et al. Energy based control of compass gait soft limbed bipeds
Chen et al. Realization of complex terrain and disturbance adaptation for hydraulic quadruped robot under flying trot gait
Yi et al. Variable speed running on kneed biped robot with underactuation degree two
Yang et al. Truncated Fourier series formulation for bipedal walking balance control
Takano et al. Analysis of biped running with rotational inerter
Kobayashi et al. Optimal use of arm-swing for bipedal walking control
Senthilkumar et al. Model-based Controller Design for Automatic Two Wheeled Self Balancing Robot
Chew et al. Frontal plane algorithms for dynamic bipedal walking
Danesh et al. Stabilization of Unstable Limit Cycles in a Push-off Based Dynamic Walker by Reversible Switching Surfaces
CN105467841A (en) Artificial neural control method for upper limb motions of humanoid robot

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20151007

Termination date: 20190408

CF01 Termination of patent right due to non-payment of annual fee