CN103204193A  Underactuated biped robot walking control method  Google Patents
Underactuated biped robot walking control method Download PDFInfo
 Publication number
 CN103204193A CN103204193A CN2013101202519A CN201310120251A CN103204193A CN 103204193 A CN103204193 A CN 103204193A CN 2013101202519 A CN2013101202519 A CN 2013101202519A CN 201310120251 A CN201310120251 A CN 201310120251A CN 103204193 A CN103204193 A CN 103204193A
 Authority
 CN
 China
 Prior art keywords
 robot
 under
 formula
 actuated bipod
 inverted pendulum
 Prior art date
Links
 210000000689 upper leg Anatomy 0.000 claims description 25
 210000001699 lower leg Anatomy 0.000 claims description 22
 238000000034 method Methods 0.000 claims description 17
 230000001537 neural Effects 0.000 claims description 14
 210000002414 Leg Anatomy 0.000 claims description 12
 230000005021 gait Effects 0.000 claims description 8
 230000000875 corresponding Effects 0.000 claims description 6
 230000002787 reinforcement Effects 0.000 claims description 6
 230000005484 gravity Effects 0.000 claims description 4
 238000004088 simulation Methods 0.000 claims description 4
 210000003414 Extremities Anatomy 0.000 claims description 3
 230000015541 sensory perception of touch Effects 0.000 claims description 3
 238000006243 chemical reaction Methods 0.000 claims description 2
 239000011159 matrix material Substances 0.000 claims description 2
 210000002569 neurons Anatomy 0.000 claims description 2
 230000002123 temporal effect Effects 0.000 claims description 2
 230000001052 transient Effects 0.000 claims description 2
 230000003993 interaction Effects 0.000 abstract 1
 238000005265 energy consumption Methods 0.000 description 6
 230000000694 effects Effects 0.000 description 4
 230000001276 controlling effect Effects 0.000 description 1
 238000010586 diagram Methods 0.000 description 1
 238000005516 engineering process Methods 0.000 description 1
 230000035939 shock Effects 0.000 description 1
 238000002922 simulated annealing Methods 0.000 description 1
Abstract
The invention discloses an underactuated biped robot walking control method aiming at the problem of planar walking control of a biped robot. By adopting a MACCEPA flexible actuator and utilizing self dynamic characteristics of the biped robot, quick walking is achieved effectively. In continuous interaction of the robot with the ground, the robot learns walking independently in a trial and error mode by fully utilizing trialanderror learning capability of a Q learning method; and stable, natural and periodical quick walking of the robot is achieved, and the method has high application value.
Description
Technical field
The present invention relates to a kind of power type walking method for biped robot, relate in particular to a kind of underactuated bipod robot ambulation control method.
Background technology
At present, biped robot's traveling method mainly comprises the walking of ZMP criterion and limit cycle walking.The walking of ZMP criterion requires the zero strong point of robot to remain at the interior .ZMP stability criterion planning of the polygon joint motions track that biped constitutes, and can realize the walking of the multiple gait of robot.At present, the successful examples of ZMP criterion walking is mainly the ASIMO of Japanese honda company.But more artificial constraint has been adopted in the walking of ZMP criterion, adopts the motor of the big inertia high gain of high rigidity to come the accurate tracking desired trajectory, does not take full advantage of the dynamics of robot itself, causes high energy consumption.In addition, the ZMP criterion can only be applicable to the robot of sole one class, and the robot of types such as no sole, arc foot can't define the ZMP point.Based on trajectory planning and the track following of ZMP stability criterion, obtained extensively and successfully using at the traditional double biped robot, and be not suitable for passive robot.
The limit cycle walking is a kind of new walking theory that the twentieth century end occurs.Be subjected to the inspiration of human walking, its walking is cycle stability, and namely gait sequence can form a stable limit cycle in state space, but in any instantaneous local stability that do not have of gait cycle.This method is less to the artificial constraint of robot, can take full advantage of the dynamics of robot self, thereby possesses higher energy efficiency, the speed of travel and antijamming capability.At present, the underactuated bipod robot successful examples of employing limit cycle walking principle comprises the biped robot of Cornell university.Robot adopts the PD controller, and parameter needs manual regulation, and work capacity is huge.
The servomotor of traditional rigidity actuator has high inertia, and energy consumption is bigger, can not take full advantage of self dynamics of robot, is not suitable for owing to drive walking control.Comparatively speaking, flexible actuator can be considered a special spring, takes full advantage of biped robot's dynamics.And the quick walking of biped robot or have shock effect when running.Flexible actuator can effectively absorb impact, helps to realize quick walking.
Summary of the invention
The objective of the invention is at the deficiencies in the prior art, a kind of underactuated bipod robot ambulation control method is provided.
The objective of the invention is to be achieved through the following technical solutions: a kind of underactuated bipod robot is the control method of walking fast, comprises the steps:
Step 1: upper computer is gathered the robot initial condition according to the sensor that is installed on underactuated bipod robot trunk and the four limbs, comprises the angle (θ of trunk, first thigh, second thigh, first shank, second shank and vertical direction
_{1}, θ
_{2}, θ
_{3}, θ
_{4}, θ
_{5}) and cireular frequency
Step 2: biped robot's modeling comprises that setting up the underactuated bipod robot motion controls model and equivalent inverted pendulum model thereof;
Step 3: initialization Q learning network comprises: initialization RBF neural network, initialization qualification mark Φ
_{0}, move vectorial A;
Step 4: calculate RBF neural network output Q (s
_{t}, a);
Step 5: adopt εgreedy policy selection to move vectorial a
_{t}
Step 6: robot is carried out dynamics simulation, find the solution the underactuated bipod robot model according to following formula, obtain new state x
_{T+1}, s
_{T+1}With remuneration signal enhancement value r
_{t}
Step 7: upgrade the Q learning network, comprise and upgrade qualification mark Φ
_{t}, calculate the TD error e, upgrade the RBF network weight.
Step 8: repeating step 47, up to the new state x of underactuated bipod robot
_{T+1}With previous state x
_{t}Identical, namely find fixed point.
Step 9: upper computer is with the underactuated bipod robotary x of fixed point correspondence
_{t}Action vector a with correspondence
_{t}Output to the underactuated bipod robot, control underactuated bipod robot obtains stablizing fast speed cycle gait.
The invention has the beneficial effects as follows: the present invention is the underactuated bipod ROBOT CONTROL method that adopts the MACCEPA flexible actuator.Adopt the MACCEPA flexible actuator, can take full advantage of biped robot's dynamics itself, reduced robot energy consumption.And the impact in the time of effectively having absorbed the robot collision has been played the certain protection effect to robot.This method has successfully to be controlled the biped robot fast and has realized stable, nature, the advantage of the dynamic gait of cycle and low energy consumption.
Description of drawings
Fig. 1 is underactuated bipod robot and equivalent inverted pendulum illustraton of model;
Fig. 2 is MACCEPA actuator scheme drawing;
Fig. 3 is RBF neural network scheme drawing;
Fig. 4 is that the biped robot controls block diagram;
Fig. 5 is control flow chart.
The specific embodiment
As shown in Figure 1, the biped robot comprises trunk 1, first thigh 2, second thigh 3, first shank 4, second shank 5, wherein, trunk 1 links to each other with first thigh 2 by first motor 6, link to each other with second thigh 3 by second motor 7, first thigh 2 links to each other with first shank 4 by the 3rd motor 8, and second thigh 3 links to each other with second shank 5 by the 4th motor 9.Trunk 1 is θ with the vertical direction angle
_{1}, first thigh 2 is θ with the vertical direction angle
_{2}, second thigh 3 is θ with the vertical direction angle
_{3}, first shank 4 is θ with the vertical direction angle
_{4}, second shank 5 is θ with the vertical direction angle
_{5}Length and the quality of underactuated bipod robot trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5 are respectively l
_{i}And m
_{i}, i=1,2 ..., 5.In order to simplify calculating, need be the inverted pendulum model with robot model's equivalence in the method.Equivalence inverted pendulum 10 with the vertical direction angle is
First motor 6, second motor 7, the 3rd motor 8 and the 4th motor 9 all adopt MACCEPA(Mechanically Adjustable Compliance and Controllable Equilibrium Position Actuator) the soft drive motor.As shown in Figure 2, comprise for first bar 11, second bar 12 and the auxiliary rod 13 that connect that captive joint with trunk 1 as first bar 11 of first motor 6, second bar 12 is captiveed joint with first thigh 2, the annexation of all the other motors by that analogy.
The characteristic equation of MACCEPA flexible actuator is as follows:
In the formula, τ is joint moment, and α is biped robot joint relative angle,
Be joint relative angle speed, k is elasticity modulus, and φ is the joint balance angle, and b is the damping constant of actuator and gets definite value that k and φ can regulate.Thereby each MACCEPA motor has two controlling quantity k and φ.The control signal input end of each motor links to each other with a control signal mouth of upper computer respectively; Upper computer is realized by industrial computer, as adopting the PC104 industrial computer.
Underactuated bipod robot of the present invention is the control method of walking fast, comprises the steps:
Step 1: upper computer is gathered the robot initial condition according to the sensor that is installed on underactuated bipod robot trunk and the four limbs, comprises the angle (θ of trunk 1, first thigh 2, second thigh 3, first shank 4, second shank 5 and vertical direction
_{1}, θ
_{2}, θ
_{3}, θ
_{4}, θ
_{5}) and cireular frequency
Step 2: the underactuated bipod robot modeling, as shown in Figure 1.Comprise that setting up the underactuated bipod robot motion controls model and equivalent inverted pendulum model thereof.The complete cycle walking process of robot comprises swing process and collision process.Swing process refers to that the robot supporting leg lands, and is axially preceding rotation with the end, and leading leg simultaneously swings to supporting leg the place ahead, contacts with ground until leading leg.Collision process refers to lead leg when swing process finishes terminal and the moment collision takes place on ground, and simultaneously, supporting leg is liftoff.After the collision, supporting leg is converted to leads leg, and leading leg converts supporting leg to.One step of robot ambulation refers to finish after collision next time through swing process by by beginning after the last time collision.
The motion control model of the underactuated bipod robot in the swing process is:
Wherein D is broad sense inertia battle array, and C is centnifugal force and coriolis force item, and G is the gravity item, u=(u
_{1}, u
_{2}, u
_{3}, u
_{4}) ' be moment of face, θ=(θ
_{1}, θ
_{2}, θ
_{3}, θ
_{4}, θ
_{5}) '.
The motion control model conversion of underactuated bipod robot is become equation of state:
Wherein:
In the formula,
$x={({\mathrm{\θ}}_{1},{\mathrm{\θ}}_{2},{\mathrm{\θ}}_{3},{\mathrm{\θ}}_{4},{\mathrm{\θ}}_{5},{\stackrel{\·}{\mathrm{\θ}}}_{1},{\stackrel{\·}{\mathrm{\θ}}}_{2},{\stackrel{\·}{\mathrm{\θ}}}_{3},{\stackrel{\·}{\mathrm{\θ}}}_{4},{\stackrel{\·}{\mathrm{\θ}}}_{5})}^{\′},$ F (x) and g (x) are nonlinear functions.
The collision process that robot contacts with ground is a transients, refers to lead leg when swing process finishes terminal and ground generation moment collision, utilizes theorem of impulse to get:
Wherein, the outer application force when F is collision, t
^{}, t
^{+}Be moment before and after the collision;
Following formula can be rewritten into:
x
^{+}=Δ(x
^{})，
Calculate for convenience, robotary x need be converted into equivalent inverted pendulum model.Underactuated bipod robot equivalence inverted pendulum model parameter comprises the length L of inverted pendulum, angle
With kinetic energy E;
The centerofgravity position of biped robot's trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5
${G}_{i}=\left[\begin{array}{c}{G}_{{x}_{i}}\\ {G}_{{y}_{i}}\end{array}\right],i=\mathrm{1,2},...,5$ For:
Equivalence inverted pendulum centerofgravity position
$G=\left[\begin{array}{c}{G}_{x}\\ {G}_{y}\end{array}\right],$ For
$G=\frac{\underset{i=1}{\overset{5}{\mathrm{\Σ}}}{m}_{i}{G}_{i}}{\underset{i=1}{\overset{5}{\mathrm{\Σ}}}{m}_{i}},$
Can calculate the angle of inverted pendulum according to the position of center of gravity
And length L, the kinetic energy E of inverted pendulum is the kinetic energy E of underactuated bipod robot trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5
_{1}, E
_{2}, E
_{3}, E
_{4}, E
_{5}Sum, equivalent inverted pendulum model
Computing formula be:
Step 3: initialization Q learning network comprises: initialization RBF neural network, initialization qualification mark Φ
_{0}, move vectorial A;
Adopt 3 layers of RBF neural network of multiinputmultioutput as shown in Figure 3, be input as continuous equivalent inverted pendulum state vector, be output as the corresponding Q value of set of actions.
Input layer: the state vector that is input as equivalent inverted pendulum
Hidden layer: what hidden layer adopted is Gaussian function
Wherein the center width of j hidden layer neuron and center vector are respectively σ
_{j}And c
_{j}
Output layer: the output valve of m node of output layer is m vectorial a of action among the vectorial A of robot ambulation t step action
_{m}Corresponding Q (s
_{t}, a
_{m}), s
_{t}Be the t equivalent inverted pendulum state in step.
Network weight matrix W between hidden layer and output layer
_{Jk}, j=1 wherein, 2 ..., H, k=1,2 ..., M.H is the hidden layer node number, and M is the output node number.
The qualification mark is defined as:
Wherein,
In the formula, t represents the current t step in the robot ambulation process, and p represents that the p that walks before the robot goes on foot s
_{p}Represent p when step underactuated bipod robot equivalence inverted pendulum state, W
_{t}Represent the t network weight in step, λ is qualification mark discount rate.
Qualification mark initialization Φ
_{0}=0.
The action vector is A=(k
_{1}, φ
_{1}, k
_{2}, φ
_{2}, k
_{3}, φ
_{3}, k
_{4}, φ
_{4}), k wherein
_{1}, k
_{2}, k
_{3}, k
_{4}Be respectively the elasticity modulus of four motors, φ
_{1}, φ
_{2}, φ
_{3}, φ
_{4}Be respectively the balance angle of four motors.
Step 4: calculate RBF neural network output Q (s
_{t}, a);
Can calculate:
In the formula
w
_{Mj}J hidden layer node is to the network weight of m output layer node.
Step 5: adopt εgreedy policy selection to move vectorial a
_{t}
Incorporate in the pseudogreedy algorithm in conjunction with simulated annealing thought and BoltzmannGibbs distribution, the present invention has adopted a kind of random chance ε with the ε decay greedy algorithm of continuous step number decay.The algorithm of random chance ε decay is:
ε=ε
_{0}·exp(step/N)，
In the formula, ε
_{0}Be arbitrary constant initial value, ε
_{0}∈ (0,1), step are the continuous walking step number, and N is according to the selfdefined integer of experiment situation.Action selection module adopts ε decay greedy algorithm to select next step action a
_{t}
Step 6: robot is carried out dynamics simulation, find the solution the underactuated bipod robot model according to following formula, obtain new state x
_{T+1}, s
_{T+1}With remuneration signal enhancement value r
_{t}
The remuneration signal has directly reflected results of learning in the intensified learning, successfully goes when making a move when robot, and upper computer is proceeded the test of next step walking of robot; When robot was fallen down, upper computer restarted the next round test.If its angle was identical with cireular frequency and previous step after robot was successfully gone and made a move, then think and found fixed point, provide bigger remuneration signal this moment.Reinforcement value r in the abovementioned intensified learning arranges as follows:
Step 7: upgrade the Q learning network.Comprise and upgrade qualification mark Φ
_{t}, calculate the TD error e, upgrade the RBF network weight.
Qualification mark more new formula is:
Wherein:
Introduce TD(Temporal Difference) error e, the error antipass in network, is revised weights and threshold value:
Wherein, r
_{t}The reinforcement value in expression underactuated bipod robot ambulation t step, Q (s
_{t}, a
_{t}) represent that t goes on foot the Q value of selected action, Q (s
_{T+1}, a) expression t+1 goes on foot the Q value of selected action.
When revising neural network weight, in conjunction with qualification mark thought, the error computing formula of RBF network weight is:
In the formula, η is learning rate, and γ is discount factor, and all in interval (0,1), r is reinforcement value, s for η, γ
_{t}Be the state in biped robot's equivalence inverted pendulum t step, s
_{T+1}Be the t+1 state in step, a is action, a
_{t}Represent the action that t selected during the step, Φ
_{t}Be the qualification mark of t during the step.In the RBF of multiinputmultioutput neural network, selected action a is only adjusted in the weights adjustment of network
_{t}The map network weights, the corresponding network weight of other action is not adjusted.More new formula is as follows for concrete network weight:
Output layer weights increment Delta w
_{Jk}The error correction formula be:
The width parameter increment Delta σ of hidden layer node
_{j}The error correction formula is:
The center vector increment Delta c of hidden layer node
_{Ij}The error correction formula is:
Wherein: λ is qualification mark discount rate, and η is learning rate, and α is factor of momentum, η, α all in interval (0,1),
Be the Gaussian function of hidden layer, t represents the current t step in the robot ambulation process.
Step 8: repeating step 47, up to the new state x of underactuated bipod robot
_{T+1}With previous state x
_{t}Identical, namely find fixed point.
Step 9: upper computer is with the underactuated bipod robotary x of fixed point correspondence
_{t}Action vector a with correspondence
_{t}Output to the underactuated bipod robot, control underactuated bipod robot obtains stablizing fast speed cycle gait.
The present invention is the underactuated bipod ROBOT CONTROL method that adopts the MACCEPA flexible actuator.The MACCEPA flexible actuator can take full advantage of biped robot's dynamics itself, reduces robot energy consumption.And the impact can effectively absorb the robot collision time, robot has been played the certain protection effect.This method has successfully to be controlled the underactuated bipod robot fast and has realized stable, nature, the advantage of the dynamic gait of cycle and low energy consumption.
Claims (6)
1. the underactuated bipod robot control method of walking fast, it is characterized in that, comprise the steps: step 1: upper computer is gathered the robot initial condition according to the sensor that is installed on underactuated bipod robot trunk and the four limbs, comprises the angle (θ of trunk, first thigh, second thigh, first shank, second shank and vertical direction
_{1}, θ
_{2}, θ
_{3}, θ
_{4}, θ
_{5}) and cireular frequency
Step 2: biped robot's modeling comprises that setting up the underactuated bipod robot motion controls model and equivalent inverted pendulum model thereof;
Step 3: initialization Q learning network comprises: initialization RBF neural network, qualification mark Φ
_{0}With the vectorial A of action;
Step 4: calculate RBF neural network output Q (s
_{t}, a);
Step 5: adopt εgreedy policy selection to move vectorial a
_{t}
Step 6: robot is carried out dynamics simulation, find the solution the underactuated bipod robot model according to following formula, obtain new state x
_{T+1}, s
_{T+1}With remuneration signal enhancement value r
_{t}
Step 7: upgrade the Q learning network, comprise and upgrade qualification mark Φ
_{t}, calculate the TD error e, upgrade the RBF network weight;
Step 8: repeating step 47, up to the new state x of underactuated bipod robot
_{T+1}With previous state x
_{t}Identical, namely find fixed point;
Step 9: upper computer is with the underactuated bipod robotary x of fixed point correspondence
_{t}Action vector a with correspondence
_{t}Output to the underactuated bipod robot, control underactuated bipod robot obtains stablizing fast speed cycle gait.
2. according to the control method of the quick walking of the described underactuated bipod robot of claim 1, it is characterized in that described step 2 is specially: the motion control model of the underactuated bipod robot in the swing process is:
Wherein, D is broad sense inertia battle array, and C is centnifugal force and coriolis force item, and G is the gravity item, u=(u
_{1}, u
_{2}, u
_{3}, u
_{4}) ' be moment of face, θ=(θ
_{1}, θ
_{2}, θ
_{3}, θ
_{4}, θ
_{5}) ',
The motion control model conversion of underactuated bipod robot is become equation of state:
Wherein:
In the formula,
F (x) and g (x) are nonlinear functions;
The collision process that robot contacts with ground is a transients, refers to lead leg when swing process finishes terminal and ground generation moment collision, utilizes theorem of impulse to get:
Wherein, the outer application force when F is collision, t
^{}, t
^{+}Be moment before and after the collision;
Following formula can be rewritten into:
x
^{+}=Δ(x
^{})，
Calculate for convenience, robotary x need be converted into equivalent inverted pendulum model.Underactuated bipod robot equivalence inverted pendulum model parameter comprises the length L of inverted pendulum, angle
With kinetic energy E;
The centerofgravity position of biped robot's trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5
For:
Equivalence inverted pendulum centerofgravity position
For
Can calculate the angle of inverted pendulum according to the position of center of gravity
And length L, the kinetic energy E of inverted pendulum is the kinetic energy E of underactuated bipod robot trunk 1, first thigh 2, second thigh 3, first shank 4 and second shank 5
_{1}, E
_{2}, E
_{3}, E
_{4}, E
_{5}Sum, equivalent inverted pendulum model
Computing formula be:
3. according to the quick control method of walking of the described underactuated bipod robot of claim 1, it is characterized in that described step 3 is specially: adopt 3 layers of RBF neural network of multiinputmultioutput, be input as continuous equivalent inverted pendulum state vector, be output as the corresponding Q value of set of actions
Input layer: the state vector that is input as equivalent inverted pendulum
Hidden layer: what hidden layer adopted is Gaussian function
Wherein the center width of j hidden layer neuron and center vector are respectively σ
_{j}And c
_{j},
Output layer: the output valve of m node of output layer is m vectorial a of action among the vectorial A of robot ambulation t step action
_{m}Corresponding Q (s
_{t}, a
_{m}), s
_{t}Be the t equivalent inverted pendulum state in step,
Network weight matrix W between hidden layer and output layer
_{Jk}, j=1 wherein, 2 ..., H, k=1,2 ..., M, H are the hidden layer node number, M is the output node number,
The qualification mark is defined as:
Wherein,
In the formula, t represents the current t step in the robot ambulation process, and p represents that the p that walks before the robot goes on foot s
_{p}Represent p when step underactuated bipod robot equivalence inverted pendulum state, W
_{t}Represent the t network weight in step, λ is qualification mark discount rate.Qualification mark initialization Φ
_{0}=0,
The action vector is A=(k
_{1}, φ
_{1}, k
_{2}, φ
_{2}, k
_{3}, φ
_{3}, k
_{4}, φ
_{4}), k wherein
_{1}, k
_{2}, k
_{3}, k
_{4}Be respectively the elasticity modulus of four motors, φ
_{1}, φ
_{2}, φ
_{3}, φ
_{4}Be respectively the balance angle of four motors.
4. the control method of walking fast according to the described underactuated bipod robot of claim 1 is characterized in that described step 4 is specially: calculate RBF neural network output Q (s by following formula
_{t}, a):
In the formula
w
_{Mj}J hidden layer node is to the network weight of m output layer node.
5. according to the control method of the quick walking of the described underactuated bipod robot of claim 1, it is characterized in that described step 5 is specially: adopted random chance ε to select to move with the ε decay greedy algorithm of continuous step number decay:
ε=ε
_{0}·exp(step/N)，
In the formula, ε
_{0}Be arbitrary constant initial value, ε
_{0}∈ (0,1), step are the continuous walking step number, and N is according to the selfdefined integer of experiment situation.
6. the control method of walking fast according to the described underactuated bipod robot of claim 1 is characterized in that described step 6 is specially: robot is carried out dynamics simulation, find the solution the underactuated bipod robot model according to following formula, obtain new state x
_{T+1}, s
_{T+1}With remuneration signal enhancement value r
_{t}
The remuneration signal has directly reflected results of learning in the intensified learning, successfully goes when making a move when robot, and upper computer is proceeded the test of next step walking of robot; When robot was fallen down, upper computer restarted the next round test.If robot successfully go make a move its angle of back identical with cireular frequency and previous step then think found fixed point, provide at this moment bigger remuneration signal.Reinforcement value r in the abovementioned intensified learning arranges as follows:
Step 7: upgrade the Q learning network, comprise and upgrade qualification mark Φ
_{t}, calculate the TD error e, upgrade the RBF network weight,
Qualification mark more new formula is:
Wherein:
Introduce TD(Temporal Difference) error e, the error antipass in network, is revised weights and threshold value:
Wherein, r
_{t}The reinforcement value in expression underactuated bipod robot ambulation t step, Q (s
_{t}, a
_{t}) represent that t goes on foot the Q value of selected action, Q (s
_{T+1}, a) expression t+1 goes on foot the Q value of selected action,
When revising neural network weight, in conjunction with qualification mark thought, the error computing formula of RBF network weight is:
In the formula, η is learning rate, and γ is discount factor, and all in interval (0,1), r is reinforcement value, s for η, γ
_{t}Be the state in biped robot's equivalence inverted pendulum t step, s
_{T+1}Be the t+1 state in step, a is action, a
_{t}Represent the action that t selected during the step, Φ
_{t}Be the qualification mark of t during the step.In the RBF of multiinputmultioutput neural network, selected action a is only adjusted in the weights adjustment of network
_{t}The map network weights, the corresponding network weight of other action is not adjusted.More new formula is as follows for concrete network weight:
Output layer weights increment Delta w
_{Jk}The error correction formula be:
The width parameter increment Delta σ of hidden layer node
_{j}The error correction formula is:
The center vector increment Delta c of hidden layer node
_{Ij}The error correction formula is:
Wherein: λ is qualification mark discount rate, and η is learning rate, and α is factor of momentum, η, α all in interval (0,1),
Be the Gaussian function of hidden layer, t represents the current t step in the robot ambulation process.
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CN201310120251.9A CN103204193B (en)  20130408  20130408  A kind of underactuated bipod robot ambulation control method 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

CN201310120251.9A CN103204193B (en)  20130408  20130408  A kind of underactuated bipod robot ambulation control method 
Publications (2)
Publication Number  Publication Date 

CN103204193A true CN103204193A (en)  20130717 
CN103204193B CN103204193B (en)  20151007 
Family
ID=48751651
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CN201310120251.9A CN103204193B (en)  20130408  20130408  A kind of underactuated bipod robot ambulation control method 
Country Status (1)
Country  Link 

CN (1)  CN103204193B (en) 
Cited By (9)
Publication number  Priority date  Publication date  Assignee  Title 

CN104932264A (en) *  20150603  20150923  华南理工大学  Humanoid robot stable control method of RBFQ learning frame 
CN105329333A (en) *  20151120  20160217  清华大学  Delayed feedbackbased biped robot walking nonmonocyclic gait control method 
CN105938364A (en) *  20160115  20160914  浙江大学  Calculation method of kinetic model of 3D underactuated biped robot 
CN106096286A (en) *  20160615  20161109  北京千安哲信息技术有限公司  Clinical path formulating method and device 
CN106094813A (en) *  20160526  20161109  华南理工大学  It is correlated with based on model humanoid robot gait's control method of intensified learning 
CN104331081B (en) *  20141010  20171107  北京理工大学  A kind of gait planning method of biped robot inclinedplane walking 
CN107891920A (en) *  20171108  20180410  北京理工大学  A kind of leg joint offset angle automatic obtaining method for biped robot 
CN111198581A (en) *  20200117  20200526  同济大学  Speed adjusting method and device for virtual passive walking robot and storage medium terminal 
CN111891249A (en) *  20200619  20201106  浙江大学  Hydraulic hexapod robot and walking gait control method based on centroid fluctuation 
Citations (4)
Publication number  Priority date  Publication date  Assignee  Title 

KR20080001228A (en) *  20060629  20080103  명지대학교 산학협력단  Method of stair walking for biped robots 
CN101323325A (en) *  20080704  20081217  清华大学  Power type walking method of dualfoot robot 
KR20100093834A (en) *  20090217  20100826  동아대학교 산학협력단  Method for generating optimal trajectory of a biped robot for walking up a staircase 
CN102910218A (en) *  20121017  20130206  同济大学  Doublefeet passive walking state control method with knee bending behavior 

2013
 20130408 CN CN201310120251.9A patent/CN103204193B/en not_active IP Right Cessation
Patent Citations (4)
Publication number  Priority date  Publication date  Assignee  Title 

KR20080001228A (en) *  20060629  20080103  명지대학교 산학협력단  Method of stair walking for biped robots 
CN101323325A (en) *  20080704  20081217  清华大学  Power type walking method of dualfoot robot 
KR20100093834A (en) *  20090217  20100826  동아대학교 산학협력단  Method for generating optimal trajectory of a biped robot for walking up a staircase 
CN102910218A (en) *  20121017  20130206  同济大学  Doublefeet passive walking state control method with knee bending behavior 
NonPatent Citations (1)
Title 

彭自强: "基于Q学习和神经网络的双足机器人控制", 《中国优秀硕士学位论文全文数据库》 * 
Cited By (13)
Publication number  Priority date  Publication date  Assignee  Title 

CN104331081B (en) *  20141010  20171107  北京理工大学  A kind of gait planning method of biped robot inclinedplane walking 
CN104932264A (en) *  20150603  20150923  华南理工大学  Humanoid robot stable control method of RBFQ learning frame 
CN104932264B (en) *  20150603  20180720  华南理工大学  The apery robot stabilized control method of Q learning frameworks based on RBF networks 
CN105329333A (en) *  20151120  20160217  清华大学  Delayed feedbackbased biped robot walking nonmonocyclic gait control method 
CN105938364A (en) *  20160115  20160914  浙江大学  Calculation method of kinetic model of 3D underactuated biped robot 
CN105938364B (en) *  20160115  20180925  浙江大学  A kind of kinetic model computational methods of 3D underactuated bipods robot 
CN106094813A (en) *  20160526  20161109  华南理工大学  It is correlated with based on model humanoid robot gait's control method of intensified learning 
CN106094813B (en) *  20160526  20190118  华南理工大学  Humanoid robot gait's control method based on model correlation intensified learning 
CN106096286A (en) *  20160615  20161109  北京千安哲信息技术有限公司  Clinical path formulating method and device 
CN107891920A (en) *  20171108  20180410  北京理工大学  A kind of leg joint offset angle automatic obtaining method for biped robot 
CN111198581A (en) *  20200117  20200526  同济大学  Speed adjusting method and device for virtual passive walking robot and storage medium terminal 
CN111198581B (en) *  20200117  20210212  同济大学  Speed adjusting method and device for virtual passive walking robot and storage medium terminal 
CN111891249A (en) *  20200619  20201106  浙江大学  Hydraulic hexapod robot and walking gait control method based on centroid fluctuation 
Also Published As
Publication number  Publication date 

CN103204193B (en)  20151007 
Similar Documents
Publication  Publication Date  Title 

Geng et al.  Fast biped walking with a sensordriven neuronal controller and realtime online learning  
Vanderborght et al.  Development of a compliance controller to reduce energy consumption for bipedal robots  
Hamed et al.  Eventbased stabilization of periodic orbits for underactuated 3D bipedal robots with leftright symmetry  
Plestan et al.  Stable walking of a 7DOF biped robot  
Stephens  Humanoid push recovery  
US8781624B2 (en)  Systems and methods for tracking and balancing robots for imitating motion capture data  
Byl et al.  Approximate optimal control of the compass gait on rough terrain  
Ono et al.  Selfexcited walking of a biped mechanism with feet  
Van Der Linde  Active leg compliance for passive walking  
Park et al.  General ZMP preview control for bipedal walking  
Vanderborght et al.  Overview of the Lucy project: Dynamic stabilization of a biped powered by pneumatic artificial muscles  
Park et al.  Quadruped bounding control with variable duty cycle via vertical impulse scaling  
US8204626B2 (en)  Control device for mobile body  
Li et al.  A passivity based admittance control for stabilizing the compliant humanoid COMAN  
Park et al.  ZMP trajectory generation for reduced trunk motions of biped robots  
Fujiwara et al.  An optimal planning of falling motions of a humanoid robot  
CN102749919B (en)  Balance control method of multileg robot  
Wensing et al.  Highspeed humanoid running through control with a 3DSLIP model  
CN105159304B (en)  Approach and track the finite time fault tolerant control method of space noncooperative target  
Sayyad et al.  Singlelegged hopping robotics researchA review  
Khadiv et al.  Step timing adjustment: A step toward generating robust gaits  
Karssen et al.  The optimal swingleg retraction rate for running  
CN100569579C (en)  A kind of power type walking method for biped robot  
Ajallooeian et al.  Central pattern generators augmented with virtual model control for quadruped rough terrain locomotion  
Hyon et al.  Symmetric walking control: Invariance and global stability 
Legal Events
Date  Code  Title  Description 

C06  Publication  
PB01  Publication  
C10  Entry into substantive examination  
SE01  Entry into force of request for substantive examination  
C14  Grant of patent or utility model  
GR01  Patent grant  
CF01  Termination of patent right due to nonpayment of annual fee  
CF01  Termination of patent right due to nonpayment of annual fee 
Granted publication date: 20151007 Termination date: 20190408 