CN108153153A - A kind of study impedance control system and control method - Google Patents
A kind of study impedance control system and control method Download PDFInfo
- Publication number
- CN108153153A CN108153153A CN201711393308.7A CN201711393308A CN108153153A CN 108153153 A CN108153153 A CN 108153153A CN 201711393308 A CN201711393308 A CN 201711393308A CN 108153153 A CN108153153 A CN 108153153A
- Authority
- CN
- China
- Prior art keywords
- impedance
- control
- gaussian process
- study
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/048—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators using a predictor
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
Abstract
The present invention is to provide a kind of study impedance control system and control methods.Mainly include impedance controller, four part of the Gaussian process model of system, impedance control strategy and policy learning algorithm.Any priori of environment is not needed to, the Gaussian process model of system is built according to interaction data, long-term reasoning and planning are carried out to system in a manner of Bayes.Complicated power control task can be completed with minimum interaction time study in the limited more useful informations of observation extracting data.By adding in energy loss item in cost function, realize the tradeoff of error and energy, make robot that there is good submissive ability.Finally, the impedance control strategy obtained can adjust target rigidity and damping parameter simultaneously according to system mode in the different phase of task.It the composite can be widely applied in the Shared controls tasks such as double mechanical arms assembling, the cooperation of more mechanical arms and Biped Robot Control, ensure safety and the robustness of interactive operation.
Description
Technical field
The present invention relates to a kind of robot Shared control, specifically a kind of efficient study impedance control system
System and control method.
Background technology
The touch operation task being increasingly used in robot under non-structure environment is such as submissive to assemble, is man-machine
Interaction etc., due to task complexity, contact environment is changeable and unpredictable, it is difficult to the precise kinetic model of system is established, how
Robot security is allowed efficiently and rapidly to perform new task, accurately controls the contact force under varying environment, is that robot faces
New challenge.Impedance control is widely used in because of good adaptability and robustness in robot interactive control task.Due to power
Control characteristic is decided by the impedance parameter of robot, and the selection height dependence task of inertia, rigidity and damping parameter is generally difficult to
Priori infers, good control performance in order to obtain, it usually needs have deep understanding to controller design and its parameter, still need to
Manually adjust control parameter.And especially for complex task, since environmental condition generally comprises some non-linear and time-varying
Factor, the impedance adjustment of preset parameter is difficult to realize goal task.If impedance control parameter can be according to task and environment
Variation carry out Dynamic Programming adjustment, then control performance be significantly better than the fixed situation of impedance control parameter.So study variable resistance
Anti- control ability is the key that modern machines people safely and fast completes complex operations task.
For the operation task that power is needed to control, the fewer study exploration number the better, because a large amount of physics interaction is attempted
Robot or workpiece may be damaged, and a large amount of sampled data is time-consuming and expensive, this is unpractical.Institute
To improve the learning efficiency of study impedance control algolithm, required trial and error interaction times be reduced, for robot Fast Learning
It is most important for completion new task.
Invention content
The purpose of the present invention is to provide a kind of learning efficiency height, can be widely applied to double mechanical arms assembling, more mechanical arms
In the Shared controls task such as cooperation and Biped Robot Control, ensure the safety of interactive operation and the study impedance of robustness
Control system.The purpose of the present invention is to provide a kind of control methods based on study impedance control system.
The study impedance control system of the present invention includes impedance controller, the Gaussian process model module of system, change
Impedance control strategy module and policy learning algoritic module,
The Gaussian process model module of system establishes system according to the physical location of robot end and force snesor information
Gaussian process model, the transformation kinetic model as control system;
Policy learning algoritic module is according to the Gaussian process model of system, by cascading one-step prediction process, Inference Forecast
Then the LONG-TERM DISTRIBUTION of control system state carries out internal emulation and the behavior of Predictive Control System according to this model;
Impedance control strategy module is counted in real time according to control system state, that is, mechanical arm tail end position and practical contact force
Impedance parameter, that is, target rigidity and damped coefficient are calculated, and passes to impedance controller;
Impedance controller is according to the target rigidity of time-varying, damped coefficient and current contact force error correction expectation reference
Track exports mechanical arm tail end desired locations increment.
The control method of study impedance control system based on the present invention includes the following steps:
(1) random initializtion control variable u=[Kd(t)Bd(t)] and control system is acted on, records primary data
[XFa], wherein Kd(t) it is target rigidity, Bd(t) it is damped coefficient, X is mechanical arm tail end position, FaFor practical contact force;
(2) according to history samples data [XFa], the Gaussian process dynamic model of system is established, the change as system moves
Mechanical model;
(3) using the optimal impedance control strategy π (θ) of policy learning algorithm search;
(4) Provisioning Policy is π * ← π (θ), and power control is carried out, and acquire new data applied to impedance controller
[XFa];
(5) it repeats step (2-4) and obtains satisfied control strategy until obtaining satisfied force tracking effect, study.
The control method of study impedance control system based on the present invention can also include:
1st, the Gaussian process dynamic model for establishing system specifically includes:
(1) Gaussian process model isIt is m ≡ 0 wherein to choose priori mean value, chooses a square finger
Counting kernel function is
(2) using state and controlled quentity controlled variable as the input tuple of Gaussian process, using state increment as training objective;
(3) N groups training input X=[x are given1,...,xn] and corresponding training objective y=[y1,...,yn]T, use certificate
According to algorithm is maximized, the hyper parameter of Gaussian process model is arrived in study
Wherein,For Observable state,For training objective, ΔtFor state increment,For independent identically distributed system noise, α2For the signal variance of potential function f,liIt is each
Input the characteristic length of dimension.
2nd, the policy learning algorithm specifically includes:
(1) control strategy π is acted on into system Gaussian process model, carries out internal emulation, the behavior of forecasting system and property
Energy;
(2) long-term Inference Forecast p (x are carried out to state using the Gaussian process model learnt1|π),...,p(xT|π);
(3) the expected gross cost J in time T is assessedπ(θ);
(4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are calculated using the decision search based on gradient
Method finds optimal policy π * ← π (θ), update policing parameter θ;
(5) step (1)-(4) are repeated until policing parameter θ restrains.
3rd, the impedance controller is location-based indirect impedance controller, according to the mesh of contact force error and time-varying
It marks rigidity and it is expected reference locus with damped coefficient amendment, obtain the desired locations increment δ X of mechanical arm tail end;
The concrete form of impedance controller is:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
Wherein it is T controlling cycles.
The present invention provides a kind of efficient study impedance control system and control methods, realize that robot is efficiently autonomous
Power control task is completed in ground study.
The main feature of technical scheme of the present invention is embodied in:
(1) impedance controller can it is expected reference locus according to the target rigidity of time-varying with damped coefficient amendment;
(2) the Gaussian process model of system is the randomization model according to actual samples data establishing system, as system
Transformation kinetic model;
(3) impedance control strategy is a kind of Gaussian process control strategy of randomization, with mean function and variance function
It represents, according to system mode --- mechanical arm tail end position X and practical contact force FaReal-time computing impedance parameter --- target is firm
Spend Kd(t) and damped coefficient Bd(t), and impedance controller is passed to;
(4) policy learning algorithm is learnt using the nitrification enhancement based on model, pre- by cascading a step
Then survey process, the LONG-TERM DISTRIBUTION of Inference Forecast system mode carry out internal emulation and the behavior of forecasting system according to this model;
Energy loss item is added in cost function, by punishing that control action reduces the impedance gain of completion required by task, realizes and misses
Difference and the tradeoff of energy minimization.
Impedance controller is specially:
(1) impedance controller is location-based indirect impedance controller, according to contact force error and the target of time-varying
Rigidity it is expected reference locus with damped coefficient amendment, obtains the desired locations increment δ X of mechanical arm tail end;
(2) controlling cycle is T, Md(t)、Kd(t)、Bd(t) be respectively target inertia, time-varying target rigidity and time-varying
Damped coefficient, then the concrete form of impedance controller be:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
The Gaussian process model of system is specially:
(1) Gaussian process model isIt is m ≡ 0 wherein to choose priori mean value, chooses a square finger
Counting kernel function is
(2) using state and controlled quentity controlled variable as the input tuple of Gaussian process, using state increment as training objective;
(3) N groups training input X=[x are given1,...,xn] and corresponding training objective y=[y1,...,yn]T, use certificate
According to algorithm is maximized, the hyper parameter of Gaussian process model is arrived in study
Impedance control strategy is specially:
(1) impedance control strategy is Gaussian process controllerIt is a kind of Gaussian process of randomization
Control strategy represents with mean function and variance function, whereinIt is tactful defeated for the Observable state of robot
Go outTarget stiffness K for impedance controllerd(t) with damped coefficient Bd(t), θ is the control for needing to learn
Policing parameter processed;
(2) trapezoidal saturation function S (π that can be micro- using boundedt)=umin+umax+umax[9sinπt+sin(3πt)]/8 control
Control variable u is limited in section [u by the physical boundary of parameter umin umin+umax] in, wherein umaxThe maximum of variable in order to control
Amplitude limit, uminThe minimum amplitude limit of variable in order to control.
Policy learning algorithm is specially:
(1) control strategy π being acted on into system model --- Gaussian process model carries out internal emulation, forecasting system
Behavior and performance;
(2) long-term Inference Forecast p (x are carried out to state using Gaussian process model1|π),...,p(xT|π);
(3) expected gross cost in time T is assessedWherein instantaneous cost letter
Number includes state error into this itemWith energy loss item ce(ut)=ce(π(xt))
=ζ (ut/umax)2Two parts, wherein d () be Euclidean distance, σcIt is the width of cost function, ζ is energy-loss factor, ut
For current controlled quentity controlled variable;
(4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are calculated using the decision search based on gradient
Method finds optimal policy π * ← π (θ), update policing parameter θ.
In order to which robot autonomous learning in non-structure environment is enable to complete complicated power control task, accurately control and connect
The contact force in operation task is touched, the present invention proposes a kind of nitrification enhancement study adjustment robot of use based on model
New departure of impedance parameter.Mainly include impedance controller, the Gaussian process model of system, impedance control strategy and strategy
Four part of learning algorithm.Its main feature is that not needing to any priori of environment, the Gauss mistake of system is built according to interaction data
Journey model carries out long-term reasoning and planning in a manner of Bayes to system.It in this way, can be in limited observation extracting data more
More useful informations completes complicated power control task with minimum interaction time study.By adding in energy in cost function
Amount loss item, realizes the tradeoff of error and energy, robot is made to have good submissive ability.Finally, the impedance control obtained
System strategy can adjust target rigidity and damping parameter simultaneously according to system mode in the different phase of task.The present invention can make machine
Device people efficiently autonomous learning complete unstructured moving grids under complicated power control task, it is only necessary to interaction for several times can learn to obtain
Optimal control strategy has data efficient, can be widely applied to double mechanical arms assembling, the cooperation of more mechanical arms is walked with robot
In the Shared controls tasks such as state control, ensure safety and the robustness of interactive operation.
The present invention solve robot learning impedance control in efficient sex chromosome mosaicism, by from observation extracting data more
More useful informations reduces the interaction time needed for study completion power control task, robot is realized high to the maximum extent
The autonomous learning of effect, which completes Shared control, has important reference, may be directly applied to the robot that contact force is needed to control
In.
Description of the drawings
Fig. 1 is the structure diagram of the system of the present invention;
Fig. 2 is the flow chart of the method for the present invention;
Fig. 3 is the policy learning algorithm flow chart of the present invention.
Specific embodiment
It illustrates below and the present invention is described in more detail.
System construction drawing for study impedance control method as shown in Figure 1, each section in dotted line frame are of the invention
Concrete structure, Gaussian process model, impedance control strategy and policy learning including including impedance controller, system are calculated
Method.Specially:
1) impedance controller is according to the target rigidity of time-varying, damped coefficient and current contact force error FeIt corrects and it is expected
Reference locus calculates output mechanical arm tail end desired locations increment δ X;
2) according to the physical location X of sampled data robot end and force snesor information FaEstablish the Gaussian process of system
Model, the transformation kinetic model as system;
3) policy learning algorithm is according to Gaussian process model, by cascading one-step prediction process, Inference Forecast system mode
LONG-TERM DISTRIBUTION, internal emulation and the behavior of forecasting system are then carried out according to this model, and use the extensive chemical based on model
It practises algorithm and obtains impedance control strategy π by minimizing expected cost;
4) impedance control strategy is a kind of Gaussian process control strategy of randomization, with mean function and variance function table
Show, according to system mode --- mechanical arm tail end position X and practical contact force FaReal-time computing impedance parameter --- target stiffness Kd
(t) and damped coefficient Bd(t), and impedance controller is passed to.
F in Fig. 1dIt is expected contact force, XdFor desired locations, XeFor total desired locations of mechanical arm tail end, qdAccording to
The expectation joint position that the inverse kinematics equation calculation of robot obtains, according to the joint physical location that q is measurement, KE,BERespectively
For unknown environment rigidity and damping.
The method of the present invention mainly includes five steps as shown in Figure 2:
1) random initializtion control variable u=[Kd(t)Bd(t)] and system is acted on, records primary data [XFa];
2) according to history samples data [XFa], establish the Gaussian process dynamic model of system, the transformation power as system
Learn model;
3) using the optimal impedance control strategy π (θ) of policy learning algorithm search;
4) Provisioning Policy is π * ← π (θ), and power control is carried out, and acquire new data [XF applied to systema];
5) it repeats step (2-4) and obtains satisfied control strategy until obtaining satisfied force tracking effect, study.
(1) impedance controller
For end is made to reach desired dynamic characteristic, second order impedance model is used:
Wherein, Md(t)、Bd(t)、Kd(t) be respectively time-varying in impedance model target inertial matrix, target damping matrix
With target stiffness matrix,X is respectively robot end in the acceleration of cartesian space reality, speed and position, XdRespectively expectation acceleration, speed and the position of robot end, FdWith F be respectively robot end with environment it
Between expectation contact force and practical contact force.
To obtain modified desired locations increment, Lagrangian transformation is carried out to second order impedance model, and use bilinearity
Convert s=2T-1(z-1)(z+1)-1Discretization obtains:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2 (3)
ω2=-8Md(t)+2Kd(t)T2 (4)
ω3=4Md(t)-2Bd(t)T+Kd(t)T2 (5)
Wherein, T periods in order to control, then the desired locations increment of difference equation, that is, end of impedance controller be:
It is calculated to simplify, target inertial matrix is set as constant Md(t)=I, thus impedance controller need according to when
The target stiffness K of changed(t), damped coefficient Bd(t) desired locations are adjusted with contact force error E (n).
(2) the Gaussian process model of system
Gaussian process model is a kind of probabilistic model of imparametrization, with mean function m () and positive semi-definite covariance
Function k () is represented.If the kinetics equation of description system is:
xt=f (xt-1,ut-1) (7)
yt=xt+εt (8)
Wherein,It is here the physical location X of robot end and practical contact force for Observable state
Fa。It inputs in order to control, wherein Kd(t) it is target rigidity, Bd(t) it is damped coefficient.For
Training objective, wherein ΔtFor state increment.For independent identically distributed system noise.F is Gaussian process pattern functionWhereinTuple is inputted for training,For
Independent identically distributed measurement noise.
In order in prediction and planning consider model uncertainty, avoid the certainty equivalence of learning model it is assumed that I
According to the sampled data of acquisition, the Posterior distrbutionp of potential function f is speculated using Gaussian process, describes all possible dynamic analog
Type.Herein in order to calculate simplicity, it is m ≡ 0 to choose priori mean value, and chooses square exponential kernel functions:
Wherein, α2For the signal variance of potential function f,liIt is that each feature for inputting dimension is long
Degree.Given N groups training input X=[x1,...,xn] and corresponding training objective y=[y1,...,yn]T, maximized using evidence
Algorithm, you can the parameter of study to Gaussian process model
Given deterministic test input x*, functional value f*=f (x*) posteriority prediction distribution p (f* | x*) obey Gauss point
Cloth:
WhereinK=k (X, X) is kernel matrix.
(3) impedance control strategy
Defining impedance control strategy isWhereinObservable shape for robot
State, strategy outputTarget stiffness K for impedance controllerd(t) with damped coefficient Bd(t), θ is needs
The controlling strategy parameter of study.Gaussian process controller is selected as control strategy π:
Wherein, numbers of the n for Gaussian process controller, XπIt is inputted for training, yπFor training objective, it is initialized as close to zero
Random value,For the characteristic length of each state,For signal variance, hereinIn this way in work(
It is similar with RBF networks on energy,For measurement noise variance.So the hyper parameter of Gaussian process control strategy π is
In actual control system, it is necessary to consider the physical boundary of control parameter u, what present invention selection bounded can be micro- is trapezoidal
Control variable u is limited in section [u by saturation functionmin umin+umax] in:
(4) policy learning algorithm
The flow chart of policy learning algorithm is illustrated in figure 3, mainly includes five steps:
1) control strategy π being acted on into system model --- Gaussian process model carries out internal emulation, the row of forecasting system
For with performance;
2) long-term Inference Forecast p (x are carried out to state using the Gaussian process model learnt1|π),...,p(xT|π);
3) the expected gross cost J in time T is assessedπ(θ);
4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are calculated using the decision search based on gradient
Method finds optimal policy π * ← π (θ), update policing parameter θ;
5) step (1-4) is repeated until policing parameter θ restrains.
To obtain optimal control strategyIt needs to be developed according to the long-term forecast of state, finding makes into
This JπThe policing parameter θ that (θ) is minimized*.We represent the transformation dynamics of real system using Gaussian process model, pass through grade
Join the long-term forecast p (x that one-step prediction obtains state distribution1),...,p(xT).Since Gaussian process model can transmit input
The state space of Gaussian Profile is mapped to object space by uncertainty, so containing the not true of model in Long-term planning
It is qualitative, reduce the negative effect that model bias is brought.The process of state one-step prediction can be reduced to:
p(xt-1)→p(ut-1)→p(xt-1,ut-1)→p(Δt)→p(xt) (18)
If p (xt-1) it is known that in order to from p (xt-1) prediction p (xt), it needs according to control variable ut-1=π (xt-1) distribution
Calculate Joint DistributionFirst calculate the distribution p (u of PREDICTIVE CONTROL variablet-1), then calculate phase cross covariance
cov[xt-1,ut-1], it finally obtainsApproximate Gaussian distribution be:
Training objective ΔtPrediction distribution be:
The wherein posteriority prediction distribution of transforming function transformation functionIt can be calculated according to formula (11)-(13).It can use
Match by moment method, the distribution p (Δ of training objectivet) it is approximately Gaussian ProfileThen, expectation state distribution p (xt)
It is approximately Gaussian Profile:
μt=μt-1+μΔ (22)
Σt=Σt-1+ΣΔ+cov[xt-1,Δt]+cov[Δt,xt-1] (23)
In order to assess the performance of control strategy π, total expected cost J in usage time Tπ(θ) is as evaluation criterion.It will control
System strategy π acts on system, according to the long-term evolution of predicted state, calculates total expected cost:
Wherein, c (xt) be t moment instantaneous cost,The expectation being distributed for instantaneous cost relative to predicted state
Value:
To realize robot, the tradeoff between error and energy minimization has this impedance characteristic, constrains contact force
To ensure safety, there is better submissive ability, we add in energy loss item in cost function, by punishing control action
Reduce the impedance gain for completing required by task.Defining instantaneous cost function is:
ct=cb(xt)+ce(ut) (27)
ce(ut)=ce(π(xt))=ζ (ut/umax)2 (29)
Instantaneous cost function ctIt is main to include two, cb(xt) it is state error cost, it is secondary saturation cost function, when
It is Euclidean distance that saturation, which is 1, d (), when deviation away from dbjective state is larger, σcIt is the width of cost function;ce(ut) it is energy
The square energy damage threshold of Loss Terms, i.e. impedance gain, ζ are energy-loss factor, utFor current controlled quentity controlled variable, umaxFor control
The maximum amplitude limit of amount processed.
Then, according to chain rule, gradient of the expected cost relative to controller parameter θ is calculated, using based on gradient
Policy searching method is obtained so that JπThe controller parameter θ that (θ) is minimized*。
Claims (6)
1. a kind of study impedance control system, it is characterized in that:Gaussian process pattern die including impedance controller, system
Block, impedance control strategy module and policy learning algoritic module,
The Gaussian process model module of system establishes the height of system according to the physical location of robot end and force snesor information
This process model, the transformation kinetic model as control system;
Policy learning algoritic module is according to the Gaussian process model of system, and by cascading one-step prediction process, Inference Forecast controls
Then the LONG-TERM DISTRIBUTION of system mode carries out internal emulation and the behavior of Predictive Control System according to this model;
Impedance control strategy module calculates resistance in real time according to control system state, that is, mechanical arm tail end position and practical contact force
Anti- parameter, that is, target rigidity and damped coefficient, and pass to impedance controller;
Impedance controller it is expected according to the target rigidity of time-varying, damped coefficient and current contact force error correction with reference to rail
Mark exports mechanical arm tail end desired locations increment.
2. a kind of control method based on study impedance control system described in claim 1, it is characterized in that:
(1) random initializtion control variable u=[Kd(t) Bd(t)] and control system is acted on, records primary data [X Fa],
Wherein Kd(t) it is target rigidity, Bd(t) it is damped coefficient, X is mechanical arm tail end position, FaFor practical contact force;
(2) according to history samples data [X Fa], establish the Gaussian process dynamic model of system, the transformation dynamics as system
Model;
(3) using the optimal impedance control strategy π (θ) of policy learning algorithm search;
(4) Provisioning Policy is π * ← π (θ), and power control is carried out, and acquire new data [XF applied to impedance controllera];
(5) it repeats step (2-4) and obtains satisfied control strategy until obtaining satisfied force tracking effect, study.
3. the control method according to claim 2 based on study impedance control system, it is characterized in that the foundation
The Gaussian process dynamic model of system specifically includes:
(1) Gaussian process model isIt is m ≡ 0 wherein to choose priori mean value, chooses square index core
Function is
(2) using state and controlled quentity controlled variable as the input tuple of Gaussian process, using state increment as training objective;
(3) N groups training input X=[x are given1,...,xn] and corresponding training objective y=[y1,...,yn]T, using evidence most
Bigization algorithm, the hyper parameter of study to Gaussian process model
Wherein,For Observable state,For training objective, ΔtFor state increment,For
Independent identically distributed system noise, α2For the signal variance of potential function f,liIt is each input dimension
Characteristic length.
4. the control method based on study impedance control system according to Claims 2 or 3, it is characterized in that the strategy
Learning algorithm specifically includes:
(1) control strategy π is acted on into system Gaussian process model, carries out internal emulation, the behavior of forecasting system and performance;
(2) long-term Inference Forecast p (x are carried out to state using the Gaussian process model learnt1|π),...,p(xT|π);
(3) the expected gross cost J in time T is assessedπ(θ);
(4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are sought using the decision search algorithm based on gradient
Look for optimal policy π * ← π (θ), update policing parameter θ;
(5) step (1)-(4) are repeated until policing parameter θ restrains.
5. the control method based on study impedance control system according to Claims 2 or 3, it is characterized in that:The change
Impedance controller is location-based indirect impedance controller, according to contact force error and the target rigidity and damped coefficient of time-varying
It corrects and it is expected reference locus, obtain the desired locations increment δ X of mechanical arm tail end;
The concrete form of impedance controller is:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
Wherein it is T controlling cycles.
6. the control method according to claim 4 based on study impedance control system, it is characterized in that:The impedance
Controller is location-based indirect impedance controller, according to the target rigidity of contact force error and time-varying and damped coefficient amendment
It is expected reference locus, obtain the desired locations increment δ X of mechanical arm tail end;
The concrete form of impedance controller is:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
Wherein it is T controlling cycles.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711393308.7A CN108153153B (en) | 2017-12-19 | 2017-12-19 | Learning variable impedance control system and control method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711393308.7A CN108153153B (en) | 2017-12-19 | 2017-12-19 | Learning variable impedance control system and control method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108153153A true CN108153153A (en) | 2018-06-12 |
CN108153153B CN108153153B (en) | 2020-09-11 |
Family
ID=62464705
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711393308.7A Active CN108153153B (en) | 2017-12-19 | 2017-12-19 | Learning variable impedance control system and control method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108153153B (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108972546A (en) * | 2018-06-22 | 2018-12-11 | 华南理工大学 | A kind of robot constant force curved surface tracking method based on intensified learning |
CN109062032A (en) * | 2018-10-19 | 2018-12-21 | 江苏省(扬州)数控机床研究院 | A kind of robot PID impedance control method based on Approximate dynamic inversion |
CN109702740A (en) * | 2018-12-14 | 2019-05-03 | 中国科学院深圳先进技术研究院 | Robot compliance control method, apparatus, equipment and storage medium |
CN111352384A (en) * | 2018-12-21 | 2020-06-30 | 罗伯特·博世有限公司 | Method and evaluation unit for controlling an automated or autonomous movement mechanism |
CN111640495A (en) * | 2020-05-29 | 2020-09-08 | 北京机械设备研究所 | Variable force tracking control method and device based on impedance control |
CN111673733A (en) * | 2020-03-26 | 2020-09-18 | 华南理工大学 | Intelligent self-adaptive compliance control method of robot in unknown environment |
CN111687833A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Manipulator inverse priority impedance control system and control method |
CN111687834A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Reverse priority impedance control system and method for redundant mechanical arm of mobile manipulator |
CN111687835A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Reverse priority impedance control system and method for redundant manipulator of underwater manipulator |
CN111687832A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Reverse priority impedance control system and method for redundant manipulator of space manipulator |
CN111904795A (en) * | 2020-08-28 | 2020-11-10 | 中山大学 | Variable impedance control method for rehabilitation robot combined with trajectory planning |
CN112372630A (en) * | 2020-09-24 | 2021-02-19 | 哈尔滨工业大学(深圳) | Multi-mechanical-arm cooperative polishing force compliance control method and system |
CN112428278A (en) * | 2020-10-26 | 2021-03-02 | 北京理工大学 | Control method and device of mechanical arm and training method of man-machine cooperation model |
CN112743540A (en) * | 2020-12-09 | 2021-05-04 | 华南理工大学 | Hexapod robot impedance control method based on reinforcement learning |
CN112859868A (en) * | 2021-01-19 | 2021-05-28 | 武汉大学 | KMP (Kernel Key P) -based lower limb exoskeleton rehabilitation robot and motion trajectory planning algorithm |
CN113427483A (en) * | 2021-05-19 | 2021-09-24 | 广州中国科学院先进技术研究所 | Double-machine manpower/bit multivariate data driving method based on reinforcement learning |
CN113641099A (en) * | 2021-07-13 | 2021-11-12 | 西北工业大学 | Impedance control imitation learning training method for surpassing expert demonstration |
CN113966264A (en) * | 2019-05-17 | 2022-01-21 | 西门子股份公司 | Method, computer program product and robot control device for positioning an object that is movable during manipulation by a robot on the basis of contact, and robot |
CN114193458A (en) * | 2022-01-25 | 2022-03-18 | 中山大学 | Robot control method based on Gaussian process online learning |
CN114378820A (en) * | 2022-01-18 | 2022-04-22 | 中山大学 | Robot impedance learning method based on safety reinforcement learning |
CN114789444A (en) * | 2022-05-05 | 2022-07-26 | 山东省人工智能研究院 | Compliant human-computer contact method based on deep reinforcement learning and impedance control |
CN115421387A (en) * | 2022-09-22 | 2022-12-02 | 中国科学院自动化研究所 | Variable impedance control system and control method based on inverse reinforcement learning |
CN115496099A (en) * | 2022-09-20 | 2022-12-20 | 哈尔滨工业大学 | Filtering and high-order state observation method for mechanical arm sensor |
CN116643501A (en) * | 2023-07-18 | 2023-08-25 | 湖南大学 | Variable impedance control method and system for aerial working robot under stability constraint |
CN117817674A (en) * | 2024-03-05 | 2024-04-05 | 纳博特控制技术(苏州)有限公司 | Self-adaptive impedance control method for robot |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104626168A (en) * | 2014-12-16 | 2015-05-20 | 苏州大学 | Robot force position compliant control method based on intelligent algorithm |
CN105213153A (en) * | 2015-09-14 | 2016-01-06 | 西安交通大学 | Based on the lower limb rehabilitation robot control method of brain flesh information impedance |
US20170007308A1 (en) * | 2015-07-08 | 2017-01-12 | Research & Business Foundation Sungkyunkwan University | Apparatus and method for discriminating biological tissue, surgical apparatus using the apparatus |
CN106406098A (en) * | 2016-11-22 | 2017-02-15 | 西北工业大学 | Man-machine interaction control method of robot system in unknown environment |
CN106938470A (en) * | 2017-03-22 | 2017-07-11 | 华中科技大学 | A kind of device and method of Robot Force control teaching learning by imitation |
-
2017
- 2017-12-19 CN CN201711393308.7A patent/CN108153153B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104626168A (en) * | 2014-12-16 | 2015-05-20 | 苏州大学 | Robot force position compliant control method based on intelligent algorithm |
US20170007308A1 (en) * | 2015-07-08 | 2017-01-12 | Research & Business Foundation Sungkyunkwan University | Apparatus and method for discriminating biological tissue, surgical apparatus using the apparatus |
CN105213153A (en) * | 2015-09-14 | 2016-01-06 | 西安交通大学 | Based on the lower limb rehabilitation robot control method of brain flesh information impedance |
CN106406098A (en) * | 2016-11-22 | 2017-02-15 | 西北工业大学 | Man-machine interaction control method of robot system in unknown environment |
CN106938470A (en) * | 2017-03-22 | 2017-07-11 | 华中科技大学 | A kind of device and method of Robot Force control teaching learning by imitation |
Non-Patent Citations (2)
Title |
---|
GUIHUA XIA 等: "Hybrid force/position control of industrial robotic manipulator based on Kalman filter", 《2016 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION》 * |
李二超等: "基于神经网络视觉伺服的机器人模糊自适应阻抗控制", 《电工技术学报》 * |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108972546A (en) * | 2018-06-22 | 2018-12-11 | 华南理工大学 | A kind of robot constant force curved surface tracking method based on intensified learning |
CN109062032A (en) * | 2018-10-19 | 2018-12-21 | 江苏省(扬州)数控机床研究院 | A kind of robot PID impedance control method based on Approximate dynamic inversion |
CN109702740A (en) * | 2018-12-14 | 2019-05-03 | 中国科学院深圳先进技术研究院 | Robot compliance control method, apparatus, equipment and storage medium |
CN109702740B (en) * | 2018-12-14 | 2020-12-04 | 中国科学院深圳先进技术研究院 | Robot compliance control method, device, equipment and storage medium |
CN111352384A (en) * | 2018-12-21 | 2020-06-30 | 罗伯特·博世有限公司 | Method and evaluation unit for controlling an automated or autonomous movement mechanism |
CN113966264A (en) * | 2019-05-17 | 2022-01-21 | 西门子股份公司 | Method, computer program product and robot control device for positioning an object that is movable during manipulation by a robot on the basis of contact, and robot |
CN111673733B (en) * | 2020-03-26 | 2022-03-29 | 华南理工大学 | Intelligent self-adaptive compliance control method of robot in unknown environment |
CN111673733A (en) * | 2020-03-26 | 2020-09-18 | 华南理工大学 | Intelligent self-adaptive compliance control method of robot in unknown environment |
CN111687833A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Manipulator inverse priority impedance control system and control method |
CN111687832A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Reverse priority impedance control system and method for redundant manipulator of space manipulator |
CN111687835A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Reverse priority impedance control system and method for redundant manipulator of underwater manipulator |
CN111687834A (en) * | 2020-04-30 | 2020-09-22 | 广西科技大学 | Reverse priority impedance control system and method for redundant mechanical arm of mobile manipulator |
CN111640495A (en) * | 2020-05-29 | 2020-09-08 | 北京机械设备研究所 | Variable force tracking control method and device based on impedance control |
CN111640495B (en) * | 2020-05-29 | 2024-05-31 | 北京机械设备研究所 | Variable force tracking control method and device based on impedance control |
CN111904795A (en) * | 2020-08-28 | 2020-11-10 | 中山大学 | Variable impedance control method for rehabilitation robot combined with trajectory planning |
CN111904795B (en) * | 2020-08-28 | 2022-08-26 | 中山大学 | Variable impedance control method for rehabilitation robot combined with trajectory planning |
CN112372630A (en) * | 2020-09-24 | 2021-02-19 | 哈尔滨工业大学(深圳) | Multi-mechanical-arm cooperative polishing force compliance control method and system |
CN112372630B (en) * | 2020-09-24 | 2022-02-22 | 哈尔滨工业大学(深圳) | Multi-mechanical-arm cooperative polishing force compliance control method and system |
CN112428278A (en) * | 2020-10-26 | 2021-03-02 | 北京理工大学 | Control method and device of mechanical arm and training method of man-machine cooperation model |
CN112743540A (en) * | 2020-12-09 | 2021-05-04 | 华南理工大学 | Hexapod robot impedance control method based on reinforcement learning |
CN112743540B (en) * | 2020-12-09 | 2022-05-24 | 华南理工大学 | Hexapod robot impedance control method based on reinforcement learning |
CN112859868A (en) * | 2021-01-19 | 2021-05-28 | 武汉大学 | KMP (Kernel Key P) -based lower limb exoskeleton rehabilitation robot and motion trajectory planning algorithm |
CN113427483A (en) * | 2021-05-19 | 2021-09-24 | 广州中国科学院先进技术研究所 | Double-machine manpower/bit multivariate data driving method based on reinforcement learning |
CN113641099B (en) * | 2021-07-13 | 2023-02-10 | 西北工业大学 | Impedance control imitation learning training method for surpassing expert demonstration |
CN113641099A (en) * | 2021-07-13 | 2021-11-12 | 西北工业大学 | Impedance control imitation learning training method for surpassing expert demonstration |
CN114378820A (en) * | 2022-01-18 | 2022-04-22 | 中山大学 | Robot impedance learning method based on safety reinforcement learning |
CN114193458A (en) * | 2022-01-25 | 2022-03-18 | 中山大学 | Robot control method based on Gaussian process online learning |
CN114193458B (en) * | 2022-01-25 | 2024-04-09 | 中山大学 | Robot control method based on Gaussian process online learning |
CN114789444A (en) * | 2022-05-05 | 2022-07-26 | 山东省人工智能研究院 | Compliant human-computer contact method based on deep reinforcement learning and impedance control |
CN114789444B (en) * | 2022-05-05 | 2022-12-16 | 山东省人工智能研究院 | Compliant human-computer contact method based on deep reinforcement learning and impedance control |
CN115496099A (en) * | 2022-09-20 | 2022-12-20 | 哈尔滨工业大学 | Filtering and high-order state observation method for mechanical arm sensor |
CN115421387A (en) * | 2022-09-22 | 2022-12-02 | 中国科学院自动化研究所 | Variable impedance control system and control method based on inverse reinforcement learning |
CN116643501A (en) * | 2023-07-18 | 2023-08-25 | 湖南大学 | Variable impedance control method and system for aerial working robot under stability constraint |
CN116643501B (en) * | 2023-07-18 | 2023-10-24 | 湖南大学 | Variable impedance control method and system for aerial working robot under stability constraint |
CN117817674A (en) * | 2024-03-05 | 2024-04-05 | 纳博特控制技术(苏州)有限公司 | Self-adaptive impedance control method for robot |
Also Published As
Publication number | Publication date |
---|---|
CN108153153B (en) | 2020-09-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108153153A (en) | A kind of study impedance control system and control method | |
Li et al. | A policy search method for temporal logic specified reinforcement learning tasks | |
Wu et al. | Dynamic fuzzy neural networks-a novel approach to function approximation | |
Zhao et al. | Model-free optimal control for affine nonlinear systems with convergence analysis | |
Hou et al. | Fuzzy logic-driven variable time-scale prediction-based reinforcement learning for robotic multiple peg-in-hole assembly | |
CN103324196A (en) | Multi-robot path planning and coordination collision prevention method based on fuzzy logic | |
CN105425820A (en) | Unmanned aerial vehicle cooperative search method for moving object with perception capability | |
CN102819264A (en) | Path planning Q-learning initial method of mobile robot | |
Liang et al. | Search-based task planning with learned skill effect models for lifelong robotic manipulation | |
Vecchietti et al. | Sampling rate decay in hindsight experience replay for robot control | |
US20230144995A1 (en) | Learning options for action selection with meta-gradients in multi-task reinforcement learning | |
Su et al. | Robot path planning based on random coding particle swarm optimization | |
Zhong et al. | Modeling-learning-based actor-critic algorithm with Gaussian process approximator | |
Wu et al. | Optimized least-squares support vector machine for predicting aero-optic imaging deviation based on chaotic particle swarm optimization | |
Komeno et al. | Deep koopman with control: Spectral analysis of soft robot dynamics | |
Wang et al. | Policy learning for nonlinear model predictive control with application to USVs | |
Li et al. | Improved Q-learning based route planning method for UAVs in unknown environment | |
Lai et al. | Deep neural network-based real-time trajectory planning for an automatic guided vehicle with obstacles | |
Contardo et al. | Learning states representations in pomdp | |
Yongqiang et al. | Path‐Integral‐Based Reinforcement Learning Algorithm for Goal‐Directed Locomotion of Snake‐Shaped Robot | |
Sharma et al. | Wavelet reduced order observer based adaptive tracking control for a class of uncertain nonlinear systems using reinforcement learning | |
Wang et al. | A compensation method for random error of gyroscopes based on support vector machine and beetle antennae search algorithm | |
Zhou et al. | Switching deep reinforcement learning based intelligent online decision making for autonomous systems under uncertain environment | |
Yu et al. | An intelligent robot motion planning method and application via lppo in unknown environment | |
Zhou et al. | Research on the fuzzy algorithm of path planning of mobile robot |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |