CN108153153A - A kind of study impedance control system and control method - Google Patents

A kind of study impedance control system and control method Download PDF

Info

Publication number
CN108153153A
CN108153153A CN201711393308.7A CN201711393308A CN108153153A CN 108153153 A CN108153153 A CN 108153153A CN 201711393308 A CN201711393308 A CN 201711393308A CN 108153153 A CN108153153 A CN 108153153A
Authority
CN
China
Prior art keywords
impedance
control
gaussian process
study
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711393308.7A
Other languages
Chinese (zh)
Other versions
CN108153153B (en
Inventor
夏桂华
李超
张智
谢心如
朱齐丹
蔡成涛
吕晓龙
刘志林
班瑞阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201711393308.7A priority Critical patent/CN108153153B/en
Publication of CN108153153A publication Critical patent/CN108153153A/en
Application granted granted Critical
Publication of CN108153153B publication Critical patent/CN108153153B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/048Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators using a predictor

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)

Abstract

The present invention is to provide a kind of study impedance control system and control methods.Mainly include impedance controller, four part of the Gaussian process model of system, impedance control strategy and policy learning algorithm.Any priori of environment is not needed to, the Gaussian process model of system is built according to interaction data, long-term reasoning and planning are carried out to system in a manner of Bayes.Complicated power control task can be completed with minimum interaction time study in the limited more useful informations of observation extracting data.By adding in energy loss item in cost function, realize the tradeoff of error and energy, make robot that there is good submissive ability.Finally, the impedance control strategy obtained can adjust target rigidity and damping parameter simultaneously according to system mode in the different phase of task.It the composite can be widely applied in the Shared controls tasks such as double mechanical arms assembling, the cooperation of more mechanical arms and Biped Robot Control, ensure safety and the robustness of interactive operation.

Description

A kind of study impedance control system and control method
Technical field
The present invention relates to a kind of robot Shared control, specifically a kind of efficient study impedance control system System and control method.
Background technology
The touch operation task being increasingly used in robot under non-structure environment is such as submissive to assemble, is man-machine Interaction etc., due to task complexity, contact environment is changeable and unpredictable, it is difficult to the precise kinetic model of system is established, how Robot security is allowed efficiently and rapidly to perform new task, accurately controls the contact force under varying environment, is that robot faces New challenge.Impedance control is widely used in because of good adaptability and robustness in robot interactive control task.Due to power Control characteristic is decided by the impedance parameter of robot, and the selection height dependence task of inertia, rigidity and damping parameter is generally difficult to Priori infers, good control performance in order to obtain, it usually needs have deep understanding to controller design and its parameter, still need to Manually adjust control parameter.And especially for complex task, since environmental condition generally comprises some non-linear and time-varying Factor, the impedance adjustment of preset parameter is difficult to realize goal task.If impedance control parameter can be according to task and environment Variation carry out Dynamic Programming adjustment, then control performance be significantly better than the fixed situation of impedance control parameter.So study variable resistance Anti- control ability is the key that modern machines people safely and fast completes complex operations task.
For the operation task that power is needed to control, the fewer study exploration number the better, because a large amount of physics interaction is attempted Robot or workpiece may be damaged, and a large amount of sampled data is time-consuming and expensive, this is unpractical.Institute To improve the learning efficiency of study impedance control algolithm, required trial and error interaction times be reduced, for robot Fast Learning It is most important for completion new task.
Invention content
The purpose of the present invention is to provide a kind of learning efficiency height, can be widely applied to double mechanical arms assembling, more mechanical arms In the Shared controls task such as cooperation and Biped Robot Control, ensure the safety of interactive operation and the study impedance of robustness Control system.The purpose of the present invention is to provide a kind of control methods based on study impedance control system.
The study impedance control system of the present invention includes impedance controller, the Gaussian process model module of system, change Impedance control strategy module and policy learning algoritic module,
The Gaussian process model module of system establishes system according to the physical location of robot end and force snesor information Gaussian process model, the transformation kinetic model as control system;
Policy learning algoritic module is according to the Gaussian process model of system, by cascading one-step prediction process, Inference Forecast Then the LONG-TERM DISTRIBUTION of control system state carries out internal emulation and the behavior of Predictive Control System according to this model;
Impedance control strategy module is counted in real time according to control system state, that is, mechanical arm tail end position and practical contact force Impedance parameter, that is, target rigidity and damped coefficient are calculated, and passes to impedance controller;
Impedance controller is according to the target rigidity of time-varying, damped coefficient and current contact force error correction expectation reference Track exports mechanical arm tail end desired locations increment.
The control method of study impedance control system based on the present invention includes the following steps:
(1) random initializtion control variable u=[Kd(t)Bd(t)] and control system is acted on, records primary data [XFa], wherein Kd(t) it is target rigidity, Bd(t) it is damped coefficient, X is mechanical arm tail end position, FaFor practical contact force;
(2) according to history samples data [XFa], the Gaussian process dynamic model of system is established, the change as system moves Mechanical model;
(3) using the optimal impedance control strategy π (θ) of policy learning algorithm search;
(4) Provisioning Policy is π * ← π (θ), and power control is carried out, and acquire new data applied to impedance controller [XFa];
(5) it repeats step (2-4) and obtains satisfied control strategy until obtaining satisfied force tracking effect, study.
The control method of study impedance control system based on the present invention can also include:
1st, the Gaussian process dynamic model for establishing system specifically includes:
(1) Gaussian process model isIt is m ≡ 0 wherein to choose priori mean value, chooses a square finger Counting kernel function is
(2) using state and controlled quentity controlled variable as the input tuple of Gaussian process, using state increment as training objective;
(3) N groups training input X=[x are given1,...,xn] and corresponding training objective y=[y1,...,yn]T, use certificate According to algorithm is maximized, the hyper parameter of Gaussian process model is arrived in study
Wherein,For Observable state,For training objective, ΔtFor state increment,For independent identically distributed system noise, α2For the signal variance of potential function f,liIt is each Input the characteristic length of dimension.
2nd, the policy learning algorithm specifically includes:
(1) control strategy π is acted on into system Gaussian process model, carries out internal emulation, the behavior of forecasting system and property Energy;
(2) long-term Inference Forecast p (x are carried out to state using the Gaussian process model learnt1|π),...,p(xT|π);
(3) the expected gross cost J in time T is assessedπ(θ);
(4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are calculated using the decision search based on gradient Method finds optimal policy π * ← π (θ), update policing parameter θ;
(5) step (1)-(4) are repeated until policing parameter θ restrains.
3rd, the impedance controller is location-based indirect impedance controller, according to the mesh of contact force error and time-varying It marks rigidity and it is expected reference locus with damped coefficient amendment, obtain the desired locations increment δ X of mechanical arm tail end;
The concrete form of impedance controller is:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
Wherein it is T controlling cycles.
The present invention provides a kind of efficient study impedance control system and control methods, realize that robot is efficiently autonomous Power control task is completed in ground study.
The main feature of technical scheme of the present invention is embodied in:
(1) impedance controller can it is expected reference locus according to the target rigidity of time-varying with damped coefficient amendment;
(2) the Gaussian process model of system is the randomization model according to actual samples data establishing system, as system Transformation kinetic model;
(3) impedance control strategy is a kind of Gaussian process control strategy of randomization, with mean function and variance function It represents, according to system mode --- mechanical arm tail end position X and practical contact force FaReal-time computing impedance parameter --- target is firm Spend Kd(t) and damped coefficient Bd(t), and impedance controller is passed to;
(4) policy learning algorithm is learnt using the nitrification enhancement based on model, pre- by cascading a step Then survey process, the LONG-TERM DISTRIBUTION of Inference Forecast system mode carry out internal emulation and the behavior of forecasting system according to this model; Energy loss item is added in cost function, by punishing that control action reduces the impedance gain of completion required by task, realizes and misses Difference and the tradeoff of energy minimization.
Impedance controller is specially:
(1) impedance controller is location-based indirect impedance controller, according to contact force error and the target of time-varying Rigidity it is expected reference locus with damped coefficient amendment, obtains the desired locations increment δ X of mechanical arm tail end;
(2) controlling cycle is T, Md(t)、Kd(t)、Bd(t) be respectively target inertia, time-varying target rigidity and time-varying Damped coefficient, then the concrete form of impedance controller be:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
The Gaussian process model of system is specially:
(1) Gaussian process model isIt is m ≡ 0 wherein to choose priori mean value, chooses a square finger Counting kernel function is
(2) using state and controlled quentity controlled variable as the input tuple of Gaussian process, using state increment as training objective;
(3) N groups training input X=[x are given1,...,xn] and corresponding training objective y=[y1,...,yn]T, use certificate According to algorithm is maximized, the hyper parameter of Gaussian process model is arrived in study
Impedance control strategy is specially:
(1) impedance control strategy is Gaussian process controllerIt is a kind of Gaussian process of randomization Control strategy represents with mean function and variance function, whereinIt is tactful defeated for the Observable state of robot Go outTarget stiffness K for impedance controllerd(t) with damped coefficient Bd(t), θ is the control for needing to learn Policing parameter processed;
(2) trapezoidal saturation function S (π that can be micro- using boundedt)=umin+umax+umax[9sinπt+sin(3πt)]/8 control Control variable u is limited in section [u by the physical boundary of parameter umin umin+umax] in, wherein umaxThe maximum of variable in order to control Amplitude limit, uminThe minimum amplitude limit of variable in order to control.
Policy learning algorithm is specially:
(1) control strategy π being acted on into system model --- Gaussian process model carries out internal emulation, forecasting system Behavior and performance;
(2) long-term Inference Forecast p (x are carried out to state using Gaussian process model1|π),...,p(xT|π);
(3) expected gross cost in time T is assessedWherein instantaneous cost letter Number includes state error into this itemWith energy loss item ce(ut)=ce(π(xt)) =ζ (ut/umax)2Two parts, wherein d () be Euclidean distance, σcIt is the width of cost function, ζ is energy-loss factor, ut For current controlled quentity controlled variable;
(4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are calculated using the decision search based on gradient Method finds optimal policy π * ← π (θ), update policing parameter θ.
In order to which robot autonomous learning in non-structure environment is enable to complete complicated power control task, accurately control and connect The contact force in operation task is touched, the present invention proposes a kind of nitrification enhancement study adjustment robot of use based on model New departure of impedance parameter.Mainly include impedance controller, the Gaussian process model of system, impedance control strategy and strategy Four part of learning algorithm.Its main feature is that not needing to any priori of environment, the Gauss mistake of system is built according to interaction data Journey model carries out long-term reasoning and planning in a manner of Bayes to system.It in this way, can be in limited observation extracting data more More useful informations completes complicated power control task with minimum interaction time study.By adding in energy in cost function Amount loss item, realizes the tradeoff of error and energy, robot is made to have good submissive ability.Finally, the impedance control obtained System strategy can adjust target rigidity and damping parameter simultaneously according to system mode in the different phase of task.The present invention can make machine Device people efficiently autonomous learning complete unstructured moving grids under complicated power control task, it is only necessary to interaction for several times can learn to obtain Optimal control strategy has data efficient, can be widely applied to double mechanical arms assembling, the cooperation of more mechanical arms is walked with robot In the Shared controls tasks such as state control, ensure safety and the robustness of interactive operation.
The present invention solve robot learning impedance control in efficient sex chromosome mosaicism, by from observation extracting data more More useful informations reduces the interaction time needed for study completion power control task, robot is realized high to the maximum extent The autonomous learning of effect, which completes Shared control, has important reference, may be directly applied to the robot that contact force is needed to control In.
Description of the drawings
Fig. 1 is the structure diagram of the system of the present invention;
Fig. 2 is the flow chart of the method for the present invention;
Fig. 3 is the policy learning algorithm flow chart of the present invention.
Specific embodiment
It illustrates below and the present invention is described in more detail.
System construction drawing for study impedance control method as shown in Figure 1, each section in dotted line frame are of the invention Concrete structure, Gaussian process model, impedance control strategy and policy learning including including impedance controller, system are calculated Method.Specially:
1) impedance controller is according to the target rigidity of time-varying, damped coefficient and current contact force error FeIt corrects and it is expected Reference locus calculates output mechanical arm tail end desired locations increment δ X;
2) according to the physical location X of sampled data robot end and force snesor information FaEstablish the Gaussian process of system Model, the transformation kinetic model as system;
3) policy learning algorithm is according to Gaussian process model, by cascading one-step prediction process, Inference Forecast system mode LONG-TERM DISTRIBUTION, internal emulation and the behavior of forecasting system are then carried out according to this model, and use the extensive chemical based on model It practises algorithm and obtains impedance control strategy π by minimizing expected cost;
4) impedance control strategy is a kind of Gaussian process control strategy of randomization, with mean function and variance function table Show, according to system mode --- mechanical arm tail end position X and practical contact force FaReal-time computing impedance parameter --- target stiffness Kd (t) and damped coefficient Bd(t), and impedance controller is passed to.
F in Fig. 1dIt is expected contact force, XdFor desired locations, XeFor total desired locations of mechanical arm tail end, qdAccording to The expectation joint position that the inverse kinematics equation calculation of robot obtains, according to the joint physical location that q is measurement, KE,BERespectively For unknown environment rigidity and damping.
The method of the present invention mainly includes five steps as shown in Figure 2:
1) random initializtion control variable u=[Kd(t)Bd(t)] and system is acted on, records primary data [XFa];
2) according to history samples data [XFa], establish the Gaussian process dynamic model of system, the transformation power as system Learn model;
3) using the optimal impedance control strategy π (θ) of policy learning algorithm search;
4) Provisioning Policy is π * ← π (θ), and power control is carried out, and acquire new data [XF applied to systema];
5) it repeats step (2-4) and obtains satisfied control strategy until obtaining satisfied force tracking effect, study.
(1) impedance controller
For end is made to reach desired dynamic characteristic, second order impedance model is used:
Wherein, Md(t)、Bd(t)、Kd(t) be respectively time-varying in impedance model target inertial matrix, target damping matrix With target stiffness matrix,X is respectively robot end in the acceleration of cartesian space reality, speed and position, XdRespectively expectation acceleration, speed and the position of robot end, FdWith F be respectively robot end with environment it Between expectation contact force and practical contact force.
To obtain modified desired locations increment, Lagrangian transformation is carried out to second order impedance model, and use bilinearity Convert s=2T-1(z-1)(z+1)-1Discretization obtains:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2 (3)
ω2=-8Md(t)+2Kd(t)T2 (4)
ω3=4Md(t)-2Bd(t)T+Kd(t)T2 (5)
Wherein, T periods in order to control, then the desired locations increment of difference equation, that is, end of impedance controller be:
It is calculated to simplify, target inertial matrix is set as constant Md(t)=I, thus impedance controller need according to when The target stiffness K of changed(t), damped coefficient Bd(t) desired locations are adjusted with contact force error E (n).
(2) the Gaussian process model of system
Gaussian process model is a kind of probabilistic model of imparametrization, with mean function m () and positive semi-definite covariance Function k () is represented.If the kinetics equation of description system is:
xt=f (xt-1,ut-1) (7)
yt=xtt (8)
Wherein,It is here the physical location X of robot end and practical contact force for Observable state FaIt inputs in order to control, wherein Kd(t) it is target rigidity, Bd(t) it is damped coefficient.For Training objective, wherein ΔtFor state increment.For independent identically distributed system noise.F is Gaussian process pattern functionWhereinTuple is inputted for training,For Independent identically distributed measurement noise.
In order in prediction and planning consider model uncertainty, avoid the certainty equivalence of learning model it is assumed that I According to the sampled data of acquisition, the Posterior distrbutionp of potential function f is speculated using Gaussian process, describes all possible dynamic analog Type.Herein in order to calculate simplicity, it is m ≡ 0 to choose priori mean value, and chooses square exponential kernel functions:
Wherein, α2For the signal variance of potential function f,liIt is that each feature for inputting dimension is long Degree.Given N groups training input X=[x1,...,xn] and corresponding training objective y=[y1,...,yn]T, maximized using evidence Algorithm, you can the parameter of study to Gaussian process model
Given deterministic test input x*, functional value f*=f (x*) posteriority prediction distribution p (f* | x*) obey Gauss point Cloth:
WhereinK=k (X, X) is kernel matrix.
(3) impedance control strategy
Defining impedance control strategy isWhereinObservable shape for robot State, strategy outputTarget stiffness K for impedance controllerd(t) with damped coefficient Bd(t), θ is needs The controlling strategy parameter of study.Gaussian process controller is selected as control strategy π:
Wherein, numbers of the n for Gaussian process controller, XπIt is inputted for training, yπFor training objective, it is initialized as close to zero Random value,For the characteristic length of each state,For signal variance, hereinIn this way in work( It is similar with RBF networks on energy,For measurement noise variance.So the hyper parameter of Gaussian process control strategy π is
In actual control system, it is necessary to consider the physical boundary of control parameter u, what present invention selection bounded can be micro- is trapezoidal Control variable u is limited in section [u by saturation functionmin umin+umax] in:
(4) policy learning algorithm
The flow chart of policy learning algorithm is illustrated in figure 3, mainly includes five steps:
1) control strategy π being acted on into system model --- Gaussian process model carries out internal emulation, the row of forecasting system For with performance;
2) long-term Inference Forecast p (x are carried out to state using the Gaussian process model learnt1|π),...,p(xT|π);
3) the expected gross cost J in time T is assessedπ(θ);
4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are calculated using the decision search based on gradient Method finds optimal policy π * ← π (θ), update policing parameter θ;
5) step (1-4) is repeated until policing parameter θ restrains.
To obtain optimal control strategyIt needs to be developed according to the long-term forecast of state, finding makes into This JπThe policing parameter θ that (θ) is minimized*.We represent the transformation dynamics of real system using Gaussian process model, pass through grade Join the long-term forecast p (x that one-step prediction obtains state distribution1),...,p(xT).Since Gaussian process model can transmit input The state space of Gaussian Profile is mapped to object space by uncertainty, so containing the not true of model in Long-term planning It is qualitative, reduce the negative effect that model bias is brought.The process of state one-step prediction can be reduced to:
p(xt-1)→p(ut-1)→p(xt-1,ut-1)→p(Δt)→p(xt) (18)
If p (xt-1) it is known that in order to from p (xt-1) prediction p (xt), it needs according to control variable ut-1=π (xt-1) distribution Calculate Joint DistributionFirst calculate the distribution p (u of PREDICTIVE CONTROL variablet-1), then calculate phase cross covariance cov[xt-1,ut-1], it finally obtainsApproximate Gaussian distribution be:
Training objective ΔtPrediction distribution be:
The wherein posteriority prediction distribution of transforming function transformation functionIt can be calculated according to formula (11)-(13).It can use Match by moment method, the distribution p (Δ of training objectivet) it is approximately Gaussian ProfileThen, expectation state distribution p (xt) It is approximately Gaussian Profile:
μtt-1Δ (22)
Σtt-1Δ+cov[xt-1t]+cov[Δt,xt-1] (23)
In order to assess the performance of control strategy π, total expected cost J in usage time Tπ(θ) is as evaluation criterion.It will control System strategy π acts on system, according to the long-term evolution of predicted state, calculates total expected cost:
Wherein, c (xt) be t moment instantaneous cost,The expectation being distributed for instantaneous cost relative to predicted state Value:
To realize robot, the tradeoff between error and energy minimization has this impedance characteristic, constrains contact force To ensure safety, there is better submissive ability, we add in energy loss item in cost function, by punishing control action Reduce the impedance gain for completing required by task.Defining instantaneous cost function is:
ct=cb(xt)+ce(ut) (27)
ce(ut)=ce(π(xt))=ζ (ut/umax)2 (29)
Instantaneous cost function ctIt is main to include two, cb(xt) it is state error cost, it is secondary saturation cost function, when It is Euclidean distance that saturation, which is 1, d (), when deviation away from dbjective state is larger, σcIt is the width of cost function;ce(ut) it is energy The square energy damage threshold of Loss Terms, i.e. impedance gain, ζ are energy-loss factor, utFor current controlled quentity controlled variable, umaxFor control The maximum amplitude limit of amount processed.
Then, according to chain rule, gradient of the expected cost relative to controller parameter θ is calculated, using based on gradient Policy searching method is obtained so that JπThe controller parameter θ that (θ) is minimized*

Claims (6)

1. a kind of study impedance control system, it is characterized in that:Gaussian process pattern die including impedance controller, system Block, impedance control strategy module and policy learning algoritic module,
The Gaussian process model module of system establishes the height of system according to the physical location of robot end and force snesor information This process model, the transformation kinetic model as control system;
Policy learning algoritic module is according to the Gaussian process model of system, and by cascading one-step prediction process, Inference Forecast controls Then the LONG-TERM DISTRIBUTION of system mode carries out internal emulation and the behavior of Predictive Control System according to this model;
Impedance control strategy module calculates resistance in real time according to control system state, that is, mechanical arm tail end position and practical contact force Anti- parameter, that is, target rigidity and damped coefficient, and pass to impedance controller;
Impedance controller it is expected according to the target rigidity of time-varying, damped coefficient and current contact force error correction with reference to rail Mark exports mechanical arm tail end desired locations increment.
2. a kind of control method based on study impedance control system described in claim 1, it is characterized in that:
(1) random initializtion control variable u=[Kd(t) Bd(t)] and control system is acted on, records primary data [X Fa], Wherein Kd(t) it is target rigidity, Bd(t) it is damped coefficient, X is mechanical arm tail end position, FaFor practical contact force;
(2) according to history samples data [X Fa], establish the Gaussian process dynamic model of system, the transformation dynamics as system Model;
(3) using the optimal impedance control strategy π (θ) of policy learning algorithm search;
(4) Provisioning Policy is π * ← π (θ), and power control is carried out, and acquire new data [XF applied to impedance controllera];
(5) it repeats step (2-4) and obtains satisfied control strategy until obtaining satisfied force tracking effect, study.
3. the control method according to claim 2 based on study impedance control system, it is characterized in that the foundation The Gaussian process dynamic model of system specifically includes:
(1) Gaussian process model isIt is m ≡ 0 wherein to choose priori mean value, chooses square index core Function is
(2) using state and controlled quentity controlled variable as the input tuple of Gaussian process, using state increment as training objective;
(3) N groups training input X=[x are given1,...,xn] and corresponding training objective y=[y1,...,yn]T, using evidence most Bigization algorithm, the hyper parameter of study to Gaussian process model
Wherein,For Observable state,For training objective, ΔtFor state increment,For Independent identically distributed system noise, α2For the signal variance of potential function f,liIt is each input dimension Characteristic length.
4. the control method based on study impedance control system according to Claims 2 or 3, it is characterized in that the strategy Learning algorithm specifically includes:
(1) control strategy π is acted on into system Gaussian process model, carries out internal emulation, the behavior of forecasting system and performance;
(2) long-term Inference Forecast p (x are carried out to state using the Gaussian process model learnt1|π),...,p(xT|π);
(3) the expected gross cost J in time T is assessedπ(θ);
(4) gradient information dJ of the cost relative to policing parameter is calculatedπ(θ)/d θ are sought using the decision search algorithm based on gradient Look for optimal policy π * ← π (θ), update policing parameter θ;
(5) step (1)-(4) are repeated until policing parameter θ restrains.
5. the control method based on study impedance control system according to Claims 2 or 3, it is characterized in that:The change Impedance controller is location-based indirect impedance controller, according to contact force error and the target rigidity and damped coefficient of time-varying It corrects and it is expected reference locus, obtain the desired locations increment δ X of mechanical arm tail end;
The concrete form of impedance controller is:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
Wherein it is T controlling cycles.
6. the control method according to claim 4 based on study impedance control system, it is characterized in that:The impedance Controller is location-based indirect impedance controller, according to the target rigidity of contact force error and time-varying and damped coefficient amendment It is expected reference locus, obtain the desired locations increment δ X of mechanical arm tail end;
The concrete form of impedance controller is:
ω1=4Md(t)+2Bd(t)T+Kd(t)T2
ω2=-8Md(t)+2Kd(t)T2
ω3=4Md(t)-2Bd(t)T+Kd(t)T2
Wherein it is T controlling cycles.
CN201711393308.7A 2017-12-19 2017-12-19 Learning variable impedance control system and control method Active CN108153153B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711393308.7A CN108153153B (en) 2017-12-19 2017-12-19 Learning variable impedance control system and control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711393308.7A CN108153153B (en) 2017-12-19 2017-12-19 Learning variable impedance control system and control method

Publications (2)

Publication Number Publication Date
CN108153153A true CN108153153A (en) 2018-06-12
CN108153153B CN108153153B (en) 2020-09-11

Family

ID=62464705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711393308.7A Active CN108153153B (en) 2017-12-19 2017-12-19 Learning variable impedance control system and control method

Country Status (1)

Country Link
CN (1) CN108153153B (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108972546A (en) * 2018-06-22 2018-12-11 华南理工大学 A kind of robot constant force curved surface tracking method based on intensified learning
CN109062032A (en) * 2018-10-19 2018-12-21 江苏省(扬州)数控机床研究院 A kind of robot PID impedance control method based on Approximate dynamic inversion
CN109702740A (en) * 2018-12-14 2019-05-03 中国科学院深圳先进技术研究院 Robot compliance control method, apparatus, equipment and storage medium
CN111352384A (en) * 2018-12-21 2020-06-30 罗伯特·博世有限公司 Method and evaluation unit for controlling an automated or autonomous movement mechanism
CN111640495A (en) * 2020-05-29 2020-09-08 北京机械设备研究所 Variable force tracking control method and device based on impedance control
CN111673733A (en) * 2020-03-26 2020-09-18 华南理工大学 Intelligent self-adaptive compliance control method of robot in unknown environment
CN111687833A (en) * 2020-04-30 2020-09-22 广西科技大学 Manipulator inverse priority impedance control system and control method
CN111687834A (en) * 2020-04-30 2020-09-22 广西科技大学 Reverse priority impedance control system and method for redundant mechanical arm of mobile manipulator
CN111687835A (en) * 2020-04-30 2020-09-22 广西科技大学 Reverse priority impedance control system and method for redundant manipulator of underwater manipulator
CN111687832A (en) * 2020-04-30 2020-09-22 广西科技大学 Reverse priority impedance control system and method for redundant manipulator of space manipulator
CN111904795A (en) * 2020-08-28 2020-11-10 中山大学 Variable impedance control method for rehabilitation robot combined with trajectory planning
CN112372630A (en) * 2020-09-24 2021-02-19 哈尔滨工业大学(深圳) Multi-mechanical-arm cooperative polishing force compliance control method and system
CN112428278A (en) * 2020-10-26 2021-03-02 北京理工大学 Control method and device of mechanical arm and training method of man-machine cooperation model
CN112743540A (en) * 2020-12-09 2021-05-04 华南理工大学 Hexapod robot impedance control method based on reinforcement learning
CN112859868A (en) * 2021-01-19 2021-05-28 武汉大学 KMP (Kernel Key P) -based lower limb exoskeleton rehabilitation robot and motion trajectory planning algorithm
CN113427483A (en) * 2021-05-19 2021-09-24 广州中国科学院先进技术研究所 Double-machine manpower/bit multivariate data driving method based on reinforcement learning
CN113641099A (en) * 2021-07-13 2021-11-12 西北工业大学 Impedance control imitation learning training method for surpassing expert demonstration
CN113966264A (en) * 2019-05-17 2022-01-21 西门子股份公司 Method, computer program product and robot control device for positioning an object that is movable during manipulation by a robot on the basis of contact, and robot
CN114193458A (en) * 2022-01-25 2022-03-18 中山大学 Robot control method based on Gaussian process online learning
CN114378820A (en) * 2022-01-18 2022-04-22 中山大学 Robot impedance learning method based on safety reinforcement learning
CN114789444A (en) * 2022-05-05 2022-07-26 山东省人工智能研究院 Compliant human-computer contact method based on deep reinforcement learning and impedance control
CN115421387A (en) * 2022-09-22 2022-12-02 中国科学院自动化研究所 Variable impedance control system and control method based on inverse reinforcement learning
CN115496099A (en) * 2022-09-20 2022-12-20 哈尔滨工业大学 Filtering and high-order state observation method for mechanical arm sensor
CN116643501A (en) * 2023-07-18 2023-08-25 湖南大学 Variable impedance control method and system for aerial working robot under stability constraint
CN117817674A (en) * 2024-03-05 2024-04-05 纳博特控制技术(苏州)有限公司 Self-adaptive impedance control method for robot

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104626168A (en) * 2014-12-16 2015-05-20 苏州大学 Robot force position compliant control method based on intelligent algorithm
CN105213153A (en) * 2015-09-14 2016-01-06 西安交通大学 Based on the lower limb rehabilitation robot control method of brain flesh information impedance
US20170007308A1 (en) * 2015-07-08 2017-01-12 Research & Business Foundation Sungkyunkwan University Apparatus and method for discriminating biological tissue, surgical apparatus using the apparatus
CN106406098A (en) * 2016-11-22 2017-02-15 西北工业大学 Man-machine interaction control method of robot system in unknown environment
CN106938470A (en) * 2017-03-22 2017-07-11 华中科技大学 A kind of device and method of Robot Force control teaching learning by imitation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104626168A (en) * 2014-12-16 2015-05-20 苏州大学 Robot force position compliant control method based on intelligent algorithm
US20170007308A1 (en) * 2015-07-08 2017-01-12 Research & Business Foundation Sungkyunkwan University Apparatus and method for discriminating biological tissue, surgical apparatus using the apparatus
CN105213153A (en) * 2015-09-14 2016-01-06 西安交通大学 Based on the lower limb rehabilitation robot control method of brain flesh information impedance
CN106406098A (en) * 2016-11-22 2017-02-15 西北工业大学 Man-machine interaction control method of robot system in unknown environment
CN106938470A (en) * 2017-03-22 2017-07-11 华中科技大学 A kind of device and method of Robot Force control teaching learning by imitation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GUIHUA XIA 等: "Hybrid force/position control of industrial robotic manipulator based on Kalman filter", 《2016 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION》 *
李二超等: "基于神经网络视觉伺服的机器人模糊自适应阻抗控制", 《电工技术学报》 *

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108972546A (en) * 2018-06-22 2018-12-11 华南理工大学 A kind of robot constant force curved surface tracking method based on intensified learning
CN109062032A (en) * 2018-10-19 2018-12-21 江苏省(扬州)数控机床研究院 A kind of robot PID impedance control method based on Approximate dynamic inversion
CN109702740A (en) * 2018-12-14 2019-05-03 中国科学院深圳先进技术研究院 Robot compliance control method, apparatus, equipment and storage medium
CN109702740B (en) * 2018-12-14 2020-12-04 中国科学院深圳先进技术研究院 Robot compliance control method, device, equipment and storage medium
CN111352384A (en) * 2018-12-21 2020-06-30 罗伯特·博世有限公司 Method and evaluation unit for controlling an automated or autonomous movement mechanism
CN113966264A (en) * 2019-05-17 2022-01-21 西门子股份公司 Method, computer program product and robot control device for positioning an object that is movable during manipulation by a robot on the basis of contact, and robot
CN111673733B (en) * 2020-03-26 2022-03-29 华南理工大学 Intelligent self-adaptive compliance control method of robot in unknown environment
CN111673733A (en) * 2020-03-26 2020-09-18 华南理工大学 Intelligent self-adaptive compliance control method of robot in unknown environment
CN111687833A (en) * 2020-04-30 2020-09-22 广西科技大学 Manipulator inverse priority impedance control system and control method
CN111687832A (en) * 2020-04-30 2020-09-22 广西科技大学 Reverse priority impedance control system and method for redundant manipulator of space manipulator
CN111687835A (en) * 2020-04-30 2020-09-22 广西科技大学 Reverse priority impedance control system and method for redundant manipulator of underwater manipulator
CN111687834A (en) * 2020-04-30 2020-09-22 广西科技大学 Reverse priority impedance control system and method for redundant mechanical arm of mobile manipulator
CN111640495A (en) * 2020-05-29 2020-09-08 北京机械设备研究所 Variable force tracking control method and device based on impedance control
CN111640495B (en) * 2020-05-29 2024-05-31 北京机械设备研究所 Variable force tracking control method and device based on impedance control
CN111904795A (en) * 2020-08-28 2020-11-10 中山大学 Variable impedance control method for rehabilitation robot combined with trajectory planning
CN111904795B (en) * 2020-08-28 2022-08-26 中山大学 Variable impedance control method for rehabilitation robot combined with trajectory planning
CN112372630A (en) * 2020-09-24 2021-02-19 哈尔滨工业大学(深圳) Multi-mechanical-arm cooperative polishing force compliance control method and system
CN112372630B (en) * 2020-09-24 2022-02-22 哈尔滨工业大学(深圳) Multi-mechanical-arm cooperative polishing force compliance control method and system
CN112428278A (en) * 2020-10-26 2021-03-02 北京理工大学 Control method and device of mechanical arm and training method of man-machine cooperation model
CN112743540A (en) * 2020-12-09 2021-05-04 华南理工大学 Hexapod robot impedance control method based on reinforcement learning
CN112743540B (en) * 2020-12-09 2022-05-24 华南理工大学 Hexapod robot impedance control method based on reinforcement learning
CN112859868A (en) * 2021-01-19 2021-05-28 武汉大学 KMP (Kernel Key P) -based lower limb exoskeleton rehabilitation robot and motion trajectory planning algorithm
CN113427483A (en) * 2021-05-19 2021-09-24 广州中国科学院先进技术研究所 Double-machine manpower/bit multivariate data driving method based on reinforcement learning
CN113641099B (en) * 2021-07-13 2023-02-10 西北工业大学 Impedance control imitation learning training method for surpassing expert demonstration
CN113641099A (en) * 2021-07-13 2021-11-12 西北工业大学 Impedance control imitation learning training method for surpassing expert demonstration
CN114378820A (en) * 2022-01-18 2022-04-22 中山大学 Robot impedance learning method based on safety reinforcement learning
CN114193458A (en) * 2022-01-25 2022-03-18 中山大学 Robot control method based on Gaussian process online learning
CN114193458B (en) * 2022-01-25 2024-04-09 中山大学 Robot control method based on Gaussian process online learning
CN114789444A (en) * 2022-05-05 2022-07-26 山东省人工智能研究院 Compliant human-computer contact method based on deep reinforcement learning and impedance control
CN114789444B (en) * 2022-05-05 2022-12-16 山东省人工智能研究院 Compliant human-computer contact method based on deep reinforcement learning and impedance control
CN115496099A (en) * 2022-09-20 2022-12-20 哈尔滨工业大学 Filtering and high-order state observation method for mechanical arm sensor
CN115421387A (en) * 2022-09-22 2022-12-02 中国科学院自动化研究所 Variable impedance control system and control method based on inverse reinforcement learning
CN116643501A (en) * 2023-07-18 2023-08-25 湖南大学 Variable impedance control method and system for aerial working robot under stability constraint
CN116643501B (en) * 2023-07-18 2023-10-24 湖南大学 Variable impedance control method and system for aerial working robot under stability constraint
CN117817674A (en) * 2024-03-05 2024-04-05 纳博特控制技术(苏州)有限公司 Self-adaptive impedance control method for robot

Also Published As

Publication number Publication date
CN108153153B (en) 2020-09-11

Similar Documents

Publication Publication Date Title
CN108153153A (en) A kind of study impedance control system and control method
Li et al. A policy search method for temporal logic specified reinforcement learning tasks
Wu et al. Dynamic fuzzy neural networks-a novel approach to function approximation
Zhao et al. Model-free optimal control for affine nonlinear systems with convergence analysis
Hou et al. Fuzzy logic-driven variable time-scale prediction-based reinforcement learning for robotic multiple peg-in-hole assembly
CN103324196A (en) Multi-robot path planning and coordination collision prevention method based on fuzzy logic
CN105425820A (en) Unmanned aerial vehicle cooperative search method for moving object with perception capability
CN102819264A (en) Path planning Q-learning initial method of mobile robot
Liang et al. Search-based task planning with learned skill effect models for lifelong robotic manipulation
Vecchietti et al. Sampling rate decay in hindsight experience replay for robot control
US20230144995A1 (en) Learning options for action selection with meta-gradients in multi-task reinforcement learning
Su et al. Robot path planning based on random coding particle swarm optimization
Zhong et al. Modeling-learning-based actor-critic algorithm with Gaussian process approximator
Wu et al. Optimized least-squares support vector machine for predicting aero-optic imaging deviation based on chaotic particle swarm optimization
Komeno et al. Deep koopman with control: Spectral analysis of soft robot dynamics
Wang et al. Policy learning for nonlinear model predictive control with application to USVs
Li et al. Improved Q-learning based route planning method for UAVs in unknown environment
Lai et al. Deep neural network-based real-time trajectory planning for an automatic guided vehicle with obstacles
Contardo et al. Learning states representations in pomdp
Yongqiang et al. Path‐Integral‐Based Reinforcement Learning Algorithm for Goal‐Directed Locomotion of Snake‐Shaped Robot
Sharma et al. Wavelet reduced order observer based adaptive tracking control for a class of uncertain nonlinear systems using reinforcement learning
Wang et al. A compensation method for random error of gyroscopes based on support vector machine and beetle antennae search algorithm
Zhou et al. Switching deep reinforcement learning based intelligent online decision making for autonomous systems under uncertain environment
Yu et al. An intelligent robot motion planning method and application via lppo in unknown environment
Zhou et al. Research on the fuzzy algorithm of path planning of mobile robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant