WO2023124346A1 - 一种协作机器人可变刚度运动技能学习与调控方法及系统 - Google Patents

一种协作机器人可变刚度运动技能学习与调控方法及系统 Download PDF

Info

Publication number
WO2023124346A1
WO2023124346A1 PCT/CN2022/123754 CN2022123754W WO2023124346A1 WO 2023124346 A1 WO2023124346 A1 WO 2023124346A1 CN 2022123754 W CN2022123754 W CN 2022123754W WO 2023124346 A1 WO2023124346 A1 WO 2023124346A1
Authority
WO
WIPO (PCT)
Prior art keywords
stiffness
force
collaborative robot
matrix
data set
Prior art date
Application number
PCT/CN2022/123754
Other languages
English (en)
French (fr)
Inventor
吴鸿敏
徐智浩
周学峰
廖昭洋
瞿弘毅
Original Assignee
广东省科学院智能制造研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东省科学院智能制造研究所 filed Critical 广东省科学院智能制造研究所
Publication of WO2023124346A1 publication Critical patent/WO2023124346A1/zh

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls

Definitions

  • the invention relates to the technical field of automation control, in particular to a method and system for learning and regulating motor skills with variable stiffness of a collaborative robot.
  • Collaborative robot motor skills imitation learning can acquire motor skills through the interaction between collaborative robots and humans, transfer human motor skills to collaborative robots, make full use of human operating experience, and enable collaborative robots to obtain highly human-like operating capabilities. It is recognized by the academic community as one of the most effective solutions for fast programming, efficient adaptation and dexterous operation of collaborative robots.
  • the existing imitation learning methods for motor skills focus on the motion feature imitation (such as: position, velocity, etc.) Such as: posture, stiffness, operability, etc.), and rely on time-driven motion trajectory, it cannot meet the skill learning of high-dimensional input, and there are big deficiencies in two aspects of task generalization and environmental adaptability.
  • the purpose of the present invention is to overcome the deficiencies of the prior art.
  • the present invention provides a method and system for learning and regulating the variable stiffness motion skills of a collaborative robot, which realizes the adaptive adjustment and control of the end stiffness of the collaborative robot when the contact force changes, and satisfies Complex tasks such as precision assembly and rehabilitation training require technical requirements for collaborative robots.
  • an embodiment of the present invention provides a method for learning and regulating motor skills with variable stiffness of a collaborative robot.
  • the method includes:
  • the end force-stiffness matrix data set is vectorized to obtain the end force-stiffness vector data set;
  • conditional probability distribution is used as the mean function of the Gaussian process to perform end force-stiffness adjustment and control processing of the collaborative robot.
  • performing motion demonstration processing on the collaborative robot based on the operation task associated with the external contact force-stiffness, obtaining the end position, velocity, acceleration, and external contact force of the collaborative robot, and forming a data set including:
  • x t,m represents the end position of the collaborative robot when the sample length is t and the number of training samples is m; Indicates the terminal velocity of the collaborative robot when the sample length is t and the number of training samples is m; Indicates the terminal acceleration of the collaborative robot when the sample length is t and the number of training samples is m; Represents the force at the end of the collaborative robot when the sample length is t and the number of training samples is m; T represents the length of the sample; M represents the number of training samples.
  • the construction of the interaction model between the collaborative robot and the environment based on the data set, and obtaining the relationship between the force-stiffness matrix at the end of the collaborative robot include:
  • the control force at the end of the collaborative robot is generated by a virtual spring-damper system, so that the dynamics of the collaborative robot at each moment are equivalent to:
  • the interaction model between the collaborative robot and the environment based on the data set, and obtaining the end force-stiffness matrix correlation of the collaborative robot it also includes:
  • the unknown damping matrix is set is the critical damping value, namely:
  • Q and ⁇ are symmetric positive definite unknown stiffness matrices
  • the eigenvector and eigenvalue of ; ⁇ represents the adjustment coefficient
  • ⁇ d represents the Cartesian damping controller to realize the execution of the ideal force at the end
  • J′ represents the transposition of the known Jacobian matrix
  • F d represents the ideal motion force trajectory at the end of the collaborative robot
  • the ideal force at the end of the collaborative robot is composed of a unique symmetric positive definite stiffness matrix decided that, Represents a symmetric positive definite manifold;
  • the motor skill learning with adjustable stiffness is realized by constructing the skill model f(K P
  • performing a stiffness matrix estimation process based on the relationship between the terminal force-stiffness matrix of the collaborative robot to form a terminal force-stiffness matrix data set including:
  • T represents the length of the sample
  • M represents the number of training samples
  • the terminal force-stiffness matrix data set is vectorized based on the matrix triangular decomposition method to obtain the end force-stiffness vector data set, including:
  • the matrix triangular decomposition method is used to vectorize the stiffness matrix in the terminal force-stiffness matrix data set And convert to Euclidean to obtain the stiffness vector of space representation, forming the terminal force-stiffness vector data set
  • T represents the length of the sample
  • M represents the number of training samples
  • a t,m represents the collaborative robot when the sample length is t and the training The decomposed upper triangular matrix when the number of samples is m
  • A′ t,m is the transpose of A t,m .
  • the motor skill modeling and online learning of the terminal force-stiffness vector based on the terminal force-stiffness vector data set includes:
  • the Gaussian mixture model is used to jointly model the input and output of the terminal force-stiffness vector data set, and the modeling results are as follows:
  • C represents the number of Gaussian distributions in the Gaussian mixture model
  • ⁇ c represents the prior probability of the cth Gaussian model
  • ⁇ c and ⁇ c distribution represent the mean and covariance of the cth Gaussian model
  • F represents the terminal force
  • the parameters of the Gaussian mixture model are obtained by iterative optimization through the expectation maximization algorithm;
  • F * ) corresponding to the stiffness vector V P* is estimated by Gaussian mixture regression.
  • conditional probability distribution as a mean function of a Gaussian process to perform end force-stiffness adjustment and control processing of the collaborative robot includes:
  • conditional probability distribution as the mean function of the Gaussian process, that is:
  • V P is the vectorized description of the upper triangular matrix elements of the collaborative robot
  • F represents the input force at the end of the collaborative robot
  • the Gaussian process covariance function will be used to deal with the constraints of the given stiffness outside, and the stiffness trajectory will be automatically adjusted;
  • I represents the N-dimensional unit matrix
  • k * [k(F * ,F 1 ),k(F * ,F 2 ),...,k(F * ,F N )]
  • K represents defined by the kernel function The covariance matrix of ;
  • k * ' represents the transposition of k * .
  • the covariance matrix defined by the kernel function is described as follows:
  • k(.,.) represents the kernel function operation.
  • the embodiment of the present invention also provides a collaborative robot variable stiffness motion skill learning and regulation system, the system includes:
  • Obtaining module used to perform motion demonstration processing on the collaborative robot based on the operation task associated with the external contact force-stiffness, obtain the end position, speed, acceleration, and external contact force of the collaborative robot, and form a data set;
  • Construction module for constructing the interaction model between the collaborative robot and the environment based on the data set, and obtaining the relationship between the force-stiffness matrix at the end of the collaborative robot;
  • Estimation module used to perform stiffness matrix estimation processing based on the relationship between the end force-stiffness matrix of the collaborative robot, and form an end force-stiffness matrix data set;
  • Vectorization module used to vectorize the terminal force-stiffness matrix data set based on the matrix triangular decomposition method to obtain the terminal force-stiffness vector data set;
  • Online learning module used for performing motor skill modeling and online learning of the terminal force-stiffness vector based on the terminal force-stiffness vector data set, and obtaining learning results, the learning results being conditional probability distributions;
  • Modulation control module used to use the conditional probability distribution as the mean function of the Gaussian process to perform end force-stiffness adjustment and control processing of the collaborative robot.
  • a stiffness matrix that satisfies symmetry and positive definiteness is estimated from the high-dimensional motion data of the end motion position, velocity, acceleration, and external contact force of the collaborative robot; the motion skill simulation is performed on the end force-stiffness vector of the collaborative robot , to realize the estimation of the end stiffness vector when any unknown external contact force is used as input, and on this basis, through the triangular decomposition inverse transformation and the collaborative robot impedance controller, the adaptive adjustment and adjustment of the end stiffness of the collaborative robot when the contact force changes Control to meet the technical requirements of collaborative robots for complex tasks such as precision assembly and rehabilitation training.
  • Fig. 1 is a schematic flow diagram of a collaborative robot variable stiffness motor skill learning and control method in an embodiment of the present invention
  • Fig. 2 is a schematic diagram of the structure and composition of the collaborative robot variable stiffness motor skill learning and control system in the embodiment of the present invention.
  • the present invention proposes a learning and control method of variable stiffness motion skills for collaborative robots, which enables collaborative robots to learn variable stiffness motion skills based on external forces from the position-external force trajectory demonstrated by humans many times.
  • the end stiffness of the collaborative robot can be adaptively adjusted and controlled to realize the imitation of human operation skills for complex tasks such as precision assembly and rehabilitation training, which will effectively improve the environmental adaptability and operational dexterity of the collaborative robot.
  • Specific performances include: (1) From the perspective of imitation learning, the present invention proposes a motor skill learning method that is suitable for collaborative robots in highly dynamic, unstructured, and uncertain environments.
  • the data-driven variable impedance motion model is expected to solve the two major problems of the existing fixed impedance modeling and parameter identification; (2) Simplify the end dynamics of the collaborative robot by assuming that the end of the collaborative robot is a unit mass body affected by the control force and external force Model.
  • the end control force of the collaborative robot can be generated by the virtual spring damping system, and then the equivalent relationship between the end stiffness matrix of the collaborative robot and the end position, velocity, acceleration, and external force is constructed.
  • the present invention carries out rough estimation of stiffness matrix from the trajectory (terminal position, velocity, acceleration, external force) of human demonstration by least square method, then carries out fine estimation by stiffness matrix approximation method to ensure the rigidity matrix Symmetry and positive definiteness, effectively solving the problem of low computational efficiency in the traditional convex optimization method for stiffness matrix estimation; (3) Considering that the symmetric positive definite stiffness matrix is located in the Riemann space instead of the conventional Euclidean space, it is impossible to directly The operation, and the external force is a high-variable input, which cannot be modeled by the traditional dynamic motion primitive model.
  • the present invention first decomposes the stiffness matrix into a triangular form to obtain the stiffness vector, and then proposes a Gaussian process modeling method based on the mean value prior to construct a skill model of the external force-stiffness vector. Function operation, the proposed method can meet the modeling and learning of diversity and trajectory uncertainty from multiple demonstration trajectories at the same time, so that the acquired skills can quickly adapt to changes in the external environment while maintaining the characteristics of the demonstration movement, and improve environmental adaptability.
  • a six-dimensional force sensor is installed at the end of the collaborative robot, and the interactive skill data acquisition and labeling system is used to pull the collaborative robot in the zero-force dragging mode to carry out multiple motion demonstrations for external force-stiffness related operation tasks, and obtain each During this demonstration, the high-dimensional motion trajectory of the end motion position, velocity, acceleration, and external contact force of the collaborative robot forms a data set.
  • a stiffness matrix estimation method combining "rough calculation-fine approximation" is constructed to obtain the mapping relationship of the stiffness matrix at the end of the collaborative robot under different external contact forces at each moment ;
  • the stiffness matrix triangular decomposition method is used to transform the distance measure between the stiffness matrices in the Riemannian space into the Euclidean space, and the matrix vectorization description is performed to form the stiffness vector;
  • a method based on the Euclidean space is proposed in the Euclidean space
  • the Gaussian process modeling method with mean prior is used to construct a motion skill imitation learning method related to the contact force-stiffness at the end of the collaborative robot.
  • the end stiffness matrix is estimated, and controlled by the collaborative robot impedance.
  • the robot realizes adaptive adjustment and control of the end stiffness of the collaborative robot when the contact force changes, which meets the technical requirements of the collaborative robot for complex tasks such as precision assembly and rehabilitation training.
  • FIG. 1 is a schematic flow chart of a method for learning and regulating motor skills with variable stiffness of a collaborative robot in an embodiment of the present invention.
  • a method for learning and regulating motor skills with variable stiffness of a collaborative robot includes:
  • S11 Perform motion demonstration processing on the collaborative robot based on the operation task associated with the external contact force-stiffness, obtain the end position, speed, acceleration, and external contact force of the collaborative robot, and form a data set;
  • the operation task associated with the external contact force-stiffness is used to perform motion demonstration processing on the collaborative robot, and the terminal position, speed, acceleration, and external contact force of the collaborative robot are obtained to form a data set, including: Install a six-dimensional force sensor at the end of the collaborative robot, and use the interactive skill data collection and labeling system to pull the collaborative robot in zero-force dragging mode to perform multiple motion demonstrations for external contact force-stiffness related operation tasks, Obtain the high-dimensional motion trajectory of the end position, velocity, acceleration, and external contact force of the collaborative robot for each demonstration to form a data set;
  • x t,m represents the end position of the collaborative robot when the sample length is t and the number of training samples is m; Indicates the terminal velocity of the collaborative robot when the sample length is t and the number of training samples is m; Indicates the terminal acceleration of the collaborative robot when the sample length is t and the number of training samples is m; Represents the force at the end of the collaborative robot when the sample length is t and the number of training samples is m; T represents the length of the sample; M represents the number of training samples.
  • S12 Construct an interaction model between the collaborative robot and the environment based on the data set, and obtain the relationship between the force-stiffness matrix at the end of the collaborative robot;
  • the construction of the interaction model between the collaborative robot and the environment based on the data set, and the obtaining of the end force-stiffness matrix relationship of the collaborative robot include: on the basis of the data set, assuming that the collaborative robot The end is the unit mass body I M affected by the control force f c and the external force f e , and its dynamic model is simplified as:
  • the control force at the end of the collaborative robot is generated by a virtual spring-damper system, so that the dynamics of the collaborative robot at each moment are equivalent to:
  • the interaction model between the collaborative robot and the environment based on the data set, and obtaining the relationship between the force-stiffness matrix at the end of the collaborative robot it also includes: enabling the collaborative robot to quickly return to balance without periodic vibration position, will set the unknown damping matrix is the critical damping value, namely:
  • ⁇ d represents the Cartesian damping controller to realize the execution of the ideal force at the end
  • J′ represents the transposition of the known Jacobian matrix
  • F d represents the ideal motion force trajectory at the end of the collaborative robot
  • the ideal force at the end of the collaborative robot is composed of a unique symmetric positive definite stiffness matrix decided that, Represents a Symmetric Positive Definite Manifold (SPD Manifold for short); by constructing the skill model f(K P
  • the described collaborative robot terminal force-stiffness matrix correlation is performed based on the stiffness matrix estimation process to form a terminal force-stiffness matrix data set, including:
  • T represents the length of the sample
  • M represents the number of training samples
  • the terminal force-stiffness matrix data set is vectorized based on the matrix triangular decomposition method, and the terminal force-stiffness vector data set is obtained, including:
  • the matrix triangular decomposition method is used to vectorize the stiffness matrix in the terminal force-stiffness matrix data set And convert to Euclidean to obtain the stiffness vector of space representation, forming the terminal force-stiffness vector data set
  • T represents the length of the sample
  • M represents the number of training samples
  • a t,m represents the collaborative robot when the sample length is t and the training The decomposed upper triangular matrix when the number of samples is m
  • A′ t,m is the transpose of A t,m .
  • S15 Perform motor skill modeling and online learning of the terminal force-stiffness vector based on the terminal force-stiffness vector data set, and obtain a learning result, and the learning result is a conditional probability distribution;
  • the motor skill modeling and online learning of the terminal force-stiffness vector based on the terminal force-stiffness vector data set includes: using a Gaussian mixture model to input the terminal force-stiffness vector data set Joint modeling with the output, the modeling results are as follows:
  • C represents the number of Gaussian distributions in the Gaussian Mixture Model (GMM);
  • ⁇ c represents the prior probability of the cth Gaussian model, and ⁇ c and ⁇ c distribution represent the mean and covariance of the cth Gaussian model;
  • F represents the terminal force;
  • GMR Gaussian Mixture Regression
  • the present invention introduces a multi-output Gaussian process to model unknown environments or new task constraints, and realize online learning of skills.
  • the terminal force-stiffness adjustment and control processing of the collaborative robot is carried out by using the conditional probability distribution as the mean function of the Gaussian process, including: online control and adjustment for terminal force-stiffness motor skills,
  • the conditional probability distribution obtained by GMM-GMR is used as the mean function of the Gaussian process, that is:
  • the GP covariance function will be used to deal with the constraints of the given stiffness of the outside world, and the stiffness trajectory will be automatically adjusted to adapt to changes in the environment.
  • I represents the N-dimensional unit matrix
  • k * [k(F * , F 1 ), k(F * , F 2 ),..., k(F * , N )]
  • K represents the Covariance matrix
  • k * ' represents the transposition of k * .
  • the covariance matrix defined by the kernel function is described as follows:
  • k(.,.) represents the kernel function operation.
  • the Gaussian process is combined to realize the automatic adjustment of the end force-stiffness to efficiently adapt to the change of the environment.
  • F * ) is restored from the Euclidean space to the stiffness matrix f(K P
  • the matrix is converted to the end force, and the impedance controller is designed to realize the adaptive control and adjustment of the end force-stiffness of the collaborative robot.
  • a stiffness matrix that satisfies symmetry and positive definiteness is estimated from the high-dimensional motion data of the end motion position, velocity, acceleration, and external contact force of the collaborative robot; the motion skill simulation is performed on the end force-stiffness vector of the collaborative robot , to realize the estimation of the end stiffness vector when any unknown external contact force is used as input, and on this basis, through the triangular decomposition inverse transformation and the collaborative robot impedance controller, the adaptive adjustment and adjustment of the end stiffness of the collaborative robot when the contact force changes Control to meet the technical requirements of collaborative robots for complex tasks such as precision assembly and rehabilitation training.
  • FIG. 2 is a schematic diagram of the structure and composition of the collaborative robot variable stiffness motor skill learning and control system in the embodiment of the present invention.
  • a collaborative robot variable stiffness motor skills learning and regulation system the system includes:
  • Obtaining module 21 used to perform motion demonstration processing on the collaborative robot based on the operation task associated with the external contact force-stiffness, obtain the end position, speed, acceleration, and external contact force of the collaborative robot, and form a data set;
  • the operation task associated with the external contact force-stiffness is used to perform motion demonstration processing on the collaborative robot, and the terminal position, speed, acceleration, and external contact force of the collaborative robot are obtained to form a data set, including: Install a six-dimensional force sensor at the end of the collaborative robot, and use the interactive skill data collection and labeling system to pull the collaborative robot in zero-force dragging mode to perform multiple motion demonstrations for external contact force-stiffness related operation tasks, Obtain the high-dimensional motion trajectory of the end position, velocity, acceleration, and external contact force of the collaborative robot for each demonstration to form a data set;
  • x t,m represents the end position of the collaborative robot when the sample length is t and the number of training samples is m; Indicates the terminal velocity of the collaborative robot when the sample length is t and the number of training samples is m; Indicates the terminal acceleration of the collaborative robot when the sample length is t and the number of training samples is m; Represents the force at the end of the collaborative robot when the sample length is t and the number of training samples is m; T represents the length of the sample; M represents the number of training samples.
  • Construction module 22 for constructing the interaction model between the collaborative robot and the environment based on the data set, and obtaining the relationship between the force-stiffness matrix at the end of the collaborative robot;
  • the construction of the interaction model between the collaborative robot and the environment based on the data set, and the obtaining of the end force-stiffness matrix relationship of the collaborative robot include: on the basis of the data set, assuming that the collaborative robot The end is the unit mass body I M affected by the control force f c and the external force f e , and its dynamic model is simplified as:
  • the control force at the end of the collaborative robot is generated by a virtual spring-damper system, so that the dynamics of the collaborative robot at each moment are equivalent to:
  • the interaction model between the collaborative robot and the environment based on the data set, and obtaining the relationship between the force-stiffness matrix at the end of the collaborative robot it also includes: enabling the collaborative robot to quickly return to balance without periodic vibration position, will set the unknown damping matrix is the critical damping value, namely:
  • ⁇ d represents the Cartesian damping controller to realize the execution of the ideal force at the end
  • J′ represents the transposition of the known Jacobian matrix
  • F d represents the ideal motion force trajectory at the end of the collaborative robot
  • the ideal force at the end of the collaborative robot is composed of a unique symmetric positive definite stiffness matrix decided that, Represents a symmetric positive definite manifold; by constructing a skill model f(K P
  • Estimation module 23 for carrying out the stiffness matrix estimation process based on the end force-stiffness matrix correlation of the collaborative robot, forming the end force-stiffness matrix data set;
  • the stiffness matrix estimation process is performed based on the relationship between the end force-stiffness matrix of the collaborative robot, and the end force-stiffness matrix data set is formed, including:
  • T represents the length of the sample
  • M represents the number of training samples
  • Vectorization module 24 used for vectorizing the terminal force-stiffness matrix data set based on the matrix triangular decomposition method to obtain the terminal force-stiffness vector data set;
  • the terminal force-stiffness matrix data set is vectorized based on the matrix triangular decomposition method, and the terminal force-stiffness vector data set is obtained, including:
  • the matrix triangular decomposition method is used to vectorize the stiffness matrix in the terminal force-stiffness matrix data set And convert to Euclidean to obtain the stiffness vector of space representation, forming the terminal force-stiffness vector data set
  • T represents the length of the sample
  • M represents the number of training samples
  • a t,m represents the collaborative robot when the sample length is t and the training The decomposed upper triangular matrix when the number of samples is m
  • A′ t,m is the transpose of A t,m .
  • Online learning module 25 used for performing motor skill modeling and online learning of the terminal force-stiffness vector based on the terminal force-stiffness vector data set, and obtaining learning results, the learning results being conditional probability distributions;
  • the motor skill modeling and online learning of the terminal force-stiffness vector based on the terminal force-stiffness vector data set includes: using a Gaussian mixture model to input the terminal force-stiffness vector data set Joint modeling with the output, the modeling results are as follows:
  • C represents the number of Gaussian distributions in the Gaussian Mixture Model (GMM);
  • ⁇ c represents the prior probability of the cth Gaussian model, and ⁇ c and ⁇ c distribution represent the mean and covariance of the cth Gaussian model;
  • F represents the terminal force;
  • GMR Gaussian Mixture Regression
  • the present invention introduces a multi-output Gaussian process to model unknown environments or new task constraints, and realize online learning of skills.
  • Modulation control module 26 used to use the conditional probability distribution as a mean function of a Gaussian process to perform end force-stiffness adjustment and control processing of the collaborative robot.
  • the terminal force-stiffness adjustment and control processing of the collaborative robot is carried out by using the conditional probability distribution as the mean function of the Gaussian process, including: online control and adjustment for terminal force-stiffness motor skills,
  • the conditional probability distribution obtained by GMM-GMR is used as the mean function of the Gaussian process, that is:
  • the GP covariance function will be used to deal with the constraints of the given stiffness of the outside world, and automatically adjust the stiffness trajectory to adapt to changes in the environment, etc.; among them, the vectorized description of the upper triangular matrix elements of the V P collaborative robot; F represents the end of the collaborative robot
  • the input force; the Gaussian process covariance function will be used to deal with the constraints of the given stiffness outside, and the stiffness trajectory will be automatically adjusted; assuming that the terminal force-stiffness vector data set contains N samples
  • I represents the N-dimensional unit matrix
  • k * [k(F * ,F 1 ),k(F * ,F 2 ),...,k(F * ,F N )]
  • K represents defined by the kernel function
  • k * ' represents the transposition of k * .
  • the covariance matrix defined by the kernel function is described as follows:
  • k(.,.) represents the kernel function operation.
  • the Gaussian process is combined to realize the automatic adjustment of the end force-stiffness to efficiently adapt to the change of the environment.
  • F * ) is restored from the Euclidean space to the stiffness matrix f(K P
  • the matrix is converted to the end force, and the impedance controller is designed to realize the adaptive control and adjustment of the end force-stiffness of the collaborative robot.
  • a stiffness matrix that satisfies symmetry and positive definiteness is estimated from the high-dimensional motion data of the end motion position, velocity, acceleration, and external contact force of the collaborative robot; the motion skill simulation is performed on the end force-stiffness vector of the collaborative robot , to realize the estimation of the end stiffness vector when any unknown external contact force is used as input, and on this basis, through the triangular decomposition inverse transformation and the collaborative robot impedance controller, the adaptive adjustment and adjustment of the end stiffness of the collaborative robot when the contact force changes Control to meet the technical requirements of collaborative robots for complex tasks such as precision assembly and rehabilitation training.

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Manipulator (AREA)

Abstract

一种协作机器人可变刚度运动技能学习与调控方法及系统,其中,协作机器人可变刚度运动技能学习与调控方法包括:基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,形成数据集(S11);基于数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系(S12);进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集(S13);基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集(S14);进行末端力-刚度向量的运动技能建模与在线学习,获得学习结果(S15);将条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理(S16)。协作机器人可变刚度运动技能学习与调控系统包括:获得模块(21)、构建模块(22)、估计模块(23)、向量化模块(24)、在线学习模块(25)、调制控制模块(26)。该方法及系统实现接触力发生变化时协作机器人末端刚度的自适应调制与控制。

Description

一种协作机器人可变刚度运动技能学习与调控方法及系统 技术领域
本发明涉及自动化控制技术领域,尤其涉及一种协作机器人可变刚度运动技能学习与调控方法及系统。
背景技术
随着制造业的智能化和柔性化发展,机器人编程技术也面临着较大的挑战。传统依赖于人为干预与反复调试的工业机器人编程技术,只适用于特定任务,当遇到相近任务或不同环境时,需要重新进行编程,无法汲取过往的操作经验,存在效率低、适应性差、灵巧性不足等问题。当前,新一代人工智能技术研发取得了重大进步,协作机器人应用广度与深度也日益提升,探索如何利用人工智能技术让协作机器人系统具备一定的自主决策和学习能力,进而使协作机器人能够学习到适应于不同任务和环境的运动技能,避免对每个任务的繁琐编程,是当今协作机器人领域前沿研究热点与难点。
协作机器人运动技能模仿学习能够通过协作机器人与人类交互的方式获得运动技能,将人类运动技能传递给协作机器人,充分利用人类的操作经验,使协作机器人获得高度类人化操作能力,是目前业界与学界公认能实现协作机器人快速编程、高效适应和灵巧操作最为有效的解决方案之一。然而,已有运动技能模仿学习方法侧重于协作机器人欧几里得空间的运动特征模仿(如:位置、速度等),缺乏考虑特殊正交群和黎曼空间中具有严格几何约束的运动特征(如:姿态、刚度、可操作性等),并且依赖于时间驱动的运动轨迹,无法满足高维输入的技能学习,在任务泛化性和环境适应性两个方面存在较大的不足。
发明内容
本发明的目的在于克服现有技术的不足,本发明提供了一种协作机器人可变刚度运动技能学习与调控方法及系统,实现接触力发生变化时协作 机器人末端刚度的自适应调整与控制,满足精密装配、康复训练等复杂任务对协作机器人的技术需求。
为了解决上述技术问题,本发明实施例提供了一种协作机器人可变刚度运动技能学习与调控方法,所述方法包括:
基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集;
基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系;
基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集;
基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集;
基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习,获得学习结果,所述学习结果为条件概率分布;
将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理。
可选的,所述基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集,包括:
在所述协作机器人末端安装六维力传感器,并通过交互技能数据采集与标注系统在零力拖动模式下牵引所述协作机器人进行外界接触力-刚度相关联的操作任务展开多次运动演示,获得每次演示协作机器人的末端位置、速度、加速度、外界接触力的高维运动轨迹,形成数据集;
所述数据集表示形式
Figure PCTCN2022123754-appb-000001
其中,x t,m表示协作机器人在样本长度为t和训练样本数量为m时的末端位置;
Figure PCTCN2022123754-appb-000002
表示协作机器人在样本长度为t和训练样本数量为m时的末端速度;
Figure PCTCN2022123754-appb-000003
表示协作机器人在样本长度为t和训练样本数量为m时的末端加速度;
Figure PCTCN2022123754-appb-000004
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量。
可选的,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系,包括:
在所述数据集合的基础上,假设所述协作机器人末端为受到控制力f c和外界力f e影响的单位质量体I M,其动力学模型简化为:
Figure PCTCN2022123754-appb-000005
在所述协作机器人末端的控制力由虚拟弹簧-阻尼系统产生,从而得到每一时刻所述协作机器人动力学等价于:
Figure PCTCN2022123754-appb-000006
在此基础上,假定通过技能学习得到所述协作机器人末端理想运动力轨迹为F d,从而可以由虚拟弹簧-阻尼系统将F d表达为:
Figure PCTCN2022123754-appb-000007
其中,
Figure PCTCN2022123754-appb-000008
表示协作机器人的末端加速度;
Figure PCTCN2022123754-appb-000009
表示协作机器人末端在样本长度为t时的末端加速度;
Figure PCTCN2022123754-appb-000010
表示协作机器人末端在样本长度为t时的刚度矩阵;
Figure PCTCN2022123754-appb-000011
表示协作机器人的末端期望位置;x t表示协作机器人的末端位置;
Figure PCTCN2022123754-appb-000012
表示协作机器人末端在样本长度为t时的阻尼矩阵;
Figure PCTCN2022123754-appb-000013
表示协作机器人的末端速度;
Figure PCTCN2022123754-appb-000014
表示在协作机器人末端在样本长度为t时的外界力;
Figure PCTCN2022123754-appb-000015
表示未知刚度矩阵;Δx表示已知未知变化量;
Figure PCTCN2022123754-appb-000016
表示未知阻尼矩阵;J表示已知雅克比矩阵;
Figure PCTCN2022123754-appb-000017
表示已知关节速度。
可选的,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系之后,还包括:
使所述协作机器人不作周期性振动而又能快速地回到平衡位置,将设定未知阻尼矩阵
Figure PCTCN2022123754-appb-000018
为临界阻尼值,即:
Figure PCTCN2022123754-appb-000019
其中,Q和Λ分别为对称正定的未知刚度矩阵
Figure PCTCN2022123754-appb-000020
的特征向量和特征值;η表示调节系数,
Figure PCTCN2022123754-appb-000021
在所述协作机器人力矩控制模式下,设笛卡尔阻尼控制器实现末端理想力的执行如下:
Figure PCTCN2022123754-appb-000022
其中,τ d表示笛卡尔阻尼控制器实现末端理想力的执行;J′表示已知雅 克比矩阵的转置;F d表示协作机器人末端理想运动力轨迹;
Figure PCTCN2022123754-appb-000023
表示机器人运动学模型,其中,θ表示已知关节角度,
Figure PCTCN2022123754-appb-000024
表示已知关节速度,
Figure PCTCN2022123754-appb-000025
表示已知关节加速度;
由协作机器人末端理想运动力轨迹F d和笛卡尔阻尼控制器实现末端理想力的执行τ d可知,所述协作机器人末端理想作用力由唯一的对称正定的刚度矩阵
Figure PCTCN2022123754-appb-000026
决定,其中,
Figure PCTCN2022123754-appb-000027
表示对称正定流形;
通过构建所述协作机器人末端力-刚度矩阵的技能模型f(K P|F)实现可调控刚度的运动技能学习。
可选的,所述基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集,包括:
通过对协作机器人末端力-刚度矩阵关联关系内的刚度矩阵进行估计,形成末端力-刚度矩阵数据集
Figure PCTCN2022123754-appb-000028
其中,
Figure PCTCN2022123754-appb-000029
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
Figure PCTCN2022123754-appb-000030
表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵。
可选的,所述基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集,包括:
根据刚度矩阵的对称性与正定性的几何约束,采用矩阵三角分解方法对末端力-刚度矩阵数据集内的刚度矩阵进行向量化
Figure PCTCN2022123754-appb-000031
并转换到欧几里得到空间表征的刚度向量,形成末端力-刚度向量数据集
Figure PCTCN2022123754-appb-000032
其中,
Figure PCTCN2022123754-appb-000033
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
Figure PCTCN2022123754-appb-000034
表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵;
Figure PCTCN2022123754-appb-000035
表示协作机器人在样本长度为t和训练样本数量为m时的上三角矩阵元素的向量化描述,长度为m(m+1)/2;A t,m表示协作机器人在样本长度为t和训练样本数量为m时的分解后的上三角矩阵;A′ t,m为A t,m的转置。
可选的,所述基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习包括:
采用高斯混合模型对末端力-刚度向量数据集的输入和输出进行联合建模,获得建模结果如下:
Figure PCTCN2022123754-appb-000036
其中,C表示高斯混合模型中的高斯分布的数量;π c表示第c个高斯模型的先验概率,并且
Figure PCTCN2022123754-appb-000037
μ c和∑ c分布表示第c个高斯模型的均值及协方差;F表示末端力;
Figure PCTCN2022123754-appb-000038
表示高斯分布;V P协作机器人的上三角矩阵元素的向量化描述;
在给定训练样本和高斯模型数量C的前提下,高斯混合模型的参数通过期望最大化算法进行迭代优化获得;
在获得高斯混合模型的参数后,对于任意未知末端力F *输入时,均利用高斯混合回归估计得到对应刚度向量V P*的条件概率分布f(V P|F *)。
可选的,所述将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理,包括:
将所述条件概率分布作为高斯过程的均值函数,即有:
μ(F)=p(V P|F);
其中,V P协作机器人的上三角矩阵元素的向量化描述;F表示协作机器人末端的输入力;
将通过高斯过程协方差函数来应对外界给定刚度的约束情况,自动调整刚度轨迹;
假设末端力-刚度向量数据集中含有N样本
Figure PCTCN2022123754-appb-000039
潜在的函数关系V P=f(F)+ε,其中,ε~N(0,σ 2)是方差为未知σ 2的噪声;
当给定新的测试输入F *后,其对应的函数值为f(F *)和已有输出样本
Figure PCTCN2022123754-appb-000040
之间存在以下联合分布关系:
Figure PCTCN2022123754-appb-000041
式中,I表示N维单位矩阵;k *=[k(F *,F 1),k(F *,F 2),…,k(F *,F N)],K表示由核函数定义的协方差矩阵;
由联合分布得到多变量高斯的条件概率分布P(f(F *)|V P),其均值μ(F *)=p(V P*|F *)由高斯混合模型-高斯回归获得,方差表示如下:
D(f(F *))=k(F *,F *)-k *(K+σ 2I) -1k *′;
其中,k *′表示k *的转置。
可选的,述由核函数定义的协方差矩阵描述如下:
Figure PCTCN2022123754-appb-000042
其中,k(.,.)表示核函数运算。
另外,本发明实施例还提供了一种协作机器人可变刚度运动技能学习与调控系统,所述系统包括:
获得模块:用于基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集;
构建模块:用于基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系;
估计模块:用于基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集;
向量化模块:用于基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集;
在线学习模块:用于基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习,获得学习结果,所述学习结果为条件概率分布;
调制控制模块:用于将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理。
在本发明实施例中,从协作机器人末端运动位置、速度、加速度、外界接触力的高维运动数据中估计出满足对称性和正定性的刚度矩阵;对协作机器人末端力-刚度向量进行运动技能模仿,实现在任意未知外界接触力作为输入时估计出末端刚度向量,并在此基础上,通过三角分解逆变换与协作机器人阻抗控制器,实现接触力发生变化时协作机器人末端刚度的自适应调整与控制,满足精密装配、康复训练等复杂任务对协作机器人的技术需求。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见的,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1是本发明实施例中的协作机器人可变刚度运动技能学习与调控方法的流程示意图;
图2是本发明实施例中的协作机器人可变刚度运动技能学习与调控系统的结构组成示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
实施例一
本发明提出了一种协作机器人可变刚度运动技能学习与调控方法,让协作机器人能够从人类多次演示的位置-外界力的运动轨迹中学习基于外界力的变刚度运动技能,当外界接触力发生变化时协作机器人末端刚度进行自适应调整与控制,实现对人类进行精密装配、康复训练等复杂任务的操作技能模仿,将有效提升协作机器人环境适应性和操作灵巧性。具体表现包括:(1)从模仿学习的角度出发,本发明提出适用于高动态非结构化、不确定性环境下协作机器人外界力-末端刚度相关联的运动技能学习方法,通过学习的方式构建数据驱动的变阻抗运动模型,有望解决现有固定阻抗建模及参数辨识两大难题;(2)通过假设协作机器人末端为受到控制力和外界力影响的单位质量体以简化协作机器人末端动力学模型。其中,协作 机器人末端控制力可由虚拟弹簧阻尼系统产生,进而构建协作机器人末端刚度矩阵与末端位置、速度、加速度、外界力之间的等价关系。在此基础上,本发明通过最小二乘法从人类演示的运动轨迹(末端位置、速度、加速度、外界力)中进行刚度矩阵的粗估计,再通过刚度矩阵逼近方法进行精估计以确保刚度矩阵的对称性与正定性,有效解决传统采用凸优化方法进行刚度矩阵估计存在计算效率低的问题;(3)考虑到对称正定的刚度矩阵位于黎曼空间,而非常规的欧氏空间,无法进行直接的运算,并且外界力作为输入时属于高变量输入,无法采用传统的动态运动基元模型进行建模。针对此问题,本发明首先将刚度矩阵进行三角分解得到刚度向量,再提出一种基于均值先验的高斯过程建模方法,构建外界力-刚度向量的技能模型,借助先验设置与高效的核函数运算,所提方法能够同时满足从多次演示轨迹中进行多样性和轨迹不确定性的建模与学习,使所习得技能能够保持演示运动特性的前提下快速适应外界环境的变化,提升环境适应性。
首先,在协作机器人末端安装六维力传感器,通过交互式技能数据采集与标注系统在零力拖动模式下牵引协作机器人进行外界力-刚度相关联的操作任务开展多次运动演示,并获取每次演示过程中协作机器人末端运动位置、速度、加速度、外界接触力的高维运动轨迹,形成数据集。然后,在刚度矩阵对称性和正定性的几何条件下,构建“粗计算-精逼近”两阶段相结合的刚度矩阵估计方法,获得每一时刻不同外界接触力情况下协作机器人末端刚度矩阵的映射关系;再次,采用刚度矩阵三角分解方法将黎曼空间中刚度矩阵间距离度量转化到欧几里得空间,进行矩阵向量化描述,形成刚度向量;最后,在欧几里得空间下提出一种基于均值先验的高斯过程建模方法,构建协作机器人末端外力接触力-刚度相关联的运动技能模仿学习方法,在获得任意未知外界接触力作为输入时估计出末端刚度矩阵,并通过协作机器人阻抗控制器实现接触力发生变化时协作机器人末端刚度进行自适应调整与控制,满足精密装配、康复训练等复杂任务对协作机器人的技术需求。
具体的,请参阅图1,图1是本发明实施例中的协作机器人可变刚度运动技能学习与调控方法的流程示意图。
如图1所示,一种协作机器人可变刚度运动技能学习与调控方法,所述方法包括:
S11:基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集;
在本发明具体实施过程中,所述基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集,包括:在所述协作机器人末端安装六维力传感器,并通过交互技能数据采集与标注系统在零力拖动模式下牵引所述协作机器人进行外界接触力-刚度相关联的操作任务展开多次运动演示,获得每次演示协作机器人的末端位置、速度、加速度、外界接触力的高维运动轨迹,形成数据集;
所述数据集表示形式
Figure PCTCN2022123754-appb-000043
其中,x t,m表示协作机器人在样本长度为t和训练样本数量为m时的末端位置;
Figure PCTCN2022123754-appb-000044
表示协作机器人在样本长度为t和训练样本数量为m时的末端速度;
Figure PCTCN2022123754-appb-000045
表示协作机器人在样本长度为t和训练样本数量为m时的末端加速度;
Figure PCTCN2022123754-appb-000046
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量。
S12:基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系;
在本发明具体实施过程中,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系,包括:在所述数据集合的基础上,假设所述协作机器人末端为受到控制力f c和外界力f e影响的单位质量体I M,其动力学模型简化为:
Figure PCTCN2022123754-appb-000047
在所述协作机器人末端的控制力由虚拟弹簧-阻尼系统产生,从而得到每一时刻所述协作机器人动力学等价于:
Figure PCTCN2022123754-appb-000048
在此基础上,假定通过技能学习得到所述协作机器人末端理想运动力轨迹为F d,从而可以由虚拟弹簧-阻尼系统将F d表达为:
Figure PCTCN2022123754-appb-000049
其中,
Figure PCTCN2022123754-appb-000050
表示协作机器人的末端加速度;
Figure PCTCN2022123754-appb-000051
表示协作机器人末端在样本长度为t时的末端加速度;
Figure PCTCN2022123754-appb-000052
表示协作机器人末端在样本长度为t时的刚度矩阵;
Figure PCTCN2022123754-appb-000053
表示协作机器人的末端期望位置;x t表示协作机器人的末端位置;
Figure PCTCN2022123754-appb-000054
表示协作机器人末端在样本长度为t时的阻尼矩阵;
Figure PCTCN2022123754-appb-000055
表示协作机器人的末端速度;
Figure PCTCN2022123754-appb-000056
表示在协作机器人末端在样本长度为t时的外界力;
Figure PCTCN2022123754-appb-000057
表示未知刚度矩阵;Δx表示已知未知变化量;
Figure PCTCN2022123754-appb-000058
表示未知阻尼矩阵;J表示已知雅克比矩阵;
Figure PCTCN2022123754-appb-000059
表示已知关节速度。
进一步的,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系之后,还包括:使所述协作机器人不作周期性振动而又能快速地回到平衡位置,将设定未知阻尼矩阵
Figure PCTCN2022123754-appb-000060
为临界阻尼值,即:
Figure PCTCN2022123754-appb-000061
其中,Q和Λ分别为对称正定的未知刚度矩阵
Figure PCTCN2022123754-appb-000062
的特征向量和特征值;η表示调节系数,
Figure PCTCN2022123754-appb-000063
在所述协作机器人力矩控制模式下,设笛卡尔阻尼控制器实现末端理想力的执行如下:
Figure PCTCN2022123754-appb-000064
其中,τ d表示笛卡尔阻尼控制器实现末端理想力的执行;J′表示已知雅克比矩阵的转置;F d表示协作机器人末端理想运动力轨迹;
Figure PCTCN2022123754-appb-000065
表示机器人运动学模型,其中,θ表示已知关节角度,
Figure PCTCN2022123754-appb-000066
表示已知关节速度,
Figure PCTCN2022123754-appb-000067
表示已知关节加速度;由协作机器人末端理想运动力轨迹F d和笛卡尔阻尼控制器实现末端理想力的执行τ d可知,所述协作机器人末端理想作用力由唯一的对称正定的刚度矩阵
Figure PCTCN2022123754-appb-000068
决定,其中,
Figure PCTCN2022123754-appb-000069
表示对称正定流形(Symmetric Positive Definite Manifold,简称SPD流形);通过构建所述协作机器人末端力-刚度矩阵的技能模型f(K P|F)实现可调控刚度的运动技能学习。
S13:基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集;
在本发明具体实施过程中,所述基于所述协作机器人末端力-刚度矩阵 关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集,包括:
通过对协作机器人末端力-刚度矩阵关联关系内的刚度矩阵进行估计,形成末端力-刚度矩阵数据集
Figure PCTCN2022123754-appb-000070
其中,
Figure PCTCN2022123754-appb-000071
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
Figure PCTCN2022123754-appb-000072
表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵。
S14:基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集;
在本发明具体实施过程中,所述基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集,包括:
根据刚度矩阵的对称性与正定性的几何约束,采用矩阵三角分解方法对末端力-刚度矩阵数据集内的刚度矩阵进行向量化
Figure PCTCN2022123754-appb-000073
并转换到欧几里得到空间表征的刚度向量,形成末端力-刚度向量数据集
Figure PCTCN2022123754-appb-000074
其中,
Figure PCTCN2022123754-appb-000075
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
Figure PCTCN2022123754-appb-000076
表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵;
Figure PCTCN2022123754-appb-000077
表示协作机器人在样本长度为t和训练样本数量为m时的上三角矩阵元素的向量化描述,长度为m(m+1)/2;A t,m表示协作机器人在样本长度为t和训练样本数量为m时的分解后的上三角矩阵;A′ t,m为A t,m的转置。
S15:基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习,获得学习结果,所述学习结果为条件概率分布;
在本发明具体实施过程中,所述基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习包括:采用高斯混合模型对末端力-刚度向量数据集的输入和输出进行联合建模,获得建模结果如下:
Figure PCTCN2022123754-appb-000078
其中,C表示高斯混合模型(Gaussian Mixture Model,GMM)中的高斯分布的数量;π c表示第c个高斯模型的先验概率,并且
Figure PCTCN2022123754-appb-000079
μ c和∑ c分布表示第c个高斯模型的均值及协方差;F表示末端力;
Figure PCTCN2022123754-appb-000080
表示高斯分 布;V P协作机器人的上三角矩阵元素的向量化描述;在给定训练样本和高斯模型数量C的前提下,高斯混合模型的参数通过期望最大化算法进行迭代优化获得;在获得高斯混合模型的参数后,对于任意未知末端力F *输入时,利用高斯混合回归(Gaussian Mixture Regression,GMR)估计得到对应刚度向量V P*的条件概率分布f(V P|F *),即:
Figure PCTCN2022123754-appb-000081
上述同属于高斯分布,由相应的均值
Figure PCTCN2022123754-appb-000082
和方差
Figure PCTCN2022123754-appb-000083
进行描述,简化为:
Figure PCTCN2022123754-appb-000084
从而,利用GMM能够从多次演示运动轨迹中学习运动的多样性与概率特性,满足于对外界未知干扰的建模。但是,GMM的参数更新需要进行重新迭代优化,难以将学习到的技能应用于与演示状态差异较大的情况。为此,本发明将在传统技能学习方法GMM-GMR的基础上,引入多输出高斯过程对未知环境或新的任务约束进行建模,实现技能的在线学习。
S16:将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理。
在本发明具体实施过程中,所述将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理,包括:针对末端力-刚度运动技能的在线控制与调整,将GMM-GMR得到的条件概率分布作为高斯过程的均值函数,即有:
μ(F)=p(V P|F);
它封装了多次演示运动轨迹的概率特性(如形状等)。下面将通过GP协方差函数来应对外界给定刚度的约束情况,自动调整刚度轨迹,以适应环境的变化等;其中,V P协作机器人的上三角矩阵元素的向量化描述;F表示协作机器人末端的输入力;将通过高斯过程协方差函数来应对外界给定 刚度的约束情况,自动调整刚度轨迹;假设末端力-刚度向量数据集中含有N样本
Figure PCTCN2022123754-appb-000085
潜在的函数关系V P=f(F)+ε,其中,
Figure PCTCN2022123754-appb-000086
是方差为未知σ 2的噪声;当给定新的测试输入F *后,其对应的函数值为f(F *)和已有输出样本
Figure PCTCN2022123754-appb-000087
之间存在以下联合分布关系:
Figure PCTCN2022123754-appb-000088
式中,I表示N维单位矩阵;k *=[k(F *,F 1),k(F *,F 2),…,k(F *, N)],K表示由核函数定义的协方差矩阵;由联合分布得到多变量高斯的条件概率分布P(f(F *)|V P),其均值μ(F *)=p(V P*|F *)由高斯混合模型-高斯回归获得,方差表示如下:
D(f(F *))=k(F *,F *)-k *(K+σ 2I) -1k *′;
其中,k *′表示k *的转置。
所述由核函数定义的协方差矩阵描述如下:
Figure PCTCN2022123754-appb-000089
其中,k(.,.)表示核函数运算。
从而,在传统GMM-GMR基础上联合高斯过程实现末端力-刚度的自动调整,以高效适应环境的变化。最后,采用矩阵三角分解的逆变换将得到的刚度向量f(V P|F *)从欧几里得空间恢复为黎曼空间的刚度矩阵f(K P|F *),将协作机器人末端刚度矩阵转换到末端力,并设计阻抗控制器,实现协作机器人末端力-刚度的自适应控制与调整。
在本发明实施例中,从协作机器人末端运动位置、速度、加速度、外界接触力的高维运动数据中估计出满足对称性和正定性的刚度矩阵;对协作机器人末端力-刚度向量进行运动技能模仿,实现在任意未知外界接触力作为输入时估计出末端刚度向量,并在此基础上,通过三角分解逆变换与协作机器人阻抗控制器,实现接触力发生变化时协作机器人末端刚度的自适应调整与控制,满足精密装配、康复训练等复杂任务对协作机器人的技术需求。
实施例二
请参阅图2,图2是本发明实施例中的协作机器人可变刚度运动技能学习与调控系统的结构组成示意图。
如图2所示,一种协作机器人可变刚度运动技能学习与调控系统,所述系统包括:
获得模块21:用于基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集;
在本发明具体实施过程中,所述基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集,包括:在所述协作机器人末端安装六维力传感器,并通过交互技能数据采集与标注系统在零力拖动模式下牵引所述协作机器人进行外界接触力-刚度相关联的操作任务展开多次运动演示,获得每次演示协作机器人的末端位置、速度、加速度、外界接触力的高维运动轨迹,形成数据集;
所述数据集表示形式
Figure PCTCN2022123754-appb-000090
其中,x t,m表示协作机器人在样本长度为t和训练样本数量为m时的末端位置;
Figure PCTCN2022123754-appb-000091
表示协作机器人在样本长度为t和训练样本数量为m时的末端速度;
Figure PCTCN2022123754-appb-000092
表示协作机器人在样本长度为t和训练样本数量为m时的末端加速度;
Figure PCTCN2022123754-appb-000093
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量。
构建模块22:用于基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系;
在本发明具体实施过程中,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系,包括:在所述数据集合的基础上,假设所述协作机器人末端为受到控制力f c和外界力f e影响的单位质量体I M,其动力学模型简化为:
Figure PCTCN2022123754-appb-000094
在所述协作机器人末端的控制力由虚拟弹簧-阻尼系统产生,从而得到每一时刻所述协作机器人动力学等价于:
Figure PCTCN2022123754-appb-000095
在此基础上,假定通过技能学习得到所述协作机器人末端理想运动力轨迹为F d,从而可以由虚拟弹簧-阻尼系统将F d表达为:
Figure PCTCN2022123754-appb-000096
其中,
Figure PCTCN2022123754-appb-000097
表示协作机器人的末端加速度;
Figure PCTCN2022123754-appb-000098
表示协作机器人末端在样本长度为t时的末端加速度;
Figure PCTCN2022123754-appb-000099
表示协作机器人末端在样本长度为t时的刚度矩阵;
Figure PCTCN2022123754-appb-000100
表示协作机器人的末端期望位置;x t表示协作机器人的末端位置;
Figure PCTCN2022123754-appb-000101
表示协作机器人末端在样本长度为t时的阻尼矩阵;
Figure PCTCN2022123754-appb-000102
表示协作机器人的末端速度;
Figure PCTCN2022123754-appb-000103
表示在协作机器人末端在样本长度为t时的外界力;
Figure PCTCN2022123754-appb-000104
表示未知刚度矩阵;Δx表示已知未知变化量;
Figure PCTCN2022123754-appb-000105
表示未知阻尼矩阵;J表示已知雅克比矩阵;
Figure PCTCN2022123754-appb-000106
表示已知关节速度。
进一步的,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系之后,还包括:使所述协作机器人不作周期性振动而又能快速地回到平衡位置,将设定未知阻尼矩阵
Figure PCTCN2022123754-appb-000107
为临界阻尼值,即:
Figure PCTCN2022123754-appb-000108
其中,Q和Λ分别为对称正定的未知刚度矩阵
Figure PCTCN2022123754-appb-000109
的特征向量和特征值;η表示调节系数,
Figure PCTCN2022123754-appb-000110
在所述协作机器人力矩控制模式下,设笛卡尔阻尼控制器实现末端理想力的执行如下:
Figure PCTCN2022123754-appb-000111
其中,τ d表示笛卡尔阻尼控制器实现末端理想力的执行;J′表示已知雅克比矩阵的转置;F d表示协作机器人末端理想运动力轨迹;
Figure PCTCN2022123754-appb-000112
表示机器人运动学模型,其中,θ表示已知关节角度,
Figure PCTCN2022123754-appb-000113
表示已知关节速度,
Figure PCTCN2022123754-appb-000114
表示已知关节加速度;由协作机器人末端理想运动力轨迹F d和笛卡尔阻尼控制器实现末端理想力的执行τ d可知,所述协作机器人末端理想作用力由唯一的对称正定的刚度矩阵
Figure PCTCN2022123754-appb-000115
决定,其中,
Figure PCTCN2022123754-appb-000116
表示对称正定流形;通过构建所述协作机器人末端力-刚度矩阵的技能模型f(K P|F)实现可调控刚度的运动技能学习。
估计模块23:用于基于所述协作机器人末端力-刚度矩阵关联关系进行 刚度矩阵估计处理,形成末端力-刚度矩阵数据集;
在本发明具体实施过程中,所述基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集,包括:
通过对协作机器人末端力-刚度矩阵关联关系内的刚度矩阵进行估计,形成末端力-刚度矩阵数据集
Figure PCTCN2022123754-appb-000117
其中,
Figure PCTCN2022123754-appb-000118
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
Figure PCTCN2022123754-appb-000119
表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵。
向量化模块24:用于基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集;
在本发明具体实施过程中,所述基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集,包括:
根据刚度矩阵的对称性与正定性的几何约束,采用矩阵三角分解方法对末端力-刚度矩阵数据集内的刚度矩阵进行向量化
Figure PCTCN2022123754-appb-000120
并转换到欧几里得到空间表征的刚度向量,形成末端力-刚度向量数据集
Figure PCTCN2022123754-appb-000121
其中,
Figure PCTCN2022123754-appb-000122
表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
Figure PCTCN2022123754-appb-000123
表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵;
Figure PCTCN2022123754-appb-000124
表示协作机器人在样本长度为t和训练样本数量为m时的上三角矩阵元素的向量化描述,长度为m(m+1)/2;A t,m表示协作机器人在样本长度为t和训练样本数量为m时的分解后的上三角矩阵;A′ t,m为A t,m的转置。
在线学习模块25:用于基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习,获得学习结果,所述学习结果为条件概率分布;
在本发明具体实施过程中,所述基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习包括:采用高斯混合模型对末端力-刚度向量数据集的输入和输出进行联合建模,获得建模结果如下:
Figure PCTCN2022123754-appb-000125
其中,C表示高斯混合模型(Gaussian Mixture Model,GMM)中的高斯分布的数量;π c表示第c个高斯模型的先验概率,并且
Figure PCTCN2022123754-appb-000126
μ c和∑ c分布表示第c个高斯模型的均值及协方差;F表示末端力;
Figure PCTCN2022123754-appb-000127
表示高斯分布;V P协作机器人的上三角矩阵元素的向量化描述;在给定训练样本和高斯模型数量C的前提下,高斯混合模型的参数通过期望最大化算法进行迭代优化获得;在获得高斯混合模型的参数后,对于任意未知末端力F *输入时,利用高斯混合回归(Gaussian Mixture Regression,GMR)估计得到对应刚度向量V P*的条件概率分布f(V P|F *),即:
Figure PCTCN2022123754-appb-000128
上述同属于高斯分布,由相应的均值
Figure PCTCN2022123754-appb-000129
和方差
Figure PCTCN2022123754-appb-000130
进行描述,简化为:
Figure PCTCN2022123754-appb-000131
从而,利用GMM能够从多次演示运动轨迹中学习运动的多样性与概率特性,满足于对外界未知干扰的建模。但是,GMM的参数更新需要进行重新迭代优化,难以将学习到的技能应用于与演示状态差异较大的情况。为此,本发明将在传统技能学习方法GMM-GMR的基础上,引入多输出高斯过程对未知环境或新的任务约束进行建模,实现技能的在线学习。
调制控制模块26:用于将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理。
在本发明具体实施过程中,所述将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理,包括:针对末端力-刚度运动技能的在线控制与调整,将GMM-GMR得到的条件概率分布作为高斯过程的均值函数,即有:
μ(F)=p(V P|F);
它封装了多次演示运动轨迹的概率特性(如形状等)。下面将通过GP 协方差函数来应对外界给定刚度的约束情况,自动调整刚度轨迹,以适应环境的变化等;其中,V P协作机器人的上三角矩阵元素的向量化描述;F表示协作机器人末端的输入力;将通过高斯过程协方差函数来应对外界给定刚度的约束情况,自动调整刚度轨迹;假设末端力-刚度向量数据集中含有N样本
Figure PCTCN2022123754-appb-000132
潜在的函数关系V P=f(F)+ε,其中,
Figure PCTCN2022123754-appb-000133
是方差为未知σ 2的噪声;当给定新的测试输入F *后,其对应的函数值为f(F *)和已有输出样本
Figure PCTCN2022123754-appb-000134
之间存在以下联合分布关系:
Figure PCTCN2022123754-appb-000135
式中,I表示N维单位矩阵;k *=[k(F *,F 1),k(F *,F 2),…,k(F *,F N)],K表示由核函数定义的协方差矩阵;由联合分布得到多变量高斯的条件概率分布P(f(F *)|V P),其均值μ(F *)=p(V P*|F *)由高斯混合模型-高斯回归获得,方差表示如下:
D(f(F *))=k(F *,F *)-k *(K+σ 2I) -1k *′;
其中,k *′表示k *的转置。
所述由核函数定义的协方差矩阵描述如下:
Figure PCTCN2022123754-appb-000136
其中,k(.,.)表示核函数运算。
从而,在传统GMM-GMR基础上联合高斯过程实现末端力-刚度的自动调整,以高效适应环境的变化。最后,采用矩阵三角分解的逆变换将得到的刚度向量f(V P|F *)从欧几里得空间恢复为黎曼空间的刚度矩阵f(K P|F *),将协作机器人末端刚度矩阵转换到末端力,并设计阻抗控制器,实现协作机器人末端力-刚度的自适应控制与调整。
在本发明实施例中,从协作机器人末端运动位置、速度、加速度、外界接触力的高维运动数据中估计出满足对称性和正定性的刚度矩阵;对协作机器人末端力-刚度向量进行运动技能模仿,实现在任意未知外界接触力作为输入时估计出末端刚度向量,并在此基础上,通过三角分解逆变换与协作机器人阻抗控制器,实现接触力发生变化时协作机器人末端刚度的自 适应调整与控制,满足精密装配、康复训练等复杂任务对协作机器人的技术需求。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:只读存储器(ROM,ReadOnly Memory)、随机存取存储器(RAM,Random Access Memory)、磁盘或光盘等。
另外,以上对本发明实施例所提供的一种协作机器人可变刚度运动技能学习与调控方法及系统进行了详细介绍,本文中应采用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (10)

  1. 一种协作机器人可变刚度运动技能学习与调控方法,其特征在于,所述方法包括:
    基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集;
    基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系;
    基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集;
    基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集;
    基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习,获得学习结果,所述学习结果为条件概率分布;
    将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理。
  2. 根据权利要求1所述的方法,其特征在于,所述基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集,包括:
    在所述协作机器人末端安装六维力传感器,并通过交互技能数据采集与标注系统在零力拖动模式下牵引所述协作机器人进行外界接触力-刚度相关联的操作任务展开多次运动演示,获得每次演示协作机器人的末端位置、速度、加速度、外界接触力的高维运动轨迹,形成数据集;
    所述数据集表示形式
    Figure PCTCN2022123754-appb-100001
    其中,x t,m表示协作机器人在样本长度为t和训练样本数量为m时的末端位置;
    Figure PCTCN2022123754-appb-100002
    表示协作机器人在样本长度为t和训练样本数量为m时的末端速度;
    Figure PCTCN2022123754-appb-100003
    表示协作机器 人在样本长度为t和训练样本数量为m时的末端加速度;
    Figure PCTCN2022123754-appb-100004
    表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量。
  3. 根据权利要求1所述的方法,其特征在于,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系,包括:
    在所述数据集合的基础上,假设所述协作机器人末端为受到控制力f c和外界力f e影响的单位质量体I M,其动力学模型简化为:
    Figure PCTCN2022123754-appb-100005
    在所述协作机器人末端的控制力由虚拟弹簧-阻尼系统产生,从而得到每一时刻所述协作机器人动力学等价于:
    Figure PCTCN2022123754-appb-100006
    在此基础上,假定通过技能学习得到所述协作机器人末端理想运动力轨迹为F d,从而可以由虚拟弹簧-阻尼系统将F d表达为:
    Figure PCTCN2022123754-appb-100007
    其中,
    Figure PCTCN2022123754-appb-100008
    表示协作机器人的末端加速度;
    Figure PCTCN2022123754-appb-100009
    表示协作机器人末端在样本长度为t时的末端加速度;
    Figure PCTCN2022123754-appb-100010
    表示协作机器人末端在样本长度为t时的刚度矩阵;
    Figure PCTCN2022123754-appb-100011
    表示协作机器人的末端期望位置;x t表示协作机器人的末端位置;
    Figure PCTCN2022123754-appb-100012
    表示协作机器人末端在样本长度为t时的阻尼矩阵;
    Figure PCTCN2022123754-appb-100013
    表示协作机器人的末端速度;
    Figure PCTCN2022123754-appb-100014
    表示在协作机器人末端在样本长度为t时的外界力;
    Figure PCTCN2022123754-appb-100015
    表示未知刚度矩阵;Δx表示已知未知变化量;
    Figure PCTCN2022123754-appb-100016
    表示未知阻尼矩阵;J表示已知雅克比矩阵;
    Figure PCTCN2022123754-appb-100017
    表示已知关节速度。
  4. 根据权利要求1所述的方法,其特征在于,所述基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系之后,还包括:
    使所述协作机器人不作周期性振动而又能快速地回到平衡位置,将设定未知阻尼矩阵
    Figure PCTCN2022123754-appb-100018
    为临界阻尼值,即:
    Figure PCTCN2022123754-appb-100019
    其中,Q和Λ分别为对称正定的未知刚度矩阵
    Figure PCTCN2022123754-appb-100020
    的特征向量和特征值;η表示调节系数,
    Figure PCTCN2022123754-appb-100021
    在所述协作机器人力矩控制模式下,设笛卡尔阻尼控制器实现末端理想力的执行如下:
    Figure PCTCN2022123754-appb-100022
    其中,τ d表示笛卡尔阻尼控制器实现末端理想力的执行;J′表示已知雅克比矩阵的转置;F d表示协作机器人末端理想运动力轨迹;
    Figure PCTCN2022123754-appb-100023
    表示机器人运动学模型,其中,θ表示已知关节角度,
    Figure PCTCN2022123754-appb-100024
    表示已知关节速度,
    Figure PCTCN2022123754-appb-100025
    表示已知关节加速度;
    由协作机器人末端理想运动力轨迹F d和笛卡尔阻尼控制器实现末端理想力的执行τ d可知,所述协作机器人末端理想作用力由唯一的对称正定的刚度矩阵
    Figure PCTCN2022123754-appb-100026
    决定,其中,
    Figure PCTCN2022123754-appb-100027
    表示对称正定流形;
    通过构建所述协作机器人末端力-刚度矩阵的技能模型f(K P|F)实现可调控刚度的运动技能学习。
  5. 根据权利要求1所述的方法,其特征在于,所述基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集,包括:
    通过对协作机器人末端力-刚度矩阵关联关系内的刚度矩阵进行估计,形成末端力-刚度矩阵数据集
    Figure PCTCN2022123754-appb-100028
    其中,
    Figure PCTCN2022123754-appb-100029
    表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
    Figure PCTCN2022123754-appb-100030
    表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵。
  6. 根据权利要求1所述的方法,其特征在于,所述基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集,包括:
    根据刚度矩阵的对称性与正定性的几何约束,采用矩阵三角分解方法对末端力-刚度矩阵数据集内的刚度矩阵进行向量化
    Figure PCTCN2022123754-appb-100031
    并转换到欧几里得到空间表征的刚度向量,形成末端力-刚度向量数据集
    Figure PCTCN2022123754-appb-100032
    其中,
    Figure PCTCN2022123754-appb-100033
    表示协作机器人在样本长度为t和训练样本数量为m时的末端的力;T表示样本的长度;M表示训练样本的数量;
    Figure PCTCN2022123754-appb-100034
    表示协作机器人在样本长度为t和训练样本数量为m时的刚度矩阵;
    Figure PCTCN2022123754-appb-100035
    表示协作机器人在样本长度为t和训练样本数量为m时的上三角矩阵元素的向量化描述,长度为m(m+1)/2;A t,m表示协作机器人在样本长度为t和训练样本数量为m时的分解后的上三角矩阵;A′ t,m为A t,m的转置。
  7. 根据权利要求1所述的方法,其特征在于,所述基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习包括:
    采用高斯混合模型对末端力-刚度向量数据集的输入和输出进行联合建模,获得建模结果如下:
    Figure PCTCN2022123754-appb-100036
    其中,C表示高斯混合模型中的高斯分布的数量;π c表示第c个高斯模型的先验概率,并且
    Figure PCTCN2022123754-appb-100037
    μ c和∑ c分布表示第c个高斯模型的均值及协方差;F表示末端力;
    Figure PCTCN2022123754-appb-100038
    表示高斯分布;V P协作机器人的上三角矩阵元素的向量化描述;
    在给定训练样本和高斯模型数量C的前提下,高斯混合模型的参数通过期望最大化算法进行迭代优化获得;
    在获得高斯混合模型的参数后,对于任意未知末端力F *输入时,利用高斯混合回归估计得到对应刚度向量V P*的条件概率分布f(V P|F *)。
  8. 根据权利要求1所述的方法,其特征在于,所述将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理,包括:
    将所述条件概率分布作为高斯过程的均值函数,即有:
    μ(F)=p(V P|F);
    其中,V P协作机器人的上三角矩阵元素的向量化描述;F表示协作机器人末端的输入力;
    将通过高斯过程协方差函数来应对外界给定刚度的约束情况,自动调整刚度轨迹;
    假设末端力-刚度向量数据集中含有N样本
    Figure PCTCN2022123754-appb-100039
    潜在的函数关系V P=f(F)+ε,其中,ε~N(0,σ 2)是方差为未知σ 2的噪声;
    当给定新的测试输入F *后,其对应的函数值为f(F *)和已有输出样本
    Figure PCTCN2022123754-appb-100040
    之间存在以下联合分布关系:
    Figure PCTCN2022123754-appb-100041
    式中,I表示N维单位矩阵;k *=[k(F *,F 1),k(F *,F 2),…,k(F *,F N)],K表示由核函数定义的协方差矩阵;
    由联合分布得到多变量高斯的条件概率分布P(f(F *)|V P),其均值μ(F *)=p(V P*|F *)由高斯混合模型-高斯回归获得,方差表示如下:
    D(f(F *))=k(F *,F *)-k *(K+σ 2I) -1k *′;
    其中,k *′表示k *的转置。
  9. 根据权利要求8所述的方法,其特征在于,所述由核函数定义的协方差矩阵描述如下:
    Figure PCTCN2022123754-appb-100042
    其中,k(.,.)表示核函数运算。
  10. 一种协作机器人可变刚度运动技能学习与调控系统,其特征在于,所述系统包括:
    获得模块:用于基于外界接触力-刚度相关联的操作任务对协作机器人进行运动演示处理,获得协作机器人的末端位置、速度、加速度、外界接触力,形成数据集;
    构建模块:用于基于所述数据集构建协作机器人与环境交互模型,获得协作机器人末端力-刚度矩阵关联关系;
    估计模块:用于基于所述协作机器人末端力-刚度矩阵关联关系进行刚度矩阵估计处理,形成末端力-刚度矩阵数据集;
    向量化模块:用于基于矩阵三角分解法对末端力-刚度矩阵数据集进行向量化处理,获得末端力-刚度向量数据集;
    在线学习模块:用于基于所述末端力-刚度向量数据集进行末端力-刚度向量的运动技能建模与在线学习,获得学习结果,所述学习结果为条件概率分布;
    调制控制模块:用于将所述条件概率分布作为高斯过程的均值函数进行协作机器人的末端力-刚度调整与控制处理。
PCT/CN2022/123754 2021-12-28 2022-10-08 一种协作机器人可变刚度运动技能学习与调控方法及系统 WO2023124346A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111628215.4A CN114310888B (zh) 2021-12-28 2021-12-28 一种协作机器人可变刚度运动技能学习与调控方法及系统
CN202111628215.4 2021-12-28

Publications (1)

Publication Number Publication Date
WO2023124346A1 true WO2023124346A1 (zh) 2023-07-06

Family

ID=81015974

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/123754 WO2023124346A1 (zh) 2021-12-28 2022-10-08 一种协作机器人可变刚度运动技能学习与调控方法及系统

Country Status (2)

Country Link
CN (1) CN114310888B (zh)
WO (1) WO2023124346A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114310888B (zh) * 2021-12-28 2024-05-31 广东省科学院智能制造研究所 一种协作机器人可变刚度运动技能学习与调控方法及系统
CN115730475B (zh) * 2023-01-09 2023-05-19 广东省科学院智能制造研究所 一种云边端协同的柔性产线机器人学习系统及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130151442A1 (en) * 2011-12-13 2013-06-13 Iucf-Hyu (Industry-University Cooperation Foundation Hanyang University) Method for learning task skill and robot using thereof
CN111230873A (zh) * 2020-01-31 2020-06-05 武汉大学 一种基于示教学习的协作搬运控制系统及方法
WO2020118730A1 (zh) * 2018-12-14 2020-06-18 中国科学院深圳先进技术研究院 机器人柔顺性控制方法、装置、设备及存储介质
CN112605973A (zh) * 2020-11-19 2021-04-06 广东省科学院智能制造研究所 一种机器人运动技能学习方法及系统
CN114310888A (zh) * 2021-12-28 2022-04-12 广东省科学院智能制造研究所 一种协作机器人可变刚度运动技能学习与调控方法及系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2569405B2 (ja) * 1987-11-30 1997-01-08 工業技術院長 力制御機能を有する制御装置
US20200156241A1 (en) * 2018-11-21 2020-05-21 Ford Global Technologies, Llc Automation safety and performance robustness through uncertainty driven learning and control
US11607806B2 (en) * 2019-10-21 2023-03-21 Autodesk, Inc. Techniques for generating controllers for robots
CN111037571B (zh) * 2019-12-31 2022-12-16 广东工业大学 一种机器人自适应变阻尼阻抗控制方法
CN113084812B (zh) * 2021-04-09 2022-06-21 吉林大学 一种机器人末端刚度性能评价方法
CN113352328B (zh) * 2021-06-28 2023-04-07 深圳亿嘉和科技研发有限公司 一种铰接模型的辨识方法及机器人操作方法
CN113386136B (zh) * 2021-06-30 2022-05-20 华中科技大学 一种基于标准球阵目标估计的机器人位姿矫正方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130151442A1 (en) * 2011-12-13 2013-06-13 Iucf-Hyu (Industry-University Cooperation Foundation Hanyang University) Method for learning task skill and robot using thereof
WO2020118730A1 (zh) * 2018-12-14 2020-06-18 中国科学院深圳先进技术研究院 机器人柔顺性控制方法、装置、设备及存储介质
CN111230873A (zh) * 2020-01-31 2020-06-05 武汉大学 一种基于示教学习的协作搬运控制系统及方法
CN112605973A (zh) * 2020-11-19 2021-04-06 广东省科学院智能制造研究所 一种机器人运动技能学习方法及系统
CN114310888A (zh) * 2021-12-28 2022-04-12 广东省科学院智能制造研究所 一种协作机器人可变刚度运动技能学习与调控方法及系统

Also Published As

Publication number Publication date
CN114310888A (zh) 2022-04-12
CN114310888B (zh) 2024-05-31

Similar Documents

Publication Publication Date Title
WO2023124346A1 (zh) 一种协作机器人可变刚度运动技能学习与调控方法及系统
Gu et al. Deep reinforcement learning for robotic manipulation
Gu et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
Van Hoof et al. Stable reinforcement learning with autoencoders for tactile and visual data
Calinon et al. On improving the extrapolation capability of task-parameterized movement models
Peters et al. Reinforcement learning by reward-weighted regression for operational space control
Kroemer et al. Combining active learning and reactive control for robot grasping
WO2022105635A1 (zh) 一种机器人运动技能学习方法及系统
WO2020118730A1 (zh) 机器人柔顺性控制方法、装置、设备及存储介质
Hyatt et al. Real-time nonlinear model predictive control of robots using a graphics processing unit
Bischoff et al. Policy search for learning robot control using sparse data
US11565412B2 (en) Generating a robot control policy from demonstrations collected via kinesthetic teaching of a robot
Zhao et al. Model accelerated reinforcement learning for high precision robotic assembly
Rayyes et al. Learning inverse statics models efficiently with symmetry-based exploration
Ambhore A comprehensive study on robot learning from demonstration
Huang et al. Non-parametric imitation learning of robot motor skills
Li et al. Enhanced task parameterized dynamic movement primitives by GMM to solve manipulation tasks
Malekzadeh et al. Skills transfer across dissimilar robots by learning context-dependent rewards
Bócsi et al. Learning tracking control with forward models
Sun et al. A Framework of Robot Manipulability Learning and Control, and Its Application in Telerobotics
Yin et al. Learning cost function and trajectory for robotic writing motion
Lu et al. Dynamic movement primitives based cloud robotic skill learning for point and non-point obstacle avoidance
Ting et al. Locally Weighted Regression for Control.
Xiong et al. Encoding primitives generation policy learning for robotic arm to overcome catastrophic forgetting in sequential multi-tasks learning
Takano et al. Synthesis of kinematically constrained full-body motion from stochastic motion model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913611

Country of ref document: EP

Kind code of ref document: A1