WO2022088593A1 - Robotic arm control method and device, and human-machine cooperation model training method - Google Patents

Robotic arm control method and device, and human-machine cooperation model training method Download PDF

Info

Publication number
WO2022088593A1
WO2022088593A1 PCT/CN2021/082254 CN2021082254W WO2022088593A1 WO 2022088593 A1 WO2022088593 A1 WO 2022088593A1 CN 2021082254 W CN2021082254 W CN 2021082254W WO 2022088593 A1 WO2022088593 A1 WO 2022088593A1
Authority
WO
WIPO (PCT)
Prior art keywords
human
robotic arm
model
machine
pose
Prior art date
Application number
PCT/CN2021/082254
Other languages
French (fr)
Chinese (zh)
Inventor
段星光
田焕玉
温浩
李长胜
李建玺
田野
靳励行
孟繁盛
Original Assignee
北京理工大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京理工大学 filed Critical 北京理工大学
Publication of WO2022088593A1 publication Critical patent/WO2022088593A1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J18/00Arms
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1602Programme controls characterised by the control system, structure, architecture
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1679Programme controls characterised by the tasks executed

Definitions

  • the present application relates to the field of robotic arms, and in particular, to a control method and device for a robotic arm and a training method for a human-machine collaborative model.
  • the main purpose of this application is to provide a method for controlling a robotic arm, so as to solve the problem that the robot cannot move along the trajectory intended by humans.
  • the present application provides a control method and device for a robotic arm and a training method for a human-machine collaborative model.
  • the present application provides a method for controlling a robotic arm.
  • the control method of the robotic arm according to the present application includes:
  • the man-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force
  • the robotic arm is controlled according to the optimal trajectory.
  • generating the optimal trajectory of the motion of the robotic arm according to the pose at the current moment and the desired pose corresponding to the human-computer interaction force at the current moment includes:
  • An optimal trajectory is selected from the set of random trajectories.
  • selecting the optimal trajectory from the multiple groups of random trajectories includes:
  • an optimal trajectory is selected through an optimal trajectory control algorithm.
  • controlling the robotic arm according to the optimal trajectory includes:
  • the first mode control is performed on the normal component of the position and attitude angle motion information of the robotic arm;
  • the second mode control is performed on the tangential components of the position and attitude angle motion information of the robotic arm; wherein, the first mode is a robot guidance mode in which the robotic arm admittance is greater than the robotic arm admittance of the second mode; so The second mode is a human-guided mode in which the human admittance is greater than the human admittance of the first mode.
  • the present application provides a training method for a human-machine collaboration model, which is used to obtain the human-machine collaboration model in the control method for a robotic arm in the first aspect.
  • the training method of the human-machine collaborative model according to the present application includes:
  • a human-machine collaboration model is established according to the multiple sets of human-computer interaction forces and the multiple sets of robotic arm poses.
  • the method further includes:
  • the human-machine collaborative model is optimized according to the supervised learning method.
  • the present application provides a control device for a robotic arm.
  • the control device of the robotic arm according to the present application includes:
  • a model acquisition module for acquiring a man-machine collaborative model, wherein the man-machine collaborative model is a model for determining the desired pose of the robotic arm according to the man-machine interaction force;
  • a pose obtaining module configured to obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model
  • a trajectory generation module configured to generate the optimal trajectory of the motion of the robotic arm according to the current moment posture and the desired posture and posture corresponding to the human-computer interaction force at the current moment;
  • the control module is used to control the robotic arm according to the optimal trajectory.
  • model acquisition module includes:
  • the optimization unit is used to optimize the human-machine collaborative model according to the supervised learning method.
  • trajectory generation module includes:
  • a random trajectory generation unit configured to control the MPC algorithm through model prediction, and generate multiple sets of random trajectories according to the current moment pose and the expected pose corresponding to the human-computer interaction force at the current moment;
  • An optimal trajectory generation unit configured to select an optimal trajectory from the multiple groups of random trajectories.
  • the optimal trajectory generation unit further includes:
  • control module includes:
  • the controller control unit is used for controlling the manipulator through the controller of the manipulator according to the optimal trajectory, wherein the controller includes an inner layer controller that controls the manipulator and a man-machine collaboration model. Controlled outer controller.
  • the present application provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the control method for a robotic arm provided in the first aspect and/or the second aspect Provide the steps of the training method of the human-machine collaboration model.
  • the present application provides a robot, including a robotic arm, a sensor, a controller, a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the program when the processor executes the program.
  • the steps of the control method of the robotic arm provided by the first aspect and/or the training method of the human-machine collaboration model provided by the second aspect.
  • the expected pose corresponding to the human-machine interaction force at the current moment is determined by the human-machine collaboration model, and the desired motion of the robotic arm is generated according to the current moment pose of the robotic arm and the expected pose corresponding to the human-machine interaction force at the current moment
  • the optimal trajectory of the robot arm is controlled by the optimal trajectory of the expected movement of the robot arm, so that the robot can move along the trajectory of human intention, so as to realize the control of the robot to accurately understand the doctor's intention and optimize the human-computer interaction experience.
  • the technical effect of the robot further solves the problem that the robot cannot move along the trajectory of the human intention.
  • FIG. 1 is a schematic flowchart of a method for controlling a robotic arm according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a method for training a human-machine collaborative model according to an embodiment of the present application
  • FIG. 3 is a structural block diagram of a control device of a robotic arm according to an embodiment of the present application.
  • the terms “installed”, “arranged”, “provided with”, “connected”, “connected” and “socketed” should be construed in a broad sense.
  • it may be a fixed connection, a detachable connection, or a unitary structure; it may be a mechanical connection, or an electrical connection; it may be directly connected, or indirectly connected through an intermediary, or between two devices, elements, or components. internal communication.
  • the specific meanings of the above terms in the present invention can be understood according to specific situations.
  • the method includes the following steps S11 to S14:
  • S11 Obtain a human-machine collaboration model, wherein the human-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force.
  • the human-machine collaborative model can be a pre-stored model in the control system of the robotic arm, or a human-machine collaborative model can be obtained by training with a machine learning method, or it can be a human-machine collaborative model optimized after training with a machine learning method.
  • the human-machine collaboration model is obtained by training through a machine learning method.
  • the human-machine collaborative model is a variety of neural network models or Gaussian process models that are pre-trained via a Gaussian Mixture Model (hereinafter referred to as GMM).
  • GMM Gaussian Mixture Model
  • the human-computer interaction force can be directly obtained through the force sensor installed on the robotic arm.
  • the force sensor is a multi-dimensional force sensor.
  • the force sensor is acquired by a three-dimensional force sensor or a six-dimensional force sensor.
  • the obtained human-computer interaction force at the current moment is input into the human-computer collaboration model, and the predicted expected pose of the robotic arm at the next moment can be obtained.
  • the desired pose is used to control the tangent direction of the path within the limited area, and the control method is exited when the desired pose deviates greatly.
  • the human-computer interaction force may also be a human-computer interaction force including a human-computer resistance force.
  • the human-machine interaction force including the human-machine impedance force can be obtained through the force sensor installed on the manipulator arm, and then the force obtained by the force sensor and the corresponding current moment pose are solved to obtain the virtual constraint of the manipulator arm (that is, the man-machine impedance force). ), so that the human-computer interaction force including the human-computer resistance force can be determined by summing the human-computer interaction force obtained by the force sensor and the virtual constraint obtained by the solution.
  • “Generate the optimal trajectory of the motion of the manipulator according to the position and attitude at the current moment and the expected position and attitude corresponding to the human-computer interaction force at the current moment” is specifically: through the model predictive control (model predictive control, hereinafter referred to as MPC) algorithm, According to the pose at the current moment and the expected pose corresponding to the human-computer interaction force at the current moment, multiple groups of random trajectories are generated; the optimal trajectory is selected from the multiple groups of random trajectories.
  • MPC model predictive control
  • MPC is a model that predicts the process output in the future based on the model at the current moment, selects the objective optimization function, predicts the future output sequence and outputs the control quantity at the current moment, and the latest measured data at the next moment is the process of the previous moment.
  • Algorithm for feedback correction of the output sequence That is, MPC can make the human-computer interaction model at the current moment predict the expected pose output in a period of time in the future.
  • the expected pose in the future time can be predicted through MPC, multiple sets of random trajectories can be generated, and the optimal trajectory of the multiple sets of random trajectories can be selected.
  • the optimal trajectory of the movement of the robotic arm generated in this step is the optimal trajectory of the movement of the robotic arm within the limited area, and the feature of the optimal trajectory is that the operator can control the forward and backward directions in the tangential direction; But the normal direction is controlled by the robot autonomously. Since humans have strong control ability on the tangent line, but the robot has strong control ability on the normal line, the operator transmits the desired position to the robotic arm through the man-machine collaboration model described in claim 1, and the robotic arm tracks the desired position on the path. The projected point on the to achieve the drag effect.
  • Selecting the optimal trajectory from the multiple groups of random trajectories is specifically: selecting the optimal trajectory from the multiple groups of random trajectories through an optimal trajectory control algorithm.
  • the selection of the optimal trajectory may be determined by a linear quadratic regulator algorithm, a nonlinear quadratic regulator (Iterative Linear Quadratic Regulator, hereinafter referred to as iLQR) algorithm or differential dynamic programming, which is not limited here.
  • the optimal trajectory is selected and determined by the iLQR algorithm in the optimal trajectory control algorithm.
  • the iLQR algorithm can obtain the optimal control law of state nonlinear feedback, which is easy to form closed-loop optimal control. That is, the optimal trajectory among multiple groups of random trajectories can be determined by the iLQR algorithm.
  • the optimal trajectory is gradually updated by an iterative optimization algorithm.
  • the iterative trajectory is considered to be the optimal trajectory.
  • the movement trajectory (movement position, speed) is optimized between 10ms and 500ms according to the pose at the current moment and the expected pose of the human-computer interaction force at the current moment.
  • the robot has a value weight about the position on the normal line to precisely control the position, and the human has a large admittance value in the tangential direction to realize the human-guided dragging.
  • humans are more controllable than robots on the tangent component, but robots are more controllable than humans on the normal component.
  • the user transmits the desired position to the robotic arm through the human-machine collaboration model in the above step S1, and the robotic arm realizes the dragging effect by tracking the projected point of the desired position on the path.
  • Controlling the robotic arm according to the optimal trajectory is specifically: acquiring the position and attitude angular motion information of the robotic arm; performing the first mode control on the normal components of the position and attitude angular motion information of the robotic arm; The tangential component of the position and attitude angle motion information of the arm is used for the second mode control; wherein, the first mode is the robot guidance mode in which the manipulator arm admittance is greater than the manipulator arm admittance of the second mode; the second mode is the human admittance A human-guided mode that is greater than the human admittance of the first mode.
  • the error feedback amount of the manipulator is constructed by the impedance coordinate system of the actual motion of the manipulator and the desired coordinate system of the desired motion of the manipulator, as shown in formula (1):
  • M(q) is the inertia matrix of the manipulator in Cartesian space, the unit of the first three columns of the matrix is kg, and the unit of all the following elements is Ns 2 /rad; q is the joint angle; the unit of the first three rows of x is m, The unit corresponding to all subsequent lines is rad; is the viscosity matrix; g(q) is the gravity vector; f env is the environment interaction force screw, which can be obtained by the force sensor of the environment-robot interaction; f is the human-computer interaction force, which can be performed by the force sensor in the above step S11 Obtain.
  • the force controller is constructed based on the feedback linearization method to complete the inner loop performance of the manipulator with high stiffness to humans and low stiffness to the environment.
  • the input of the inner loop is the position and attitude of the impedance coordinate system.
  • the predicted pose X t+1 (0m, 0.1m, 0.1m, 0.3°, 0.1°, 0.2°) of the desired pose is obtained according to the human-machine collaboration model, it can be determined according to the large admittance of the user and the small admittance of the machine.
  • the principle is to control the robotic arm.
  • the first mode is a robot guidance mode
  • the second mode is a human guidance mode.
  • the first mode and the second mode can coexist, but the movement directions of the two modes are different. That is, the first mode control is performed on the robot arm in the normal direction; the second mode control is performed on the robot arm in the tangential direction.
  • the expected pose corresponding to the human-machine interaction force at the current moment is determined through the human-machine collaboration model, so that the predicted displacement of the robotic arm at the current moment and the predicted moment can be determined, and multiple sets of random trajectories of the predicted displacement can be generated through MPC, and then the optimal trajectory control algorithm can be used to generate multiple random trajectories.
  • the optimal trajectory of multiple random trajectories is determined, and the position and attitude angle motion information of the manipulator is obtained, and the manipulator is controlled to achieve the effect of making the robot move along the trajectory intended by humans.
  • the human-machine collaborative model training method includes the following steps S21 and S22 :
  • the human-computer interaction force can be directly obtained through the force sensor installed on the robotic arm.
  • the force sensor is a multi-dimensional force sensor.
  • the human-computer interaction force is acquired by a six-dimensional force sensor.
  • the training force group obtained by the force sensor includes three training force components and three training torque components corresponding to the X, Y, and Z axes.
  • the pose of the robotic arm can be recorded by establishing a coordinate system of the robotic arm including the X, Y, and Z axes.
  • the pose of the robotic arm includes three distance movement components and three angular movement components corresponding to the X, Y, and Z axes.
  • the acquired human-computer interaction force W t (1N, 0N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), wherein 1N, 0N, 0N are three training force components, 0.1Nm, 0.2Nm, 0.3Nm are Three training torque components.
  • the robot arm pose X t (0.01m, 0.02m, 0.01m, 0.3°, 0.4°, 0.1°), where 0.01m, 0.02m, 0.01m are the three distance movement components corresponding to the X, Y, and Z axes, 0.3°, 0.4°, and 0.1° are three angular movement components, respectively.
  • the number of groups of human-computer interaction force is the same as that of the pose of the robotic arm.
  • the obtained groups of human-computer interaction forces may be 3-5 groups. That is, when the obtained multiple sets of human-computer interaction forces are 3 sets, the obtained poses of the manipulator are also 3 sets.
  • the human-computer interaction force may also be a human-computer interaction force including a human-computer resistance force.
  • the virtual constraint (ie, human-machine impedance force) of the robotic arm is obtained by solving the force obtained by the force sensor and the corresponding pose at the current moment, so that the human-machine interaction force obtained by the force sensor and the virtual constraint obtained by the solution can be summed , to determine the human-machine interaction force including the human-machine resistance force.
  • the model input for training the human-machine collaboration model can be the human-machine interaction force at the current moment and the sampled value of the robot arm's pose at the current moment, and the human-machine collaboration model is trained according to the sampled values input by the model.
  • the output is the actual pose of the end of the manipulator at the next moment.
  • the network model trained by the human-machine collaborative model may be a Gaussian Mixture Model (GMM for short), a Bayesian network model, a neural network model, etc., which is not limited here.
  • GMM Gaussian Mixture Model
  • the training method is to draw multiple trajectories by dragging the end of the robotic arm, and record the human-computer interaction force and the current posture of the robotic arm at every moment. Based on the inference relationship between the current moment and the next moment, the model can be trained through supervised learning.
  • step S21 “obtaining multiple sets of human-computer interaction forces of the robotic arm and multiple sets of robotic arm poses corresponding to the multiple sets of human-computer interaction forces” may be to obtain multiple sets of human-computer interaction forces of the robotic arm in the trust region The poses of the multiple sets of robotic arms corresponding to the multiple sets of human-computer interaction forces.
  • the trust region refers to the region where the sampling distribution ps of the force sensor of the acquired human-computer interaction force is between the preset KL divergence thresholds, where the KL divergence refers to the KL divergence between ps and the human-machine collaborative model , as shown in equation (2), the KL divergence can be expressed as:
  • D KL is the KL divergence
  • ps is the sampling distribution of the force sensor
  • ps is obtained by maximum likelihood estimation
  • pm is the model distribution of the human-machine collaborative model
  • th KL is the first preset KL divergence threshold, which can be determined by the user. set up.
  • the prediction model human-machine collaborative model
  • the robot active mode task execution mode
  • the prediction model does not conform to the actual robot motion, the robot is in free dragging mode.
  • the first preset KL The divergence threshold can also be obtained by the human-machine collaboration model and the machine learning method through user learning in different man-machine resistance forces (for example, the first preset KL divergence threshold can be -20).
  • the method further includes: judging whether the human-machine collaboration model is an effective model.
  • the human-machine collaborative model determines whether the human-machine collaborative model is an effective model.
  • the KL divergence between ps and pm is -35
  • the second preset KL divergence is -50
  • the KL divergence between ps and pm is greater than the second preset KL divergence, then the human-machine collaborative model is a valid model.
  • the likelihood of the human-computer interaction force collected in the above step S1 can be calculated, and whether the likelihood is greater than the first preset likelihood threshold, if greater, Then the human-machine collaboration model is an effective model.
  • the method further includes:
  • the human-machine collaborative model is optimized according to the supervised learning method, and the optimized human-machine collaborative model is generated.
  • the optimized human-machine collaboration model is the model under the trust region.
  • Using the supervised learning method to optimize the parameters of the human-machine collaborative model includes using prior information. Specifically, using the maximum likelihood principle in the supervised learning method to optimize the parameters of the human-machine collaborative model, as shown in formula (3) , the corresponding parameters of the optimized training model are:
  • pm is the model distribution of the human-machine collaborative model
  • f h is the human-machine interaction force obtained by the force sensor
  • x d is the desired pose of the robotic arm
  • t is the current moment
  • t+1 is the next moment
  • ⁇ C is the human parameters of the machine collaboration model.
  • the human-machine collaboration model is GMM
  • ⁇ C is the serial number of the sub-model
  • ⁇ C is the dimensionless weight of connecting nodes.
  • the human-machine collaboration model is optimized according to the optimized parameters.
  • different optimization methods are used for different modeling methods of the human-machine collaborative model.
  • the human-machine collaborative model is GMM
  • the maximization (Expectation-Maximum, hereinafter referred to as EM) algorithm is used to perform the human-machine collaborative model.
  • EM maximization
  • SGD stochastic gradient descent
  • the embodiment of the present application also provides a control method of another mechanical arm, as shown in Figure 3, and specifically includes the following steps:
  • the human-machine collaborative model is trained based on the collected data of human-machine interaction force and torque.
  • the specific training process is as follows:
  • the human-computer interaction force and displacement are obtained as input, and the next moment force and torque value are output as a human-machine collaborative model.
  • the inputs are Wt(1N, 0N, 0N, 0.1Nm, 0.2Nm, 0.3Nm) and Xt(0.01m, 0.02m, 0.01m, 0.3°, 0.4°, 0.1°).
  • the output gets Wt+1(0N, 1N, 1N, 0.3Nm, 0.1Nm, 0.2Nm)
  • the human-machine collaboration model is optimized by the related training methods of model-based reinforcement learning, such as stochastic gradient descent and variational inference.
  • the fixed parameters are then executed from the first step.
  • the human-machine collaborative model is finally obtained through multiple cycles of training in the preceding four steps.
  • the maximum likelihood estimation is performed on the measurement results to obtain the sampling distribution.
  • the maximum likelihood estimation of the trajectory is performed. For example, during the human-machine dragging process, the maximum likelihood estimation of the measurement results is performed by collecting 3-5 continuous human-machine interaction force and torque values.
  • W1 (1N, 0.5N, 0N, 0.1Nm, 0.2Nm, 0.3Nm); W2 (2N, 0.5N, 0N, 0.1 Nm, 0.2Nm, 0.3Nm); W3 (3N, 0.5N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), the distribution obeyed by W is obtained through maximum likelihood estimation, assuming Gaussian distribution, the expectation is (1N, 0.5 N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), the variance is diag(0.1N, 0.05N, 0N, 0.1Nm, 0.1Nm, 0.03Nm), and diag represents a diagonal matrix.
  • the human-machine collaborative model is an effective model, and this mode is the execution Task mode; if the KL divergence is not within the trustworthy threshold, that is, the value of the KL divergence is greater than the trustworthy threshold, that is, the distance between the sampling distribution and the model distribution is large, the human-machine collaboration model is an invalid model, and this mode is free Drag mode.
  • the trustworthy threshold is the corresponding threshold in the trustworthy region, and the specific KL divergence can be expressed as:
  • D is the KL divergence, which represents the distance between two Gaussian distributions.
  • Ps is the pose distribution (ie model distribution) obtained by the force sensor of human-computer interaction force through the man-machine collaborative model
  • Pm is the actual pose distribution (ie sampling distribution) obtained from the actual path obtained by maximum likelihood estimation
  • th KL is the pre-
  • the set KL divergence threshold can be set by the user. When it is detected that the KL divergence calculated at Ps and Pm is less than th KL , it means that the two are close. That is, the model calculation conforms to the actual robot motion, and the robot is in the active mode at this time. When the model calculation does not conform to the actual robot motion, the robot is in the free drag mode.
  • the task execution mode is that the robot arm performs specific operations on the operation object according to the plan. It can be seen that the method of the embodiment of the present application can judge which mode is currently based on the judgment of the KL divergence between the sampling distribution and the model distribution, so as to realize free switching between the two modes.
  • the human-machine collaborative model provides assistance for trajectory control, and can also switch between the two modes; in the free drag mode, the human-machine collaborative model does not provide assistance for trajectory control, only for the two modes. Toggle comes into play.
  • the validity and invalidity of the model in this application are differentiated according to whether it can provide help for control. If the model can provide help, it is effective, and if it cannot provide help, it is invalid.
  • the human-machine collaboration model is the effective model, so the control flow in the execution task mode is described in detail.
  • the desired pose corresponding to the human-computer interaction force at the current moment is obtained according to the human-machine collaboration model, and the desired pose is recorded as the first pose; the desired pose corresponding to the trajectory planned by the host computer is obtained. , denoted as the second pose; determine the pose of the robot arm in the impedance coordinate system based on the difference between the first pose and the second pose, and then determine the optimal trajectory; control.
  • the human-computer interaction force can be obtained through a force sensor.
  • a force sensor is set at the handle that realizes the human-computer interaction, and the human-computer interaction force can be collected through the force sensor. After inputting the human-computer interaction force and the current pose, the expected pose at the next moment can be predicted, and the desired pose is recorded as the first pose
  • the impedance coordinate system refers to the coordinate system after deformation (change), and a specific example is given for illustration: obstacles may be encountered in the actual operation process, so it is necessary to bypass obstacles,
  • the coordinates corresponding to the obstacles that can be bypassed are the coordinates in the impedance coordinate system.
  • the optimal trajectory is generated by the iLQR method, and the optimal trajectory is obtained to realize the update of the target point of the manipulator, which belongs to the outer loop control.
  • the inner loop control is required, that is, the manipulator is controlled according to the optimal trajectory. Specifically: obtain the pose of the manipulator in the impedance coordinate system (the output of the outer loop control); move in the attitude angle and normal direction, plan the large admittance according to the upper computer, and control the human dragging the small admittance; move in the tangential direction , according to the person dragging the large admittance, the host computer plans the small admittance to control.
  • the inner loop control is based on the dynamic model of the robot arm itself (robot dynamics model).
  • the inner loop control can get the actual torque of the robot arm.
  • the output of the outer loop is the input of the inner loop, and the output of the inner loop is the input of the outer loop.
  • the doctor interaction model ie, the human-machine collaboration model
  • the human-machine collaboration impedance model and the robot model robot dynamics model
  • the feedforward controller, feedback controller, and kh as a whole correspond to the iLQR method.
  • the human-machine cooperation impedance model is explained: the human-machine impedance model is constructed by the iLQR method to control the errors of the desired coordinate system and the impedance coordinate system in different directions.
  • MPC is a model based on the current moment to predict the process output for a period of time in the future. It selects the objective optimization function, predicts the future output sequence and outputs the control amount at the current moment.
  • the expected pose in the future time can be predicted through the human-machine collaboration model, multiple sets of random trajectories can be generated, and the optimal trajectory of the multiple sets of random trajectories can be selected.
  • the control mode is the torque control mode of the robot, and its purpose is to track the trajectory generated by the impedance model.
  • a human-machine cooperative impedance model that is, a second-order stiffness damping model
  • This characteristic is a fixed and time-invariant differential equation, and the purpose is to maintain the feel during dragging Consistent.
  • the linear controller is updated by using the value function of labor-saving in the main direction, human labor-saving in the auxiliary direction and the smallest error.
  • doctor interaction model Based on the observation of the online doctor interaction dynamic model (doctor interaction model), the doctor interaction model is compared with the nominal dynamics (planned trajectory), so as to obtain the doctor's intention and realize a certain mode switch.
  • control method of the present application can be applied in the medical field, such as orthopedic surgery and puncture in the surgical field.
  • a device 10 for implementing the above-mentioned control method of a robotic arm is also provided.
  • the control device 10 of the robotic arm includes:
  • the model obtaining module 11 is used for obtaining a human-machine collaboration model, wherein the human-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force;
  • the pose obtaining module 12 is configured to obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model;
  • a trajectory generation module 13 configured to generate the optimal trajectory of the motion of the robotic arm according to the current moment posture and the desired posture and posture corresponding to the human-computer interaction force at the current moment;
  • the control module 14 is configured to control the robotic arm according to the optimal trajectory.
  • model acquisition module 11 includes:
  • the optimization unit is used to optimize the human-machine collaborative model according to the supervised learning method.
  • trajectory generation module 13 includes:
  • a random trajectory generation unit configured to control the MPC algorithm through model prediction, and generate multiple sets of random trajectories according to the current moment pose and the expected pose corresponding to the human-computer interaction force at the current moment;
  • An optimal trajectory generation unit configured to select an optimal trajectory from the multiple groups of random trajectories.
  • the optimal trajectory generation unit further includes:
  • control module 14 includes:
  • the controller control unit is used for controlling the manipulator through the controller of the manipulator according to the optimal trajectory, wherein the controller includes an inner layer controller that controls the manipulator and a man-machine collaboration model. Controlled outer controller.
  • the expected pose corresponding to the human-machine interaction force at the current moment is determined through the human-machine collaboration model, so that the predicted displacement of the robotic arm at the current moment and the predicted moment can be determined, and multiple sets of random trajectories of the predicted displacement can be generated through MPC, and then the optimal trajectory control algorithm can be used to generate multiple random trajectories.
  • the optimal trajectory of multiple random trajectories is determined, and the position and attitude angle motion information of the manipulator is obtained, and the manipulator is controlled to achieve the effect of making the robot move along the trajectory intended by humans.
  • modules or steps of the present invention can be implemented by a general-purpose computing device, and they can be centralized on a single computing device or distributed in a network composed of multiple computing devices Alternatively, they can be implemented with program codes executable by a computing device, so that they can be stored in a storage device and executed by the computing device, or they can be made into individual integrated circuit modules, or they can be integrated into The multiple modules or steps are fabricated into a single integrated circuit module. As such, the present invention is not limited to any particular combination of hardware and software.

Abstract

Disclosed are a robotic arm control method and device, and a human-machine cooperation model training method. The robotic arm control method comprises: obtaining a human-machine cooperation model, the human-machine cooperation model being a model for determining an expected attitude of a robotic arm according to a human-machine interaction force; obtaining the attitude at the current moment, and obtaining, according to the human-machine cooperation model, an expected attitude corresponding to the human-machine interaction force at the current moment; generating, according to the attitude at the current moment and the expected attitude corresponding to the human-machine interaction force at the current moment, an optimal trajectory where the robotic arm moves; and controlling the robotic arm according to the optimal trajectory. The present application solves the problem that a robot cannot move along a trajectory intended by the human.

Description

机械臂的控制方法、装置及人机协同模型的训练方法Control method and device of robotic arm and training method of man-machine collaborative model
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2020年10月26日提交中国专利局,申请号为2020111594282,发明名称为“机械臂的控制方法、装置及人机协同模型的训练方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on October 26, 2020, with the application number 2020111594282 and the invention titled "Control Method, Device and Human-Machine Collaborative Model of Robotic Arm", all of which The contents are incorporated herein by reference.
技术领域technical field
本申请涉及机械臂领域,具体而言,涉及一种机械臂的控制方法、装置及人机协同模型的训练方法。The present application relates to the field of robotic arms, and in particular, to a control method and device for a robotic arm and a training method for a human-machine collaborative model.
背景技术Background technique
在骨科和穿刺机器人领域中,存在这样一类机器人,该机器人可以应用于手术领域,其与医生和和环境均有交互。这类机器人可以根据医生的交互力进行移动并对环境做功,然而,相关技术在面向特定轨迹进行拖动时(如圆弧,直线),机器人无法根据人类的表现行为来判断人类企图,使机器人无法沿着人类意图的轨迹运动,如何控制机器人以达到准确理解医生意图并优化机器人-医生的交互体验的目的,成为亟待解决的问题。In the field of orthopedics and puncture robotics, there is a class of robots that can be used in the field of surgery, which interact with both the doctor and the environment. This type of robot can move according to the interaction force of the doctor and do work in the environment. However, when the related technology is dragging towards a specific trajectory (such as arc, straight line), the robot cannot judge the human attempt according to the human performance behavior, which makes the robot Unable to move along the trajectory of human intention, how to control the robot to accurately understand the doctor's intention and optimize the robot-doctor interaction experience has become an urgent problem to be solved.
针对机器人无法沿着人类意图的轨迹运动的问题,目前尚未提出有效的解决方案。For the problem that the robot cannot move along the trajectory intended by the human, no effective solution has been proposed yet.
发明内容SUMMARY OF THE INVENTION
本申请的主要目的在于提供一种机械臂的控制方法,以解决机器人无法沿着人类意图的轨迹运动的问题。The main purpose of this application is to provide a method for controlling a robotic arm, so as to solve the problem that the robot cannot move along the trajectory intended by humans.
为了实现上述目的,本申请提供了一种机械臂的控制方法、装置及人机协同模型的训练方法。In order to achieve the above purpose, the present application provides a control method and device for a robotic arm and a training method for a human-machine collaborative model.
第一方面,本申请提供了一种机械臂的控制方法。In a first aspect, the present application provides a method for controlling a robotic arm.
根据本申请的机械臂的控制方法包括:The control method of the robotic arm according to the present application includes:
获取人机协同模型,其中所述人机协同模型为根据人机交互力确定机械臂期望位姿的模 型;Obtaining a man-machine collaboration model, wherein the man-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force;
获取当前时刻位姿,根据所述人机协同模型获取当前时刻人机交互力对应的期望位姿;Obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model;
根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹;Generate the optimal trajectory of the motion of the robotic arm according to the pose at the current moment and the desired pose corresponding to the human-computer interaction force at the current moment;
根据所述最优轨迹对机械臂进行控制。The robotic arm is controlled according to the optimal trajectory.
进一步的,所述根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹包括:Further, generating the optimal trajectory of the motion of the robotic arm according to the pose at the current moment and the desired pose corresponding to the human-computer interaction force at the current moment includes:
通过模型预测控制MPC算法,根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿,生成多组随机轨迹;Through the model predictive control MPC algorithm, multiple groups of random trajectories are generated according to the pose at the current moment and the expected pose corresponding to the human-computer interaction force at the current moment;
从所述多组随机轨迹中选择最优轨迹。An optimal trajectory is selected from the set of random trajectories.
进一步的,所述从所述多组随机轨迹中选择最优轨迹包括:Further, the selecting the optimal trajectory from the multiple groups of random trajectories includes:
从所述多组随机轨迹,通过最优轨迹控制算法,选择最优轨迹。From the sets of random trajectories, an optimal trajectory is selected through an optimal trajectory control algorithm.
进一步的,所述根据所述最优轨迹对机械臂进行控制,包括:Further, the controlling the robotic arm according to the optimal trajectory includes:
获取机械臂的位置及姿态角运动信息;Obtain the position and attitude angle motion information of the robotic arm;
对机械臂的位置及姿态角运动信息的法向分量,进行第一模式控制;The first mode control is performed on the normal component of the position and attitude angle motion information of the robotic arm;
对机械臂的位置及姿态角运动信息的切向分量,进行第二模式控制;其中,所述第一模式为机械臂导纳大于所述第二模式的机械臂导纳的机器人引导模式;所述第二模式为人类导纳大于所述第一模式的人类导纳的人类引导模式。The second mode control is performed on the tangential components of the position and attitude angle motion information of the robotic arm; wherein, the first mode is a robot guidance mode in which the robotic arm admittance is greater than the robotic arm admittance of the second mode; so The second mode is a human-guided mode in which the human admittance is greater than the human admittance of the first mode.
第二方面,本申请提供了一种人机协同模型的训练方法,用于得到第一方面中的机械臂的控制方法中的人机协同模型。In a second aspect, the present application provides a training method for a human-machine collaboration model, which is used to obtain the human-machine collaboration model in the control method for a robotic arm in the first aspect.
根据本申请的人机协同模型的训练方法包括:The training method of the human-machine collaborative model according to the present application includes:
获取机械臂的多组人机交互力和所述多组人机交互力对应的多组机械臂位姿,所述多组人机交互力为多组原始人机交互力;Obtaining multiple sets of human-computer interaction forces of the robotic arm and multiple sets of robotic arm poses corresponding to the multiple sets of human-computer interaction forces, where the multiple sets of human-computer interaction forces are multiple sets of original human-computer interaction forces;
根据所述多组人机交互力和所述多组机械臂位姿,建立人机协同模型。A human-machine collaboration model is established according to the multiple sets of human-computer interaction forces and the multiple sets of robotic arm poses.
进一步的,在所述根据所述多组人机交互力和所述多组机械臂位姿,建立人机协同模型之后,所述方法还包括:Further, after the man-machine collaboration model is established according to the multiple sets of human-computer interaction forces and the multiple sets of robotic arm poses, the method further includes:
根据监督学习方法对人机协同模型进行优化。The human-machine collaborative model is optimized according to the supervised learning method.
第三方面,本申请提供了一种机械臂的控制装置。In a third aspect, the present application provides a control device for a robotic arm.
根据本申请的机械臂的控制装置包括:The control device of the robotic arm according to the present application includes:
模型获取模块,用于获取人机协同模型,其中所述人机协同模型为根据人机交互力确定 机械臂期望位姿的模型;A model acquisition module for acquiring a man-machine collaborative model, wherein the man-machine collaborative model is a model for determining the desired pose of the robotic arm according to the man-machine interaction force;
位姿获取模块,用于获取当前时刻位姿,根据所述人机协同模型获取当前时刻人机交互力对应的期望位姿;a pose obtaining module, configured to obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model;
轨迹生成模块,用于根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹;a trajectory generation module, configured to generate the optimal trajectory of the motion of the robotic arm according to the current moment posture and the desired posture and posture corresponding to the human-computer interaction force at the current moment;
控制模块,用于根据所述最优轨迹对机械臂进行控制。The control module is used to control the robotic arm according to the optimal trajectory.
进一步的,所述模型获取模块包括:Further, the model acquisition module includes:
优化单元,用于根据对监督学习方法人机协同模型进行优化。The optimization unit is used to optimize the human-machine collaborative model according to the supervised learning method.
进一步的,所述轨迹生成模块包括:Further, the trajectory generation module includes:
随机轨迹生成单元,用于通过模型预测控制MPC算法,根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿,生成多组随机轨迹;a random trajectory generation unit, configured to control the MPC algorithm through model prediction, and generate multiple sets of random trajectories according to the current moment pose and the expected pose corresponding to the human-computer interaction force at the current moment;
最优轨迹生成单元,用于从所述多组随机轨迹中选择最优轨迹。An optimal trajectory generation unit, configured to select an optimal trajectory from the multiple groups of random trajectories.
进一步的,所述最优轨迹生成单元还包括:Further, the optimal trajectory generation unit further includes:
用于从所述多组随机轨迹,通过最优轨迹控制算法,选择最优轨迹。for selecting an optimal trajectory from the plurality of sets of random trajectories through an optimal trajectory control algorithm.
进一步的,所述控制模块包括:Further, the control module includes:
控制器控制单元,用于根据所述最优轨迹,通过机械臂的控制器对机械臂进行控制,其中,所述控制器包括对机械臂进行控制的内层控制器和对人机协同模型进行控制的外层控制器。The controller control unit is used for controlling the manipulator through the controller of the manipulator according to the optimal trajectory, wherein the controller includes an inner layer controller that controls the manipulator and a man-machine collaboration model. Controlled outer controller.
第四方面,本申请提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现第一方面提供的机械臂的控制方法和/或第二方面提供的人机协同模型的训练方法的步骤。In a fourth aspect, the present application provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the control method for a robotic arm provided in the first aspect and/or the second aspect Provide the steps of the training method of the human-machine collaboration model.
第五方面,本申请提供一种机器人,包括机械臂、传感器、控制器、存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现第一方面提供的机械臂的控制方法和/或第二方面提供的人机协同模型的训练方法的步骤。In a fifth aspect, the present application provides a robot, including a robotic arm, a sensor, a controller, a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the program when the processor executes the program. The steps of the control method of the robotic arm provided by the first aspect and/or the training method of the human-machine collaboration model provided by the second aspect.
在本申请实施例中,通过人机协同模型确定当前时刻人机交互力对应的期望位姿,并根据机械臂当前时刻位姿和当前时刻人机交互力对应的期望位姿生成机械臂期望运动的最优轨迹,从而通过机械臂期望运动的最优轨迹对机械臂进行控制,达到了使机器人沿着人类意图的轨迹运动,从而实现了控制机器人以达到准确理解医生意图并优化人机交互体验的技术效果进而解决了机器人无法沿着人类意图的轨迹运动的问题。In the embodiment of the present application, the expected pose corresponding to the human-machine interaction force at the current moment is determined by the human-machine collaboration model, and the desired motion of the robotic arm is generated according to the current moment pose of the robotic arm and the expected pose corresponding to the human-machine interaction force at the current moment The optimal trajectory of the robot arm is controlled by the optimal trajectory of the expected movement of the robot arm, so that the robot can move along the trajectory of human intention, so as to realize the control of the robot to accurately understand the doctor's intention and optimize the human-computer interaction experience. The technical effect of the robot further solves the problem that the robot cannot move along the trajectory of the human intention.
附图说明Description of drawings
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the specific embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the specific embodiments or the prior art. Obviously, the accompanying drawings in the following description The drawings are only some embodiments of the present invention, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.
图1是根据本申请实施例的机械臂的控制方法的流程示意图;1 is a schematic flowchart of a method for controlling a robotic arm according to an embodiment of the present application;
图2是根据本申请实施例的人机协同模型训练方法的流程示意图;2 is a schematic flowchart of a method for training a human-machine collaborative model according to an embodiment of the present application;
图3是根据本申请实施例的机械臂的控制装置的结构框图。FIG. 3 is a structural block diagram of a control device of a robotic arm according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。In order to make those skilled in the art better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only Embodiments are part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second" and the like in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances for the embodiments of the invention described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.
在本发明中,术语“安装”、“设置”、“设有”、“连接”、“相连”、“套接”应做广义理解。例如,可以是固定连接,可拆卸连接,或整体式构造;可以是机械连接,或电连接;可以是直接相连,或者是通过中间媒介间接相连,又或者是两个装置、元件或组成部分之间内部的连通。对于本领域普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。In the present invention, the terms "installed", "arranged", "provided with", "connected", "connected" and "socketed" should be construed in a broad sense. For example, it may be a fixed connection, a detachable connection, or a unitary structure; it may be a mechanical connection, or an electrical connection; it may be directly connected, or indirectly connected through an intermediary, or between two devices, elements, or components. internal communication. For those of ordinary skill in the art, the specific meanings of the above terms in the present invention can be understood according to specific situations.
需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。It should be noted that the embodiments of the present invention and the features of the embodiments may be combined with each other under the condition of no conflict. The present invention will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
如图1所示,该方法包括如下的步骤S11至步骤S14:As shown in FIG. 1, the method includes the following steps S11 to S14:
S11:获取人机协同模型,其中所述人机协同模型为根据人机交互力确定机械臂期望位姿 的模型。S11: Obtain a human-machine collaboration model, wherein the human-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force.
人机协同模型可以是机械臂的控制系统中预存的模型,也可以通过机器学习方法进行训练得到人机协同模型,也可以是通过机器学习方法训练后进行优化的人机协同模型。在该实施例中,示例的,人机协同模型为通过机器学习方法进行训练得到的,具体的训练方法以参见后面的实施例部分图2的说明。具体的,人机协同模型是经由高斯混合模型(Gaussian Mixture Model,以下简称为GMM)作为预训练的各种神经网络模型或高斯过程模型。The human-machine collaborative model can be a pre-stored model in the control system of the robotic arm, or a human-machine collaborative model can be obtained by training with a machine learning method, or it can be a human-machine collaborative model optimized after training with a machine learning method. In this embodiment, by way of example, the human-machine collaboration model is obtained by training through a machine learning method. For the specific training method, please refer to the description of FIG. 2 in the embodiment section below. Specifically, the human-machine collaborative model is a variety of neural network models or Gaussian process models that are pre-trained via a Gaussian Mixture Model (hereinafter referred to as GMM).
S12:获取当前时刻位姿,根据所述人机协同模型获取当前时刻人机交互力对应的期望位姿。S12: Obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model.
其中,人机交互力可以通过安装在机械臂上的力传感器直接获取。具体的,力传感器为多维力传感器。在该实施例中,示例的,力传感器通过三维力传感器或六维力传感器获取。将获取到的当前时刻的人机交互力输入至人机协同模型,可以得到预测的机械臂下一时刻的期望位姿。该期望位姿被应用于限定区域范围内路径切线方向的控制,且当期望位姿有较大偏离时退出该控制方法。人机交互力还可以为包含人机阻抗力的人机交互力。包含人机阻抗力的人机交互力先可以通过安装在机械臂上的力传感器获取,再对力传感器获取的力和对应的当前时刻位姿求解得到机械臂的虚拟约束(即人机阻抗力),从而可以通过对力传感器获取的人机交互力和求解得到的虚拟约束求和,确定包含人机阻抗力的人机交互力。Among them, the human-computer interaction force can be directly obtained through the force sensor installed on the robotic arm. Specifically, the force sensor is a multi-dimensional force sensor. In this embodiment, for example, the force sensor is acquired by a three-dimensional force sensor or a six-dimensional force sensor. The obtained human-computer interaction force at the current moment is input into the human-computer collaboration model, and the predicted expected pose of the robotic arm at the next moment can be obtained. The desired pose is used to control the tangent direction of the path within the limited area, and the control method is exited when the desired pose deviates greatly. The human-computer interaction force may also be a human-computer interaction force including a human-computer resistance force. The human-machine interaction force including the human-machine impedance force can be obtained through the force sensor installed on the manipulator arm, and then the force obtained by the force sensor and the corresponding current moment pose are solved to obtain the virtual constraint of the manipulator arm (that is, the man-machine impedance force). ), so that the human-computer interaction force including the human-computer resistance force can be determined by summing the human-computer interaction force obtained by the force sensor and the virtual constraint obtained by the solution.
S13:根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂期运动的最优轨迹。S13: Generate an optimal trajectory of the robotic arm period motion according to the pose at the current moment and the desired pose corresponding to the human-computer interaction force at the current moment.
“根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹”具体为:通过模型预测控制(model predictive control,以下简称为MPC)算法,根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿,生成多组随机轨迹;从所述多组随机轨迹中选择最优轨迹。"Generate the optimal trajectory of the motion of the manipulator according to the position and attitude at the current moment and the expected position and attitude corresponding to the human-computer interaction force at the current moment" is specifically: through the model predictive control (model predictive control, hereinafter referred to as MPC) algorithm, According to the pose at the current moment and the expected pose corresponding to the human-computer interaction force at the current moment, multiple groups of random trajectories are generated; the optimal trajectory is selected from the multiple groups of random trajectories.
具体的,MPC是一种基于当前时刻的模型预测未来一段时间内的过程输出的,选取目标优化函数,预测未来输出序列并输出当前时刻控制量,下一时刻最新实测数据对前一时刻的过程输出序列进行反馈校正的算法。即,MPC可以使当前时刻的人机交互模型预测未来一段时间内输出的期望位姿。根据当前时刻的位姿和人机协同模型,可以通过MPC预测未来时间的期望位姿,并生成多组随机轨迹,并选择多组随机轨迹的最优轨迹。可选地,在该步骤中生成的机械臂运动的最优轨迹为在机械臂在限定区域范围内运动的最优轨迹,最优轨迹的特征是操作者在切线方向上可以控制前进、后退;但法线方向上由机器人自主控制。由于人类在切线上具有强控制能力,但在法线上机器人控制能力较强,操作者通过权利要求1所述的 人机协同模型将期望位置传递给机械臂,机械臂通过跟踪期望位置在路径上的投影点实现拖动效果。Specifically, MPC is a model that predicts the process output in the future based on the model at the current moment, selects the objective optimization function, predicts the future output sequence and outputs the control quantity at the current moment, and the latest measured data at the next moment is the process of the previous moment. Algorithm for feedback correction of the output sequence. That is, MPC can make the human-computer interaction model at the current moment predict the expected pose output in a period of time in the future. According to the pose at the current moment and the human-machine collaboration model, the expected pose in the future time can be predicted through MPC, multiple sets of random trajectories can be generated, and the optimal trajectory of the multiple sets of random trajectories can be selected. Optionally, the optimal trajectory of the movement of the robotic arm generated in this step is the optimal trajectory of the movement of the robotic arm within the limited area, and the feature of the optimal trajectory is that the operator can control the forward and backward directions in the tangential direction; But the normal direction is controlled by the robot autonomously. Since humans have strong control ability on the tangent line, but the robot has strong control ability on the normal line, the operator transmits the desired position to the robotic arm through the man-machine collaboration model described in claim 1, and the robotic arm tracks the desired position on the path. The projected point on the to achieve the drag effect.
“从所述多组随机轨迹中选择最优轨迹”具体为:从所述多组随机轨迹,通过最优轨迹控制算法,选择最优轨迹。"Selecting the optimal trajectory from the multiple groups of random trajectories" is specifically: selecting the optimal trajectory from the multiple groups of random trajectories through an optimal trajectory control algorithm.
具体的,选取最优轨迹可以通过线性二次型调节器算法、非线性二次型调节器(Iterative Linear Quadratic Regulator,以下简称iLQR)算法或微分动态规划进行确定,在此不进行限制。在该实施例中,示例的,选取最优轨迹通过最优轨迹控制算法中的iLQR算法进行确定。其中,iLQR算法可以得到状态非线性反馈的最优控制规律,易于构成闭环最优控制。即,通过iLQR算法可以确定多组随机轨迹中的最优轨迹。最优轨迹通过迭代优化算法逐步更新,当迭代收敛时认为迭代轨迹就是最优轨迹。可选的,根据当前时刻位姿和当前时刻人机交互力的期望位姿在10ms至500ms间对运动轨迹(运动位置、速度)进行优化。其中,机器人在法线上具有关于位置的价值权重从而精准控制位置、在切线方向上人类具有较大的导纳值从而实现人类引导拖动。另外,人类在切线分量上比机器人的控制能力更强,但在法线分量上机器人的控制能力比人类更强。用户通过上述步骤S1中的人机协同模型将期望位置传递给机械臂,机械臂通过跟踪期望位置在路径上的投影点实现拖动效果。Specifically, the selection of the optimal trajectory may be determined by a linear quadratic regulator algorithm, a nonlinear quadratic regulator (Iterative Linear Quadratic Regulator, hereinafter referred to as iLQR) algorithm or differential dynamic programming, which is not limited here. In this embodiment, by way of example, the optimal trajectory is selected and determined by the iLQR algorithm in the optimal trajectory control algorithm. Among them, the iLQR algorithm can obtain the optimal control law of state nonlinear feedback, which is easy to form closed-loop optimal control. That is, the optimal trajectory among multiple groups of random trajectories can be determined by the iLQR algorithm. The optimal trajectory is gradually updated by an iterative optimization algorithm. When the iteration converges, the iterative trajectory is considered to be the optimal trajectory. Optionally, the movement trajectory (movement position, speed) is optimized between 10ms and 500ms according to the pose at the current moment and the expected pose of the human-computer interaction force at the current moment. Among them, the robot has a value weight about the position on the normal line to precisely control the position, and the human has a large admittance value in the tangential direction to realize the human-guided dragging. Also, humans are more controllable than robots on the tangent component, but robots are more controllable than humans on the normal component. The user transmits the desired position to the robotic arm through the human-machine collaboration model in the above step S1, and the robotic arm realizes the dragging effect by tracking the projected point of the desired position on the path.
S14:根据所述最优轨迹对机械臂进行控制。S14: Control the robotic arm according to the optimal trajectory.
“根据所述最优轨迹对机械臂进行控制”具体为:获取机械臂的位置及姿态角运动信息;对机械臂的位置及姿态角运动信息的法向分量,进行第一模式控制;对机械臂的位置及姿态角运动信息的切向分量,进行第二模式控制;其中,第一模式为机械臂导纳大于第二模式的机械臂导纳的机器人引导模式;第二模式为人类导纳大于第一模式的人类导纳的人类引导模式。"Controlling the robotic arm according to the optimal trajectory" is specifically: acquiring the position and attitude angular motion information of the robotic arm; performing the first mode control on the normal components of the position and attitude angular motion information of the robotic arm; The tangential component of the position and attitude angle motion information of the arm is used for the second mode control; wherein, the first mode is the robot guidance mode in which the manipulator arm admittance is greater than the manipulator arm admittance of the second mode; the second mode is the human admittance A human-guided mode that is greater than the human admittance of the first mode.
具体的,根据机器人动力学,通过机械臂实际运动的阻抗坐标系和机械臂期望运动的期望坐标系构建机械臂的误差反馈量,如式(1)所示:Specifically, according to the robot dynamics, the error feedback amount of the manipulator is constructed by the impedance coordinate system of the actual motion of the manipulator and the desired coordinate system of the desired motion of the manipulator, as shown in formula (1):
Figure PCTCN2021082254-appb-000001
Figure PCTCN2021082254-appb-000001
其中,M(q)为机械臂在笛卡尔空间下的惯性矩阵,矩阵前三列单位为kg,后面所有的元素单位为Ns 2/rad;q为关节角;x前三行单位为m,后面所有行对应的单位为rad;
Figure PCTCN2021082254-appb-000002
为粘滞矩阵;g(q)为重力向量;f env为环境交互力旋量,可以通过环境-机械臂交互的力传感器获取;f为人机交互力,可以通过上述步骤S11中的力传感器进行获取。
Among them, M(q) is the inertia matrix of the manipulator in Cartesian space, the unit of the first three columns of the matrix is kg, and the unit of all the following elements is Ns 2 /rad; q is the joint angle; the unit of the first three rows of x is m, The unit corresponding to all subsequent lines is rad;
Figure PCTCN2021082254-appb-000002
is the viscosity matrix; g(q) is the gravity vector; f env is the environment interaction force screw, which can be obtained by the force sensor of the environment-robot interaction; f is the human-computer interaction force, which can be performed by the force sensor in the above step S11 Obtain.
基于动力学表达,基于反馈线性化方法构建力控制器,完成对人类高刚度、对环境低刚度的机械臂内环表现。内环输入为阻抗坐标系位置和姿态,通过上述步骤S3中的iLQR方法可以得到机械臂各关节对应的运动最优轨迹以及基于代价矩阵权重的变控制参数,并作为期望轨迹的最优轨迹和机械臂实际轨迹的误差反馈量进行分方向的控制。Based on the dynamic expression, the force controller is constructed based on the feedback linearization method to complete the inner loop performance of the manipulator with high stiffness to humans and low stiffness to the environment. The input of the inner loop is the position and attitude of the impedance coordinate system. Through the iLQR method in the above step S3, the optimal trajectory of the motion corresponding to each joint of the manipulator and the variable control parameters based on the weight of the cost matrix can be obtained as the optimal trajectory of the desired trajectory and The error feedback of the actual trajectory of the manipulator is controlled in different directions.
机械臂在路径跟踪(跟踪由预先规划和人机交互生成的路径)过程中,当机械臂位置及姿态在法向子方向运动时(即姿态角与法向运动时),根据机器大导纳(机器做功较大)、用户小导纳(人类做功较小)对机械臂进行控制(即第一模式);当机械臂位置及姿态在切线子方向运动时(优选的,当切线方向运动时),根据用户大导纳(人类做功较大)、机器小导纳(机器做功较小)对机械臂进行控制(即第二模式)。示例的,当根据人机协同模型得到期望位姿的预测X t+1(0m,0.1m,0.1m,0.3°,0.1°,0.2°),可以根据用户大导纳、机器小导纳的原则对机械臂进行控制。其中,第一模式为机器人引导模式,第二模式为人类引导模式,第一模式和第二模式可以共存,但两种模式的运动方向不同。即,在法线方向上对机械臂进行第一模式控制;在切线方向上对机械臂进行第二模式控制。 In the process of path tracking (tracking the path generated by pre-planning and human-computer interaction) of the robotic arm, when the position and posture of the robotic arm move in the normal sub-direction (that is, when the attitude angle and the normal direction move), according to the maximum admittance of the machine (The machine does more work) and the user has a small admittance (the human does less work) to control the robotic arm (ie the first mode); when the position and posture of the robotic arm move in the tangential direction (preferably, when the tangential direction moves ), the robotic arm is controlled (ie, the second mode) according to the user's large admittance (human work is large) and the machine small admittance (machine work is small). For example, when the predicted pose X t+1 (0m, 0.1m, 0.1m, 0.3°, 0.1°, 0.2°) of the desired pose is obtained according to the human-machine collaboration model, it can be determined according to the large admittance of the user and the small admittance of the machine. The principle is to control the robotic arm. The first mode is a robot guidance mode, and the second mode is a human guidance mode. The first mode and the second mode can coexist, but the movement directions of the two modes are different. That is, the first mode control is performed on the robot arm in the normal direction; the second mode control is performed on the robot arm in the tangential direction.
从以上的描述中,可以看出,本发明实现了如下技术效果:From the above description, it can be seen that the present invention achieves the following technical effects:
通过人机协同模型确定当前时刻人机交互力对应的期望位姿,从而可以确定机械臂当前时刻和预测时刻的预测位移,通过MPC生成预测位移的多组随机轨迹,再根据最优轨迹控制算法确定多组随机轨迹的最优轨迹,并获取机械臂的位置和姿态角运动信息,对机械臂进行控制,达到了使机器人沿着人类意图的轨迹运动的效果。The expected pose corresponding to the human-machine interaction force at the current moment is determined through the human-machine collaboration model, so that the predicted displacement of the robotic arm at the current moment and the predicted moment can be determined, and multiple sets of random trajectories of the predicted displacement can be generated through MPC, and then the optimal trajectory control algorithm can be used to generate multiple random trajectories. The optimal trajectory of multiple random trajectories is determined, and the position and attitude angle motion information of the manipulator is obtained, and the manipulator is controlled to achieve the effect of making the robot move along the trajectory intended by humans.
根据本申请实施例,还提供了一种用于得到上述机械臂的控制方法中的人机协同模型的方法,如图2所示,该人机协同模型训练方法包括如下的步骤S21、步骤S22:According to an embodiment of the present application, a method for obtaining a human-machine collaborative model in the above-mentioned control method of a robotic arm is also provided. As shown in FIG. 2 , the human-machine collaborative model training method includes the following steps S21 and S22 :
S21:获取机械臂的多组人机交互力和所述多组人机交互力对应的多组机械臂位姿,所述多组人机交互力为多组原始人机交互力;S21: Obtain multiple sets of human-computer interaction forces of the robotic arm and multiple sets of robotic arm poses corresponding to the multiple sets of human-computer interaction forces, where the multiple sets of human-computer interaction forces are multiple sets of original human-computer interaction forces;
S22:根据所述多组人机交互力和所述多组机械臂位姿,建立人机协同模型。S22: Establish a human-machine collaboration model according to the multiple sets of human-computer interaction forces and the multiple sets of robotic arm poses.
人机交互力可以通过安装在机械臂上的力传感器直接获取。具体的,力传感器为多维力传感器。在该实施例中,示例的,人机交互力通过六维力传感器获取。通过力传感器获取的训练力组包括X、Y、Z轴对应的三个训练力分量和三个训练力矩分量。机械臂位姿可以通过建立机械臂的包括X、Y、Z轴坐标系进行记录,具体的,机械臂位姿包括X、Y、Z轴对应的三个距离移动分量和三个角度移动分量。实例的,获取的人机交互力W t(1N,0N,0N,0.1Nm,0.2Nm,0.3Nm),其中1N,0N,0N为三个训练力分量,0.1Nm,0.2Nm,0.3Nm为三个训练力矩分量。机械臂位姿X t(0.01m,0.02m,0.01m,0.3°,0.4°,0.1°),其中0.01m, 0.02m,0.01m为X、Y、Z轴对应的三个距离移动分量,0.3°,0.4°,0.1°分别为三个角度移动分量。需要说明的是,人机交互力的组数与机械臂位姿的组数是相同的。比如,具体的,假设获取的多组人机交互力可以为3-5组。即,当获取的多组人机交互力为3组时,获取的机械臂位姿也为3组。人机交互力还可以为包含人机阻抗力的人机交互力。通过对力传感器获取的力和对应的当前时刻位姿求解得到的机械臂的虚拟约束(即人机阻抗力),从而可以通过对力传感器获取的人机交互力和求解得到的虚拟约束求和,确定包含人机阻抗力的人机交互力。 The human-computer interaction force can be directly obtained through the force sensor installed on the robotic arm. Specifically, the force sensor is a multi-dimensional force sensor. In this embodiment, for example, the human-computer interaction force is acquired by a six-dimensional force sensor. The training force group obtained by the force sensor includes three training force components and three training torque components corresponding to the X, Y, and Z axes. The pose of the robotic arm can be recorded by establishing a coordinate system of the robotic arm including the X, Y, and Z axes. Specifically, the pose of the robotic arm includes three distance movement components and three angular movement components corresponding to the X, Y, and Z axes. For example, the acquired human-computer interaction force W t (1N, 0N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), wherein 1N, 0N, 0N are three training force components, 0.1Nm, 0.2Nm, 0.3Nm are Three training torque components. The robot arm pose X t (0.01m, 0.02m, 0.01m, 0.3°, 0.4°, 0.1°), where 0.01m, 0.02m, 0.01m are the three distance movement components corresponding to the X, Y, and Z axes, 0.3°, 0.4°, and 0.1° are three angular movement components, respectively. It should be noted that the number of groups of human-computer interaction force is the same as that of the pose of the robotic arm. For example, specifically, it is assumed that the obtained groups of human-computer interaction forces may be 3-5 groups. That is, when the obtained multiple sets of human-computer interaction forces are 3 sets, the obtained poses of the manipulator are also 3 sets. The human-computer interaction force may also be a human-computer interaction force including a human-computer resistance force. The virtual constraint (ie, human-machine impedance force) of the robotic arm is obtained by solving the force obtained by the force sensor and the corresponding pose at the current moment, so that the human-machine interaction force obtained by the force sensor and the virtual constraint obtained by the solution can be summed , to determine the human-machine interaction force including the human-machine resistance force.
训练人机协同模型的模型输入可以为当前时刻的人机交互力和机械臂当前时刻位姿的采样值,并根据模型输入的采样值对人机协同模型进行训练。输出为机械臂末端下一时刻的实际位姿,通过该输入-输出可以得到机械臂当前时刻和下一时刻的位姿关系,因此该模型属于一种预测作用。具体的,人机协同模型训练的网络模型可以是高斯混合模型(Gaussian Mixture Model,以下简称为GMM)、贝叶斯网络模型、神经网络模型等,在此不进行限制。训练的方式是由人拖动机械臂末端画出多条轨迹,并记录每时刻人机交互力和机械臂当前位姿。基于当前时刻和下一时刻的推理关系,可以通过监督式学习可以训练模型。The model input for training the human-machine collaboration model can be the human-machine interaction force at the current moment and the sampled value of the robot arm's pose at the current moment, and the human-machine collaboration model is trained according to the sampled values input by the model. The output is the actual pose of the end of the manipulator at the next moment. Through the input-output, the pose relationship between the current moment and the next moment of the manipulator can be obtained, so the model belongs to a kind of prediction. Specifically, the network model trained by the human-machine collaborative model may be a Gaussian Mixture Model (GMM for short), a Bayesian network model, a neural network model, etc., which is not limited here. The training method is to draw multiple trajectories by dragging the end of the robotic arm, and record the human-computer interaction force and the current posture of the robotic arm at every moment. Based on the inference relationship between the current moment and the next moment, the model can be trained through supervised learning.
具体的,步骤S21“获取机械臂的多组人机交互力和所述多组人机交互力对应的多组机械臂位姿”可以为在信赖区域下获取机械臂的多组人机交互力和所述多组人机交互力对应的多组机械臂位姿。其中,信赖区域是指为获取的人机交互力的力传感器的采样分布ps在预设KL散度阈值之间的区域,其中,KL散度指ps和人机协同模型之间的KL散度,如式(2)所示,KL散度可以表达为:Specifically, step S21 "obtaining multiple sets of human-computer interaction forces of the robotic arm and multiple sets of robotic arm poses corresponding to the multiple sets of human-computer interaction forces" may be to obtain multiple sets of human-computer interaction forces of the robotic arm in the trust region The poses of the multiple sets of robotic arms corresponding to the multiple sets of human-computer interaction forces. The trust region refers to the region where the sampling distribution ps of the force sensor of the acquired human-computer interaction force is between the preset KL divergence thresholds, where the KL divergence refers to the KL divergence between ps and the human-machine collaborative model , as shown in equation (2), the KL divergence can be expressed as:
D KL(p s,p m)≤th KL         (2) D KL (p s ,p m )≤th KL (2)
其中,D KL为KL散度,ps为力传感器的采样分布,ps由最大似然估计得到;pm为人机协同模型的模型分布,th KL为第一预设KL散度阈值,可以由用户进行设定。当检测处于ps和pm计算的KL散度小于th kl的时候,说明两者接近。也就是预测模型(人机协同模型)符合实际的机器人运动,此时处于机器人主动模式(执行任务模式),当预测模型不符合实际的机器人运动时机器人处于自由拖动模式,第一预设KL散度阈值也可以由人机协同模型由机器学习方法通过用户在不同人机阻抗力的学习得到,(例如,第一预设KL散度阈值可以是-20)。 Among them, D KL is the KL divergence, ps is the sampling distribution of the force sensor, ps is obtained by maximum likelihood estimation; pm is the model distribution of the human-machine collaborative model, th KL is the first preset KL divergence threshold, which can be determined by the user. set up. When it is detected that the KL divergence calculated at ps and pm is less than th kl , it means that the two are close. That is to say, the prediction model (human-machine collaborative model) conforms to the actual robot motion. At this time, it is in the robot active mode (task execution mode). When the prediction model does not conform to the actual robot motion, the robot is in free dragging mode. The first preset KL The divergence threshold can also be obtained by the human-machine collaboration model and the machine learning method through user learning in different man-machine resistance forces (for example, the first preset KL divergence threshold can be -20).
进一步的,在训练人机协同模型的模型之后,该方法还包括:判断人机协同模型是否为有效模型。Further, after training the model of the human-machine collaboration model, the method further includes: judging whether the human-machine collaboration model is an effective model.
具体的,判断人机协同模型是否为有效模型,可以通过判断ps和pm之间的KL散度是否 大于第二预设KL散度,若大于,则人机协同模型为有效模型。(例如,ps和pm之间的KL散度为-35,第二预设KL散度为-50,ps和pm之间的KL散度大于第二预设KL散度,则人机协同模型为有效模型)。Specifically, to determine whether the human-machine collaborative model is an effective model, it can be determined whether the KL divergence between ps and pm is greater than the second preset KL divergence, and if it is greater than the human-machine collaborative model is an effective model. (For example, the KL divergence between ps and pm is -35, the second preset KL divergence is -50, and the KL divergence between ps and pm is greater than the second preset KL divergence, then the human-machine collaborative model is a valid model).
具体的,判断人机协同模型是否为有效模型,可以通过计算上述步骤S1中采集的人机交互力的似然度,并判断似然度是否大于第一预设似然度阈值,若大于,则人机协同模型为有效模型。例如,假设上述步骤S1中采集的3组人机交互力为W 1(1N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm);W 2(2N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm);W 3(3N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm),通过W 1、W 2、W 3可以求解得到模型似然度=0.3,并判断第一预设似然度阈值是否小于模型似然度,若模型似然度大于模型似然度,则人机协同模型为有效模型(例如,模型似然度=5,第一预设似然度阈值=2.5,模型似然度大于第一预设似然度阈值,则人机协同模型为有效模型)。 Specifically, to determine whether the human-machine collaboration model is an effective model, the likelihood of the human-computer interaction force collected in the above step S1 can be calculated, and whether the likelihood is greater than the first preset likelihood threshold, if greater, Then the human-machine collaboration model is an effective model. For example, suppose that the three groups of human-computer interaction forces collected in the above step S1 are W 1 (1N, 0.5N, 0N, 0.1Nm, 0.2Nm, 0.3Nm); W 2 (2N, 0.5N, 0N, 0.1Nm, 0.2 Nm, 0.3Nm); W 3 (3N, 0.5N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), through W 1 , W 2 , W 3 , the model likelihood = 0.3 can be obtained, and the first prediction Set whether the likelihood threshold is less than the model likelihood, if the model likelihood is greater than the model likelihood, the human-machine collaborative model is an effective model (for example, model likelihood=5, the first preset likelihood threshold= 2.5, if the model likelihood is greater than the first preset likelihood threshold, the human-machine collaborative model is an effective model).
进一步的,在所述根据所述多组人机交互力和所述多组机械臂位姿,建立人机协同模型之后,所述方法还包括:Further, after the man-machine collaboration model is established according to the multiple sets of human-computer interaction forces and the multiple sets of robotic arm poses, the method further includes:
根据监督学习方法对人机协同模型进行优化,生成优化后的人机协同模型。优化后的人机协同模型即为信赖区域下的模型。The human-machine collaborative model is optimized according to the supervised learning method, and the optimized human-machine collaborative model is generated. The optimized human-machine collaboration model is the model under the trust region.
利用监督学习方法对人机协同模型的参数进行优化包括采用先验信息,具体的,利用监督学习方法中的最大似然原理,对人机协同模型的参数进行优化,如式(3)所示,优化后的训练模型的对应参数为:Using the supervised learning method to optimize the parameters of the human-machine collaborative model includes using prior information. Specifically, using the maximum likelihood principle in the supervised learning method to optimize the parameters of the human-machine collaborative model, as shown in formula (3) , the corresponding parameters of the optimized training model are:
Figure PCTCN2021082254-appb-000003
Figure PCTCN2021082254-appb-000003
其中,pm为人机协同模型的模型分布,f h为通过力传感器获取的人机交互力,x d为机械臂的期望位姿,t为当前时刻,t+1为下一时刻,θ C为人机协同模型的参数。具体的,当人机协同模型为GMM时,θ C为分模型序号;当人机协同模型为神经网络模型时,θ C为连接节点无量纲权重。 Among them, pm is the model distribution of the human-machine collaborative model, f h is the human-machine interaction force obtained by the force sensor, x d is the desired pose of the robotic arm, t is the current moment, t+1 is the next moment, and θ C is the human parameters of the machine collaboration model. Specifically, when the human-machine collaboration model is GMM, θ C is the serial number of the sub-model; when the human-machine collaboration model is a neural network model, θ C is the dimensionless weight of connecting nodes.
示例的,在对人机协同模型的参数进行优化后,根据优化后的参数对人机协同模型进行优化。具体的,对人机协同模型不同的建模方式使用不同的优化方法,例如,当人机协同模型为GMM时,采用最大化(Expectation-Maximum,以下简称为EM)算法对人机协同模型进行优化;当人机协同模型为神经网络时,采用随机梯度下降(stochastic gradient descent,以下简称为SGD)方法对人机人机协同模型进行优化。For example, after the parameters of the human-machine collaboration model are optimized, the human-machine collaboration model is optimized according to the optimized parameters. Specifically, different optimization methods are used for different modeling methods of the human-machine collaborative model. For example, when the human-machine collaborative model is GMM, the maximization (Expectation-Maximum, hereinafter referred to as EM) algorithm is used to perform the human-machine collaborative model. Optimization; when the human-machine collaboration model is a neural network, the stochastic gradient descent (hereinafter referred to as SGD) method is used to optimize the human-machine human-machine collaboration model.
进一步的,本申请实施例还提供了另一种机械臂的控制方法,如图3所示,具体包括如 下步骤:Further, the embodiment of the present application also provides a control method of another mechanical arm, as shown in Figure 3, and specifically includes the following steps:
S31.获取人机协同模型,在人机拖动过程中每采集连续多个人机交互力以及力矩值时,对测量结果做最大似然估计得到采样分布。S31. Obtain a human-machine collaboration model, and perform maximum likelihood estimation on the measurement results to obtain a sampling distribution every time a plurality of continuous human-machine interaction force and moment values are collected during the human-machine dragging process.
人机协同模型是根据采集的人机交互力和力矩的数据进行训练得到的。具体的训练过程如下:The human-machine collaborative model is trained based on the collected data of human-machine interaction force and torque. The specific training process is as follows:
首先,通过“人-机械臂”拖动对人机交互力和力矩的数据进行采集,采集得到xyz三方向对应的力值,力矩值,例如W(1N,0N,0N,0.1Nm,0.2Nm,0.3Nm)及对应的位置移动X(0.01m,0.02m,0.01m,0.3°,0.4°,0.1°),实际中会采集大量的数据。First, collect the data of human-machine interaction force and torque by dragging the "human-robot arm", and collect the force values and torque values corresponding to the three directions of xyz, such as W(1N, 0N, 0N, 0.1Nm, 0.2Nm , 0.3Nm) and the corresponding position movement X (0.01m, 0.02m, 0.01m, 0.3°, 0.4°, 0.1°), a large amount of data will be collected in practice.
其次,通过线性模型或神经网络得到人机交互力和位移(位置移动)为输入,下一时刻力,力矩值为输出的人机协同模型。例如输入是Wt(1N,0N,0N,0.1Nm,0.2Nm,0.3Nm)和Xt(0.01m,0.02m,0.01m,0.3°,0.4°,0.1°)。输出得到Wt+1(0N,1N,1N,0.3Nm,0.1Nm,0.2Nm)Secondly, through a linear model or a neural network, the human-computer interaction force and displacement (position movement) are obtained as input, and the next moment force and torque value are output as a human-machine collaborative model. For example the inputs are Wt(1N, 0N, 0N, 0.1Nm, 0.2Nm, 0.3Nm) and Xt(0.01m, 0.02m, 0.01m, 0.3°, 0.4°, 0.1°). The output gets Wt+1(0N, 1N, 1N, 0.3Nm, 0.1Nm, 0.2Nm)
第三,通过基于模型的强化学习的相关训练方法,如随机梯度下降、变分推断等方法对人机协作模型进行优化。Thirdly, the human-machine collaboration model is optimized by the related training methods of model-based reinforcement learning, such as stochastic gradient descent and variational inference.
最后,基于当前的模型参数,即神经网络的权重,固定参数基于再从第一步开始执行。Finally, based on the current model parameters, that is, the weights of the neural network, the fixed parameters are then executed from the first step.
通过前述4个步骤的多次的循环训练,最终得到人机协同模型。The human-machine collaborative model is finally obtained through multiple cycles of training in the preceding four steps.
得到人机协同模型后,“在人机拖动过程中每采集连续多个人机交互力以及力矩值时,对测量结果做最大似然估计得到采样分布”具体可以为通过前一段时间的测量力轨迹进行最大似然估计,例如:在人机拖动过程中通过采集3-5个连续的人机交互力以及力矩值,对测量结果做最大似然估计。具体的,可以采集t0到t输入的连续的三组人机交互力和力矩值W1(1N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm);W2(2N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm);W3(3N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm),通过最大似然估计得到W服从的分布,假设服从高斯分布,期望为(1N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm),方差为diag(0.1N,0.05N,0N,0.1Nm,0.1Nm,0.03Nm),diag代表对角矩阵。After the human-machine collaboration model is obtained, "each time a plurality of continuous human-machine interaction forces and torque values are collected during the human-machine dragging process, the maximum likelihood estimation is performed on the measurement results to obtain the sampling distribution." The maximum likelihood estimation of the trajectory is performed. For example, during the human-machine dragging process, the maximum likelihood estimation of the measurement results is performed by collecting 3-5 continuous human-machine interaction force and torque values. Specifically, three consecutive groups of human-computer interaction force and torque values input from t0 to t can be collected W1 (1N, 0.5N, 0N, 0.1Nm, 0.2Nm, 0.3Nm); W2 (2N, 0.5N, 0N, 0.1 Nm, 0.2Nm, 0.3Nm); W3 (3N, 0.5N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), the distribution obeyed by W is obtained through maximum likelihood estimation, assuming Gaussian distribution, the expectation is (1N, 0.5 N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), the variance is diag(0.1N, 0.05N, 0N, 0.1Nm, 0.1Nm, 0.03Nm), and diag represents a diagonal matrix.
S32.基于通过人机协同模型生成的多个人机交互力以及力矩值,进行最大似然估计得到模型分布。S32. Based on the multiple human-computer interaction force and moment values generated by the human-computer collaboration model, perform maximum likelihood estimation to obtain the model distribution.
对于上述步骤的示例,即在t时刻通过人机协同模型计算力的轨迹的分布,假设为高斯分布,期望为(2N,0.5N,0N,0.1Nm,0.2Nm,0.3Nm),方差为diag(0.2N,0.05N,0N,0.1Nm,0.1Nm,0.03Nm),diag代表对角矩阵。For the example of the above steps, that is, the distribution of the trajectory of the force calculated by the human-machine collaborative model at time t, assume a Gaussian distribution, the expectation is (2N, 0.5N, 0N, 0.1Nm, 0.2Nm, 0.3Nm), and the variance is diag (0.2N, 0.05N, 0N, 0.1Nm, 0.1Nm, 0.03Nm), diag represents a diagonal matrix.
S33.计算采样分布和模型分布之间的KL散度。S33. Calculate the KL divergence between the sampling distribution and the model distribution.
计算上述步骤S31和S32得到的分布和模型分布之间的KL散度,具体的是通过KL散度计算两个分布之间的距离。Calculate the KL divergence between the distribution obtained in the above steps S31 and S32 and the model distribution, and specifically calculate the distance between the two distributions through the KL divergence.
S34.根据KL散度的值与可信赖阈值进行比较,根据比较结果判断机械臂执行自由拖动模式或执行任务模式。S34. Compare the value of the KL divergence with the trustworthy threshold, and determine whether the robotic arm executes the free drag mode or the task execution mode according to the comparison result.
若KL散度在可信赖阈值内,即KL散度的值小于等于可信赖阈值,也即采样分布与模型分布之间的距离较小,则人机协同模型是有效模型,该种模式为执行任务模式;若KL散度不在可信赖阈值内,即KL散度的值大于可信赖阈值,也即采样分布与模型分布之间距离较大,人机协同模型为无效模型,该种模式为自由拖动模式。If the KL divergence is within the trustworthy threshold, that is, the value of the KL divergence is less than or equal to the trustworthy threshold, that is, the distance between the sampling distribution and the model distribution is small, then the human-machine collaborative model is an effective model, and this mode is the execution Task mode; if the KL divergence is not within the trustworthy threshold, that is, the value of the KL divergence is greater than the trustworthy threshold, that is, the distance between the sampling distribution and the model distribution is large, the human-machine collaboration model is an invalid model, and this mode is free Drag mode.
可信赖阈值为可信赖区域内对应的阈值,具体的KL散度可以表达为:The trustworthy threshold is the corresponding threshold in the trustworthy region, and the specific KL divergence can be expressed as:
D KL(p s,p m)≤th KL D KL ( p s ,pm )≤th KL
其中,D是KL散度,代表了两个高斯分布之间的距离。Ps为人机交互力的力传感器通过人机协同模型得到的位姿分布(即模型分布),Pm由最大似然估计得到的实际路径得到的实际位姿分布(即采样分布);th KL为预设的KL散度阈值,可以由用户进行设定。当检测处于Ps和Pm计算的KL散度小于th KL的时候,说明两者接近。也就是模型计算符合实际的机器人运动,此时处于机器人主动模式,当模型计算不符合实际的机器人运动时机器人处于自由拖动模式。 where D is the KL divergence, which represents the distance between two Gaussian distributions. Ps is the pose distribution (ie model distribution) obtained by the force sensor of human-computer interaction force through the man-machine collaborative model, Pm is the actual pose distribution (ie sampling distribution) obtained from the actual path obtained by maximum likelihood estimation; th KL is the pre- The set KL divergence threshold can be set by the user. When it is detected that the KL divergence calculated at Ps and Pm is less than th KL , it means that the two are close. That is, the model calculation conforms to the actual robot motion, and the robot is in the active mode at this time. When the model calculation does not conform to the actual robot motion, the robot is in the free drag mode.
另外需要说明的是,执行任务模式为机械臂对操作对象按照规划进行具体的操作。可以看出,本申请实施例的方法可以根据采样分布和模型分布之间的KL散度的判断,来判断当前是哪种模式,以实现两种模式下的自由的切换。在执行任务模式下,人机协同模型为轨迹控制提供铺助,也可以实现两种模式的切换;在自由拖动模式下,人机协同模型不会轨迹控制提供帮助,只对两种模式的切换发挥作用。本申请中的模型的有效和无效是针对是否可以为控制提供帮助来区分的,能够提供帮助就是有效,不能提供帮助为无效。In addition, it should be noted that the task execution mode is that the robot arm performs specific operations on the operation object according to the plan. It can be seen that the method of the embodiment of the present application can judge which mode is currently based on the judgment of the KL divergence between the sampling distribution and the model distribution, so as to realize free switching between the two modes. In the task execution mode, the human-machine collaborative model provides assistance for trajectory control, and can also switch between the two modes; in the free drag mode, the human-machine collaborative model does not provide assistance for trajectory control, only for the two modes. Toggle comes into play. The validity and invalidity of the model in this application are differentiated according to whether it can provide help for control. If the model can provide help, it is effective, and if it cannot provide help, it is invalid.
在执行任务模式下,人机协同模型才是有效的模型,因此针对执行任务模式下的控制流程进行详细的说明。In the execution task mode, the human-machine collaboration model is the effective model, so the control flow in the execution task mode is described in detail.
具体的,在执行任务模式下,根据人机协同模型获取当前时刻人机交互力对应的期望位姿,该期望位姿记作第一位姿;获取根据上位机规划的轨迹对应的期望位姿,记作第二位姿;基于第一位姿和第二位姿之间的差值确定阻抗坐标系下机械臂的位姿,进而确定所述最优轨迹;根据最优轨迹对机械臂进行控制。Specifically, in the task execution mode, the desired pose corresponding to the human-computer interaction force at the current moment is obtained according to the human-machine collaboration model, and the desired pose is recorded as the first pose; the desired pose corresponding to the trajectory planned by the host computer is obtained. , denoted as the second pose; determine the pose of the robot arm in the impedance coordinate system based on the difference between the first pose and the second pose, and then determine the optimal trajectory; control.
人机交互力可以通过力传感器获取到,比如在实现人机交互的手柄处设置力传感器,通过力传感器可以采集到人机交互力。将人机交互力和当前位姿输入后可以预测得到下一时刻 的期望位姿,该期望位姿记作第一位姿The human-computer interaction force can be obtained through a force sensor. For example, a force sensor is set at the handle that realizes the human-computer interaction, and the human-computer interaction force can be collected through the force sensor. After inputting the human-computer interaction force and the current pose, the expected pose at the next moment can be predicted, and the desired pose is recorded as the first pose
其中,需要说明的是,阻抗坐标系下是指经过变形(改变)之后的坐标系,给出具体的示例进行说明:在实际进行操作的过程中可能会遇到障碍,因此需要绕过障碍,能够绕过障碍对应的坐标为阻抗坐标系下的坐标。Among them, it should be noted that the impedance coordinate system refers to the coordinate system after deformation (change), and a specific example is given for illustration: obstacles may be encountered in the actual operation process, so it is necessary to bypass obstacles, The coordinates corresponding to the obstacles that can be bypassed are the coordinates in the impedance coordinate system.
其中,最优轨迹是通过iLQR方法生成的,得到最优轨迹是为了实现机械臂目标点的更新,属于外环控制。在实现目标点的更新后,需要进行内环控制,即根据最优轨迹对机械臂进行控制。具体为:获取阻抗坐标系下机械臂的位姿(外环控制的输出);在姿态角与法向运动,根据上位机规划大导纳,人拖动小导纳进行控制;在切线方向运动,根据人拖动大导纳,上位机规划小导纳进行控制。内环控制是基于机械臂的自身的动力学模型(机器人动力学模型)实现的控制。内环控制可以得到实际的机械臂的力矩。外环的输出是内环的输入,内环的输出是外环的输入。对于内环和外环的控制可以参见图4所示,其中,医生交互模型(即人机协同模型)为外环控制,人机协作阻抗模型和机器人模型(机器人动力学模型)为内环控制。前馈控制器、反馈控制器、kh整体对应iLQR方法。对于人机协作阻抗模型进行说明:通过iLQR方法构建人机阻抗模型对期望坐标系和阻抗坐标系的误差进行分方向的控制。Among them, the optimal trajectory is generated by the iLQR method, and the optimal trajectory is obtained to realize the update of the target point of the manipulator, which belongs to the outer loop control. After the update of the target point is achieved, the inner loop control is required, that is, the manipulator is controlled according to the optimal trajectory. Specifically: obtain the pose of the manipulator in the impedance coordinate system (the output of the outer loop control); move in the attitude angle and normal direction, plan the large admittance according to the upper computer, and control the human dragging the small admittance; move in the tangential direction , according to the person dragging the large admittance, the host computer plans the small admittance to control. The inner loop control is based on the dynamic model of the robot arm itself (robot dynamics model). The inner loop control can get the actual torque of the robot arm. The output of the outer loop is the input of the inner loop, and the output of the inner loop is the input of the outer loop. For the control of the inner loop and the outer loop, see Figure 4, where the doctor interaction model (ie, the human-machine collaboration model) is the outer loop control, and the human-machine collaboration impedance model and the robot model (robot dynamics model) are the inner loop control. . The feedforward controller, feedback controller, and kh as a whole correspond to the iLQR method. The human-machine cooperation impedance model is explained: the human-machine impedance model is constructed by the iLQR method to control the errors of the desired coordinate system and the impedance coordinate system in different directions.
另外还需要说明的,MPC是一种基于当前时刻的模型预测未来一段时间内的过程输出的,选取目标优化函数,预测未来输出序列并输出当前时刻控制量,下一时刻最新实测数据对前一时刻的过程输出序列进行反馈校正的算法。即,本实施例中MPC可以借助人机协同模型,产生一系列的控制量并使得控制量(机械臂力矩)-状态(机械臂位姿)所对应的轨迹总代价最低。根据当前时刻的位姿和人机协同模型,可以通过人机协同模型预测未来时间的期望位姿,并生成多组随机轨迹,并选择多组随机轨迹的最优轨迹。In addition, it should be noted that MPC is a model based on the current moment to predict the process output for a period of time in the future. It selects the objective optimization function, predicts the future output sequence and outputs the control amount at the current moment. The algorithm for feedback correction of the process output sequence at the moment. That is, in this embodiment, the MPC can generate a series of control quantities by means of the man-machine collaboration model and make the total cost of the trajectory corresponding to the control quantity (torque of the manipulator)-state (the pose of the manipulator) to be the lowest. According to the pose at the current moment and the human-machine collaboration model, the expected pose in the future time can be predicted through the human-machine collaboration model, multiple sets of random trajectories can be generated, and the optimal trajectory of the multiple sets of random trajectories can be selected.
最后,对本申请实施例的效果进行总结:Finally, the effects of the embodiments of the present application are summarized:
1,通过构建逆动力学方法及重力补偿方法对机器人模型的位置和速度进行控制(机器人动力学模型),控制模式为机器人的力矩控制模式,其目的为对阻抗模型产生的轨迹进行跟踪。1. Control the position and speed of the robot model by constructing an inverse dynamics method and a gravity compensation method (robot dynamics model). The control mode is the torque control mode of the robot, and its purpose is to track the trajectory generated by the impedance model.
2,通过构建人机协作阻抗模型,也即一个二阶刚度阻尼模型来表示人机之间的交互物理特性,该特性为一个固定且时不变的微分方程,目的是保持拖动过程中手感一致。2. By constructing a human-machine cooperative impedance model, that is, a second-order stiffness damping model, the physical characteristics of the interaction between humans and machines are represented. This characteristic is a fixed and time-invariant differential equation, and the purpose is to maintain the feel during dragging Consistent.
3,通过基于线性控制器理论及相关方法设计前馈控制器和反馈控制器,利用主方向机器人省力,辅助方向人类省力和误差最小的价值函数对线性控制器进行更新。3. By designing a feedforward controller and a feedback controller based on the linear controller theory and related methods, the linear controller is updated by using the value function of labor-saving in the main direction, human labor-saving in the auxiliary direction and the smallest error.
4,基于在线的医生交互动力学模型(医生交互模型)观测,将医生交互模型和标称动力学(规划轨迹)进行比对,从而得到医生意图进而实现一定的模式切换。4. Based on the observation of the online doctor interaction dynamic model (doctor interaction model), the doctor interaction model is compared with the nominal dynamics (planned trajectory), so as to obtain the doctor's intention and realize a certain mode switch.
5,本申请的控制方式可以应用在医学领域,比如手术领域中的骨科手术、穿刺等。5. The control method of the present application can be applied in the medical field, such as orthopedic surgery and puncture in the surgical field.
根据本发明实施例,还提供了一种用于实施上述机械臂的控制方法的装置10,如图5所示,该机械臂的控制装置10包括:According to an embodiment of the present invention, a device 10 for implementing the above-mentioned control method of a robotic arm is also provided. As shown in FIG. 5 , the control device 10 of the robotic arm includes:
模型获取模块11,用于获取人机协同模型,其中所述人机协同模型为根据人机交互力确定机械臂期望位姿的模型;The model obtaining module 11 is used for obtaining a human-machine collaboration model, wherein the human-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force;
位姿获取模块12,用于获取当前时刻位姿,根据所述人机协同模型获取当前时刻人机交互力对应的期望位姿;The pose obtaining module 12 is configured to obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model;
轨迹生成模块13,用于根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹;A trajectory generation module 13, configured to generate the optimal trajectory of the motion of the robotic arm according to the current moment posture and the desired posture and posture corresponding to the human-computer interaction force at the current moment;
控制模块14,用于根据所述最优轨迹对机械臂进行控制。The control module 14 is configured to control the robotic arm according to the optimal trajectory.
进一步的,所述模型获取模块11包括:Further, the model acquisition module 11 includes:
优化单元,用于根据监督学习方法对人机协同模型进行优化。The optimization unit is used to optimize the human-machine collaborative model according to the supervised learning method.
进一步的,所述轨迹生成模块13包括:Further, the trajectory generation module 13 includes:
随机轨迹生成单元,用于通过模型预测控制MPC算法,根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿,生成多组随机轨迹;a random trajectory generation unit, configured to control the MPC algorithm through model prediction, and generate multiple sets of random trajectories according to the current moment pose and the expected pose corresponding to the human-computer interaction force at the current moment;
最优轨迹生成单元,用于从所述多组随机轨迹中选择最优轨迹。An optimal trajectory generation unit, configured to select an optimal trajectory from the multiple groups of random trajectories.
进一步的,所述最优轨迹生成单元还包括:Further, the optimal trajectory generation unit further includes:
用于从所述多组随机轨迹,通过最优轨迹控制算法,选择最优轨迹。for selecting an optimal trajectory from the plurality of sets of random trajectories through an optimal trajectory control algorithm.
进一步的,所述控制模块14包括:Further, the control module 14 includes:
控制器控制单元,用于根据所述最优轨迹,通过机械臂的控制器对机械臂进行控制,其中,所述控制器包括对机械臂进行控制的内层控制器和对人机协同模型进行控制的外层控制器。The controller control unit is used for controlling the manipulator through the controller of the manipulator according to the optimal trajectory, wherein the controller includes an inner layer controller that controls the manipulator and a man-machine collaboration model. Controlled outer controller.
具体的,本实施例中各个模块的实现可以参考方法实施例中的相关实现,不再赘述。Specifically, for the implementation of each module in this embodiment, reference may be made to the relevant implementation in the method embodiment, and details are not described again.
从以上的描述中,可以看出,本申请实现了如下技术效果:From the above description, it can be seen that the application has achieved the following technical effects:
通过人机协同模型确定当前时刻人机交互力对应的期望位姿,从而可以确定机械臂当前时刻和预测时刻的预测位移,通过MPC生成预测位移的多组随机轨迹,再根据最优轨迹控制算法确定多组随机轨迹的最优轨迹,并获取机械臂的位置和姿态角运动信息,对机械臂进行控制,达到了使机器人沿着人类意图的轨迹运动的效果。The expected pose corresponding to the human-machine interaction force at the current moment is determined through the human-machine collaboration model, so that the predicted displacement of the robotic arm at the current moment and the predicted moment can be determined, and multiple sets of random trajectories of the predicted displacement can be generated through MPC, and then the optimal trajectory control algorithm can be used to generate multiple random trajectories. The optimal trajectory of multiple random trajectories is determined, and the position and attitude angle motion information of the manipulator is obtained, and the manipulator is controlled to achieve the effect of making the robot move along the trajectory intended by humans.
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置 中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the above-mentioned modules or steps of the present invention can be implemented by a general-purpose computing device, and they can be centralized on a single computing device or distributed in a network composed of multiple computing devices Alternatively, they can be implemented with program codes executable by a computing device, so that they can be stored in a storage device and executed by the computing device, or they can be made into individual integrated circuit modules, or they can be integrated into The multiple modules or steps are fabricated into a single integrated circuit module. As such, the present invention is not limited to any particular combination of hardware and software.
需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。不同的实施例之间也可以相互参考或者结合。It should be noted that the steps shown in the flowcharts of the accompanying drawings may be executed in a computer system, such as a set of computer-executable instructions, and, although a logical sequence is shown in the flowcharts, in some cases, Steps shown or described may be performed in an order different from that herein. Different embodiments may also refer to or be combined with each other.
虽然结合附图描述了本发明的实施方式,但是本领域技术人员可以在不脱离本发明的精神和范围的情况下作出各种修改和变型,这样的修改和变型均落入由所附权利要求所限定的范围之内。Although the embodiments of the present invention have been described with reference to the accompanying drawings, various modifications and variations can be made by those skilled in the art without departing from the spirit and scope of the present invention, and such modifications and variations fall within the scope of the appended claims within the limited range.

Claims (18)

  1. 一种机械臂的控制方法,其特征在于,包括:A method for controlling a robotic arm, comprising:
    获取人机协同模型,其中所述人机协同模型为根据人机交互力确定机械臂期望位姿的模型;Obtaining a human-machine collaboration model, wherein the human-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force;
    获取当前时刻位姿,根据所述人机协同模型获取当前时刻人机交互力对应的期望位姿;Obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model;
    根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹;Generate the optimal trajectory of the motion of the robotic arm according to the pose at the current moment and the desired pose corresponding to the human-computer interaction force at the current moment;
    根据所述最优轨迹对机械臂进行控制。The robotic arm is controlled according to the optimal trajectory.
  2. 根据权利要求1所述的机械臂的控制方法,其特征在于,所述根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹包括:The method for controlling a robotic arm according to claim 1, wherein the generating the optimal trajectory of the robotic arm motion according to the pose at the current moment and the desired pose corresponding to the human-machine interaction force at the current moment comprises:
    通过模型预测控制MPC算法,根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿,生成多组随机轨迹;Through the model predictive control MPC algorithm, multiple groups of random trajectories are generated according to the pose at the current moment and the expected pose corresponding to the human-computer interaction force at the current moment;
    从所述多组随机轨迹中选择最优轨迹。An optimal trajectory is selected from the set of random trajectories.
  3. 根据权利要求2所述的机械臂的控制方法,其特征在于,所述从所述多组随机轨迹中选择最优轨迹包括:The control method for a robotic arm according to claim 2, wherein the selecting an optimal trajectory from the multiple groups of random trajectories comprises:
    从所述多组随机轨迹,通过最优轨迹控制算法,选择最优轨迹。From the sets of random trajectories, an optimal trajectory is selected through an optimal trajectory control algorithm.
  4. 根据权利要求1所述的机械臂的控制方法,其特征在于,所述根据所述最优轨迹对机械臂进行控制,包括:The method for controlling a robotic arm according to claim 1, wherein the controlling the robotic arm according to the optimal trajectory comprises:
    获取机械臂的位置及姿态角运动信息;Obtain the position and attitude angle motion information of the robotic arm;
    对机械臂的位置及姿态角运动信息的法向分量,进行第一模式控制;The first mode control is performed on the normal component of the position and attitude angle motion information of the robotic arm;
    对机械臂的位置及姿态角运动信息的切向分量,进行第二模式控制;其中,所述第一模式为机械臂导纳大于所述第二模式的机械臂导纳的机器人引导模式;所述第二模式为人类导纳大于所述第一模式的人类导纳的人类引导模式。The second mode control is performed on the tangential components of the position and attitude angle motion information of the robotic arm; wherein, the first mode is a robot guidance mode in which the robotic arm admittance is greater than the robotic arm admittance of the second mode; so The second mode is a human-guided mode in which the human admittance is greater than the human admittance of the first mode.
  5. 根据权利要求1所述的机械臂的控制方法,其特征在于,在获取人机协同模型之后,所述方法还包括:The method for controlling a robotic arm according to claim 1, wherein after acquiring the man-machine collaboration model, the method further comprises:
    在人机拖动过程中每采集连续多个人机交互力以及力矩值时,对测量结果做最大似然估计得到采样分布;In the process of human-machine dragging, each time a plurality of continuous human-machine interaction force and moment values are collected, the maximum likelihood estimation is performed on the measurement results to obtain the sampling distribution;
    基于通过人机协同模型生成的多个人机交互力以及力矩值,进行最大似然估计得到模型分布;Based on the multiple human-machine interaction forces and torque values generated by the human-machine collaborative model, the maximum likelihood estimation is performed to obtain the model distribution;
    计算所述采样分布和模型分布之间的KL散度;calculating the KL divergence between the sampling distribution and the model distribution;
    根据KL散度的值与可信赖阈值进行比较,根据比较结果判断机械臂执行自由拖动模式或执行任务模式。The value of KL divergence is compared with the trustworthy threshold, and according to the comparison result, it is judged that the robot arm executes the free dragging mode or the executing task mode.
  6. 根据权利要求5所述的机械臂的控制方法,其特征在于,所述根据KL散度的值与可信赖阈值进行比较,根据比较结果判断机械臂执行自由拖动模式或执行任务模式包括:The method for controlling a robotic arm according to claim 5, wherein the comparison between the value of the KL divergence and the trustworthy threshold, and judging that the robotic arm executes the free drag mode or the execution task mode according to the comparison result comprises:
    若KL散度的值小于等于可信赖阈值,则判断机械臂执行任务模式;If the value of the KL divergence is less than or equal to the trustworthy threshold, the robot arm is judged to perform the task mode;
    若若KL散度的值大于可信赖阈值,则判断机械臂执行自由拖动模式。If the value of the KL divergence is greater than the trustworthy threshold, it is determined that the robotic arm is in the free dragging mode.
  7. 根据权利要求6所述的机械臂的控制方法,其特征在于,若机械臂执行自由拖动模式时,人机协同模型为无效模型;The control method of the manipulator according to claim 6, wherein, if the manipulator executes the free drag mode, the man-machine collaboration model is an invalid model;
    若机械臂执行任务模式时,人机协同模型为有效模型。If the manipulator executes the task mode, the human-machine collaborative model is the valid model.
  8. 根据权利要求7所述的机械臂的控制方法,其特征在于,若机械臂执行任务模式时,执行根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹。The control method of a robotic arm according to claim 7, wherein if the robotic arm executes the task mode, the robotic arm is generated according to the current moment pose and the desired pose corresponding to the human-machine interaction force at the current moment. The optimal trajectory of the movement.
  9. 根据权利要求8所述的机械臂的控制方法,其特征在于,所述根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹包括:The method for controlling a robotic arm according to claim 8, wherein the generating the optimal trajectory of the robotic arm motion according to the pose at the current moment and the desired pose corresponding to the human-machine interaction force at the current moment comprises:
    获取基于所述人机协同模型获取当前时刻人机交互力对应的期望位姿,记作第一位姿;Obtain the desired pose corresponding to the human-machine interaction force at the current moment based on the human-machine collaboration model, and record it as the first pose;
    获取基于上位机规划的轨迹对应的期望位姿,记作第二位姿;Obtain the desired pose corresponding to the trajectory planned based on the host computer, and record it as the second pose;
    基于第一位姿和第二位姿之间的差值确定阻抗坐标系下机械臂的位姿,进而确定所述最优轨迹。The pose of the manipulator in the impedance coordinate system is determined based on the difference between the first pose and the second pose, and then the optimal trajectory is determined.
  10. 根据权利要求9所述的机械臂的控制方法,其特征在于,所述根据所述最优轨迹对机械臂进行控制包括:The method for controlling a robotic arm according to claim 9, wherein the controlling the robotic arm according to the optimal trajectory comprises:
    获取阻抗坐标系下机械臂的位姿;Obtain the pose of the robotic arm in the impedance coordinate system;
    在姿态角与法向运动,根据上位机规划大导纳,人拖动小导纳进行控制;In attitude angle and normal motion, the large admittance is planned according to the host computer, and the human drags the small admittance to control;
    在切线方向运动,根据人拖动大导纳,上位机规划小导纳进行控制。Move in the tangential direction, according to the person dragging the large admittance, the host computer plans the small admittance for control.
  11. 根据权利要求9所述的机械臂的控制方法,其特征在于,所述根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹包括:The method for controlling a robotic arm according to claim 9, wherein the generating the optimal trajectory of the robotic arm motion according to the current moment pose and the desired pose corresponding to the human-machine interaction force at the current moment comprises:
    基于最优控制方法iLQR方法生成机械臂运动的最优轨迹。The optimal trajectory of the robot arm movement is generated based on the optimal control method iLQR method.
  12. 一种人机协同模型的训练方法,用于得到权利要求1-10中任一项所述的机械臂的控制方法的人机协同模型,所述人机协同模型的训练方法包括:A training method of a man-machine collaborative model, used to obtain the man-machine collaborative model of the control method of a robotic arm according to any one of claims 1-10, the training method of the man-machine collaborative model comprising:
    获取机械臂的多组人机交互力和所述多组人机交互力对应的多组机械臂位姿,所述多组 人机交互力为多组原始人机交互力;Obtaining multiple sets of human-computer interaction forces of the robotic arm and multiple sets of robotic arm poses corresponding to the multiple sets of human-computer interaction forces, where the multiple sets of human-computer interaction forces are multiple sets of original human-computer interaction forces;
    根据所述多组人机交互力和所述多组机械臂位姿,建立人机协同模型。A human-machine collaboration model is established according to the multiple sets of human-computer interaction forces and the multiple sets of robotic arm poses.
  13. 根据权利要求12所述的人机协同模型的训练方法,其特征在于,在根据所述多组人机交互力和所述多组机械臂位姿,建立人机协同模型之后,所述方法还包括:The method for training a human-machine collaborative model according to claim 12, wherein after the human-machine collaborative model is established according to the multiple sets of human-computer interaction forces and the multiple sets of robotic arm poses, the method further comprises: include:
    根据监督学习方法对人机协同模型进行优化,生成优化后的人机协同模型。The human-machine collaborative model is optimized according to the supervised learning method, and the optimized human-machine collaborative model is generated.
  14. 根据权利要求12所述的人机协同模型的训练方法,其特征在于,所述获取机械臂的多组人机交互力包括:通过六维力传感器获取人操作手柄时的多组人机交互力;The training method for a human-machine collaborative model according to claim 12, wherein the acquiring multiple sets of human-computer interaction forces of the robotic arm comprises: acquiring multiple sets of human-computer interaction forces when a person operates the handle through a six-dimensional force sensor ;
    获取所述多组人机交互力对应的多组机械臂位姿包括:Obtaining the poses of the multiple sets of robotic arms corresponding to the multiple sets of human-computer interaction forces includes:
    获取上位机及路径规划系统给出,其形式为阻抗坐标系下机械臂的期望位姿。The upper computer and the path planning system are obtained, in the form of the expected pose of the manipulator in the impedance coordinate system.
  15. 一种机械臂的控制装置,其特征在于,包括:A control device for a robotic arm, comprising:
    模型获取模块,用于获取人机协同模型,其中所述人机协同模型为根据人机交互力确定机械臂期望位姿的模型;a model obtaining module, used for obtaining a human-machine collaboration model, wherein the human-machine collaboration model is a model for determining the desired pose of the robotic arm according to the human-machine interaction force;
    位姿获取模块,用于获取当前时刻位姿,根据所述人机协同模型获取当前时刻人机交互力对应的期望位姿;a pose obtaining module, configured to obtain the pose at the current moment, and obtain the desired pose corresponding to the human-computer interaction force at the current moment according to the human-machine collaboration model;
    轨迹生成模块,用于根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿生成机械臂运动的最优轨迹;A trajectory generation module, configured to generate an optimal trajectory of the motion of the robotic arm according to the current moment posture and the expected posture and posture corresponding to the human-computer interaction force at the current moment;
    控制模块,用于根据所述最优轨迹对机械臂进行控制。The control module is used to control the robotic arm according to the optimal trajectory.
  16. 根据权利要求15所述的机械臂的控制装置,其特征在于,所述轨迹生成模块包括:The control device for a robotic arm according to claim 15, wherein the trajectory generation module comprises:
    随机轨迹生成单元,用于通过MPC算法,根据所述当前时刻位姿和所述当前时刻人机交互力对应的期望位姿,生成多组随机轨迹;a random trajectory generation unit, configured to generate multiple groups of random trajectories through the MPC algorithm according to the current moment posture and the expected posture corresponding to the human-computer interaction force at the current moment;
    最优轨迹生成单元,从所述多组随机轨迹中选择最优轨迹。The optimal trajectory generation unit selects the optimal trajectory from the multiple groups of random trajectories.
  17. 计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使所述计算机执行权利要求1-11任意一项所述的机械臂的控制方法和/或权利要求12至14中任一项所述的人机协同模型的训练方法。A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the control method and /or the training method of the human-machine collaborative model according to any one of claims 12 to 14.
  18. 一种机器人,包括:机械臂、传感器、至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器执行权利要求1-11任意一项所述的机械臂的控制方法和/或权利要求12至14中任一所述的人机协同模型的训练方法。A robot comprising: a robotic arm, a sensor, at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores a computer program executable by the at least one processor, The computer program is executed by the at least one processor, so that the at least one processor executes the control method for a robotic arm according to any one of claims 1-11 and/or any one of claims 12 to 14. The training method of the human-machine collaborative model.
PCT/CN2021/082254 2020-10-26 2021-03-23 Robotic arm control method and device, and human-machine cooperation model training method WO2022088593A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011159428.2 2020-10-26
CN202011159428.2A CN112428278B (en) 2020-10-26 2020-10-26 Control method and device of mechanical arm and training method of man-machine cooperation model

Publications (1)

Publication Number Publication Date
WO2022088593A1 true WO2022088593A1 (en) 2022-05-05

Family

ID=74696144

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/082254 WO2022088593A1 (en) 2020-10-26 2021-03-23 Robotic arm control method and device, and human-machine cooperation model training method

Country Status (2)

Country Link
CN (1) CN112428278B (en)
WO (1) WO2022088593A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114800523A (en) * 2022-05-26 2022-07-29 江西省智能产业技术创新研究院 Mechanical arm track correction method, system, computer and readable storage medium
CN114995132A (en) * 2022-05-26 2022-09-02 哈尔滨工业大学(深圳) Multi-arm spacecraft model prediction control method, equipment and medium based on Gaussian mixture process
CN115070764A (en) * 2022-06-24 2022-09-20 中国科学院空间应用工程与技术中心 Mechanical arm motion track planning method and system, storage medium and electronic equipment
CN116214527A (en) * 2023-05-09 2023-06-06 南京泛美利机器人科技有限公司 Three-body collaborative intelligent decision-making method and system for enhancing man-machine collaborative adaptability

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112428278B (en) * 2020-10-26 2022-11-15 北京理工大学 Control method and device of mechanical arm and training method of man-machine cooperation model
CN113084814B (en) * 2021-04-13 2022-05-10 中国科学院自动化研究所 Method for realizing motion control of musculoskeletal robot based on distribution position optimization
CN113177310B (en) * 2021-04-25 2022-05-27 哈尔滨工业大学(深圳) Mechanical arm holding method based on human body comfort
CN113858201B (en) * 2021-09-29 2023-04-25 清华大学 Self-adaptive variable impedance control method, system and equipment for flexible driving robot
CN113925607B (en) * 2021-11-12 2024-02-27 上海微创医疗机器人(集团)股份有限公司 Operation robot operation training method, device, system, medium and equipment
CN114147710B (en) * 2021-11-27 2023-08-11 深圳市优必选科技股份有限公司 Robot control method and device, robot and storage medium
CN114789443B (en) * 2022-04-29 2024-02-23 广东工业大学 Mechanical arm control method and system based on multi-source information deep reinforcement learning
CN114800532B (en) * 2022-06-27 2022-09-16 西南交通大学 Mechanical arm control parameter determination method, device, equipment, medium and robot
CN115309044A (en) * 2022-07-26 2022-11-08 福建工程学院 Mechanical arm angular velocity control method based on model predictive control

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070151389A1 (en) * 2005-12-20 2007-07-05 Giuseppe Prisco Medical robotic system with programmably controlled constraints on error dynamics
CN109848983A (en) * 2018-12-10 2019-06-07 华中科技大学 A kind of method of highly conforming properties people guided robot work compound
CN110524544A (en) * 2019-10-08 2019-12-03 深圳前海达闼云端智能科技有限公司 A kind of control method of manipulator motion, terminal and readable storage medium storing program for executing
CN110559082A (en) * 2019-09-10 2019-12-13 深圳市精锋医疗科技有限公司 surgical robot and control method and control device for mechanical arm of surgical robot
CN111546315A (en) * 2020-05-28 2020-08-18 济南大学 Robot flexible teaching and reproducing method based on human-computer cooperation
CN111660306A (en) * 2020-05-27 2020-09-15 华中科技大学 Robot variable admittance control method and system based on operator comfort
CN112428278A (en) * 2020-10-26 2021-03-02 北京理工大学 Control method and device of mechanical arm and training method of man-machine cooperation model

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8271132B2 (en) * 2008-03-13 2012-09-18 Battelle Energy Alliance, Llc System and method for seamless task-directed autonomy for robots
US8671005B2 (en) * 2006-11-01 2014-03-11 Microsoft Corporation Interactive 3D shortage tracking user interface
US20150294496A1 (en) * 2014-04-14 2015-10-15 GM Global Technology Operations LLC Probabilistic person-tracking using multi-view fusion
WO2017019860A1 (en) * 2015-07-29 2017-02-02 Illinois Tool Works Inc. System and method to facilitate welding software as a service
JP2018530441A (en) * 2015-09-09 2018-10-18 カーボン ロボティクス, インコーポレイテッドCarbon Robotics, Inc. Robot arm system and object avoidance method
US10635758B2 (en) * 2016-07-15 2020-04-28 Fastbrick Ip Pty Ltd Brick/block laying machine incorporated in a vehicle
CN106406098B (en) * 2016-11-22 2019-04-19 西北工业大学 A kind of man-machine interaction control method of robot system under circumstances not known
JP6390735B1 (en) * 2017-03-14 2018-09-19 オムロン株式会社 Control system
CN107121926A (en) * 2017-05-08 2017-09-01 广东产品质量监督检验研究院 A kind of industrial robot Reliability Modeling based on deep learning
CN106970594B (en) * 2017-05-09 2019-02-12 京东方科技集团股份有限公司 A kind of method for planning track of flexible mechanical arm
CN107457780B (en) * 2017-06-13 2020-03-17 广州视源电子科技股份有限公司 Method and device for controlling mechanical arm movement, storage medium and terminal equipment
CN107202584B (en) * 2017-07-06 2020-02-14 北京理工大学 Planet accurate landing anti-interference guidance method
CN108153153B (en) * 2017-12-19 2020-09-11 哈尔滨工程大学 Learning variable impedance control system and control method
CN108284444B (en) * 2018-01-25 2021-05-11 南京工业大学 Multi-mode human body action prediction method based on Tc-ProMps algorithm under man-machine cooperation
CN109048890B (en) * 2018-07-13 2021-07-13 哈尔滨工业大学(深圳) Robot-based coordinated trajectory control method, system, device and storage medium
CN109048891B (en) * 2018-07-25 2021-12-07 西北工业大学 Neutral buoyancy robot posture and track control method based on self-triggering model predictive control
US20200089229A1 (en) * 2018-09-18 2020-03-19 GM Global Technology Operations LLC Systems and methods for using nonlinear model predictive control (mpc) for autonomous systems
CN110764416A (en) * 2019-11-11 2020-02-07 河海大学 Humanoid robot gait optimization control method based on deep Q network
CN111152220B (en) * 2019-12-31 2021-07-06 浙江大学 Mechanical arm control method based on man-machine fusion
CN111360839A (en) * 2020-04-24 2020-07-03 哈尔滨派拉科技有限公司 Multi-configuration mechanical arm hierarchical control method and system based on motion trail

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070151389A1 (en) * 2005-12-20 2007-07-05 Giuseppe Prisco Medical robotic system with programmably controlled constraints on error dynamics
CN109848983A (en) * 2018-12-10 2019-06-07 华中科技大学 A kind of method of highly conforming properties people guided robot work compound
CN110559082A (en) * 2019-09-10 2019-12-13 深圳市精锋医疗科技有限公司 surgical robot and control method and control device for mechanical arm of surgical robot
CN110524544A (en) * 2019-10-08 2019-12-03 深圳前海达闼云端智能科技有限公司 A kind of control method of manipulator motion, terminal and readable storage medium storing program for executing
CN111660306A (en) * 2020-05-27 2020-09-15 华中科技大学 Robot variable admittance control method and system based on operator comfort
CN111546315A (en) * 2020-05-28 2020-08-18 济南大学 Robot flexible teaching and reproducing method based on human-computer cooperation
CN112428278A (en) * 2020-10-26 2021-03-02 北京理工大学 Control method and device of mechanical arm and training method of man-machine cooperation model

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114800523A (en) * 2022-05-26 2022-07-29 江西省智能产业技术创新研究院 Mechanical arm track correction method, system, computer and readable storage medium
CN114995132A (en) * 2022-05-26 2022-09-02 哈尔滨工业大学(深圳) Multi-arm spacecraft model prediction control method, equipment and medium based on Gaussian mixture process
CN114800523B (en) * 2022-05-26 2023-12-01 江西省智能产业技术创新研究院 Mechanical arm track correction method, system, computer and readable storage medium
CN115070764A (en) * 2022-06-24 2022-09-20 中国科学院空间应用工程与技术中心 Mechanical arm motion track planning method and system, storage medium and electronic equipment
CN116214527A (en) * 2023-05-09 2023-06-06 南京泛美利机器人科技有限公司 Three-body collaborative intelligent decision-making method and system for enhancing man-machine collaborative adaptability
CN116214527B (en) * 2023-05-09 2023-08-11 南京泛美利机器人科技有限公司 Three-body collaborative intelligent decision-making method and system for enhancing man-machine collaborative adaptability

Also Published As

Publication number Publication date
CN112428278B (en) 2022-11-15
CN112428278A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
WO2022088593A1 (en) Robotic arm control method and device, and human-machine cooperation model training method
Xu et al. Dynamic neural networks based kinematic control for redundant manipulators with model uncertainties
Kebria et al. Adaptive type-2 fuzzy neural-network control for teleoperation systems with delay and uncertainties
Abu-Dakka et al. Adaptation of manipulation skills in physical contact with the environment to reference force profiles
Zhang et al. A dual neural network for redundancy resolution of kinematically redundant manipulators subject to joint limits and joint velocity limits
Satheeshbabu et al. Continuous control of a soft continuum arm using deep reinforcement learning
Hamedani et al. Intelligent impedance control using wavelet neural network for dynamic contact force tracking in unknown varying environments
Xu et al. Motion planning of manipulators for simultaneous obstacle avoidance and target tracking: An RNN approach with guaranteed performance
CN112140101A (en) Trajectory planning method, device and system
CN114641375A (en) Dynamic programming controller
Nicolis et al. Human intention estimation based on neural networks for enhanced collaboration with robots
Jiao et al. Adaptive hybrid impedance control for dual-arm cooperative manipulation with object uncertainties
Mitrovic et al. Optimal feedback control for anthropomorphic manipulators
Hu et al. Adaptive variable impedance control of dual-arm robots for slabstone installation
Katayama et al. Whole-body model predictive control with rigid contacts via online switching time optimization
Al-Shuka et al. Adaptive hybrid regressor and approximation control of robotic manipulators in constrained space
JP2021035714A (en) Control device, control method and control program
Mitrovic et al. Adaptive optimal control for redundantly actuated arms
Ma et al. Control of a Cable-Driven Parallel Robot via Deep Reinforcement Learning
Sun et al. A Fuzzy Cluster-based Framework for Robot-Environment Collision Reaction
Pluzhnikov et al. Behavior-based arm control for an autonomous bucket excavator
Toner et al. Probabilistically safe mobile manipulation in an unmodeled environment with automated feedback tuning
van Veldhuizen Autotuning PID control using Actor-Critic Deep Reinforcement Learning
Heyu et al. Impedance control method with reinforcement learning for dual-arm robot installing slabstone
Yu et al. Adaptive human-robot collaboration control based on optimal admittance parameters

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21884337

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21884337

Country of ref document: EP

Kind code of ref document: A1