CN113255208A

CN113255208A - Neural network model prediction control method for series elastic actuator of robot

Info

Publication number: CN113255208A
Application number: CN202110429622.6A
Authority: CN
Inventors: 单新平; 张安龙; 严作海
Original assignee: Hangzhou Seenpin Robot Technology Co ltd
Current assignee: Hangzhou Xinjian Electromechanical Transmission Co ltd
Priority date: 2021-04-21
Filing date: 2021-04-21
Publication date: 2021-08-13
Anticipated expiration: 2041-04-21
Also published as: CN113255208B

Abstract

The invention discloses a neural network model prediction control method for a series elastic actuator of a robot, which comprises the following steps: establishing a dynamic model aiming at the series elastic actuator; constructing and training a dynamic cyclic neural network; applying the trained dynamic circulation neural network as a model to a nonlinear power system, and inputting a current measurement value and a historical moment measurement value associated with the nonlinear power system into the dynamic circulation neural network to obtain a system parameter prediction value; and inputting an expected track aiming at the nonlinear power system, combining the system parameter predicted value, and optimally solving by adopting a gradient descent method to obtain a system optimal control input sequence at a future moment. According to the neural network model predictive control method for the series elastic actuator of the robot, disclosed by the invention, more effective inhibition on vibration can be realized by combining the neural network with a model predictive control algorithm, so that the performance of the robot and the comfort level of man-machine cooperation are improved.

Description

Neural network model prediction control method for series elastic actuator of robot

Technical Field

The invention relates to the technical field of intelligent robot driver control and nonlinear system control, in particular to a neural network model predictive control method for a series elastic actuator of a robot

Background

Model predictive control is an effective optimal control strategy, has been successful in recent decades, and is widely applied in the fields of process control, automobile systems and robots. The principle is that the state output of the future moment is predicted through the current state, the control input, the kinetic equation and the like, so that the optimal control input quantity of the next moment is obtained. The series elastic actuator is one of flexible joints in the field of robots, and model predictive control is a complex problem in the field of application of the flexible joints of the robots. On one hand, an elastomer exists in the actuator, the dynamic equation of the elastomer is a highly-coupled nonlinear system, and a model of the system is difficult to establish due to the fact that gear clearances and friction force exist in the speed reducer and many parameters in the system are uncertain; on the other hand, due to the characteristics of the actuator system, the real-time performance requirement of model predictive control solution is high.

Most of the traditional series elastic actuator control uses a model-free control method, mainly a PID-based control method. For example, a controller may be derived using PD control in conjunction with an online adaptive gravity compensation or friction compensation method, or using a back-stepping method in conjunction with an RBF neural network. The PID control method is simple in design, but the performance of the PID control method is relatively poor when the PID control method is applied to a nonlinear system, experience and time are needed for adjusting parameters, the derivation process of other methods for a controller is complicated, the design difficulty is high, and a technician needs to have certain theoretical knowledge.

Therefore, the controller with simple method and excellent performance is designed to be applied to the field of robot flexible joints, and has important significance.

The series elastic actuator causes vibration due to external disturbance and intrinsic flexibility, the vibration affects the performance of the robot and the comfort of human-computer cooperation, and the traditional control method is difficult to efficiently inhibit the vibration. In order to inhibit the influence of vibration on the performance of a robot, two main methods are an input shaping method and a trajectory optimization method, the existing control algorithms mainly comprise a quadratic optimization method, a terminal sliding mode control method (TSMC), an open loop Iterative Learning (ILC) method, an adaptive shaping input control method and the like, but most of the methods rely on a dynamic model, the dynamic model is easily influenced by external interference except for difficult modeling, and some methods need to adjust parameters, so that the vibration inhibition algorithms are rarely researched at present.

Therefore, it is necessary to design a more advanced intelligent control algorithm, and more particularly, to design a neural network model predictive control method for a series elastic actuator of a robot, so that a superior technical effect of vibration suppression can be achieved by combining a neural network with a model predictive control algorithm.

Disclosure of Invention

The invention aims to solve the technical problem that the existing robot using a series elastic actuator can cause vibration which is difficult to effectively inhibit due to external disturbance and intrinsic flexibility, and provides a novel neural network model predictive control method for the series elastic actuator of the robot aiming at the defect that the vibration can cause adverse effects on the performance of the robot and the human-computer cooperation comfort level.

The invention solves the technical problems by adopting the following technical scheme:

the invention provides a neural network model predictive control method for a series elastic actuator of a robot, wherein the series elastic actuator is a nonlinear power system, and the method is characterized by comprising the following steps:

step S1: establishing a kinetic model for the series elastic actuator system;

step S2: constructing and training a dynamic circulation neural network for approximating the nonlinear dynamical system, wherein a training method suitable for the dynamic circulation neural network is selected and a hyper-parameter of the neural network is set;

step S3: applying the trained dynamic circulation neural network as a nonlinear dynamics model to the system, inputting a current measurement value and a historical moment measurement value of the nonlinear dynamical system into the dynamic circulation neural network to obtain a state output prediction value of the nonlinear dynamical system so as to test a multi-step prediction error of the dynamic circulation neural network, and returning to the step S2 to continue training until the multi-step prediction error is within a preset error threshold value range if the multi-step prediction error exceeds the preset error threshold value;

step S4: and inputting an expected track aiming at the nonlinear power system, predicting system output at a future moment by using the dynamic cyclic neural network, and optimally solving by adopting a gradient descent method so as to obtain a system optimal control input sequence for realizing the future moment of the expected track.

According to some embodiments of the present invention, the method includes, with the newly obtained current measurement value and the measurement value at the historical time, executing the steps of obtaining the predicted value of the system parameter S3 and optimizing the solution S4 in a loop.

According to some embodiments of the invention, the dynamic recurrent neural network is defined by the following equation (1):

y(n+1)＝f(y(n)，y(n-1)，...，y(n-d_y+1)，u(n)，u(n-1)，u(n-2)，...，u(n-d_u+1)) (1)

wherein y (n +1) is the system output state at the (n +1) moment, y (n) is the system output state at the n moment, d_yIs a time delay constant, u (n) is a system control input at time n, d_uInputting a time delay constant for system control;

the method for training the dynamic cyclic neural network adopts an Adam algorithm.

According to some embodiments of the invention, the Adam algorithm dynamically adjusts the learning rate of each parameter included in the dynamic recurrent neural network using first and second moment estimates of the gradient.

According to some embodiments of the invention, step S4 includes the following sub-steps:

inputting a desired trajectory, a current time state and a historical time state for the nonlinear dynamical system to the dynamic recurrent neural network;

predicting a system output at a future time using the dynamic recurrent neural network;

determining a cost function associated with the nonlinear power system, and performing back propagation derivation on the cost function according to time;

and optimizing and solving by adopting a random gradient descent method so as to obtain an optimal system control input sequence for realizing the future time of the expected track.

According to some embodiments of the invention, the cost function is a quadratic function, the prediction horizon in step S4 is 5 steps, and the control horizon is 1 step.

According to some embodiments of the invention, the time interval of each step is 20 milliseconds.

According to some embodiments of the present invention, the nonlinear power system comprises a dc brushless motor driven in an SVPWM manner, and the method employs a current protection condition as a constraint term in the predictive control solving process in steps S3 and S4.

According to some embodiments of the invention, the nonlinear power system further comprises a controller saturation constraint term.

On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.

The positive progress effects of the invention are as follows:

according to the neural network model predictive control method for the series elastic actuator of the robot, disclosed by the invention, more effective inhibition on vibration can be realized by combining the neural network with a model predictive control algorithm, so that the performance of the robot and the comfort level of man-machine cooperation are improved.

Drawings

Fig. 1 is a schematic diagram of a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 2 is a flow chart illustrating a data set sampling process in a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 3 is a schematic diagram of a dynamic circulation neural network training process in a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 4 is a partial flowchart of the step S4 in the neural network model predictive control method for the series elastic actuators of the robot according to the preferred embodiment of the present invention, in which an exemplary optimization solving process is shown.

Fig. 5 is a simulation diagram of a system step response of an application example of the neural network model predictive control method for the series elastic actuators of the robot according to the preferred embodiment of the present invention.

Fig. 6 is a simulation diagram of a system sinusoidal response of an application example of the neural network model predictive control method for the series elastic actuators of the robot according to the preferred embodiment of the present invention.

Fig. 7 is another simulation diagram of a system step response of an application example of the neural network model predictive control method for the series elastic actuators of the robot according to the preferred embodiment of the present invention.

Fig. 8 is another simulation diagram of a system sinusoidal response of an application example of the neural network model predictive control method for the series elastic actuators of the robot according to the preferred embodiment of the present invention.

Fig. 9 is a schematic diagram of a NARX neural network structure involved in a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Detailed Description

The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, is intended to be illustrative, and not restrictive, and any other similar items may be considered within the scope of the present invention.

In the following detailed description, directional terms, such as "left", "right", "upper", "lower", "front", "rear", and the like, are used with reference to the orientation as illustrated in the drawings. The components of various embodiments of the present invention can be positioned in a number of different orientations and the directional terminology is used for purposes of illustration and is in no way limiting.

Referring to fig. 1, a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention, the series elastic actuator being a nonlinear dynamical system, the method comprising:

step S1: establishing a kinetic model for the series elastic actuator system;

step S3: applying the trained dynamic circulation neural network as a model to the nonlinear dynamical system, inputting a current measurement value and a historical moment measurement value which are associated with the nonlinear dynamical system into the dynamic circulation neural network to obtain a system output prediction value which is associated with the nonlinear dynamical system so as to test a multi-step prediction error of the dynamic circulation neural network, and returning to the step S2 to continue training until the multi-step prediction error is within a preset error threshold value range if the multi-step prediction error exceeds a preset error threshold value;

step S4: and inputting an expected track aiming at the nonlinear power system, predicting system output at a future moment by using the dynamic cyclic neural network, and optimally solving by adopting a gradient descent method so as to obtain an optimal system control input sequence for realizing the future moment of the expected track.

The method may further include, with the newly obtained current measurement value and the measurement value at the historical time, cyclically executing the steps S3 of obtaining the predicted value of the system parameter and S4 of optimizing the solution.

In other words, as shown in fig. 1, the method may further include:

step S5: and at a future moment, the new measured value is used as an initial condition to predict the future output of the system again and solve the optimization problem, and the process is repeated to any moment in a circulating mode until a result obtained by solving the optimization problem meets certain set conditions or requirements.

It should be appreciated that the method described herein is primarily directed to a series elastic actuator for a robot, which is a non-linear powered system. The non-linear power system may also be referred to herein simply as a "system".

According to some preferred embodiments, the dynamic recurrent neural network is defined by the following equation (1):

wherein y (n +1) is the system output state at the future (n +1) time, y (n) is the system output state at the n time, d_yIs a time delay constant, u (n) is a system control input at time n, d_uInputting a time delay constant for system control;

According to some preferred embodiments, the Adam algorithm dynamically adjusts the learning rate of each parameter included in the dynamic recurrent neural network using first and second moment estimates of the gradient.

According to some preferred embodiments, step S4 includes the following sub-steps:

It is further preferable that the cost function is a quadratic function, the prediction time domain in step S4 is 5 steps, the control time domain is 1 step, and the time interval of each step is 20 milliseconds.

According to some preferred embodiments of the present invention, the nonlinear dynamical system comprises a dc brushless motor driven in an SVPWM manner, and the method employs a current protection condition as a constraint term in the predictive control solving process in steps S3 and S4. The SVPWM is an abbreviation of Space Vector Pulse Width Modulation (Space Vector Pulse Width Modulation).

The individual steps involved in this method according to the above preferred embodiment of the invention will be illustrated in more detail below.

1. Detailed description of method steps for the construction of a kinetic model and a neural network

(1) Establishing a dynamic model of a series elastic actuator system

In S1, the dynamic model of the series elastic actuator system is established as shown in equation (2):

wherein: b is_lIs the connecting rod side moment of inertia, B_mIs the rotational inertia of the motor side, q is the position of the output end of the actuator, theta is the position of the motor side of the actuator,

is the side acceleration of the connecting rod,

for motor side acceleration, τ_f1For link side damping (velocity dependent), τ_f2For motor side damping (velocity dependent), K is the elastomer constant, m is the load weight, g is the gravitational acceleration, l is the link length, f_distFor external disturbances, τ_mThe motor outputs a torque.

(2) Establishing dynamic cyclic neural network model

The neural network used in step S2 is a dynamic cyclic neural network, and the mathematical expression is as follows:

y(n+1)＝f(y(n)，y(n-1)，...，y(n-d_y+1)，u(n)，u(n-1)，u(n-2)，...，u(n-d_u+1))

wherein y (n +1) is the system output state at the (n +1) moment, y (n) is the system output state at the n moment, d_yIs a time delay constant, u (n) is a system control input at time n, d_uTo control the input delay constant. The dynamic recurrent neural network structure can be as shown in fig. 9.

Preferably, the dynamic circulation neural network is 3 layers, namely an input layer, a hidden layer and an output layer, in the invention, the number of neurons in the hidden layer is 30, and the time delay constant is set to be 4. Wherein each hidden node outputs the following formula:

where σ is a non-linear activation function.

Preferably, the tanh function is selected as an activation function, and the expression is:

the network output is shown in the following formula

(3) Dynamic circulation neural network training method

Preferably, the dynamic recurrent neural network training described in S2 selects a Mean Square Error (MSE) as a loss function, wherein the MSE is characterized by the following formula

Preferably, the Adam algorithm formula used in the training process in S10 is as follows:

M_t＝β₁*M_t-1+(1-β₁)*g_t

the parameter updating method comprises the following steps:

wherein g is_tIs a gradient, α is 0.001, β₁＝0.9,β₂＝0.999,∈＝10^-8,M_tAnd G_tRespectively a first moment estimate and a second moment estimate of the gradient,

and

is to M_tAnd G_tThe learning rate is dynamically adjusted according to the gradient, and

a dynamic constraint is formed on the learning rate.

(4) Data set sampling and training process

The data set sampling directly affects the training process of the dynamic cyclic neural network, model deviation can be caused by insufficient data sets, and the data set sampling process is shown in fig. 2.

Firstly, generating a random control quantity u e (u)_min，u_max) And inputting the control quantity into the series elastic actuator system.

Each control quantity lasts for a random time length t epsilon (t)_min，t_max) In the present invention, the time range is set to be between 15ms and 40 ms.

The system output is collected at fixed time intervals, and the position control is aimed at in the invention, so the system output is output end position output.

And judging whether the preset maximum collection number is reached.

And if not, returning to continue generating the random control quantity.

The data collection is exited by storing a data set according to a time sequence, wherein the data set comprises time, control input and system output data information.

In order to ensure the sufficiency of the data set, a plurality of groups of data sets need to be acquired so that the data sets can fully embody the dynamic characteristics of the system.

In the training process, the data set is circularly input into the neural network according to the time sequence, the training process is open-loop training, only single-step prediction is carried out, data of a future time step is used as a reference value, and the specific process is shown in fig. 3.

And updating the network weight by using an Adam algorithm, carrying out batch processing on the data set for improving the training speed, iterating for a certain number of times, or stopping training when the gradient is smaller than a certain threshold value, and storing the network weight.

And iterating the multistep prediction by using the trained neural network, testing the multistep prediction error of the network, keeping the weight of the network and ending the training if the error is within an allowable range, and returning to the process to continue training the network if the multistep prediction error exceeds a preset range.

2. Description of method steps for on-line solution of neural network model predictive system control

Preferably, after the prediction accuracy of the neural network reaches the standard, the trained neural network is used as a system model, a model prediction control method is applied to the system, and a random gradient descent method is used in an online solving method of model prediction control.

(1) Preferably, the cost function is a quadratic function, as shown in the following formula

u_min≤u≤u_max

Where u is the input constraint. It is worth mentioning that different cost functions can be set according to different control objectives to achieve the desired requirements.

(2) And (3) optimizing and solving by using a gradient descent method, wherein the gradient calculation process is as follows:

calculating a partial derivative:

(3) and taking the first item of the solving sequence as a control input and applying the first item of the solving sequence to a control system.

The specific steps can be as shown in fig. 4, and include:

the desired trajectory, current and past time states are input into the dynamic recurrent neural network.

And predicting system output at a future moment by using the dynamic recurrent neural network.

The value function is derived from the time backpropagation.

And (4) optimizing and solving by using a random gradient descent method to obtain an optimal control input sequence at a future moment.

And applying the solved first item to the system, predicting the future output of the system again by taking the newly obtained measured value as an initial condition at the next moment, solving the optimization problem, applying the optimized solved first item to the system, and repeating the operation till the time is infinite.

3. Description of simulation experiment results for the above examples

(1) As shown in fig. 5, the system simulation experiment result is shown, the simulation is completed under the Matlab platform, and the system model is built according to the formula (2).

Both the training data set and the test data set, including the training process, are done in Matlab. The figure is the step response of the system, the dotted line is the given reference, the solid line is the tracking output of the system, the black line of the middle figure is the tracking error of the system, the solid line at the bottom is the control input quantity of the system, and as can be seen from the figure, the method of the invention has excellent step response performance, almost no overshoot and the steady state error is about 0.003 rad.

As shown in FIG. 6, the sinusoidal response of the system is shown, the dotted line is the given reference, one solid line is the tracking output of the system, the black line in the middle graph is the tracking error of the system, and one solid line at the bottom is the system control input quantity.

(2) The system used in this example as shown in fig. 7 and 8 includes a motor driver portion and a reducer. The motor driving mode is an SVPWM mode, the input of a controller is motor output torque, Hall signals are used for detecting the position of a motor rotor, the final output of a system is obtained by an encoder for the position (rad) after passing through a speed reducer and an elastic body, namely the joint position (series elastic actuator), and a current term is used as the constraint of the controller.

Fig. 7 shows the step response of the system, one solid line is the given reference, and the other solid line is the tracking output of the system, and it can be seen from the figure that the method of the present invention has excellent step response performance, almost no overshoot, and the steady state error is about 0.008rad in practical application.

Fig. 8 shows the sinusoidal response, one solid line is the given reference, and the other solid line is the tracking output of the system, and it can be seen from the figure that the method of the present invention has excellent sinusoidal response performance in practical application, can track sinusoidal signals quickly, and the dynamic error of the tracking process is within 0.008 rad.

The method according to the above preferred embodiment of the present invention has excellent system control performance by approximating a dynamic system model using a dynamic recurrent neural network, combining the neural network with model predictive control. Furthermore, this method may also achieve the following technical advantages: the method is simple to implement, easy to operate and free from adjusting too many parameters; the method can greatly inhibit the vibration generated by the elastomer contained in the series elastic actuator in the control process, and has higher control precision and certain robustness.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims

1. A method for neural network model predictive control of a series elastic actuator of a robot, the series elastic actuator being a nonlinear dynamical system, the method comprising:

step S1: establishing a kinetic model for the series elastic actuator;

step S3: applying the trained dynamic circulation neural network as a model to the nonlinear dynamical system, inputting a current measurement value and a historical moment measurement value which are associated with the nonlinear dynamical system into the dynamic circulation neural network to obtain a system state output predicted value which is associated with the nonlinear dynamical system so as to test a multi-step prediction error of the dynamic circulation neural network, and returning to the step S2 to continue training until the multi-step prediction error is within a preset error threshold value range if the multi-step prediction error exceeds a preset error threshold value;

2. The neural network model predictive control method of claim 1, comprising, with newly obtained current measurement values and historical time measurement values, cyclically executing the steps of obtaining predicted values of the system parameters S3 and optimizing the solution S4.

3. The neural network model predictive control method of claim 1, wherein the dynamic recurrent neural network is defined by the following equation (1):

4. The neural network model predictive control method of claim 3, wherein the Adam algorithm dynamically adjusts the learning rate of each parameter included in the dynamic recurrent neural network using first order moment estimation and second order moment estimation of the gradient.

5. The neural network model predictive control method of claim 4, wherein the step S4 includes the sub-steps of:

6. The neural network model predictive control method of claim 5, wherein the cost function is a quadratic function, the prediction time domain in step S4 is 5 steps, and the control time domain is 1 step.

7. The neural network model predictive control method of claim 6, wherein the time interval of each step is 20 milliseconds.

8. The neural network model predictive control method of claim 1, wherein the nonlinear dynamical system comprises a dc brushless motor driven in an SVPWM manner, and a current protection condition is adopted in the method as a constraint term in the predictive control solving process in steps S3 and S4.

9. The neural network model predictive control method of claim 8, wherein the nonlinear dynamical system further comprises a controller saturation constraint term.