CN113255208B

CN113255208B - Neural network model predictive control method for series elastic actuator of robot

Info

Publication number: CN113255208B
Application number: CN202110429622.6A
Authority: CN
Inventors: 单新平; 张安龙; 严作海
Original assignee: Hangzhou Seenpin Robot Technology Co ltd
Current assignee: Hangzhou Xinjian Electromechanical Transmission Co ltd
Priority date: 2021-04-21
Filing date: 2021-04-21
Publication date: 2023-05-12
Anticipated expiration: 2041-04-21
Also published as: CN113255208A

Abstract

The invention discloses a neural network model predictive control method for a series elastic actuator of a robot, which comprises the following steps: establishing a dynamic model aiming at the series elastic actuator; constructing and training a dynamic cyclic neural network; the dynamic circulating neural network obtained through training is used as a model to be applied to a nonlinear power system, and a current measured value and a historical moment measured value which are related to the nonlinear power system are input into the dynamic circulating neural network, so that a system parameter predicted value is obtained; and (3) inputting an expected track aiming at the nonlinear power system, combining a system parameter predicted value, and adopting a gradient descent method to optimize and solve to obtain a system optimal control input sequence at a future moment. According to the neural network model predictive control method for the serial elastic actuator of the robot, disclosed by the invention, the suppression of vibration can be more effectively realized by utilizing the neural network in combination with a model predictive control algorithm, so that the performance of the robot and the comfort level of man-machine cooperation are improved.

Description

Neural network model predictive control method for series elastic actuator of robot

Technical Field

The invention relates to the technical field of intelligent robot driver control and nonlinear system control, in particular to a neural network model predictive control method for a series elastic actuator of a robot

Background

Model predictive control is an effective optimal control strategy, which has achieved great success in the last decades and is widely used in the field of process control, automotive systems and robotics. The principle is that the state output at the future moment is predicted through the current state, the control input, the dynamics equation and the like, so that the optimal control input quantity at the next moment is obtained. The series elastic actuator is one of flexible joints in the field of robots, and model predictive control is a complex problem in the field of robot flexible joint application. On one hand, an elastic body exists in the actuator, a dynamics equation of the elastic body is a highly coupled nonlinear system, and a model of the system is difficult to establish due to the fact that gear gaps and friction exist in a speed reducer and many parameters in the system are uncertain; on the other hand, due to the characteristics of an actuator system, the real-time performance requirement of model predictive control solution is high.

The traditional tandem elastic actuator control mostly uses a model-free control method, mainly a PID-based control method. The controller is derived using, for example, PD control in combination with online adaptive gravity compensation or friction compensation methods, or using a back-stepping method in combination with RBF neural networks. The PID control method is simple in design, but is relatively poor in performance when applied to a nonlinear system, experience is needed, time is consumed for adjusting parameters, the derivation process of the controller of other methods is complex, the design difficulty is high, and a certain theoretical knowledge is needed for technicians.

Therefore, the design of the controller with simple method and excellent performance has important significance in the field of robot flexible joints.

The external disturbance and the intrinsic flexibility of the serial elastic actuator can cause vibration, the vibration can influence the performance of the robot and the comfort level of man-machine cooperation, and the traditional control method is difficult to efficiently restrain the vibration. In order to restrain the influence of vibration on the performance of a robot, two main methods are an input shaping method and a track optimization method, the existing control algorithm mainly comprises a secondary optimization method, a terminal sliding film control method (TSMC), an open loop Iterative Learning (ILC), an adaptive shaping input control method and the like, but most of the methods depend on a dynamics model, the dynamics model is easily influenced by external interference besides modeling difficulty, some methods need parameter adjustment, and the research on the vibration restraining algorithm is less at present.

Therefore, there is a need to design a more advanced intelligent control algorithm, and more particularly, a neural network model predictive control method for a series elastic actuator of a robot, so that a more superior technical effect of vibration suppression can be achieved by combining a neural network with the model predictive control algorithm.

Disclosure of Invention

The invention aims to overcome the defect that the vibration of the existing robot using the serial elastic actuator is difficult to effectively inhibit due to external disturbance and intrinsic flexibility, and the performance of the robot and the comfort of man-machine cooperation are adversely affected by the vibration.

The invention solves the technical problems by adopting the following technical scheme:

the invention provides a neural network model predictive control method for a series elastic actuator of a robot, wherein the series elastic actuator is a nonlinear power system, and the method is characterized by comprising the following steps:

step S1: establishing a dynamic model for the series elastic actuator system;

step S2: constructing and training a dynamic cyclic neural network for approximating the nonlinear power system, wherein a training method suitable for the dynamic cyclic neural network is selected and super parameters of the neural network are set;

step S3: the dynamic cyclic neural network obtained through training is used as a nonlinear dynamics model to be applied to the system, and a current measured value and a historical moment measured value of the nonlinear power system are input into the dynamic cyclic neural network, so that a state output predicted value of the nonlinear power system is obtained, a multi-step prediction error of the dynamic cyclic neural network is tested, if the multi-step prediction error exceeds a preset error threshold value, the step S2 is returned to continue training until the multi-step prediction error is within the range of the preset error threshold value;

step S4: and inputting an expected track of the nonlinear power system, predicting system output at a future time by using the dynamic cyclic neural network, and optimally solving by adopting a gradient descent method so as to obtain a system optimal control input sequence for realizing the future time of the expected track.

According to some embodiments of the invention, the method comprises, with newly obtained current measurements and historical time measurements, a step S3 of obtaining the predicted values of the system parameters and a step S4 of optimizing the solution are performed in a loop.

According to some embodiments of the invention, the dynamic recurrent neural network is defined by the following formula (1):

y(n+1)＝f(y(n)，y(n-1)，...，y(n-d _y +1)，u(n)，u(n-1)，u(n-2)，...，u(n-d _u +1)) (1)

wherein y (n+1) is the system output state at the time of (n+1), y (n) is the system output state at the time of n, d _y As a time delay constant, u (n) is a system control input at time n, d _u Inputting a time delay constant for system control;

the method for training the dynamic cyclic neural network adopts an Adam algorithm.

According to some embodiments of the invention, adam's algorithm dynamically adjusts the learning rate of the various parameters contained in the dynamic recurrent neural network using first and second moment estimates of the gradient.

According to some embodiments of the invention, step S4 comprises the sub-steps of:

inputting an expected track, a current moment state and a historical moment state aiming at the nonlinear power system to the dynamic circulating neural network;

predicting a system output at a future time using the dynamic recurrent neural network;

determining a cost function associated with the nonlinear power system, and back-propagating the cost function according to time;

and optimizing and solving by adopting a random gradient descent method, so as to obtain an optimal system control input sequence for realizing the future moment of the expected track.

According to some embodiments of the invention, the cost function is a quadratic function, the prediction time domain in step S4 is 5 steps, and the control time domain is 1 step.

According to some embodiments of the invention, the time interval of each step is 20 milliseconds.

According to some embodiments of the invention, the nonlinear power system comprises a brushless DC motor driven in SVPWM mode, wherein the current protection condition is used as a constraint term in the predictive control solving process in steps S3 and S4.

According to some embodiments of the invention, the nonlinear power system further comprises a controller saturation constraint.

On the basis of conforming to the common knowledge in the field, the above preferred conditions can be arbitrarily combined to obtain the preferred examples of the invention.

The invention has the positive progress effects that:

according to the neural network model predictive control method for the serial elastic actuator of the robot, disclosed by the invention, the suppression of vibration can be more effectively realized by utilizing the neural network in combination with a model predictive control algorithm, so that the performance of the robot and the comfort level of man-machine cooperation are improved.

Drawings

Fig. 1 is a schematic view of a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 2 is a flowchart illustrating a data set sampling process in a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 3 is a schematic view of a dynamic cyclic neural network training process in a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 4 is a partial flowchart of a part S4 in the neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention, in which an exemplary optimization solving process is shown.

Fig. 5 is a schematic diagram showing a simulation of a system step response of an application example of a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 6 is a schematic diagram showing a simulation of a system sinusoidal response of an application example of a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 7 is another simulation diagram of a system step response of an application example of a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 8 is another simulation diagram of a system sinusoidal response of an application example of a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Fig. 9 is a schematic view of a NARX neural network structure involved in a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention.

Detailed Description

The following detailed description of the preferred embodiments of the invention, taken in conjunction with the accompanying drawings, is given by way of illustration and not limitation, and any other similar situations are intended to fall within the scope of the invention.

In the following detailed description, directional terms, such as "left", "right", "upper", "lower", "front", "rear", etc., are used with reference to the directions described in the drawings. The components of the various embodiments of the present invention can be positioned in a number of different orientations and the directional terminology is used for purposes of illustration and is in no way limiting.

Referring to fig. 1, a neural network model predictive control method for a series elastic actuator of a robot according to a preferred embodiment of the present invention, the series elastic actuator being a nonlinear power system, the method comprising:

step S1: establishing a dynamic model for the series elastic actuator system;

step S3: the dynamic cyclic neural network obtained through training is used as a model to be applied to the nonlinear power system, and a current measured value and a historical moment measured value which are related to the nonlinear power system are input into the dynamic cyclic neural network, so that a system output predicted value which is related to the nonlinear power system is obtained, the multi-step prediction error of the dynamic cyclic neural network is tested, if the multi-step prediction error exceeds a preset error threshold value, the step S2 is returned to continue training until the multi-step prediction error is within the range of the preset error threshold value;

step S4: and inputting an expected track of the nonlinear power system, predicting system output at a future time by using the dynamic cyclic neural network, and optimally solving by adopting a gradient descent method so as to obtain an optimal system control input sequence for realizing the future time of the expected track.

The method may further include, with the newly obtained current measurement value and the historical time measurement value, performing step S3 of obtaining the predicted value of the system parameter and step S4 of optimizing the solution in a circulating manner.

In other words, as shown in fig. 1, the method may further include:

step S5: at the future moment, the new measured value is used as an initial condition to re-predict the future output of the system and solve the optimization problem, and the system is cycled back and forth to any moment until the result obtained by solving the optimization problem meets certain set conditions or requirements.

It should be understood that the methods described herein are primarily directed to a tandem spring actuator for a robot, which is a nonlinear power system. Nonlinear power systems may also be referred to herein simply as "systems".

According to some preferred embodiments, the dynamically recurring neural network is defined by the following equation (1):

wherein y (n+1) is the system output state at the future (n+1) time, y (n) is the system output state at the n time, d _y As a time delay constant, u (n) is a system control input at time n, d _u Inputting a time delay constant for system control;

According to some preferred embodiments, adam's algorithm utilizes first and second moment estimates of gradients to dynamically adjust the learning rate of the various parameters contained by the dynamic recurrent neural network.

According to some preferred embodiments, step S4 comprises the sub-steps of:

Wherein, further preferably, the cost function is a quadratic function, the prediction time domain in step S4 is 5 steps, the control time domain is 1 step, and the time interval of each step is 20 milliseconds.

According to some preferred embodiments of the present invention, the nonlinear power system includes a dc brushless motor driven in a SVPWM manner, and the method uses a current protection condition as a constraint term in the predictive control solving process in steps S3 and S4. Among them, SVPWM is an abbreviation of space vector pulse width modulation (Space Vector Pulse Width Modulation).

The individual steps involved in this method according to the above-described preferred embodiment of the invention will be exemplified in more detail below.

1. Detailed description of method steps for construction of kinetic models and neural networks

(1) Establishing a kinetic model of a tandem elastic actuator system

In the step S1, a dynamic model of the series elastic actuator system is established as shown in a formula (2):

wherein: b (B) _l B is the moment of inertia of the connecting rod side _m For motor side moment of inertia, q is the actuator output position, θ is the actuator motor side position,

is the acceleration of the side of the connecting rod>

For motor side acceleration τ _f1 For connecting rod side damping (speed dependent), τ _f2 For motor side damping (speed dependent), K is elastomer constant, m is load weight, g is gravitational acceleration, l is connecting rod length, f _dist For external disturbance τ _m The motor outputs torque.

(2) Establishing a dynamic cyclic neural network model

The neural network used in the step S2 is a dynamic cyclic neural network, and the mathematical expression is shown as the following formula:

y(n+1)＝f(y(n)，y(n-1)，...，y(n-d _y +1)，u(n)，u(n-1)，u(n-2)，...，u(n-d _u +1))

wherein y (n+1) is the system output state at the time of (n+1), y (n) is the system output state at the time of n, d _y For the time delay constant, u (n) is the system control input at time n, d _u To control the input delay constant. The dynamic recurrent neural network architecture may be as shown in fig. 9.

Preferably, the dynamic circulating neural network is 3 layers, namely an input layer, a hidden layer and an output layer, and the number of neurons of the selected hidden layer is 30 and the time delay constant is set to be 4. Wherein each hidden node outputs a formula as follows:

where σ is a nonlinear activation function.

Preferably, the tanh function is selected as the activation function, and the expression is:

the network output is shown in the following formula

(3) Dynamic circulation neural network training method

Preferably, the dynamic cyclic neural network training of S2 selects the Mean Square Error (MSE) as the loss function, and is characterized by the following formula

Preferably, the Adam algorithm used in the training process in S10 is formulated as follows:

M _t ＝β ₁ *M _t-1 +(1-β ₁ )*g _t

/>

the parameter updating method comprises the following steps:

wherein g _t Is gradient, α=0.001, β ₁ ＝0.9,β ₂ ＝0.999,∈＝10 ^-8 ,M _t And G _t A first moment estimate and a second moment estimate of the gradient respectively,

and->

Is to M _t And G _t Is dynamically adjusted according to the gradient

A dynamic constraint is formed on the learning rate.

(4) Data set sampling and training process

The training process of the dynamic cyclic neural network is directly affected by the data set sampling, the model deviation can be caused by insufficient data set, and the data set sampling process is shown in fig. 2.

First, a random control quantity u epsilon (u) _min ，u _max ) The control amount is input into the series elastic actuator system.

Each control amount lasts for a random time length t epsilon (t _min ，t _max ) The time range is set to be between 15ms and 40ms in the invention.

The system output is collected at fixed time intervals, and is directed to position control in the invention, so that the system output is output end position output.

Judging whether the preset maximum acquisition number is reached.

And if not, returning to continuously generate the random control quantity.

The data set is stored according to time sequence, the data set comprises time, control input and system output data information, and the data collection is exited.

To ensure sufficiency of the data sets, multiple sets of data sets need to be acquired so that the data sets can fully exhibit the dynamics of the system.

In the training process, the data set is circularly input into the neural network according to time sequence, the training process is open loop training, only single-step prediction is carried out, and the data of the future time step is used as a reference value, and the specific process is shown in fig. 3.

And updating the network weight by using an Adam algorithm, carrying out batch processing on the data set to improve the training speed, iterating for a certain number of times, or stopping training when the gradient is smaller than a certain threshold value, and storing the network weight.

And iterating the multi-step prediction by using the trained neural network, testing the multi-step prediction error of the network, storing the network weight when the error is within the allowable range, finishing the training, and returning to the process to continue the training of the network if the multi-step prediction error exceeds the preset range.

2. Description of method steps for on-line solution of neural network model predictive system control

Preferably, after the neural network prediction precision reaches the standard, the trained neural network is used as a system model, and the model prediction control method is combined with the system, and a random gradient descent method is used in the model prediction control on-line solving method.

(1) Preferably, the cost function is a quadratic function, as shown in the following formula

u _min ≤u≤u _max

Where u is the input constraint. It should be noted that different cost functions may be set according to different control objectives to achieve desired requirements.

(2) The gradient descent method is used for optimizing and solving, and the gradient calculation process is as follows:

deviation guide is calculated:

(3) The first term of the above solution sequence is used as a control input to the control system.

The specific steps may be as shown in fig. 4, and include:

the desired trajectory, current and past time states, are input to the dynamically recurring neural network.

A dynamic cyclic neural network is used to predict future time instant system output.

The value function is derived from time-dependent back propagation.

And optimizing and solving by using a random gradient descent method, and solving to obtain an optimal control input sequence at the future moment.

And (3) applying the first term obtained by solving to the system, re-predicting future output of the system and solving the optimization problem by taking the newly obtained measured value as an initial condition at the next moment, and then applying the first term obtained by optimizing to the system, and repeating until the time is infinite.

3. Description of simulation experiment results of the above examples

(1) As shown in fig. 5, which shows the experimental result of the system simulation, the simulation is completed under the Matlab platform, and the system model is built according to the formula (2).

The training data set and the test data set, including the training process, are all completed in Matlab. The graph shows the step response of the system, the dotted line shows the given reference, the solid line shows the tracking output of the system, the black line in the middle shows the tracking error of the system, and the bottom solid line shows the control input of the system.

As can be seen from the graph, the sinusoidal response performance of the method is excellent, sinusoidal signals can be tracked rapidly, and dynamic errors in the tracking process are within 0.005 rad.

(2) The system used in this example as shown in fig. 7 and 8 includes a motor driver portion and a decelerator. The motor drive mode is SVPWM mode, the controller input is motor output torque, the Hall signal is used for detecting the position of a motor rotor, the final output of the system is obtained by an encoder for the position (rad) after passing through a speed reducer and an elastic body, namely the joint position (serial elastic actuator), and the current item is used as the constraint of the controller.

Fig. 7 shows the step response of the system, one solid line is given reference, and the other solid line is tracking output of the system, and it can be seen from the figure that the step response performance of the method of the invention is superior in practical application, almost no overshoot is generated, and the steady-state error is about 0.008 rad.

FIG. 8 shows sinusoidal response, one solid line is given reference, and the other solid line is tracking output of the system, and it can be seen from the graph that the sinusoidal response performance of the method of the invention is excellent in practical application, sinusoidal signals can be tracked rapidly, and dynamic errors in the tracking process are within 0.008 rad.

The method according to the above preferred embodiment of the present invention combines the neural network with model predictive control by model approximation of a dynamic system using a dynamic cyclic neural network, and has excellent system control performance. In addition, the method can realize the following technical advantages: the method is simple to realize and easy to operate, and does not need to adjust too many parameters; the method can greatly inhibit vibration generated by the elastic body contained in the series elastic actuator in the control process, and has higher control precision and certain robustness.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the principles and spirit of the invention, and such changes and modifications fall within the scope of the invention.

Claims

1. A neural network model predictive control method for a series elastic actuator of a robot, the series elastic actuator being a nonlinear power system, the method comprising:

step S1: establishing a dynamic model for the series elastic actuator;

step S3: the dynamic cyclic neural network obtained through training is used as a model to be applied to the nonlinear power system, and a current measured value and a historical moment measured value which are related to the nonlinear power system are input into the dynamic cyclic neural network, so that a system state output predicted value which is related to the nonlinear power system is obtained, the multi-step prediction error of the dynamic cyclic neural network is tested, if the multi-step prediction error exceeds a preset error threshold value, the step S2 is returned to continue training until the multi-step prediction error is within the range of the preset error threshold value;

step S4, comprising the sub-steps of:

2. The neural network model predictive control method according to claim 1, characterized in that the method comprises a step S3 of obtaining the system state output predicted value and a step S4 of optimizing the solution are cyclically executed with newly obtained current measurement values and historical time measurement values.

3. The neural network model predictive control method according to claim 1, wherein the dynamic cyclic neural network is defined by the following formula (1): y (n+1) =f (y (n), y (n-1), …, y (n-d) _y +1),u(n),u(n-1),u(n-2),…,u(n-d _u +1)) (1), wherein y (n+1) is the system output state at time (n+1), y (n) is the system output state at time n, d _y As a time delay constant, u (n) is a system control input at time n, d _u Inputting a time delay constant for system control;

4. A neural network model predictive control method as claimed in claim 3, wherein Adam's algorithm dynamically adjusts the learning rate of each parameter contained in the dynamic cyclic neural network using first and second moment estimates of gradients.

5. The neural network model predictive control method according to claim 4, wherein the cost function is a quadratic function, in step S4, a prediction time domain of a system output at a future time is predicted using the dynamic cyclic neural network is 5 steps, and a control time domain of the optimal system control input sequence is 1 step.

6. The neural network model predictive control method of claim 5, wherein each step is 20 milliseconds apart.

7. The neural network model predictive control method according to claim 1, wherein the nonlinear power system includes a dc brushless motor driven in a SVPWM manner, and wherein current protection conditions are employed as constraint terms in the predictive control solving process in steps S3 and S4.

8. The neural network model predictive control method of claim 7, wherein the nonlinear power system further comprises a controller saturation constraint term.