CN111580389A

CN111580389A - Three-degree-of-freedom helicopter explicit model prediction control method based on deep learning

Info

Publication number: CN111580389A
Application number: CN202010433327.3A
Authority: CN
Inventors: 张聚; 施超; 吴崇坚; 牛彦; 潘伟栋; 陈德臣
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-05-21
Filing date: 2020-05-21
Publication date: 2020-08-25
Anticipated expiration: 2040-05-21
Also published as: CN111580389B

Abstract

The three-degree-of-freedom helicopter explicit model prediction control based on deep learning comprises the following steps: step 1) firstly, analyzing and dynamically modeling a helicopter system; step 2) data acquisition and processing are carried out through an explicit model predictive control algorithm; step 3), constructing and training an EMPC-based neural network; and 4) model verification and deep learning network application. The invention solves the problems of high calculation resource requirement and long calculation time of the traditional model predictive control, effectively improves the control precision and the prediction accuracy rate and improves the calculation efficiency by combining with deep learning.

Description

Three-degree-of-freedom helicopter explicit model prediction control method based on deep learning

Technical Field

The invention relates to a helicopter control method, in particular to a three-degree-of-freedom helicopter model prediction control method based on deep learning.

Background

With the rapid development of modern science and technology and artificial intelligence, the unmanned helicopter is one of the unique creations, and the application range of the aircraft is greatly expanded. The unmanned helicopter has the advantages that the unmanned helicopter can fly in any direction, can vertically take off and land, can hover to monitor a preset target at a fixed point, and can also rotate for 360 degrees at the fixed point; the overall dimension is small, the radar transmitting area is small, and the flight safety is high; the requirement on take-off and landing sites is low; the ground equipment is simple, the erection and withdrawal time is short, and the like, so that the unmanned helicopter which is reliable to research and has low cost becomes a main trend particularly along with the development of microelectronic technology and micro sensors. Due to the characteristics, the compound has wide application and development prospect. The method has wide application in military field such as ground attack, landing, weapon transportation, logistics support and the like. The method is applied to short-distance transportation, medical rescue, disaster relief and lifesaving, emergency rescue, hoisting equipment, geological exploration, forest protection and fire extinguishing, aerial photography and the like in the civil aspect. When the helicopter is controlled, a good control strategy is selected to be crucial, under the condition that the system is normal, the normal operation of the system is ensured, the calculation cost is reduced, and meanwhile, the improvement of the performance index of the system is the key of system evaluation and analysis.

For many years, experts and scholars in the flight control industry have conducted extensive research on the control strategy of an airplane and proposed some more typical advanced control methods, and a Model Predictive Control (MPC) method is a very effective method and mainly comprises three elements: however, MPC is generally applied to systems with slow dynamic characteristics, and for nonlinear systems (such as helicopters), MPC needs to solve nonlinear performance indexes at each sampling transient moment, and has large demand on computing resources and low computational efficiency.

Explicit predictive control is a fast MPC algorithm for small-scale control proposition proposed by Bemporad et al in 2002. The main idea is to put the online calculation of the optimized solution into offline operation through the idea of parameter planning, thereby improving the speed of online calculation. When solving a multi-parameter planning (mp-QP) problem, the conventional explicit model predictive control is usually a geometric-based solution, i.e., convex division is performed on a state region of a system, an explicit functional relationship (which is a linear control law of states) between an optimal control law corresponding to an optimization problem on each state partition and the state is established, and a finally obtained parameter solution is the control law corresponding to each state critical domain; however, the method has limitations, and on one hand, whether the properties of adjacent surfaces between critical domains are satisfied or not must be considered, on the other hand, the step length must be set reasonably, and all constraint sets must be distinguished one by one, so that the calculation efficiency for solving the problem is not high, and the calculation amount is large.

The model prediction control based on deep learning effectively combines the model prediction control and the neural network, and the most outstanding characteristic of the neural network is that the neural network has excellent mapping approximation capability and excellent learning generalization capability, so that the model prediction part in the prediction control can more accurately approximate to an actual process object by training and learning self-adaptive adjustment parameters and adopting a gradient descent algorithm and continuously updating a weight value and a bias value, and the control precision of the whole prediction control algorithm is improved. Neural networks are ideal choices for processing control objects into higher-order, multivariate, time-varying, nonlinear, strongly coupled models. By utilizing the characteristics of the model prediction algorithm that the control result is predicted first and then the behavior is controlled, and by means of the advantages of deep learning in multi-parameter optimization, the model prediction control algorithm for deep learning is provided, the requirements of low computing resources and optimized computation are met, the computing efficiency is improved, and the computation of the past model prediction control is compensated. And after the training is finished, the three-degree-of-freedom helicopter is controlled by using a deep learning neural network instead of model prediction.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides an explicit model prediction control method based on deep learning, model prediction control and a neural network are effectively combined by introducing a deep learning concept, and the most prominent characteristic of the neural network is that the neural network has excellent mapping approximation capability and excellent learning generalization capability, so that a model prediction part in prediction control can more accurately approximate to an actual process object by training and learning self-adaptive adjustment parameters and adopting a gradient descent algorithm and continuously updating a weight value and a bias value, and the control precision of the whole prediction control algorithm is improved.

When the control is carried out by using the deep learning explicit model predictive control, except that the explicit model predictive control is required to be used for carrying out online solving and optimizing problems for the first time, the control laws of all state partitions are calculated offline, the partition where the current system state is located is found online, and the optimal control quantity of the current parameter is determined. And once the training is finished, the deep learning neural network is used for replacing model prediction to control the three-degree-of-freedom helicopter. Neural networks are ideal choices for processing control objects into higher-order, multivariate, time-varying, nonlinear, strongly coupled models. The model predictive control algorithm for deep learning is provided by using the characteristics of the model predictive algorithm that the control result is predicted first and then the behavior is controlled, and by means of the advantages of deep learning in multi-parameter optimization, the requirements of low computing resources and optimized computation are met, the computing efficiency is improved, and the computing burden of the past model predictive control is made up.

The control method of the explicit prediction model based on the deep learning is divided into three parts of data acquisition, model training and model prediction in an algorithm. The method comprises the steps of obtaining data through two working processes of off-line calculation and on-line calculation by using an explicit model prediction control method during data acquisition, establishing a model for model training to finally obtain an ideal model, predicting the model in the last part, and controlling the three-degree-of-freedom helicopter by using a deep learning neural network instead of the original model prediction when the expected target is achieved.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention are further described below. The three-degree-of-freedom helicopter explicit model prediction control method based on deep learning comprises the following steps:

step 1) firstly, analyzing and dynamically modeling a helicopter system: and analyzing the stress condition corresponding to each shaft and component in the operation process of the helicopter system.

The three-degree-of-freedom helicopter system space state equation is as follows:

according to the modeling analysis of the helicopter, the altitude angle, the pitch angle p, the rotation angle r and their respective differential altitude angular velocities are selected

Pitch angular velocity

And rotational angular velocity

As state vectors, i.e.

The voltage of front and rear motors of the three-freedom-degree helicopter is selected as an input vector, namely u ═ V_fV_b]^TAltitude, pitch angle p, rotation angle r as output vector y ═ p r]^T. Substituting the values in the specific relevant parameter table to obtain the coefficients of the A, B and C state equations as follows:

step 2) data acquisition and processing are carried out through an explicit model predictive control algorithm: the main idea of off-line computation of explicit predictive control is multi-parameter planning, thus constraining the optimal control problem:

s.t.Ex_k+Lu_k≤M,k＝0,1,···N-1 (1b)

x_k+1＝Ax_k+Bx_k,k≥0 (1c)

x₀＝x(0),x_N∈χ_f(1d)

wherein P, Q is not less than 0, R is more than 0, x_NIs terminal state, u_kAnd x_kRespectively, the control variable and the state variable, U, of each sampling point_NIn order to optimize the control vector,

for the set of all feasible solutions x (0), x_N∈χ_fIs the final state constraint region χ_fIs a convex set.

However, the optimal control vector U in the optimization problem (1)_NDepending only on the current system state x (0), since (1c) is linear, it can be translated into

Substituting (2) into (1) can transform the constrained optimal control problem into the following optimization problem:

s.t.GU_N≤W+Ex(0) (3b)

wherein Y, H, F, G, W, E can be obtained from (1). The multi-parameter quadratic programming problem solution is in the form of a piecewise affine function:

u(k)＝f_ix(k)+g_i，x(k)∈CR_i(4)

and (4) performing convex division on the system state space by using mp-QP (mp-QP) in the explicit model prediction control to obtain the corresponding optimal explicit control law on the state partition, and storing the data. The online calculation is that the solved state partition and control law data are stored on a computer, and the state partition and the corresponding control law of the current state are determined as the control voltage of the three-degree-of-freedom helicopter only by looking up a table. The input (control voltage) and output (altitude, pitch, rotation angle) data are integrated into a training set, a test set and a verification set.

Step 3), constructing and training an EMPC-based neural network: the model prediction control algorithm of the deep neural network is designed by means of the advantages of deep learning on multi-parameter optimization, the model prediction algorithm of the deep learning is a deep learning network which takes explicit model prediction control as a prophase algorithm and carries out online training through data of a control result, and therefore reconstruction of the control law of the three-degree-of-freedom helicopter under different flight attitudes is achieved. The deep neural network phi (x | theta) is composed of L band parameters (w)_j,b_j) Affine function λ of_j(x) Wherein L is the number of layers of the neural network.

λ_j(x)＝w_jx+b_j(5)

By the last affine function lambda_j(x) In addition, each of the rest needs to have the function of a nonlinear activation function h, because each node in the hidden layer of the neural network is associated with an activation function, so the deep neural network is expressed as:

φ(x|θ)＝λ_L·h+λ_l-1·h···h+λ₁(6)

wherein

Is the set of all parameters that need to be optimized, w_jIs the connection weight of the jth, b_jIs a threshold value. The ReLU is selected as the activation function because the deep network with the ReLU as the activation function has good feasibility and can obtain better results to reduce the influence of the approximation error. The simultaneous activation function ReLU and corresponding network parameters can accurately represent mp-QP, and the solution of mp-QP is continuous piecewise simulation on polyhedronThe neural network with the activation function of ReLU is a continuous and segmented affine function on polyhedron.

The ReLU activation function is defined as: the piecewise linear function of y max {0, x } (7) activation function is expressed as:

the deep learning network constructed on the basis can accurately represent any explicit model predictive control law u (k), which is also determined by two aspects, namely that each piecewise affine function f (x) can be written into the difference of two convex piecewise affine functions

The method comprises the following steps:

f(x)＝γ(x)-η(x) (9)

wherein γ (x) is represented in

Thereon is provided with r_γAn area, η (x) is shown in

Thereon is provided with r_ηAnd (4) a region. Second convex piecewise affine function f (x) in

The maximum value of the upper N points is

After the neural network structure is constructed, the next step is to optimize the network parameters-this is dependent on the training set data. Therefore, a plurality of different points are randomly selected in the state space, and the corresponding optimal control input u (k) is obtained by repeatedly solving to obtain the explicit model predictive control law, so that the generated training set data is used for training the neural network.

This training process can be viewed as a process of continuously iteratively updating the weights and bias values of each layer of neurons to improve their accuracy using an optimization algorithm and a loss function, wherein methods including a stochastic gradient descent optimizer method, such as Adam optimizer method and automatic parameter adjustment method, are used, and these methods are performed by training a subset (small lot) of the dataset and calculating the loss. And replacing the whole data set with a small batch, and then calculating a loss gradient by using a back propagation method to update the neural network parameter theta until the data set is completely sampled to obtain an ideal model. The standard for finishing the training model is whether the convergence degree of the loss function meets the requirement or not and whether the accuracy reaches the expectation or not, the loss function is the mean square error of a predicted value and an actual value, and the accuracy can be more intuitively and effectively seen in the relation between the two. Training thus provides a basis for how deep learning networks map inputs to outputs.

Step 4), model verification and application of a deep learning network: the model designed by the controller is only an approximation of the actual system, and to ensure that the trained model is an accurate representation of the data set and to prevent overfitting, a validation set is required to validate the model with constant tuning, validation techniques such as data partitioning, cross validation and analysis of residual maps. After the deep learning network is trained for the first time, when the output error is smaller than a set value, the actual deep learning network control parameter is not trained to be optimal, multiple times of subsequent training are still needed, and the training parameters, the optimizer and the loss function are adjusted to optimize the program, so that the error accuracy of the deep learning network is further trained and improved, and the deep learning network can reach the expected value. For a trained neural network, u (k) slave can be accurately represented

The feedback law of the model predictive control is displayed, the reconstruction of the control law of the three-degree-of-freedom helicopter under different flight attitudes can be realized, and the control of the flight attitudes of the three-degree-of-freedom helicopter can be realized quickly and accurately.

The invention has the following advantages:

1. the invention solves the problems of high calculation resource requirement and long calculation time of the traditional model predictive control, effectively improves the control precision and the prediction accuracy rate and improves the calculation efficiency by combining with deep learning.

2. The method is applied to the control problem of the three-degree-of-freedom helicopter and has a good prediction effect.

3. The method has firm theoretical basis of steps, simple and clear steps and perfect theoretical support.

Drawings

Fig. 1 is a schematic view of a three-degree-of-freedom helicopter to which the method of the present invention relates.

Fig. 2 is a model diagram of a three-degree-of-freedom helicopter to which the method of the present invention relates.

FIG. 3 is an overall flow chart of the method of the present invention.

FIG. 4 is a state partition diagram of the present invention.

FIG. 5 is a diagram of the PWA function of the present invention.

FIG. 6 is a cost function graph of the present invention.

FIG. 7 is the structure of the neuron-layer-network of the present invention.

Fig. 8 is a graph of the loss function convergence of the present invention.

Fig. 9 is a loss function visualization of the present invention.

FIG. 10 is a structural analysis diagram of a visualization module of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings:

the three-degree-of-freedom helicopter model prediction control method based on deep learning disclosed by the invention comprises the following steps of:

Pitch angular velocity

And rotational angular velocity

As state vectors, i.e.

s.t.Ex_k+Lu_k≤M,k＝0,1,···N-1 (1b)

x_k+1＝Ax_k+Bx_k,k≥0 (1c)

x₀＝x(0),x_N∈χ_f(1d)

wherein P, Q is not less than 0, R is more than 0, x_NIs terminal state, u_kAnd x_kRespectively the control variable and the state of each sampling pointVariable, U_NIn order to optimize the control vector,

s.t.GU_N≤W+Ex(0) (3b)

u(k)＝f_ix(k)+g_i，x(k)∈CR_i(4)

Step 3), constructing and training an EMPC-based neural network: the model prediction control algorithm of the deep neural network is designed by means of the advantages of deep learning on multi-parameter optimization, and the model prediction algorithm of the deep learning takes explicit model prediction control as a prophase algorithm and carries out on-line training through the data of the control resultAnd (4) learning the network, thereby realizing the reconstruction of the control law of the three-degree-of-freedom helicopter under different flight attitudes. The deep neural network phi (x | theta) is composed of L band parameters (w)_j,b_j) Affine function λ of_j(x) Wherein L is the number of layers of the neural network.

λ_j(x)＝w_jx+b_j(5)

φ(x|θ)＝λ_L·h+λ_l-1·h···h+λ₁(6)

wherein

Is the set of all parameters that need to be optimized, w_jIs the connection weight of the jth, b_jIs a threshold value. The ReLU is selected as the activation function because the deep network with the ReLU as the activation function has good feasibility and can obtain better results to reduce the influence of the approximation error. Meanwhile, the activation function ReLU and corresponding network parameters can accurately represent the mp-QP, and as the solution of the mp-QP is continuous piecewise affine on a polyhedron and is positively determined to be unique, the neural network with the activation function ReLU is also continuous piecewise affine on the polyhedron.

The method comprises the following steps:

f(x)＝γ(x)-η(x) (9)

wherein γ (x) is represented in

Thereon is provided with r_γAn area, η (x) is shown in

The maximum value of the upper N points is

Step (ii) of4) Application of model verification and deep learning network: the model designed by the controller is only an approximation of the actual system, and to ensure that the trained model is an accurate representation of the data set and to prevent overfitting, a validation set is required to validate the model with constant tuning, validation techniques such as data partitioning, cross validation and analysis of residual maps. After the deep learning network is trained for the first time, when the output error is smaller than a set value, the actual deep learning network control parameter is not trained to be optimal, multiple times of subsequent training are still needed, and the training parameters, the optimizer and the loss function are adjusted to optimize the program, so that the error accuracy of the deep learning network is further trained and improved, and the deep learning network can reach the expected value. For a trained neural network, u (k) slave can be accurately represented

Case analysis

The invention aims at the three-degree-of-freedom helicopter, because of the self MIM0, higher order and nonlinear characteristics, the aircraft is controlled by utilizing the explicit model prediction to obtain output data, the data is used as training data, then the trained deep learning network is used for respectively carrying out experiments on the height axis, the rotating axis and the pitching axis, the performance of the deep learning combined with the explicit model prediction control method in the specific application of the three-degree-of-freedom helicopter is shown, and the superior performance of the invention is embodied by comparing the experimental results. The control signal can be fed back more quickly under the control of the deep learning network than the control of the explicit model prediction, the response time is shortened, and the stability of the system in the control process is improved due to the autonomous learning capability of the control system.

And obtaining output data of the system by an explicit model predictive control method, wherein the value of the input quantity and the correspondingly selected output quantity are correspondingly combined into data. Then, the table is analyzed to delete unnecessary data, correct abnormal data and repair the missing data. And finally, constructing a data set meeting the training requirement, converting the data set into a csv format, and dividing the data set into a training set, a verification set and a test set. The overall steps can be seen from the overall flow chart of fig. 2, a neural network is built on the tenserflow, data is imported, and then normalization processing is performed. Then 500 rounds of data training are set with a learning rate of 0.01, defining a mean square error loss function and creating an optimizer. As shown in the specification and the attached figure 5, after 500 training, the error of 0.12 still exists between the predicted value and the actual value, although the error exists between the finally obtained result and the calculation of the explicit model prediction control, the flight stability of the helicopter is not affected, and the helicopter can still keep stable flight within the error range. Compared with the solution time, the deep learning network has higher control efficiency than the common model prediction under the condition of the same parameters. The burden of the computer is greatly reduced on the storage space and the computation.

According to the experimental results and the program operation, under the same conditions and on the premise of stable operation, compared with common explicit model prediction control, the model control based on deep learning has the advantages of higher solving speed and better explicit calculation performance. In the aspect of control effect, the altitude angle, the rotation angle and the pitch angle of the three-freedom-degree helicopter can be effectively adjusted, an ideal stable state is quickly achieved, and the three-freedom-degree helicopter has good control performance.

The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.

Claims

1. The three-degree-of-freedom helicopter explicit model prediction control based on deep learning comprises the following steps:

step 1) firstly, analyzing and dynamically modeling a helicopter system: the method comprises the following steps of analyzing stress conditions corresponding to all shafts and components of the helicopter system in the operation process, and specifically comprises the following steps:

Pitch angular velocity

And rotational angular velocity

As state vectors, i.e.

The voltage of front and rear motors of the three-freedom-degree helicopter is selected as an input vector, namely u ═ V_fV_b]^TAltitude, pitch angle p, rotation angle r as output vector y ═ p r]^T(ii) a Substituting the values in the specific relevant parameter table to obtain the coefficients of the A, B and C state equations as follows:

s.t.Ex_k+Lu_k≤M,k＝0,1,···N-1 (1b)

x_k+1＝Ax_k+Bx_k,k≥0 (1c)

x₀＝x(0),x_N∈χ_f(1d)

for the set of all feasible solutions x (0), x_N∈χ_fIs the final state constraint region χ_fIs a convex set;

s.t.GU_N≤W+Ex(0) (3b)

wherein Y, H, F, G, W, E are obtained from (1); the multi-parameter quadratic programming problem solution is in the form of a piecewise affine function:

u(k)＝f_ix(k)+g_i，x(k)∈CR_i(4)

performing convex division on a system state space by using mp-QP through explicit model predictive control to obtain an optimal explicit control law corresponding to the state partition, and storing the lower data; the online calculation is that the solved state partition and control law data are stored on a computer, and the state partition and the corresponding control law of the current state are determined to be used as the control voltage of the three-degree-of-freedom helicopter only by looking up a table; integrating input (control voltage) and output (altitude angle, pitch angle and rotation angle) data into a training set, a test set and a verification set;

step 3), constructing and training an EMPC-based neural network: the model prediction control algorithm of the deep neural network is designed by means of the advantage of deep learning on multi-parameter optimization, the model prediction algorithm of the deep learning is the deep learning network which takes explicit model prediction control as a prophase algorithm and carries out online training through data of a control result of the deep learning network, and therefore the reconstruction of the control law of the three-degree-of-freedom helicopter under different flight attitudes is realized; the deep neural network phi (x | theta) is composed of L band parameters (w)_j,b_j) Affine function λ of_j(x) Wherein L is the number of layers of the neural network;

λ_j(x)＝w_jx+b_j(5)

φ(x|θ)＝λ_L·h+λ_l-1·h···h+λ₁(6)

wherein

Is the set of all parameters that need to be optimized, w_jIs the connection weight of the jth, b_jIs a threshold value; ReLU is selected as the activation function because the deep network taking ReLU as the activation function has good feasibility and can obtain better results so as to reduce the influence of approximation error; meanwhile, the activation function ReLU and corresponding network parameters can accurately represent mp-QP, and as the solution of mp-QP is continuous piecewise affine on a polyhedron and is positively determined to be unique, the neural network with the activation function ReLU is also continuous piecewise affine on the polyhedron;

the ReLU activation function is defined as: y max {0, x } (7)

The piecewise linear function of the activation function is represented as: y is equal to x,

The method comprises the following steps:

f(x)＝γ(x)-η(x) (9)

wherein γ (x) is represented in

Thereon is provided with r_γAn area, η (x) is shown in

Thereon is provided with r_ηAn area; second convex piecewise affine function f (x) in

The maximum value of the upper N points is

After the neural network structure is constructed, the next step is to optimize the network parameters, which depend on the training set data; therefore, a plurality of different points are randomly selected in a state space, and the corresponding optimal control input u (k) is obtained by repeatedly solving to obtain an explicit model predictive control law, so that the generated training set data is used for training the neural network;

this training process can be seen as a process of continuously iteratively updating the weights and bias values of each layer of neurons to continuously improve their accuracy using an optimization algorithm and a loss function, wherein methods including a stochastic gradient descent optimizer method, such as Adam optimizer method and automatic parameter adjustment method, are used, and these methods are performed by training a subset (small lot) of the dataset and calculating the loss; replacing the whole data set with a small batch, and then calculating a loss gradient by using a back propagation method to update a neural network parameter theta until the data set is completely sampled to obtain an ideal model; the standard for finishing the training model is whether the convergence degree of the loss function meets the requirement or not and whether the accuracy reaches the expectation or not, the loss function is the mean square error of a predicted value and an actual value, and the accuracy can be more intuitively and effectively seen in the relation between the two; training therefore provides a basis for how deep learning networks map inputs to outputs;

step 4), model verification and application of a deep learning network: the model designed by the controller is only approximate to an actual system, and in order to ensure that the trained model is accurately represented by a data set and prevent overfitting, a verification set is needed to verify the model and continuously adjust parameters, and verification technologies comprise data partitioning, cross verification and residual error map analysis; after the deep learning network is trained for the first time, when the output error is smaller than a set value, the actual deep learning network control parameter is not trained to be optimal, multiple times of subsequent training are still needed, and the training parameters, an optimizer and a loss function are adjusted to optimize a program, so that the error precision of the deep learning network is further trained and improved, and the deep learning network reaches an expected value; for a trained neural network, u (k) slave can be accurately represented