CN116382093A - Optimal control method and equipment for nonlinear system with unknown model - Google Patents
Optimal control method and equipment for nonlinear system with unknown model Download PDFInfo
- Publication number
- CN116382093A CN116382093A CN202310559968.7A CN202310559968A CN116382093A CN 116382093 A CN116382093 A CN 116382093A CN 202310559968 A CN202310559968 A CN 202310559968A CN 116382093 A CN116382093 A CN 116382093A
- Authority
- CN
- China
- Prior art keywords
- equation
- nonlinear system
- optimal
- cost function
- control
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 238000012545 processing Methods 0.000 claims abstract description 22
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 106
- 229920006395 saturated elastomer Polymers 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 8
- 230000005284 excitation Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 abstract description 7
- 239000004973 liquid crystal related substance Substances 0.000 description 12
- 238000013461 design Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000011217 control strategy Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 229910021389 graphene Inorganic materials 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000003754 machining Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- KFOPKOFKGJJEBW-ZSSYTAEJSA-N methyl 2-[(1s,7r,8s,9s,10r,13r,14s,17r)-1,7-dihydroxy-10,13-dimethyl-3-oxo-1,2,6,7,8,9,11,12,14,15,16,17-dodecahydrocyclopenta[a]phenanthren-17-yl]acetate Chemical compound C([C@H]1O)C2=CC(=O)C[C@H](O)[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H](CC(=O)OC)[C@@]1(C)CC2 KFOPKOFKGJJEBW-ZSSYTAEJSA-N 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012887 quadratic function Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
Abstract
The optimal control method and the optimal control equipment for the nonlinear system with unknown model are provided, an optimal cost function of the system is established aiming at the nonlinear system with unknown model, and a partial differential equation for solving the optimal cost function is determined; according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation; introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system; and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system. The method effectively solves the problem of dimension disaster caused by large calculation amount, has high algorithm convergence speed and improves the efficiency of nonlinear system control.
Description
Technical Field
The invention belongs to the field of control planning, and particularly relates to a nonlinear system optimal control method and equipment for model unknown.
Background
Control theory in control system engineering is a sub-field of control in mathematics that deals with engineering processes and dynamic systems of continuous operation in machines. The aim is to develop a control strategy that controls such a system in an optimal way using control actions while not delaying or overshooting and ensuring control stability.
For example, optimization-based control and estimation techniques, such as Model Predictive Control (MPC), allow for a model-based design framework in which system dynamics and constraints can be directly considered. MPC is used in many applications to control power systems of various complexities. Examples of such systems include production lines, automotive engines, robots, numerically controlled processes, motors, satellites, and generators. However, in many cases, the model of the controlled system is nonlinear and may be difficult to design, use in real time, or may be inaccurate. Examples of such situations are common in robotics, building control (HVAC), smart grids, factory automation, transportation, self-regulating machines, and transportation networks. In addition, even if the nonlinear model is fully available, designing an optimal controller is inherently a challenging task because of the need to solve partial differential equations known as Hamilton-Jacobi-Bellman (Hamilton-Jacobi-Bellman equation: HJB) equations.
Finding the optimal control law of a general nonlinear system requires solving a Hamilton-Jacobi-Bellman (HJB) partial differential equation, hereinafter referred to as the HJB equation, and there are various conventional solutions for the optimal control problem of a dynamic system with performance indexes or so-called cost functions, but these conventional solutions have two drawbacks. On the one hand, the HJB equation solving process has inherent computational complexity, which grows exponentially with the change of state dimension, i.e., there is a "dimension disaster". On the other hand, the implementation of the conventional solution depends on an accurate system model and cannot be applied to a difficult modeling system. Thus, the problem of optimal control independent of mathematical models remains a hotspot in current research.
Disclosure of Invention
In view of the foregoing problems of the prior art, it is an object of the present invention to provide a method and apparatus for optimal control of a nonlinear system with unknown model, which can improve the calculation efficiency of optimal control of the nonlinear system.
In order to solve the technical problems, the specific technical scheme is as follows:
in one aspect, provided herein is a method for optimal control of a model-unknown nonlinear system, the method comprising:
aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;
according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;
introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
Further, for a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function, including:
establishing a state equation of a nonlinear system with unknown model:x(t 0 )=x 0 wherein->Is a state variable of the system,/>Is a control input of the system,/->Is a system dynamic equation, +.>Is a system input state equation;
determining an optimal cost function of the nonlinear system according to the dynamic constraint of the nonlinear system and the state equation of the nonlinear system: wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]An inner part;
further, according to the empirical data set of the nonlinear system, the partial differential equation is subjected to a developing process to obtain a higher-order Chang Weifen equation, including:
establishing the experience data set according to the historical input and output data of the nonlinear system;
from the empirical data set, and the partial differential equation, a saturated state equation and a saturated partial differential equation based on the state trajectory and the control input are determined as follows:
and obtaining a higher-order Chang Weifen equation of the optimal cost function through a differential dynamic programming algorithm according to the saturated state equation and the saturated partial differential equation, and the state equation and the cost function, wherein the higher-order Chang Weifen equation is as follows:
Further, introducing the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system, including:
determining a basis function due to approximating the optimal cost function; the following is shown:
determining an estimation function of optimal control according to the basis function;
and bringing the estimation function into the Gao Jiechang differential equation to obtain a high-order differential dynamic approximation.
Further, said bringing said estimation function into said Gao Jiechang differential equation, resulting in a higher order differential dynamic approximation, further comprising:
according to the high-order differential dynamic approximation, determining an algebraic matrix equation which is satisfied by the weight approximation;
and when the continuous excitation condition is met, optimizing the algebraic matrix equation to obtain a target algebraic matrix equation for calculating the weight approximation.
Further, according to the set constraint, performing iterative processing on the data-driven model to determine optimal control of the nonlinear system, including:
defining an approximately estimated hamiltonian amount and a control input;
and performing differential dynamic iteration processing on the high-order differential dynamic approximation according to the approximate estimated Hamiltonian amount and the definition of the control input so as to determine the optimal control of the nonlinear system.
Further, differential dynamic iteration processing is performed on the high order differential dynamic approximation according to the approximation estimated hamiltonian and the definition of the control input to determine optimal control of the nonlinear system, including:
step 1: setting an initial parameter value, and calculating to obtain an initial cost function value, wherein the initial parameter value at least comprises an initial control input and an initial state variable;
step 2: according to the initial parameter value, calculating to obtain an iteration control input of the nonlinear system;
step 3: determining an iteration state variable of the nonlinear system according to the iteration control input and the state equation;
step 4: calculating to obtain iteration weight approximation according to a target algebra matrix equation and iteration times, judging whether a first convergence condition is met, if not, returning to the step3, and if so, entering the step 5;
step 5: and calculating to obtain an iteration cost function value according to the iteration state variable and the optimal cost function, judging whether the iteration cost function value meets a second convergence condition, if not, carrying out iteration control input into the step2, carrying out control input iteration, and if not, determining the iteration control input as a target control input.
In another aspect, there is provided herein a nonlinear system optimum control apparatus for model unknowns, the apparatus comprising:
the partial differential equation determining module is used for establishing an optimal cost function of a nonlinear system with unknown model and determining a partial differential equation for solving the optimal cost function;
gao Jiechang differential equation determining module, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;
the data driving model determining module is used for introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and the optimal control module is used for carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
In another aspect, a nonlinear system optimal control apparatus for model-agnostic, the apparatus comprising:
an input interface configured to receive a state trace of a nonlinear system;
a memory;
a processor configured to perform the method described above and generate control instructions;
an output interface configured to send the control commands to actuators of the nonlinear system to control operation of the system.
Finally, a computer readable storage medium is provided herein, which stores a computer program which, when executed by a processor, implements a method as described above.
By adopting the technical scheme, the optimal control method and the optimal control equipment for the nonlinear system with unknown model are used for establishing an optimal cost function of the system aiming at the nonlinear system with unknown model, and determining a partial differential equation for solving the optimal cost function; according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation; introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system; and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system. The method effectively solves the problem of dimension disaster caused by large calculation amount, has high algorithm convergence speed and improves the efficiency of nonlinear system control.
The foregoing and other objects, features and advantages will be apparent from the following more particular description of preferred embodiments, as illustrated in the accompanying drawings.
Drawings
In order to more clearly illustrate the embodiments herein or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments herein and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 shows a schematic overview of the principles used by some embodiments for controlling the operation of a system.
FIG. 2 illustrates a schematic step diagram of a nonlinear system optimal control method for model unknowns provided by embodiments herein;
FIG. 3 illustrates a flow diagram of a method for model-agnostic nonlinear system optimal control provided by embodiments herein;
FIG. 4 illustrates a state trace comparison for a system under initial control and optimal control inputs, respectively, in an embodiment herein;
FIG. 5 illustrates a state trace comparison plot for another embodiment system herein under initial control and optimal control inputs, respectively;
FIG. 6 shows an initial cost function V for a system in embodiments herein 0 And an optimal cost function V 17 ;
FIG. 7 shows a schematic diagram of a nonlinear system optimal control apparatus for model agnostic provided by embodiments herein;
fig. 8 shows a schematic structural diagram of a control device provided by the embodiments herein.
Description of the drawings:
100. a control device; 102. a system; 104. a model; 106. a control instruction;
701. a partial differential equation determination module; 702. gao Jiechang differential equation determination module; 703. a data driven model determination module; 704. and an optimal control module.
Detailed Description
The following description of the embodiments of the present disclosure will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the disclosure. All other embodiments, based on the embodiments herein, which a person of ordinary skill in the art would obtain without undue burden, are within the scope of protection herein.
It should be noted that the terms "first," "second," and the like in the description and claims herein and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or device.
Fig. 1 shows a schematic overview of the principles used by some embodiments for controlling the operation of a system. Some embodiments provide a control device 100 configured to control a system 102. For example, the apparatus 100 may be configured as a dynamic system 102 that controls continuous operation in engineering processes and machines. Hereinafter, "control device" and "device" may be used interchangeably and will have the same meaning. Hereinafter, "continuously operating power system" and "system" may be used interchangeably and will be synonymous. Examples of the system 102 are HVAC systems, LIDAR systems, condensing units, production lines, self-tuning machines, smart grids, automotive engines, robots, numerically controlled machining, motors, satellites, generators, traffic networks, and the like. Some embodiments are based on the following recognition: the apparatus 100 develops control instructions 106 for controlling the system 102 using control actions in an optimal manner without delay or overshoot and ensuring control stability.
In some implementations, the apparatus 100 uses model-based and/or optimization-based control and estimation techniques, such as Model Predictive Control (MPC), to develop the control commands 106 for the system 102. Model-based techniques may be advantageous for control of dynamic systems. For example, MPC allows for a model-based design framework in which the dynamics and constraints of the system 102 can be directly considered. The MPC develops control commands 106 based on the model 104 of the system. The model 104 of the system refers to the dynamics of the system 102 described using differential equations. In some implementations, the model 104 is non-linear and may be difficult to design and/or difficult to use in real-time. For example, even if a nonlinear model is fully available, estimating the optimal control commands 106 is inherently a challenging task because it is computationally challenging to solve a Partial Differential Equation (PDE) (known as the Hamilton-Jacobi-Bellman (HJB) equation) that describes the dynamics of the system 102.
Some embodiments use data-driven control techniques to design the model 104. The data-driven technique utilizes operational data generated by the system 102 in order to build a feedback control strategy that stabilizes the system 102.
Further, the embodiment provides the nonlinear system optimal control method for model unknown, which can effectively solve the problem of dimension disaster caused by large calculation amount and has high algorithm convergence speed. FIG. 2 is a schematic diagram of the steps of a method for model-unknown optimal control of a nonlinear system provided by embodiments herein, which provides the method operational steps described in the examples or flowcharts, but may include more or fewer operational steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When a system or apparatus product in practice is executed, it may be executed sequentially or in parallel according to the method shown in the embodiments or the drawings. As shown in fig. 2, the method may include:
s201: aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;
s202: according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;
s203: introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
s204: and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
It will be appreciated that for nonlinear systems, the partial differential equation (Hamilton-Jacobi-Bellman, HJB) is developed as a higher order Chang Weifen equation, i.e., (Differential Dynamic Programming, DDP) expansion, in conjunction with Differential Dynamic Programming (DDP) techniques. And then introducing function approximation into the DDP expansion to form an actor-critic structure, and constructing a data driving model. Based on the data driven model, a DDP iterative algorithm with strict convergence proof was developed. The novel algorithm provided by the patent overcomes the technical obstacle and solves the time-varying behavior of the HJB partial differential equation under the condition of the finite time domain cost function.
In the embodiment of the present specification, for a nonlinear system whose model is unknown, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function, including:
establishing a state equation of a nonlinear system with unknown model:x(t 0 )=x 0 wherein->Is a state variable of the system,/>Is a control input of the system,/->Is a system dynamic equation, +.>Is a system input state equation;
determining an optimal cost function of the nonlinear system according to the dynamic constraint of the nonlinear system and the state equation of the nonlinear system: wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]An inner part;
determining a partial differential equation that solves the optimal cost function:illustratively, the nonlinear system may be as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is a state variable of the system,/>Is a control input of the system,/->Is a system dynamic equation, +.>Is a system input state equation. Assuming f (x) +g (x) u satisfies Lipschitz continuous conditions,a closed bounded set for all saturated inputs, where γ is a constraint. For a fixed time interval t= [ T ] 0 ,t f ]We define the cost function associated with the system shown in equation (1) as:
where Q (x) is a positive definite function, W (u) is a non-negative multiplicative function, and τ is an integral argument.
The objective of the optimal control problem is to design a constrained optimal controlSo that the cost function (2) satisfies:
J(x 0 ,t 0 ,u)≥J(x 0 ,t 0 ,u * ) (3)
under dynamic constraints in the system (1), the following generalized non-quadratic function is employed to cope with the input constraints:
wherein r is i >0, i=1, 2, …, m is a positive weight factor.
Formula (4) is rewritable in the following compact form:
wherein r=diag (R 1 ,r 2 ,…,r m ),v=(v 1 ,v 2 ,…,v m ) T ,tanh -1 (v/γ)=(tanh -1 (v 1 /γ),tanh -1 (v 2 /γ),...,tanh -1 (v m /γ)) T 。
Describing the optimal control problem with the following optimal cost function:
wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]And (3) inner part. Assuming that V (x, t) belongs to a first order continuous derivative function, an optimal cost function can be found that satisfies the HJB partial differential equation:
for all ofOptimal control strategy->The control input u can be differentiated by the HJB equation as follows:
the hamiltonian equation defining the optimal control is:
H(x,u,λ)=Q(x)+W(u)+λ T (f(x)+g(x)u) (9)
wherein, the liquid crystal display device comprises a liquid crystal display device,is a vector parameter. We can rewrite HJB equation (7) as:
in the embodiment of the present specification, the expanding process is performed on the partial differential equation according to the empirical data set of the nonlinear system to obtain a higher-order Chang Weifen equation, including:
establishing the experience data set according to the historical input and output data of the nonlinear system;
from the empirical data set, and the partial differential equation, a saturated state equation and a saturated partial differential equation based on the state trajectory and the control input are determined as follows:
and obtaining a higher-order Chang Weifen equation of the optimal cost function through a differential dynamic programming algorithm according to the saturated state equation and the saturated partial differential equation, and the state equation and the cost function, wherein the higher-order Chang Weifen equation is as follows:
Illustratively, a test control input is first selectedLet->Is of initial valueIs a state trace of (a). For an initial value x 0 We will->Defined as an unknown optimal control. Thus, any saturated input U (t) ∈U, t ε [ t ] 0 ,t f ]The lower state trace x (t) is parameterized +.>The representation is:
wherein, the liquid crystal display device comprises a liquid crystal display device,is a state error of the system,/->Is the error of the control input. The state equation and the HJB equation can be written as follows:
DDP expansion: let d based on the state equation (1) and the cost function (2) i Is a column vector G= ((G) 1 ) x ,(g 2 ) x ,...,(g m ) x ). Wherein g i Is the i-th column vector of g, i=1, 2, …, m. Then, the optimal cost function V and its partial derivative V x 、V xx Satisfies the following formula:
wherein, the liquid crystal display device comprises a liquid crystal display device, S 12 =V xxg ,/>S 22 =W uu ,boundary condition of->
In the equations (14) to (16), the functions V, V x ,V xx ,f,f x ,g,G,Q,Q x ,Q xx ,W,W uu Are all atThe parameters are omitted for simplicity of evaluation.
In the embodiment of the present specification, introducing the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system includes:
determining a basis function due to approximating the optimal cost function; the following is shown:
determining an estimation function of optimal control according to the basis function;
and bringing the estimation function into the Gao Jiechang differential equation to obtain a high-order differential dynamic approximation.
Illustratively, to obtain the optimal cost function V in the case where the system model f is unknown, the unknown function is approximated with an independent set of basis functions. The definition is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,and->Is a set of basis functions; />Andis a set of weights; n (N) a And N b The number of the basis functions in each group of basis functions is the number of the basis functions in the set; />Andis an approximation error. When N is a And N b Approximation error e when approaching infinity respectively ia And e b Consistently converged to zero. Since the exact value of the weight is unknown, the estimation function is defined as +.> Wherein->Is a set of weight estimates.
For simplicity, the following compact form is defined: further, by the above-defined symbols, the following expression in compact form can be obtained:
Substituting the estimation function into the second-order expansions (14) - (15) or the third-order expansions (14) - (16) may result in a second-order or third-order DDP approximation.
In this embodiment of the present disclosure, said bringing the estimation function into the Gao Jiechang differential equation results in a higher order differential dynamic approximation, and then further includes:
according to the high-order differential dynamic approximation, determining an algebraic matrix equation which is satisfied by the weight approximation;
and when the continuous excitation condition is met, optimizing the algebraic matrix equation to obtain a target algebraic matrix equation for calculating the weight approximation.
Illustratively, without loss of generality, we provide a second order DDP approximation, as follows:
second order DDP approximation: based on the second order DDP expansions (14) - (15), weight estimates are obtainedAnd->At any time interval->The following algebraic matrix equation is satisfied:
wherein, the liquid crystal display device comprises a liquid crystal display device,
furthermore, if the continuous excitation (PE) condition is satisfied, there is a constant ρ>0 and a plurality of time intervalsMake->Then it is possible to obtain:
further, according to the set constraint, performing iterative processing on the data-driven model to determine optimal control of the nonlinear system, including:
defining an approximately estimated hamiltonian amount and a control input;
and performing differential dynamic iteration processing on the high-order differential dynamic approximation according to the approximate estimated Hamiltonian amount and the definition of the control input so as to determine the optimal control of the nonlinear system.
Illustratively, the following definitions are given first:
definition 2: the approximate estimated hamiltonian is defined as:
furthermore, the exact weight of the basis function defining the system f (x) for which the model is unknown is A * ,i.e.,f(x)=A * Ψ(x)。
Where γ is the constraint of the control input.
Further, differential dynamic iteration processing is performed on the high order differential dynamic approximation according to the approximation estimated hamiltonian and the definition of the control input to determine optimal control of the nonlinear system, including:
step 1: setting an initial parameter value, and calculating to obtain an initial cost function value, wherein the initial parameter value at least comprises an initial control input and an initial state variable;
step 2: according to the initial parameter value, calculating to obtain an iteration control input of the nonlinear system;
step 3: determining an iteration state variable of the nonlinear system according to the iteration control input and the state equation;
step 4: calculating to obtain iteration weight approximation according to a target algebra matrix equation and iteration times, judging whether a first convergence condition is met, if not, returning to the step3, and if so, entering the step 5;
step 5: and calculating to obtain an iteration cost function value according to the iteration state variable and the optimal cost function, judging whether the iteration cost function value meets a second convergence condition, if not, carrying out iteration control input into the step2, carrying out control input iteration, and if not, determining the iteration control input as a target control input.
Illustratively, the iterative process is as follows:
s301: selecting an initial valueConvergence accuracy epsilon>0. Let x be 0 (t),t∈[t 0 ,t f ]Is with an initially given control input +.>The corresponding state of the system of (2) satisfies the following equation:
s303: calculating x i+1 By solving the following formula:
Illustratively, the nonlinear system in this embodiment is selected as follows:
defining a cost function as:
wherein γ=0.5, r=1. The optimal cost function of the approximation system is:
according to step S204, developing a data-driven differential dynamic programming algorithm, and setting the initial condition of the system as x 0 =[2,1] T The constraint of the control input is set to |u|<0.5。
From an optimal cost function Initially, after 17 iterations, the values of the weight vector are obtained as:
based on equation (24), the 17 th optimal control input can be found as:
as shown in fig. 3 and 4, the state traces of the system under the initial control and the optimal control input in the present embodiment are respectively compared, and fig. 5 is an initial cost function V of the system in the present embodiment 0 And an optimal cost function V 17 。
Compared with the prior art, the embodiment of the specification has the following advantages and effects:
1. the invention expands the HJB partial differential equation into a higher-order Chang Weifen equation based on a differential dynamic programming algorithm, and builds a new data driving model;
2. the method effectively solves the problem of dimension disaster caused by large calculation amount, and has high algorithm convergence speed;
3. the invention can overcome the technical obstacle of the time-varying behavior of the HJB equation caused by the finite time domain cost function.
On the basis of the method provided above, the embodiment of the present disclosure further provides a nonlinear system optimal control apparatus for model unknown, as shown in fig. 7, where the method includes:
the partial differential equation determining module 701 is configured to establish an optimal cost function of a nonlinear system with unknown model, and determine a partial differential equation for solving the optimal cost function;
gao Jiechang differential equation determining module 702, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;
a data driving model determining module 703, configured to introduce the Gao Jiechang differential equation into a function approximation to obtain a data driving model of the nonlinear system;
and the optimal control module 704 is configured to perform iterative processing on the data driving model according to the set constraint, so as to determine optimal control of the nonlinear system.
The effects obtained by the device are consistent with the beneficial effects obtained by the method, and the embodiments of the present disclosure are not repeated.
Further, the present specification also provides an apparatus for optimal control of a nonlinear system whose model is unknown, the apparatus comprising:
an input interface configured to receive a state trace of a nonlinear system;
a memory;
a processor configured to perform the method described above and generate control instructions;
an output interface configured to send the control commands to actuators of the nonlinear system to control operation of the system.
For example, an internal structural view thereof may be as shown in fig. 8. The device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the device is configured to provide computing and control capabilities. The memory of the device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method for identifying a driving surface covering of a computer device.
It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of a portion of the structure associated with the present application and does not constitute a limitation of the apparatus to which the present application is applied, and that a particular apparatus may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
It should also be understood that in embodiments herein, the term "and/or" is merely one relationship that describes an associated object, meaning that three relationships may exist. For example, a and/or B may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided herein, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the elements may be selected according to actual needs to achieve the objectives of the embodiments herein.
Specific examples are set forth herein to illustrate the principles and embodiments herein and are merely illustrative of the methods herein and their core ideas; also, as will be apparent to those of ordinary skill in the art in light of the teachings herein, many variations are possible in the specific embodiments and in the scope of use, and nothing in this specification should be construed as a limitation on the invention.
Claims (10)
1. A method for optimal control of a nonlinear system for model unknowns, the method comprising:
aiming at a nonlinear system with an unknown model, establishing an optimal cost function of the system, and determining a partial differential equation for solving the optimal cost function;
according to the experience data set of the nonlinear system, the partial differential equation is expanded to obtain a higher-order Chang Weifen equation;
introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
2. The method of claim 1, wherein establishing an optimal cost function for the system for a nonlinear system for which the model is unknown and determining a partial differential equation that solves the optimal cost function comprises:
establishing a state equation of a nonlinear system with unknown model:x(t 0 )=x 0 wherein->Is a state variable of the system,/>Is a control input of the system,/->Is a dynamic equation of the system and,is a system input state equation;
determining an optimal cost function of the nonlinear system according to the dynamic constraint of the nonlinear system and the state equation of the nonlinear system: wherein t is [ t ] 0 ,t f ]And u [ t, t ] f ]Indicating that the control input u is limited to a time interval t, t f ]An inner part;
3. the method of claim 1, wherein expanding the partial differential equation to obtain a higher order Chang Weifen equation from the empirical data set of the nonlinear system comprises:
establishing the experience data set according to the historical input and output data of the nonlinear system;
from the empirical data set, and the partial differential equation, a saturated state equation and a saturated partial differential equation based on the state trajectory and the control input are determined as follows:
and obtaining a higher-order Chang Weifen equation of the optimal cost function through a differential dynamic programming algorithm according to the saturated state equation and the saturated partial differential equation, and the state equation and the cost function, wherein the higher-order Chang Weifen equation is as follows:
4. The method of claim 1, wherein introducing the Gao Jiechang differential equation into a functional approximation yields a data driven model of the nonlinear system, comprising:
determining a basis function due to approximating the optimal cost function; the following is shown:
determining an estimation function of optimal control according to the basis function;
and bringing the estimation function into the Gao Jiechang differential equation to obtain a high-order differential dynamic approximation.
5. The method of claim 4, wherein said bringing the estimation function into the Gao Jiechang differential equation yields a higher order differential dynamic approximation, further comprising thereafter:
according to the high-order differential dynamic approximation, determining an algebraic matrix equation which is satisfied by the weight approximation;
and when the continuous excitation condition is met, optimizing the algebraic matrix equation to obtain a target algebraic matrix equation for calculating the weight approximation.
6. The method of claim 4, wherein iteratively processing the data driven model to determine optimal control of the nonlinear system according to the set constraints comprises:
defining an approximately estimated hamiltonian amount and a control input;
and performing differential dynamic iteration processing on the high-order differential dynamic approximation according to the approximate estimated Hamiltonian amount and the definition of the control input so as to determine the optimal control of the nonlinear system.
7. The method of claim 6, wherein performing differential dynamic iterative processing on the higher order differential dynamic approximations to determine optimal control of the nonlinear system based on the approximated hamiltonian and a definition of a control input, comprises:
step 1: setting an initial parameter value, and calculating to obtain an initial cost function value, wherein the initial parameter value at least comprises an initial control input and an initial state variable;
step 2: according to the initial parameter value, calculating to obtain an iteration control input of the nonlinear system;
step 3: determining an iteration state variable of the nonlinear system according to the iteration control input and the state equation;
step 4: calculating to obtain iteration weight approximation according to a target algebra matrix equation and iteration times, judging whether a first convergence condition is met, if not, returning to the step3, and if so, entering the step 5;
step 5: and calculating to obtain an iteration cost function value according to the iteration state variable and the optimal cost function, judging whether the iteration cost function value meets a second convergence condition, if not, carrying out iteration control input into the step2, carrying out control input iteration, and if not, determining the iteration control input as a target control input.
8. An optimal control device for a nonlinear system whose model is unknown, said device comprising:
the partial differential equation determining module is used for establishing an optimal cost function of a nonlinear system with unknown model and determining a partial differential equation for solving the optimal cost function;
gao Jiechang differential equation determining module, configured to perform expansion processing on the partial differential equation according to the empirical data set of the nonlinear system, so as to obtain a higher-order Chang Weifen equation;
the data driving model determining module is used for introducing the Gao Jiechang differential equation into function approximation to obtain a data driving model of the nonlinear system;
and the optimal control module is used for carrying out iterative processing on the data driving model according to the set constraint so as to determine the optimal control of the nonlinear system.
9. An optimal control device for a nonlinear system whose model is unknown, said device comprising:
an input interface configured to receive a state trace of a nonlinear system;
a memory;
a processor configured to perform the method of any one of claims 1 to 7 and generate control instructions;
an output interface configured to send the control commands to actuators of the nonlinear system to control operation of the system.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310559968.7A CN116382093A (en) | 2023-05-18 | 2023-05-18 | Optimal control method and equipment for nonlinear system with unknown model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310559968.7A CN116382093A (en) | 2023-05-18 | 2023-05-18 | Optimal control method and equipment for nonlinear system with unknown model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116382093A true CN116382093A (en) | 2023-07-04 |
Family
ID=86973521
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310559968.7A Pending CN116382093A (en) | 2023-05-18 | 2023-05-18 | Optimal control method and equipment for nonlinear system with unknown model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116382093A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117290965A (en) * | 2023-11-22 | 2023-12-26 | 中汽研汽车检验中心(广州)有限公司 | Vehicle model simulation test simulation method, equipment and medium of vehicle simulation software |
-
2023
- 2023-05-18 CN CN202310559968.7A patent/CN116382093A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117290965A (en) * | 2023-11-22 | 2023-12-26 | 中汽研汽车检验中心(广州)有限公司 | Vehicle model simulation test simulation method, equipment and medium of vehicle simulation software |
CN117290965B (en) * | 2023-11-22 | 2024-04-09 | 中汽研汽车检验中心(广州)有限公司 | Vehicle model simulation test simulation method, equipment and medium of vehicle simulation software |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ahn et al. | Online tuning fuzzy PID controller using robust extended Kalman filter | |
US20220326664A1 (en) | Improved machine learning for technical systems | |
EP3935580B1 (en) | Apparatus and method for controlling operation of machine subject, and storage medium | |
EP3101488B1 (en) | Gray box model estimation for process controller | |
US10895854B1 (en) | System and method for control constrained operation of machine with partially unmodeled dynamics using Lipschitz constant | |
Rao et al. | Modeling of room temperature dynamics for efficient building energy management | |
JP7090734B2 (en) | Control system, control method and storage medium | |
CN116382093A (en) | Optimal control method and equipment for nonlinear system with unknown model | |
EP3928167B1 (en) | Apparatus and method for control with data-driven model adaptation | |
Rober et al. | Backward reachability analysis for neural feedback loops | |
KR20190139161A (en) | Pre-step co-simulation method and device | |
WO2023106990A1 (en) | A modular, variable time-step simulator for use in process simulation, evaluation, adaption and/or control | |
US11790247B2 (en) | Robust adaptive dynamic mode decomposition for modeling, prediction, and control of high dimensional physical systems | |
US20240152748A1 (en) | System and Method for Training of neural Network Model for Control of High Dimensional Physical Systems | |
Margolis | A Sweeping Gradient Method for Ordinary Differential Equations with Events | |
Ma | Modeling and control of partial differential equations (PDE) described systems | |
US20230341141A1 (en) | Time-varying reinforcement learning for robust adaptive estimator design with application to HVAC flow control | |
Magalhaes et al. | Data-Driven Controller and Multi-Gradient Search Algorithm for Morphing Configurations | |
Yamaguchi et al. | Multirotor Ensemble Model Predictive Control I: Simulation Experiments | |
Wei | Discrete-time Contraction Analysis and Controller Design for Nonlinear Processes | |
Kanai et al. | Model Predictive Control with Model Error Compensation by Koopman Approach | |
CN116819959A (en) | Multi-agent optimization controller construction method and system based on sliding mode mechanism | |
Rebaï et al. | A Decoupling Approach to Design Observers for Polytopic Takagi-Sugeno Models Subject to Unknown Inputs | |
CN116755339A (en) | Self-adaptive non-backstepping control method and system for nonlinear strict feedback system | |
CN117406748A (en) | Engineering machinery travel path tracking control method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |