CN110879531B - Data-driven self-adaptive optimization control method and medium for random disturbance system - Google Patents
Data-driven self-adaptive optimization control method and medium for random disturbance system Download PDFInfo
- Publication number
- CN110879531B CN110879531B CN201911154069.9A CN201911154069A CN110879531B CN 110879531 B CN110879531 B CN 110879531B CN 201911154069 A CN201911154069 A CN 201911154069A CN 110879531 B CN110879531 B CN 110879531B
- Authority
- CN
- China
- Prior art keywords
- data
- optimal
- state
- control
- driven
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
Abstract
The invention discloses a data-driven self-adaptive optimization control method and a medium of a random disturbance system, wherein the method comprises a problem description part, a design part of a data-driven optimal state observer and an inter-policy data-driven ADP control part of the random disturbance system; the present invention has been described in detail with respect to the above three sections. According to the method, the optimal state observer is driven by design data, and different strategy data driving ADP control of a random disturbance system is performed. The data driving ADP method is firstly used for a system with completely unmeasurable state; model-less LQG control is generalized to continuous time systems; non-matching noise outside a control signal channel and independent noise independent of a state and a control signal are considered in ADP design; a novel different strategy data driving ADP control method and medium for a random disturbance system are provided, the burden of repeatedly reading and updating control signals is avoided, and the calculation amount is obviously reduced.
Description
Technical Field
The invention relates to a random noise disturbance system, in particular to model-free random optimal control. The random noise disturbance system is applied to various fields such as industrial and agricultural production, electric power systems, chemical processes, mechanical manufacturing, transportation, aerospace, artificial intelligence and the like.
Background
Uncertainty in the actual system may come from signals such as inputs and conditionsNoise. Therefore, the optimal control problem of the random noise disturbance system is always concerned. In the conventional literature, such problems are usually represented by H2Or H∞And (3) processing by using a robust control method, wherein the main realization mode is to adjust disturbance input by using a certain determination model so as to design state feedback control. In engineering practice, however, it is often not practical to update the external disturbances in the way they are expected. On the other hand, existing H2And H∞The outcomes are mostly model-based methods. For practical control systems, in addition to noise interference, uncertainty due to model unknowns may also suffer. Therefore, the research of the model-free random optimal control method has important theoretical and practical significance.
An Adaptive Dynamic Programming (ADP) method provides a new idea of model-free random optimal control. In recent years, random optimal control results based on reinforcement learning or an ADP method have appeared, which only considers the "matching type" noise of a control signal channel, needs to read and update the control signal for many times, and has a large computation amount. In a practical system, however, the source of the noise may fall into different categories. Another concern is that system state is sometimes not directly available, but existing data-driven reinforcement learning or ADP methods require that system state be at least partially known. Under the MBC framework, some researchers have solved the problem of completely unmeasurable states by taking the output as a state. However, in the case of measurement noise, this approach will certainly deteriorate the performance of the control system.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a data-driven self-adaptive optimization control method of a random disturbance system, which is used for solving the control problem of the system with completely undetectable state.
It is another object of the present invention to provide a storage medium for a data-driven adaptive optimization control method for a stochastic disturbance system.
The purpose of the invention is realized by the following technical scheme:
a data-driven self-adaptive optimization control method of a random disturbance system comprises a problem description part, a design part of a data-driven optimal state observer and an ADP control part of a random disturbance system driven by different strategy data;
for the problem description section:
giving a random perturbation system and obtaining an output equation associated with the random perturbation system; solving the optimal linear control, and minimizing a cost function through a designed random optimal control strategy;
for the design part of the data-driven optimal state observer:
aiming at a completely unmeasured system state, designing a data-driven optimal state observer; obtaining a state design system through a random disturbance system, an output equation and an observer;
observing the obtained optimal control strategy of the state design system;
designing a data driving algorithm on a state design system by using the idea of data driving ADP, and solving the optimal observation gain;
for the different strategy data drive ADP control part of the random disturbance system:
and obtaining the online state information of the state design system by using the optimal observer, and further designing a data-driven ADP algorithm to finally obtain random optimal control.
As a preferred mode, for the problem description section:
given a randomly perturbed system described by a random differential equation
And output equation associated therewith
y=Cx+υ (2)
Wherein the content of the first and second substances,andrespectively represent the system state and control the outputInputting and outputting;andis an unknown constant matrix;andare uncorrelated zero-mean wiener processes whose covariance matrices are respectively represented asAndN1,N2is a non-negative integer; xii,Is a zero-mean wiener process, satisfies
Where ρ isij,σij>0 is a known constant value;
given the above system, the goal is to solve for the optimal linear control u*=-K*x, whereinIs a random optimal control strategy to be designed so that the cost function
Namely, it is
As a preferred mode, for the design part of the data-driven optimal state observer:
aiming at the completely unmeasured system state, an observer is designed
WhereinAn observed value that is representative of the state,the observation gain to be solved; the observation error is expressed asFrom (1), (2) and (8) can be obtained:
the change of the error e is independent of the state x, firstly, an optimal observer of an unknown state is designed, and then an optimal control strategy of the system is designed by using the observed state.
Preferably, the method comprises the following steps: definition of
According to the LQG control theory, the optimal state observation gain L*Can be expressed as
L*=S*CTV-1 (12)
Wherein S*Is composed of
A unique symmetric positive solution of;
for an optimal observer to exist, the following assumptions are made:
Of (a) wherein L isk(k-1, 2,3, …) is prepared from
Lk=Sk-1BTV-1 (16)
By using the idea of data driving ADP, a data driving algorithm is designed on a system (9) to solve the optimal observation gain L*。
Preferably, the method comprises the following steps: the observation gain in the data acquisition and learning stage is fixed to L0WhereinIs a Hurwitz matrix. When the independent noise is zero, in order to ensure the continuous excitation condition, the exploration noise zeta dt is added to the right side of the system (9); when independent noise exists, the continuous excitation condition is automatically met, and exploration noise does not need to be added, namely zeta is 0;
definition of Lk:=Lk-L0(k is 0,1,2 …) and
let d sigma1Can be measured if
Integrating the two sides of the system (9) along the trajectory to obtain
Definition of
Using the above expression, (19) can be transformed into a more compact form
Wherein psiekAnd ΩekAre respectively defined as
If true, then there is SkAnd Sk+1And calculating the resulting sequenceRespectively converge to S*,L*;
As long as the number of samples r1(the lower subscript of the preset sampling time in the formula (24) represents the number of the collected data) is selected to be large enough to ensure that the rank condition is met, and the rank condition can be met through
Iterative solution of Sk,Lk+1. By setting a threshold value k1As the cycle interruption condition, when | | | S is satisfiedk-Sk-1||≤κ1Interrupt the cycle at that time, LkI.e. the resulting optimal observation gain.
Preferably, the ADP control part is driven by the abnormal strategy data of the random disturbance system:
when the independent noise is not zero, the continuous excitation condition is automatically met, and the control law in the learning stage is set as
u0=-K0x (30)
Wherein K0Admission control strategy of the system. When the independent noise is zero, the exploration noise is added on the right side of (30)
Derived from the third introduction of Ito
and d sigma2Can be measured as follows
Integrating the two sides of equation (33) along the trajectory of system (1) to obtain further
Definition of
Using the above expressions, it is possible to transform (34) into a compact form
Wherein psixkAnd ΩxkAre respectively defined as
Further obtain
Given an initial admission control strategy K0If rank condition
By choosing r large enough2(equation (38) lower subscript of the sampling instant) to obtain (P)k,Kk+1) Is unique solution of, andby setting a threshold value k2As the cycle interruption condition, when | | | P is satisfiedk-Pk-1||≤κ2The cycle is interrupted at times when u is-Kkx is the resulting random optimal control input.
A computer-readable storage medium, on which a computer program is stored, which computer program is executed by a processor for performing the above-mentioned method.
The beneficial effects of the invention are:
according to the method, the optimal state observer is driven by design data, and different strategy data driving ADP control of a random disturbance system is performed. The data driving ADP method is firstly used for a system with completely unmeasurable state; model-less LQG control is generalized to continuous time systems; non-matching noise outside a control signal channel and independent noise independent of a state and a control signal are considered in ADP design; a novel different strategy data driving ADP control method and medium for a random disturbance system are provided, the burden of repeatedly reading and updating control signals is avoided, and the calculation amount is obviously reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a top view of a laboratory scene;
FIG. 2 is a flow chart of a data-driven adaptive optimization control algorithm for a random perturbation system;
FIG. 3 is a graph showing the meaning of (d), (t), and θ (t));
FIG. 4 is a diagram of the trajectory of the end movements;
FIG. 5 is the tip speed and the motive force in a zero force field;
FIG. 6 is the tip speed and force before learning in VF;
FIG. 7 is the tip velocity and force after learning in VF;
fig. 8 shows the VF back end velocity and force.
Detailed Description
The technical solutions of the present invention are further described in detail below with reference to the accompanying drawings, but the scope of the present invention is not limited to the following.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Example one
The invention provides a data-driven self-adaptive optimization control method of a random disturbance system, which comprises a problem description part, a design part of a data-driven optimal state observer and an ADP control part driven by different strategy data of the random disturbance system;
for the problem description section:
giving a random perturbation system and obtaining an output equation associated with the random perturbation system; solving the optimal linear control, and minimizing a cost function through a designed random optimal control strategy;
for the design part of the data-driven optimal state observer:
aiming at a completely unmeasured system state, designing a data-driven optimal state observer; obtaining a state design system through a random disturbance system, an output equation and an observer;
observing the obtained optimal control strategy of the state design system;
designing a data driving algorithm on a state design system by using the idea of data driving ADP, and solving the optimal observation gain;
for the different strategy data drive ADP control part of the random disturbance system:
and obtaining the online state information of the state design system by using the optimal observer, and further designing a data-driven ADP algorithm to finally obtain random optimal control.
For the problem description section:
given a random perturbation system described by random differential equations
And the output equation associated therewith
y=Cx+υ (2)
Wherein, the first and the second end of the pipe are connected with each other,andrespectively representing system status (completely undetectable), control inputs and outputs;andis an unknown constant matrix;andis a zero-mean wiener process (Brownian motion) that is uncorrelated with each other and whose covariance matrix is divided intoIs shown asAndN1,N2is a non-negative integer; xii,Is a zero-mean wiener process, satisfies
Where ρ isij,σij>0 is a known constant value;
given the above system, the goal is to solve for the optimal linear control u*=-K*x is whereinIs a random optimal control strategy to be designed so that the cost function
Namely, it is
For the design part of the data-driven optimal state observer:
aiming at the completely unmeasured system state, an observer is designed
WhereinAn observed value that is representative of the state,the observation gain to be solved; the observation error is expressed asFrom (1), (2) and (8) can be obtained:
the change in error e is independent of the state x, which is consistent with the description of the separation principle. Therefore, it is possible to first design an optimal observer of unknown state and then design an optimal control strategy of the system using the observed state.
Definition of
According to LQG control theory, the optimal state observation gain L*Can be expressed as
L*=S*CTV-1 (12)
Wherein S*Is composed of
The unique symmetric positive solution of (a);
for an optimal observer to exist, the following assumptions are made:
Lk=Sk-1BTV-1 (16)
By using the idea of data driving ADP, a data driving algorithm is designed on a system (9) to solve the optimal observation gain L*。
The observation gain in the data acquisition and learning stage is fixed to L0In whichIs a Hurwitz matrix. When the independent noise is zero, in order to ensure the continuous excitation condition, the exploration noise zeta dt is added to the right side of the system (9); when independent noise exists, the continuous excitation condition is automatically met, and exploration noise does not need to be added, namely zeta is 0;
definition of Lk:=Lk-L0(k is 0,1,2 …) and
let d sigma1Can be measured if
Integrating the two sides of the system (9) along the trajectory to obtain
Definition of
Using the above expression, (19) can be transformed into a more compact form
Wherein psiekAnd ΩekAre respectively defined as
If true, then there is SkAnd Sk+1And calculating the resulting sequenceRespectively converge to S*,L*;
As long as the number of samples r1(the lower corner mark of the preset sampling time in the formula (24) represents the number of the acquired data) is selected to be large enough, so that the rank condition can be ensured to be met, and the rank condition can be met at the moment
Iterative solution of Sk,Lk+1. By setting a threshold value k1As a cycle interrupt condition. When | | | S is satisfiedk-Sk-1||≤κ1Interrupt the cycle at that time, LkI.e. the resulting optimal observation gain.
For the different strategy data drive ADP control part of the random disturbance system:
when the independent noise is not zero, the continuous excitation condition is automatically met, and the control law in the learning stage is set as
u0=-K0x (30)
Wherein K0The allowable control strategy of the system. When the independent noise is zero, add the exploration noise on the right side of (30)
Derived from the third introduction of Ito
and d sigma2Can measure, have
Integrating the two sides of equation (33) along the trajectory of system (1) to obtain further
Definition of
Using the above expression, it is possible to transform (34) into a compact form
Wherein psixkAnd ΩxkAre respectively defined as
Further obtain
Given an initial admission control strategy K0If rank condition
By choosing r large enough2(equation (38) lower corner of sampling time) here again, we get (P)k,Kk+1) Is unique solution of, andby setting a threshold value k2As a cycle interrupt condition. When | | | P is satisfiedk-Pk-1||≤κ2The cycle is interrupted at times when u is-Kkx is the resulting random optimal control input.
Example two
In accordance with an embodiment, the invention provides a computer-readable storage medium, on which a computer program is stored, the computer program being executed by a processor for performing the method described above.
The present invention may employ a computer program product embodied on one or more storage media (including disk storage, CD-ROM, optical storage) having computer program code embodied therein.
The present invention has been described with reference to a method according to an embodiment of the invention. It will be understood that each flow in the flow diagrams can be implemented by computer program instructions. These computer programs may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means.
EXAMPLE III
In accordance with an embodiment, the present invention provides an example of an application of a learning mechanism simulation of the central nervous system.
This example demonstrates the effectiveness of the above method by simulating the arm motion control experiment of the Central Nervous System (CNS) under external force field disturbance. The human subject moves the manipulator tip forward in the horizontal plane to the target position by arm movements, as shown in fig. 1. Two torque motors are arranged in a base of the manipulator, can generate a required force field, and apply corresponding interference force to the arm through the mechanical arm and a handle at the tail end. The data-driven Adaptive Optimal Control (AOC) approach shown in fig. 2 is used here to simulate the learning mechanism of the CNS in this example.
1) Simulation setup
The dynamic behavior of the system can be described as:
wherein, two-dimensional vector p ═ px,py]T,υ=[υx,υy]TRespectively representing the position and the action speed of the tail end; a ═ ax,ay]TThe state of the actuator is represented, namely the action force applied to the tail end by the experimental object; u ═ ux,uy]TControl signals for CNS; m represents the mass of the hand; b is the viscosity constant; τ is a time constant; d η represents the control correlation noise, consisting of
Given, wherein eta1And η2Is two wiener processes, c1And c2Is a positive number that measures the noise amplitude; f is an external force generated by a Velocity-dependent force field (VF), and the value of the external force is set as
The values of the physical parameters of the system are shown in table 1. Taking the state vector as [ p ]T,υT,aT]T. Since the output is noisy, the state cannot be measured directly, but is obtained by an observer, where C ═ I is set6. Note that N1=0,N22. Taking W as 0.001I as covariance matrix of independent noise6V ═ 0.015diag (1,1,1,1,10, 10). The initial observed gain and control strategy are set to L respectively0=10I6And K0=[100I2 10I2 10I2]。
TABLE 1 partial physical parameters of arm motion model
2) Weight matrix selection
Weight matrix R is 0.01I2. Q is set to task dependent, i.e. the CNS can select different Q matrices for different tasks. For example, when a force field perturbation is found in the x-axis direction, the CNS can increase the weight of that direction to enhance the stiffness of the system (i.e., the magnitude of the restoring force per unit of trajectory deflection). Thus, the existing tests can be utilized (one trial and error is calledFor one trial) data and idea of data fitting, select the appropriate Q. Note that Q contains 21 independent elements. To reduce redundancy, the form is set to Q ═ diag (Q)0,0.01Q0,0.0005Q0) WhereinIs a task-dependent symmetric matrix. Let g*(t) is an ideal motion track, namely a straight track; g (t) is the actual trajectory of the last trial. In g*With (t) as the origin, the polar function pairs (d (t), θ (t)) for the g (t) phases are obtained, as shown in FIG. 3.
At time t when d (t) is at its maximummHas a dm=d(tm),θm=θ(tm) Then CNS determines Q by lower model0The value of (c):
wherein, ω is0,ω1,ω2As an empirical constant, the example takes ω0=5×105,ω1=5×104,ω2=105. In the case of a fixed external force field, dmAnd thetamIs a constant value; d is only modulated when changes in the external force field are observed in the CNSmAnd thetamAdapting Q to new task requirements.
3) Simulation result
The trajectory of the tip movement in the simulation is shown in fig. 4.
5 trials were performed in a zero force field (Null field) using an initial admission control strategy (top left panel); then, VF was suddenly applied to the subject, and the experiment was performed (upper right panel); after the first test in VF, the weight matrix parameters are obtainedAfter learning of the CNS by the data-driven AOC algorithm, experiments were performed (lower left panel); when VF is suddenly removed, namely the state of zero force field is recovered, the trace of the end of the aftereffect is obviously deviated to the right (lower right graph)) It is shown that previous learning does produce a stable compensation effect that is still valid until a new learning task is performed.
The end action speed and action force states of the above four stages are shown in fig. 5-8, respectively. It can be seen that the initial trajectory in the y-direction (i.e. the target direction) is approximately a bell curve. After VF is applied, the x positive direction and y negative direction of the motion force increase significantly, which indicates that the subject is disturbed by a large external force (the direction of which is related to the initial motion direction), so that the motion force has to be increased to compensate. After learning, the experimental object generates stable compensation force, and the influence of an external force field is effectively counteracted.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention. The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, it should be noted that any modifications, equivalents and improvements made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (5)
1. A data-driven self-adaptive optimization control method of a random disturbance system is characterized by comprising a problem description part, a design part of a data-driven optimal state observer and an ADP control part driven by different strategy data of the random disturbance system;
for the problem description section:
a random disturbance system is given, and an output equation associated with the random disturbance system is obtained; solving the optimal linear control, and minimizing a cost function through a designed random optimal control strategy;
for the design part of the data-driven optimal state observer:
aiming at a completely unmeasured system state, designing a data-driven optimal state observer; obtaining a state design system through a random disturbance system, an output equation and an observer;
observing the obtained optimal control strategy of the state design system;
designing a data driving algorithm on a state design system by using the idea of data driving ADP, and solving the optimal observation gain;
for the different strategy data drive ADP control part of the random disturbance system:
obtaining the online state information of the state design system by using the optimal observer, namely further designing a data-driven ADP algorithm and finally obtaining random optimal control;
the observation gain in the data acquisition and learning stage is fixed to L0WhereinIs a Hurwitz matrix; when the independent noise is zero, in order to ensure the continuous excitation condition, the exploration noise zeta dt is added to the right side of the system (9); when independent noise exists, the continuous excitation condition is automatically met, and then exploration noise does not need to be added, namely zeta is equal to 0;
definition of Lk:=Lk-L0(k is 0,1,2 …) and
let d sigma1Can be measured if
Integrating the two sides of the system (9) along the trajectory to obtain
Definition of
Using the above expression, (19) can be transformed into a more compact form
Wherein psiekAnd ΩekAre respectively defined as
If true, then there is SkAnd Sk+1And calculating the resulting sequenceRespectively converge to S*,L*;
As long as the number of samples r1(the subscript of the preset sampling time in the formula (24), representing the number of the collected data) is selected to be large enough, so that the rank condition can be met, and the rank condition can be met at the moment
Iterative solution of Sk,Lk+1By setting a threshold value k1As the cycle interruption condition, when | | | S is satisfiedk-Sk-1||≤κ1Interrupt the cycle at that time, LkObtaining the optimal observation gain;
for the different strategy data drive ADP control part of the random disturbance system:
when the independent noise is not zero, the continuous excitation condition is automatically met, and the control law of the learning stage is set as
u0=-K0x (30)
Wherein K0The allowable control strategy of the system, when the independent noise is zero, the exploration noise theta is added on the right side of the (30);
derived from the third introduction of Ito
and d sigma2Can measure, have
Integrating the track of the system (1) to two sides of the equation (33) to further obtain
Definition of
Using the above expression, it is possible to transform (34) into a compact form
Wherein psixkAnd ΩxkAre respectively defined as
Further obtain
Given an initial admission control strategy K0If rank condition
2. The data-driven adaptive optimization control method for the random disturbance system according to claim 1, wherein for the problem description part:
given a randomly perturbed system described by a random differential equation
And the output equation associated therewith
y=Cx+υ (2)
Wherein the content of the first and second substances,andrespectively representing system states, control inputs and outputs;andis an unknown constant matrix;andare uncorrelated zero-mean wiener processes whose covariance matrices are respectively represented asAndN1,N2is a non-negative integer; xii,Is a zero-mean wiener process, satisfies
Where ρ isij,σij>0 is a known constant value;
given the above system, the goal is to solve for the optimal linear control u*=-K*x, whereinIs a random optimal control strategy to be designed so that the cost function
Namely, it is
3. The data-driven adaptive optimization control method of the random disturbance system according to claim 1, wherein for the design part of the data-driven optimal state observer:
aiming at the completely unmeasured system state, an observer is designed
WhereinAn observed value representative of the state of the device,the observation gain to be solved; the observation error is expressed asFrom (1), (2) and (8) can be obtained:
the change of the error e is independent of the state x, firstly, an optimal observer of an unknown state is designed, and then an optimal control strategy of the system is designed by using the observed state.
4. The data-driven adaptive optimization control method of the random disturbance system according to claim 3, wherein:
definition of
According to the LQG control theory, the optimal state observation gain L*Can be expressed as
L*=S*CTV-1 (12)
Wherein S*Is composed of
A unique symmetric positive solution of;
for an optimal observer to exist, the following assumptions are made:
Of (a) wherein L isk(k-1, 2,3, …) is prepared from
Lk=Sk-1BTV-1 (16)
By using the idea of data driving ADP, a data driving algorithm is designed on a system (9) to solve the optimal observation gain L*。
5. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program is executed by a processor for performing the method according to any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911154069.9A CN110879531B (en) | 2019-11-22 | 2019-11-22 | Data-driven self-adaptive optimization control method and medium for random disturbance system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911154069.9A CN110879531B (en) | 2019-11-22 | 2019-11-22 | Data-driven self-adaptive optimization control method and medium for random disturbance system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110879531A CN110879531A (en) | 2020-03-13 |
CN110879531B true CN110879531B (en) | 2022-06-24 |
Family
ID=69730443
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911154069.9A Active CN110879531B (en) | 2019-11-22 | 2019-11-22 | Data-driven self-adaptive optimization control method and medium for random disturbance system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110879531B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111665719B (en) * | 2020-06-11 | 2022-11-22 | 大连海事大学 | Supply ship synchronous control algorithm with timeliness and stability |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104019520A (en) * | 2014-05-20 | 2014-09-03 | 天津大学 | Data drive control method for minimum energy consumption of refrigerating system on basis of SPSA |
CN107273445A (en) * | 2017-05-26 | 2017-10-20 | 电子科技大学 | The apparatus and method that missing data mixes multiple interpolation in a kind of big data analysis |
CN107807069A (en) * | 2017-10-25 | 2018-03-16 | 中国石油大学(华东) | The adaptive tracking control method and its system of a kind of offshore spilled oil |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB0719969D0 (en) * | 2007-10-12 | 2007-11-21 | Cambridge Entpr Ltd | Substance monitoring and control in human or animal bodies |
-
2019
- 2019-11-22 CN CN201911154069.9A patent/CN110879531B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104019520A (en) * | 2014-05-20 | 2014-09-03 | 天津大学 | Data drive control method for minimum energy consumption of refrigerating system on basis of SPSA |
CN107273445A (en) * | 2017-05-26 | 2017-10-20 | 电子科技大学 | The apparatus and method that missing data mixes multiple interpolation in a kind of big data analysis |
CN107807069A (en) * | 2017-10-25 | 2018-03-16 | 中国石油大学(华东) | The adaptive tracking control method and its system of a kind of offshore spilled oil |
Non-Patent Citations (1)
Title |
---|
李刚 等."飞行器惯性导航陀螺仪故障诊断研究".《计算机仿真》.2019,第36卷(第3期),第32-38,44页. * |
Also Published As
Publication number | Publication date |
---|---|
CN110879531A (en) | 2020-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Doerr et al. | Model-based policy search for automatic tuning of multivariate PID controllers | |
Qi et al. | Stable indirect adaptive control based on discrete-time T–S fuzzy model | |
CN109375512B (en) | Prediction control method for ensuring closed loop stability of inverted pendulum system based on RBF-ARX model | |
Kersten et al. | State-space transformations of uncertain systems with purely real and conjugate-complex eigenvalues into a cooperative form | |
Yang et al. | Adaptive backstepping terminal sliding mode control method based on recurrent neural networks for autonomous underwater vehicle | |
Guan et al. | Ship steering control based on quantum neural network | |
CN110879531B (en) | Data-driven self-adaptive optimization control method and medium for random disturbance system | |
CN111273677B (en) | Autonomous underwater robot speed and heading control method based on reinforcement learning technology | |
Rahman et al. | Neural ordinary differential equations for nonlinear system identification | |
CN112571420A (en) | Dual-function model prediction control method under unknown parameters | |
Kim et al. | TOAST: Trajectory Optimization and Simultaneous Tracking Using Shared Neural Network Dynamics | |
Abadi et al. | Chattering-free adaptive finite-time sliding mode control for trajectory tracking of MEMS gyroscope | |
Takahashi | Remarks on a recurrent quaternion neural network with application to servo control systems | |
Yang et al. | Robust control of a class of under-actuated mechanical systems with model uncertainty | |
Wu et al. | Date-Driven Tracking Control via Fuzzy-State Observer for AUV under Uncertain Disturbance and Time-Delay | |
Zhu et al. | Online parameter estimation for uncertain robot manipulators with fixed-time convergence | |
Loria | Uniform global position feedback tracking control of mechanical systems without friction | |
Fan et al. | Differential Dynamic Programming for time-delayed systems | |
Zheng et al. | Identification for nonlinear singularly perturbed system using recurrent high-order multi-time scales neural network | |
Azarfar et al. | Adaptive control for nonlinear singular systems | |
Xia et al. | Three-Dimensional Trajectory Tracking for a Heterogeneous XAUV via Finite-Time Robust Nonlinear Control and Optimal Rudder Allocation | |
Żak | Neural Controlling of Remotely Operated Underwater Vehicle | |
Balakhnov et al. | Robust explicit model predictive control for hybrid linear systems with parameter uncertainties | |
Rawat et al. | Trajectory Control of Robotic Manipulator using Metaheuristic Algorithms | |
Takahashi et al. | Remarks on an Echo State Network–Based Optimal Predictive Control Using a Metaheuristics Optimisation Approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |