CN116208041A - Motor system H infinite reduced order output tracking control method based on reinforcement learning - Google Patents

Motor system H infinite reduced order output tracking control method based on reinforcement learning Download PDF

Info

Publication number
CN116208041A
CN116208041A CN202310067097.7A CN202310067097A CN116208041A CN 116208041 A CN116208041 A CN 116208041A CN 202310067097 A CN202310067097 A CN 202310067097A CN 116208041 A CN116208041 A CN 116208041A
Authority
CN
China
Prior art keywords
infinite
reinforcement learning
neural network
disturbance
motor system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310067097.7A
Other languages
Chinese (zh)
Inventor
周林娜
厉功贺
杨春雨
褚众
王海
刘晓敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN202310067097.7A priority Critical patent/CN116208041A/en
Publication of CN116208041A publication Critical patent/CN116208041A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02PCONTROL OR REGULATION OF ELECTRIC MOTORS, ELECTRIC GENERATORS OR DYNAMO-ELECTRIC CONVERTERS; CONTROLLING TRANSFORMERS, REACTORS OR CHOKE COILS
    • H02P21/00Arrangements or methods for the control of electric machines by vector control, e.g. by control of field orientation
    • H02P21/0003Control strategies in general, e.g. linear type, e.g. P, PI, PID, using robust control
    • H02P21/0014Control strategies in general, e.g. linear type, e.g. P, PI, PID, using robust control using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02PCONTROL OR REGULATION OF ELECTRIC MOTORS, ELECTRIC GENERATORS OR DYNAMO-ELECTRIC CONVERTERS; CONTROLLING TRANSFORMERS, REACTORS OR CHOKE COILS
    • H02P21/00Arrangements or methods for the control of electric machines by vector control, e.g. by control of field orientation
    • H02P21/14Estimation or adaptation of machine parameters, e.g. flux, current or voltage

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Power Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a motor system H infinite reduced order output tracking control method based on reinforcement learning, which can solve the problem of motor system interference suppression tracking control containing unmodeled dynamic and imperfect data. The method specifically comprises the following steps: decomposing the H infinite output tracking control problem of the original motor system by utilizing a singular perturbation theory to obtain a reduced-order system problem; based on the output state data of the original system, a state reconstruction mechanism of the virtual subsystem is provided to solve the problem that the data of the virtual subsystem is not measurable, and a reinforcement learning H infinite output tracking iterative algorithm based on the reconstruction data is further deduced; and introducing an execution-evaluation-disturbance neural network approximation controller, a performance index and disturbance, and iteratively updating the weight of the neural network based on a least square method to obtain the reduced-order H infinite output tracking controller based on reinforcement learning. The invention avoids the potential high-speed and pathological numerical problems when designing the double-time-scale motor system tracking controller under the reinforcement learning framework.

Description

Motor system H infinite reduced order output tracking control method based on reinforcement learning
Technical Field
The invention belongs to the field of motor system drive control, and particularly relates to a motor system H infinite reduced order output tracking control method based on reinforcement learning.
Background
The nonlinear double-time-scale motor system widely existing in the fields of power systems, flow industry and the like is a system with complex characteristics of high order, fast and slow coupling and the like. In practice, the system is often required to have a certain anti-interference capability while running according to a preset reference track. The goal of robust tracking control is to design the controller so that the system meets the above requirements and is therefore under extensive research.
The existing nonlinear double-time-scale motor system tracking control method is mainly based on sliding mode control, active disturbance rejection control and the like. However, no quantitative analysis of disturbance inhibition exists in the method, so that H infinite control is generated, and the method becomes an effective means for dealing with disturbance rejection. However, if the tracking control method of the general system is directly applied to a singular perturbation system, the problem of pathological numerical values and the dimension disaster can be caused. To this end, a viable solution based on system decomposition is applied in controlling such systems. While time-scale decomposition has been introduced as a result to design a combined robust controller of nonlinear dual time scales, the system model is required to be fully known and the virtual subsystem states are required to be fully measurable. At present, H infinite reduced order output tracking control of a nonlinear double-time-scale motor system with unknown dynamics does not exist.
In the actual industrial production process, the accurate model of the system is difficult to build, and the reinforcement learning has the unique advantage of dealing with the problem of model-free control due to the interactive error characteristics of the intelligent body and the environment, and the system can acquire an ideal control law by utilizing the input and output data of the system, so that the problem of optimal tracking control can be solved. Today, many approaches have emerged to overcome the adverse effects of interference under reinforcement learning frameworks. As a mainstream immunity method, H infinity control based on reinforcement learning has attracted attention. Converting the H infinity control problem to a zero and game problem and solving using the optimal control concept has proven to be an effective method. However, since the dual time scale system has high dimension and fast and slow dynamic coupling characteristics, the existing reinforcement learning method is not suitable for the motor system, and even causes a problem of pathological numerical values in the iterative learning process. Therefore, there is an urgent need to develop a motor system H infinity reduced order output tracking control method with self-learning capability, which can still realize H infinity reduced order output tracking control of a system other than the above system under the condition of containing unknown dynamics and data imperfections.
Disclosure of Invention
Disclosure of Invention
Aiming at the technical defects, the invention aims to provide a motor system H infinite reduced order output tracking control method based on reinforcement learning, which can solve the problem of motor system interference suppression tracking control containing unmodeled dynamic and imperfect data and avoid the potential high-dimension and pathological numerical problems when a double-time-scale motor system tracking controller is designed under the reinforcement learning framework.
In order to solve the technical problems, the invention adopts the following technical scheme:
a motor system H infinite reduced order output tracking control method based on reinforcement learning is used for servo motors, flow industry and other systems, and comprises the following steps:
step one: decomposing the H infinite output tracking control problem of the original motor system by utilizing a singular perturbation theory to obtain a reduced order problem;
step two: based on the output state data of the original system, a state reconstruction mechanism of the virtual subsystem is provided to solve the problem that the data of the virtual subsystem is not measurable, and an H infinite output tracking reinforcement learning iterative algorithm based on the reconstruction data is further deduced;
step three: and introducing an execution-evaluation-disturbance neural network approximation controller, a performance index and disturbance, and iteratively updating the weight of the neural network based on a least square method to obtain the reduced order tracking controller based on reinforcement learning.
Preferably, in step one, the motor system is described by the following state space model:
Figure BDA0004062641280000021
Figure BDA0004062641280000022
wherein x1 ,x 2 As a motor system state variable, u= [ u ] 1 ,…,u m ]Is a control input, w= [ w ] 1 ,…w q ]Is an external disturbance, f 11 、f 12 、f 21 、f 22 Is the system dynamic, g 1 、g 2 Is the input dynamics, k is the disturbance dynamics and 0<Epsilon < 1 is a singular perturbation parameter; let f 11 、f 12 、f 21 、f 22 、g 1 、g 2 K is completely unknown and lipshitz is continuous, f (0) =0 and f 22 Reversible, the fast subsystem is asymptotically stable in a short time without the application of a fast controller;
to make the system slow state x 1 Tracking a bounded reference trajectory r (t), assuming a lipshitz continuous function exists, such that
Figure BDA0004062641280000023
Define tracking error as
ρ=Cx 1 -r(t);
The tracking error dynamic is
Figure BDA0004062641280000031
The original H infinite output tracking control problem is: designing a state feedback controller u=χ (ρ, r), satisfying an L2 gain condition defined by the following equation in the presence of disturbance, and converging a tracking error to 0 in the absence of disturbance;
Figure BDA0004062641280000032
wherein z 2 =ρ T Qρ+u T Ru is defined virtual control output, alpha>0 is a discount factor, γ represents the level of attenuation from the disturbance input w (t) to the defined performance output variable z (t), q= [ C ] 1 C 2 ] T [C 1 C 2 ]>0,R>0;
The original system is simplified into the following reduced-order system:
Figure BDA0004062641280000033
y=Cx 1s
wherein C is a system output matrix, x 1s Is a reduced system state and
F s (x 1s )=f 11 (x 1s )-f 12 (x 1s )f 22 -1 (x 1s )f 21 (x 1s )
G s (x 1s )=g 1 (x 1s )-f 12 (x 1s )f 22 -1 (x 1s )g 2 (x 1s );
K s (x 1s )=k(x 1s )
the H infinite reduced order output tracking control problem is simplified into the following reduced order output tracking problem:
design controller u s So that the reduced order system outputs a state track Cx 1s Tracking a reference track r (t);
defining the output tracking error of the reduced order system as
ρ s =Cx 1s -r(t);
The tracking error dynamic is
Figure BDA0004062641280000034
Virtual control outputs are defined as follows:
||z|| 2 =ρ s Ts +u s T Ru s
the objective of the H infinity reduced order output tracking control problem is to calculate the tracking error rho s And a reference track r, find a control strategy u of a smoothing function χ s =χ(ρ s R) is set to satisfy the following conditions:
1) In the presence of disturbances, the system satisfies the following L 2 Gain conditions:
Figure BDA0004062641280000041
2) In the absence of disturbances, the output tracking error approaches 0.
Preferably, in the second step, the state reconstruction mechanism of the virtual subsystem is as follows: using the slow dynamic state x of the original system 1 Reconstructing an unmeasurable virtual subsystem state based on the reconstructed data x 1 The slow subsystem H infinite reinforcement learning iterative algorithm is as follows:
Figure BDA0004062641280000042
wherein ,
Figure BDA00040626412800000411
i is the slow controller iteration index.
Preferably, in the third step, the slow controller design method based on reinforcement learning specifically includes:
selecting and evaluating the neural network, executing the neural network and perturbing the linear independent activation function vectors of the neural network to be respectively
Figure BDA0004062641280000043
Design ofEvaluation-execution-perturbation neural network for approximating performance index J rec Controller u rec Disturbance w rec
Figure BDA0004062641280000044
Figure BDA0004062641280000045
Figure BDA0004062641280000046
wherein ,
Figure BDA0004062641280000049
respectively representing the weight vectors of the evaluation neural network, the execution neural network and the disturbance neural network;
initializing the weight vector of the neural network
Figure BDA00040626412800000410
Given an initially stable execution network and perturbed network weights
Figure BDA0004062641280000047
In different behavior strategies u s Under the action of w, data pair { X } is collected from the original system 1(n) ,u s(n) ,w,X′ 1(n) And put it into sample set +.>
Figure BDA0004062641280000048
In (2), the number of collected samples is N s ,n=1,…,N s
c, utilizing
Figure BDA0004062641280000051
and W(i) Further constructing a database, and simultaneously updating the weights of the evaluation-execution-disturbance neural network based on a least square method:
Figure BDA0004062641280000052
wherein ,
Figure BDA0004062641280000053
preferably, the motor system H infinite reduced order output tracking controller based on reinforcement learning is:
Figure BDA0004062641280000054
the beneficial effects of the invention are as follows:
1) The singular perturbation theory is utilized to decompose the H infinite output tracking control problem of the original motor system to obtain the problem of a reduced-order slow subsystem, so that the occurrence of the problem of a pathological numerical value is avoided;
2) Based on the output state data of the original system, a state reconstruction mechanism of the virtual subsystem is provided to solve the problem that the data of the virtual subsystem is not measurable, and an H infinite output tracking reinforcement learning iterative algorithm based on the reconstruction data is further deduced;
3) Introducing a reinforcement learning algorithm into a motor control system, and iteratively updating the weight of the neural network based on a least square method by utilizing an execution-evaluation-disturbance neural network approximation controller, a performance index and disturbance to obtain a reinforcement learning-based reduced-order H infinite output tracking controller.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a diagram of a motor system H infinite reduced order output tracking control framework based on reinforcement learning provided by an embodiment of the invention;
fig. 2 is a schematic diagram of a process for evaluating convergence of weights of a neural network according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a first implementation neural network weight convergence procedure according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a second implementation neural network weight convergence procedure according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a process for converging weights of a perturbed neural network according to an embodiment of the present invention;
fig. 6 is a trace curve of the state of the closed-loop motor system under the action of the optimal control law provided by the embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1; referring to fig. 1, the motor system H infinite reduced order output tracking control method based on reinforcement learning includes the following steps:
step 101: the singular perturbation theory is utilized to decompose the H infinite output tracking control problem of the original motor system to obtain a reduced order problem, so that the occurrence of the disease state numerical value problem is avoided;
the specific method comprises the following steps:
(1-1) a nonlinear double time scale motor system, without loss of generality, the state space model of the system is described as:
Figure BDA0004062641280000061
Figure BDA0004062641280000062
wherein x1 ,x 2 As a motor system state variable, u= [ u ] 1 ,…,u m ]Is a control input, w= [ w ] 1 ,…w q ]Is an external disturbance, f 11 ,f 12 ,f 21 ,f 22 Is the system dynamic, g 1 ,g 2 Is the input dynamics, k is the disturbance dynamics and 0<Epsilon < 1 is a singular perturbation parameter. Let f 11 ,f 12 ,f 21 ,f 22 ,g 1 ,g 2 K is completely unknown and lipshitz is continuous, f (0) =0 and f 22 Reversible, the fast subsystem is asymptotically stable in a very short time without the application of a fast controller.
To make the system slow state x 1 Tracking a bounded reference trajectory r (t), assuming a Lipschitz continuous function exists, such that
Figure BDA0004062641280000063
Define tracking error as
ρ=Cx 1 -r(t)(3)
The tracking error dynamic is
Figure BDA0004062641280000064
(1-2) the original H infinity output tracking control problem is: the design state feedback controller u=χ (ρ, r), satisfies the L2 gain condition defined by the following equation in the presence of disturbance, and converges to 0 in the absence of disturbance.
Figure BDA0004062641280000071
Wherein z 2 =ρ T Qρ+u T Ru is defined virtual control output, alpha>0 is a discount factor, γ represents the level of attenuation from the disturbance input w (t) to the defined performance output variable z (t), q= [ C ] 1 C 2 ] T [C 1 C 2 ]>0,R>0。
(1-3) the decomposed slow sub-problem is: design controller u s So that the slow subsystem outputs a state track Cx 1s Tracking a reference track r (t).
Defining the output tracking error of the reduced order system as
ρ s =Cx 1s -r(t)(6)
The tracking error dynamic is
Figure BDA0004062641280000072
Virtual control outputs are defined as follows:
||z|| 2 =ρ s Ts +u s T Ru s (8)
the objective of the H infinity reduced order output tracking control problem is to calculate the tracking error rho s And a reference track r, find a control strategy u of a smoothing function χ s =χ(ρ s R) is set to satisfy the following conditions:
1) In the presence of disturbances, the system satisfies the following L 2 Gain conditions:
Figure BDA0004062641280000073
2) In the absence of disturbances, the output tracking error approaches 0.
Step 102: based on the output state data of the original system, a state reconstruction mechanism of the virtual subsystem is provided to solve the problem that the data of the virtual subsystem is not measurable, and an H infinite output tracking reinforcement learning iterative algorithm based on the reconstruction data is further deduced; comprising the following steps:
(2-1) Using the Primary System Slow dynamic State x 1 Reconstructing an unmeasurable virtual subsystem state, said reconstructing data x based 1 The slow subsystem H infinite reinforcement learning iterative algorithm is as follows:
Figure BDA0004062641280000081
wherein ,
Figure BDA0004062641280000082
i is the slow controller iteration index.
Step 103: introducing a reinforcement learning algorithm into a motor control system, and iteratively updating weights of a neural network based on a least square method by utilizing an execution-evaluation-disturbance neural network approximation controller, performance indexes and disturbance to obtain a reinforcement learning-based reduced-order H infinite output tracking controller, wherein the reinforcement learning-based reduced-order H infinite output tracking controller comprises the following steps of:
(3-1) designing a slow controller based on reinforcement learning, specifically:
selecting a slow evaluation neural network, an execution neural network and a linear independent activation function vector of a disturbance neural network as follows respectively
Figure BDA0004062641280000083
Designing an evaluation-execution-perturbation neural network for approximating a slow performance index J rec Slow controller u rec Disturbance w rec
Figure BDA0004062641280000084
Figure BDA00040626412800000814
Figure BDA0004062641280000085
wherein ,
Figure BDA0004062641280000086
the weight vectors of the slow evaluation neural network, the slow execution neural network and the first disturbance neural network are respectively represented.
Initializing the weight vector of the neural network
Figure BDA0004062641280000087
Given an initially stable execution network and perturbed network weights
Figure BDA0004062641280000088
In different behavior strategies u s Under the action of w, data pair ∈A is collected from the original system>
Figure BDA0004062641280000089
And put it into sample set +.>
Figure BDA00040626412800000810
In (2), the number of collected samples is N s ,n=1,…,N s
c, utilizing
Figure BDA00040626412800000811
and W(i) Further constructing a database, and simultaneously updating the weights of the evaluation-execution-disturbance neural network based on a least square method:
Figure BDA00040626412800000812
wherein ,
Figure BDA00040626412800000813
will be
Figure BDA0004062641280000091
Acting on the original motor system;
the motor system H infinite reduced order output tracking controller based on reinforcement learning is designed as follows
u=u s (15)。
Example 2
In order to enable those skilled in the art to better understand the invention, a motor system H infinite reduced order output tracking control method based on reinforcement learning is described in detail below with reference to specific embodiments;
consider the permanent magnet synchronous motor model:
Figure BDA0004062641280000092
Figure BDA0004062641280000093
Figure BDA0004062641280000094
wherein the number of pole pairs n p =4, viscous friction coefficient B υ =0.005 n·m·s, stator resistance R s =10.7Ω, synthetic rotor flux linkage
Figure BDA0004062641280000095
Direct axis and quadrature axis inductance L d =L q =0.0098 mH, moment of inertia +.>
Figure BDA00040626412800000910
Select state variable +.>
Figure BDA00040626412800000911
For motor rotation speed, direct axis current and quadrature axis current, the control input u= [ u ] 1 u 2 ] T =[u d u q ] T External disturbance for direct and quadrature voltages>
Figure BDA0004062641280000096
Time scale parameter for load torque
Figure BDA0004062641280000097
Obtaining
x 1 =-0.238x 1 +2.0114x 2 -4.7619w
Figure BDA0004062641280000098
The control objective of this embodiment is to design a state feedback controller to make the motor system run according to a given reference trajectory at w≡0 and to satisfy L 2 Gain of
Figure BDA0004062641280000099
Q and R are chosen to be first and second order identity matrices, respectively, γ=1.
When designing the H-infinity output tracking controller, four neural networks are introduced, including an evaluation neural network, two execution neural networks and a disturbance neural network. The reference track is selected to be r=0.2 cos (0.2 t), the initial value is 0, and x 1 Initial values are 1, c=1, q=i, r=1, α=0.2, γ=1. The neural network basis function of the evaluation function is sigma= [ ρ ] 234 ,r,r 3 ]Executing the network and perturbing the network basis functions as
Figure BDA0004062641280000101
The initial weight is +.>
Figure BDA0004062641280000102
Figure BDA0004062641280000103
And applying detection noise, collecting sample data, and carrying out iteration to converge the weights of the neural networks. The slow subsystem evaluates the neural network weight iterative process as shown in fig. 2, the execution neural network weight iterative process as shown in fig. 2-4, and the disturbance neural network weight iterative process as shown in fig. 5. Based on executing the neural network weight and combining (12), the H infinite reduced order tracking controller (15) can be obtained.
The state track curve of the closed loop motor system under the action of the reduced order tracking controller is shown in fig. 6, and it can be seen that the system operates according to a given reference track without disturbance.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (5)

1. The motor system H infinite reduced order output tracking control method based on reinforcement learning is characterized by comprising the following steps of:
step one: decomposing the H infinite output tracking control problem of the original motor system by utilizing a singular perturbation theory to obtain a reduced-order system problem;
step two: based on the output state data of the original system, a state reconstruction mechanism of the virtual subsystem is provided to solve the problem that the data of the virtual subsystem is not measurable, and an H infinite output tracking reinforcement learning iterative algorithm based on the reconstruction data is further deduced;
step three: and introducing an execution-evaluation-disturbance neural network approximation controller, performance indexes and disturbance, and iteratively updating the weight of the neural network based on a least square method to obtain a reinforcement learning-based reduced order controller.
2. The reinforcement learning-based motor system H infinite reduced order output tracking control method according to claim 1, wherein in step one, the motor system is described by the following state space model:
Figure FDA0004062641270000011
Figure FDA0004062641270000012
wherein x1 ,x 2 As a motor system state variable, u= [ u ] 1 ,…,u m ]Is a control input, w= [ w ] 1 ,…w q ]Is an external disturbance, f 11 、f 12 、f 21 、f 22 Is the system dynamic, g 1 、g 2 Is the input dynamics, k is the disturbance dynamics and 0<Epsilon < 1 is a singular perturbation parameter; let f 11 、f 12 、f 21 、f 22 、g 1 、g 2 K is completely unknown and lipshitz is continuous, f (0) =0 and f 22 Reversible, the fast subsystem is asymptotically stable in a short time without the application of a fast controller;
to make the system slow state x 1 Tracking a bounded reference trajectory r (t), assuming a lipshitz continuous function exists, such that
Figure FDA0004062641270000013
Define tracking error as
ρ=Cx 1 -r(t);
The tracking error dynamic is
Figure FDA0004062641270000014
The original H infinite output tracking control problem is: designing a state feedback controller u=χ (ρ, r), satisfying an L2 gain condition defined by the following equation in the presence of disturbance, and converging a tracking error to 0 in the absence of disturbance;
Figure FDA0004062641270000021
wherein z 2 =ρ T Qρ+u T Ru is defined virtual control output, alpha>0 is a discount factor, γ represents the level of attenuation from the disturbance input w (t) to the defined performance output variable z (t), q= [ C ] 1 C 2 ] T [C 1 C 2 ]>0,R>0;
The original system is simplified into the following reduced-order system:
Figure FDA0004062641270000022
y=Cx 1s
wherein C is a system output matrix, x 1s Is a reduced system state and
Figure FDA0004062641270000023
the original H infinity output tracking control problem is simplified into the following H infinity reduced order output tracking problem:
design controller u s So that the reduced order system outputs a state track Cx 1s Tracking a reference track r (t);
defining the output tracking error of the reduced order system as
ρ s =Cx 1s -r(t);
The tracking error dynamic is
Figure FDA0004062641270000024
Virtual control outputs are defined as follows:
||z|| 2 =ρ s Ts +u s T Ru s
the objective of the H infinity reduced order output tracking control problem is to calculate the tracking error rho s And a reference track r, find a control strategy u of a smoothing function χ s =χ(ρ s R) is set to satisfy the following conditions:
1) In the presence of disturbances, the system satisfies the following L 2 Gain conditions:
Figure FDA0004062641270000025
2) In the absence of disturbances, the output tracking error approaches 0.
3. The method for controlling H infinite reduced order output tracking of a motor system based on reinforcement learning according to claim 1, wherein in the second step, a state reconstruction mechanism of the virtual subsystem is as follows: using the slow dynamic state x of the original system 1 Reconstructing an unmeasurable virtual subsystem state based on the reconstructed data x 1 The slow subsystem H infinite reinforcement learning iterative algorithm is as follows:
Figure FDA0004062641270000031
wherein ,
Figure FDA0004062641270000032
i is the slow controller iteration index.
4. The motor system H infinite reduced order output tracking control method based on reinforcement learning according to claim 1, wherein in the third step, the slow controller design method based on reinforcement learning specifically comprises:
selecting and evaluating the neural network, executing the neural network and perturbing the linear independent activation function vectors of the neural network to be respectively
Figure FDA0004062641270000033
Design evaluation-execution-perturbation neural network for approximating performance index J rec Controller u rec Disturbance w rec
Figure FDA0004062641270000034
Figure FDA0004062641270000035
Figure FDA0004062641270000036
wherein ,
Figure FDA0004062641270000037
respectively representing the weight vectors of the evaluation neural network, the execution neural network and the disturbance neural network;
initializing the weight vector of the neural network
Figure FDA0004062641270000038
Given an initial stable execution network and a perturbation network weight +.>
Figure FDA0004062641270000039
In different behavior strategies u s Under the action of w, data pair { X } is collected from the original system 1(n) ,u s(n) ,w,X' 1(n) And put it into sample set +.>
Figure FDA00040626412700000310
In (2), the number of collected samples is N s ,n=1,…,N s
c, utilizing
Figure FDA00040626412700000311
and W(i) Further constructing a database, and simultaneously updating the weights of the evaluation-execution-disturbance neural network based on a least square method: />
Figure FDA00040626412700000312
wherein ,
Figure FDA00040626412700000313
5. the reinforcement learning-based motor system H infinite reduced order output tracking control method according to claim 1, wherein the reinforcement learning-based motor system H infinite reduced order output tracking controller is:
Figure FDA0004062641270000041
/>
CN202310067097.7A 2023-01-30 2023-01-30 Motor system H infinite reduced order output tracking control method based on reinforcement learning Pending CN116208041A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310067097.7A CN116208041A (en) 2023-01-30 2023-01-30 Motor system H infinite reduced order output tracking control method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310067097.7A CN116208041A (en) 2023-01-30 2023-01-30 Motor system H infinite reduced order output tracking control method based on reinforcement learning

Publications (1)

Publication Number Publication Date
CN116208041A true CN116208041A (en) 2023-06-02

Family

ID=86512321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310067097.7A Pending CN116208041A (en) 2023-01-30 2023-01-30 Motor system H infinite reduced order output tracking control method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN116208041A (en)

Similar Documents

Publication Publication Date Title
Park et al. Economic load dispatch for piecewise quadratic cost function using Hopfield neural network
Liu et al. Adaptive fuzzy output tracking control for a class of uncertain nonlinear systems
CN110018638A (en) Exchange radial direction magnetic bearing neural network automatic disturbance rejection controller and its building method
CN111176118B (en) Robust adaptive algorithm-based turntable servo system identification method
CN110705174B (en) Five-degree-of-freedom magnetic suspension motorized spindle rotor displacement self-detection system and method
CN106788028B (en) Bearing-free permanent magnet synchronous motor intensified learning controller and its building method
Lin et al. Intelligent sliding-mode position control using recurrent wavelet fuzzy neural network for electrical power steering system
Li et al. Robust control for permanent magnet in-wheel motor in electric vehicles using adaptive fuzzy neural network with inverse system decoupling
CN110011582A (en) A kind of permanent magnet synchronous motor vector control method
Xu et al. Adaptive fuzzy fault-tolerant control of static var compensator based on dynamic surface control technique
Tian et al. Application of improved whale optimization algorithm in parameter identification of hydraulic turbine at no-load
Gao et al. Deep learning controller design of embedded control system for maglev train via deep belief network algorithm
Zhao Adaptive Fuzzy Control of a Class of Discrete-Time Nonlinear Systems
CN112564557B (en) Control method, device and equipment of permanent magnet synchronous motor and storage medium
Lamouchi et al. Active fault tolerant control using zonotopic techniques for linear parameter varying systems: Application to wind turbine system
CN116208041A (en) Motor system H infinite reduced order output tracking control method based on reinforcement learning
Arshad et al. Deep Deterministic Policy Gradient to Regulate Feedback Control Systems Using Reinforcement Learning.
Liu et al. Weighting factor design based on SVR–MOPSO for finite set MPC operated power electronic converters
Wanfeng et al. Adaptive PID controller based on online LSSVM identification
Zhou et al. High‐gain observer‐based adaptive fuzzy finite‐time prescribed performance tracking control for linear stepping motor with event‐triggered strategy
CN110504884A (en) A kind of induction-type bearingless motor radial force suspension control system based on differential geometrical decoupled control
Maheedhar et al. A Behavioral Study of Different Controllers and Algorithms in Real-Time Applications
CN111654056B (en) Voltage tracking control method and system for photovoltaic grid-connected inverter system
CN115933383B (en) Nonlinear double-time-scale industrial system H infinite combination control method based on reinforcement learning
Yuan et al. Design of single neuron super-twisting sliding mode controller for permanent magnet synchronous servo motor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination