CN112180996A - Liquid level fault-tolerant control method based on reinforcement learning - Google Patents

Liquid level fault-tolerant control method based on reinforcement learning Download PDF

Info

Publication number
CN112180996A
CN112180996A CN202010947314.8A CN202010947314A CN112180996A CN 112180996 A CN112180996 A CN 112180996A CN 202010947314 A CN202010947314 A CN 202010947314A CN 112180996 A CN112180996 A CN 112180996A
Authority
CN
China
Prior art keywords
liquid level
fault
output
weight
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010947314.8A
Other languages
Chinese (zh)
Inventor
张大鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202010947314.8A priority Critical patent/CN112180996A/en
Publication of CN112180996A publication Critical patent/CN112180996A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D9/00Level control, e.g. controlling quantity of material stored in vessel
    • G05D9/12Level control, e.g. controlling quantity of material stored in vessel characterised by the use of electric means

Abstract

A liquid level fault-tolerant control system based on reinforcement learning is used for fault-tolerant control of a multi-water-tank system, the required precondition is only that a fault is detected without further diagnosis of the fault, the precondition is easy to realize in fault detection and diagnosis, and at present, a plurality of mature methods such as PCA and Bayesian decision making exist. In addition, the evaluation action structure is mainly realized by adopting an artificial neural network, and the neural network has good robustness and can effectively overcome the influence of noise. The invention can directly utilize the acquired data to control under the condition of no training sample, thereby realizing the same index when the liquid level of the container reaches the fault-free condition. The control quantity obtained by the method is the optimal control quantity when the system fails, and is the performance index which can be reached to the maximum extent when the system fails.

Description

Liquid level fault-tolerant control method based on reinforcement learning
Technical Field
The invention relates to a liquid level fault-tolerant control method. In particular to a liquid level fault-tolerant control method based on reinforcement learning.
Background
In the industrial and agricultural growth, the liquid level of a container is often required to be controlled, for example, the water storage capacity of a water tank, a water pool, a water tank, a boiler and the like is automatically controlled, a large number of mature products are directly used for controlling the water level of a single container, but in the industrial production (crystallizer liquid level), the situation that a plurality of containers are communicated through valves is often faced, the set heights of different containers are kept through regulating the opening degree of the valves, so that the liquid phase reaction in the containers has higher efficiency, however, the liquid level is often deviated from the original set value due to the detection signal deviation caused by the reduction of the precision of a sensor, the performance of a valve controller is reduced, and the liquid leakage in the tank caused by the sealing failure, so that the liquid phase reaction efficiency is reduced, and the commonly adopted method is fault-tolerant.
The multi-container connection enables the height of each container to be kept at a set position through the opening adjustment of the connecting valve, but the liquid level is often deviated from an original set value due to the detection signal deviation caused by the reduction of the precision of the sensor, the performance reduction of a valve controller and the liquid leakage in the tank caused by the sealing failure.
In the traditional various artificial intelligence methods based on data driving, sample data is required to be adopted for training in advance, but due to the uncertainty of the time of fault occurrence and the randomness of fault types, enough effective fault data is difficult to obtain as a training sample.
Disclosure of Invention
The invention aims to provide a liquid level fault-tolerant control method based on reinforcement learning, which can keep the liquid level of each container to be at the height without fault even under the fault condition by adjusting the flow.
The technical scheme adopted by the invention is as follows: a liquid level fault-tolerant control system based on reinforcement learning is characterized in that the fault-tolerant control system is used for a multi-tank system and comprises the following components: the device comprises an information acquisition unit for respectively acquiring liquid level information of each water tank at different moments, a fault-free model for predicting the liquid level information of all the water tanks at the moment k +1 according to the liquid level information of all the water tanks at the moment k and the control information of the frequency converter output by the information acquisition unit, an evaluation network for respectively estimating total values V (k) and V (k +1) of control variables of the control frequency converter corresponding to the moment k and the moment k +1 according to the liquid level information of all the water tanks at the moment k +1 and output by the fault-free model, a stage value evaluation unit for evaluating stage values R (k) according to the stage values output by the stage value evaluation unit and the total values output by the evaluation network V (k) and V (k +1) output a fitness function used for weight updating, a weight updating unit used for updating the weight of the evaluation network according to the fitness function output by the receiving fitness estimating unit, the evaluation network outputs the weight related to the control quantity u (k) of the frequency converter according to all updated weights output by the receiving weight updating unit, and an action network used for controlling the frequency converter of the multi-capacity water tank system by the optimal control variable obtained by iterative updating according to the weight related to the control quantity u (k) of the frequency converter output by the receiving evaluation network and the liquid level information of all water tanks at the moment k output by the information acquisition unit.
The liquid level fault-tolerant control method based on reinforcement learning has the following advantages that:
1. the method of the invention does not need to diagnose and position the fault type and part in advance, and directly adopts a data driving method to carry out fault-tolerant control on the liquid level of the container.
2. The method of the invention overcomes the contradiction between the traditional artificial intelligence method that enough training samples are needed and the actual system is difficult to obtain the sample data, and can directly utilize the acquired data to control under the condition of no training sample, thereby realizing the same index when the liquid level of the container reaches the fault-free condition.
3. The control quantity obtained by the method is the optimal control quantity when the system fails, and is the performance index which can be reached to the maximum extent when the system fails.
Drawings
FIG. 1 is a schematic diagram of a control structure of a liquid level fault-tolerant control method based on reinforcement learning according to the present invention;
FIG. 2 is a schematic diagram of an evaluation neural network in accordance with the present invention;
FIG. 3 is a schematic diagram of an acting neural network according to the present invention;
FIG. 4 is a schematic structural diagram of a three-volume system according to an embodiment of the present invention;
FIG. 5 is a fluid level diagram of T3 in an actuator output deviation fault scenario in accordance with an embodiment of the present invention;
FIG. 6 is a diagram illustrating the evolution of various states in an actuator output bias fault scenario in accordance with an embodiment of the present invention;
FIG. 7 is a control variable plot in an actuator output deviation fault scenario in accordance with an embodiment of the present invention;
FIG. 8 is a liquid level diagram of T3 in an actuator stuck fault scenario in accordance with an embodiment of the present invention;
FIG. 9 is a diagram illustrating the evolution of each state in a stuck-at fault scenario of an actuator according to an embodiment of the present invention;
FIG. 10 is a control variable diagram in an actuator stuck fault scenario in accordance with an embodiment of the present invention;
fig. 11 is a liquid level diagram of T3 when the submersible pump 1 opening degree similar to the dead-lock fault is reduced to 30% according to the embodiment of the present invention;
FIG. 12 is a diagram showing the evolution of the submersible pump 1 according to the embodiment of the present invention when the opening degree similar to the dead lock fault is reduced to 30%;
fig. 13 is a control variable diagram when the opening degree of the submersible pump 1 is reduced to 30% similar to the dead lock fault according to the embodiment of the invention;
FIG. 14 is a liquid level diagram of T3 in a leak fault scenario according to an embodiment of the present invention;
FIG. 15 is a diagram illustrating the evolution of various states in a leakage failure scenario in accordance with an embodiment of the present invention;
FIG. 16 is a graph of control variables in a leakage fault scenario in accordance with an embodiment of the present invention.
Detailed Description
The liquid level fault-tolerant control method based on reinforcement learning of the invention is described in detail below with reference to the embodiments and the accompanying drawings.
As shown in fig. 1, the liquid level fault-tolerant control system based on reinforcement learning of the present invention is a fault-tolerant control system for a multi-tank system, and includes: an information acquisition unit 1 for respectively acquiring liquid level information of each water tank at different moments, a fault-free model 3 for predicting the liquid level information of all the water tanks at the moment k +1 according to the liquid level information of all the water tanks at the moment k and the control information of the frequency converter output by the information acquisition unit 1, an evaluation network 2 for respectively estimating the total values V (k) and V (k +1) of the control variables of the control frequency converter corresponding to the moment k and the moment k +1 according to the liquid level information of all the water tanks at the moment k and the moment k +1 output by the information acquisition unit 1, and a stage value evaluation unit 4 for evaluating the stage value R (k) according to the liquid level information of all the water tanks at the moment k +1 output by the information acquisition unit 1 and the liquid level information of all the water tanks at the moment k +1 predicted by the fault-free model 3, a deviation estimating unit 5 for outputting a fitness function for weight updating according to the separately received stage value output by the stage value evaluating unit 4 and the overall values V (k) and V (k +1) output by the evaluation network 2, a weight updating unit 6 for updating the weight of the evaluation network 2 according to the fitness function output by the receiving deviation estimating unit 5, the evaluation network 2 outputs the weight value related to the control quantity u (k) of the frequency converter according to all the updated weight values output by the receiving weight value updating unit 6, and the action network 7 is used for carrying out iterative updating according to the weight values which are output by the receiving and evaluating network 2 and are related to the control quantity u (k) of the frequency converter and the liquid level information of all the water tanks at the moment k output by the information acquisition unit 1 to obtain the optimal control variable to control the frequency converter of the multi-water-tank system. Wherein the content of the first and second substances,
1) the liquid level information of all the water tanks at the moment k output by the information acquisition unit 1 is represented as x (k), and the liquid level information of all the water tanks at the moment k +1 is represented as x (k + 1).
2) The fault-free model 3 is represented as follows:
Figure BDA0002675735050000031
Figure BDA0002675735050000032
Figure BDA0002675735050000033
Figure BDA0002675735050000034
in the formula, x1,x2,x3And xnLiquid level information of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn, S1,S2,S3And SnThe sectional areas of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn respectively, g is the gravity acceleration and the parameters
Figure BDA0002675735050000035
Parameter(s)
Figure BDA0002675735050000036
Parameter(s)
Figure BDA0002675735050000037
Parameter(s)
Figure BDA0002675735050000038
Parameter(s)
Figure BDA0002675735050000039
In the formula, R12Is the flow resistance, R, between the tank 1 and the tank 232Is the flow resistance, R, between the tank 3 and the tank 243Is the flow resistance, R, between the water tank 4 and the water tank 3n-1,nIs the flow resistance between tank n-1 and tank n, RnIs the drainage resistance of the water tank Tn, and rho is the liquid density;
Figure BDA00026757350500000310
Q1and Q2Is the flow rate of the submersible pump 1 and the submersible pump 2.
3) The evaluation network 2 is shown in fig. 2 and includes an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer has n +2 neurons, the hidden layer has 2n neurons, and the output layer has 1 neuron.
4) The stage value evaluation unit 4 is composed of the following formula:
Figure BDA00026757350500000311
wherein R (k) is a stageA value; x (k +1) is the liquid level information of all the water tanks at the moment of k + 1; x is the number ofrAnd (k +1) is the liquid level information of all the water tanks at the moment of predicting k +1 output by the fault-free model (3).
5) The deviation estimating unit 5 is composed of the following formula:
TE=V(k)-R(k)+γV(k+1)
wherein TE is a deviation; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor.
6) The weight updating unit 6 includes:
(1) will evaluate the weight W of the input layer and the hidden layer in the network 2c1And the weight W of the hidden layer and the output layerc2Randomly selecting an initial particle value by using corresponding particle position representation;
(2) the fitness function for each particle is calculated according to the following formula:
Figure BDA0002675735050000041
wherein FF (z (k)) is a fitness function of the ith particle at the p-th iteration; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor; x (k) is the combination of the liquid level information x (k) of all the water tanks at the moment k and the control information u (k) of the frequency converter;
(3) obtaining the optimal position p of the current particle swarm according to the fitness function value and the following formulabestAnd the optimal position g experienced by the whole particle swarmbestAnd update pbest,gbest
Figure BDA0002675735050000042
Figure BDA0002675735050000043
Wherein i is the number of particles, and m is the number of particles; p is the number of iterations;
(4) updating the particle moving speed v according to the basic iterative formula of the particle swarm optimizationiAnd the position z of the particlei
Figure BDA0002675735050000044
Figure BDA0002675735050000045
Wherein z represents the particle position, v represents the particle velocity, ω is the inertial weight, c1And c2Is the acceleration constant, and rand1 and rand2 are at [0,1]]Two random numbers, p, generated independently of each otherbestIs the current optimum position of the particle swarm, gbestIs the best position experienced by the whole particle swarm, (p) represents the number of iterations;
(5) repeating the steps (2) to (4) until convergence, and recording the optimal position g of the current particle swarmbest1
(6) Redistributing particles with random numbers of [0,1] to obtain a new fitness function value;
(7) repeating the steps (2) to (4) until convergence, and recording the optimal position g of the current particle swarmbest2
(8) If the optimum position gbest2Better than optimum position gbest1Then use the optimum position gbest2Alternative optimum position gbest1Otherwise, the optimum position g is maintainedbest1The change is not changed;
(9) repeating the step (2) to the step (8) until a better optimal position cannot be found, and obtaining a final position gbest1
(10) The particles are in gbest1Is located to judge the network Wc1And Wc2The solution of (1).
7) The action network 7 is shown in fig. 3 and comprises an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer is provided with n godsThe channel element, the hidden layer has n +3 neurons, the output layer has 2 neurons, the weight between the input layer and the hidden layer is Wa1The weight between the hidden layer and the output layer is Wa2
8) The weight of the action network 7 is changed into
ΔWa2=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·sout,a
ΔWa1=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·Wa2·[sout,a(1-sout,a)]·x(k)
Wherein l is the learning rate, Wc2Represents the weight, s, between the hidden layer and the output layer in the evaluation network 2out,cAnd sout,aAre the outputs of the non-linear functions in the evaluation network 2 and the action network 7, respectively; wc1,uFor evaluating the weight of the hidden layer pair of the network 2 in relation to the control u (k) of the frequency converter, Wa2The weights of the hidden layer and the output layer in the action network 7, x (k) is the liquid level information of all the water tanks at the moment k, Wc1,u、Wc2、sout,c,sout,aAnd Wa2Both obtained from the evaluation network and the action network;
updating the weights Wa1 and Wa2 of the action network according to the following formula
Wa1’=Wa1+ΔWa1
Wa2’=Wa2+ΔWa2
In the formula, Wa1' and Wa2' is the updated weights between the input layer and the hidden layer and the weights between the hidden layer and the output layer in the action network 7.
Experimental validation is given below
The proposed method was verified using a three-volume system as the experimental platform. The three-container system consists of a water tank T1, a T2, a T3, a submersible pump 1 with flow rate Q1 and Q2 controlled by a digital controller, a submersible pump 2, a connecting valve CV1, CV2, CV3, a leakage valve LV1, LV2 and LV3 and pipelines. The liquid level information of each of the water tanks T1, T2, and T3 may be obtained separately by a liquid level meter. The three tanks T1, T2 and T3 have the same size plumbing connections. The system operates with the connecting valve open and the leak valve closed. Thus, the liquid in the reservoir flows into the tank through the connecting valve CV3 and re-enters the tank body through the submersible pumps 1 and 2. The inter-tank flow resistance can be changed by manually adjusting the opening degrees of the connecting valves CV1, CV2, CV3 and the leak valves LV1, LV2, LV 3. The submersible pump 1 and the submersible pump 2 are respectively controlled by separate frequency converters. The flow rates of the submersible pumps 1 and2 are determined by the rotating speeds of the submersible pumps, and the rotating speeds are controlled by separate frequency converters. The controller outputs a frequency converter control signal of 0-5V. By additional experiments, the relation between the pump flow and the frequency control signal was obtained. After that, for the sake of clarity, we omit the frequency converter, and replace the rotation speed with the pump flow rate as the control variable of the controlled object. The structure is shown in fig. 4.
The following formula gives the fault-free model of the three-capacitor system
Figure BDA0002675735050000051
Figure BDA0002675735050000052
Figure BDA0002675735050000053
In the formula, each variable has the same meaning as described above.
We use the PID controller for the submersible pump 1 to keep the submersible pump 2 at 50% opening (middle signal 2.5V of soft channel control signal 0-5V), achieving the goal of keeping the liquid level at T3 without failure. We call this stability a standard state of no failure. When a fault occurs, the FTC controller with two outputs (flow to the submersible pump 1 and 2) will replace the previous controllers (PID for the submersible pump 1 and 50% fixed opening for the submersible pump 2). Our goal is to maintain the reference level in T3 by controlling the flow rate of submersible pump 1 and submersible pump 2, respectively.
A. Actuator output deviation fault scenario
In the non-fault case, actuator faults of the submersible pump 1 are simulated by changing the relation between the submersible pump flow and the frequency control signal, which changes cause the flux to increase/decrease compared to the initial value of the connection controller output. By this method, the output deviation fault of the actuator can be obtained by a soft method, and the real actuator can be prevented from being damaged. After sampling 100, the actuator of the submersible pump 1 has a fault, wherein the fault is that the flow is greater than the initial set value and is 12L/min (conversion is carried out according to the relation between the pump flow and the frequency control signal). The liquid phase height, state evolution and control variables of T3 are shown in fig. 5, 6 and 7.
The first and second curves represent the case when no FTC is used and when FTC is used, respectively. Fig. 6 shows that the states x1, x2, and x3 in the no fault state remain stable until the fault occurs, and the liquid level of T3 remains at the reference level (fig. 5). When a fault occurs at sample 100, state x1 will rise because there is more traffic at T1, and states x2, x3 will also rise because coupling without FTC. However, after a transition, the liquid level height of T3 from 10cm to 15cm will enter another steady state. The FTC controller was designed as per procedure 1, using a 3-10-2 forward neural network. 100 data are selected from the training set, and training is carried out by adopting a Levenberg-Marquardt algorithm. A well-trained neural network is used as the FTC. The algorithm clearly restores the liquid phase height of T3.
More explanations about the control variables will be given on the basis of fig. 7. In fig. 7, the horizontal coordinate is the sampling time, and the vertical coordinate is the pump flow rate. The zero point of the scale of the vertical coordinate represents the flow rate of the pump in the normal state. We use the scale zero instead of the actual flow because in the absence of a fault, the standard state will vary with the reference level of T3. Negative means less flow and positive means more flow than in the normal state without failure. The first and second curves represent flow rates without and with FTC, respectively. It can be seen that the pump 1 will reduce the flow to react more to output faults to the actuator. On the other hand, pump 2 will also decrease the output to maintain the T3 liquid level at the reference level.
B. Scene of dead locking fault of actuator
After sampling 100, the pump 1 experienced a stuck fault at 60% opening (signal 3V of inverter control signal 0-5V, indicating that the pump 1 was bumped due to loss of control). Fig. 8, 9 and 10 are the liquid level, state evolution and control variables of T3, respectively.
As can be seen from the first curve of fig. 9, if the control object is followed, the liquid levels of T1, T2, and T3 slowly rise (in response to the characteristics of the equipment) after the occurrence of the stuck fault. Fig. 10 shows the control variables with FTC (second curve) and without FTC (first curve). Due to the pump 1 being blocked, the regulating function is lost and the first and second curves coincide. The pump 2 reflects this failure by stopping the delivery flow for a period of time to release the buildup. It will then provide a steady flow to maintain the level of T3. Fig. 8 shows that the liquid level of T3 can be maintained at a fault-free (red curve) level under FTC control.
The opening degree of the pump 1 is reduced to 30 percent (the frequency converter control signal is 0-5V, the signal is 1.5V) similar to the blocking fault, and the liquid level can not be maintained to rise. The state evolution, liquid level and control variables of T3 are shown in fig. 12, 11 and 13. The first curve represents the case without FTC and the second curve represents the state evolution with FTC. As can be seen from fig. 13, the release flow is less intense and shorter in time than the 60% breakblock opening. Due to the difference in stability, a deviation between the first curve and the second curve also occurs.
C. Leakage fault scenario
We also caused a flow leak failure by partially opening LV2 of the T3 tank. As shown in fig. 14, if a fault-free control is implemented (as shown in the first curve), the liquid level in T3 will decrease from 9cm to 7cm due to the flow leakage. The second curve of fig. 14 shows the liquid height trend for T3 with FTC. It can be seen that the liquid level in T3 will remain at a fault-free level due to the action of the FTC. The state evolution and control variables are shown in fig. 15 and 16.

Claims (9)

1. A liquid level fault-tolerant control system based on reinforcement learning is characterized in that the fault-tolerant control system is used for a multi-tank system and comprises the following components: an information acquisition unit (1) used for respectively acquiring the liquid level information of each water tank at different moments, a fault-free model (3) used for predicting the liquid level information of all the water tanks at the moment k +1 according to the liquid level information of all the water tanks at the moment k and the control information of the frequency converter output by the information acquisition unit (1) and used for estimating the total values V (k) and V (k +1) of the control variables of the frequency converter corresponding to the moment k and the moment k +1 according to the liquid level information of all the water tanks at the moment k and the moment k +1 output by the information acquisition unit (1) respectively, an evaluation network (2) used for evaluating the stage value R (k) according to the liquid level information of all the water tanks at the moment k +1 output by the information acquisition unit (1) respectively and the liquid level information of all the water tanks at the moment k +1 predicted by the fault-free model (3), a deviation estimation unit (5) for outputting a fitness function for weight updating according to the separately received phase value output by the phase value evaluation unit (4) and the overall values V (k) and V (k +1) output by the evaluation network (2), a weight updating unit (6) for updating the weight of the evaluation network (2) according to the fitness function output by the receiving deviation estimating unit (5), the evaluation network (2) outputs the weight value related to the control quantity u (k) of the frequency converter according to all the updated weight values output by the receiving weight value updating unit (6), and the action network (7) is used for carrying out iterative updating according to the weight values which are output by the receiving and evaluating network (2) and are related to the control quantity u (k) of the frequency converter and the liquid level information of all the water tanks at the time k output by the information acquisition unit (1) to obtain an optimal control variable to control the frequency converter of the multi-water-tank system.
2. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, wherein the liquid level information of all water tanks at time k output by the information acquisition unit (1) is represented as x (k), and the liquid level information of all water tanks at time k +1 is represented as x (k + 1).
3. The reinforcement learning-based liquid level fault-tolerant control method according to claim 1, characterized in that the fault-free model (3) is represented as follows:
Figure FDA0002675735040000011
Figure FDA0002675735040000012
Figure FDA0002675735040000013
Figure FDA0002675735040000014
in the formula, x1,x2,x3And xnLiquid level information of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn, S1,S2,S3And SnThe sectional areas of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn respectively, g is the gravity acceleration and the parameters
Figure FDA0002675735040000015
Parameter(s)
Figure FDA0002675735040000016
Parameter(s)
Figure FDA0002675735040000017
Parameter(s)
Figure FDA0002675735040000018
Parameter(s)
Figure FDA0002675735040000019
In the formula, R12Is the flow resistance, R, between the tank 1 and the tank 232Is the flow resistance, R, between the tank 3 and the tank 243Is the flow resistance, R, between the water tank 4 and the water tank 3n-1,nIs the flow resistance between tank n-1 and tank n, RnIs the drainage resistance of the water tank Tn, and rho is the liquid density;
Figure FDA0002675735040000021
Q1and Q2Is the flow rate of the submersible pump 1 and the submersible pump 2.
4. The reinforcement learning-based liquid level fault-tolerant control method according to claim 1, characterized in that the evaluation network (2) comprises an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer has n +2 neurons, the hidden layer has 2n neurons, and the output layer has 1 neuron.
5. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, characterized in that the stage value evaluation unit (4) is composed of the following formula:
Figure FDA0002675735040000022
wherein R (k) is a stage value; x (k +1) is the liquid level information of all the water tanks at the moment of k + 1; x is the number ofrAnd (k +1) is the liquid level information of all the water tanks at the moment of predicting k +1 output by the fault-free model (3).
6. The fault-tolerant liquid level control method based on reinforcement learning of claim 1, wherein the deviation estimation unit (5) is formed by the following formula:
TE=V(k)-R(k)+γV(k+1)
wherein TE is a deviation; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor.
7. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, wherein the weight updating unit (6) comprises:
1) the weights W of the input layer and the hidden layer in the evaluation network (2) are calculatedc1And the weight W of the hidden layer and the output layerc2Randomly selecting an initial particle value by using corresponding particle position representation;
2) the fitness function for each particle is calculated according to the following formula:
Figure FDA0002675735040000023
wherein FF (z (k)) is a fitness function of the ith particle at the p-th iteration; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor; x (k) is the combination of the liquid level information x (k) of all the water tanks at the moment k and the control information u (k) of the frequency converter;
3) obtaining the optimal position p of the current particle swarm according to the fitness function value and the following formulabestAnd the optimal position g experienced by the whole particle swarmbestAnd update pbest,gbest
Figure FDA0002675735040000024
Figure FDA0002675735040000025
Wherein i is the number of particles, and m is the number of particles; p is the number of iterations;
4) updating the particle moving speed v according to the basic iterative formula of the particle swarm optimizationiAnd the position z of the particlei
Figure FDA0002675735040000026
Figure FDA0002675735040000027
Wherein z represents the particle position, v represents the particle velocity, ω is the inertial weight, c1And c2Is the acceleration constant, and rand1 and rand2 are at [0,1]]Two random numbers, P, generated independently of each otherbestIs the current optimum position of the particle swarm, gbestIs the best position experienced by the whole particle swarm, (p) represents the number of iterations;
5) repeating the steps 2) to 4) until convergence, and recording the optimal position g of the current particle swarmbest1
6) Redistributing particles with random numbers of [0,1] to obtain a new fitness function value;
7) repeating the steps 2 to 4 until convergence, and recording the optimal position g of the current particle swarmbest2
8) If the optimum position gbest2Better than optimum position gbest1Then use the optimum position gbest2Alternative optimum position gbest1Otherwise, the optimum position g is maintainedbest1The change is not changed;
9) repeating the steps 2) to 8) until a better optimal position cannot be found, and obtaining a final position gbest1
10) The particles are in gbest1Is located to judge the network Wc1And Wc2The solution of (1).
8. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, characterized in that the action network (7) comprises an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer has n neurons, the hidden layer has n +3 neurons, the output layer has 2 neurons, and the weight between the input layer and the hidden layer is Wa1The weight between the hidden layer and the output layer is Wa2
9. The reinforcement learning-based liquid level fault-tolerant control method according to claim 1 or 8, characterized in that the weight of the action network (7) is changed into
ΔWa2=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·sout,a
ΔWa1=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·Wa2·[sout,a(1-sout,a)]·x(k)
Wherein l is the learning rate, Wc2Represents the weight, s, between the hidden layer and the output layer in the evaluation network (2)out,cAnd sout,aAre the outputs of the non-linear functions in the evaluation network (2) and the action network (7), respectively; wc1,uFor evaluating the weight of the hidden layer of the network (2) on the control quantity u (k) of the frequency converter, Wa2The weight of a hidden layer and an output layer in the action network (7), x (k) is the liquid level information of all water tanks at the moment k, Wc1,u、Wc2、sout,c,sout,aAnd Wa2Both obtained from the evaluation network and the action network;
updating the weights Wa1 and Wa2 of the action network according to the following formula
Wa1’=Wa1+ΔWa1
Wa2’=Wa2+ΔWa2
In the formula, Wa1' and Wa2' are updated weights between the input layer and the hidden layer and weights between the hidden layer and the output layer in the action network (7).
CN202010947314.8A 2020-09-10 2020-09-10 Liquid level fault-tolerant control method based on reinforcement learning Pending CN112180996A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010947314.8A CN112180996A (en) 2020-09-10 2020-09-10 Liquid level fault-tolerant control method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010947314.8A CN112180996A (en) 2020-09-10 2020-09-10 Liquid level fault-tolerant control method based on reinforcement learning

Publications (1)

Publication Number Publication Date
CN112180996A true CN112180996A (en) 2021-01-05

Family

ID=73921803

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010947314.8A Pending CN112180996A (en) 2020-09-10 2020-09-10 Liquid level fault-tolerant control method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN112180996A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020046359A1 (en) * 2000-03-16 2002-04-18 Boden Scott T. Method and apparatus for secure and fault tolerant data storage
CN1471627A (en) * 2000-10-26 2004-01-28 �Ʒ� A fault tolerant liquid measurement system using multiple-model state estimators
CN1737423A (en) * 2005-08-10 2006-02-22 东北大学 Method and apparatus for realizing integration of fault-diagnosis and fault-tolerance for boiler sensor based on Internet
CN109635864A (en) * 2018-12-06 2019-04-16 佛山科学技术学院 A kind of fault tolerant control method and device based on data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020046359A1 (en) * 2000-03-16 2002-04-18 Boden Scott T. Method and apparatus for secure and fault tolerant data storage
CN1471627A (en) * 2000-10-26 2004-01-28 �Ʒ� A fault tolerant liquid measurement system using multiple-model state estimators
CN1737423A (en) * 2005-08-10 2006-02-22 东北大学 Method and apparatus for realizing integration of fault-diagnosis and fault-tolerance for boiler sensor based on Internet
CN109635864A (en) * 2018-12-06 2019-04-16 佛山科学技术学院 A kind of fault tolerant control method and device based on data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张大鹏: "Fault Tolerant Control Using Reinforcement Learning and Particle Swarm Optimization", 《IEEE ACCESS》, 9 September 2020 (2020-09-09), pages 2 - 5 *

Similar Documents

Publication Publication Date Title
Daigle et al. A model-based prognostics approach applied to pneumatic valves
CN109879410A (en) Sewage treatment aeration control system
CN110347155B (en) Intelligent vehicle automatic driving control method and system
CN110806759A (en) Aircraft route tracking method based on deep reinforcement learning
CN109724657A (en) Watermeter flowing rate metering method and system based on modified Delphi approach
CN112432644B (en) Unmanned ship integrated navigation method based on robust adaptive unscented Kalman filtering
CN111507530B (en) RBF neural network ship traffic flow prediction method based on fractional order momentum gradient descent
CN112000015A (en) Intelligent BIT design method for heavy-duty gas turbine control system controller module based on LSTM and bio-excitation neural network
CN113916329A (en) Natural gas flowmeter calibrating device and method based on neural network
CN114839884B (en) Underwater vehicle bottom layer control method and system based on deep reinforcement learning
CN112180996A (en) Liquid level fault-tolerant control method based on reinforcement learning
Lubana et al. How do quadratic regularizers prevent catastrophic forgetting: The role of interpolation
CN114548311A (en) Hydraulic equipment intelligent control system based on artificial intelligence
CN108681241B (en) Neural network-based dual-capacity system identification method
CN111679577B (en) Speed tracking control method and automatic driving control system of high-speed train
CN116880191A (en) Intelligent control method of process industrial production system based on time sequence prediction
Marcu et al. Neural observer schemes for robust detection and isolation of process faults
Hallouzi et al. Multiple model estimation: A convex model formulation
Adetona et al. Robust nonlinear adaptive control using neural networks
JPH04211859A (en) Abnormality recognizing method
Chalupa et al. Modeling of hydraulic control valves
Babuska et al. Particle filtering for on-line estimation of overflow losses in a hopper dredger
Rato et al. Multimodel based fault tolerant control of the 3-tank system
Jungbeck et al. Optimal neural network output feedback control for robot manipulators
CN117369286B (en) Dynamic positioning control method for ocean platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210105