Temperature control method based on DDPG-fuzzy PID
Technical Field
The invention relates to the field of temperature control, in particular to a temperature control method based on DDPG-fuzzy PID.
Background
With the development of science and technology, temperature control is widely applied to various technical fields, and the implementation of central heating, reasonable resource utilization, energy utilization efficiency improvement and environmental pollution reduction is an effective measure and an important way for accelerating urban modernization. Central heating has become the main form of winter heating in northern china and is increasingly widely used, and it uses a heat supply network to connect a large number of heat consumers and provides necessary heat through a uniform heat source. In a heating system, a temperature regulation system is a complex system with large lag, time variation and nonlinearity, and the traditional PID control cannot meet the control requirement.
In order to better solve the control problem of the system, the fuzzy PID combines fuzzy control with a classical PID regulator, and meanwhile, the fuzzy PID has flexibility and adaptability of the fuzzy control and improves the control precision of the system. However, fuzzy PID systems have difficulty accurately controlling temperature when there are severe and frequent disturbances and large hysteresis in the control system. Aiming at the problem, the invention designs a temperature control method based on DDPG-fuzzy PID, which introduces an auxiliary controller based on DDPG to a fuzzy PID control system, utilizes DDPG algorithm to carry out online weight learning capability, carries out auxiliary control on a system actuator, realizes automatic compensation of uncertain interference of a temperature control system, and obtains a temperature regulation and control system with robustness through linkage control of a main controller and an auxiliary controller.
Disclosure of Invention
In order to solve the problems, the invention provides a temperature control method based on DDPG-fuzzy PID. The control precision of the temperature system is improved through a fuzzy PID control algorithm, the optimization problem that the membership function and the fuzzy rule of the fuzzy PID are difficult to determine is further solved through a genetic algorithm, and finally, a control method which takes the fuzzy PID as a main controller and takes the DDPG as an auxiliary controller is provided. In order to achieve the purpose, the invention provides a temperature control method based on DDPG-fuzzy PID, which comprises the following steps:
step 1, acquiring experimental data of a temperature control system: detecting the temperature of the system through a temperature sensor, and recording the output and command signals of an actuator;
step 2, designing a fuzzy PID main controller of the temperature system: the fuzzy PID controller is used as a main controller to control the action of an actuator, and the tracking of the set value of the system temperature is completed;
step 3, optimizing fuzzy PID parameters by a genetic algorithm: the membership function and the fuzzy rule of the fuzzy PID are optimized by adopting a genetic algorithm, so that the control performance of the temperature system is improved;
step 4, designing a DDPG auxiliary controller of the temperature system: the DDPG auxiliary controller realizes auxiliary control on the temperature system by detecting the temperature value of the temperature control system, the reward value for executing the action and the system temperature value after executing the action;
and 5, performing linkage control on the main controller and the auxiliary controller to obtain a temperature regulation and control system with robustness, and embedding the temperature regulation and control system into an upper computer for practical application.
Further, the process of designing the fuzzy PID main controller of the temperature system in the step 2 is represented as follows:
the temperature control system respectively sets the temperature set value and the temperature recovery value as two-dimensional input including the deviation e (t) of the system temperature set value and the measurement feedback value, the deviation signal change rate ec (t) and the three-dimensional output including the proportionality coefficient kpIntegral coefficient kiDifferential coefficient kdThe system modifies the parameters of the fuzzy PID controller in real time through monitoring and the value of ec (t), thereby optimizing the control performance of the fuzzy PID controller; in the temperature control system, a deviation signal e (t) is obtained by the operation of a sensor for recovering the current temperature value and a target set value, the deviation signal e (t) and the deviation signal change rate ec (t) are fuzzified to obtain a corresponding fuzzy quantity, then the fuzzy control quantity is obtained by reasoning according to a fuzzy rule, and finally the fuzzy control quantity is subjected to sharpening treatment to accurately control a controlled object to obtain kp、kiAnd kd。
Further, the process of optimizing the fuzzy PID parameter by the genetic algorithm in step 3 can be expressed as follows:
the genetic algorithm is an iterative self-adaptive optimal solving algorithm based on a natural selection principle and a genetic mechanism, can improve the control performance of a temperature system, and comprises the following steps of:
step 3.1, fuzzy domain coding; the actual range of temperature deviation e (t) may reach-100 deg.C to 100 deg.C, and thus, its domain of discourse is [ -100,100 [ ]]Since the temperature changes slowly, the fundamental range of the rate of change of the temperature deviation is approximately [ -2,2 [ ]]The output k is analyzed through parameter settingp、ki、kdThe actual universe of discourse is set to [ -1,1 []、[-0.02,0.02]、[-0.3,0.3](ii) a The input and output fuzzy domain is set as [ -6, -5, -4, -3,2, -1,0,1,2,3,4,5,6](ii) a The system sets 7 language variable values of negative large NB, negative middle NM, negative NS, zero ZO, positive small PS, positive middle PM and positive large PB, and carries out coding representation by 0,1,2,3,4,5 and 6;
step 3.2, selecting an initial population; setting an initial population parameter range according to a set value, setting the optimal solution parameter range of the initial population to [0, 6], and randomly generating an initial population with the scale of 100 in the set range;
step 3.3, selecting a fitness function; the fitness function judges the fitness of the individual by judging the individual characteristics, the fitness of the individual is used as a quality standard for evaluating PID parameters, and the dynamic deviation, overshoot and adjustment time performance indexes of the system are used as target functions, so that the fitness function is described as follows:
where u (t) is the controller output, tu is the system response time, ω1、ω2、ω3Is a weighting constant;
step 3.4, selecting a genetic operator; the roulette selection method determines the probability of selection by using the ratio of the individual fitness value to the overall fitness value in the overall population, and the formula is as follows:
wherein, PiIs the probability that the individual i is genetically selected, fjIs the global fitness value of the superposition of all individuals, fiIs the fitness value of the individual i;
step 3.5, crossover and mutation operations; the genetic algorithm mutually exchanges partial genes of two paired individuals according to a certain mode through cross operation, the cross probability is set to be 0.86, and a two-point cross algorithm is adopted, so that two new individuals are formed; meanwhile, in order to improve the local search capability, the mutation probability is set to be 0.04, and a new individual is generated by changing certain gene values of the individual through mutation operation.
Further, in step 4, the process of designing the temperature system DDPG auxiliary controller can be expressed as follows:
the DDPG algorithm obtains sample data(s) through a temperature control systemt,at,rt+1,st+1) Wherein s istIs the temperature value of the temperature control system at time t, atIs that the system is in stAction performed in the state rt+1Is the system is in state stLower execution action atIs given a prize value of st+1The temperature control system is performing action atAnd (3) measuring the temperature value of the system, putting the sample data into an experience pool, randomly sampling the minimum batch data from the experience pool to learn and update, and finally executing the following actions by the DDPG auxiliary controller:
at=μ(st|θμ)+Nt (3)
in the formula, NtIs random noise, the function μ () is the optimal behavior strategy, θμRepresenting policy network parameters; the DDPG auxiliary controller is mainly used for compensating the temperature tracking error of a temperature control system and improving the control performance of the fuzzy PID main controller.
The invention relates to a temperature control method based on DDPG-fuzzy PID, which has the following beneficial effects: the invention has the technical effects that:
1. on the basis of a fuzzy PID controller, a temperature regulation system taking the fuzzy PID as a main controller is established, and the optimal membership function and fuzzy rule of the fuzzy PID are screened through a genetic algorithm, so that the system precision is improved;
2. in order to increase the self-adaption and anti-interference capability of a temperature regulation system, the invention designs a new auxiliary controller based on DDPG, which utilizes DDPG algorithm to carry out on-line weight learning capability, carries out auxiliary control on a system actuator and can automatically compensate the uncertainty and interference of a temperature control system;
3. the robust temperature regulation and control system is obtained through linkage control of the main controller and the auxiliary controller, the regulation robustness of the temperature system is far superior to that of a traditional temperature control system, and the regulation time, the delay time and the like are obviously shortened.
Drawings
FIG. 1 is a control block diagram of the present invention;
FIG. 2 is a membership function optimized by the genetic algorithm of the present invention;
FIG. 3 is a graph of the dynamic response of three model algorithms for a target temperature of 30 ℃ in accordance with the present invention;
FIG. 4 shows three model algorithm dynamic response curves at a target temperature of 30 ℃ under white noise interference according to the invention.
Detailed Description
The invention is described in further detail below with reference to the following detailed description and accompanying drawings:
the invention provides a temperature control method based on DDPG-fuzzy PID, aiming at obtaining a robust temperature regulation and control system and reducing the regulation time and delay time of the system. Fig. 1 is a control structure diagram of the present invention. The steps of the present invention will be described in detail below with reference to the control structure diagram.
Step 1, acquiring experimental data of a temperature control system: detecting the temperature of the system through a temperature sensor, and recording the output and command signals of an actuator;
step 2, designing a fuzzy PID main controller of the temperature system: the fuzzy PID controller is used as a main controller to control the action of an actuator, and the tracking of the set value of the system temperature is completed;
the process of designing the fuzzy PID main controller of the temperature system in the step 2 can be expressed as follows:
the temperature control system sets the temperatureThe value and the temperature recovery value are respectively set as two-dimensional input (the deviation e (t) of the given value of the system temperature and the measurement feedback value, the deviation signal change rate ec (t)) and three-dimensional output (the proportionality coefficient k)pIntegral coefficient kiDifferential coefficient kd) The system modifies the parameters of the fuzzy PID controller in real time through monitoring and the value of ec (t), thereby optimizing the control performance of the fuzzy PID controller; in the temperature control system, a deviation signal e (t) is obtained by the operation of a sensor for recovering the current temperature value and a target set value, the deviation signal e (t) and the deviation signal change rate ec (t) are fuzzified to obtain a corresponding fuzzy quantity, then the fuzzy control quantity is obtained by reasoning according to a fuzzy rule, and finally the fuzzy control quantity is subjected to sharpening treatment to accurately control a controlled object to obtain kp、kiAnd kd。
Step 3, optimizing fuzzy PID parameters by a genetic algorithm: the membership function and the fuzzy rule of the fuzzy PID are optimized by adopting a genetic algorithm, so that the control performance of the temperature system is improved;
the process of genetic algorithm optimization of fuzzy PID parameters in step 3 can be represented as follows:
the genetic algorithm is an iterative self-adaptive optimal solving algorithm based on a natural selection principle and a genetic mechanism, can improve the control performance of a temperature system, and comprises the following steps of:
step 3.1, fuzzy domain coding; the actual range of temperature deviation e (t) may reach-100 deg.C to 100 deg.C, and thus, its domain of discourse is [ -100,100 [ ]]Since the temperature changes slowly, the fundamental range of the rate of change of the temperature deviation is approximately [ -2,2 [ ]]The output k is analyzed through parameter settingp、ki、kdThe actual universe of discourse is set to [ -1,1 []、[-0.02,0.02]、[-0.3,0.3](ii) a The input and output fuzzy domain is set as [ -6, -5, -4, -3,2, -1,0,1,2,3,4,5,6](ii) a The system sets 7 language variable values of negative large NB, negative middle NM, negative NS, zero ZO, positive small PS, positive middle PM and positive large PB, and carries out coding representation by 0,1,2,3,4,5 and 6;
step 3.2, selecting an initial population; setting an initial population parameter range according to a set value, setting the optimal solution parameter range of the initial population to [0, 6], and randomly generating an initial population with the scale of 100 in the set range;
step 3.3, selecting a fitness function; the fitness function judges the fitness of the individual by judging the individual characteristics, the fitness of the individual is used as a quality standard for evaluating PID parameters, and the dynamic deviation, overshoot and adjustment time performance indexes of the system are used as target functions, so that the fitness function is described as follows:
where u (t) is the controller output, tu is the system response time, ω1、ω2、ω3Is a weighting constant;
step 3.4, selecting a genetic operator; the roulette selection method determines the probability of selection by using the ratio of the individual fitness value to the overall fitness value in the overall population, and the formula is as follows:
wherein, PiIs the probability that the individual i is genetically selected, fjIs the global fitness value of the superposition of all individuals, fiIs the fitness value of the individual i;
step 3.5, crossover and mutation operations; the genetic algorithm mutually exchanges partial genes of two paired individuals according to a certain mode through cross operation, the cross probability is set to be 0.86, and a two-point cross algorithm is adopted, so that two new individuals are formed; meanwhile, in order to improve the local search capability, the mutation probability is set to be 0.04, and a new individual is generated by changing certain gene values of the individual through mutation operation.
The invention sets the iteration times of the genetic algorithm as 100 times, and k can be obtained after 100 times of iterative computationp、ki、kdThe fuzzy control rules and their membership functions are shown in tables 1 to 3 and fig. 2, respectively.
TABLE 1 genetic Algorithm optimized kpRule base
TABLE 2 genetic Algorithm optimized kiRule base
TABLE 3 genetic Algorithm optimized kdRule base
Step 4, designing a DDPG auxiliary controller of the temperature system: the DDPG auxiliary controller realizes auxiliary control on the temperature system by detecting the temperature value of the temperature control system, the reward value for executing the action and the system temperature value after executing the action;
in step 4, the process of designing the temperature system DDPG auxiliary controller can be represented as follows:
the DDPG algorithm obtains sample data(s) through a temperature control systemt,at,rt+1,st+1) Wherein s istIs the temperature value of the temperature control system at time t, atIs that the system is in stAction performed in the state rt+1Is the system is in state stLower execution action atIs given a prize value of st+1The temperature control system is performing action atAnd (3) measuring the temperature value of the system, putting the sample data into an experience pool, randomly sampling the minimum batch data from the experience pool to learn and update, and finally executing the following actions by the DDPG auxiliary controller:
at=μ(st|θμ)+Nt (3)
in the formula, NtIs random noise, and the function mu () is the optimal behavior strategy,θμRepresenting policy network parameters; the DDPG auxiliary controller is mainly used for compensating the temperature tracking error of a temperature control system and improving the control performance of the fuzzy PID main controller, and the parameters of a DDPG network are shown in a table 4.
TABLE 4 parameter settings for DDPG auxiliary controller
And 5, performing linkage control on the main controller and the auxiliary controller to obtain a temperature regulation and control system with robustness, and embedding the temperature regulation and control system into an upper computer for practical application.
The response curves of the traditional PID control system, the fuzzy PID control system and the DDPG fuzzy self-tuning control system based on the method are compared at the same time under the same condition, and the results are shown in figures 3 and 4
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, but any modifications or equivalent variations made according to the technical spirit of the present invention are within the scope of the present invention as claimed.