CN110320796A - Electrical control method, device and equipment based on PID controller - Google Patents
Electrical control method, device and equipment based on PID controller Download PDFInfo
- Publication number
- CN110320796A CN110320796A CN201910722233.5A CN201910722233A CN110320796A CN 110320796 A CN110320796 A CN 110320796A CN 201910722233 A CN201910722233 A CN 201910722233A CN 110320796 A CN110320796 A CN 110320796A
- Authority
- CN
- China
- Prior art keywords
- function
- pid controller
- value
- electrical control
- parameter setting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 230000006870 function Effects 0.000 claims abstract description 134
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 57
- 230000002787 reinforcement Effects 0.000 claims abstract description 27
- 230000006399 behavior Effects 0.000 claims description 38
- 230000009471 action Effects 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 4
- 238000013461 design Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 abstract description 9
- 230000008569 process Effects 0.000 description 7
- 230000004044 response Effects 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000007613 environmental effect Effects 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000010355 oscillation Effects 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 238000004886 process control Methods 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000005684 electric field Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B11/00—Automatic controllers
- G05B11/01—Automatic controllers electric
- G05B11/36—Automatic controllers electric with provision for obtaining particular characteristics, e.g. proportional, integral, differential
- G05B11/42—Automatic controllers electric with provision for obtaining particular characteristics, e.g. proportional, integral, differential for obtaining a characteristic which is both proportional and time-dependent, e.g. P. I., P. I. D.
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
Abstract
The invention discloses an electrical control method, a device, equipment and a computer readable storage medium based on a PID controller, comprising the following steps: constructing a target function of a PID controller parameter setting problem, wherein undetermined parameters of the target function comprise N single-dimensional variables; discretizing the N single-dimensional variables, and learning the N single-dimensional variables by adopting N agents according to a reinforcement learning algorithm to determine target values of the N single-dimensional variables; according to the target values of the N single-dimensional variables, determining the optimal value of the target function, and completing parameter setting of the PID controller; and controlling a control object in the electrical control system by using the PID controller after parameter setting is finished. The method, the device, the equipment and the computer readable storage medium provided by the invention improve the parameter optimization efficiency and the convergence speed of the PID controller and the control performance of the PID controller.
Description
Technical Field
The invention relates to the technical field of process control, in particular to an electrical control method, device and equipment based on a PID controller and a computer readable storage medium.
Background
With the progress of process control technology in the electrical field, it has been a major development in recent decades. Researchers around the world have developed various control methods including adaptive control, artificial neural network control, fuzzy control, etc. Among them, the most basic, most widely used, is a single loop PID controller. The PID controller composed of proportional (P), integral (I) and derivative (D) units is simple in structure and can keep good robustness when the variation range of the operating conditions is large. Therefore, how to optimize and tune the proportional, integral and differential parameters of the PID controller is one of the key points of the control problem research.
In the prior art, a parameter optimization method includes two categories: traditional regulation and intelligent regulation. First, the conventional tuning method includes a Ziegler-Nichols algorithm, an optimal PID parameter tuning method based on an Integral Square Time error criterion (ISTE). The adjusting process is complex, oscillation and large overshoot are difficult to avoid, and the optimal PID parameter is difficult to obtain. Therefore, researchers are dedicated to developing intelligent PID parameter tuning methods based on various heuristic algorithms. Artificial intelligence techniques such as Genetic Algorithm (GA), Particle Swarm Optimization (PSO), fuzzy inference Algorithm, and artificial neural network are then used in the tuning process of PID parameters. These techniques can effectively overcome the above-mentioned disadvantages of the conventional regulation method and enhance the control performance of the PID controller. However, these techniques also have their own drawbacks. For example, the GA needs to process a cumbersome encoding process, and both the GA and the PSO depend on the concept of the population, which has a long convergence time and a slow convergence rate; the fuzzy reasoning is difficult to find out a systematic method to complete the selection of the parameters of the algorithm; the neural network comprises a plurality of layers of neurons, and it is difficult to find an explicit method for determining the number of hidden layer neurons and the initial weights of the neurons.
From the above, it can be seen that how to improve the optimization efficiency of PID controller parameters is a problem to be solved at present.
Disclosure of Invention
The invention aims to provide an electrical control method, an electrical control device, electrical control equipment and a computer readable storage medium based on a PID controller, and aims to solve the problems that a parameter adjusting method of the PID controller in the prior art is complex in process, long in convergence time and slow in convergence rate.
In order to solve the above technical problem, the present invention provides an electrical control method based on a PID controller, comprising: constructing a target function of a PID controller parameter setting problem, wherein undetermined parameters of the target function comprise N single-dimensional variables; discretizing the N single-dimensional variables, and learning the N single-dimensional variables by adopting N agents according to a reinforcement learning algorithm to determine target values of the N single-dimensional variables; according to the target values of the N single-dimensional variables, determining the optimal value of the target function, and completing parameter setting of the PID controller; and controlling a control object in the electrical control system by using the PID controller after parameter setting is finished.
Preferably, the constructing an objective function of the PID controller parameter tuning problem, wherein the undetermined parameter of the objective function includes N single-dimensional variables includes:
constructing a target function of a PID controller parameter setting problem:
wherein e (t) is the tracking error of the PID controller; u (t) is the output of the PID controller; t is tuA rise time for the output signal y (t) of the electrical control system to rise from 10% to 90% of a steady state value; y (t) -y (t-1) is overshoot penalty, when y (t) is not less than 0, ω is40; when ey (t) < 0, ω4Not equal to 0 and ω4>>ω1(ii) a Undetermined parameters of the objective functionComprising a first weight ω1A second weight ω2A third weight ω3And a fourth weight ω4。
Preferably, the step of learning each single-dimensional variable by each agent comprises:
s1: determining a current solution of the objective function after an i (i) 1, 2., N) th agent selects a current action from an i (i) 1, 2., N) th set of actions that can be taken for a single-dimensional variable;
s2: determining a reward function value corresponding to the current behavior according to a calculation rule of a preset reward function and the current solution of the objective function;
s3: updating the value function corresponding to the current behavior according to the reward function value so that the ith agent selects the next behavior according to the updated value function;
s4: adding different disturbances to all dimensions of the current solution;
s5: and circularly executing the steps from S1 to S4 until the number of circulation times reaches a preset number, and finishing the learning of the ith single-dimensional variable.
Preferably, the determining, according to the calculation rule of the preset reward function and the current solution of the objective function, the reward function value corresponding to the current behavior includes:
according toDetermining a value R of a reward function for the kth step of the current behavior of the ith agentk(ii) a Wherein, JkIs the current solution of the objective function; j. the design is a squarebestIs the initial optimal solution of the objective function.
Preferably, the updating the value function corresponding to the current behavior according to the reward function value includes:
according to Vk+1(i,j)=(1-α)Vk(i,j)+α[Rk+(1-λ2)Lmax(i,j)+λ2Lmin(i,j)]Updating the value function corresponding to the current behavior;
wherein, Vk(i, j) is the corresponding valueA function; l isl(i, j) is a path value, l ═ 1 indicates a path to the left, and l ═ 2 indicates a path to the right; lambda [ alpha ]1As a function of said value Vk(ii) a weight of (i, j); α is the learning rate; l ismax(i, j) and Lmin(i, j) two path values, maximum and minimum respectively; lambda [ alpha ]2(1- λ) being the weight of the maximum and minimum path values2)>λ2。
The invention also provides an electrical control device based on the PID controller, which comprises:
the system comprises a construction module, a parameter setting module and a parameter setting module, wherein the construction module is used for constructing a target function of a PID controller parameter setting problem, and undetermined parameters of the target function comprise N single-dimensional variables;
the reinforcement learning module is used for discretizing the N single-dimensional variables, learning the N single-dimensional variables by adopting N agents according to a reinforcement learning algorithm, and determining target values of the N single-dimensional variables;
the setting module is used for determining the optimal value of the objective function according to the target values of the N single-dimensional variables to complete parameter setting of the PID controller;
and the electrical control module is used for controlling a control object in the electrical control system by utilizing the PID controller after parameter setting is finished.
Preferably, the building block is specifically configured to:
constructing a target function of a PID controller parameter setting problem:
wherein e (t) is the tracking error of the PID controller; u (t) is the output of the PID controller; t is tuA rise time for the output signal y (t) of the electrical control system to rise from 10% to 90% of a steady state value; y (t) -y (t-1) is overshoot penalty, when y (t) is not less than 0, ω is40; when ey (t) < 0, ω4Not equal to 0 and ω4>>ω1(ii) a The undetermined parameters of the objective function include a first weight ω1A second weight ω2A third weight ω3And a fourth weight ω4。
Preferably, the reinforcement learning module includes:
a selecting unit, configured to determine a current solution of the objective function after an i (i ═ 1, 2., N) -th agent selects a current behavior from an i (i ═ 1, 2., N) -th one-dimensional variable set of adoptable behaviors;
the determining unit is used for determining a reward function value corresponding to the current behavior according to a calculation rule of a preset reward function and the current solution of the target function;
the updating unit is used for updating the value function corresponding to the current behavior according to the reward function value so that the ith agent can select the next behavior according to the updated value function;
the disturbance unit is used for adding different disturbances to all dimensions of the current solution;
a loop unit, configured to loop the i (i-1, 2...., N) -th agent to determine a current solution of the objective function after selecting a current behavior from an actionable set of i (i-1, 2.., N) -th one-dimensional variables; determining a reward function value corresponding to the current behavior according to a calculation rule of a preset reward function and the current solution of the objective function; updating the value function corresponding to the current behavior according to the reward function value so that the ith agent selects the next behavior according to the updated value function; and adding different disturbances to all dimensions of the current solution until the cycle times reach preset times, and finishing the learning of the ith single-dimensional variable.
The invention also provides an electrical control device based on the PID controller, which comprises:
a memory for storing a computer program; and the processor is used for realizing the steps of the electrical control method based on the PID controller when executing the computer program.
The present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a PID controller-based electrical control method as described above.
According to the electrical control method based on the PID controller, after the parameters of the PID controller are set by adopting a reinforcement learning algorithm, the PID controller after parameter setting is used for controlling the control object in the electrical control system. The electrical control system consists of a PID controller and a controlled electrical system. When the PID controller is subjected to parameter setting by adopting a reinforcement learning algorithm, firstly, discretizing N single-dimensional variables in a target function of the parameter setting problem of the PID controller. Then, according to a reinforcement learning algorithm, N agents are adopted to respectively learn the discretized N single-dimensional variables, and the target values of the N unit variables are determined, so that the optimal value of the target function is determined, and the parameter setting of the PID controller is completed. The method provided by the invention is based on the reinforced learning algorithm to set the parameters of the PID controller, does not depend on the population, adopts the thought of repeated trial and error, completes the parameter setting through the interaction of the agent and the unknown environment, and can optimize the parameters of the PID controller on line and carry out tracking control on the system when the unknown environment changes, namely the controlled system is dynamically time-varying. The invention improves the optimization efficiency and the convergence speed of the PID controller parameters and the control performance of the PID controller, is convenient to realize and has practicability; and the reinforcement learning algorithm has certain randomness and can jump out local optimum.
Drawings
In order to more clearly illustrate the embodiments or technical solutions of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a flow chart of a first embodiment of a PID controller based electrical control method provided by the present invention;
FIG. 2 is a schematic diagram of an electrical control system;
FIG. 3 is a step response comparison graph of the system corresponding to the GA algorithm, the PSO algorithm, and the RL algorithm;
FIG. 4 is a comparison graph of the average objective function convergence results of the GA algorithm, the PSO algorithm and the RL algorithm which are respectively optimized for 10 times;
FIG. 5 is a flow chart of a method for each agent to learn each single-dimensional control variable;
fig. 6 is a block diagram of an electrical control device based on a PID controller according to an embodiment of the present invention.
Detailed Description
The core of the invention is to provide an electrical control method, a device, equipment and a computer readable storage medium based on a PID controller, which improve the parameter optimization efficiency, the convergence rate and the control performance of the PID controller.
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating a PID controller-based electrical control method according to a first embodiment of the present invention; the specific operation steps are as follows:
step S101: constructing a target function of a PID controller parameter setting problem, wherein undetermined parameters of the target function comprise N single-dimensional variables;
an electrical control system composed of a PID controller and a control object is shown in fig. 2, wherein c(s) is a transfer function of the PID controller, g(s) is a transfer function of the control object, the input and output of the whole electrical control system are r (t) and y (t), respectively, the input signal of the electrical control system is used as a reference of the output signal of the control object, the difference value between the input signal and the output signal is a tracking error e (t) of the PID controller, and the output u (t) of the PID controller is the input of the control object. In the control process, given input signals, namely a reference signal and a control object, the PID controller can make the output of the control object approach the input signal by processing the tracking error, and the specific processing method is represented by the following laplace transfer function:
wherein, Kp、KiAnd KdRespectively, to be determined proportional, integral and differential parameters.
The environmental state is quantitatively represented by the objective function value. The target function expression of the PID controller parameter setting problem is as follows:
wherein e (t) is the tracking error of the PID controller; u (t) is the output of the PID controller; t is tuA rise time for the output signal y (t) of the electrical control system to rise from 10% to 90% of a steady state value; in order to avoid large overshoot, an overshoot penalty term is set in the objective function; y (t) -y (t-1) is overshoot penalty, when y (t) is not less than 0, ω is40; when ey (t) < 0, ω4Not equal to 0 and ω4>>ω1(ii) a The undetermined parameters of the objective function include a first weight ω1A second weight ω2A third weight ω3And a fourth weight ω4。
Step S102: discretizing the N single-dimensional variables, and learning the N single-dimensional variables by adopting N agents according to a reinforcement learning algorithm to determine target values of the N single-dimensional variables;
assuming that the dimension of the parameter X to be determined is N, it can be expressed as X ═ X1,x2,…,xN]. The reinforcement learning algorithm adopts N agents, each agent is responsible for the optimization of a single-dimensional variable,the N agents perform a learning step on respective single-dimensional variables in turn. Discretizing feasible field of ith single dimension of variable into Di(i-1, 2, …, N) boxes, the set of actions that the ith agent can take is ai={1,2,…,Di}。
Step S103: according to the target values of the N single-dimensional variables, determining the optimal value of the target function, and completing parameter setting of the PID controller;
setting a control object in the electrical control system asAnd a PID controller based on a reinforcement learning algorithm is adopted for control, and an input signal is a unit step signal.
In the embodiment, the PSO algorithm and the GA algorithm are selected to be compared with the PID parameter setting method based on the reinforcement learning algorithm, and the three algorithms optimize PID controller parameters in the same system. The parameters of the reinforcement learning algorithm are set as follows: n is 4, lambda1=0.5,λ20.25, 1 and 10. The parameters of the PSO algorithm are set as: acceleration factor c1=c2The population size was 100 ═ 2. Parameters of the GA algorithm are set as: the crossover and mutation rates were 0.9 and 0.01, respectively, with a population size of 100. The objective function contains weight values set as: omega1=0.999,ω2=0.001,ω3=2,ω4=100。
Fig. 3 shows the step response of the system corresponding to the three algorithms, and the RL algorithm is the reinforcement learning algorithm in this embodiment. At the time 0, the input signal is suddenly changed from 0 to 1, and the results show that the three algorithms can eliminate the oscillation and overshoot before the output signal reaches the steady-state value 1 in the system response. And the performance of the three algorithms is close, and the step response is completed within 0.1 second. The RL algorithm almost coincides with the system response corresponding to the PSO, the system response corresponding to the GA algorithm is slightly faster at the rise phase, but enters a stable value slightly slower than the other two algorithms. Fig. 4 shows the results of the convergence of the average objective function for 10 times optimized by the three algorithms. The PSO and RL algorithms converge to smaller objective function values than the GA algorithm, but the RL algorithm converges at a rate that is doubled compared to the PSO.
Step S104: and controlling a control object in the electrical control system by using the PID controller after parameter setting is finished.
The method provided by the embodiment adopts a reinforcement learning algorithm to optimize parameters aiming at the parameter setting problem of the PID controller. The reinforcement learning algorithm can avoid introduction of a population in a genetic algorithm and a particle swarm algorithm, and introduces an agent to optimize a target function, so that the convergence rate in the optimization process is improved. Meanwhile, the reinforcement learning algorithm has certain randomness and can jump out of local optimum; convenient to realize and has practicability.
Based on step S102 in the above-described embodiment, a step of learning each single-dimensional variable by each agent is provided in the present embodiment. Referring to fig. 5, fig. 5 is a flowchart of a method for learning each single-dimensional control variable by each agent, and the specific optimization steps include:
s501: determining a current solution of the objective function after an i (i) 1, 2., N) th agent selects a current action from an i (i) 1, 2., N) th set of actions that can be taken for a single-dimensional variable;
the environmental state is quantitatively represented by the objective function value.
S502: determining a reward function value corresponding to the current behavior according to a calculation rule of a preset reward function and the current solution of the objective function;
the context is fed back to the agent as a reward function to characterize whether the agent has taken a favorable action to shift the context to a better state. According toDetermining a value R of a reward function for the kth step of the current behavior of the ith agentk(ii) a Wherein, JkIs the current solution of the objective function; j. the design is a squarebestIs the initial optimal solution of the objective function.
S503: updating the value function corresponding to the current behavior according to the reward function value so that the ith agent selects the next behavior according to the updated value function;
the value function corresponding to the jth action of the ith agent is V (i, j). And the agent updates the value function corresponding to the currently taken action according to the reward function and the path value. The path value refers to the value of the ith agent in the ith dimension of the variable, selecting a path search continuing to the left or right from the current jth grid, and is expressed as Ll(i, j), l 1 indicates a path to the left, and l 2 indicates a path to the right. The left and right path values are calculated by the value function corresponding to n lattices adjacent to the left and right sides of the jth lattice, as follows:
wherein,for the m-th element after descending the function of adjacent n values, λ1Is the weight of a value function and satisfies
In summary, the update rule of the value function is as follows:
Vk+1(i,j)=(1-α)Vk(i,j)+α[Rk+(1-λ2)Lmax(i,j)+λ2Lmin(i,j)]
wherein, Vk(i, j) is the corresponding value function; l isl(i, j) is a path value, l ═ 1 indicates a path to the left, and l ═ 2 indicates a path to the right; lambda [ alpha ]1As a function of said value Vk(ii) a weight of (i, j); alpha is the learning rate, characterizing the new information [ Rk+(1-λ2)Lmax(i,j)+λ2Lmin(i,j)]The impact on the value function; l ismax(i, j) and Lmin(i, j) two path values, maximum and minimum respectively; lambda [ alpha ]2(1- λ) being the weight of the maximum and minimum path values2)>λ2。
The agent selects the next action according to the updated value function. Previously, the agent needs to first select a path, and the selection method is as follows:
wherein, taukIs temperature, and has a value range of 0 to taukLess than or equal to 1. When tau iskThe numerical value is larger, and the probability that the remaining non-most favorable behaviors are selected is close; when tau iskThe value is close to 0 and the probability of these actions being selected will vary depending on the magnitude of the value function. Tau iskThe value of (a) is gradually reduced with the number of learning, namely:
then, the agent selects one behavior from adjacent n lattices on the selected path starting from the jth lattice, and the selection method is as follows:
and the value of the next one-dimensional variable is randomly determined from the selected trellis.
S504: adding different disturbances to all dimensions of the current solution;
in order to increase the diversity of the solution and avoid the algorithm from falling into the local optimum, the algorithm adds different disturbances to all dimensions of the current solution after the Nth agent completes a learning step, and the specific method is as follows:
X←X+Δ,Δ=[Δ1,Δ2,…,ΔN]
wherein the disturbance quantity delta is generated according to a covariance evolution algorithm.
S505: and circularly executing the steps from S501 to S504 until the circulation times reach preset times, and finishing the learning of the ith single-dimensional variable.
Repeating steps S501 and S504, each time the ith agent completes a learning processAfter that, the counter k is incremented by 1. When k reaches a preset threshold value kmaxThe algorithm terminates.
The embodiment provides a PID controller parameter setting method based on a reinforcement learning algorithm, the method does not depend on a population, adopts the thought of repeated trial and error, completes parameter setting through the interaction of an agent and an unknown environment, and when the unknown environment changes, namely the controlled system is dynamically time-varying, the reinforcement learning algorithm optimizes the PID parameters on line to carry out tracking control on the system.
Referring to fig. 6, fig. 6 is a block diagram of an electrical control apparatus based on a PID controller according to an embodiment of the present invention; the specific device may include:
the building module 100 is configured to build a target function of a PID controller parameter tuning problem, where a to-be-determined parameter of the target function includes N single-dimensional variables;
the reinforcement learning module 200 is configured to perform discretization on the N single-dimensional variables, and learn the N single-dimensional variables by using N agents according to a reinforcement learning algorithm, so as to determine target values of the N single-dimensional variables;
a setting module 300, configured to determine an optimal value of the objective function according to the target values of the N single-dimensional variables, and complete parameter setting of the PID controller;
and the electrical control module 400 is used for controlling a control object in the electrical control system by using the PID controller after parameter setting is finished.
The electrical control apparatus based on the PID controller of this embodiment is used for implementing the aforementioned electrical control method based on the PID controller, and therefore specific implementation manners in the electrical control apparatus based on the PID controller can be seen in the foregoing example portions of the electrical control method based on the PID controller, for example, the building module 100, the reinforcement learning module 200, the tuning module 300, and the electrical control module 400 are respectively used for implementing steps S101, S102, S103, and S104 in the aforementioned electrical control method based on the PID controller, so specific implementation manners thereof can refer to descriptions of corresponding respective partial embodiments, and are not repeated herein.
The specific embodiment of the present invention further provides an electrical control device based on a PID controller, including: a memory for storing a computer program; and the processor is used for realizing the steps of the electrical control method based on the PID controller when executing the computer program.
The present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the electrical control method based on the PID controller.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The electrical control method, device, equipment and computer readable storage medium based on the PID controller provided by the invention are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
Claims (10)
1. An electrical control method based on a PID controller is characterized by comprising the following steps:
constructing a target function of a PID controller parameter setting problem, wherein undetermined parameters of the target function comprise N single-dimensional variables;
discretizing the N single-dimensional variables, and learning the N single-dimensional variables by adopting N agents according to a reinforcement learning algorithm to determine target values of the N single-dimensional variables;
according to the target values of the N single-dimensional variables, determining the optimal value of the target function, and completing parameter setting of the PID controller;
and controlling a control object in the electrical control system by using the PID controller after parameter setting is finished.
2. The method of claim 1, wherein the constructing an objective function of a PID controller parameter tuning problem, wherein the pending parameters of the objective function include N single-dimensional variables comprises:
constructing a target function of a PID controller parameter setting problem:
wherein e (t) is the tracking error of the PID controller; u (t) is the output of the PID controller;tuA rise time for the output signal y (t) of the electrical control system to rise from 10% to 90% of a steady state value; y (t) -y (t-1) is overshoot penalty, when y (t) is not less than 0, ω is40; when ey (t) < 0, ω4Not equal to 0 and ω4>>ω1(ii) a The undetermined parameters of the objective function include a first weight ω1A second weight ω2A third weight ω3And a fourth weight ω4。
3. The method of claim 2, wherein the step of each agent learning each single-dimensional variable comprises:
s1: determining a current solution of the objective function after an i (i) 1, 2., N) th agent selects a current action from an i (i) 1, 2., N) th set of actions that can be taken for a single-dimensional variable;
s2: determining a reward function value corresponding to the current behavior according to a calculation rule of a preset reward function and the current solution of the objective function;
s3: updating the value function corresponding to the current behavior according to the reward function value so that the ith agent selects the next behavior according to the updated value function;
s4: adding different disturbances to all dimensions of the current solution;
s5: and circularly executing the steps from S1 to S4 until the number of circulation times reaches a preset number, and finishing the learning of the ith single-dimensional variable.
4. The method of claim 3, wherein determining the reward function value corresponding to the current behavior according to the preset reward function calculation rule and the current solution of the objective function comprises:
according toDetermining a value R of a reward function for the kth step of the current behavior of the ith agentk(ii) a Wherein, JkIs the current solution of the objective function; j. the design is a squarebestIs the initial optimal solution of the objective function.
5. The method of claim 4, wherein said updating the value function corresponding to the current behavior according to the reward function value comprises:
according to Vk+1(i,j)=(1-α)Vk(i,j)+α[Rk+(1-λ2)Lmax(i,j)+λ2Lmin(i,j)]Updating the value function corresponding to the current behavior;
wherein, Vk(i, j) is the corresponding value function; l isl(i, j) is a path value, l ═ 1 indicates a path to the left, and l ═ 2 indicates a path to the right; lambda [ alpha ]1As a function of said value Vk(ii) a weight of (i, j); α is the learning rate; l ismax(i, j) and Lmin(i, j) two path values, maximum and minimum respectively; lambda [ alpha ]2(1- λ) being the weight of the maximum and minimum path values2)>λ2。
6. An electrical control apparatus based on a PID controller, comprising:
the system comprises a construction module, a parameter setting module and a parameter setting module, wherein the construction module is used for constructing a target function of a PID controller parameter setting problem, and undetermined parameters of the target function comprise N single-dimensional variables;
the reinforcement learning module is used for discretizing the N single-dimensional variables, learning the N single-dimensional variables by adopting N agents according to a reinforcement learning algorithm, and determining target values of the N single-dimensional variables;
the setting module is used for determining the optimal value of the objective function according to the target values of the N single-dimensional variables to complete parameter setting of the PID controller;
and the electrical control module is used for controlling a control object in the electrical control system by utilizing the PID controller after parameter setting is finished.
7. The apparatus of claim 6, wherein the build module is specifically configured to:
constructing a target function of a PID controller parameter setting problem:
wherein e (t) is the tracking error of the PID controller; u (t) is the output of the PID controller; t is tuA rise time for the output signal y (t) of the electrical control system to rise from 10% to 90% of a steady state value; y (t) -y (t-1) is overshoot penalty, when y (t) is not less than 0, ω is40; when ey (t) < 0, ω4Not equal to 0 and ω4>>ω1(ii) a The undetermined parameters of the objective function include a first weight ω1A second weight ω2A third weight ω3And a fourth weight ω4。
8. The apparatus of claim 7, wherein the reinforcement learning module comprises:
a selecting unit, configured to determine a current solution of the objective function after an i (i ═ 1, 2., N) -th agent selects a current behavior from an i (i ═ 1, 2., N) -th one-dimensional variable set of adoptable behaviors;
the determining unit is used for determining a reward function value corresponding to the current behavior according to a calculation rule of a preset reward function and the current solution of the target function;
the updating unit is used for updating the value function corresponding to the current behavior according to the reward function value so that the ith agent can select the next behavior according to the updated value function;
the disturbance unit is used for adding different disturbances to all dimensions of the current solution;
a loop unit, configured to loop the i (i-1, 2...., N) -th agent to determine a current solution of the objective function after selecting a current behavior from an actionable set of i (i-1, 2.., N) -th one-dimensional variables; determining a reward function value corresponding to the current behavior according to a calculation rule of a preset reward function and the current solution of the objective function; updating the value function corresponding to the current behavior according to the reward function value so that the ith agent selects the next behavior according to the updated value function; and adding different disturbances to all dimensions of the current solution until the cycle times reach preset times, and finishing the learning of the ith single-dimensional variable.
9. An electrical control apparatus based on a PID controller, comprising:
a memory for storing a computer program;
a processor for implementing the steps of a PID controller based electrical control method according to any of claims 1 to 5 when executing said computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of a PID controller-based electrical control method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910722233.5A CN110320796A (en) | 2019-08-06 | 2019-08-06 | Electrical control method, device and equipment based on PID controller |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910722233.5A CN110320796A (en) | 2019-08-06 | 2019-08-06 | Electrical control method, device and equipment based on PID controller |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110320796A true CN110320796A (en) | 2019-10-11 |
Family
ID=68125626
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910722233.5A Pending CN110320796A (en) | 2019-08-06 | 2019-08-06 | Electrical control method, device and equipment based on PID controller |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110320796A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118011783A (en) * | 2024-04-09 | 2024-05-10 | 天津仁爱学院 | Construction environment PID control method based on improved barrel jellyfish algorithm |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104200096A (en) * | 2014-08-29 | 2014-12-10 | 中国南方电网有限责任公司超高压输电公司昆明局 | Lightning arrester grading ring optimization method based on differential evolutionary algorithm and BP neural network |
CN105911868A (en) * | 2016-06-15 | 2016-08-31 | 南京工业大学 | Multi-batch intermittent reactor two-dimensional iterative learning feedback control method |
CN106896716A (en) * | 2017-04-17 | 2017-06-27 | 华北电力大学(保定) | Micro-capacitance sensor alternating current-direct current section transverter pid parameter optimization method based on grey wolf algorithm |
EP3357651A2 (en) * | 2017-02-06 | 2018-08-08 | Seiko Epson Corporation | Control device, robot, and robot system |
-
2019
- 2019-08-06 CN CN201910722233.5A patent/CN110320796A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104200096A (en) * | 2014-08-29 | 2014-12-10 | 中国南方电网有限责任公司超高压输电公司昆明局 | Lightning arrester grading ring optimization method based on differential evolutionary algorithm and BP neural network |
CN105911868A (en) * | 2016-06-15 | 2016-08-31 | 南京工业大学 | Multi-batch intermittent reactor two-dimensional iterative learning feedback control method |
EP3357651A2 (en) * | 2017-02-06 | 2018-08-08 | Seiko Epson Corporation | Control device, robot, and robot system |
CN106896716A (en) * | 2017-04-17 | 2017-06-27 | 华北电力大学(保定) | Micro-capacitance sensor alternating current-direct current section transverter pid parameter optimization method based on grey wolf algorithm |
Non-Patent Citations (1)
Title |
---|
X. Y. SHANG ET. AL.: "Parameter Optimization of PID Controllers by Reinforcement Learning", 《5TH COMPUTER SCIENCE AND ELECTRONIC ENGINEERING CONFERENCE (CEEC)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118011783A (en) * | 2024-04-09 | 2024-05-10 | 天津仁爱学院 | Construction environment PID control method based on improved barrel jellyfish algorithm |
CN118011783B (en) * | 2024-04-09 | 2024-06-04 | 天津仁爱学院 | Construction environment PID control method based on improved barrel jellyfish algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Devaraj et al. | Real-coded genetic algorithm and fuzzy logic approach for real-time tuning of proportional–integral–derivative controller in automatic voltage regulator system | |
Bagis | Tabu search algorithm based PID controller tuning for desired system specifications | |
Jacknoon et al. | Ant Colony based LQR and PID tuned parameters for controlling Inverted Pendulum | |
CN113919217B (en) | Adaptive parameter setting method and device for active disturbance rejection controller | |
CN109325580A (en) | A kind of adaptive cuckoo searching method for Services Composition global optimization | |
Mahmoodabadi et al. | Pareto optimal design of the decoupled sliding mode controller for an inverted pendulum system and its stability simulation via Java programming | |
CN115293052A (en) | Power system active power flow online optimization control method, storage medium and device | |
Wang et al. | A boosting-based deep neural networks algorithm for reinforcement learning | |
Hein et al. | Generating interpretable fuzzy controllers using particle swarm optimization and genetic programming | |
George et al. | An Effective Technique for Tuning the Time Delay System with PID Controller-Ant Lion Optimizer Algorithm with ANN Technique. | |
Cheng et al. | PID controller parameters optimization based on artificial fish swarm algorithm | |
CN110320796A (en) | Electrical control method, device and equipment based on PID controller | |
Ikemoto et al. | Continuous deep Q-learning with a simulator for stabilization of uncertain discrete-time systems | |
Jalilvand et al. | Advanced particle swarm optimization-based PID controller parameters tuning | |
El-Nagar | Practical implementation for stable adaptive interval A2-C0 type-2 TSK fuzzy controller | |
Nahrendra et al. | Adaptive control of cyber-physical distillation column using data driven control approach | |
CN116880191A (en) | Intelligent control method of process industrial production system based on time sequence prediction | |
Marseguerra et al. | Genetic algorithm optimization of a model-free fuzzy control system | |
Lian et al. | Performance enhancement for T–S fuzzy control using neural networks | |
Wang | Automatic design of fuzzy controllers | |
CN113485099B (en) | Online learning control method of nonlinear discrete time system | |
Fairbank et al. | A comparison of learning speed and ability to cope without exploration between DHP and TD (0) | |
Nakano et al. | Consideration of particle swarm optimization combined with tabu search | |
Masoumzadeh et al. | Deep blue: A fuzzy q-learning enhanced active queue management scheme | |
Li et al. | A DDPG-based solution for optimal consensus of continuous-time linear multi-agent systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20191011 |
|
WD01 | Invention patent application deemed withdrawn after publication |