CN101667012A - Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator - Google Patents
Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator Download PDFInfo
- Publication number
- CN101667012A CN101667012A CN 200810051135 CN200810051135A CN101667012A CN 101667012 A CN101667012 A CN 101667012A CN 200810051135 CN200810051135 CN 200810051135 CN 200810051135 A CN200810051135 A CN 200810051135A CN 101667012 A CN101667012 A CN 101667012A
- Authority
- CN
- China
- Prior art keywords
- voltage
- value
- reinforcement learning
- proportion integration
- integration differentiation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Feedback Control In General (AREA)
Abstract
The invention discloses a method for controlling a reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator, which comprises the following steps of: deducing a current-voltage conversion equation in a d-q coordinate system according to a instantaneous power balance formula; respectively regulating an error between a given voltage value and a practical measured value and an error between a DC side capacitor voltage command value and a practical measured value by a reinforcement learning adaptive PID control algorithm in the controlrealizing process so as to obtain an active command signal and a passive command signal; performing voltage-current conversion in a mathematical model to obtain a voltage signal; and carrying out coordinate transformation to obtain a required voltage modulating signal. The method uses the reinforcement learning adaptive PID control algorithm to prevent a condition of unstable performance of a controller caused by the variation of an equivalent parameter in the conventional PID control, realizes the adaptive capability of the controller well and improves the control accuracy.
Description
Technical field
The present invention relates to a kind of control method of distribution static synchronous compensator, the control method of particularly a kind of distribution static synchronous compensator (DSTATCOM) based on reinforcement learning adaptive PID.
Background technology
Along with science and technology development, China's power system development is very fast, and in industry and household electricity load, inductive load occupies very big ratio, and the necessary absorbing reactive power ability of inductive load operate as normal, a large amount of harmonic currents that these loads simultaneously produce are also wanted consume reactive power.The consumption of a large amount of reactive powers has caused a series of power quality problems such as voltage fluctuation, flickering and three-phase imbalance of power distribution network.In distribution system, there is simultaneously a large amount of quick impact loads,, can causes voltage flicker, cause the unbalancedness of system power and voltage as the electric arc furnaces load.On the other hand, fast development along with Chinese national economy and scientific and technological level, all trades and professions are more and more higher to the requirement of the quality of power supply, particularly along with the widespread use of various electronic installations and precision equipment, make the user wish that power supply enterprise can provide the electric energy of high-efficiency high-quality.In case power quality problem occurs, gently then cause equipment failure, heavy then cause the damage of total system, the loss that brings thus is difficult to estimate.So power quality problem has been related to safe, stable, economy, the reliability service of whole electric system and equipment, and be related to the overall benefit and the development strategy of whole national economy.When electric system pressed for advanced power transmission and distribution technology and improves the quality of power supply and system stability, along with the fast development of Power Electronic Technique and modern control technology, a kind of new technology that changes the power transmission and distribution ability---system (FACTS) quietly rose.Distribution static synchronous compensator (DSTATCOM) has been represented the development trend of Future Power System reactive power compensator like this, utilize Power Electronic Technique and modern control technology combine can be comprehensive the solution power distribution network in multiple power quality problem.The conventional controller design of power distribution network STATCOM is based on its local linearization model, because the uncertain feature of the non-linear and equivalent parameters of DSTATCOM model, very difficulty is complicated to make its control.Using at present more is traditional PID control, and when adopting PID control, when equivalent parameters was measured inaccurate or change, it is even unstable that the performance of controller can reduce, and the more serious maloperation that can occur controlling burns out the DSTATCOM device.So realize the adaptive ability of DSTATCOM controller, have great importance.
Summary of the invention
Technical matters to be solved by this invention provides a kind of distribution static synchronous compensator control method based on reinforcement learning adaptive PID with dynamically good and static properties, to realize the adaptive ability of DSTATCOM controller.
For solving the problems of the technologies described above, the invention provides a kind of based on reinforcement learning adaptive proportion integration differentiation distribution static synchronous compensator control method, according to the instantaneous power equilibrium principle, list the distribution static synchronous compensator mathematical model, and it is transformed into the dq0 coordinate system by transformed matrix from rest frame, drawing the distribution static synchronous compensator system is the coupling nonlinear system of typical two inputs, two outputs, it is characterized in that: the error of voltage instruction value and actual measured value is formed idle instruction current signal after reinforcement learning adaptive proportion integration differentiation is regulated; The error of dc capacitor voltage command value and actual measured value forms meritorious instruction current signal after reinforcement learning adaptive proportion integration differentiation is regulated, after the relation transformation of idle instruction current signal and meritorious instruction current signal voltage and electric current in mathematical model, form idle command voltage signal and meritorious command voltage signal, idle command voltage signal and meritorious command voltage signal after the dq/abc coordinate transform as modulation signal, after the triangular carrier modulation, produce the action that the pulse-width modulation PWM drive signal goes to control Intelligent Power Module, produce the voltage that needs compensation, thereby kept the constant of dc capacitor voltage and points of common connection PCC voltage.
The present invention introduces the reinforcement learning adaptive pid control algorithm in the control method of DSTATCOM.Wherein in control procedure, pass through the intensified learning algorithm to K
P, K
IAnd K
DTrain and learn, and in study, add the decoupling zero requirement, make control system to regulate K automatically according to the variation of model parameter
P, K
IAnd K
DValue, make system reach satisfied control result.
DSTATCOM control method based on reinforcement learning adaptive PID of the present invention has been kept the constant of dc capacitor voltage and system node voltage, has realized effective reactive power compensation.This control algolithm is derived current-voltage model under the dq0 coordinate according to the instantaneous power equilibrium principle, and proposed to be suitable for the reinforcement learning adaptive pid control algorithm of this control system, when this control method has been avoided in the traditional PID control equivalent parameters change, the unsettled situation of controller performance, well realize the adaptive ability of controller, improved the degree of accuracy of control.
Description of drawings
Fig. 1 is DSTATCOM main circuit structure figure of the present invention;
Fig. 2 is a control system schematic diagram of the present invention;
Fig. 3 is the control algolithm process flow diagram of reinforcement learning adaptive PID.
Embodiment
With reference to Fig. 1, u represents the three-phase voltage of electrical network; E and i then represent three-phase output voltage and the electric current of DSTATCOM respectively; Resistance R and inductance L respectively the indication device loss with line reactance and be connected the transformer leakage reactance.The supposing the system three-phase voltage is:
Suppose that the DSTATCOM output voltage is:
In the formula (2), K is the no-load voltage ratio coefficient, and δ is DSTATCOM output voltage u
cWith system voltage u
sAngle is controlled amounts.
It is as follows to obtain in the dq0 coordinate system current-voltage conversion formula according to the instantaneous power balanced type:
According to the power-balance principle, the DSTATCOM output power should equal the power of injected system and the power sum that equivalent resistance, reactance consume, that is:
P
e=P
o+P
f (3)
Q
e=Q
o+Q
f (4)
Select the d axle of synchronous rotating frame to overlap, can get with PCC access point voltage vector:
u
d=u u
q=0 (5)
(5) formula substitution power-balance formula is drawn:
e
d=i
dR-i
qωL+u
(6)
e
q=i
qR+i
dωL
Following formula has been realized current i in the d-q coordinate system
d, i
qTo voltage e
d, e
qConversion.
By last two formulas as can be seen, the Control of Voltage of DSTATCOM instruction e
d, e
qFormation and equivalent parameters R, L is closely related.If the i that adopts conventional PID control to form
q *And i
d *, when equivalent parameters was measured accurately and remained unchanged, controller performance was superior, but the device agings that equivalent parameters may cause because of the variation that is difficult to accurate measurement, operating condition, owing to long-time running etc. are former thereby caused the uncertainty of equivalent parameters.In order to improve the robustness of controller to system's major parameter disturbance, realize the adaptive ability of system, the present invention proposes DSTATCOM control method based on reinforcement learning adaptive PID.
With reference to Fig. 2, with voltage instruction value U
* PccWith actual measured value U
AbcError regulate the idle instruction current signal i of back formation through intensified learning PID
q *Dc capacitor voltage command value U
Dc *With actual measured value U
DcError regulate the back through intensified learning PID and form the instruction current signal i that gains merit
d *i
q *, i
d *In mathematical model, after the relation transformation of voltage and electric current, form idle command voltage signal e
q *With meritorious command voltage signal e
d *e
q *And e
d *After the dq/abc coordinate transform as modulation signal, after the triangular carrier modulation, produce the action that the pulse-width modulation PWM drive signal goes to control Intelligent Power Module, produce the voltage that needs compensation, thereby kept the constant of dc capacitor voltage and points of common connection PCC voltage.
The introduction of reinforcement learning adaptive pid control algorithm:
Intensified learning is called again and strengthens study, be a kind of with environmental feedback as machine learning method input, special, that conform, examination is gathered and is searched and the time-delay remuneration is two key characters of intensified learning.Wherein the Q learning algorithm is that effect is reasonable a kind of in the intensified learning algorithm, and in Q study, (s a), is defined as: its value is that the maximum conversion accumulative total when state s begins and uses as first element a is repaid to main study action valuation functions Q.The value of Q is for carrying out the value (converting with γ) of following optimal strategy after repaying immediately of action adds from state.After the intensified learning process was finished, system by to the mapping of Q matrix, obtained optimum relatively action to corresponding state.
In the adjustment of traditional pid parameter, all be to use fixing formula, it is not very desirable adjusting effect, is difficult to adapt to changeable power grid environment, in the adjustment of pid parameter, can make pid parameter have better adaptability the intensified learning algorithm application.With the parameter K of Q learning algorithm to PID
P, K
IAnd K
DAdjust respectively, step is as follows:,
1. the Q value matrix of initiation parameter, Q value matrix recording status s and under this state, select action a to expect that the accumulation that obtains awards.Wherein state promptly is somebody's turn to do K constantly
P, K
IAnd K
DValue, the action a promptly adjust this parameter value, export corresponding K
P, K
IAnd K
DValue.Study factor-alpha of initialization simultaneously and discount factor γ;
2. select and carry out action a according to current state s and Action Selection strategy π;
3. according to the K that imports
P, K
IAnd K
DValue calculate remuneration r, and enter new state s
1
4. use formula Q (s, a)=(s a)+α * (r+ γ * maxQ (s ', a ')) upgrades the Q value to (1-α) * Q; Wherein α is the study factor, and this study factor reduces gradually along with the increase of study iterations, is zero at last, means the end of learning process, because the Q value will no longer be updated;
As can be seen, the iteration of Q learning algorithm is that strategy is irrelevant from formula, and it always selects maximum Q value to import as iteration.Same through after iterating, Q (s
t, a
t) will progressively approach desirable Q (s, a).Adjusting the study factor, return step 2, is 0 up to the study factor.
5. export rational K
P, K
IAnd K
DValue;
6.PID controller adopts the increment type PID control algolithm: control deviation:
Δu(k)=K
PX(1)+K
IX(2)+K
DX(3)
Wherein: X (1)=e (k);
X(2)=e(k)-e(k-1);
X(3)=e(k)-2e(k-1)+e(k-2);
u(k)=u(k-1)+Δu(k)
In the process of the training of reinforcement learning adaptive pid control algorithm and study, add the requirement of decoupling zero control, make the result of its training and study also satisfy the function of decoupling zero control simultaneously.Improved the control accuracy of system.
Claims (2)
1, a kind of based on reinforcement learning adaptive proportion integration differentiation distribution static synchronous compensator control method, at first according to the instantaneous power equilibrium principle, list the distribution static synchronous compensator mathematical model, and it is transformed into the dq0 coordinate system by transformed matrix from rest frame, it is characterized in that: the error of voltage instruction value and actual measured value is regulated the idle instruction current signal of back formation through the reinforcement learning adaptive proportion integration differentiation control algolithm; The error of dc capacitor voltage command value and actual measured value forms meritorious instruction current signal after reinforcement learning adaptive proportion integration differentiation is regulated; After obtaining idle instruction current signal and active current signal, after the relation transformation of idle instruction current signal and meritorious instruction current signal voltage and electric current in mathematical model, form idle command voltage signal and meritorious command voltage signal, idle command voltage signal and meritorious command voltage signal after the dq/abc coordinate transform as modulation signal, after the triangular carrier modulation, produce the action that the pulse-width modulation PWM drive signal goes to control Intelligent Power Module, produce the voltage that needs compensation.
2, according to claim 1 based on reinforcement learning adaptive proportion integration differentiation distribution static synchronous compensator control method, it is characterized in that: the implementation procedure of reinforcement learning adaptive proportion integration differentiation control algolithm is as follows:
(1) the Q value matrix of initiation parameter, Q value matrix recording status s reach the accumulation award of selecting action a to expect acquisition under this state;
(2) select and carry out action a according to current state s and Action Selection strategy π;
(3) according to the K that imports
P, K
IAnd K
DValue calculate remuneration r, and enter new state s
1
(4) use formula Q (s, a)=(s a)+α * (r+ γ * maxQ (s ', a ')) upgrades the Q value to (1-α) * Q; Through after iterating, Q (s
t, a
t) will progressively approach Q (s, a); Adjusting the study factor, return step 2, is 0 up to the study factor;
(5) export rational K
P, K
IAnd K
DValue;
(6) with the K that exports
P, K
IAnd K
DValue send into the adaptive proportion integration differentiation controller, adopt increment type adaptive proportion integration differentiation control algolithm to control.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200810051135 CN101667012A (en) | 2008-09-03 | 2008-09-03 | Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200810051135 CN101667012A (en) | 2008-09-03 | 2008-09-03 | Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101667012A true CN101667012A (en) | 2010-03-10 |
Family
ID=41803657
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200810051135 Pending CN101667012A (en) | 2008-09-03 | 2008-09-03 | Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101667012A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101917010A (en) * | 2010-07-27 | 2010-12-15 | 荣信电力电子股份有限公司 | Compound control structure for balanced output of multiple sets of automatically controlled power equipment |
CN102787915A (en) * | 2012-06-06 | 2012-11-21 | 哈尔滨工程大学 | Diesel engine electronic speed adjusting method based on reinforced study of proportion integration differentiation (PID) controller |
CN106707752A (en) * | 2016-12-21 | 2017-05-24 | 大连理工大学 | Improved algorithm for solving state feedback gain matrix of current source STATCOM (static synchronous compensator) |
CN107943022A (en) * | 2017-10-23 | 2018-04-20 | 清华大学 | A kind of PID locomotive automatic Pilot optimal control methods based on intensified learning |
CN108014926A (en) * | 2018-02-05 | 2018-05-11 | 吉林建筑大学 | The adjustable electrostatic precipitator of voltage and method |
CN110095654A (en) * | 2019-05-09 | 2019-08-06 | 东北电力大学 | A kind of power grid inductance detection method |
CN110488759A (en) * | 2019-08-09 | 2019-11-22 | 西安交通大学 | A kind of numerically-controlled machine tool feeding control compensation methods based on Actor-Critic algorithm |
CN112542161A (en) * | 2020-12-10 | 2021-03-23 | 长春工程学院 | BP neural network voice recognition method based on double-layer PID optimization |
CN116581770A (en) * | 2022-11-24 | 2023-08-11 | 长春工程学院 | Micro-grid system VSG double-droop control method based on self-adaptive neural network |
-
2008
- 2008-09-03 CN CN 200810051135 patent/CN101667012A/en active Pending
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101917010A (en) * | 2010-07-27 | 2010-12-15 | 荣信电力电子股份有限公司 | Compound control structure for balanced output of multiple sets of automatically controlled power equipment |
CN102787915A (en) * | 2012-06-06 | 2012-11-21 | 哈尔滨工程大学 | Diesel engine electronic speed adjusting method based on reinforced study of proportion integration differentiation (PID) controller |
CN106707752A (en) * | 2016-12-21 | 2017-05-24 | 大连理工大学 | Improved algorithm for solving state feedback gain matrix of current source STATCOM (static synchronous compensator) |
CN107943022A (en) * | 2017-10-23 | 2018-04-20 | 清华大学 | A kind of PID locomotive automatic Pilot optimal control methods based on intensified learning |
CN108014926A (en) * | 2018-02-05 | 2018-05-11 | 吉林建筑大学 | The adjustable electrostatic precipitator of voltage and method |
CN108014926B (en) * | 2018-02-05 | 2024-05-03 | 吉林建筑大学 | Electrostatic dust collection device and method with adjustable voltage |
CN110095654A (en) * | 2019-05-09 | 2019-08-06 | 东北电力大学 | A kind of power grid inductance detection method |
CN110488759A (en) * | 2019-08-09 | 2019-11-22 | 西安交通大学 | A kind of numerically-controlled machine tool feeding control compensation methods based on Actor-Critic algorithm |
CN112542161A (en) * | 2020-12-10 | 2021-03-23 | 长春工程学院 | BP neural network voice recognition method based on double-layer PID optimization |
CN112542161B (en) * | 2020-12-10 | 2022-08-12 | 长春工程学院 | BP neural network voice recognition method based on double-layer PID optimization |
CN116581770A (en) * | 2022-11-24 | 2023-08-11 | 长春工程学院 | Micro-grid system VSG double-droop control method based on self-adaptive neural network |
CN116581770B (en) * | 2022-11-24 | 2024-02-20 | 长春工程学院 | Micro-grid system VSG double-droop control method based on self-adaptive neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101667012A (en) | Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator | |
Yang et al. | Integration of a StatCom and battery energy storage | |
Kazemi et al. | Modeling and simulation of SVC and TCSC to study their limits on maximum loadability point | |
Zhang et al. | Large-signal stability analysis of islanded DC microgrids with multiple types of loads | |
CN109586305B (en) | Power distribution network operation control strategy based on flexible multi-state switch | |
Zhang et al. | A variable self-tuning horizon mechanism for generalized dynamic predictive control on DC/DC boost converters feeding CPLs | |
CN115149806A (en) | Adaptive model prediction control method for interleaved parallel Boost converters | |
CN113364324A (en) | Inverter control method based on improved adaptive droop control strategy | |
Kavithaa et al. | A prognostic energy optimization technique with unified power quality conditioner for energy stabilization in grid system | |
Raza et al. | Adaptive drooping control scheme for VSC-MTDC system with multiple renewable energy sites based on variable droop constant | |
Hasanzadeh et al. | Optimal tuning of linear controllers for power electronics/power systems applications | |
CN116316850A (en) | Virtual synchronous generator power decoupling method and system based on virtual negative impedance and current dynamic compensation | |
Garg et al. | Voltage control and dynamic performance of power transmission system using STATCOM and its comparison with SVC | |
CN103855716A (en) | Intelligent FLC-PID mixed STATCOM control method | |
CN108574282A (en) | A kind of flow control method based on the UPFC of nonlinear Control in micro-capacitance sensor | |
CN112467716B (en) | Self-adaptive droop control method for direct-current micro-grid | |
CN110854888B (en) | Improved control method of energy storage converter based on generalized second-order integrator under weak current network | |
CN112531776A (en) | Droop control method for new energy station | |
Hermanu et al. | Comparison of static var compensator (svc) and unified power flow controller (upfc) for static voltage stability based on sensitivity analysis: A case study of 500 kv java-bali electrical power system | |
Suntio | Unified derivation and analysis of duty-ratio constraints for peak-current-mode control in continuous and discontinuous modes | |
Shendre et al. | Optimal power flow using UPFC with artificial neural network | |
Xia et al. | Frequency regulation strategy for AC–DC system during black start | |
Su et al. | Island microgrid power control system based on adaptive virtual impedance | |
Zhang et al. | Adaptive virtual impedance control based on second-order generalized integral for circulating current suppression | |
Chatterjee et al. | A comparison of conventional, direct-output-voltage and Fuzzy-PI control strategies for D-STATCOM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Open date: 20100310 |