CN101667012A - Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator - Google Patents

Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator Download PDF

Info

Publication number
CN101667012A
CN101667012A CN 200810051135 CN200810051135A CN101667012A CN 101667012 A CN101667012 A CN 101667012A CN 200810051135 CN200810051135 CN 200810051135 CN 200810051135 A CN200810051135 A CN 200810051135A CN 101667012 A CN101667012 A CN 101667012A
Authority
CN
China
Prior art keywords
voltage
value
reinforcement learning
proportion integration
integration differentiation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200810051135
Other languages
Chinese (zh)
Inventor
孟祥萍
谭万禹
孙贵新
纪秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute Of Jilin Electric Power Co
Changchun Institute of Applied Chemistry of CAS
Changchun Institute Technology
Original Assignee
Electric Power Research Institute Of Jilin Electric Power Co
Changchun Institute Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute Of Jilin Electric Power Co, Changchun Institute Technology filed Critical Electric Power Research Institute Of Jilin Electric Power Co
Priority to CN 200810051135 priority Critical patent/CN101667012A/en
Publication of CN101667012A publication Critical patent/CN101667012A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Feedback Control In General (AREA)

Abstract

The invention discloses a method for controlling a reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator, which comprises the following steps of: deducing a current-voltage conversion equation in a d-q coordinate system according to a instantaneous power balance formula; respectively regulating an error between a given voltage value and a practical measured value and an error between a DC side capacitor voltage command value and a practical measured value by a reinforcement learning adaptive PID control algorithm in the controlrealizing process so as to obtain an active command signal and a passive command signal; performing voltage-current conversion in a mathematical model to obtain a voltage signal; and carrying out coordinate transformation to obtain a required voltage modulating signal. The method uses the reinforcement learning adaptive PID control algorithm to prevent a condition of unstable performance of a controller caused by the variation of an equivalent parameter in the conventional PID control, realizes the adaptive capability of the controller well and improves the control accuracy.

Description

Based on reinforcement learning adaptive proportion integration differentiation distribution static synchronous compensator control method
Technical field
The present invention relates to a kind of control method of distribution static synchronous compensator, the control method of particularly a kind of distribution static synchronous compensator (DSTATCOM) based on reinforcement learning adaptive PID.
Background technology
Along with science and technology development, China's power system development is very fast, and in industry and household electricity load, inductive load occupies very big ratio, and the necessary absorbing reactive power ability of inductive load operate as normal, a large amount of harmonic currents that these loads simultaneously produce are also wanted consume reactive power.The consumption of a large amount of reactive powers has caused a series of power quality problems such as voltage fluctuation, flickering and three-phase imbalance of power distribution network.In distribution system, there is simultaneously a large amount of quick impact loads,, can causes voltage flicker, cause the unbalancedness of system power and voltage as the electric arc furnaces load.On the other hand, fast development along with Chinese national economy and scientific and technological level, all trades and professions are more and more higher to the requirement of the quality of power supply, particularly along with the widespread use of various electronic installations and precision equipment, make the user wish that power supply enterprise can provide the electric energy of high-efficiency high-quality.In case power quality problem occurs, gently then cause equipment failure, heavy then cause the damage of total system, the loss that brings thus is difficult to estimate.So power quality problem has been related to safe, stable, economy, the reliability service of whole electric system and equipment, and be related to the overall benefit and the development strategy of whole national economy.When electric system pressed for advanced power transmission and distribution technology and improves the quality of power supply and system stability, along with the fast development of Power Electronic Technique and modern control technology, a kind of new technology that changes the power transmission and distribution ability---system (FACTS) quietly rose.Distribution static synchronous compensator (DSTATCOM) has been represented the development trend of Future Power System reactive power compensator like this, utilize Power Electronic Technique and modern control technology combine can be comprehensive the solution power distribution network in multiple power quality problem.The conventional controller design of power distribution network STATCOM is based on its local linearization model, because the uncertain feature of the non-linear and equivalent parameters of DSTATCOM model, very difficulty is complicated to make its control.Using at present more is traditional PID control, and when adopting PID control, when equivalent parameters was measured inaccurate or change, it is even unstable that the performance of controller can reduce, and the more serious maloperation that can occur controlling burns out the DSTATCOM device.So realize the adaptive ability of DSTATCOM controller, have great importance.
Summary of the invention
Technical matters to be solved by this invention provides a kind of distribution static synchronous compensator control method based on reinforcement learning adaptive PID with dynamically good and static properties, to realize the adaptive ability of DSTATCOM controller.
For solving the problems of the technologies described above, the invention provides a kind of based on reinforcement learning adaptive proportion integration differentiation distribution static synchronous compensator control method, according to the instantaneous power equilibrium principle, list the distribution static synchronous compensator mathematical model, and it is transformed into the dq0 coordinate system by transformed matrix from rest frame, drawing the distribution static synchronous compensator system is the coupling nonlinear system of typical two inputs, two outputs, it is characterized in that: the error of voltage instruction value and actual measured value is formed idle instruction current signal after reinforcement learning adaptive proportion integration differentiation is regulated; The error of dc capacitor voltage command value and actual measured value forms meritorious instruction current signal after reinforcement learning adaptive proportion integration differentiation is regulated, after the relation transformation of idle instruction current signal and meritorious instruction current signal voltage and electric current in mathematical model, form idle command voltage signal and meritorious command voltage signal, idle command voltage signal and meritorious command voltage signal after the dq/abc coordinate transform as modulation signal, after the triangular carrier modulation, produce the action that the pulse-width modulation PWM drive signal goes to control Intelligent Power Module, produce the voltage that needs compensation, thereby kept the constant of dc capacitor voltage and points of common connection PCC voltage.
The present invention introduces the reinforcement learning adaptive pid control algorithm in the control method of DSTATCOM.Wherein in control procedure, pass through the intensified learning algorithm to K P, K IAnd K DTrain and learn, and in study, add the decoupling zero requirement, make control system to regulate K automatically according to the variation of model parameter P, K IAnd K DValue, make system reach satisfied control result.
DSTATCOM control method based on reinforcement learning adaptive PID of the present invention has been kept the constant of dc capacitor voltage and system node voltage, has realized effective reactive power compensation.This control algolithm is derived current-voltage model under the dq0 coordinate according to the instantaneous power equilibrium principle, and proposed to be suitable for the reinforcement learning adaptive pid control algorithm of this control system, when this control method has been avoided in the traditional PID control equivalent parameters change, the unsettled situation of controller performance, well realize the adaptive ability of controller, improved the degree of accuracy of control.
Description of drawings
Fig. 1 is DSTATCOM main circuit structure figure of the present invention;
Fig. 2 is a control system schematic diagram of the present invention;
Fig. 3 is the control algolithm process flow diagram of reinforcement learning adaptive PID.
Embodiment
With reference to Fig. 1, u represents the three-phase voltage of electrical network; E and i then represent three-phase output voltage and the electric current of DSTATCOM respectively; Resistance R and inductance L respectively the indication device loss with line reactance and be connected the transformer leakage reactance.The supposing the system three-phase voltage is:
u = u a u b u c = 2 V S sin ωt sin ( ωt - 2 π / 3 ) sin ( ωt + 2 π / 3 ) - - - ( 1 )
Suppose that the DSTATCOM output voltage is:
e = e a e b e c = Ku dc sin ( ωt - δ ) sin ( ωt - 2 π / 3 - δ ) sin ( ωt + 2 π - δ ) - - - ( 2 )
In the formula (2), K is the no-load voltage ratio coefficient, and δ is DSTATCOM output voltage u cWith system voltage u sAngle is controlled amounts.
It is as follows to obtain in the dq0 coordinate system current-voltage conversion formula according to the instantaneous power balanced type:
According to the power-balance principle, the DSTATCOM output power should equal the power of injected system and the power sum that equivalent resistance, reactance consume, that is:
P e=P o+P f (3)
Q e=Q o+Q f (4)
Select the d axle of synchronous rotating frame to overlap, can get with PCC access point voltage vector:
u d=u u q=0 (5)
(5) formula substitution power-balance formula is drawn:
e d=i dR-i qωL+u
(6)
e q=i qR+i dωL
Following formula has been realized current i in the d-q coordinate system d, i qTo voltage e d, e qConversion.
By last two formulas as can be seen, the Control of Voltage of DSTATCOM instruction e d, e qFormation and equivalent parameters R, L is closely related.If the i that adopts conventional PID control to form q *And i d *, when equivalent parameters was measured accurately and remained unchanged, controller performance was superior, but the device agings that equivalent parameters may cause because of the variation that is difficult to accurate measurement, operating condition, owing to long-time running etc. are former thereby caused the uncertainty of equivalent parameters.In order to improve the robustness of controller to system's major parameter disturbance, realize the adaptive ability of system, the present invention proposes DSTATCOM control method based on reinforcement learning adaptive PID.
With reference to Fig. 2, with voltage instruction value U * PccWith actual measured value U AbcError regulate the idle instruction current signal i of back formation through intensified learning PID q *Dc capacitor voltage command value U Dc *With actual measured value U DcError regulate the back through intensified learning PID and form the instruction current signal i that gains merit d *i q *, i d *In mathematical model, after the relation transformation of voltage and electric current, form idle command voltage signal e q *With meritorious command voltage signal e d *e q *And e d *After the dq/abc coordinate transform as modulation signal, after the triangular carrier modulation, produce the action that the pulse-width modulation PWM drive signal goes to control Intelligent Power Module, produce the voltage that needs compensation, thereby kept the constant of dc capacitor voltage and points of common connection PCC voltage.
The introduction of reinforcement learning adaptive pid control algorithm:
Intensified learning is called again and strengthens study, be a kind of with environmental feedback as machine learning method input, special, that conform, examination is gathered and is searched and the time-delay remuneration is two key characters of intensified learning.Wherein the Q learning algorithm is that effect is reasonable a kind of in the intensified learning algorithm, and in Q study, (s a), is defined as: its value is that the maximum conversion accumulative total when state s begins and uses as first element a is repaid to main study action valuation functions Q.The value of Q is for carrying out the value (converting with γ) of following optimal strategy after repaying immediately of action adds from state.After the intensified learning process was finished, system by to the mapping of Q matrix, obtained optimum relatively action to corresponding state.
In the adjustment of traditional pid parameter, all be to use fixing formula, it is not very desirable adjusting effect, is difficult to adapt to changeable power grid environment, in the adjustment of pid parameter, can make pid parameter have better adaptability the intensified learning algorithm application.With the parameter K of Q learning algorithm to PID P, K IAnd K DAdjust respectively, step is as follows:,
1. the Q value matrix of initiation parameter, Q value matrix recording status s and under this state, select action a to expect that the accumulation that obtains awards.Wherein state promptly is somebody's turn to do K constantly P, K IAnd K DValue, the action a promptly adjust this parameter value, export corresponding K P, K IAnd K DValue.Study factor-alpha of initialization simultaneously and discount factor γ;
2. select and carry out action a according to current state s and Action Selection strategy π;
3. according to the K that imports P, K IAnd K DValue calculate remuneration r, and enter new state s 1
4. use formula Q (s, a)=(s a)+α * (r+ γ * maxQ (s ', a ')) upgrades the Q value to (1-α) * Q; Wherein α is the study factor, and this study factor reduces gradually along with the increase of study iterations, is zero at last, means the end of learning process, because the Q value will no longer be updated;
As can be seen, the iteration of Q learning algorithm is that strategy is irrelevant from formula, and it always selects maximum Q value to import as iteration.Same through after iterating, Q (s t, a t) will progressively approach desirable Q (s, a).Adjusting the study factor, return step 2, is 0 up to the study factor.
5. export rational K P, K IAnd K DValue;
6.PID controller adopts the increment type PID control algolithm: control deviation: e ( k ) = U PCC * ( k ) - U abc ( k ) ,
Δu(k)=K PX(1)+K IX(2)+K DX(3)
Wherein: X (1)=e (k);
X(2)=e(k)-e(k-1);
X(3)=e(k)-2e(k-1)+e(k-2);
u(k)=u(k-1)+Δu(k)
In the process of the training of reinforcement learning adaptive pid control algorithm and study, add the requirement of decoupling zero control, make the result of its training and study also satisfy the function of decoupling zero control simultaneously.Improved the control accuracy of system.

Claims (2)

1, a kind of based on reinforcement learning adaptive proportion integration differentiation distribution static synchronous compensator control method, at first according to the instantaneous power equilibrium principle, list the distribution static synchronous compensator mathematical model, and it is transformed into the dq0 coordinate system by transformed matrix from rest frame, it is characterized in that: the error of voltage instruction value and actual measured value is regulated the idle instruction current signal of back formation through the reinforcement learning adaptive proportion integration differentiation control algolithm; The error of dc capacitor voltage command value and actual measured value forms meritorious instruction current signal after reinforcement learning adaptive proportion integration differentiation is regulated; After obtaining idle instruction current signal and active current signal, after the relation transformation of idle instruction current signal and meritorious instruction current signal voltage and electric current in mathematical model, form idle command voltage signal and meritorious command voltage signal, idle command voltage signal and meritorious command voltage signal after the dq/abc coordinate transform as modulation signal, after the triangular carrier modulation, produce the action that the pulse-width modulation PWM drive signal goes to control Intelligent Power Module, produce the voltage that needs compensation.
2, according to claim 1 based on reinforcement learning adaptive proportion integration differentiation distribution static synchronous compensator control method, it is characterized in that: the implementation procedure of reinforcement learning adaptive proportion integration differentiation control algolithm is as follows:
(1) the Q value matrix of initiation parameter, Q value matrix recording status s reach the accumulation award of selecting action a to expect acquisition under this state;
(2) select and carry out action a according to current state s and Action Selection strategy π;
(3) according to the K that imports P, K IAnd K DValue calculate remuneration r, and enter new state s 1
(4) use formula Q (s, a)=(s a)+α * (r+ γ * maxQ (s ', a ')) upgrades the Q value to (1-α) * Q; Through after iterating, Q (s t, a t) will progressively approach Q (s, a); Adjusting the study factor, return step 2, is 0 up to the study factor;
(5) export rational K P, K IAnd K DValue;
(6) with the K that exports P, K IAnd K DValue send into the adaptive proportion integration differentiation controller, adopt increment type adaptive proportion integration differentiation control algolithm to control.
CN 200810051135 2008-09-03 2008-09-03 Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator Pending CN101667012A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810051135 CN101667012A (en) 2008-09-03 2008-09-03 Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810051135 CN101667012A (en) 2008-09-03 2008-09-03 Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator

Publications (1)

Publication Number Publication Date
CN101667012A true CN101667012A (en) 2010-03-10

Family

ID=41803657

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810051135 Pending CN101667012A (en) 2008-09-03 2008-09-03 Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator

Country Status (1)

Country Link
CN (1) CN101667012A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101917010A (en) * 2010-07-27 2010-12-15 荣信电力电子股份有限公司 Compound control structure for balanced output of multiple sets of automatically controlled power equipment
CN102787915A (en) * 2012-06-06 2012-11-21 哈尔滨工程大学 Diesel engine electronic speed adjusting method based on reinforced study of proportion integration differentiation (PID) controller
CN106707752A (en) * 2016-12-21 2017-05-24 大连理工大学 Improved algorithm for solving state feedback gain matrix of current source STATCOM (static synchronous compensator)
CN107943022A (en) * 2017-10-23 2018-04-20 清华大学 A kind of PID locomotive automatic Pilot optimal control methods based on intensified learning
CN108014926A (en) * 2018-02-05 2018-05-11 吉林建筑大学 The adjustable electrostatic precipitator of voltage and method
CN110095654A (en) * 2019-05-09 2019-08-06 东北电力大学 A kind of power grid inductance detection method
CN110488759A (en) * 2019-08-09 2019-11-22 西安交通大学 A kind of numerically-controlled machine tool feeding control compensation methods based on Actor-Critic algorithm
CN112542161A (en) * 2020-12-10 2021-03-23 长春工程学院 BP neural network voice recognition method based on double-layer PID optimization
CN116581770A (en) * 2022-11-24 2023-08-11 长春工程学院 Micro-grid system VSG double-droop control method based on self-adaptive neural network

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101917010A (en) * 2010-07-27 2010-12-15 荣信电力电子股份有限公司 Compound control structure for balanced output of multiple sets of automatically controlled power equipment
CN102787915A (en) * 2012-06-06 2012-11-21 哈尔滨工程大学 Diesel engine electronic speed adjusting method based on reinforced study of proportion integration differentiation (PID) controller
CN106707752A (en) * 2016-12-21 2017-05-24 大连理工大学 Improved algorithm for solving state feedback gain matrix of current source STATCOM (static synchronous compensator)
CN107943022A (en) * 2017-10-23 2018-04-20 清华大学 A kind of PID locomotive automatic Pilot optimal control methods based on intensified learning
CN108014926A (en) * 2018-02-05 2018-05-11 吉林建筑大学 The adjustable electrostatic precipitator of voltage and method
CN108014926B (en) * 2018-02-05 2024-05-03 吉林建筑大学 Electrostatic dust collection device and method with adjustable voltage
CN110095654A (en) * 2019-05-09 2019-08-06 东北电力大学 A kind of power grid inductance detection method
CN110488759A (en) * 2019-08-09 2019-11-22 西安交通大学 A kind of numerically-controlled machine tool feeding control compensation methods based on Actor-Critic algorithm
CN112542161A (en) * 2020-12-10 2021-03-23 长春工程学院 BP neural network voice recognition method based on double-layer PID optimization
CN112542161B (en) * 2020-12-10 2022-08-12 长春工程学院 BP neural network voice recognition method based on double-layer PID optimization
CN116581770A (en) * 2022-11-24 2023-08-11 长春工程学院 Micro-grid system VSG double-droop control method based on self-adaptive neural network
CN116581770B (en) * 2022-11-24 2024-02-20 长春工程学院 Micro-grid system VSG double-droop control method based on self-adaptive neural network

Similar Documents

Publication Publication Date Title
CN101667012A (en) Method for controlling reinforcement learning adaptive proportion integration differentiation-based distribution static synchronous compensator
Yang et al. Integration of a StatCom and battery energy storage
Kazemi et al. Modeling and simulation of SVC and TCSC to study their limits on maximum loadability point
Zhang et al. Large-signal stability analysis of islanded DC microgrids with multiple types of loads
CN109586305B (en) Power distribution network operation control strategy based on flexible multi-state switch
Zhang et al. A variable self-tuning horizon mechanism for generalized dynamic predictive control on DC/DC boost converters feeding CPLs
CN115149806A (en) Adaptive model prediction control method for interleaved parallel Boost converters
CN113364324A (en) Inverter control method based on improved adaptive droop control strategy
Kavithaa et al. A prognostic energy optimization technique with unified power quality conditioner for energy stabilization in grid system
Raza et al. Adaptive drooping control scheme for VSC-MTDC system with multiple renewable energy sites based on variable droop constant
Hasanzadeh et al. Optimal tuning of linear controllers for power electronics/power systems applications
CN116316850A (en) Virtual synchronous generator power decoupling method and system based on virtual negative impedance and current dynamic compensation
Garg et al. Voltage control and dynamic performance of power transmission system using STATCOM and its comparison with SVC
CN103855716A (en) Intelligent FLC-PID mixed STATCOM control method
CN108574282A (en) A kind of flow control method based on the UPFC of nonlinear Control in micro-capacitance sensor
CN112467716B (en) Self-adaptive droop control method for direct-current micro-grid
CN110854888B (en) Improved control method of energy storage converter based on generalized second-order integrator under weak current network
CN112531776A (en) Droop control method for new energy station
Hermanu et al. Comparison of static var compensator (svc) and unified power flow controller (upfc) for static voltage stability based on sensitivity analysis: A case study of 500 kv java-bali electrical power system
Suntio Unified derivation and analysis of duty-ratio constraints for peak-current-mode control in continuous and discontinuous modes
Shendre et al. Optimal power flow using UPFC with artificial neural network
Xia et al. Frequency regulation strategy for AC–DC system during black start
Su et al. Island microgrid power control system based on adaptive virtual impedance
Zhang et al. Adaptive virtual impedance control based on second-order generalized integral for circulating current suppression
Chatterjee et al. A comparison of conventional, direct-output-voltage and Fuzzy-PI control strategies for D-STATCOM

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20100310