CN115395532A - Time-lag wind power system wide-area damper control method based on reinforcement learning - Google Patents
Time-lag wind power system wide-area damper control method based on reinforcement learning Download PDFInfo
- Publication number
- CN115395532A CN115395532A CN202210994492.5A CN202210994492A CN115395532A CN 115395532 A CN115395532 A CN 115395532A CN 202210994492 A CN202210994492 A CN 202210994492A CN 115395532 A CN115395532 A CN 115395532A
- Authority
- CN
- China
- Prior art keywords
- reinforcement learning
- control
- network
- output
- power
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 60
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000013016 damping Methods 0.000 claims abstract description 19
- 230000010355 oscillation Effects 0.000 claims abstract description 19
- 238000010248 power generation Methods 0.000 claims abstract description 14
- 230000006870 function Effects 0.000 claims description 22
- 238000004891 communication Methods 0.000 claims description 18
- 230000006698 induction Effects 0.000 claims description 15
- 210000002569 neuron Anatomy 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000004458 analytical method Methods 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 6
- 238000012546 transfer Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 230000001629 suppression Effects 0.000 claims description 4
- 230000008878 coupling Effects 0.000 claims description 3
- 238000010168 coupling process Methods 0.000 claims description 3
- 238000005859 coupling reaction Methods 0.000 claims description 3
- 230000009347 mechanical transmission Effects 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 230000002441 reversible effect Effects 0.000 claims description 2
- 230000008901 benefit Effects 0.000 abstract description 4
- 230000009286 beneficial effect Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 19
- 238000005259 measurement Methods 0.000 description 11
- 239000011159 matrix material Substances 0.000 description 6
- 230000004907 flux Effects 0.000 description 5
- 230000001360 synchronised effect Effects 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000006866 deterioration Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 230000007257 malfunction Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000011217 control strategy Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000004804 winding Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
- H02J3/24—Arrangements for preventing or reducing oscillations of power in networks
- H02J3/241—The oscillation concerning frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/10—Geometric CAD
- G06F30/18—Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/28—Design optimisation, verification or simulation using fluid dynamics, e.g. using Navier-Stokes equations or computational fluid dynamics [CFD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
- H02J3/24—Arrangements for preventing or reducing oscillations of power in networks
- H02J3/242—Arrangements for preventing or reducing oscillations of power in networks using phasor measuring units [PMU]
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
- H02J3/38—Arrangements for parallely feeding a single network by two or more generators, converters or transformers
- H02J3/381—Dispersed generators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/02—CAD in a network environment, e.g. collaborative CAD or distributed simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/10—Numerical modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/04—Power grid distribution networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/06—Wind turbines or wind farms
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J2203/00—Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
- H02J2203/20—Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J2300/00—Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
- H02J2300/20—The dispersed energy generation being of renewable origin
- H02J2300/28—The renewable source being wind energy
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E10/00—Energy generation through renewable energy sources
- Y02E10/70—Wind energy
- Y02E10/76—Power conversion electric or electronic aspects
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E40/00—Technologies for an efficient electrical power generation, transmission or distribution
- Y02E40/70—Smart grids as climate change mitigation technology in the energy generation sector
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E60/00—Enabling technologies; Technologies with a potential or indirect contribution to GHG emissions mitigation
Abstract
The invention discloses a time-lag wind power system wide-area damper control method based on reinforcement learning, which comprises the following steps of: constructing a time-lag wind power wide area damping (TDWADC) control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a double-fed wind turbine generator and a TDWADC controller based on reinforcement learning; analyzing the geometric controllability/observability of the system, and selecting one or more feedback signals with the highest geometric observability corresponding to the interval oscillation mode as input signals of the TDWADC; controlling the double-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning; the invention has the beneficial effects that: the TDWADC controller can effectively restrain low-frequency oscillation of a power system in time, improves safety and stability of a power grid on one hand, enables the power grid to absorb electric energy generated by a wind power plant on a large scale in time on the other hand, and improves economic and social benefits of wind power generation enterprises.
Description
Technical Field
The invention relates to the technical field of large-scale wind power grid-connected power generation system control, in particular to a time-lag wind power system wide-area damper control method based on reinforcement learning.
Background
The internet trend of power systems and the transmission and exchange of electric energy between regions will form a great test for the stability of power systems. Various stability problems of the power system, such as transient stability and small disturbance stability, are closely related to the mutual parallel networking operation of the large power grid. With the continuous expansion of the scale of the interconnected power system and the increasing complexity, it becomes difficult for a control device which only adopts local unit signals to ensure the stable operation of the power system.
The Wide Area Measurement System WAMS (Wide Area Measurement System) is a power network dynamic monitoring System based on a network control technology and constructed by a Global Positioning System (GPS), and specifically, physical quantities such as internal potential, power angle, angular velocity, bus voltage and the like of each generator are obtained by real-time Measurement of PMUs distributed at different geographic locations, so that the observability of the whole power System network is realized. And data information with unified time scale obtained by PMU measurement is transmitted to the PDC through a communication network, so that a data basis is provided for screening out Wide-Area signals with better controllability on inter-Area oscillation, constructing a Wide-Area Power Damping controller (WADC) to realize the coordination Control of a Wide-Area Power system, and further effectively inhibiting inter-Area low-frequency oscillation in an interconnected Power grid.
A Wide Area Measurement System (WAMS) formed based on Phasor Measurement Unit (PMU) technology provides reliable support for analyzing and researching dynamic behaviors and control strategies of a large-scale interconnected power System, provides a new technical idea and an operation platform for stable analysis and control of the Wide Area power System, provides very powerful support for comprehensive real-time monitoring of a large-Area interconnected power grid, and provides a new means for real-time monitoring of the large-Area interconnected power grid.
The most obvious difference between the PMU/WAMS-based controller and the traditional local control is that a communication network is added to form a networked wide-area time-lag system.
In the process that each PMU substation in the WAMS transmits measured phasor data to a wide-area controller through a network, the phasor data need to pass through a sensor (a voltage transformer and a current transformer), synchronous sampling, phasor calculation and data packaging, a substation communication module, a communication link, data synchronization and processing of a phasor data centralized server, data issuing to the controller and other links, and different time lags can be introduced into each link. Therefore, the existence of the time lag is an inevitable problem in the application of the wide-area signal and the magnitude of the time lag depends on various factors such as the distance between the measurement stations, the communication carrier, the communication protocol, and the load condition of the communication line. The test result shows that a certain time lag exists when wide-area signals are transmitted in a communication network formed by different media, wherein the minimum time lag of optical fibers and digital microwaves is 100-150 ms, and when the satellite mode communication is adopted, the transmission time lag can be as high as 700ms.
The delay characteristic of the communication will affect the effect of the controller, causing the controller to malfunction and even playing the opposite role. Since there is a large time lag in the transmission of the wide-area measurement information in the communication network, which is one of the important causes of controller malfunction, deterioration of the operating state, and system instability, the influence of the time lag must be taken into account when performing closed-loop control of the power system using the wide-area measurement information.
Disclosure of Invention
The invention provides a time-lag wind power system wide-area damper control method based on reinforcement learning, and aims to solve the problem that the conventional PSS control is difficult to adapt to the continuous increase of the permeability of renewable energy represented by wind power generation, and the large fluctuation of the voltage and power at a public node of a wind power and a power grid is caused.
The application provides a time-lag wind power system wide area damper control method based on reinforcement learning, which comprises the following steps:
s101: constructing a time-lag wind power wide area damping TDWADC control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a plurality of groups of double-fed wind turbines and a TDWADC controller based on reinforcement learning;
wherein doubly-fed wind turbine generator system includes: the system comprises a wind turbine, a gear box, a double-fed induction generator DFIG, a transformer, a rotor side frequency converter, a power grid side frequency converter and an overvoltage protection circuit crowBar;
the wind turbine is connected with the gear box through mechanical transmission; the gear box is connected with the doubly-fed induction generator through a transmission bearing; the double-fed induction generator DFIG is connected with the transformer through electromagnetic coupling and is connected to an alternating current power grid through the transformer; the DFIG of the doubly-fed induction generator is electrically connected with the output end of the overvoltage protection circuit crowBar and the input end of the rotor side frequency converter; the output end of the rotor side frequency converter is electrically connected with the input end of the power grid side frequency converter; the output end of the power grid side frequency converter is electrically connected with one end of the transformer;
s102: controlling the doubly-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning;
the TDWADC controller based on reinforcement learning comprises three parts of control: reinforcement learning control, voltage outer loop PI control and current inner loop PI control;
the input signal of the reinforcement learning control is a wide-area feedback power signal generated after a plurality of groups of double-fed wind turbine generators are connected through a communication network; the output signal of the reinforcement learning control is accessed to the voltage outer loop PI control;
the input signal selection process of reinforcement learning control is specifically as follows: selecting wide-area feedback power signals by adopting a modal geometric controllable/observable method, and selecting one or more wide-area feedback power signals with the highest geometric observability corresponding to the interval oscillation model as input signals for reinforcement learning control by performing geometric controllable/observable analysis on the wide-area feedback power signals;
the voltage outer loop PI control is used for controlling a power grid side frequency converter;
and the current inner loop PI control is used for controlling the rotor side frequency converter to output specified active power and reactive power and finishing the suppression of the power oscillation of the grid-connected access point of the wind turbine.
Compared with the prior art, the invention has the beneficial effects that: the damping controller can effectively suppress the low-frequency oscillation of the power system in time, so that the safety and the stability of a power grid are improved, the power grid can absorb electric energy generated by a wind power plant in time on a large scale, and the economic and social benefits of a wind power generation enterprise are improved.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic structural diagram of a time-lag wind power wide area damping TDWADC control system based on reinforcement learning;
FIG. 3 is a reinforcement learning control schematic;
FIG. 4 is a schematic diagram of the structure of the Critic network;
fig. 5 is a schematic diagram of the architecture of the Actor network;
FIG. 6 is a schematic diagram of a reinforcement learning process;
FIG. 7 is a basic control block diagram of the rotor-side frequency converter;
FIG. 8 is a diagram of a grid side converter architecture;
FIG. 9 is a grid side converter control block diagram;
FIG. 10 is a schematic diagram of the control principle of a reinforcement learning based TDWADC controller;
FIG. 11 is a schematic diagram of a 16-machine time-lag wind power generation system;
fig. 12 is a power angle deviation response curve between the generators 1 and 3 with time lag t =300 ms;
fig. 13 is a power angle deviation response curve between the generators 1 and 3 with a time lag t =600 ms.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
First, the related terms and their abbreviations are explained in a unified manner as follows.
A Time lag Wide Area Damping Controller (TDWADC);
power System Stabilizer (PSS);
reinforcement Learning (RL);
time lag (TD);
wide Area Measurement System (WAMS);
a synchronous Phasor Measurement Unit (PMU);
phasor Data Concentrators (PDC);
low Frequency Oscillation (LFO);
referring to fig. 1, fig. 1 is a schematic flow chart of the method of the present invention;
the invention provides a method for designing an oscillation damping controller of a wind storage power generation system based on reinforcement learning. The method comprises the following steps:
s101: constructing a time-lag wind power wide area damping TDWADC control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a plurality of groups of double-fed wind turbine generators and a TDWADC controller based on reinforcement learning;
referring to fig. 2, fig. 2 is a schematic structural diagram of a time-lag wind power wide area damping TDWADC control system based on reinforcement learning;
wherein doubly-fed wind turbine generator system includes: the system comprises a wind turbine, a gear box, a double-fed induction generator DFIG, a transformer, a rotor side frequency converter, a power grid side frequency converter and an overvoltage protection circuit crowBar;
the wind turbine is connected with the gear box through mechanical transmission; the gear box is connected with the doubly-fed induction generator through a transmission bearing; the DFIG is connected with the transformer through electromagnetic coupling and is connected to an alternating current power grid through the transformer; the DFIG of the doubly-fed induction generator is electrically connected with the output end of the overvoltage protection circuit crowBar and the input end of the rotor side frequency converter; the output end of the rotor side frequency converter is electrically connected with the input end of the power grid side frequency converter; the output end of the power grid side frequency converter is electrically connected with one end of the transformer;
the double-fed wind turbine generator structure diagram mainly comprises the following models, namely a wind turbine model and a double-fed induction generator model;
it should be noted that the wind turbine model mainly completes the conversion from wind energy to mechanical energy, and takes the wind speed, pitch angle and generator mechanical rotation speed per unit value as input, and outputs the mechanical torque acting on the induction motor rotor.
Based on aerodynamics, the characteristics of a wind turbine capturing wind energy can be represented by the wind turbine output mechanical power simplified as follows
In the formula (I), the compound is shown in the specification,
1)P w outputting mechanical power (W) for the wind turbine, wherein R is the radius (m) of the blade, rho is the air density (kg/m 3), and v is w Equivalent wind speed (m/s);
2)C p the coefficient of wind energy utilization is related to λ and β, λ being the tip speed ratio and β being the blade pitch angle (°).
(1) Tip speed ratio
Defined as the ratio of linear velocity of the tip of the wind turbine blade to the wind speed
In the formula, ω m The mechanical rotating speed of the wind turbine.
In the formula, ω r The generator mechanical speed per unit value (p.u.), p is the generator pole pair number (p), and GR is the gear ratio.
(2) Blade pitch angle
The included angle between the fan blade and the plane of the wind wheel is indicated. The smaller the pitch angle, the larger the windward side of the blade and thus the larger the wind energy captured.
Considering the mechanical efficiency of the wind turbine gearbox, the actual output power of the wind turbine is
P w_out =η·P w (1.4)
Where η is the gearbox efficiency.
From this, the per unit value of the output mechanical torque can be obtained as
It should be noted that the doubly-fed induction generator model is as follows:
the double-fed generator mathematical model consists of a voltage equation, a flux linkage equation and a torque power equation. According to the motor convention, in a synchronous rotating coordinate system, the stator-rotor voltage equation is:
where p is the differential operator d/dt, # denotes flux linkage, subscripts s and r denote stator and rotor variables, respectively, and subscripts d and q denote d-axis and q-axis variables, respectively, of the synchronous coordinate system, the same applies below. All variables including time in the formula (2.1) are per unit values, wherein the reference value of the time t is 1/ω b, namely according to ω b The angular frequency of (1) is selected for the time taken for 1 rad. Omega b Reference angular frequency, for 50Hz systems, omega b =2*πf。
The per unit stator has active and reactive equations as follows:
and (3) orienting the d axis of the synchronous rotating coordinate system on the axis of the stator magnetic chain, and then the stator magnetic chain equation and the rotor magnetic chain equation are as follows:
neglecting stator resistance (Rs = 0) and stator flux linkageTransient state (p psi) sd,q = 0), the stator voltage in equation (2.3) is simplified as:
in the formula of U s For stator phase voltage amplitude, ω 1 Is the angular velocity of the electrical quantities of the stator. Will psi sd =U s /ω 1 The sum formula (2.4) is substituted into the power equation, and the output power of the stator is recorded as P out And Q out I.e. P sout =-P s ,Q sout =-Q s Obtaining:
thus, P is shown by the formula (2.5) sout ,Q sout Can be formed by rq ,i rd And (4) decoupling control. And i rq ,i rd Is finally controlled by u rd ,u rq And (4) realizing. i.e. i rq ,i rd And u rd ,u rq The relationship of (a) is derived as follows:
from the stator flux linkage equation:
will be given by i rq ,i rd The expressed stator current equation is substituted into the rotor flux linkage equation to obtain
In the formula:substituting equation (2.7) into the rotor voltage equation in equation (2.1) can yield i rq ,i rd And u rd ,u rq In relation to (2)
Wherein ω is slip =ω 1 -ω r And is a per unit value of the angular velocity deviation signal.
S102: controlling the double-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning;
the TDWADC controller based on reinforcement learning comprises three parts of control: reinforcement learning control, voltage outer loop PI control and current inner loop PI control;
the input signal of the reinforcement learning control is a wide-area feedback power signal generated after a plurality of groups of double-fed wind turbine generators are connected through a communication network;
it should be noted that, due to the existence of the communication network added among the multiple groups of double-fed wind turbine generators, a networked wide area time-lag system is formed; in the wide-area damping control, a time-lag link is approached by a pad approximate rational polynomial, equivalent processing is carried out, and then a wide-area damping controller is designed.
The time-lag link can be generally used as e -as To represent; the closed loop transfer function of the entire skew system is as follows:
wherein, G c (s) is a controller link, G(s) is a controlled object, e -as Is a time-lag link, a is a delay time; from equation (3.1), it can be seen that the denominator of the closed loop transfer function contains e -as The time-lag element makes the transfer function G(s) have infinite poles, and the closed-loop system may be unstable. Network communication time lag in the system can cause system controller failure and system running state deterioration, and finally the system is unstable. Therefore, when the informatization degree of the power system is higher and higher, the time lag generated by the remote communication of the communication network in the power system, the time lag of the data acquisition and pretreatment links, the response delay of the controller and the actuator to the input signal and the like have the influence on the stable operation of the systemThe louder and louder.
In this application, a time-lag link e -as A first order Pade approximation can be used (equation 2.10):
for the influence brought by time lag, the method selects the wide area feedback power signals by adopting a modal geometric controllable/observable method, and selects one or more wide area feedback power signals with the highest geometric observable corresponding to the interval oscillation model as input signals for reinforcement learning control by performing geometric controllable/observable analysis on the wide area feedback power signals;
with respect to the geometric controllability/observability method of the modality, the present application is specifically explained as follows:
assume a linearized model of the entire system as:
wherein A, B and C are respectively a state matrix, an input matrix and an output matrix.
Let matrix A have n independent eigenvalues lambda k (k =1,2, \8230;, n), and their corresponding left and right eigenvectors are, respectivelyAnd psi, and thus, a degree of geometric controllability gm corresponding to the mode k ci (k) Gm of geometric considerable degree oj (k) Respectively as follows:
wherein, b i Is the ith column of matrix B, and the ithInput is corresponding to c j Is the jth row of matrix C, corresponding to the jth output, | z |, | | z | | is the modulo and Euclidean norm of z, respectively, α (ψ) k ,b i ) Is the geometric angle between the ith input and the kth left eigenvector,is the geometric angle between the jth output and the kth right eigenvector.
From this, the definition of the integrated geometric controllability/observability of the kth modality is:
gm cok (i,j)=gm ci (k)gm oj (k) (2.13)
and selecting one or more feedback signals with the highest geometric observability corresponding to the interval oscillation model as input signals of the wide-area damping controller TDWADC through geometric controllable/observability analysis.
For the output signal of the reinforcement learning control, the output signal is accessed to the voltage outer loop PI control;
the voltage outer loop PI control is used for controlling a power grid side frequency converter;
and the current inner loop PI control is used for controlling the rotor side frequency converter to output specified active power and reactive power so as to complete the suppression of power oscillation of the grid-connected access point of the wind turbine.
Referring to fig. 3, fig. 3 is a schematic diagram of reinforcement learning control; the reinforcement learning control includes: a state converter, an Actor network and a Critic network; the principle of the independent control is as follows:
subtracting a preset signal w (t) from the output quantity y (t) of the controlled photovoltaic power generation system according to the actual situation to generate an error signal e (t); converting the error signal e (t) into an input state signal x (t) of the reinforcement learning network through a state converter; inputting the state signal x (t) into the Actor network to obtain the output signal u n (t); inputting the state signal x (t) and the error reinforcement learning signal r (t) into a Critic network together to obtain an output signal n (t); output signal u n (t) combining the control input signal u (t) with the control input signal n (t) to obtain a control input signal u (t) of the controlled photovoltaic power generation system; u (t) acts on a controlled photovoltaic power generation system to obtain an output signal y (t) to form closed-loop control; a. TheThe factor network and the Critic network also pass through a timing differential signal delta TD And (t) updating the weight coefficients of the Actor network and the Critic network online.
And (3) respectively finishing the strategy function of the Actor network and the value function of the criticic network by adopting two BP neural networks.
Referring to fig. 4, fig. 4 is a schematic structural diagram of the Critic network;
the input of Critic network is a state signal
x c (t)=[x 1 (t),x 2 (t)…,x n (t),r(t)] T (1)
The criticic network error function is shown as formula (2),
where λ is the discount coefficient, 0< λ <1;
r (t) is defined as:
wherein is a constant with ε > 0;
the transfer function of the cryptic layer neuron of the Critic network adopts a bipolar sigmoid function, and the following formula (4) is shown:
the output of the Critic network is a performance index function J (t), a hidden layer adopts a sigmoid activation function, and an output layer adopts a linear activation function; inputs and outputs for the hidden and output layer neurons of the Critic network are as follows (5):
wherein N is c For evaluating neurons in hidden layers of networksNumber q i And p i The input and output of the ith neuron of the hidden layer respectively,andrespectively representing the weights from the input layer to the hidden layer and from the hidden layer to the output layer;
the Critic network weight value updating calculation is as in formula (6):
η c (t) is the learning rate of the Critic network; the gradient calculation from the hidden layer to the output layer is obtained according to a reverse gradient descent method as shown in formula (7):
the gradient calculation from the input layer to the hidden layer is shown as equation (8):
referring to fig. 5, fig. 5 is a schematic diagram of the architecture of an Actor network;
the input of the Actor network is as follows:
x a (t)=[x 1 (t),x 2 (t)…,x n (t)] T (9)
the inputs and outputs of the Actor network hidden layer and output layer neurons are as follows (10):
N a to evaluate the number of network hidden layer neurons, h i And g i The input and output of the ith neuron of the hidden layer respectively,andrespectively representing the weights from an input layer to a hidden layer and from the hidden layer to an output layer;
the Actor network weight updating formula is shown in formula (11):
η a (t) is the learning rate of the Actor network; the gradient calculations from the hidden layer to the output layer and the input layer to the hidden layer are shown as equations (12) and (13):
wherein ω is nj And ω j The weight coefficients of the Actor network and the Critic network are respectively.
The expression of the control input signal u (t) is as follows (14):
u(t)=u 1 (t)+η m (0,ρ(t)) (14)
η m depends on the output of Critic network J (t), ρ (t) = [1+ exp (2J (t))] -1 。
Referring to fig. 6, fig. 6 is a schematic diagram of a reinforcement learning process; the reinforcement learning process is as follows:
1. initializing learning rate, weight, iteration times and error threshold of an Actor network and a Critic network;
2. calculating values of J (t) and u1 (t) according to the aforementioned formulas (5) and (10);
3. initializing iteration times i =1; judging whether i is smaller than the maximum iteration number Nc; if yes, entering the step 4; otherwise, turning to the step 7;
4. calculating the time-series difference function delta according to the formula (2) TD A value of (d);
5. determining an error threshold delta 0 >δ TD If yes, updating the weight of the criticic network according to the formulas (7) and (8); otherwise, turning to the step 7;
6. updating the weight of the Actor network according to the formulas (12) and (13);
7. t = t +1, entering a calculation process after updating the weight value;
8. calculating a control signal u (t) according to the formula (14) and acting on a controlled system;
9. updating a system state vector;
10. judging whether T is greater than the simulation time T, if yes, ending the reinforcement learning process; otherwise, i = i +1, and jumping to the iteration frequency judging process in the third step.
And the voltage outer loop PI control and the current inner loop PI control are used for controlling the rotor side frequency converter to output specified active power and reactive power, and finishing the suppression of the output power oscillation of the wind turbine grid-connected access point.
Regarding the control of the rotor-side inverter, the following is detailed:
the DFIG rotor winding is fed into a power grid through a power electronic frequency converter capable of controlling the voltage of a rotor slip ring, and active power and reactive power generated by the DFIG can be respectively controlled through a proper decoupling control method.
From the previously described model formula for the doubly fed induction generator DFIG: by changing i qr ,i dr Namely, the output power P of the doubly-fed induction generator can be respectively controlled sout ,Q sout And i is qr ,i dr Control of final composition of u qr ,u dr And realizing a power outer loop PI control structure and a current inner loop PI control structure. The basic control block diagram of the rotor-side frequency converter is shown in fig. 7. It should be noted that the current inner loop PI control is a necessary control means in the present application, and the power outer loop PI control can be performed by a common PI control means, which is referred to in the present applicationAre not described in detail;
with regard to the grid-side frequency converter control, the present application is elaborated as follows:
the grid side frequency converter is controlled to maintain the voltage of the direct current link capacitor at a preset constant value, and is irrelevant to the direction and the magnitude of the rotor power; and the reactive power generated by the whole wind generation set is controlled to be a set reference value according to the requirement of the whole wind generation set on the reactive power.
The positive direction of each electrical quantity of the grid-side inverter is defined as shown in fig. 8. u. u dc Is a DC bus voltage, i dcr For direct current to the rotor-side frequency converter, i dcg For the direct current output from the network-side frequency converter, L g Is a filter inductor, R g Is the resistance of the filter inductor.
The voltage equation at any dq rotation coordinate is
Wherein omega c And the variable including time t in the formula is a per unit value.
When d axis is oriented to grid voltage vector, u gq =0, the active power injected into the grid-side frequency converter by the power grid is as follows:
thus by controlling i separately gd ,i gq Power P capable of being output to power grid g ,Q g The decoupling control of (1).
According to the instantaneous power theory, reactive power is only transmitted between three phases, and only active power influencing direct current voltage can pass through active current i gd The dc voltage is controlled.
Adopting the conversion from ABC to dq of the transverse amplitude, neglecting the time P of the direct current side loss under the famous value g =u dc i dcg =3/2u gd i gd Substituted into AC side currentVoltage and direct current voltage relation (m is modulation ratio):
Will i dcr Considering the disturbance amount, the transfer function of the dc voltage controlled by the output active current can be expressed as:
similar to the control of the rotor-side frequency converter, the control of the network-side frequency converter is finally realized by controlling the voltage thereof, so that the relationship between the alternating voltage and the current of the frequency converter needs to be established. The AC side voltage u of the frequency converter can be obtained according to the formula (3.4) gcd ,u gcq The expression of (a) is:
the control of the grid-side frequency converter is shown in fig. 9.
Referring to fig. 10, fig. 10 is a schematic diagram illustrating a control principle of the TDWADC controller based on reinforcement learning.
Wherein, I dref The reference value of the inner ring current d axis generated by the outer ring of the active power can control the wind power to output the active power.
Output inner loop current q-axis reference value I controlled by reactive power PI qref To adjust the wind powerAnd outputting reactive power.
As shown in FIG. 9, the wide-area damping controller of the wind power generation system is designed in the voltage outer loop, i.e. the wide-area damping control signal is added into the voltage outer loop, so as to adjust the inner loop current q-axis reference value I qref And wind power reactive power is output, so that the aim of inhibiting power oscillation of a fan grid-connected access point is fulfilled.
Finally, referring to fig. 11, fig. 11 is a schematic diagram of a 16-machine time-lag wind power generation system as an embodiment. For the unit, the relevant parameter settings in the application are shown in tables 1-3;
TABLE 1 reinforcement learning network parameter settings
TABLE 2 Fan parameter settings
TABLE 3 results of modal analysis of the machine system
As can be seen from table 3, the 16-machine system has 4 interval oscillation modes, with modes 1 and 3 being sufficiently damped, and modes 2 and 4 being insufficiently damped. Mode 4 has a higher frequency than mode 2 and a larger damping ratio. The TDWADC feedback signal selection is primarily for mode 2.
For the above mode, please see table 4 for the geometric observable/controllable results of the present application;
TABLE 4 results of geometric observability/controllability in different modes
As can be seen from table 4, compared to the six alternative signals, for mode 2, signal 5 is smaller than the integrated geometric controllability/observability of signal 1, but for the other three modes, the integrated geometric controllability/observability of signal 5 is the smallest, and the transmitted power signal P68-52 in the signal selection tie 68-52 is considered as the feedback signal of the TDWADC controller, because it has a larger integrated geometric controllability/observability for mode 2 and at the same time has a smaller influence corresponding to the other 3 modes, the TDWADC controller can better improve the damping of mode 2, while having the smallest damping influence on the other 3 modes.
Two different calculations were also made in this application, comparing the conventional PSS control and the reinforced learning based TDWADC control based on the time lags t =200ms and t =600ms, respectively, and the results are shown in fig. 12 and 13.
As can be seen from fig. 12 and 13, the RL-TDWADC (RL stands for reinforcement learning, and TDWADC stands for time-lag wide-area damping controller) adopted by the present application can better suppress low-frequency oscillation of the power system.
The invention has the beneficial effects that: the low-frequency oscillation of the power system can be effectively inhibited in time, on one hand, the safety and the stability of the power grid are improved, on the other hand, the power grid can absorb the electric energy generated by the wind power plant in time and on a large scale, and the economic and social benefits of a wind power generation enterprise are improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
The above-described embodiments of the present invention should not be construed as limiting the scope of the present invention. Any other corresponding changes and modifications made according to the technical idea of the present invention should be included in the protection scope of the claims of the present invention.
Claims (5)
1. A time-lag wind power system wide-area damper control method based on reinforcement learning is characterized by comprising the following steps:
s101: constructing a time-lag wind power wide area damping TDWADC control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a plurality of groups of double-fed wind turbine generators and a TDWADC controller based on reinforcement learning;
wherein doubly-fed wind turbine generator system includes: the system comprises a wind turbine, a gear box, a double-fed induction generator DFIG, a transformer, a rotor side frequency converter, a power grid side frequency converter and an overvoltage protection circuit crowBar;
the wind turbine is connected with the gear box through mechanical transmission; the gear box is connected with the doubly-fed induction generator through a transmission bearing; the DFIG is connected with the transformer through electromagnetic coupling and is connected to an alternating current power grid through the transformer; the DFIG of the doubly-fed induction generator is electrically connected with the output end of the overvoltage protection circuit crowBar and the input end of the rotor side frequency converter; the output end of the rotor side frequency converter is electrically connected with the input end of the power grid side frequency converter; the output end of the power grid side frequency converter is electrically connected with one end of the transformer;
s102: controlling the doubly-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning;
the TDWADC controller based on reinforcement learning comprises three parts of control: reinforcement learning control, voltage outer loop PI control and current inner loop PI control;
the input signal of the reinforcement learning control is a wide-area feedback power signal generated after a plurality of groups of double-fed wind turbine generators are connected through a communication network; the output signal of the reinforcement learning control is accessed to the voltage outer loop PI control;
the input signal selection process of reinforcement learning control is specifically as follows: selecting the wide-area feedback power signals by adopting a modal geometric controllable/observable method, performing geometric controllable/observable analysis on the wide-area feedback power signals, and selecting one or more wide-area feedback power signals with the highest geometric observability corresponding to the interval oscillation model as input signals of the reinforcement learning controller;
the voltage outer loop PI control is used for controlling a power grid side frequency converter;
and the current inner loop PI control is used for controlling the rotor side frequency converter to output specified active power and reactive power and finishing the suppression of the power oscillation of the grid-connected access point of the wind turbine.
2. The reinforcement learning-based time-lag wind power system wide-area damper control method as claimed in claim 1, wherein: the reinforcement learning control includes: a state converter, an Actor network and a Critic network; the principle of independent control is as follows:
subtracting the output quantity y (t) from a preset signal w (t) according to the actual situation to generate an error signal e (t); converting the error signal e (t) into an input state signal x (t) of the reinforcement learning network through a state converter; inputting the state signal x (t) into the Actor network to obtain the output signal u n (t); inputting the state signal x (t) and the error reinforcement learning signal r (t) into a Critic network together to obtain an output signal n (t); output signal u n (t) combining the control input signal u (t) with the control input signal n (t) to obtain a control input signal u (t) of the controlled photovoltaic power generation system; u (t) acts on a controlled photovoltaic power generation system to obtain an output signal y (t) to form closed-loop control; the Actor network and the Critic network also pass through a time sequence differential signal delta TD And (t) updating the weight coefficients of the Actor network and the Critic network online.
3. The reinforcement learning-based time lag wind power system wide area damper control method of claim 2, characterized in that: and respectively finishing the strategy function of the Actor network and the value function of the criticic network by adopting two BP networks.
4. The reinforcement learning-based time lag wind power system wide area damper control method of claim 3, characterized in that:
the input of the Critic network is a state signal
x c (t)=[x 1 (t),x 2 (t)…,x n (t),r(t)] T (1),
The criticic network error function is shown as formula (2),
where λ is the discount coefficient, 0< λ <1;
r (t) is defined as:
wherein is a constant with ε > 0;
the transfer function of the hidden layer neuron of the Critic network adopts a bipolar sigmoid function, and is shown as the formula (4):
the output of the Critic network is a performance index function J (t), a hidden layer adopts a sigmoid activation function, and an output layer adopts a linear activation function; inputs and outputs for the hidden and output layer neurons of the Critic network are as follows (5):
wherein N is c To evaluate the number of network hidden layer neurons, q i And p i Input and output, ω, of the ith neuron of the hidden layer, respectively c (1) And ω c (2) Respectively representing the weights from the input layer to the hidden layer and from the hidden layer to the output layer;
the updating calculation of the Critic network weight value is as the following formula (6):
η c (t) is the learning rate of the Critic network;
the gradient calculation from the hidden layer to the output layer is obtained according to a reverse gradient descent method, and is shown as the formula (7):
the gradient calculation from the input layer to the hidden layer is shown as equation (8):
5. the reinforcement learning-based time-lag wind power system wide-area damper control method as claimed in claim 4, wherein: the input of the Actor network is as follows:
x a (t)=[x 1 (t),x 2 (t)…,x n (t)] T (9)
the input and output of the neurons of the hidden layer and the output layer of the Actor network are as follows (10):
N a to evaluate the number of network hidden layer neurons, h i And g i Input and output, ω, of the ith neuron of the hidden layer, respectively a (1) And omega a (2) Respectively representing the weights from the input layer to the hidden layer and from the hidden layer to the output layer;
the formula for updating the weights of the Actor network is shown in equation (11):
η a (t) is the learning rate of the Actor network; the gradient calculation from the hidden layer to the output layer and from the input layer to the hidden layer is shown as equation (12) and equation (13):
wherein omega nj And omega j The weight coefficients of the Actor network and the Critic network are respectively.
The expression of the control input signal u (t) is as follows (14):
u(t)=u 1 (t)+η m (0,ρ(t)) (14)
η m depends on the output of Critic network J (t), ρ (t) = [1+ exp (2J (t))] -1 。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210994492.5A CN115395532A (en) | 2022-08-18 | 2022-08-18 | Time-lag wind power system wide-area damper control method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210994492.5A CN115395532A (en) | 2022-08-18 | 2022-08-18 | Time-lag wind power system wide-area damper control method based on reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115395532A true CN115395532A (en) | 2022-11-25 |
Family
ID=84120547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210994492.5A Withdrawn CN115395532A (en) | 2022-08-18 | 2022-08-18 | Time-lag wind power system wide-area damper control method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115395532A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116613784A (en) * | 2023-07-20 | 2023-08-18 | 武昌理工学院 | Wind-light power generation system subsynchronous oscillation coordination control method based on PID-DHDP |
-
2022
- 2022-08-18 CN CN202210994492.5A patent/CN115395532A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116613784A (en) * | 2023-07-20 | 2023-08-18 | 武昌理工学院 | Wind-light power generation system subsynchronous oscillation coordination control method based on PID-DHDP |
CN116613784B (en) * | 2023-07-20 | 2023-12-12 | 武昌理工学院 | Wind-light power generation system subsynchronous oscillation coordination control method based on PID-DHDP |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Pan et al. | Wind energy conversion systems analysis of PMSG on offshore wind turbine using improved SMC and Extended State Observer | |
Li et al. | Aggregated models and transient performances of a mixed wind farm with different wind turbine generator systems | |
Liu et al. | Co-ordinated multiloop switching control of DFIG for resilience enhancement of wind power penetrated power systems | |
Zhou et al. | Pitch controller design of wind turbine based on nonlinear PI/PD control | |
CN106786673B (en) | The suppressing method and device of double-fed blower compensated transmission system subsynchronous resonance | |
CN106451539B (en) | It is a kind of meter and permanent magnet direct-drive wind turbine group dynamic characteristic wind farm grid-connected Method of Stability Analysis | |
Shao et al. | An equivalent model for sub-synchronous oscillation analysis in direct-drive wind farms with VSC-HVDC systems | |
Tian et al. | Engineering modelling of wind turbine applied in real‐time simulation with hardware‐in‐loop and optimising control | |
Pathak et al. | Fractional‐order nonlinear PID controller based maximum power extraction method for a direct‐driven wind energy system | |
CN115395532A (en) | Time-lag wind power system wide-area damper control method based on reinforcement learning | |
CN104883109A (en) | Control method for restraining harmonic current of doubly-fed wind generator stator side | |
Yaichi et al. | Control of doubly fed induction generator with maximum power point tracking for variable speed wind energy conversion systems | |
CN103956767B (en) | A kind of wind farm grid-connected method for analyzing stability considering wake effect | |
CN110336299B (en) | Distribution network reconstruction method considering small interference stability of comprehensive energy system | |
Wang et al. | Research on interconnecting offshore wind farms based on multi-terminal VSC-HVDC | |
He et al. | Grey prediction pi control of direct drive permanent magnet synchronous wind turbine | |
Nadour et al. | Advanced backstepping control of a wind energy conversion system using a doubly-fed induction generator | |
Djoudi et al. | Multilevel converter and fuzzy logic solutions for improving direct control accuracy of DFIG-based wind energy system | |
Liu et al. | Optimization of torque control parameters for wind turbine based on drive chain active damping control | |
Li et al. | Crowbar Resistance Setting and its Influence on DFIG Low Voltage Based on Characteristics | |
Jizhen et al. | Dynamic modeling of wind turbine generation system based on grey-box identification with genetic algorithm | |
Oualah et al. | Super-twisting sliding mode control for brushless doubly fed reluctance generator based on wind energy conversion system | |
Li et al. | Structure Preserving Aggregation Method for Doubly-Fed Induction Generators in Wind Power Conversion | |
Al-Toma | Hybrid control schemes for permanent magnet synchronous generator wind turbines | |
Riouch et al. | A coordinated control for smoothing output power of a DFIG based wind turbine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20221125 |