CN115395532A - Time-lag wind power system wide-area damper control method based on reinforcement learning - Google Patents

Time-lag wind power system wide-area damper control method based on reinforcement learning Download PDF

Info

Publication number
CN115395532A
CN115395532A CN202210994492.5A CN202210994492A CN115395532A CN 115395532 A CN115395532 A CN 115395532A CN 202210994492 A CN202210994492 A CN 202210994492A CN 115395532 A CN115395532 A CN 115395532A
Authority
CN
China
Prior art keywords
reinforcement learning
control
network
output
power
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210994492.5A
Other languages
Chinese (zh)
Inventor
谢兴旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuchang University of Technology
Original Assignee
Wuchang University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuchang University of Technology filed Critical Wuchang University of Technology
Priority to CN202210994492.5A priority Critical patent/CN115395532A/en
Publication of CN115395532A publication Critical patent/CN115395532A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/24Arrangements for preventing or reducing oscillations of power in networks
    • H02J3/241The oscillation concerning frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/28Design optimisation, verification or simulation using fluid dynamics, e.g. using Navier-Stokes equations or computational fluid dynamics [CFD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/24Arrangements for preventing or reducing oscillations of power in networks
    • H02J3/242Arrangements for preventing or reducing oscillations of power in networks using phasor measuring units [PMU]
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/381Dispersed generators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/02CAD in a network environment, e.g. collaborative CAD or distributed simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/10Numerical modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/04Power grid distribution networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/06Wind turbines or wind farms
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2300/00Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
    • H02J2300/20The dispersed energy generation being of renewable origin
    • H02J2300/28The renewable source being wind energy
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E10/00Energy generation through renewable energy sources
    • Y02E10/70Wind energy
    • Y02E10/76Power conversion electric or electronic aspects
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E60/00Enabling technologies; Technologies with a potential or indirect contribution to GHG emissions mitigation

Abstract

The invention discloses a time-lag wind power system wide-area damper control method based on reinforcement learning, which comprises the following steps of: constructing a time-lag wind power wide area damping (TDWADC) control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a double-fed wind turbine generator and a TDWADC controller based on reinforcement learning; analyzing the geometric controllability/observability of the system, and selecting one or more feedback signals with the highest geometric observability corresponding to the interval oscillation mode as input signals of the TDWADC; controlling the double-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning; the invention has the beneficial effects that: the TDWADC controller can effectively restrain low-frequency oscillation of a power system in time, improves safety and stability of a power grid on one hand, enables the power grid to absorb electric energy generated by a wind power plant on a large scale in time on the other hand, and improves economic and social benefits of wind power generation enterprises.

Description

Time-lag wind power system wide-area damper control method based on reinforcement learning
Technical Field
The invention relates to the technical field of large-scale wind power grid-connected power generation system control, in particular to a time-lag wind power system wide-area damper control method based on reinforcement learning.
Background
The internet trend of power systems and the transmission and exchange of electric energy between regions will form a great test for the stability of power systems. Various stability problems of the power system, such as transient stability and small disturbance stability, are closely related to the mutual parallel networking operation of the large power grid. With the continuous expansion of the scale of the interconnected power system and the increasing complexity, it becomes difficult for a control device which only adopts local unit signals to ensure the stable operation of the power system.
The Wide Area Measurement System WAMS (Wide Area Measurement System) is a power network dynamic monitoring System based on a network control technology and constructed by a Global Positioning System (GPS), and specifically, physical quantities such as internal potential, power angle, angular velocity, bus voltage and the like of each generator are obtained by real-time Measurement of PMUs distributed at different geographic locations, so that the observability of the whole power System network is realized. And data information with unified time scale obtained by PMU measurement is transmitted to the PDC through a communication network, so that a data basis is provided for screening out Wide-Area signals with better controllability on inter-Area oscillation, constructing a Wide-Area Power Damping controller (WADC) to realize the coordination Control of a Wide-Area Power system, and further effectively inhibiting inter-Area low-frequency oscillation in an interconnected Power grid.
A Wide Area Measurement System (WAMS) formed based on Phasor Measurement Unit (PMU) technology provides reliable support for analyzing and researching dynamic behaviors and control strategies of a large-scale interconnected power System, provides a new technical idea and an operation platform for stable analysis and control of the Wide Area power System, provides very powerful support for comprehensive real-time monitoring of a large-Area interconnected power grid, and provides a new means for real-time monitoring of the large-Area interconnected power grid.
The most obvious difference between the PMU/WAMS-based controller and the traditional local control is that a communication network is added to form a networked wide-area time-lag system.
In the process that each PMU substation in the WAMS transmits measured phasor data to a wide-area controller through a network, the phasor data need to pass through a sensor (a voltage transformer and a current transformer), synchronous sampling, phasor calculation and data packaging, a substation communication module, a communication link, data synchronization and processing of a phasor data centralized server, data issuing to the controller and other links, and different time lags can be introduced into each link. Therefore, the existence of the time lag is an inevitable problem in the application of the wide-area signal and the magnitude of the time lag depends on various factors such as the distance between the measurement stations, the communication carrier, the communication protocol, and the load condition of the communication line. The test result shows that a certain time lag exists when wide-area signals are transmitted in a communication network formed by different media, wherein the minimum time lag of optical fibers and digital microwaves is 100-150 ms, and when the satellite mode communication is adopted, the transmission time lag can be as high as 700ms.
The delay characteristic of the communication will affect the effect of the controller, causing the controller to malfunction and even playing the opposite role. Since there is a large time lag in the transmission of the wide-area measurement information in the communication network, which is one of the important causes of controller malfunction, deterioration of the operating state, and system instability, the influence of the time lag must be taken into account when performing closed-loop control of the power system using the wide-area measurement information.
Disclosure of Invention
The invention provides a time-lag wind power system wide-area damper control method based on reinforcement learning, and aims to solve the problem that the conventional PSS control is difficult to adapt to the continuous increase of the permeability of renewable energy represented by wind power generation, and the large fluctuation of the voltage and power at a public node of a wind power and a power grid is caused.
The application provides a time-lag wind power system wide area damper control method based on reinforcement learning, which comprises the following steps:
s101: constructing a time-lag wind power wide area damping TDWADC control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a plurality of groups of double-fed wind turbines and a TDWADC controller based on reinforcement learning;
wherein doubly-fed wind turbine generator system includes: the system comprises a wind turbine, a gear box, a double-fed induction generator DFIG, a transformer, a rotor side frequency converter, a power grid side frequency converter and an overvoltage protection circuit crowBar;
the wind turbine is connected with the gear box through mechanical transmission; the gear box is connected with the doubly-fed induction generator through a transmission bearing; the double-fed induction generator DFIG is connected with the transformer through electromagnetic coupling and is connected to an alternating current power grid through the transformer; the DFIG of the doubly-fed induction generator is electrically connected with the output end of the overvoltage protection circuit crowBar and the input end of the rotor side frequency converter; the output end of the rotor side frequency converter is electrically connected with the input end of the power grid side frequency converter; the output end of the power grid side frequency converter is electrically connected with one end of the transformer;
s102: controlling the doubly-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning;
the TDWADC controller based on reinforcement learning comprises three parts of control: reinforcement learning control, voltage outer loop PI control and current inner loop PI control;
the input signal of the reinforcement learning control is a wide-area feedback power signal generated after a plurality of groups of double-fed wind turbine generators are connected through a communication network; the output signal of the reinforcement learning control is accessed to the voltage outer loop PI control;
the input signal selection process of reinforcement learning control is specifically as follows: selecting wide-area feedback power signals by adopting a modal geometric controllable/observable method, and selecting one or more wide-area feedback power signals with the highest geometric observability corresponding to the interval oscillation model as input signals for reinforcement learning control by performing geometric controllable/observable analysis on the wide-area feedback power signals;
the voltage outer loop PI control is used for controlling a power grid side frequency converter;
and the current inner loop PI control is used for controlling the rotor side frequency converter to output specified active power and reactive power and finishing the suppression of the power oscillation of the grid-connected access point of the wind turbine.
Compared with the prior art, the invention has the beneficial effects that: the damping controller can effectively suppress the low-frequency oscillation of the power system in time, so that the safety and the stability of a power grid are improved, the power grid can absorb electric energy generated by a wind power plant in time on a large scale, and the economic and social benefits of a wind power generation enterprise are improved.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic structural diagram of a time-lag wind power wide area damping TDWADC control system based on reinforcement learning;
FIG. 3 is a reinforcement learning control schematic;
FIG. 4 is a schematic diagram of the structure of the Critic network;
fig. 5 is a schematic diagram of the architecture of the Actor network;
FIG. 6 is a schematic diagram of a reinforcement learning process;
FIG. 7 is a basic control block diagram of the rotor-side frequency converter;
FIG. 8 is a diagram of a grid side converter architecture;
FIG. 9 is a grid side converter control block diagram;
FIG. 10 is a schematic diagram of the control principle of a reinforcement learning based TDWADC controller;
FIG. 11 is a schematic diagram of a 16-machine time-lag wind power generation system;
fig. 12 is a power angle deviation response curve between the generators 1 and 3 with time lag t =300 ms;
fig. 13 is a power angle deviation response curve between the generators 1 and 3 with a time lag t =600 ms.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
First, the related terms and their abbreviations are explained in a unified manner as follows.
A Time lag Wide Area Damping Controller (TDWADC);
power System Stabilizer (PSS);
reinforcement Learning (RL);
time lag (TD);
wide Area Measurement System (WAMS);
a synchronous Phasor Measurement Unit (PMU);
phasor Data Concentrators (PDC);
low Frequency Oscillation (LFO);
referring to fig. 1, fig. 1 is a schematic flow chart of the method of the present invention;
the invention provides a method for designing an oscillation damping controller of a wind storage power generation system based on reinforcement learning. The method comprises the following steps:
s101: constructing a time-lag wind power wide area damping TDWADC control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a plurality of groups of double-fed wind turbine generators and a TDWADC controller based on reinforcement learning;
referring to fig. 2, fig. 2 is a schematic structural diagram of a time-lag wind power wide area damping TDWADC control system based on reinforcement learning;
wherein doubly-fed wind turbine generator system includes: the system comprises a wind turbine, a gear box, a double-fed induction generator DFIG, a transformer, a rotor side frequency converter, a power grid side frequency converter and an overvoltage protection circuit crowBar;
the wind turbine is connected with the gear box through mechanical transmission; the gear box is connected with the doubly-fed induction generator through a transmission bearing; the DFIG is connected with the transformer through electromagnetic coupling and is connected to an alternating current power grid through the transformer; the DFIG of the doubly-fed induction generator is electrically connected with the output end of the overvoltage protection circuit crowBar and the input end of the rotor side frequency converter; the output end of the rotor side frequency converter is electrically connected with the input end of the power grid side frequency converter; the output end of the power grid side frequency converter is electrically connected with one end of the transformer;
the double-fed wind turbine generator structure diagram mainly comprises the following models, namely a wind turbine model and a double-fed induction generator model;
it should be noted that the wind turbine model mainly completes the conversion from wind energy to mechanical energy, and takes the wind speed, pitch angle and generator mechanical rotation speed per unit value as input, and outputs the mechanical torque acting on the induction motor rotor.
Based on aerodynamics, the characteristics of a wind turbine capturing wind energy can be represented by the wind turbine output mechanical power simplified as follows
Figure BDA0003805013670000051
In the formula (I), the compound is shown in the specification,
1)P w outputting mechanical power (W) for the wind turbine, wherein R is the radius (m) of the blade, rho is the air density (kg/m 3), and v is w Equivalent wind speed (m/s);
2)C p the coefficient of wind energy utilization is related to λ and β, λ being the tip speed ratio and β being the blade pitch angle (°).
(1) Tip speed ratio
Defined as the ratio of linear velocity of the tip of the wind turbine blade to the wind speed
Figure BDA0003805013670000052
In the formula, ω m The mechanical rotating speed of the wind turbine.
Figure BDA0003805013670000053
In the formula, ω r The generator mechanical speed per unit value (p.u.), p is the generator pole pair number (p), and GR is the gear ratio.
(2) Blade pitch angle
The included angle between the fan blade and the plane of the wind wheel is indicated. The smaller the pitch angle, the larger the windward side of the blade and thus the larger the wind energy captured.
Considering the mechanical efficiency of the wind turbine gearbox, the actual output power of the wind turbine is
P w_out =η·P w (1.4)
Where η is the gearbox efficiency.
From this, the per unit value of the output mechanical torque can be obtained as
Figure BDA0003805013670000061
It should be noted that the doubly-fed induction generator model is as follows:
the double-fed generator mathematical model consists of a voltage equation, a flux linkage equation and a torque power equation. According to the motor convention, in a synchronous rotating coordinate system, the stator-rotor voltage equation is:
Figure BDA0003805013670000062
where p is the differential operator d/dt, # denotes flux linkage, subscripts s and r denote stator and rotor variables, respectively, and subscripts d and q denote d-axis and q-axis variables, respectively, of the synchronous coordinate system, the same applies below. All variables including time in the formula (2.1) are per unit values, wherein the reference value of the time t is 1/ω b, namely according to ω b The angular frequency of (1) is selected for the time taken for 1 rad. Omega b Reference angular frequency, for 50Hz systems, omega b =2*πf。
The per unit stator has active and reactive equations as follows:
Figure BDA0003805013670000063
and (3) orienting the d axis of the synchronous rotating coordinate system on the axis of the stator magnetic chain, and then the stator magnetic chain equation and the rotor magnetic chain equation are as follows:
Figure BDA0003805013670000071
neglecting stator resistance (Rs = 0) and stator flux linkageTransient state (p psi) sd,q = 0), the stator voltage in equation (2.3) is simplified as:
Figure BDA0003805013670000072
in the formula of U s For stator phase voltage amplitude, ω 1 Is the angular velocity of the electrical quantities of the stator. Will psi sd =U s1 The sum formula (2.4) is substituted into the power equation, and the output power of the stator is recorded as P out And Q out I.e. P sout =-P s ,Q sout =-Q s Obtaining:
Figure BDA0003805013670000073
thus, P is shown by the formula (2.5) sout ,Q sout Can be formed by rq ,i rd And (4) decoupling control. And i rq ,i rd Is finally controlled by u rd ,u rq And (4) realizing. i.e. i rq ,i rd And u rd ,u rq The relationship of (a) is derived as follows:
from the stator flux linkage equation:
Figure BDA0003805013670000074
will be given by i rq ,i rd The expressed stator current equation is substituted into the rotor flux linkage equation to obtain
Figure BDA0003805013670000075
In the formula:
Figure BDA0003805013670000076
substituting equation (2.7) into the rotor voltage equation in equation (2.1) can yield i rq ,i rd And u rd ,u rq In relation to (2)
Figure BDA0003805013670000081
Wherein ω is slip =ω 1r And is a per unit value of the angular velocity deviation signal.
S102: controlling the double-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning;
the TDWADC controller based on reinforcement learning comprises three parts of control: reinforcement learning control, voltage outer loop PI control and current inner loop PI control;
the input signal of the reinforcement learning control is a wide-area feedback power signal generated after a plurality of groups of double-fed wind turbine generators are connected through a communication network;
it should be noted that, due to the existence of the communication network added among the multiple groups of double-fed wind turbine generators, a networked wide area time-lag system is formed; in the wide-area damping control, a time-lag link is approached by a pad approximate rational polynomial, equivalent processing is carried out, and then a wide-area damping controller is designed.
The time-lag link can be generally used as e -as To represent; the closed loop transfer function of the entire skew system is as follows:
Figure BDA0003805013670000082
wherein, G c (s) is a controller link, G(s) is a controlled object, e -as Is a time-lag link, a is a delay time; from equation (3.1), it can be seen that the denominator of the closed loop transfer function contains e -as The time-lag element makes the transfer function G(s) have infinite poles, and the closed-loop system may be unstable. Network communication time lag in the system can cause system controller failure and system running state deterioration, and finally the system is unstable. Therefore, when the informatization degree of the power system is higher and higher, the time lag generated by the remote communication of the communication network in the power system, the time lag of the data acquisition and pretreatment links, the response delay of the controller and the actuator to the input signal and the like have the influence on the stable operation of the systemThe louder and louder.
In this application, a time-lag link e -as A first order Pade approximation can be used (equation 2.10):
Figure BDA0003805013670000083
for the influence brought by time lag, the method selects the wide area feedback power signals by adopting a modal geometric controllable/observable method, and selects one or more wide area feedback power signals with the highest geometric observable corresponding to the interval oscillation model as input signals for reinforcement learning control by performing geometric controllable/observable analysis on the wide area feedback power signals;
with respect to the geometric controllability/observability method of the modality, the present application is specifically explained as follows:
assume a linearized model of the entire system as:
Figure BDA0003805013670000091
wherein A, B and C are respectively a state matrix, an input matrix and an output matrix.
Let matrix A have n independent eigenvalues lambda k (k =1,2, \8230;, n), and their corresponding left and right eigenvectors are, respectively
Figure BDA0003805013670000092
And psi, and thus, a degree of geometric controllability gm corresponding to the mode k ci (k) Gm of geometric considerable degree oj (k) Respectively as follows:
Figure BDA0003805013670000093
Figure BDA0003805013670000094
wherein, b i Is the ith column of matrix B, and the ithInput is corresponding to c j Is the jth row of matrix C, corresponding to the jth output, | z |, | | z | | is the modulo and Euclidean norm of z, respectively, α (ψ) k ,b i ) Is the geometric angle between the ith input and the kth left eigenvector,
Figure BDA0003805013670000095
is the geometric angle between the jth output and the kth right eigenvector.
From this, the definition of the integrated geometric controllability/observability of the kth modality is:
gm cok (i,j)=gm ci (k)gm oj (k) (2.13)
and selecting one or more feedback signals with the highest geometric observability corresponding to the interval oscillation model as input signals of the wide-area damping controller TDWADC through geometric controllable/observability analysis.
For the output signal of the reinforcement learning control, the output signal is accessed to the voltage outer loop PI control;
the voltage outer loop PI control is used for controlling a power grid side frequency converter;
and the current inner loop PI control is used for controlling the rotor side frequency converter to output specified active power and reactive power so as to complete the suppression of power oscillation of the grid-connected access point of the wind turbine.
Referring to fig. 3, fig. 3 is a schematic diagram of reinforcement learning control; the reinforcement learning control includes: a state converter, an Actor network and a Critic network; the principle of the independent control is as follows:
subtracting a preset signal w (t) from the output quantity y (t) of the controlled photovoltaic power generation system according to the actual situation to generate an error signal e (t); converting the error signal e (t) into an input state signal x (t) of the reinforcement learning network through a state converter; inputting the state signal x (t) into the Actor network to obtain the output signal u n (t); inputting the state signal x (t) and the error reinforcement learning signal r (t) into a Critic network together to obtain an output signal n (t); output signal u n (t) combining the control input signal u (t) with the control input signal n (t) to obtain a control input signal u (t) of the controlled photovoltaic power generation system; u (t) acts on a controlled photovoltaic power generation system to obtain an output signal y (t) to form closed-loop control; a. TheThe factor network and the Critic network also pass through a timing differential signal delta TD And (t) updating the weight coefficients of the Actor network and the Critic network online.
And (3) respectively finishing the strategy function of the Actor network and the value function of the criticic network by adopting two BP neural networks.
Referring to fig. 4, fig. 4 is a schematic structural diagram of the Critic network;
the input of Critic network is a state signal
x c (t)=[x 1 (t),x 2 (t)…,x n (t),r(t)] T (1)
The criticic network error function is shown as formula (2),
Figure BDA0003805013670000101
where λ is the discount coefficient, 0< λ <1;
r (t) is defined as:
Figure BDA0003805013670000111
wherein is a constant with ε > 0;
the transfer function of the cryptic layer neuron of the Critic network adopts a bipolar sigmoid function, and the following formula (4) is shown:
Figure BDA0003805013670000112
the output of the Critic network is a performance index function J (t), a hidden layer adopts a sigmoid activation function, and an output layer adopts a linear activation function; inputs and outputs for the hidden and output layer neurons of the Critic network are as follows (5):
Figure BDA0003805013670000113
wherein N is c For evaluating neurons in hidden layers of networksNumber q i And p i The input and output of the ith neuron of the hidden layer respectively,
Figure BDA0003805013670000114
and
Figure BDA0003805013670000115
respectively representing the weights from the input layer to the hidden layer and from the hidden layer to the output layer;
the Critic network weight value updating calculation is as in formula (6):
Figure BDA0003805013670000116
η c (t) is the learning rate of the Critic network; the gradient calculation from the hidden layer to the output layer is obtained according to a reverse gradient descent method as shown in formula (7):
Figure BDA0003805013670000117
the gradient calculation from the input layer to the hidden layer is shown as equation (8):
Figure BDA0003805013670000118
referring to fig. 5, fig. 5 is a schematic diagram of the architecture of an Actor network;
the input of the Actor network is as follows:
x a (t)=[x 1 (t),x 2 (t)…,x n (t)] T (9)
the inputs and outputs of the Actor network hidden layer and output layer neurons are as follows (10):
Figure BDA0003805013670000121
N a to evaluate the number of network hidden layer neurons, h i And g i The input and output of the ith neuron of the hidden layer respectively,
Figure BDA0003805013670000122
and
Figure BDA0003805013670000123
respectively representing the weights from an input layer to a hidden layer and from the hidden layer to an output layer;
the Actor network weight updating formula is shown in formula (11):
Figure BDA0003805013670000124
η a (t) is the learning rate of the Actor network; the gradient calculations from the hidden layer to the output layer and the input layer to the hidden layer are shown as equations (12) and (13):
Figure BDA0003805013670000125
Figure BDA0003805013670000126
wherein ω is nj And ω j The weight coefficients of the Actor network and the Critic network are respectively.
The expression of the control input signal u (t) is as follows (14):
u(t)=u 1 (t)+η m (0,ρ(t)) (14)
η m depends on the output of Critic network J (t), ρ (t) = [1+ exp (2J (t))] -1
Referring to fig. 6, fig. 6 is a schematic diagram of a reinforcement learning process; the reinforcement learning process is as follows:
1. initializing learning rate, weight, iteration times and error threshold of an Actor network and a Critic network;
2. calculating values of J (t) and u1 (t) according to the aforementioned formulas (5) and (10);
3. initializing iteration times i =1; judging whether i is smaller than the maximum iteration number Nc; if yes, entering the step 4; otherwise, turning to the step 7;
4. calculating the time-series difference function delta according to the formula (2) TD A value of (d);
5. determining an error threshold delta 0TD If yes, updating the weight of the criticic network according to the formulas (7) and (8); otherwise, turning to the step 7;
6. updating the weight of the Actor network according to the formulas (12) and (13);
7. t = t +1, entering a calculation process after updating the weight value;
8. calculating a control signal u (t) according to the formula (14) and acting on a controlled system;
9. updating a system state vector;
10. judging whether T is greater than the simulation time T, if yes, ending the reinforcement learning process; otherwise, i = i +1, and jumping to the iteration frequency judging process in the third step.
And the voltage outer loop PI control and the current inner loop PI control are used for controlling the rotor side frequency converter to output specified active power and reactive power, and finishing the suppression of the output power oscillation of the wind turbine grid-connected access point.
Regarding the control of the rotor-side inverter, the following is detailed:
the DFIG rotor winding is fed into a power grid through a power electronic frequency converter capable of controlling the voltage of a rotor slip ring, and active power and reactive power generated by the DFIG can be respectively controlled through a proper decoupling control method.
From the previously described model formula for the doubly fed induction generator DFIG: by changing i qr ,i dr Namely, the output power P of the doubly-fed induction generator can be respectively controlled sout ,Q sout And i is qr ,i dr Control of final composition of u qr ,u dr And realizing a power outer loop PI control structure and a current inner loop PI control structure. The basic control block diagram of the rotor-side frequency converter is shown in fig. 7. It should be noted that the current inner loop PI control is a necessary control means in the present application, and the power outer loop PI control can be performed by a common PI control means, which is referred to in the present applicationAre not described in detail;
with regard to the grid-side frequency converter control, the present application is elaborated as follows:
the grid side frequency converter is controlled to maintain the voltage of the direct current link capacitor at a preset constant value, and is irrelevant to the direction and the magnitude of the rotor power; and the reactive power generated by the whole wind generation set is controlled to be a set reference value according to the requirement of the whole wind generation set on the reactive power.
The positive direction of each electrical quantity of the grid-side inverter is defined as shown in fig. 8. u. u dc Is a DC bus voltage, i dcr For direct current to the rotor-side frequency converter, i dcg For the direct current output from the network-side frequency converter, L g Is a filter inductor, R g Is the resistance of the filter inductor.
The voltage equation at any dq rotation coordinate is
Figure BDA0003805013670000141
Wherein omega c And the variable including time t in the formula is a per unit value.
When d axis is oriented to grid voltage vector, u gq =0, the active power injected into the grid-side frequency converter by the power grid is as follows:
Figure BDA0003805013670000142
thus by controlling i separately gd ,i gq Power P capable of being output to power grid g ,Q g The decoupling control of (1).
According to the instantaneous power theory, reactive power is only transmitted between three phases, and only active power influencing direct current voltage can pass through active current i gd The dc voltage is controlled.
Adopting the conversion from ABC to dq of the transverse amplitude, neglecting the time P of the direct current side loss under the famous value g =u dc i dcg =3/2u gd i gd Substituted into AC side currentVoltage and direct current voltage relation (m is modulation ratio):
Figure BDA0003805013670000151
to obtain
Figure BDA0003805013670000152
And a DC circuit
Figure BDA0003805013670000153
To obtain
Figure BDA0003805013670000154
Will i dcr Considering the disturbance amount, the transfer function of the dc voltage controlled by the output active current can be expressed as:
Figure BDA0003805013670000155
similar to the control of the rotor-side frequency converter, the control of the network-side frequency converter is finally realized by controlling the voltage thereof, so that the relationship between the alternating voltage and the current of the frequency converter needs to be established. The AC side voltage u of the frequency converter can be obtained according to the formula (3.4) gcd ,u gcq The expression of (a) is:
Figure BDA0003805013670000156
the control of the grid-side frequency converter is shown in fig. 9.
Referring to fig. 10, fig. 10 is a schematic diagram illustrating a control principle of the TDWADC controller based on reinforcement learning.
Wherein, I dref The reference value of the inner ring current d axis generated by the outer ring of the active power can control the wind power to output the active power.
Output inner loop current q-axis reference value I controlled by reactive power PI qref To adjust the wind powerAnd outputting reactive power.
As shown in FIG. 9, the wide-area damping controller of the wind power generation system is designed in the voltage outer loop, i.e. the wide-area damping control signal is added into the voltage outer loop, so as to adjust the inner loop current q-axis reference value I qref And wind power reactive power is output, so that the aim of inhibiting power oscillation of a fan grid-connected access point is fulfilled.
Finally, referring to fig. 11, fig. 11 is a schematic diagram of a 16-machine time-lag wind power generation system as an embodiment. For the unit, the relevant parameter settings in the application are shown in tables 1-3;
TABLE 1 reinforcement learning network parameter settings
Figure BDA0003805013670000161
TABLE 2 Fan parameter settings
Figure BDA0003805013670000162
TABLE 3 results of modal analysis of the machine system
Figure BDA0003805013670000163
As can be seen from table 3, the 16-machine system has 4 interval oscillation modes, with modes 1 and 3 being sufficiently damped, and modes 2 and 4 being insufficiently damped. Mode 4 has a higher frequency than mode 2 and a larger damping ratio. The TDWADC feedback signal selection is primarily for mode 2.
For the above mode, please see table 4 for the geometric observable/controllable results of the present application;
TABLE 4 results of geometric observability/controllability in different modes
Figure BDA0003805013670000171
As can be seen from table 4, compared to the six alternative signals, for mode 2, signal 5 is smaller than the integrated geometric controllability/observability of signal 1, but for the other three modes, the integrated geometric controllability/observability of signal 5 is the smallest, and the transmitted power signal P68-52 in the signal selection tie 68-52 is considered as the feedback signal of the TDWADC controller, because it has a larger integrated geometric controllability/observability for mode 2 and at the same time has a smaller influence corresponding to the other 3 modes, the TDWADC controller can better improve the damping of mode 2, while having the smallest damping influence on the other 3 modes.
Two different calculations were also made in this application, comparing the conventional PSS control and the reinforced learning based TDWADC control based on the time lags t =200ms and t =600ms, respectively, and the results are shown in fig. 12 and 13.
As can be seen from fig. 12 and 13, the RL-TDWADC (RL stands for reinforcement learning, and TDWADC stands for time-lag wide-area damping controller) adopted by the present application can better suppress low-frequency oscillation of the power system.
The invention has the beneficial effects that: the low-frequency oscillation of the power system can be effectively inhibited in time, on one hand, the safety and the stability of the power grid are improved, on the other hand, the power grid can absorb the electric energy generated by the wind power plant in time and on a large scale, and the economic and social benefits of a wind power generation enterprise are improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
The above-described embodiments of the present invention should not be construed as limiting the scope of the present invention. Any other corresponding changes and modifications made according to the technical idea of the present invention should be included in the protection scope of the claims of the present invention.

Claims (5)

1. A time-lag wind power system wide-area damper control method based on reinforcement learning is characterized by comprising the following steps:
s101: constructing a time-lag wind power wide area damping TDWADC control system based on reinforcement learning; the TDWADC control system comprises: the system comprises a plurality of groups of double-fed wind turbine generators and a TDWADC controller based on reinforcement learning;
wherein doubly-fed wind turbine generator system includes: the system comprises a wind turbine, a gear box, a double-fed induction generator DFIG, a transformer, a rotor side frequency converter, a power grid side frequency converter and an overvoltage protection circuit crowBar;
the wind turbine is connected with the gear box through mechanical transmission; the gear box is connected with the doubly-fed induction generator through a transmission bearing; the DFIG is connected with the transformer through electromagnetic coupling and is connected to an alternating current power grid through the transformer; the DFIG of the doubly-fed induction generator is electrically connected with the output end of the overvoltage protection circuit crowBar and the input end of the rotor side frequency converter; the output end of the rotor side frequency converter is electrically connected with the input end of the power grid side frequency converter; the output end of the power grid side frequency converter is electrically connected with one end of the transformer;
s102: controlling the doubly-fed wind turbine generator set by adopting a TDWADC controller based on reinforcement learning;
the TDWADC controller based on reinforcement learning comprises three parts of control: reinforcement learning control, voltage outer loop PI control and current inner loop PI control;
the input signal of the reinforcement learning control is a wide-area feedback power signal generated after a plurality of groups of double-fed wind turbine generators are connected through a communication network; the output signal of the reinforcement learning control is accessed to the voltage outer loop PI control;
the input signal selection process of reinforcement learning control is specifically as follows: selecting the wide-area feedback power signals by adopting a modal geometric controllable/observable method, performing geometric controllable/observable analysis on the wide-area feedback power signals, and selecting one or more wide-area feedback power signals with the highest geometric observability corresponding to the interval oscillation model as input signals of the reinforcement learning controller;
the voltage outer loop PI control is used for controlling a power grid side frequency converter;
and the current inner loop PI control is used for controlling the rotor side frequency converter to output specified active power and reactive power and finishing the suppression of the power oscillation of the grid-connected access point of the wind turbine.
2. The reinforcement learning-based time-lag wind power system wide-area damper control method as claimed in claim 1, wherein: the reinforcement learning control includes: a state converter, an Actor network and a Critic network; the principle of independent control is as follows:
subtracting the output quantity y (t) from a preset signal w (t) according to the actual situation to generate an error signal e (t); converting the error signal e (t) into an input state signal x (t) of the reinforcement learning network through a state converter; inputting the state signal x (t) into the Actor network to obtain the output signal u n (t); inputting the state signal x (t) and the error reinforcement learning signal r (t) into a Critic network together to obtain an output signal n (t); output signal u n (t) combining the control input signal u (t) with the control input signal n (t) to obtain a control input signal u (t) of the controlled photovoltaic power generation system; u (t) acts on a controlled photovoltaic power generation system to obtain an output signal y (t) to form closed-loop control; the Actor network and the Critic network also pass through a time sequence differential signal delta TD And (t) updating the weight coefficients of the Actor network and the Critic network online.
3. The reinforcement learning-based time lag wind power system wide area damper control method of claim 2, characterized in that: and respectively finishing the strategy function of the Actor network and the value function of the criticic network by adopting two BP networks.
4. The reinforcement learning-based time lag wind power system wide area damper control method of claim 3, characterized in that:
the input of the Critic network is a state signal
x c (t)=[x 1 (t),x 2 (t)…,x n (t),r(t)] T (1),
The criticic network error function is shown as formula (2),
Figure FDA0003805013660000021
where λ is the discount coefficient, 0< λ <1;
r (t) is defined as:
Figure FDA0003805013660000022
wherein is a constant with ε > 0;
the transfer function of the hidden layer neuron of the Critic network adopts a bipolar sigmoid function, and is shown as the formula (4):
Figure FDA0003805013660000023
the output of the Critic network is a performance index function J (t), a hidden layer adopts a sigmoid activation function, and an output layer adopts a linear activation function; inputs and outputs for the hidden and output layer neurons of the Critic network are as follows (5):
Figure FDA0003805013660000031
wherein N is c To evaluate the number of network hidden layer neurons, q i And p i Input and output, ω, of the ith neuron of the hidden layer, respectively c (1) And ω c (2) Respectively representing the weights from the input layer to the hidden layer and from the hidden layer to the output layer;
the updating calculation of the Critic network weight value is as the following formula (6):
Figure FDA0003805013660000032
η c (t) is the learning rate of the Critic network;
the gradient calculation from the hidden layer to the output layer is obtained according to a reverse gradient descent method, and is shown as the formula (7):
Figure FDA0003805013660000033
the gradient calculation from the input layer to the hidden layer is shown as equation (8):
Figure FDA0003805013660000034
5. the reinforcement learning-based time-lag wind power system wide-area damper control method as claimed in claim 4, wherein: the input of the Actor network is as follows:
x a (t)=[x 1 (t),x 2 (t)…,x n (t)] T (9)
the input and output of the neurons of the hidden layer and the output layer of the Actor network are as follows (10):
Figure FDA0003805013660000041
N a to evaluate the number of network hidden layer neurons, h i And g i Input and output, ω, of the ith neuron of the hidden layer, respectively a (1) And omega a (2) Respectively representing the weights from the input layer to the hidden layer and from the hidden layer to the output layer;
the formula for updating the weights of the Actor network is shown in equation (11):
Figure FDA0003805013660000042
η a (t) is the learning rate of the Actor network; the gradient calculation from the hidden layer to the output layer and from the input layer to the hidden layer is shown as equation (12) and equation (13):
Figure FDA0003805013660000043
Figure FDA0003805013660000044
wherein omega nj And omega j The weight coefficients of the Actor network and the Critic network are respectively.
The expression of the control input signal u (t) is as follows (14):
u(t)=u 1 (t)+η m (0,ρ(t)) (14)
η m depends on the output of Critic network J (t), ρ (t) = [1+ exp (2J (t))] -1
CN202210994492.5A 2022-08-18 2022-08-18 Time-lag wind power system wide-area damper control method based on reinforcement learning Withdrawn CN115395532A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210994492.5A CN115395532A (en) 2022-08-18 2022-08-18 Time-lag wind power system wide-area damper control method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210994492.5A CN115395532A (en) 2022-08-18 2022-08-18 Time-lag wind power system wide-area damper control method based on reinforcement learning

Publications (1)

Publication Number Publication Date
CN115395532A true CN115395532A (en) 2022-11-25

Family

ID=84120547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210994492.5A Withdrawn CN115395532A (en) 2022-08-18 2022-08-18 Time-lag wind power system wide-area damper control method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN115395532A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116613784A (en) * 2023-07-20 2023-08-18 武昌理工学院 Wind-light power generation system subsynchronous oscillation coordination control method based on PID-DHDP

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116613784A (en) * 2023-07-20 2023-08-18 武昌理工学院 Wind-light power generation system subsynchronous oscillation coordination control method based on PID-DHDP
CN116613784B (en) * 2023-07-20 2023-12-12 武昌理工学院 Wind-light power generation system subsynchronous oscillation coordination control method based on PID-DHDP

Similar Documents

Publication Publication Date Title
Pan et al. Wind energy conversion systems analysis of PMSG on offshore wind turbine using improved SMC and Extended State Observer
Li et al. Aggregated models and transient performances of a mixed wind farm with different wind turbine generator systems
Liu et al. Co-ordinated multiloop switching control of DFIG for resilience enhancement of wind power penetrated power systems
Zhou et al. Pitch controller design of wind turbine based on nonlinear PI/PD control
CN106786673B (en) The suppressing method and device of double-fed blower compensated transmission system subsynchronous resonance
CN106451539B (en) It is a kind of meter and permanent magnet direct-drive wind turbine group dynamic characteristic wind farm grid-connected Method of Stability Analysis
Shao et al. An equivalent model for sub-synchronous oscillation analysis in direct-drive wind farms with VSC-HVDC systems
Tian et al. Engineering modelling of wind turbine applied in real‐time simulation with hardware‐in‐loop and optimising control
Pathak et al. Fractional‐order nonlinear PID controller based maximum power extraction method for a direct‐driven wind energy system
CN115395532A (en) Time-lag wind power system wide-area damper control method based on reinforcement learning
CN104883109A (en) Control method for restraining harmonic current of doubly-fed wind generator stator side
Yaichi et al. Control of doubly fed induction generator with maximum power point tracking for variable speed wind energy conversion systems
CN103956767B (en) A kind of wind farm grid-connected method for analyzing stability considering wake effect
CN110336299B (en) Distribution network reconstruction method considering small interference stability of comprehensive energy system
Wang et al. Research on interconnecting offshore wind farms based on multi-terminal VSC-HVDC
He et al. Grey prediction pi control of direct drive permanent magnet synchronous wind turbine
Nadour et al. Advanced backstepping control of a wind energy conversion system using a doubly-fed induction generator
Djoudi et al. Multilevel converter and fuzzy logic solutions for improving direct control accuracy of DFIG-based wind energy system
Liu et al. Optimization of torque control parameters for wind turbine based on drive chain active damping control
Li et al. Crowbar Resistance Setting and its Influence on DFIG Low Voltage Based on Characteristics
Jizhen et al. Dynamic modeling of wind turbine generation system based on grey-box identification with genetic algorithm
Oualah et al. Super-twisting sliding mode control for brushless doubly fed reluctance generator based on wind energy conversion system
Li et al. Structure Preserving Aggregation Method for Doubly-Fed Induction Generators in Wind Power Conversion
Al-Toma Hybrid control schemes for permanent magnet synchronous generator wind turbines
Riouch et al. A coordinated control for smoothing output power of a DFIG based wind turbine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20221125