CN110880774B - Self-adaptive adjustment inverter controller - Google Patents

Self-adaptive adjustment inverter controller Download PDF

Info

Publication number
CN110880774B
CN110880774B CN201911165356.XA CN201911165356A CN110880774B CN 110880774 B CN110880774 B CN 110880774B CN 201911165356 A CN201911165356 A CN 201911165356A CN 110880774 B CN110880774 B CN 110880774B
Authority
CN
China
Prior art keywords
inverter
formula
module
learning
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911165356.XA
Other languages
Chinese (zh)
Other versions
CN110880774A (en
Inventor
魏俊
叶圣永
张玉鸿
韩宇奇
刘旭娜
张文涛
赵达维
李达
吕学海
陈博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Economic and Technological Research Institute of State Grid Sichuan Electric Power Co Ltd
Original Assignee
Economic and Technological Research Institute of State Grid Sichuan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Economic and Technological Research Institute of State Grid Sichuan Electric Power Co Ltd filed Critical Economic and Technological Research Institute of State Grid Sichuan Electric Power Co Ltd
Priority to CN201911165356.XA priority Critical patent/CN110880774B/en
Publication of CN110880774A publication Critical patent/CN110880774A/en
Application granted granted Critical
Publication of CN110880774B publication Critical patent/CN110880774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/24Arrangements for preventing or reducing oscillations of power in networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a self-adaptive adjustment inverter controller which comprises a dq conversion module, an output active reactive power and end voltage effective value calculation module, a modulation wave signal amplitude calculation module, a simulation rotor motion equation module, a reinforcement learning control module and a dq inverse conversion and PWM (pulse width modulation) module.

Description

Self-adaptive adjustment inverter controller
Technical Field
The invention relates to the technical field of power electronic inverters, in particular to a self-adaptive adjusting inverter controller.
Background
In view of the pressure of energy and environmental protection, more and more renewable energy sources are connected to the power grid through power electronic power generation equipment. As an electric energy conversion device capable of converting direct current into alternating current, an inverter is widely applied in the fields of wind power, energy storage, photovoltaic and the like. The earliest inverter control strategy adopted a dual-layer control structure, i.e., the inner layer was a current loop and the outer layer was a power loop or voltage loop. However, the response speed of the control strategy is fast, which is not favorable for the frequency stability of the power system, and the output power of the inverter can not be adjusted adaptively according to the voltage and frequency conditions of the system. A droop control strategy for the inverter then occurs. Although the control strategy can automatically adjust the output active power and reactive power according to the frequency and voltage of the system, and is beneficial to the frequency and voltage control of the system, the control strategy still has the defect of high response speed. In view of the beneficial effect of the rotational inertia of the rotor of the synchronous generator in the frequency stabilization of the power system, a Virtual Synchronous Generator (VSG) control technology has been proposed, and the core idea is to simulate the rotor motion equation of the synchronous generator in the control strategy of the inverter. However, the technology also inherits the characteristic that the rotor of the synchronous generator is easy to generate low-frequency oscillation while reducing the response speed and increasing the inertia of a power system. In order to solve the problem, a series of control strategies are proposed by starting from the moment of inertia J and the damping coefficient D. A small signal model of single inverter grid connection containing VSG is built, and proper moment of inertia J and damping coefficient D are selected by means of root track, so that stability is guaranteed, and good dynamic characteristics are achieved. However, the controller designed by the linearization method is difficult to adapt to the complex operation condition of the inverter, and once the operation condition is different from the initial design condition, there is a risk that instability or deterioration of dynamic characteristics may occur. A series of control measures from the adaptive control point of view have also been proposed, such as a VSG control algorithm that adjusts J according to the rate of change D ω/dt of the VSG virtual angular frequency, or parameters of J and D according to the condition of the output power P and its rate of change dP/dt. However, the adaptive control strategy also has the problem of selecting the parameters of the controller. In practice, these controller parameters are often selected by a simulation trial and error method, which is difficult to cover all possible operating states of the inverter, and also cannot ensure that the designed controller can be stable in a complex operating environment.
In recent years, the application of artificial intelligence related technology in power systems is becoming widespread. In the field of power equipment control, Reinforcement Learning (RL) technology is attracting attention. RL can be considered as an important machine learning method in the field of artificial intelligence, and can also be considered as an independent branch of Markov Decision Process (MDP) and dynamic optimization methods. In the reinforcement learning method, a reinforcement learning control module is often regarded as an agent, which does not need any prior knowledge about the environment, and gradually obtains an optimized mapping strategy from a state to a behavior by exploring the action of the self controller and the obtained reward and continuously updating and iterating. Namely, the reinforcement learning mainly carries out behavior decision through actions, states and rewards of continuous interaction between the intelligent agent and the environment. At present, the Power equipment controller designed based on RL algorithm includes a dc additive damping controller, a dynamic quadrature booster, a Static Var Compensator (SVC), a Power System Stabilizer (PSS), and the like. Research shows that the controller designed by the RL algorithm can effectively give consideration to the optimization control of a mixed target of stability and dynamic performance, shows good environmental adaptability, and is more suitable for a system with multiple uncertain factors and large disturbance, such as a power system. If a reinforcement learning module is introduced into the controller of the inverter to evaluate the oscillation states of the inverter excited by system disturbance and with different J and D, the parameters of the controller can be adaptively adjusted through online learning and training, so that the inverter function can be realized while a better oscillation suppression effect is obtained. Therefore, with the further improvement of the permeability of renewable energy, when a large proportion of power electronic devices with various control strategies coexisting and complex operation conditions exist in a power system, how to realize the control of the inverter by using the RL algorithm becomes a technical problem to be solved urgently.
Disclosure of Invention
The invention aims to provide a self-adaptive adjustment inverter controller, which utilizes a reinforcement learning algorithm and introduces an online learning and optimization mechanism to realize self-adaptive adjustment of virtual moment of inertia and a virtual damping coefficient of an inverter so as to adapt to the operation control of the inverter in the easy oscillation environment of a high-proportion renewable energy grid-connected power electronic system.
The invention is realized by the following technical scheme:
a self-adaptive adjustment inverter controller comprises a dq transformation module, an output active reactive power and end voltage effective value calculation module, a modulation wave signal amplitude calculation module, a simulation rotor motion equation module, a reinforcement learning control module and a dq inverse transformation and PWM (pulse Width modulation) modulation module, wherein the inverter controller simulates a synchronous generator rotor motion equation and adjusts virtual inertia and a damping coefficient on line through the reinforcement learning control module so as to obtain a good low-frequency oscillation suppression effect of an electric power system.
In the adjusting scheme of two virtual coefficients of the inverters J and D based on the virtual synchronous generator control strategy, the adjusting idea of an optimization strategy through online learning based on a reinforcement learning algorithm is adopted, and J and D are not adaptively adjusted through the differential of the output power P of the inverter and the virtual angular frequency omega and the differential in the existing method. The reinforcement learning algorithm simulates human intelligence to a certain extent, so the control strategy is favorable for the inverter to adapt to more complex power system operation conditions and oscillation environments
The invention is particularly suitable for the inverter controller which operates in an easy oscillation environment. The controller is added with the function of strengthening learning and optimizing and adjusting the parameters of the controller on line on the basis of the control technology of the virtual synchronous generator.
Furthermore, the dq conversion module is used for converting the three-phase voltage e of the capacitor on the LC filter of the invertera、eb、ecAnd three-phase current i flowing through the inductora、ib、icDecomposing the data into a synchronous rotation coordinate system of the inverter to obtain a dq axis component u thereofd、uqAnd id、iq
The dq inverse transformation and PWM module is used for carrying out dq inverse transformation on the calculation results of the modulation wave signal amplitude calculation module and the simulation rotor motion equation module to obtain a modulation wave signal ua、ub、ucAnd generating a PWM control signal according to a PWM algorithm to drive a three-phase inverter bridge so as to realize an inverter function.
Further, the calculation formula of the output active and reactive power and terminal voltage effective value calculation module is shown in formula (1):
Figure BDA0002287298710000031
in the formula, P is active power, Q is reactive power, U is output voltage amplitude, and s is a laplace operator.
Further, the modulation wave signal amplitude calculation module is used for calculating the modulation wave amplitude Eq by using the formula:
Eq=∫Ke[(Uref-U)-n(Q-Qref)]d(t) (2)
in the formula, Uref is a set value of the inverter terminal voltage; qrefIs a set value of the reactive power of the inverter; ke is amplification gain, and n is droop coefficient of the reactive voltage link; eq is a q-axis component in the inverter modulated wave signal, and both the 0-axis and d-axis components of the modulated wave signal are 0.
Further, the analog rotor motion equation module is used for calculating the virtual angular frequency ω of the inverter and the phase angle of the modulation wave, and the calculation formula is shown in formulas (3) and (4):
Figure BDA0002287298710000032
=∫ωd(t) (4)
in the formula, J is the virtual rotational inertia of the inverter; m is an active droop coefficient; pref is a set value of the active power output by the inverter; omega0The rated angular frequency of the power system where the inverter is located; d is the virtual damping coefficient of the inverter; the wave phase is modulated for the inverter.
Further, the reinforcement learning control module is used for adjusting the virtual moment of inertia and the damping coefficient on line, and comprises the following steps:
step 1: determining a control action set a, a state set s, a reward function R, a learning time step and a Q value updating strategy;
step 2: establishing a mathematical model of the inverter, and obtaining a controller parameter after preliminary optimization by utilizing offline pre-learning of a Q learning algorithm;
and step 3: and putting the inverter into online operation, continuously utilizing a Q learning algorithm to learn online in the operation process, and further updating an action strategy to adapt to the complex operation environment of the power grid.
Further, the expression of the control action set a in step 1 is as follows:
a∈{(J,D),|J∈[Jmin,Jmax],D∈[Dmin,Dmax]} (5)
in the formula, JminAnd JmaxA set of predetermined ranges of virtual moments of inertia; dminAnd DmaxA set of predetermined ranges of virtual rotational damping coefficients;
the expression of the state set s in step 1 is as follows:
Figure 1
in the formula, Δ P is the difference between the output active power of the inverter at a certain time and a set value; Δ ω is a deviation of the virtual synchronous generator rotational speed at the time, and ([ delta ] P, [ delta ] ω) represents a deviation combination of the inverter output power P and the virtual angular frequency ω at the time.
Further, the expression of the reward function R in step 1 is as follows:
Figure BDA0002287298710000041
in the formula, RiThe prize value obtained for the ith iteration; t is a certain time; t + step is the time after one learning time step; a isP、aωRespectively are weight coefficients of the active power difference and the rotating speed deviation in the evaluation index;
the expression of the Q value updating strategy in the step 1 is as follows:
Figure BDA0002287298710000042
in the formula, si,aiState and action representing the ith iteration; q(s)i,ai) For the ith iteration si,aiThe corresponding Q value; α is a learning factor indicating the degree of confidence to be given to the improved update portion; gamma is a discount factor and determines how far and near the time affects the return.
Further, the Q learning algorithm in step 2 is a cyclic iterative process, and obtains the preliminarily optimized controller parameters through offline learning, including the following steps:
s11: initializing all parameters of reinforcement learning, namely setting a discount factor gamma, a learning factor alpha, the probability of a greedy strategy, a Q table matrix convergence setting value Q and initializing a Q table as a zero matrix;
s12: observing the state s at the current momenti
S13: according to the current state siOne action a is selected from the action set represented by the formula (5) according to the policy pii
S14: after the action is performed, the state s at the next moment is observedi+1
S15: the reward value R under the current action is calculated by the reward function formula (7)i
S16: updating the Q table matrix by equation (8);
s17: i ═ i + 1. And judging whether the algorithm convergence condition is met, if not, returning to S12, otherwise, ending the off-line learning.
Further, in step 3, the Q learning algorithm is continuously used for online learning and further updating the action strategy in the operation process, and the Q learning algorithm is used in the following steps:
s21: observing the current state si
S22: according to the current state siOne action a is selected from the action set represented by the formula (5) according to the policy pii
S23: after the action is performed, the state s at the next moment is observedi+1
S24: obtaining current actions through reward functionsLower reward value Ri
S25: updating the Q table matrix by equation (8);
s26: i +1, the process returns to step S21.
Further, in the iteration step of the Q learning algorithm, the strategy pi is a greedy strategy, namely, a selected action aiThere is a probability that one of the actions in the action set represented by the formula (5) is arbitrarily selected as aiThere is a probability of 1-depending on the current state s from the set of actions represented by equation (5)iSelecting the action with the maximum Q value as ai
Compared with the prior art, the invention has the following advantages and beneficial effects:
in the adjusting scheme of two virtual coefficients of the inverters J and D based on the virtual synchronous generator control strategy, the adjusting idea of an optimization strategy through online learning based on a reinforcement learning algorithm is adopted, and J and D are not adaptively adjusted through the differential of the output power P of the inverter and the virtual angular frequency omega and the differential in the existing method. The reinforcement learning algorithm simulates human intelligence to a certain extent, so that the control strategy is favorable for the inverter to adapt to more complex power system operation conditions and oscillation environments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
FIG. 1 is a main circuit topology of an inverter controller of the present invention and a control block diagram thereof;
fig. 2 is a control block diagram of the inverter controller of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.
Example (b):
as shown in fig. 2, the adaptive regulation inverter controller of the present invention includes a dq conversion module, an output active and reactive and terminal voltage effective value calculation module, a modulated wave signal amplitude calculation module, an analog rotor motion equation module, a reinforcement learning control module, and a dq inverse conversion and PWM modulation module:
a dq transformation module: three-phase voltage e for connecting capacitor on inverter LC filtera、eb、ecAnd three-phase current i flowing through the inductora、ib、icDecomposing the data into a synchronous rotation coordinate system of the inverter to obtain a dq axis component u thereofd、uqAnd id、iq
The output active power and end voltage effective value calculation module: the calculation formula is shown as formula (1):
Figure BDA0002287298710000061
in the formula, P is active power, Q is reactive power, U is output voltage amplitude, and s is a laplace operator.
Modulation wave signal amplitude calculation module: the formula used for calculating the amplitude Eq of the modulated wave is as follows:
Eq=∫Ke[(Uref-U)-n(Q-Qref)]d(t) (2)
in the formula, Uref is a set value of the inverter terminal voltage; qrefIs a set value of the reactive power of the inverter; ke is amplification gain, and n is droop coefficient of the reactive voltage link; eq is a q-axis component in the inverter modulated wave signal, and both the 0-axis and d-axis components of the modulated wave signal are 0.
A simulation rotor motion equation module: obtaining the virtual angular frequency omega of the inverter and the phase angle of the modulation wave according to the formulas (3) and (4), wherein the formula is as follows:
Figure BDA0002287298710000062
=∫ωd(t) (4)
in the formula, J is the virtual rotational inertia of the inverter; m is an active droop coefficient; pref is a set value of the active power output by the inverter; omega0The rated angular frequency of the power system where the inverter is located; d is the virtual damping coefficient of the inverter; the wave phase is modulated for the inverter.
A reinforcement learning control module: the method comprises the following steps:
step 1: and determining a control action set a, a state set s, a reward function R, a step of learning time each time and a Q value updating strategy. In conjunction with the inverter control strategy, the action set a here is:
a∈{(J,D),|J∈[Jmin,Jmax],D∈[Dmin,Dmax]} (5)
in the formula, JminAnd JmaxA set of predetermined ranges of virtual moments of inertia; dminAnd DmaxA set of predetermined ranges of virtual rotational damping coefficients; by means of discretization, { (J, D) } represents the set of combinations within the two parameter adjustment ranges, representing possible candidate combinations of J and D.
The state set s is:
Figure 2
in the formula, Δ P is the difference between the output active power of the inverter at a certain time and a set value; Δ ω is the rotational speed deviation of the virtual synchronous generator at the moment, (Δp,. DELTA.ω) represents the deviation combination of the output power P and the virtual angular frequency ω of the inverter at the moment, and the state space of Δ P and Δ ω is divided into 6 different states respectively to reflect the oscillation condition of the inverter in the dynamic process.
The reward function R is:
Figure BDA0002287298710000071
in the formula, RiThe prize value obtained for the ith iteration; t is a certain time; t + step is the elapsed learning time step step later; a isP、aωThe weight coefficients of the active power difference and the rotating speed deviation which account for the evaluation indexes are respectively. Due to the discrete control of the inverter in the above implementation, the reward value in each step time period needs to be calculated according to equation (8):
Figure BDA0002287298710000072
in the formula, j-1 represents a time at which t-0, and N-step/Δ t represents a value obtained by rounding step time periods at time intervals of Δ t, and thus represents a t + step time. The time interval of Δ t is equal to the time interval of the inverter controller sampling and control. For medium and low power inverters Δ t is typically 1/5000 seconds.
Step 2: and establishing a mathematical model of the inverter, and obtaining the preliminarily optimized controller parameters by utilizing offline pre-learning of a Q learning algorithm. The Q learning algorithm is a cyclic iterative process, and the controller parameters after preliminary optimization are obtained through off-line learning. The whole process is divided into 7 steps from S11 to S17:
s11: initializing all parameters of reinforcement learning, namely setting a discount factor gamma, a learning factor alpha, the probability of a greedy strategy, a Q table matrix convergence setting value Q and initializing a Q table as a zero matrix;
s12: observing the state s at the current momenti
S13: according to the current state siOne action a is selected from the action set represented by the formula (5) according to the policy pii
S14: after the action is performed, the state s at the next moment is observedi+1
S15: the reward value R under the current action is calculated by the reward function formula (7)i
S16: updating the Q table matrix by equation (8);
s17: i ═ i + 1. Judging whether the algorithm convergence condition is met, if not, returning to the step S12, otherwise, ending off-line learning;
the iterative convergence condition of the algorithm is that the 2 norm of the Q table matrix is smaller than a set value Q.
And step 3: and putting the inverter into online operation, continuously utilizing a Q learning algorithm to learn online in the operation process, and further updating an action strategy to adapt to the complex operation environment of the power grid. The steps of the Q learning algorithm are a loop iteration process, and the whole process is divided into 6 steps from S21 to S26:
s21: observing the current state si
S22: according to the current state siOne action a is selected from the action set represented by the formula (5) according to the policy pii
S23: after the action is performed, the state s at the next moment is observedi+1
S24: obtaining the reward value R under the current action through the reward functioni
S25: updating the Q table matrix by equation (8);
s26: i +1, the process returns to step S21.
The strategy pi involved in the iteration step of the Q learning algorithm in the reinforcement learning steps 2 and 3 above is a greedy strategy, i.e. one action a is selectediThere is a probability that one of the actions in the action set represented by the formula (5) is arbitrarily selected as aiThere is a probability of 1-depending on the current state s from the set of actions represented by equation (5)iSelecting the action with the maximum Q value as ai
dq inverse transformation and PWM modulation module: for placing Eq calculated by formula (2) on q axis, supplementing signals on d axis and 0 axis to 0, and obtaining modulated wave signal u after dq inverse transformation by using sum omega of formula (3) and formula (4)a、ub、ucAnd generating a PWM control signal according to a PWM algorithm to drive a three-phase inverter bridge so as to realize an inverter function.
Specific parameters of an inverter controller participating in the suppression of low-frequency oscillations of the power system are given below as an example. The rated voltage of the three-phase inverter is 380V, the rated frequency is 50Hz, and the rated power is 100 kW. Correspondingly, rated voltage U of DC end of inverterdc750V. The power electronic device of the three-phase full-bridge inverter circuit is IGBT (insulated gate bipolar transistor)d Gate Bipolar Transistor) model number English flying F150R12RT 4. The PWM carrier frequency is 5 kHz. The inductance L of the LC filter is 2mH, the capacitance is 13 muF, and the model C67S1136-002700 is selected. The energy storage capacitors on the direct current bus are Hitachi capacitors with capacity values of 2200 mu F and withstand voltage of 450V, 6 capacitors are connected in series every two capacitors, and the total capacity value is 3300 mu F. The current sensor is selected from the group consisting of HAS150-S (LEM).
The parameters in the inverter controller are set to Uref ═ 1.00, ω01.00, Pref 0.85, Qref 0.00, these parameters are per unit. The droop coefficient of the virtual synchronous generator is selected to be m-n-0.1, Ke-100, J01.0 and 1.00. The relevant parameters of reinforcement learning are selected as follows: j. the design is a squaremin=0.01,Jmax=5.01,Dmin=0.01,Dmax5.01. When discretization is used to determine the action set { J, D }, Δ J is 0.2 and Δ D is 0.2, i.e., the action set { J, D } { (0.01 ), (0.01, 0.21), (0.21, 0.01), … …, (5.01 ) }, 676 combinations. step is taken for 0.1 s. a isP、aωThe weighting coefficients are 1 and 314, respectively. The discount factor γ is 0.7. Learning factor a is 0.5. Greedy policy 0.1. The 2 norm Q of the Q table matrix iteration convergence condition is 0.1.
Fig. 1 is a main circuit topology and a control block diagram of an inverter controller according to the present invention, which includes modules such as a power circuit, a current and voltage measuring unit, a controller, and a driving circuit. The following describes the structure and function of each part in detail with reference to fig. 1:
1) the power circuit part: the circuit comprises an inverse direct current side, a three-phase full-bridge inverter circuit, an LC filter, a closing switch KM and the like, and is mainly used for electric energy transmission. In FIG. 1, rfAnd LfThe equivalent inductance and resistance of the inductor L portion of the inverter LC filter is usually so small compared to the inductance that it is negligible. CfIs the capacitor C in the inverter LC filter. r and L are the equivalent resistance and inductance of the inverter to the transmission line of the connected grid. Udc is the direct current side, such as a battery or the like. The three-phase full-bridge inverter circuit comprises 6 full-control power electronic devices, and is switched on or off under the control of a voltage signal output by the drive circuit, so that the three-phase full-bridge inverter circuit is realizedAnd (4) inversion function.
2) Current and voltage measuring unit part: this section is mainly voltage and current sensors that enable measurement of the inverter port voltage and output current. Wherein the three-phase voltage measured on the capacitor C in the LC filter is represented by ea、eb、ecThe current flowing through the inductor L is measured as ia、ib、ic
3) A controller section: this part is mainly to implement the control function, in particular the three-phase voltage e to be measureda、eb、 ecAnd a current signal ia、ib、icAnd the control signal is sent to the control module, and the control signal is generated and output according to the control strategy and is used for controlling the on-off of the power electronic device by the driving circuit. In the present invention, the specific control strategy of the controller can be seen in fig. 2.
4) A drive circuit section: this part is mainly to control the on/off of the power electronics according to the PWM signal.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (9)

1. A self-adaptive adjustment inverter controller is characterized by comprising a dq conversion module, an output active reactive power and end voltage effective value calculation module, a modulation wave signal amplitude calculation module, a simulation rotor motion equation module, a reinforcement learning control module and a dq inverse conversion and PWM modulation module, wherein the inverter controller simulates a synchronous generator rotor motion equation and adjusts virtual rotational inertia and a damping coefficient on line through the reinforcement learning control module so as to obtain a better low-frequency oscillation suppression effect of an electric power system;
the reinforcement learning control module is used for adjusting the virtual moment of inertia and the damping coefficient on line, and comprises the following steps:
step 1: determining a control action set a, a state set s, a reward function R, a learning time step and a Q value updating strategy;
step 2: establishing a mathematical model of the inverter, and obtaining a controller parameter after preliminary optimization by utilizing offline pre-learning of a Q learning algorithm;
and step 3: and putting the inverter into online operation, continuously utilizing a Q learning algorithm to learn online in the operation process, and further updating an action strategy to adapt to the complex operation environment of the power grid.
2. The adaptive modulation inverter controller of claim 1, wherein the dq conversion module is configured to convert a three-phase voltage e of a capacitor on an LC filter of the inverter into a three-phase voltage ea、eb、ecAnd three-phase current i flowing through the inductora、ib、icDecomposing the data into a synchronous rotation coordinate system of the inverter to obtain a dq axis component u thereofd、uqAnd id、iq
The dq inverse transformation and PWM module is used for carrying out dq inverse transformation on the calculation results of the modulation wave signal amplitude calculation module and the simulation rotor motion equation module to obtain a modulation wave signal ua、ub、ucAnd generating a PWM control signal according to a PWM algorithm to drive a three-phase inverter bridge so as to realize an inverter function.
3. The adaptive-adjustment inverter controller according to claim 1, wherein a calculation formula of the output active reactive power and terminal voltage effective value calculation module is as shown in formula (1):
Figure FDA0002772105280000011
in the formula, P is active power, Q is reactive power, U is an output voltage amplitude, and s is a Laplace operator;
the modulation wave signal amplitude calculation module is used for calculating a modulation wave amplitude Eq, and the formula is as follows:
Eq=∫Ke[(Uref-U)-n(Q-Qref)]d(t) (2)
in the formula, Uref is a set value of the inverter terminal voltage; qrefIs a set value of the reactive power of the inverter; ke is amplification gain, and n is droop coefficient of the reactive voltage link; eq is a q-axis component in the inverter modulated wave signal, and both the 0-axis and d-axis components of the modulated wave signal are 0.
4. The adaptive modulation inverter controller according to claim 1, wherein the analog rotor motion equation module is configured to calculate an angular frequency ω of the inverter virtual and a phase angle of the modulated wave, and the calculation formula is as shown in equations (3) and (4):
Figure FDA0002772105280000021
=∫ωd(t) (4)
in the formula, J is the virtual rotational inertia of the inverter; m is an active droop coefficient; pref is a set value of the active power output by the inverter; omega0The rated angular frequency of the power system where the inverter is located; d is the virtual damping coefficient of the inverter; the wave phase is modulated for the inverter.
5. The adaptive-tuning inverter controller according to claim 1, wherein the expression of the control action set a in step 1 is as follows:
a∈{(J,D),|J∈[Jmin,Jmax],D∈[Dmin,Dmax]} (5)
in the formula, JminAnd JmaxA set of predetermined ranges of virtual moments of inertia; dminAnd DmaxA set of predetermined ranges of virtual rotational damping coefficients;
the expression of the state set s in step 1 is as follows:
Figure FDA0002772105280000022
in the formula, Δ P is the difference between the output active power of the inverter at a certain time and a set value; Δ ω is a deviation of the virtual synchronous generator rotational speed at the time, and ([ delta ] P, [ delta ] ω) represents a deviation combination of the inverter output power P and the virtual angular frequency ω at the time.
6. The adaptive-tuning inverter controller according to claim 1, wherein the reward function R in step 1 is expressed as follows:
Ri=-∫t t+step(aP|ΔP|+aω|Δω|)dt (7)
in the formula, RiThe prize value obtained for the ith iteration; t is a certain time; t + step is the time after one learning time step; a isP、aωRespectively are weight coefficients of the active power difference and the rotating speed deviation in the evaluation index;
the expression of the Q value updating strategy in the step 1 is as follows:
Figure FDA0002772105280000023
in the formula, si,aiState and action representing the ith iteration; q(s)i,ai) For the ith iteration si,aiThe corresponding Q value; α is a learning factor indicating the degree of confidence to be given to the improved update portion; gamma is a discount factor and determines how far and near the time affects the return.
7. The adaptive-adjustment inverter controller according to claim 6, wherein the Q learning algorithm in step 2 is a cyclic iterative process, and the controller parameters after preliminary optimization are obtained through offline learning, and the adaptive-adjustment inverter controller comprises the following steps:
s11: initializing all parameters of reinforcement learning, namely setting a discount factor gamma, a learning factor alpha, the probability of a greedy strategy, a Q table matrix convergence setting value Q and initializing a Q table as a zero matrix;
s12: observing the state s at the current momenti
S13: according to the current state siOne action a is selected from the action set represented by the formula (5) according to the policy pii
S14: after the action is performed, the state s at the next moment is observedi+1
S15: the reward value R under the current action is calculated by the reward function formula (7)i
S16: updating the Q table matrix by equation (8);
s17: and if the algorithm convergence condition is not met, returning to S12, and otherwise, ending the off-line learning.
8. The adaptive-adjustment inverter controller according to claim 6, wherein the step 3 is to continue to use the Q learning algorithm to learn online and further update the action strategy during the operation process, and the Q learning algorithm is used in the following steps:
s21: observing the current state si
S22: according to the current state siOne action a is selected from the action set represented by the formula (5) according to the policy pii
S23: after the action is performed, the state s at the next moment is observedi+1
S24: obtaining the reward value R under the current action through the reward functioni
S25: updating the Q table matrix by equation (8);
s26: i +1, the process returns to step S21.
9. The adaptive-tuning inverter controller according to claim 7 or 8, wherein the strategy pi is a greedy strategy.
CN201911165356.XA 2019-11-25 2019-11-25 Self-adaptive adjustment inverter controller Active CN110880774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911165356.XA CN110880774B (en) 2019-11-25 2019-11-25 Self-adaptive adjustment inverter controller

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911165356.XA CN110880774B (en) 2019-11-25 2019-11-25 Self-adaptive adjustment inverter controller

Publications (2)

Publication Number Publication Date
CN110880774A CN110880774A (en) 2020-03-13
CN110880774B true CN110880774B (en) 2021-01-05

Family

ID=69730201

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911165356.XA Active CN110880774B (en) 2019-11-25 2019-11-25 Self-adaptive adjustment inverter controller

Country Status (1)

Country Link
CN (1) CN110880774B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111355234A (en) * 2020-03-18 2020-06-30 国网浙江嘉善县供电有限公司 Micro-grid frequency control method based on reinforcement learning
CN112187074B (en) * 2020-09-15 2022-04-19 电子科技大学 Inverter controller based on deep reinforcement learning
CN112187073B (en) * 2020-09-15 2021-11-09 电子科技大学 Inverter controller with additional damping control
CN113098058B (en) * 2021-04-06 2023-05-02 广东电网有限责任公司电力科学研究院 Self-adaptive optimization control method, device, equipment and medium for moment of inertia
CN113131771B (en) * 2021-04-25 2022-09-27 合肥工业大学 Inverter optimization control method based on reinforcement learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105006834A (en) * 2015-06-10 2015-10-28 合肥工业大学 Optimal virtual inertia control method based on virtual synchronous generator
CN107482939A (en) * 2017-09-08 2017-12-15 中南大学 A kind of inverter control method
CN109067220A (en) * 2018-07-16 2018-12-21 电子科技大学 A kind of circuit control device with damping Real Time Control Function

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105006834A (en) * 2015-06-10 2015-10-28 合肥工业大学 Optimal virtual inertia control method based on virtual synchronous generator
CN107482939A (en) * 2017-09-08 2017-12-15 中南大学 A kind of inverter control method
CN109067220A (en) * 2018-07-16 2018-12-21 电子科技大学 A kind of circuit control device with damping Real Time Control Function

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《虚拟同步发电机的转子惯量自适应控制方法》;程冲等;《电力系统自动化》;20151010;第39卷(第19期);第82-89页 *

Also Published As

Publication number Publication date
CN110880774A (en) 2020-03-13

Similar Documents

Publication Publication Date Title
CN110880774B (en) Self-adaptive adjustment inverter controller
CN112187074B (en) Inverter controller based on deep reinforcement learning
CN110829461B (en) Inverter controller with function of participating in system low-frequency oscillation suppression
CN108418256B (en) Virtual synchronous machine self-adaptive control method based on output differential feedback
CN108493984B (en) Virtual synchronous generator control method suitable for photovoltaic grid-connected system
WO2021110171A1 (en) P-u droop characteristic-based virtual direct current motor control method
CN110212513B (en) Flexible virtual capacitor control method for stabilizing voltage fluctuation of direct-current micro-grid bus
CN106786733A (en) A kind of control method, the apparatus and system of virtual synchronous generator
WO2024021206A1 (en) Method and system for energy storage system control based on grid-forming converter, storage medium, and device
CN116014748A (en) Active support-based low-voltage ride through control method and device for energy storage converter
CN110266044B (en) Microgrid grid-connected control system and method based on energy storage converter
Hua et al. Research on power point tracking algorithm considered spinning reserve capacity in gird-connected photovoltaic system based on VSG control strategy
CN114759575A (en) Virtual synchronous double-fed fan subsynchronous oscillation suppression method and system
CN111193262B (en) Fuzzy self-adaptive VSG control method considering energy storage capacity and SOC constraint
CN110518625B (en) Grid-connected inverter direct-current component suppression method with variable learning rate BP-PID control
CN115903457B (en) Control method of low-wind-speed permanent magnet synchronous wind driven generator based on deep reinforcement learning
Pavković et al. Modeling, parameterization and damping optimum-based control system design for an airborne wind energy ground station power plant
CN116865331A (en) Virtual DC motor low voltage ride through method based on dynamic matrix predictive control
CN108736517B (en) VSG-based inverter type distributed power supply adaptive damping control method
CN110854903B (en) Island microgrid reactive power distribution control method based on self-adaptive virtual impedance
Dehghani et al. Dynamic behavior control of induction motor with STATCOM
CN110247427A (en) A kind of gird-connected inverter resonance intelligence suppressing method of electrical network parameter online recognition
CN112187073B (en) Inverter controller with additional damping control
Ge et al. Adaptive Virtual Synchronous Generator Modulation Strategy Based on Moment of Inertia, Damping Coefficient and Virtual Impedance
CN114696320B (en) New energy power generation equipment self-synchronous voltage source control and low voltage ride through control dual-mode switching control method

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant