CN116581766A

CN116581766A - Virtual power plant strengthening online voltage control method considering sagging characteristic

Info

Publication number: CN116581766A
Application number: CN202310843169.2A
Authority: CN
Inventors: 李培帅; 沈嘉伟; 陈敏强; 韩静; 董彦昊; 王艺涵
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2023-07-11
Filing date: 2023-07-11
Publication date: 2023-08-11
Anticipated expiration: 2043-07-11
Also published as: CN116581766B

Abstract

The invention discloses a virtual power plant strengthening online voltage control method considering droop characteristics, which is oriented to all coupled and interconnected sub-networks, and a Q-V droop control model with adjustable parameters is provided by considering reactive power regulation capability of an inverter; then, the overall operation of the system is optimized, the lowest network loss and the voltage constraint are taken as targets, and a voltage control model is built by combining the system power flow constraint, the safety constraint and the like; fitting the voltage control model into a partially observable Markov decision, solving through an improved MADDPG algorithm, and finally obtaining a control decision for automatically adjusting the intercept of a sagging curve to realize real-time voltage control of the virtual power plant to a target power grid system; the method solves the voltage regulation problem and the network power loss reduction problem in the virtual power plant, combines the traditional physical model in the power grid with the data drive, improves the calculation efficiency, enables the control strategy to be more fit with the actual situation, and obtains the effective control strategy.

Description

Virtual power plant strengthening online voltage control method considering sagging characteristic

Technical Field

The invention belongs to the field of virtual power plant application of a data driving method, and particularly relates to a virtual power plant strengthening on-line voltage control method considering sagging characteristics.

Background

Virtual power plants (Virtual Power Plants, VPP) are one of the important technologies for implementing smart distribution networks. The clean energy, the controllable load and the energy storage system which are installed in a scattered manner in the power grid are combined through the distributed energy management system to be used as a special power plant to participate in the power grid operation, so that the contradiction between the intelligent power grid and the distributed energy is well coordinated, and the value and benefit brought by the distributed energy to the power grid and users are fully excavated.

The general VPP categories can be largely classified into Commercial VPP (CVPP) and Technical VPP (TVPP), focusing on economic profit and performance of the power system, respectively. Unlike CVPP, TVPP mainly focuses on achieving optimal scheduling and real-time operation to balance power and ensure system safety.

To address the voltage regulation problem and reduce network power consumption problem, reactive/voltage control (VVC) methods have been proposed for TVPP. Among them, inverter-based VVCs have been widely studied due to their increasing availability, flexibility of reactive compensation, and rapid response speed of advanced power electronics technologies. The problem of inverter-based VVC optimization can be solved on a central level, and a centralized optimization method has been developed. However, the centralized approach requires a complex communication network and often suffers from privacy, complexity, and scalability issues. VVC schemes based on distributed control frameworks are becoming more and more widely used because they do not require powerful communication and computing capabilities and can effectively protect privacy concerns.

The current VVC optimization problem is mostly expressed by adopting an accurate mathematical model before and then solved by using a mathematical method, the calculation amount of the method is very large, the communication requirement is higher, and the requirement of realizing real-time control under the condition of accessing a large amount of distributed resources can not be met. Many new approaches have been proposed in terms of improving computational efficiency, including the application of data-driven approaches to solve the relevant optimization problem. The data-driven approach represented by the madppg algorithm has many advantages, the most prominent being that the neural network can make decisions within milliseconds, regardless of the scale of the grid system and the complexity of the operating conditions. Meanwhile, the MADDPG algorithm is based on a partially observable Markov decision, so that the MADDPG algorithm still has higher applicability under the condition of no accurate model, and the VVC problem of the VPP is solved in a large range, so that the method has expandability. However, the conventional data driving method only trains a strategy for realizing global optimum through a large amount of historical data, and ignores the conventional physical model in the power grid, so that the strategy obtained by training sometimes does not accord with the actual situation. The traditional physical model contains a plurality of physical information and physical laws, and the information has high value, and the defects of pure data or pure model driving can be effectively overcome by combining the traditional physical model with data driving.

Disclosure of Invention

The invention provides a virtual power plant strengthening online voltage control method considering droop characteristics, which solves the voltage regulation problem in a Virtual Power Plant (VPP) and the problem of network power loss reduction, combines a traditional physical model in a power grid with data driving, improves the calculation efficiency, enables a control strategy to be more fit with the actual situation, and obtains an effective control strategy.

The invention adopts the following technical scheme:

a virtual power plant strengthening online voltage control method considering droop characteristics aims at all coupling and interconnection sub-networks of a target power grid system based on regional division, and each sub-network corresponds to one virtual power plant, and real-time voltage control of the virtual power plant on the target power grid system is realized through the following steps:

step 1: constructing an inverter Q-V droop control model with adjustable droop parameters for each inverter contained in each sub-network;

step 2: aiming at each virtual power plant, based on the aims of active power loss minimization and voltage constraint, and combining the constraint of a droop control model of an inverter Q-V with adjustable droop parameters, the safety constraint and the system trend constraint, establishing a voltage control model corresponding to each virtual power plant;

step 3: modeling the voltage control model into a part of observable Markov decision model aiming at the voltage control model corresponding to each virtual power plant, and solving to obtain the voltage control strategy model corresponding to each virtual power plant;

step 4: based on the voltage control strategy model corresponding to each virtual power plant, each virtual power plant controls the sagging parameters of the corresponding inverter in real time based on the state of the corresponding sub-network, and further real-time voltage control of the virtual power plant on the target power grid system is realized according to the sagging control model of the inverter Q-V with adjustable sagging parameters.

As a preferable technical scheme of the invention, the droop control model of the inverter Q-V with adjustable droop parameters is as follows:

，

in the formula ,the representation being connected tobInverter at nodetActive power emitted at any time; />The representation being connected tobInverter at nodetReactive power emitted at any moment; />The representation being connected tobComplex power capacity of the inverter of the node; PV represents an inverter; />Representing a sub-networkmNetwork node index set with PV installed in,/->；/>Representing a sub-networkmNode sets in (a); />A slope representing a sagging curve;nrepresenting the total number of network nodes in a target power grid system; />Representation ofbThe sensitivity of the voltage amplitude at the node to the reactive power injected by the node; />The representation being connected tobInverter at nodetMaximum reactive power for output at the moment; />Representation oftTime of daybThe voltage magnitude at the node; />Representation oftTime of daybThe droop parameter of the inverter connected at the node istTime of daybThe intersection of the inverter sag curve connected at the node and the x-axis representing the voltage.

As a preferable technical scheme of the invention, the voltage control model is as follows:

，

constraint conditions:，

，

in the formula ,representing a sub-networkmAt the position oftActive power loss at time; />Representing a sub-networkmAt the position oftVoltage constraint at time;mrepresenting a subnetwork; m represents the total number of subnetworks; t represents a preset duration total time; />Representing a preset weight coefficient; />The representation being connected tobInverter at nodetActive power emitted at any time; />The representation being connected tobInverter at nodetReactive power emitted at any moment; />The representation being connected tobComplex power capacity of the inverter of the node; PV represents an inverter; />Representing a sub-networkmNetwork node index set with PV installed in,/->；/>Representing a sub-networkmNode sets in (a); />A slope representing a sagging curve;nrepresenting the total number of network nodes in a target power grid system; />Representation ofbThe sensitivity of the voltage amplitude at the node to the reactive power injected by the node; />The representation being connected tobInverter at nodetMaximum reactive power for output at the moment; />Representation oftTime of daybThe voltage magnitude at the node; />Representation oftTime of daybThe droop parameter of the inverter connected at the node istTime of daybAn intersection of an inverter sag curve connected at the node and an x-axis representing the voltage; />Representing a network voltage reference value; />Representation oftTime of daycThe voltage amplitude of the node; />Representation oftTime of daycAndeactive power loss of branches between nodes; />Representing a sub-networkmIs a branch index set of (1); />Representation ofcAndethe resistance of the branch between the nodes; />Representation ofcAndereactance of the branches between the nodes; />Representation oftTime of dayeThe voltage amplitude of the node; />Representation oftTime of day injectioniActive power of the node; />Representation oftTime of day injectioniReactive power of the node; />，BA branch index set representing a target power grid system; />Representation oftTime of dayiThe voltage amplitude of the node; />Representation oftTime of dayjThe voltage amplitude of the node; />Representation ofiNode and method for manufacturing the samejConductance of the bus between the nodes; />Representation ofiNode and method for manufacturing the samejSusceptance of bus bar between nodes; />Representation ofiNode and method for manufacturing the samejNode attThe voltage phase angle difference at the moment;Nrepresenting a set of nodes in the target grid system.

In the step 3, for the voltage control model corresponding to each virtual power plant, the voltage control model is fitted into a partially observable markov decision model, and the MADDPG algorithm is adopted to solve the model, so as to obtain the voltage control strategy model corresponding to each virtual power plant.

As a preferable technical scheme of the invention, in the solving process of the MADDPG algorithm, aiming at a training set with failed network solving, the reward function of the training set is adjusted by the following formula:

，

wherein ,，/>，

in the formula ,representing a sub-networkmIs a bonus function of (2); />Representing a preset discount coefficient; />Is shown intTime sub-networkmIs a bonus function of (2); />Representing a sub-networkmDuration of training success in the training set in which solution failure occurs;f _m representing a preset penalty occurrence constant,f _m >6n, n represents the total number of nodes of the target grid system; />Representing a sub-networkmIs set in the constant coefficient;representing a sub-networkmA time point when the last solution is successful before the solution fails; />Representing a sub-networkmThe longest training time point of each training set.

As a preferable technical scheme of the invention, part of the observable markov decision model in the step 3 takes a virtual power plant as an intelligent agent,

the action space a includes:；/>representation oftTime of daybDroop parameters of the inverter connected at the node; />，Representing a sub-networkmA network node index set in which an inverter is installed;

the state space O includes:、/>、/>、/>；/>the representation being connected tobInverter at nodetActive power emitted at the moment, PV represents inverter,>is connected togLoad of nodedAt the position oftActive power emitted at the moment, +.>Is connected togLoad of nodedAt the position oftReactive power emitted at any moment; />Representation oftTime of daycThe voltage magnitude at the node; />，/>；Representing a set of nodes in the sub-network m;

bonus function:；/>is shown intTime sub-networkmIs a reward function of->Representing preset weight coefficient,/->Representing a sub-networkmAt the position oftTime of dayVoltage constraint->Representing a sub-networkmAt the position oftTotal power loss at time.

The beneficial effects of the invention are as follows: the invention provides a virtual power plant strengthening online voltage control method considering droop characteristics, which is oriented to all coupled and interconnected sub-networks, and a Q-V droop control model with adjustable parameters is provided in consideration of reactive power regulation capability of a photovoltaic inverter; then, the overall operation of the system is optimized, the lowest network loss and the voltage constraint are taken as targets, and factors such as system power flow constraint and safety constraint are combined to construct a virtual power plant voltage control model; and further fitting the voltage control model into a partially observable Markov decision, and finally solving through an improved MADDPG algorithm to improve the training efficiency. Finally, a control decision with global preference for automatically adjusting the intercept of the sagging curve is obtained, and real-time voltage control of the virtual power plant to the target power grid system is realized; the method solves the voltage regulation problem and the network power loss reduction problem in the virtual power plant, combines the traditional physical model in the power grid with the data drive, improves the calculation efficiency, enables the control strategy to be more fit with the actual situation, and obtains the effective control strategy.

Drawings

FIG. 1 is a schematic diagram of a distributed control architecture according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of information interaction between VPPs according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating the droop control of the inverter Q-V according to an embodiment of the present invention;

FIG. 4 is a block diagram of an algorithm flow in an embodiment of the invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples will provide those skilled in the art with a more complete understanding of the invention, but are not intended to limit the invention in any way.

In this embodiment, as shown in fig. 4, for each sub-network of the target power grid system based on the coupling interconnection of the region division, each sub-network corresponds to one virtual power plant, and the real-time voltage control of the virtual power plant on the target power grid system is realized through the following steps.

In this embodiment, each sub-network of coupling interconnection based on region division in the target power grid system is a multi-agent distributed control architecture of coupling interconnection VPP, each VPP is controlled by an agent, information interaction is performed between different agents to ensure global optimality of decision, and the distributed architecture is shown in fig. 1. In the architecture, each intelligent agent makes a corresponding voltage control decision by combining information interaction information among different intelligent agents through information such as network topology, line parameters, equipment capacity and position, new energy/load prediction and the like in the VPP. The information interaction among different subjects takes boundary conditions as carriers and is realized through consistency interaction rules. With VPP in FIG. 2m1 and VPPm2For example, the photovoltaic units in the figure correspond to PV, for VPPm1In particular, it requires VPPm2The boundary information amount of the interaction is, wherein />Representing nodes respectivelyiSum nodejIs>，/>Respectively represent branchesijActive power, reactive power and current amplitude,，/>respectively represent VPPm2Corresponding virtual active and reactive loads, +.>, wherein Representing a set of nodes (buses) in the target grid system, with node 0 being the reference node, +.>For the target grid system branch index set, +.>Is an index set of VPPs. Likewise, for VPPm2In other words, it needs to be connected with a sub-networkm1The boundary information amount of the interaction is

。

In this embodiment, the agents corresponding to each VPP adopt a combination of offline centralized training and online decentralized execution. Through training, each intelligent control strategy is obtained, so that real-time output decision can be made on line according to the real-time operation condition of the system, and global optimal voltage control is realized.

Step 1: and constructing an inverter Q-V droop control model with adjustable droop parameters for each inverter contained in each sub-network.

The droop control model of the inverter Q-V with the adjustable droop parameters is as follows:

in the formula ,the representation being connected tobInverter at nodetActive power emitted at momentA power; />The representation being connected tobInverter at nodetReactive power emitted at any moment; />The representation being connected tobComplex power capacity of the inverter of the node; PV represents an inverter; />Representing VPP subnetworksmNetwork node index set with PV installed in,/->；/>Representing VPP subnetworksmNode sets in (a); />A slope representing a sagging curve;nrepresenting the total number of network nodes in a target power grid system; />Representation ofbThe sensitivity of the voltage amplitude at the node to the reactive power injected by the node; />The representation being connected tobInverter at nodetMaximum reactive power for output at the moment; />Representation oftTime of daybThe voltage magnitude at the node; />Representation oftTime of daybThe droop parameter of the inverter connected at the node istTime of daybThe intersection point of the sagging curve of the inverter connected at the node and the x-axis representing the voltage is the intercept of the sagging curve, and the parameter is set as a controllable parameter to enable the sagging curveThe line moves on the x-axis representing the magnitude of the voltage, as the case may be.

The reactive power output by each inverter is also limited within its complex power capacity, positive values indicating the injection of reactive power into the grid and negative values indicating the absorption of grid reactive power. A schematic diagram of the sagging curve of the inverter Q-V is shown in FIG. 3, in whichRepresentation oftTime of dayiOptimal sag intercept of inverter connected at node, +.>Representation oftTime of dayiCandidate sag curve intercept of the inverter connected at the node. In actual use, the intelligent agent obtains the intercept of the optimal sagging curve according to the current state and the trained strategy, so that the position of the sagging curve of the inverter Q-V is determined, and then the inverter can generate reactive power to be output according to the current sagging curve of the Q-V according to the actual voltage of the installation node. And the reactive power in the system has great influence on the voltage level of the system node based on the change of the reactive power state of the system by the inverter, so that the real-time voltage control of the virtual power plant on the target power grid system is realized.

Step 2: aiming at each virtual power plant, based on the aims of active power loss minimization and voltage constraint, and combining the constraint of a droop control model of an inverter Q-V with adjustable droop parameters, the safety constraint and the system trend constraint, a voltage control model corresponding to each virtual power plant is built.

In this embodiment, the active power loss minimization and voltage constraint are based on two objectives, which are applicable to all VPP subnetworks within the system,the multiple objective function is converted into an equivalent single objective function with weighting factors using classical weighted sum algorithms. For network m, the weighted sum of the normalized targets is expressed as:

in the formula ,is a weight coefficient. Since the voltage constraint penalty is 0 when the voltages of all nodes in the network are within a specified range, it is considered that a greater weight is given to the voltage constraint penalty, here typically 0.8.

Further, the above model and the basic power flow network model are integrated, and the voltage control model is as follows:

constraint conditions:

in the formula ,representing a sub-networkmAt the position oftActive power loss at time; />Representing a sub-networkmAt the position oftThe voltage constraint at the moment is the voltage deviation;mrepresenting a subnetwork; m represents the total number of subnetworks; t represents a preset duration total time; />Representing a preset weight coefficient; />The representation being connected tobInverter at nodetActive power emitted at any time; />The representation being connected tobInverter at nodetReactive power emitted at any moment; />The representation being connected tobComplex power capacity of the inverter of the node; PV represents an inverter; />Representing VPP subnetworksmNetwork node index set with PV installed in,/->；/>Representing VPP subnetworksmNode sets in (a); />A slope representing a sagging curve;nrepresenting the total number of network nodes in a target power grid system; />Representation ofbThe sensitivity of the voltage amplitude at the node to the reactive power injected by the node; />The representation being connected tobInverter at nodetMaximum reactive power for output at the moment; />Representation oftTime of daybThe voltage magnitude at the node; />Representation oftTime of daybThe droop parameter of the inverter connected at the node istTime of daybAn intersection of an inverter sag curve connected at the node and an x-axis representing the voltage; />Representing a network voltage reference value; />Representation oftTime of daycThe voltage amplitude of the node; />Representation oftTime of daycAndeactive power loss of branches between nodes; />Representing a sub-networkmIs a branch index set of (1); />Representation ofcAndethe resistance of the branch between the nodes; />Representation ofcAndereactance of the branches between the nodes; />Representation oftTime of dayeThe voltage amplitude of the node; />Representation oftTime of day injectioniActive power of the node; />Representation oftTime of day injectioniReactive power of the node; />，BA branch index set representing a target power grid system; />Representation oftTime of dayiThe voltage amplitude of the node; />Representation oftTime of dayjThe voltage amplitude of the node; />Representation ofiNode and method for manufacturing the samejConductance of the bus between the nodes; />Representation ofiNode and method for manufacturing the samejSusceptance of bus bar between nodes; />Representation ofiNode and method for manufacturing the samejNode attThe voltage phase angle difference at the moment;Nrepresenting a set of nodes in the target grid system. In this embodiment, <' > a->Is the network voltage reference value.

In the above, the system power flow constraint is:

the method is applicable to global networks;for instantaneous maximization of photovoltaic inverterReactive power output constraint is a safety constraint; voltage constraint of +.>，

，

。

Step 3: and modeling the voltage control model into a partial observable Markov decision model aiming at the voltage control model corresponding to each virtual power plant, and solving to obtain the voltage control strategy model corresponding to each virtual power plant.

The Markov decision process is a mathematical basis for solving an optimization problem by using reinforcement learning, and before a specific reinforcement learning algorithm is used for solving, the optimization problem needs to be normalized into a Markov decision process, and the Markov decision process is a mathematical model for sequential decision and consists of 3 basic elements of states, actions and rewards. Specifically, part of the step 3 can observe a Markov decision model, and a virtual power plant is taken as an agent,

the action space a includes:；/>representation oftTime of daybDroop parameters of the inverter connected at the node; />，Representing VPP subnetworksmA network node index set in which an inverter is installed; namely each intelligentThe body acts as a sagging parameter of each inverter connected with the node in the sub-network where the body is positioned at the moment t; and then based on->The output required by the inverter can be calculated according to the actual voltage by combining the inverter Q-V droop control model>The method comprises the steps of carrying out a first treatment on the surface of the The inverter changes the reactive power state in the system, the reactive power in the system has great influence on the voltage level of the system node, and the real-time voltage control of the virtual power plant on the target power grid system is realized;

the state space O includes:、/>、/>、/>；/>the representation being connected tobInverter at nodetActive power emitted at the moment, PV represents inverter,>is connected togLoad of nodedAt the position oftActive power emitted at the moment, +.>Is connected togLoad of nodedAt the position oftReactive power emitted at any moment; />Representation oftTime of daycThe voltage magnitude at the node; />，/>；Representing a set of nodes in the VPP subnetwork m; the state in the state space is active power and reactive power sent by all nodes connected with loads in a sub-network where the intelligent agent is located at the moment t, the voltage amplitude of all the nodes in the sub-network at the moment t, and the active power sent by each inverter connected with the nodes in the sub-network at the moment t; all data in the state space are observed only by the agent corresponding to the VPP sub-network where the node is located;

bonus function:；/>is shown intTime sub-networkmIs a reward function of->Representing preset weight coefficient,/->Representing a sub-networkmAt the position oftTime-of-day voltage constraint->Indicating the total power loss of the sub-network m at time t.

Furthermore, each agent's goal is to maximize its expected benefits over a preset total duration range T, in the formula ,/>For preset discount coefficient, < >>Is the firstmAnd rewarding the VPPs at the time T, wherein T is the preset duration. The expected revenue R of the target grid system over time range T: />Wherein M is an index set of VPP, and the obtained R can be observed by all agents.

This is followed by an expected, postulated, and cumulative return function that determines the current state or state action pair, i.e., a state-action value function:, in the formula ,/>Representing policy(s)>Representation->Hope under policy, < >>Representing the state set at time t->The set of actions at time t is shown.

Because of the deterministic strategy, the state-action value function is decomposed by using the Bellman equation, and the function under the optimal strategy is obtained:, in the formula ,/>Representing an optimal strategy->A set of next states is represented and,representing the next movementAnd (5) collecting.

In the step 3, for the voltage control model corresponding to each virtual power plant, fitting the voltage control model into a partially observable markov decision model, and solving by adopting an MADDPG algorithm, namely a depth deterministic strategy gradient algorithm, to obtain the voltage control strategy model corresponding to each virtual power plant.

In the process of solving the MADDPG algorithm, aiming at a training set with network solving failure, adjusting a reward function of the training set by the following formula:

wherein ,，/>，

In the process of solving the MADDPG algorithm, aiming at a training set with network solving failure, adding network solving failure punishment to improve the success rate of network solving to realize quick convergence of training, and when the network solving failure occurs in the training process caused by the problem of the bearing capacity of the electric power network, the training of the training set is terminated. To reduce the occurrence of such situations, the convergence speed of training is increased, and a network solution failure penalty function is added：/>，/>, in the formula ,f _m the penalty occurrence constant is a larger positive number, the value is larger than 6n, and n is the total number of nodes of the network. The prize value obtained by solving the failure penalty function over the network will increase as the episode training duration increases. When termination does not occur during training, the network solution failure penalty is equal to 0, without affecting the final converged prize value. The rewards for each training set that is interrupted by the occurrence of a network solution failure are then adjusted to:

。

In the embodiment, the formed MADDPG algorithm is integrally planned, and a strategy method based on the combination of data driving and a physical network model is formed and applied to the VPP network. And performing offline training based on a large amount of historical data to obtain an effective control strategy, so that real-time output decision can be made online according to the real-time operation condition of the virtual power plant.

The invention designs a virtual power plant strengthening online voltage control method considering droop characteristics, and provides a Q-V droop control model with adjustable parameters, which is oriented to all sub-networks of coupling and interconnection and considers the reactive power regulation capability of a photovoltaic inverter; then, the overall operation of the system is optimized, the lowest network loss and the voltage constraint are taken as targets, and factors such as system power flow constraint and safety constraint are combined to construct a virtual power plant voltage control model; and further fitting the voltage control model into a partially observable Markov decision, and finally solving through an improved MADDPG algorithm to improve the training efficiency. Finally, a control decision with global preference for automatically adjusting the intercept of the sagging curve is obtained, and real-time voltage control of the virtual power plant to the target power grid system is realized; the method solves the voltage regulation problem and the network power loss reduction problem in the virtual power plant, combines the traditional physical model in the power grid with the data drive, improves the calculation efficiency, enables the control strategy to be more fit with the actual situation, and obtains the effective control strategy.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that the foregoing embodiments may be modified or equivalents substituted for some of the features thereof. All equivalent structures made by the content of the specification and the drawings of the invention are directly or indirectly applied to other related technical fields, and are also within the scope of the invention.

Claims

1. The virtual power plant strengthening online voltage control method considering sagging characteristics is characterized by aiming at each coupling and interconnection sub-network of a target power grid system based on regional division, wherein each sub-network corresponds to one virtual power plant, and real-time voltage control of the virtual power plant on the target power grid system is realized through the following steps:

2. The virtual power plant enhanced on-line voltage control method considering droop characteristics according to claim 1, wherein the droop parameter-adjustable inverter Q-V droop control model is as follows:

，

in the formula ,the representation being connected tobInverter at nodetActive power emitted at any time; />The representation being connected tobInverter at nodetReactive power emitted at any moment; />The representation being connected tobComplex power capacity of the inverter of the node; PV represents an inverter; />Representing a sub-networkmNetwork node index set with PV installed in,/->；/>Representing a sub-networkmNode sets in (a); />A slope representing a sagging curve;nrepresenting the total number of network nodes in a target power grid system; />Representation ofbThe sensitivity of the voltage amplitude at the node to the reactive power injected by the node; />The representation being connected tobInverter at nodetMaximum reactive power for output at the moment; />Representation oftTime of daybAt the nodeIs set to the voltage amplitude of (1); />Representation oftTime of daybThe droop parameter of the inverter connected at the node istTime of daybThe intersection of the inverter sag curve connected at the node and the x-axis representing the voltage.

3. The virtual power plant enhanced on-line voltage control method considering droop characteristics according to claim 1, wherein the voltage control model is as follows:

，

constraint conditions:，

，

4. The virtual power plant strengthening online voltage control method considering droop characteristics according to claim 1, wherein in the step 3, for the voltage control model corresponding to each virtual power plant, the voltage control model is fitted into a partially observable markov decision model, and the MADDPG algorithm is adopted for solving, so as to obtain the voltage control strategy model corresponding to each virtual power plant.

5. The virtual power plant strengthening online voltage control method considering droop characteristics according to claim 4, wherein in the process of solving by the madppg algorithm, aiming at a training set of network solving failure, the reward function of the training set is adjusted by the following formula:

，

wherein ,，/>，

in the formula ,representing a sub-networkmIs a bonus function of (2); />Representing a preset discount coefficient; />Is shown intTime sub-networkmIs a bonus function of (2); />Representing a sub-networkmDuration of training success in the training set in which solution failure occurs;f _m representing a preset penalty occurrence constant,f _m >6n, n represents the total number of nodes of the target grid system; />Representing a sub-networkmIs set in the constant coefficient; />Representing a sub-networkmA time point when the last solution is successful before the solution fails; />Representing a sub-networkmThe longest training time point of each training set.

6. The method for on-line voltage control of virtual power plant enhancement taking droop characteristics into consideration according to claim 1, wherein part of the observable markov decision model in the step 3 uses the virtual power plant as an agent,

the action space a includes:；/>representation oftTime of daybDroop parameters of the inverter connected at the node; />，/>Representing a sub-networkmA network node index set in which an inverter is installed;

the state space O includes:、/>、/>、/>；/>the representation being connected tobInverter at nodetActive power emitted at the moment, PV represents inverter,>is connected togLoad of nodedAt the position oftActive power emitted at the moment, +.>Is connected togLoad of nodedAt the position oftReactive power emitted at any moment; />Representation oftTime of daycThe voltage magnitude at the node; />，/>；/>Representing a set of nodes in the sub-network m;

bonus function:；/>is shown intTime sub-networkmIs a reward function of->Representing preset weight coefficient,/->Representing a sub-networkmAt the position oftTime-of-day voltage constraint->Representing a sub-networkmAt the position oftTotal power loss at time.