CN115765035A - Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction - Google Patents

Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction Download PDF

Info

Publication number
CN115765035A
CN115765035A CN202211308934.2A CN202211308934A CN115765035A CN 115765035 A CN115765035 A CN 115765035A CN 202211308934 A CN202211308934 A CN 202211308934A CN 115765035 A CN115765035 A CN 115765035A
Authority
CN
China
Prior art keywords
network
node
power distribution
dgi
distribution network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211308934.2A
Other languages
Chinese (zh)
Inventor
霍现旭
董雷
张磐
郑悦
梁海深
李占一
吴怡
张涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
North China Electric Power University
Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Baodi Power Supply Co of State Grid Tianjin Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
North China Electric Power University
Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd
Baodi Power Supply Co of State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd, North China Electric Power University, Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd, Baodi Power Supply Co of State Grid Tianjin Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202211308934.2A priority Critical patent/CN115765035A/en
Publication of CN115765035A publication Critical patent/CN115765035A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention relates to a flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction, which comprises the following steps of: step 1, initializing to generate an initial network structure, and importing photovoltaic, wind power and load data; step 2, generating a network topology mechanism set which can be adopted after the flexible distribution line fails based on the network structure generated by initializing in the step 1; step 3, establishing a mathematical model for optimizing operation after the fault recovery of the flexible power distribution network based on the feasible network topology structure generated in the step 2; step 4, establishing a fault recovery optimized operation mathematical model of the flexible power distribution network based on the step 3, and establishing a multi-type variable double-agent reinforcement learning collaborative optimization model based on the multi-type variable double-agent reinforcement learning collaborative optimization model; and 5, outputting dynamic reconstruction and a regulation and control strategy of the controllable active equipment in an online decision mode. The invention can solve the problems of extreme difficulty in optimizing the intelligent agent, low efficiency and difficulty in convergence.

Description

Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction
Technical Field
The invention belongs to the technical field of optimized operation of a power distribution network containing a renewable distributed power supply, and relates to a flexible power distribution network disturbance recovery method, in particular to a flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction.
Background
With the development of national economy and the progress of science and technology, users put higher requirements on the reliability of power supply correspondingly. The power distribution network is used as a terminal of the power network, and whether the user can safely and reliably use the power or not can be directly determined. Because the power distribution network has the characteristics of closed-loop design and open-loop operation, the N-1 safety criterion can be met, namely when a certain distribution line breaks down and is disconnected, the system can achieve stable operation without load shedding by adjusting the topological structure. However, when the topology structure is adjusted, the above process only realizes the restoration of power supply for all loads, and topology optimization is not performed, that is, while the reliability of power supply is ensured, the operation economy of the system is ignored. Meanwhile, the access of the renewable distributed power supply with volatility and uncertainty inevitably brings disturbance to the operation of the power distribution network, and further causes adverse effects on the quality of the electric energy supplied to users. Therefore, in the fault recovery process of the power distribution network, a series of voltage problems caused by the fact that the renewable distributed power supply is connected into the power distribution network can be effectively relieved by reasonably selecting the topological structure.
With the development of power electronic technology, an intelligent soft Switch (SOP) gets more and more attention and discussion by virtue of its flexible regulation and control manner. The SOP is a power electronic device installed on a tie line, and compared with a traditional tie switch, the SOP can provide continuous and smooth reactive power compensation and rapid and accurate active power regulation and control, and is one of effective means in a system fault recovery process. However, since the SOP is expensive and can only partially replace the conventional tie switch in a short period of time, research on cooperative optimization of the SOP and the conventional tie switch is relatively small. In addition, the above collaborative optimization problem includes continuous variables and a large number of discrete variables, and the problem has a large scale, and therefore, the problems are difficult to find and low in solving efficiency. The existing solving methods such as a mathematical programming method, a heuristic algorithm and the like need to simplify a physical model to reduce the quantity and scale of problems, and for example, the measures of time-interval combination by adopting a specific clustering method or artificially set indexes before and after network reconstruction face the problems of complicated flow and incapability of maximizing the optimization due to the subjectivity of the clustering process. Meanwhile, the simplified model still has the problems of low solving speed, easy falling into local optimization, difficult convergence and the like during solving.
Generally speaking, for the technical field of optimized operation of a power distribution network containing renewable distributed power sources, a method for flexible power distribution network disturbance recovery suitable for full-time dynamic reconstruction is still lacking at present.
Through searching, the patent documents of the prior art which are the same as or similar to the invention are not found.
Disclosure of Invention
The invention aims to overcome the defects of a theoretical support system based on experience in the prior art, and provides a flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction.
The invention solves the practical problem by adopting the following technical scheme:
a flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction comprises the following steps:
step 1, initializing to generate an initial network structure, and importing photovoltaic, wind power and load data;
step 2, generating a network topology mechanism set which can be adopted after the flexible distribution line has a fault based on the network structure generated by initializing in the step 1;
step 3, establishing a mathematical model for optimizing operation after the fault recovery of the flexible power distribution network based on the feasible network topology structure generated in the step 2;
step 4, establishing a fault recovery optimized operation mathematical model of the flexible power distribution network based on the step 3, and establishing a multi-type variable double-agent reinforcement learning collaborative optimization model based on the multi-type variable double-agent reinforcement learning collaborative optimization model;
and 5, outputting dynamic reconstruction and a regulation and control strategy of the controllable active equipment in an online decision mode.
Further, the specific steps of step 2 include:
step 2.1, deleting the fault line and simplifying the network: deleting a fault line on an initial network structure G0, combining branches connected with nodes with the node degree of 2 in the network, and regarding the branches as a branch to obtain a simplified network G1;
step 2.2 generating all spanning trees of the simplified network based on polynomial multiplication: based on the node branch information of G1, self-defining a root node, and forming basic items in the polynomial by taking each non-root node as a unit, wherein each basic item is obtained by adding the labels of all branches connected with corresponding nodes;
step 2.3 mapping the spanning tree of the simplified network to the spanning tree of the original network: each spanning tree of the simplified network corresponds to a plurality of spanning trees of the original network; for the tree branches of the spanning tree of the simplified network, the initial network structure reserves all corresponding branches; for the connection branches of the simplified network except the spanning tree, the initial network structure can be used for disconnecting any one of the corresponding branch circuits, so that a network topology structure set which can be adopted after the flexible distribution line fails is generated.
Further, the specific steps of step 3 include:
step 3.1, establishing an optimized operation objective function after the fault of the flexible power distribution network is recovered;
the objective function considers the accumulated active power loss value of the system in the optimization period after the disturbance occurs, the switching action times in the network reconstruction process and the renewable energy consumption level, and is shown as formula (1):
Figure BDA0003907137130000031
in the formula: t is simulation duration, the method is set to be one day, optimization is carried out by taking hours as units, and n is the number of nodes of the system. P is i (t) is the active power injected at node i during time t. T is a unit of j (t) is the state of the switch j in the time period t, when the switch is closed, the state is 1, and when the switch is opened, the state is 0; p is DGi (t) is the active power injected by the distributed power supply at node i during time t,
Figure BDA0003907137130000041
the active power injected by the distributed power supply at the node i in the t period is predicted. n is DG Is the total number of distributed power supplies. The network topology and the injected power of each node in a unit time interval are assumed to be unchanged. w is a loss 、w t 、w r The impact coefficients on the importance of the objective function are respectively consumed for network losses, switching action times and renewable energy.
Step 3.2, establishing constraint conditions for optimizing operation after fault recovery of the flexible power distribution network
1) Radial confinement
G(t)∈G (2)
In the formula: g (t) represents a network structure adopted in the t period; g is a network topology set which does not consider the branch where the SOP is located, strictly meets the radial constraint and does not contain the fault line.
2) Switch action times constraint
Figure BDA0003907137130000042
Figure BDA0003907137130000043
In the formula: t is a unit of max Maximum value of total number of allowable switching actions for network reconfiguration in an optimized period, T j,max In order to optimize the maximum number of allowed actions of the switch j in the period, in the invention, the optimization period is 24 hours, the total number of allowed maximum switch actions is 15, and the maximum number of allowed actions of a single switch is 3; omega node Is a set of system nodes.
3) SOP constraint
The SOP involves the transmission active power constraint and the two-side capacity constraint as follows:
Figure BDA0003907137130000044
in the formula: s. the k1max 、S k2max The maximum capacity of the converter on two sides of the SOP is respectively. Omega SOP Is the set of the branches where the SOP is located.
4) Flow balance constraints
Figure BDA0003907137130000051
Figure BDA0003907137130000052
In the formula: p is i (t)、Q i (t) active power and reactive power injected into the node i at a time period t respectively; v i (t) is the voltage amplitude of the inode during t period; g ij 、B ij Respectively the conductance and the susceptance of the node admittance matrix; theta.theta. ij Is the phase angle difference between the i node and the j node; p is DGi (t)、Q DGi (t) active power and reactive power injected by the distributed power supply at the node i in the t period respectively; p is SOPi (t)、Q SOPi (t) active power and reactive power injected by the SOP at the node i in the t period respectively; p LDi (t)、Q LDi And (t) the active power and the reactive power consumed by the load at the node at the t period.
5) Node voltage constraint
V min ≤V i (t)≤V max i∈Ω node (8)
In the formula: v min 、V max Respectively to meet the upper and lower limits of the voltage amplitude of the system operation node.
6) Branch flow constraint
Figure BDA0003907137130000053
In the formula: I.C. A ij (t) is the amplitude of the current flowing in branch ij at time t, I ijmax The maximum amplitude of the current allowed to flow on branch ij.
7) Distributed power supply output constraints
P DGi,min ≤P DGi (t)≤P DGi,max i∈Ω DG (10)
Q DGi,min ≤Q DGi (t)≤Q DGi,max i∈Ω DG (11)
P DGi,min 、P DGi,max The minimum value and the maximum value of the active power output of the distributed power supply connected with the i node are respectively; q DGi,min 、Q DGi,max The minimum value and the maximum value of the reactive power output of the distributed power supply connected with the node i are respectively;
further, the specific steps of step 4 include:
step 4.1: constructing a discrete intelligent agent action space for network topology optimization, namely a network topology set which can be adopted after a flexible power distribution network line fails, wherein the action space of the discrete intelligent agent is as follows: { G (t) }
And 4.2: constructing a continuous intelligent agent action space for optimizing the controllable active device, wherein the action space of the continuous intelligent agent is as follows: { P k1 (t)、P k2 (t)、Q k1 (t)、Q k2 (t)、P DG (t)}。
Step 4.3: constructing state spaces of the discrete intelligent agents and the continuous intelligent agents, and describing the running state of the system through the source network load state, namely the state spaces are as follows: { P k1 (t)、P k2 (t)、Q k1 (t)、Q k2 (t)、P DG (t)、G(t)、P L (t)}
Step 4.4: constructing a reward function of the discrete agent and the continuous agent:
Figure BDA0003907137130000061
Figure BDA0003907137130000062
in the formula: w is a p ,w g The weighting coefficients of the network loss and the switching action frequency are respectively, M is a great positive number, and when the switching action frequency is out of range, a great punishment is given.
The continuous intelligent agent reward function calculation method comprises the following steps: the continuous intelligent agent optimizes controllable devices such as a distributed power supply and an SOP (self-service platform) based on a network topology structure selected by the discrete intelligent agent, reduces network loss and light and air abandonment, and has a reward function as shown in a formula (15):
Figure BDA0003907137130000063
step 4.5: setting intelligent agent hyper-parameters;
step 4.6: off-line training is based on a double-agent reinforcement learning model for optimizing operation after the flexible power distribution network fails;
moreover, the specific method of the step 5 is as follows:
after the load and the wind power photovoltaic prediction curve are obtained based on the multi-type variable double-intelligent-body reinforcement learning collaborative optimization model, the dynamic reconstruction and the regulation and control strategy of the controllable active device can be directly completed without the optimization process.
The invention has the advantages and beneficial effects that:
1. the invention provides a flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction, and mainly aims to solve the problems that the traditional power distribution network regulating and controlling capability is insufficient after a large number of fluctuating distributed power supplies are connected into a power distribution network to generate disturbance, so that the network loss is large, the renewable energy consumption level is low, the relay protection switching frequency is high and the like.
2. In order to further accelerate the convergence speed, the DQN algorithm considering the sampling priority is adopted, the larger the TD error of the sample in the DQN algorithm considering the sampling priority is, the larger the influence on the inverse gradient calculation is, and the higher the probability of being sampled is, so that the condition that uniform sampling or annihilation of information with higher value but less quantity in an experience pool is avoided, and finally the DQN algorithm converges to an undesired suboptimal solution is avoided.
3. The invention adopts multi-type variable double-agent training to generate an optimal strategy, and different agents respectively carry out coordination training on integer variables and continuous variables. The two agents respectively give actions at specific time points and are applied to the same environment, and the updating of the environment state influences the next action (cooperative interaction) of the agents. By assigning actions to different agents, the action dimension is also reduced, making the agents more reliable convergence.
Drawings
FIG. 1 is a schematic diagram of a simplified network structure based on polynomial multiplication of the present invention;
FIG. 2 is a diagram of the location and topology of the SOP of the present invention in a power distribution network;
FIG. 3 is a diagram of a multi-type variable dual agent reinforcement learning collaborative optimization model of the present invention;
fig. 4 is a topology diagram of an improved IEEE33 node power distribution network of the present invention.
Detailed Description
The embodiments of the invention are further described in the following with reference to the drawings:
a flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction comprises the following steps:
step 1, initializing to generate an initial network structure, and importing photovoltaic, wind power and load data;
initializing and generating an initial network structure, importing photovoltaic data, wind power historical data and load data, and describing the fluctuation and uncertainty of active power actually output by the wind power and the photovoltaic by adopting a mode of predicting power superposition prediction errors.
Step 2, generating a network topology mechanism set which can be adopted after the flexible distribution line has a fault based on the network structure generated by initializing in the step 1;
as shown in fig. 1. Based on the initial network structure generated in the step 1, the initial network structure is reconstructed by adopting a polynomial multiplication method, and a radial network topological structure meeting the requirements of network connectivity and relay protection setting is generated. The feasible network topology formation process after the line fault is as follows:
the specific steps of the step 2 comprise:
step 2.1, deleting the fault line and simplifying the network: deleting a fault line on an initial network structure G0, merging branches connected with nodes with the node degree of 2 in the network, and regarding the branches as a branch to obtain a simplified network G1;
step 2.2 generating all spanning trees of the simplified network based on polynomial multiplication: based on the node branch information of G1, self-defining a root node, and forming basic items in the polynomial by taking each non-root node as a unit, wherein each basic item is obtained by adding the labels of all branches connected with corresponding nodes;
multiplying, expanding and combining the same-class terms by a polynomial, and deleting high-power terms and the same-class terms, thereby ensuring connectivity and radiancy of the spanning tree;
step 2.3, mapping the spanning tree of the simplified network to the spanning tree of the original network: each spanning tree of the simplified network corresponds to a plurality of spanning trees of the original network; for the tree branches of the spanning tree of the simplified network, the initial network structure reserves all corresponding branches; for the connection branches of the simplified network except the spanning tree, the initial network structure can be used for disconnecting any one of the corresponding branch circuits, so that a network topology structure set which can be adopted after the flexible distribution line fails is generated.
Step 3, based on the feasible network topology structure generated in the step 2, establishing an optimized operation mathematical model after the fault of the flexible power distribution network is recovered;
the specific steps of the step 3 comprise:
step 3.1, establishing an optimized operation objective function after the fault of the flexible power distribution network is recovered;
the main purpose of the optimized operation research after the fault recovery of the flexible power distribution network is to disconnect the circuit breakers on two sides of a fault point after the fault occurs, ensure that the power consumption of a user is not influenced as much as possible in a network reconstruction mode, and expect that the operation economy of the power distribution network is optimal after the network structure is changed. The objective function considers the accumulated active power loss value of the system in the optimization period after the disturbance occurs, the switching action times in the network reconstruction process and the renewable energy consumption level, and is shown as formula (1):
Figure BDA0003907137130000091
in the formula: t is simulation duration, the method is set to be one day, optimization is carried out by taking hours as a unit, and n is the number of nodes of the system. P i (t) is the active power injected at node i during time t. T is a unit of j (t) is the state of the switch j in the period of t, when the switch is closed, the state is 1, and when the switch is opened, the state is 0; p DGi (t) is the active power injected by the distributed power supply at node i during time t,
Figure BDA0003907137130000092
and the active power predicted value injected by the distributed power supply at the node i in the t period. n is a radical of an alkyl radical DG Is the total number of distributed power sources. The network topology and the injected power of each node in a unit time period are assumed to be unchanged. w is a loss 、w t 、w r The impact coefficients on the importance of the objective function are consumed for network losses, switching action times and renewable energy, respectively.
The network topology and the injected power of each node in a unit time interval are assumed to be unchanged. The invention adopts a linear weighting method to convert multi-objective optimization into single-objective optimization, the weight coefficient can be adjusted according to the dispatching requirement of the power distribution system, the specific numerical value can be determined by an analytic hierarchy process, and the requirement of w is met loss +w h +w r =1.0。
Step 3.2, establishing constraint conditions for optimizing operation after fault recovery of the flexible power distribution network
1) Radial constraint
G(t)∈G (2)
In the formula: g (t) represents a network structure adopted in the t period; g is a network topology set which does not consider the branch where the SOP is located, strictly meets the radial constraint and does not contain the fault line.
2) Switch action times constraint
In the dynamic reconfiguration process of the power distribution network, the circuit breaker performance is degraded due to frequent changes of the network topology, the service life is shortened, and adverse effects are brought to the transient stability of the system, so that the times of the circuit breaker action need to be reasonably limited in an optimization period.
Figure BDA0003907137130000101
Figure BDA0003907137130000102
In the formula: t is max Maximum value of total number of allowable switching actions for network reconfiguration in an optimized period, T j,max In order to optimize the maximum number of allowed actions of the switch j in the period, in the invention, the optimization period is 24 hours, the total number of the allowed maximum switch actions is 15, and the maximum action number allowed by a single switch is 3; omega node Is a set of system nodes.
3) SOP constraint
As shown in fig. 2, the SOP is composed of a fully-controlled power electronic device, and considering that the manufacturing cost of the intelligent soft switch is high, the intelligent soft switch generally selects a part to replace a tie switch in the distribution network.
The operation scenes of the SOP are all normal operation, and the control variables are as follows: active power P transmitted by SOP k1 、P k2 Reactive power support Q provided at the connection node on both sides of the SOP k1 、Q k2 The power losses generated during SOP operation are ignored. The SOP involves the transmission active power constraint and the two-sided capacity constraint as follows:
Figure BDA0003907137130000103
in the formula: s. the k1max 、S k2max The maximum capacity of the converter on two sides of the SOP is respectively. Omega SOP Is the set of the branches where the SOP is located.
4) Flow balance constraints
Figure BDA0003907137130000104
Figure BDA0003907137130000111
In the formula: p is i (t)、Q i (t) active power and reactive power injected into the node i in a period t respectively; v i (t) is the voltage amplitude of the inode during t period; g ij 、B ij Respectively the conductance and the susceptance of the node admittance matrix; theta.theta. ij Is the phase angle difference between the i node and the j node; p is DGi (t)、Q DGi (t) active power and reactive power injected by the distributed power supply at the node i in the t period respectively; p SOPi (t)、Q SOPi (t) active power and reactive power injected by the SOP at the node i in the t period respectively; p LDi (t)、Q LDi And (t) the active power and the reactive power consumed by the load at the node at the t period.
5) Node voltage constraint
V min ≤V i (t)≤V max i∈Ω node (8)
In the formula: v min 、V max Respectively satisfying the upper and lower limits of the voltage amplitude of the system operation node.
6) Branch current flow restraint
Figure BDA0003907137130000112
In the formula: i is ij (t) is the amplitude of the current flowing in branch ij at time t, I ijmax The maximum amplitude of the current allowed to flow on branch ij.
7) Distributed power supply output constraints
P DGi,min ≤P DGi (t)≤P DGi,max i∈Ω DG (10)
Q DGi,min ≤Q DGi (t)≤Q DGi,max i∈Ω DG (11)
P DGi,min 、P DGi,max The minimum value and the maximum value of the active power output of the distributed power supply connected with the i node are respectively; q DGi,min 、Q DGi,max The minimum value and the maximum value of the reactive power output of the distributed power supply connected with the node i are respectively; in the invention, the renewable energy wind generating set runs with a constant power factor and is trained by adopting prediction data; the power provided by the controllable distributed power supply meets the above constraints.
From the above problems, it can be seen that: decision variables in network reconstruction are composed of 0-1 variables, the regulating quantity of SOP-containing active controllable equipment is a continuous variable, and related power flow constraints, SOP operation constraints and the like are all nonlinear constraints. Therefore, the multi-period collaborative optimization problem is a large-scale mixed integer nonlinear programming problem, and great challenges are provided for the solving capability and the solving efficiency of the algorithm.
Step 4, establishing a fault recovery optimized operation mathematical model of the flexible power distribution network based on the step 3, and establishing a multi-type variable double-agent reinforcement learning collaborative optimization model based on the multi-type variable double-agent reinforcement learning collaborative optimization model;
according to the model established in the step 3, the optimized operation mathematical model after the fault recovery of the flexible power distribution network is a large-scale mixed integer nonlinear programming problem, the variables comprise discrete decision variables consisting of 0-1 variables in network reconstruction, distributed power output, SOP-containing active controllable equipment regulating quantity and other continuous variables, the related power flow constraint, SOP operation constraint and the like are nonlinear constraints, and the direct solution has the advantages of large calculated quantity and low solution speed and cannot ensure the global optimal solution.
The existing research shows that the reinforcement learning algorithm gets rid of the dependence on accurate prediction data and a physical model by virtue of strong autonomous learning exploration capacity, adapts to the change of the environment and effectively processes the complex sequence decision problem, so that the reinforcement learning can be used for processing the problem of mixed integer programming. The process of reinforcement learning processing mixed integer programming is as follows: as shown in fig. 3, the optimal strategy is generated by training with two agents, and different agents perform coordination training on the integer variable and the continuous variable respectively. Therefore, the invention constructs two intelligent agents with different time scales, and respectively carries out network topology optimization and controllable active equipment output optimization containing the intelligent soft switch.
The specific steps of the step 4 comprise:
step 4.1: constructing a discrete intelligent agent action space for network topology optimization, namely a network topology set which can be adopted after a flexible power distribution network line fails, wherein the action space of the discrete intelligent agent is as follows: { G (t) }
Step 4.2: a continuous agent action space for optimizing controllable active devices is constructed. The continuous intelligent agent realizes the optimized operation of the system by optimizing the reactive power provided by two sides of the intelligent soft switch and the active power transmitted by the intelligent soft switch and regulating and controlling a controllable distributed power supply (wind power, photovoltaic and micro gas turbine). Thus, the action space of a continuum agent is: { P k1 (t)、P k2 (t)、Q k1 (t)、Q k2 (t)、P DG (t)}。
Step 4.3: and constructing state spaces of the discrete agents and the continuous agents. Two agents share distribution network state information, and therefore set the same state quantities. Describing the running state of the system through the source network load state, namely the state space is as follows: { P k1 (t)、P k2 (t)、Q k1 (t)、Q k2 (t)、P DG (t)、G(t)、P L (t)}
Step 4.4: a reward function is constructed for both discrete agents and continuous agents. The instant reward of the discrete intelligent agent comprises two parts of network loss and switching action times, and because the algorithm converges towards the direction of the maximum reward function value, the reward function is set as the opposite number of the network loss and the switching action times:
Figure BDA0003907137130000131
Figure BDA0003907137130000132
in the formula: w is a p ,w g The weighting coefficients of the network loss and the switching action times are respectively, M is a great positive number, and when the switching action times are out of range, a great punishment is given.
The continuous intelligent agent reward function calculation method comprises the following steps: the continuous intelligent agent optimizes controllable devices such as a distributed power supply and an SOP (self-service platform) based on a network topology structure selected by the discrete intelligent agent, reduces network loss and light and air abandonment, and has a reward function as shown in a formula (15):
Figure BDA0003907137130000133
step 4.5: setting intelligent agent hyper-parameters;
the hyper-parameters of the discrete agent are: by adopting the DQN algorithm, the size of the experience pool is set to 20000, the discount factor gamma is 0.9, the batch processing scale is 32, the learning rate beta is 0.001, and the target network is updated 1 time every 200 steps. The continuous intelligent agent hyper-parameter design is as follows: with the AC algorithm, the learning rates α and β of the actor Network and the critical Network are 0.001 and 0.01, respectively, and the discount factor γ is 0.9. The neural network adopts a full connection layer mode, the number of neurons of two hidden layers is 128 and 256 respectively, and the activation function adopts a Relu function.
Step 4.6: off-line training is based on a double-agent reinforcement learning model for optimizing operation after the fault of the flexible power distribution network;
in order to further accelerate the convergence of the algorithm and improve the training efficiency, a discrete agent for network topology selection adopts a DQN algorithm, and each sample is provided with a priority delta proportional to the absolute value of a time sequence difference error (TD error) and is stored into an experience pool. Since the topology of the power network cannot be changed frequently, the agent acts on a longer time scale, the action time interval is set to d1, and the time interval of d1 is set in hours.
The continuous intelligent agent for regulating and controlling the controllable active equipment has the advantages that the regulating speed is high, the action time interval is set to be d2, and generally the time interval of the d2 is set to be in minutes.
In the training process, the two agents adopt a collaborative training mode and share the state information of the power distribution network. At different time points, the discrete intelligent agents and the continuous intelligent agents give corresponding action selection based on the current power distribution network dispatching center state, and meanwhile, the running state of the power distribution network changes, so that the action selection of another intelligent agent is influenced.
Step 5, making a decision on line to output dynamic reconstruction and a regulation and control strategy of the controllable active equipment;
the specific method of the step 5 comprises the following steps:
after the load and the wind power photovoltaic prediction curve are obtained based on the multi-type variable double-intelligent-body reinforcement learning collaborative optimization model, the dynamic reconstruction and the regulation and control strategy of the controllable active device can be directly completed without the optimization process.
The working principle of the invention is as follows:
the invention provides a dynamic reconstruction-based flexible power distribution network disturbance recovery optimization method, which is based on the current situation that the economical efficiency of a system is poor due to the change of a network structure when the interior of a power distribution network containing an SOP is disturbed. The method specifically comprises the following steps:
step 1: initializing and generating an initial network structure; importing photovoltaic and wind power historical data and load data; introducing impedance data of each branch of a power distribution network of an improved IEEE33 node (as shown in figure 4), performing per unit value, and defining active power and reactive power of the nodes in the network; importing day-ahead load curve data and calculating the total load; and setting the capacity value of the converter on two sides of the SOP and adding a controllable distributed power supply in the improved IEEE33 node network.
Step 2: and (2) reconstructing the initial network structure by adopting a polynomial multiplication method based on the initial network structure generated in the step (1) to generate a radial network topology structure meeting the network connectivity and relay protection setting requirements. If each switch is used as a single variable, and the related radial constraint is directly added into a reward function of reinforcement learning so as to constrain the action selected by the intelligent agent, the variables are numerous, the action space formed after arrangement and combination is huge, a large number of infeasible solutions exist, and meanwhile, the limitation of the action times of the switch needs to be considered, and the factors can make the optimization of the intelligent agent extremely difficult, the efficiency is low, and the convergence is difficult. The invention ensures the radial network topological structure through the polynomial multiplication method reduction technology, and has higher accurate solution while reducing the number of action spaces.
And step 3: and (3) according to the feasible network topology structure generated in the step (2), taking the distributed power output, the SOP power and the load recovery state as decision variables, taking the sum of the accumulated active power loss value of the system in the optimized period after disturbance, the switching action times in the network reconstruction process and the renewable energy consumption level as a minimum objective function to establish an optimized operation mathematical model after the fault recovery of the flexible power distribution network, and adding radial constraint, switching action times constraint, SOP constraint conditions, power flow balance constraint, node voltage constraint, branch power flow constraint and distributed power output constraint.
And 4, step 4: according to the model established in the step 3, the optimized operation mathematical model after the fault recovery of the flexible power distribution network is a large-scale mixed integer nonlinear programming problem, the decision variable in the network reconstruction is composed of 0-1 variables and is a discrete variable, the regulating quantity of the SOP-containing active controllable equipment is a continuous variable, and the related power flow constraint, SOP operation constraint and the like are nonlinear constraints. And constructing two different types of intelligent agents with different time scales, and respectively carrying out network topology optimization and controllable active equipment output optimization containing an intelligent soft switch.
It should be emphasized that the embodiments described herein are illustrative and not restrictive, and thus the present invention includes, but is not limited to, the embodiments described in this detailed description, as well as other embodiments that can be derived by one skilled in the art from the teachings herein, and are within the scope of the present invention.

Claims (5)

1. A flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction is characterized by comprising the following steps: the method comprises the following steps:
step 1, initializing to generate an initial network structure, and importing photovoltaic, wind power and load data;
step 2, generating a network topology mechanism set which can be adopted after the flexible distribution line fails based on the network structure generated by initializing in the step 1;
step 3, based on the feasible network topology structure generated in the step 2, establishing an optimized operation mathematical model after the fault of the flexible power distribution network is recovered;
step 4, establishing an optimized operation mathematical model after the fault recovery of the flexible power distribution network based on the establishment constructed in the step 3, and establishing a multi-type variable based double-agent reinforcement learning collaborative optimization model;
and 5, outputting dynamic reconstruction and a regulation and control strategy of the controllable active equipment by online decision.
2. The method for recovering disturbance of the flexible power distribution network adaptive to full-time dynamic reconstruction according to claim 1, wherein the method comprises the following steps: the specific steps of the step 2 comprise:
step 2.1, deleting the fault line and simplifying the network: deleting a fault line on an initial network structure G0, merging branches connected with nodes with the node degree of 2 in the network, and regarding the branches as a branch to obtain a simplified network G1;
step 2.2 generating all spanning trees of the simplified network based on a polynomial multiplication method: based on the node branch information of G1, self-defining a root node, and forming basic items in the polynomial by taking each non-root node as a unit, wherein each basic item is obtained by adding the labels of all branches connected with corresponding nodes;
step 2.3, mapping the spanning tree of the simplified network to the spanning tree of the original network: each spanning tree of the simplified network corresponds to a plurality of spanning trees of the original network; for the tree branches of the spanning tree of the simplified network, the initial network structure reserves all corresponding branches; for the connection branches of the simplified network except the spanning tree, the initial network structure can be used for disconnecting any corresponding branch, so that a network topology structure set which can be adopted after the flexible distribution line fails is generated.
3. The method for recovering disturbance of the flexible power distribution network adaptive to full-time dynamic reconstruction according to claim 1, wherein the method comprises the following steps: the specific steps of the step 3 comprise:
step 3.1, establishing an optimized operation objective function after the fault of the flexible power distribution network is recovered;
the objective function considers the accumulated active power loss value of the system in the optimization period after disturbance occurs, the switching action times in the network reconstruction process and the renewable energy consumption level, and is shown as formula (1):
Figure FDA0003907137120000021
in the formula: t is simulation duration, is set to be one day, is optimized by taking hours as a unit, and n is the number of nodes of the system; p i (t) is the active power injected at node i during time t; t is a unit of j (t) is the state of the switch j in the time period t, when the switch is closed, the state is 1, and when the switch is opened, the state is 0; p DGi (t) is the active power injected by the distributed power supply at node i during time t,
Figure FDA0003907137120000022
the active power predicted value injected by the distributed power supply at the node i in the t period is obtained; n is DG The total number of the distributed power supplies; the network topology and the injected power of each node in a unit time interval are assumed to be kept unchanged; w is a loss 、w t 、w r Respectively eliminating the influence coefficients of the network loss, the switching action times and the renewable energy on the importance of the objective function;
step 3.2, establishing constraint conditions for optimizing operation after fault recovery of the flexible power distribution network
1) Radial constraint
G(t)∈G (2)
In the formula: g (t) represents a network structure adopted in the t period; g is a network topology set which does not consider the branch where the SOP is and strictly meets the radial constraint and does not contain the fault line;
2) Switch action frequency constraint
Figure FDA0003907137120000023
Figure FDA0003907137120000024
In the formula: t is max Maximum value of total number of permissible switching actions for network reconfiguration in an optimized period, T j,max In order to optimize the maximum number of allowed actions of the switch j in the period, in the invention, the optimization period is 24 hours, the total number of allowed maximum switch actions is 15, and the maximum number of allowed actions of a single switch is 3; omega node Is a system node set;
3) SOP constraints
The SOP involves the transmission active power constraint and the two-sided capacity constraint as follows:
Figure FDA0003907137120000031
in the formula: s k1max 、S k2max The maximum capacity of the converters on two sides of the SOP is respectively; omega SOP The branch set where the SOP is located;
4) Tidal current balance constraint
Figure FDA0003907137120000032
Figure FDA0003907137120000033
In the formula: p i (t)、Q i (t) active power and reactive power injected into the node i at a time period t respectively; v i (t) is the voltage amplitude of the inode during t period; g ij 、B ij Respectively the conductance and the susceptance of the node admittance matrix; theta ij Is the phase angle difference between the i node and the j node; p is DGi (t)、Q DGi (t) active power and reactive power injected by the distributed power supply at the node i in the t period respectively; p SOPi (t)、Q SOPi (t) active power injected at node i node, t,Reactive power; p LDi (t)、Q LDi (t) the active power and the reactive power consumed by the load at the i node in the t period respectively;
5) Node voltage constraint
V min ≤V i (t)≤V max i∈Ω node (8)
In the formula: v min 、V max Respectively satisfying the upper and lower limits of the voltage amplitude of the system operation node;
6) Branch current flow restraint
Figure FDA0003907137120000034
In the formula: i is ij (t) is the amplitude of the current flowing in branch ij at time t, I ijmax The maximum amplitude of the current allowed to flow in branch ij;
7) Distributed power supply output constraints
P DGi,min ≤P DGi (t)≤P DGi,max i∈Ω DG (10)
Q DGi,min ≤Q DGi (t)≤Q DGi,max i∈Ω DG (11)
P DGi,min 、P DGi,max Respectively the minimum value and the maximum value of the active power output of the distributed power supply connected with the i node; q DGi,min 、Q DGi,max Respectively the minimum value and the maximum value of the reactive power output of the distributed power supply connected with the i node.
4. The method for recovering disturbance of the flexible power distribution network adaptive to full-time dynamic reconstruction according to claim 1, wherein the method comprises the following steps: the specific steps of the step 4 comprise:
step 4.1: constructing a discrete intelligent agent action space for network topology optimization, namely a network topology set which can be adopted after a flexible power distribution network line has a fault, wherein the action space of the discrete intelligent agent is as follows: { G (t) }
Step 4.2: constructing a continuous agent action space for optimizing controllable active devicesIn between, the action space of the continuous agent is: { P k1 (t)、P k2 (t)、Q k1 (t)、Q k2 (t)、P DG (t)};
Step 4.3: constructing state spaces of the discrete intelligent agents and the continuous intelligent agents, and describing the running state of the system through the source network load state, namely the state spaces are as follows: { P k1 (t)、P k2 (t)、Q k1 (t)、Q k2 (t)、P DG (t)、G(t)、P L (t)}
Step 4.4: constructing a reward function of the discrete agent and the continuous agent:
Figure FDA0003907137120000041
Figure FDA0003907137120000042
in the formula: w is a p ,w g The weighting coefficients are respectively the network loss and the switching action frequency, M is a great positive number, and when the switching action frequency is out of range, a great punishment is given;
the continuous intelligent agent reward function calculation method comprises the following steps: the continuous intelligent agent optimizes controllable devices such as a distributed power supply and an SOP (self-service platform) based on a network topology structure selected by the discrete intelligent agent, reduces network loss and light and air abandonment, and has a reward function as shown in a formula (15):
Figure FDA0003907137120000051
step 4.5: setting intelligent agent hyper-parameters;
step 4.6: and off-line training is based on a double-agent reinforcement learning model for optimizing operation after the fault of the flexible power distribution network.
5. The method for recovering disturbance of the flexible power distribution network adaptive to full-time dynamic reconstruction according to claim 1, wherein the method comprises the following steps: the specific method of the step 5 comprises the following steps:
after the load and the wind power photovoltaic prediction curve are obtained based on the multi-type variable double-intelligent-body reinforcement learning collaborative optimization model, the dynamic reconstruction and the regulation and control strategy of the controllable active device can be directly completed without the optimization process.
CN202211308934.2A 2022-10-25 2022-10-25 Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction Pending CN115765035A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211308934.2A CN115765035A (en) 2022-10-25 2022-10-25 Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211308934.2A CN115765035A (en) 2022-10-25 2022-10-25 Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction

Publications (1)

Publication Number Publication Date
CN115765035A true CN115765035A (en) 2023-03-07

Family

ID=85353088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211308934.2A Pending CN115765035A (en) 2022-10-25 2022-10-25 Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction

Country Status (1)

Country Link
CN (1) CN115765035A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117996703A (en) * 2024-04-07 2024-05-07 国网江苏省电力有限公司常州供电分公司 Power distribution network fault distinguishing and protecting device and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117996703A (en) * 2024-04-07 2024-05-07 国网江苏省电力有限公司常州供电分公司 Power distribution network fault distinguishing and protecting device and method

Similar Documents

Publication Publication Date Title
WO2022016622A1 (en) Adaptive optimization and control method in event of failure of true bipolar flexible direct-current power transmission system
CN111342461B (en) Power distribution network optimal scheduling method and system considering dynamic reconfiguration of network frame
CN109768573A (en) Var Optimization Method in Network Distribution based on multiple target difference grey wolf algorithm
CN114362196B (en) Multi-time-scale active power distribution network voltage control method
CN111030188A (en) Hierarchical control strategy containing distributed and energy storage
CN108304972B (en) Active power distribution network frame planning method based on supply and demand interaction and DG (distributed generation) operation characteristics
Tang et al. Study on day-ahead optimal economic operation of active distribution networks based on Kriging model assisted particle swarm optimization with constraint handling techniques
CN111049173B (en) Self-organizing droop control method for multi-terminal direct-current distribution network
CN111277004B (en) Power distribution network source-network-load two-stage multi-target control method and system
CN111490542B (en) Site selection and volume fixing method of multi-end flexible multi-state switch
CN114362267B (en) Distributed coordination optimization method for AC/DC hybrid power distribution network considering multi-objective optimization
Kaysal et al. Hierarchical energy management system with multiple operation modes for hybrid DC microgrid
CN111614110B (en) Receiving-end power grid energy storage optimization configuration method based on improved multi-target particle swarm optimization
CN113159366A (en) Multi-time scale self-adaptive optimization scheduling method for multi-microgrid system
CN115313403A (en) Real-time voltage regulation and control method based on deep reinforcement learning algorithm
CN115765035A (en) Flexible power distribution network disturbance recovery method suitable for full-time dynamic reconstruction
CN115481856A (en) Comprehensive energy system multi-scale scheduling method and system considering comprehensive demand response
CN114548597A (en) Optimization method for alternating current-direct current hybrid optical storage and distribution power grid
CN114759616B (en) Micro-grid robust optimization scheduling method considering characteristics of power electronic devices
CN116865270A (en) Optimal scheduling method and system for flexible interconnection power distribution network containing embedded direct current
CN115860180A (en) Power grid multi-time scale economic dispatching method based on consistency reinforcement learning algorithm
CN113346501B (en) Power distribution network voltage optimization method and system based on brainstorming algorithm
Zhao et al. Distribution network reconfiguration digital twin model based on bi-level dynamical time division
CN113904340A (en) Utilization rate improving method and system based on topology optimization and schedulable load optimization
Welch et al. Optimal control of a photovoltaic solar energy system with adaptive critics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination