CN114662982B - Multistage dynamic reconstruction method for urban power distribution network based on machine learning - Google Patents
Multistage dynamic reconstruction method for urban power distribution network based on machine learning Download PDFInfo
- Publication number
- CN114662982B CN114662982B CN202210399965.7A CN202210399965A CN114662982B CN 114662982 B CN114662982 B CN 114662982B CN 202210399965 A CN202210399965 A CN 202210399965A CN 114662982 B CN114662982 B CN 114662982B
- Authority
- CN
- China
- Prior art keywords
- distribution network
- power distribution
- representing
- load
- reconstruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000010801 machine learning Methods 0.000 title claims abstract description 16
- 238000005457 optimization Methods 0.000 claims abstract description 45
- 230000002787 reinforcement Effects 0.000 claims abstract description 34
- 230000009471 action Effects 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 27
- 230000009467 reduction Effects 0.000 claims description 19
- 210000002569 neuron Anatomy 0.000 claims description 18
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 230000003993 interaction Effects 0.000 claims description 15
- 238000010248 power generation Methods 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 9
- 238000004891 communication Methods 0.000 claims description 8
- 238000012546 transfer Methods 0.000 claims description 7
- 230000007704 transition Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 4
- 230000006399 behavior Effects 0.000 claims description 3
- 230000005611 electricity Effects 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 230000008014 freezing Effects 0.000 abstract description 2
- 238000007710 freezing Methods 0.000 abstract description 2
- 230000007246 mechanism Effects 0.000 abstract description 2
- 230000007613 environmental effect Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000005855 radiation Effects 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0637—Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/10—Geometric CAD
- G06F30/18—Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for ac mains or ac distribution networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/04—Power grid distribution networks
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J2203/00—Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
- H02J2203/20—Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- Entrepreneurship & Innovation (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Educational Administration (AREA)
- Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Primary Health Care (AREA)
- Medical Informatics (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Analysis (AREA)
- Software Systems (AREA)
- Power Engineering (AREA)
- Water Supply & Treatment (AREA)
- Development Economics (AREA)
- Public Health (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
The invention relates to a machine learning-based multistage dynamic reconstruction method for an urban power distribution network, which belongs to the technical field of dynamic reconstruction of the urban power distribution network. Secondly, establishing a multi-agent reinforcement learning model, and carrying out joint optimization on different reconstruction subjects in each period; and (3) learning environmental information such as predicted load, photovoltaic energy output power and the like by using a deep Q network with parameter freezing and experience playback mechanisms, and dynamically reconstructing and optimizing the power distribution network by using a learned strategy set with the aim of optimizing running cost, voltage offset and load balance. The invention can realize the efficient, safe and economic operation of the urban power distribution network.
Description
Technical Field
The invention belongs to the technical field of dynamic reconstruction of urban power distribution networks, and particularly relates to a multistage dynamic reconstruction method of an urban power distribution network based on machine learning.
Background
The access of the high-permeability distributed power supply and the rapid development of unbalanced load in urban areas lead to the extremely unbalanced space-time distribution of the net load of the urban power distribution network, and provide new challenges for the safe and economic operation of the power distribution network. The reconstruction of the power distribution network is taken as one of the active management measures of the power distribution network, and the network structure is adjusted by changing the on-off states of the interconnection switch and the sectionalizing switch so as to achieve the purposes of reducing network loss, balancing load, eliminating line overload and improving clean energy consumption. However, the traditional optimization method relies on an explicit model, a prediction technology and an optimization solver, so that the solving is time-consuming and on-line decision making is difficult to realize, and meanwhile, uncertainty caused by large-scale access of distributed power sources such as wind-driven photovoltaic and the like also increases the solving difficulty. Therefore, in the face of increasingly complex power grid environments, how to select an urban power distribution network reconstruction strategy, how to implement on-line decision of a reconstruction level, and how to handle uncertainty of a distributed power supply become important problems to be discussed and researched in the background of a novel power system.
Disclosure of Invention
The invention aims to provide a machine learning-based multistage dynamic reconstruction method for an urban power distribution network, which is used for solving the technical problems in the prior art, such as: the traditional optimization method relies on an explicit model, a prediction technology and an optimization solver, solving is time-consuming and is difficult to realize online decision, and meanwhile, uncertainty caused by large-scale access of distributed power sources such as wind-driven photovoltaic and the like also improves solving difficulty. Therefore, in the face of increasingly complex power grid environments, how to select an urban power distribution network reconstruction strategy, how to implement on-line decision of a reconstruction level, and how to handle uncertainty of a distributed power supply become important problems to be discussed and researched in the background of a novel power system.
In order to achieve the above purpose, the technical scheme of the invention is as follows:
a multistage dynamic reconstruction method of an urban power distribution network based on machine learning comprises the following steps:
s1: fitting the historical operation data of the power distribution network and the reconstruction level decision based on a neural network multi-label classification method to realize the on-line decision of the reconstruction level of the power distribution network;
s2: taking the power and current characteristics of each node in the power distribution network at each moment as an agent state space, taking the on-off state of a branch and the light discarding and load discarding rate as an action space, comprehensively considering the running cost, the voltage offset index and the load balance degree of the power distribution network by a reward function, taking the uncertainty of photovoltaic output into account in state transition, and constructing an urban power distribution network reconstruction model based on reinforcement learning;
s3: and carrying out joint optimization training on different agents distributed in the whole optimization period, and constructing a multi-agent reinforcement learning joint optimization model.
Further, the step S1 specifically includes:
node load, upper power grid interaction quantity and photovoltaic output measured by the whole system of the power distribution network are used as input characteristics of a neural network, and reconstruction levels of all the main bodies are used as output of the neural network; the neural network structure has 4 layers, namely an input layer, a hidden layer formed by two full-connection layers and an output layer;
the neural network structure is as follows:
wherein: w= { W (1) ,b (1) ,W (2) ,b (2) ,W (3) ,b (3) },f o Representing the sigmoid activation function,and->Representing a linear rectification activation function; the two hidden layers are provided with F neurons, and the neurons of the input layer pass through the weight W (1) ∈R F×D And bias term b (1) ∈R F×1 Hidden layer neuron h E R with first layer F×1 Connecting; hidden layer neuron h E R F×1 By weight W (2) ∈R F×F And bias term b (2) ∈R F×1 Is connected with the second layer hidden layer neuron, and finally is connected with the output layer neuron o E R through a sigmoid activation function L×1 Connection to limit the output range to [0,1 ]]Between them.
Further, in step S2,
state space:
wherein:respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period, the load reduction amount of the load reduction node and the purchase power amount of the upper power grid; i ij,t A current flowing from the node i to the node j at the time t;
action space:
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1;respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period and the load reduction amount of the load-reducible node; d represents the discretized granularity;
state transfer function:
s t+1 =f(s t ,a t ,ρ)
wherein: ρ represents a random amount; this indicates that the state transition is not only subject to action a t And is also influenced by randomness, and on the basis of the photovoltaic output prediction baseline, normal function noise is added during each training to simulate the uncertainty of the photovoltaic output.
Further, in step S2, the reward function is:
wherein: lambda (lambda) 1 ,λ 2 Respectively representing the rewarding weight coefficients;giving corresponding punishment to the intelligent agent if the branch power exceeds the allowable upper limit; />The economic operation cost is t time intervals of the power distribution network; />Representing the line loss cost in the t period; />Representing switching loss costs; />Representing cut-down of load costs; />Representing the reject cost; Δt represents a time interval; c loss Representing the electricity price of unit network loss; r is (r) ij Representing the resistance of branch ij; omega shape l Representing all branch sets in the optimization area;respectively represents the single operation cost when changing the on-off states of a feeder tie switch, a transformer substation tie switch and a branch sectionalizer, andeach kind of contact switch state marks respectively representing the reconstruction area, 0 represents open, 1 represents close; omega shape LR Representing a reducible load set; c LR Representing unit load rejection cost; />The load power of the node i is abandoned in the period t; omega shape PV Representing a collection of light Fu Jiedian; />The unit light rejection penalty cost at the time t is represented; />Representing the power of photovoltaic output at the moment t of the node i; />The actual power of the node i in the period t of photovoltaic access to the power grid is represented; v (V) i N And V i,t Respectively a voltage rated value of a node i and an actual value of a period t; Ω denotes a set of all nodes; r is R i,t Representing the load rate of the node i in the t period; />Representing the average load rate of the power distribution network in the t period; p (P) i,t Active power injected for node i in period t; p (P) i max Active power is injected for the maximum allowable node i; n represents the number of nodes of the distribution network.
Further, the multi-agent reinforcement learning joint optimization model of step S3 firstly determines an optimization subject of a 24-hour period through step S1, and distributes different agents to different optimization subjects of different periods; distributing the same agent to the same optimizing subject in different time periods; and the distribution network structure changed by the execution action decision of the agent in the current period is matched with a state transfer function to serve as an agent state space in the next period.
A machine learning based multistage dynamic reconfiguration system for an urban power distribution network, comprising: reconstructing a level fast decision model and a distribution network optimization operation model based on reinforcement learning;
the reconstruction level fast decision model comprises: the system comprises a power distribution network state rapid sensing module, a reconstruction level decision module and a first information interaction module;
the power distribution network state rapid sensing module is used for monitoring photovoltaic power generation capacity of a real-time photovoltaic node of a power distribution network in real time, reducing load capacity of load nodes, power exchange capacity of a superior power grid and load demand capacity;
the reconstruction level decision module is used for deciding the reconstruction level of the power distribution network according to the running state of the urban power distribution network and limiting the optimization main body range of the reinforcement learning agent;
the first information interaction module is used for transmitting a reconstruction level decision result to the power distribution network optimization operation model for reinforcement learning;
the power distribution network optimization operation model based on reinforcement learning comprises a second information interaction module, a power distribution network state accurate sensing module, an experience pool module, a tie switch action module, a photovoltaic output decision module, a load reduction decision module, an agent joint operation module and a reconstruction decision module;
the second information interaction module is used for receiving a reconstruction level decision result of the reconstruction level quick decision model;
the power distribution network state accurate sensing module is used for accurately sensing real-time photovoltaic power generation capacity of the power distribution network, load reduction capacity capable of reducing load nodes, power exchange capacity of a superior power grid, load demand and branch current;
the experience pool module is used for storing historical operation environments of the power distribution network and rewarding values obtained after decision making is carried out on the historical operation environments and the corresponding agents;
the contact switch action module is used for remotely controlling the opening and closing actions of the contact switches of all the branches according to the corresponding reconstruction scheme;
the photovoltaic output decision module is used for deciding the photovoltaic node light rejection amount behavior under the corresponding reconstruction scheme;
the load reduction decision module is used for reducing load rejection decision of load nodes under the corresponding reconstruction scheme;
the intelligent agent joint operation module is used for carrying out power distribution network optimization operation decision by combining all intelligent agents;
and the reconstruction decision module is used for correspondingly controlling each module according to the multi-stage dynamic reconstruction method of the urban power distribution network based on machine learning.
Compared with the prior art, the invention has the following beneficial effects:
the scheme has the advantages that a machine learning urban power distribution network multistage dynamic reconstruction method is provided, a reconstruction level rapid judgment model and a multi-agent deep reinforcement learning model are established, and real-time decision of reconstruction level and optimized operation is realized. Firstly, an on-line decision of a reconstruction level is realized by establishing a neural network-based reconstruction level rapid judgment model, real-time reference is provided for a dispatcher, and meanwhile, the problem that the action space grows exponentially when a plurality of subjects are optimized by a traditional deep reinforcement learning single agent is solved by dividing an optimizing subject; secondly, simulating the uncertainty of the photovoltaic through a state transfer function, realizing convergence of training accuracy through a large amount of training, and solving the problem of difficult solution of the problem containing uncertainty; and finally, a multi-agent joint solution model is established to finish the solution of the multi-stage dynamic reconstruction problem of the power distribution network considering uncertainty, and the model does not need repeated solution under the similar running state of the power distribution network and has practicability.
Drawings
Fig. 1 is a schematic diagram of a multistage dynamic reconstruction method of an urban power distribution network based on machine learning.
FIG. 2 is a schematic diagram of the operation of the reconstruction level fast decision model of the present invention.
Fig. 3 is a working principle diagram of a reconstruction optimization operation model of the power distribution network based on reinforcement learning.
FIG. 4 is a block diagram of a multi-agent reinforcement learning architecture of the present invention.
FIG. 5 is a graph of the reconstruction level fast decision model fitting results of the present invention.
FIG. 6 is a graph of the results of the multi-agent reinforcement learning model optimization of the present invention.
Detailed Description
For the purpose of making the technical solution and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and examples. It should be understood that the particular embodiments described herein are illustrative only and are not intended to limit the invention, i.e., the embodiments described are merely some, but not all, of the embodiments of the invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention. It is noted that relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The features and capabilities of the present invention are described in further detail below in connection with the examples.
Examples:
the traditional optimization method relies on an explicit model, a prediction technology and an optimization solver, solving is time-consuming and is difficult to realize online decision, and meanwhile, uncertainty caused by large-scale access of distributed power sources such as wind-driven photovoltaic and the like also improves solving difficulty. Therefore, in the face of increasingly complex power grid environments, how to select an urban power distribution network reconstruction strategy, how to implement on-line decision of a reconstruction level, and how to handle uncertainty of a distributed power supply become important problems to be discussed and researched in the background of a novel power system.
As shown in fig. 1, the invention uses node load, upper power grid interaction quantity and photovoltaic output measured by a whole system as input characteristics based on measurement characteristics of space-time big data of a power system, so that sample data has space-time characteristics, provides labels for a neural network multi-label classification fitting model, uses binary cross entropy as a loss function, establishes a reconstruction level fast decision model, realizes fast decision of the reconstruction level of a power distribution network, and performs dimension reduction on a next reinforcement learning action space; the power distribution network with the reconstructed level judged is subjected to a single-agent reinforcement learning model containing parameter freezing and experience playback mechanisms by taking the optimal economic operation cost, voltage offset and load balance as targets and considering constraints such as the radiation performance, the tide and the like of a power distribution network structure; dividing a reconstruction optimization main body of the power distribution network in each period according to a 24-hour reconstruction level judgment result, establishing a multi-agent reinforcement learning model, performing joint optimization on the model, and performing validity verification on the method based on an example system.
As shown in fig. 2, the reconstruction level fast decision model consists of a power distribution network state fast sensing module, a reconstruction level decision module and an information interaction module; the power distribution network state rapid sensing module is used for monitoring photovoltaic power generation capacity of real-time photovoltaic nodes of the urban power distribution network in real time, reducing load capacity of load-reducible nodes, power exchange capacity of an upper-level power grid and load demand capacity; the reconstruction level decision module is used for deciding the reconstruction level of the power distribution network according to the running state of the urban power distribution network and limiting the optimization main body range of the reinforcement learning agent; the information interaction module is used for transmitting the reconstruction level decision result to the power distribution network optimization operation model for reinforcement learning;
the power distribution network optimization operation model based on reinforcement learning is composed of an information interaction module, a power distribution network state accurate sensing module, an experience pool module, a tie switch action module, a photovoltaic output decision module, a load reduction decision module, an agent joint operation module and a reconstruction decision module as shown in fig. 3. The information interaction module is used for receiving a reconstruction level decision result of the reconstruction level quick decision model; the power distribution network state accurate sensing module is used for accurately sensing real-time photovoltaic power generation capacity of the power distribution network, load reduction capacity capable of reducing load nodes, power exchange capacity of a superior power grid, load demand and branch current; the experience pool module is used for storing historical operation environments of the power distribution network and rewarding values obtained after decision making is carried out on the historical operation environments and corresponding intelligent agents; the contact switch action module is used for remotely controlling the switching action of each branch contact switch according to the corresponding reconstruction scheme; the photovoltaic output decision module is used for deciding the photovoltaic node light rejection amount behavior under the corresponding reconstruction scheme; the load reduction decision module is used for reducing the load rejection decision of the load node under the corresponding reconstruction scheme; the intelligent agent joint operation module is used for carrying out power distribution network optimization operation decision by combining all intelligent agents;
the reconstruction decision module is used for correspondingly controlling each module according to the machine learning-based power distribution network multistage dynamic reconstruction method established by the invention.
In the reconstruction level fast decision model,
the neural network structure:
wherein: w= { W (1) ,b (1) ,W (2) ,b (2) ,W (3) ,b (3) },f o Representing the sigmoid activation function,and->Representing a linear rectifying activation function. The two hidden layers are provided with F neurons, and the neurons of the input layer pass through the weight W (1) ∈R F×D And bias term b (1) ∈R F×1 Hidden layer neuron h E R with first layer F×1 And (5) connection. Hidden layer neuron h E R F×1 By weight W (2) ∈R F×F And bias term b (2) ∈R F×1 Is connected with the second layer hidden layer neuron, and finally is connected with the output layer neuron o E R through a sigmoid activation function L×1 Connection to limit the output range to [0,1 ]]Between them.
The binary cross entropy:
wherein: loss represents a binary cross entropy Loss function used to represent the error between evaluating a given real label and the predicted outcome in a multi-label classification task;o nl And y nl The prediction result for the tag l and the real tag are represented respectively.
Meanwhile, the reconstruction level of the power distribution network is judged to have coupling constraint, and the constraint is as follows:
Wherein:respectively representing substation level reconstruction, transformer level reconstruction or feeder level reconstruction which are required by different main bodies in t time periods, wherein the variables are 0-1; omega shape Sub Representing a substation node set; n (N) i Trans Representing the total number of transformers contained in the i substation; f epsilon mu (i) represents a transformer f belonging to a transformer substation i; the formula (3) shows that the transformer substation can only selectively execute transformer level reconstruction or substation level reconstruction at the same time; equation (4) indicates that the actions of the mutually related substations m and n are consistent when whether to execute substation level reconstruction or not; equation (5) indicates that when the upper-layer substation i performs transformer level or substation level reconstruction, the inside of the substation cannot perform feeder level reconstruction at the same time.
In the reinforcement learning-based power distribution network optimization operation model,
the state space:
wherein:respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period, the load reduction amount of the load reduction node and the purchase power amount of the upper power grid; i ij,t The current flowing from node i to node j at time t is shown.
The action space is as follows:
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1;respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period and the load reduction amount of the load-reducible node; d represents the discretized granularity.
Meanwhile, the action space needs to meet the switch set selection constraint, and the constraint is as follows:
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1; omega shape i SW,Sub 、N i SW,Sub 、Ω i SW,Trans And N i SW ,Trans Respectively representing the collection and the number of transformer station interconnection switches and transformer interconnection switches of the i transformer station; and->Respectively representing the collection and the quantity of feeder tie switches and branch on-off switches of transformers belonging to the substation i. From the above formula, it can be seen that: 1) When beta is i SR =1、β i TR =0、/>When the transformer substation is in operation, the i transformer substation performs transformer substation level reconstruction, and the on-off state of the interconnection switch can be freely adjusted with the transformer interconnection switch, the feeder interconnection switch and the branch sectionalizer of the associated transformer substation; 2) When beta is i SR =0、β i TR =1、/>When the transformer stage reconstruction is carried out by the i transformer station, the on-off states of the interconnection switch, the feeder interconnection switch and the branch sectionalizer of the transformer station can be freely adjusted; 3) When beta is i SR =0、β i TR =0、/>And when the feeder level reconstruction of the transformer f belonging to the transformer i is executed, the on-off states of a feeder tie switch and a branch sectionalizer belonging to the transformer can be adjusted.
The state transfer function:
s t+1 =f(s t ,a t ,ρ)
where ρ represents a random quantity. This indicates that the state transition is not only subject to action a t And is also affected by randomness, and is added during each training round on the basis of the predicted baseline of the photovoltaic outputNormal function noise to simulate photovoltaic output uncertainty.
The reward function:
wherein: lambda (lambda) 1 ,λ 2 Respectively representing the rewarding weight coefficients;giving a larger punishment to the intelligent agent if the branch power exceeds the allowable upper limit; />The economic operation cost is t time intervals of the power distribution network; />Representing the line loss cost in the t period;representing switching loss costs; />Representing cut-down of load costs; />Representing the reject cost; Δt represents a time interval; c loss Representing the electricity price of unit network loss; r is (r) ij Representing the resistance of branch ij; omega shape l Representing all branch sets in the optimization area;respectively represents the single operation cost when changing the on-off states of a feeder tie switch, a transformer substation tie switch and a branch sectionalizer, andeach type of tie switch status flag, 0, and 1 respectively represent a reconstruction area, respectively, is open and closed. Omega shape LR Representing a reducible load set; c LR Representing unit load rejection cost; />The load power of the node i is abandoned in the period t; omega shape PV Representing a collection of light Fu Jiedian; />The unit light rejection penalty cost at the time t is represented; />Representing the power of photovoltaic output at the moment t of the node i; />The actual power of the node i in the period t of photovoltaic access to the power grid is represented; v (V) i N And V i,t Respectively a voltage rated value of a node i and an actual value of a period t; Ω denotes a set of all nodes; r is R i,t Representing the load rate of the node i in the t period; />Representing the average load rate of the power distribution network in the t period; p (P) i,t Active power injected for node i in period t; p (P) i max Active power is injected for the maximum allowable node i; n represents the number of nodes of the distribution network.
The multi-agent reinforcement learning model:
as shown in fig. 4, firstly, determining an optimized main body in a 24-hour period through a reconstruction level rapid judgment model, and distributing different intelligent agents to different optimized main bodies in different periods; distributing the same agent to the same optimizing subject in different time periods; and the distribution network structure changed by the execution action decision of the agent in the current period is matched with a state transfer function to serve as an agent state space in the next period.
The constraint conditions are as follows:
(1) Tidal current constraint
Wherein: p (P) i,t And Q i,t Respectively representing the active power and the reactive power of a node i at the moment t; v (V) i,t Representing the voltage of node i at time t, the admittance between adjacent nodes is G ij And B ij ;θ ij Is the voltage phase angle difference.
(2) Safe operation constraint
Wherein: w (w) ij Indicating the switching state of branch ij, if w ij =1, then it indicates that the branch ij switch is closed; omega shape SW Representing a collection of switching legs.
(3) Radiation constraints for power distribution network
Wherein: e (E) Always Representing the total number of the branches which are always in a closed state and cannot be adjusted in the net rack; l (L) s (i) To represent a set of branch terminal nodes with i as an initial node; l (L) e (i) Representing a branch initial node set taking i as a terminal node; n (N) Sub Representing the number of substations of the optimized subject, if feeder level reconstruction or transformer level reconstruction is performed, N Sub =1, the size is the sum of the numbers of the substation and the associated substation if substation level reconstruction is performed. However, the power distribution network containing DG may have island operation under the constraint of formula (32), so that the supplementary formulas (33-35) are needed, power epsilon is injected into the non-substation nodes, and the nodes are kept in a communicated state through the simplified power flow constraint.And the auxiliary power flow active power on the branch ij at the moment t is represented.
And (3) performing example verification analysis:
the method is verified by adopting a modified practical 145 node system, fitting training is carried out on the reconstruction level fast decision model based on the existing data set, and validity analysis is carried out on the multi-agent reinforcement learning model by using the prediction data of initial load and photovoltaic output.
As shown in fig. 5, it can be seen that the loss function value of the neural network in the verification set is continuously reduced, approaches to the minimum value and gradually converges after 15 rounds, no fitting phenomenon occurs, and the prediction accuracy is 99% -100%, which indicates that the neural network has been fitted with a reconstruction level fast judgment model based on a mathematical method, thereby realizing accurate sensing of the power distribution network environment, shortening the reconstruction level judgment time, and being capable of fast judging the reconstruction level of the power distribution network.
As shown in fig. 6, for the multi-agent reinforcement learning model for dynamic multi-stage reconstruction of the urban distribution network, the model reaches the vicinity of the maximum value after training for 15 000 rounds, and the reason for the oscillation of the reward value is that the joint agent continuously tries new selection because of the setting of the search value, so as to avoid sinking into local optimum. The optimization effect is continuously improved and tends to be stable according to the rewarding trend, and the effectiveness of the model is verified.
According to the invention, the reconstruction level rapid judgment and operation optimization strategy based on machine learning is mainly researched by considering the space-time flexibility requirement of the net load of the urban power distribution network and the difference of the adjustment capability of the multi-type interconnection switches. Firstly, realizing the quick decision of the reconstruction level of the power distribution network through a quick decision model of the reconstruction level, and reducing the dimension of the next reinforcement learning action space. And secondly, establishing an optimal operation model of the power distribution network based on reinforcement learning by taking the optimal economic operation cost, voltage offset and load balance as targets and considering constraints such as the radiation performance, the tide and the like of the power distribution network structure. And finally, dividing the reconstruction optimization main body of the power distribution network in each period according to the 24-hour reconstruction level judgment result, establishing a multi-agent reinforcement learning model and carrying out joint optimization on the multi-agent reinforcement learning model.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.
Claims (3)
1. The multistage dynamic reconfiguration method for the urban power distribution network based on machine learning is characterized by comprising the following steps of:
s1: fitting the historical operation data of the power distribution network and the reconstruction level decision based on a neural network multi-label classification method to realize the on-line decision of the reconstruction level of the power distribution network;
s2: taking the power and current characteristics of each node in the power distribution network at each moment as an agent state space, taking the on-off state of a branch and the light discarding and load discarding rate as an action space, comprehensively considering the running cost, the voltage offset index and the load balance degree of the power distribution network by a reward function, taking the uncertainty of photovoltaic output into account in state transition, and constructing an urban power distribution network reconstruction model based on reinforcement learning;
s3: performing joint optimization training on different agents distributed in the whole optimization period, and constructing a multi-agent reinforcement learning joint optimization model;
the step S1 is specifically as follows:
node load, upper power grid interaction quantity and photovoltaic output measured by the whole system of the power distribution network are used as input characteristics of a neural network, and reconstruction levels of all the main bodies are used as output of the neural network; the neural network structure has 4 layers, namely an input layer, a hidden layer formed by two full-connection layers and an output layer;
the neural network structure is as follows:
wherein: w= { W (1) ,b (1) ,W (2) ,b (2) ,W (3) ,b (3) },f o Representing the sigmoid activation function,and->Representing a linear rectification activation function; the two hidden layers are provided with F neurons, and the neurons of the input layer pass through the weight W (1) ∈R F×D And bias term b (1) ∈R F×1 Hidden layer neuron h E R with first layer F×1 Connecting; hidden layer neuron h E R F×1 By weight W (2) ∈R F×F And bias term b (2) ∈R F×1 Is connected with the second layer hidden layer neuron, and finally is connected with the output layer neuron o E R through a sigmoid activation function L×1 Connection to limit the output range to [0,1 ]]Between them;
in the step S2 of the process,
state space:
wherein:respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period, the load reduction amount of the load reduction node and the purchase power amount of the upper power grid; i ij,t A current flowing from the node i to the node j at the time t;
action space:
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1;respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period and the load reduction amount of the load-reducible node; d represents the discretized granularity;
state transfer function:
s t+1 =f(s t ,a t ,ρ)
wherein: ρ represents a random amount; this indicates that the state transition is not only subject to action a t On the basis of a photovoltaic output prediction baseline, normal function noise is added during each training to simulate the uncertainty of the photovoltaic output;
in step S2, the bonus function is:
wherein: lambda (lambda) 1 ,λ 2 Respectively representing the rewarding weight coefficients;giving corresponding punishment to the intelligent agent if the branch power exceeds the allowable upper limit; />The economic operation cost is t time intervals of the power distribution network; />Representing the line loss cost in the t period; />Representing switching loss costs; />Representing cut-down of load costs; />Representing the reject cost; Δt represents a time interval; c loss Representing the electricity price of unit network loss; r is (r) ij Representing the resistance of branch ij; omega shape l Representing all branch sets in the optimization area;respectively represents the single operation cost when changing the on-off states of a feeder tie switch, a transformer substation tie switch and a branch sectionalizer, andeach kind of contact switch state marks respectively representing the reconstruction area, 0 represents open, 1 represents close; omega shape LR Representing a reducible load set; c LR Representing unit load rejection cost; />The load power of the node i is abandoned in the period t; omega shape PV Representing a collection of light Fu Jiedian; />The unit light rejection penalty cost at the time t is represented; />Representing the power of photovoltaic output at the moment t of the node i; />The actual power of the node i in the period t of photovoltaic access to the power grid is represented; v (V) i N And V i,t Respectively a voltage rated value of a node i and an actual value of a period t; Ω denotes a set of all nodes; r is R i,t Representing the load rate of the node i in the t period; />Representing the average load rate of the power distribution network in the t period; p (P) i,t Active power injected for node i in period t; p (P) i max Active power is injected for the maximum allowable node i; n represents the number of nodes of the distribution network.
2. The method for multistage dynamic reconfiguration of an urban power distribution network based on machine learning according to claim 1, wherein the multi-agent reinforcement learning joint optimization model of step S3 firstly determines an optimization subject of 24-hour period through step S1, and distributes different agents to different optimization subjects of different periods; distributing the same agent to the same optimizing subject in different time periods; and the distribution network structure changed by the execution action decision of the agent in the current period is matched with a state transfer function to serve as an agent state space in the next period.
3. The utility model provides a multistage dynamic reconfiguration system of urban distribution network based on machine learning which characterized in that includes: reconstructing a level fast decision model and a distribution network optimization operation model based on reinforcement learning;
the reconstruction level fast decision model comprises: the system comprises a power distribution network state rapid sensing module, a reconstruction level decision module and a first information interaction module;
the power distribution network state rapid sensing module is used for monitoring photovoltaic power generation capacity of a real-time photovoltaic node of a power distribution network in real time, reducing load capacity of load nodes, power exchange capacity of a superior power grid and load demand capacity;
the reconstruction level decision module is used for deciding the reconstruction level of the power distribution network according to the running state of the urban power distribution network and limiting the optimization main body range of the reinforcement learning agent;
the first information interaction module is used for transmitting a reconstruction level decision result to the power distribution network optimization operation model for reinforcement learning;
the power distribution network optimization operation model based on reinforcement learning comprises a second information interaction module, a power distribution network state accurate sensing module, an experience pool module, a tie switch action module, a photovoltaic output decision module, a load reduction decision module, an agent joint operation module and a reconstruction decision module;
the second information interaction module is used for receiving a reconstruction level decision result of the reconstruction level quick decision model;
the power distribution network state accurate sensing module is used for accurately sensing real-time photovoltaic power generation capacity of the power distribution network, load reduction capacity capable of reducing load nodes, power exchange capacity of a superior power grid, load demand and branch current;
the experience pool module is used for storing historical operation environments of the power distribution network and rewarding values obtained after decision making is carried out on the historical operation environments and the corresponding agents;
the contact switch action module is used for remotely controlling the opening and closing actions of the contact switches of all the branches according to the corresponding reconstruction scheme;
the photovoltaic output decision module is used for deciding the photovoltaic node light rejection amount behavior under the corresponding reconstruction scheme;
the load reduction decision module is used for reducing load rejection decision of load nodes under the corresponding reconstruction scheme;
the intelligent agent joint operation module is used for carrying out power distribution network optimization operation decision by combining all intelligent agents;
a reconstruction decision module for controlling each module according to the machine learning-based multistage dynamic reconstruction method for the urban power distribution network according to any one of claims 1-2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210399965.7A CN114662982B (en) | 2022-04-15 | 2022-04-15 | Multistage dynamic reconstruction method for urban power distribution network based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210399965.7A CN114662982B (en) | 2022-04-15 | 2022-04-15 | Multistage dynamic reconstruction method for urban power distribution network based on machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114662982A CN114662982A (en) | 2022-06-24 |
CN114662982B true CN114662982B (en) | 2023-07-14 |
Family
ID=82035547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210399965.7A Active CN114662982B (en) | 2022-04-15 | 2022-04-15 | Multistage dynamic reconstruction method for urban power distribution network based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114662982B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110716550A (en) * | 2019-11-06 | 2020-01-21 | 南京理工大学 | Gear shifting strategy dynamic optimization method based on deep reinforcement learning |
CN112491037A (en) * | 2020-11-09 | 2021-03-12 | 四川大学 | Multi-target multi-stage dynamic reconstruction method and system for urban power distribution network |
CN114123178A (en) * | 2021-11-17 | 2022-03-01 | 哈尔滨工程大学 | Intelligent power grid partition network reconstruction method based on multi-agent reinforcement learning |
CN114282330A (en) * | 2021-12-28 | 2022-04-05 | 山东科技大学 | Distribution network real-time dynamic reconstruction method and system based on branch dual-depth Q network |
CN114298429A (en) * | 2021-12-30 | 2022-04-08 | 国网北京市电力公司 | Power distribution network scheme aided decision-making method, system, device and storage medium |
-
2022
- 2022-04-15 CN CN202210399965.7A patent/CN114662982B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110716550A (en) * | 2019-11-06 | 2020-01-21 | 南京理工大学 | Gear shifting strategy dynamic optimization method based on deep reinforcement learning |
CN112491037A (en) * | 2020-11-09 | 2021-03-12 | 四川大学 | Multi-target multi-stage dynamic reconstruction method and system for urban power distribution network |
CN114123178A (en) * | 2021-11-17 | 2022-03-01 | 哈尔滨工程大学 | Intelligent power grid partition network reconstruction method based on multi-agent reinforcement learning |
CN114282330A (en) * | 2021-12-28 | 2022-04-05 | 山东科技大学 | Distribution network real-time dynamic reconstruction method and system based on branch dual-depth Q network |
CN114298429A (en) * | 2021-12-30 | 2022-04-08 | 国网北京市电力公司 | Power distribution network scheme aided decision-making method, system, device and storage medium |
Non-Patent Citations (1)
Title |
---|
基于蚁群最优算法的配电网重构;蔡国伟;张言滨;孙铭泽;辛鹏;王继松;;东北电力大学学报(自然科学版)(04);9-14 * |
Also Published As
Publication number | Publication date |
---|---|
CN114662982A (en) | 2022-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | Reinforcement learning in sustainable energy and electric systems: A survey | |
Li et al. | Many-objective distribution network reconfiguration via deep reinforcement learning assisted optimization algorithm | |
Macedo et al. | Demand side management using artificial neural networks in a smart grid environment | |
Cinar et al. | Development of future energy scenarios with intelligent algorithms: case of hydro in Turkey | |
Xu et al. | Multi-objective robust optimization of active distribution networks considering uncertainties of photovoltaic | |
CN105932723A (en) | Optimization planning method for grid structure of alternating current/direct current hybrid microgrid | |
Gao et al. | Multi-objective dynamic reconfiguration for urban distribution network considering multi-level switching modes | |
Tao et al. | Reserve evaluation and energy management of micro-grids in joint electricity markets based on non-intrusive load monitoring | |
Jiang et al. | A novel multi-agent cooperative reinforcement learning method for home energy management under a peak power-limiting | |
Pandey et al. | Nodal congestion price estimation in spot power market using artificial neural network | |
Zhang et al. | Two-layered hierarchical optimization strategy with distributed potential game for interconnected hybrid energy systems | |
CN114662982B (en) | Multistage dynamic reconstruction method for urban power distribution network based on machine learning | |
Haydari et al. | Time-series load modelling and load forecasting using neuro-fuzzy techniques | |
Yasin et al. | Long-term load forecasting using grey wolf optimizer-least-squares support vector machine | |
Shayanfar et al. | Multi-Objective allocation of DG simultaneous with capacitor and protective device including load model | |
Rouzbahani | Energy scheduling in IoE-enabled smart grids using probabilistic delayed double deep Q-learning (P3DQL) algorithm | |
Li et al. | A gray rbf model improved by genetic algorithm for electrical power forecasting | |
Wang et al. | Mutual information and non-fixed ANNs for daily peak load forecasting | |
Chen et al. | Research on short-term electricity price prediction in power market based on BP neural network | |
Singla et al. | Application of Levenberg Marquardt Algorithm for Short Term Load Forecasting: A theoretical investigation. | |
Khavari et al. | Forecasting of energy demand in virtual power plants | |
Sanjani | The prediction of increase or decrease of electricity cost using fuzzy expert systems | |
Ghods et al. | Long-term peak demand forecasting by using radial basis function neural networks | |
CN117913827B (en) | Optimization method of complex power distribution network considering trigger function | |
Liu et al. | Distributed optimal dispatch method for smart community demand response based on machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |