CN114662982B - Multistage dynamic reconstruction method for urban power distribution network based on machine learning - Google Patents

Multistage dynamic reconstruction method for urban power distribution network based on machine learning Download PDF

Info

Publication number
CN114662982B
CN114662982B CN202210399965.7A CN202210399965A CN114662982B CN 114662982 B CN114662982 B CN 114662982B CN 202210399965 A CN202210399965 A CN 202210399965A CN 114662982 B CN114662982 B CN 114662982B
Authority
CN
China
Prior art keywords
distribution network
power distribution
representing
load
reconstruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210399965.7A
Other languages
Chinese (zh)
Other versions
CN114662982A (en
Inventor
高红均
王子晗
贺帅佳
马望
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202210399965.7A priority Critical patent/CN114662982B/en
Publication of CN114662982A publication Critical patent/CN114662982A/en
Application granted granted Critical
Publication of CN114662982B publication Critical patent/CN114662982B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/04Power grid distribution networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Analysis (AREA)
  • Software Systems (AREA)
  • Power Engineering (AREA)
  • Water Supply & Treatment (AREA)
  • Development Economics (AREA)
  • Public Health (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention relates to a machine learning-based multistage dynamic reconstruction method for an urban power distribution network, which belongs to the technical field of dynamic reconstruction of the urban power distribution network. Secondly, establishing a multi-agent reinforcement learning model, and carrying out joint optimization on different reconstruction subjects in each period; and (3) learning environmental information such as predicted load, photovoltaic energy output power and the like by using a deep Q network with parameter freezing and experience playback mechanisms, and dynamically reconstructing and optimizing the power distribution network by using a learned strategy set with the aim of optimizing running cost, voltage offset and load balance. The invention can realize the efficient, safe and economic operation of the urban power distribution network.

Description

Multistage dynamic reconstruction method for urban power distribution network based on machine learning
Technical Field
The invention belongs to the technical field of dynamic reconstruction of urban power distribution networks, and particularly relates to a multistage dynamic reconstruction method of an urban power distribution network based on machine learning.
Background
The access of the high-permeability distributed power supply and the rapid development of unbalanced load in urban areas lead to the extremely unbalanced space-time distribution of the net load of the urban power distribution network, and provide new challenges for the safe and economic operation of the power distribution network. The reconstruction of the power distribution network is taken as one of the active management measures of the power distribution network, and the network structure is adjusted by changing the on-off states of the interconnection switch and the sectionalizing switch so as to achieve the purposes of reducing network loss, balancing load, eliminating line overload and improving clean energy consumption. However, the traditional optimization method relies on an explicit model, a prediction technology and an optimization solver, so that the solving is time-consuming and on-line decision making is difficult to realize, and meanwhile, uncertainty caused by large-scale access of distributed power sources such as wind-driven photovoltaic and the like also increases the solving difficulty. Therefore, in the face of increasingly complex power grid environments, how to select an urban power distribution network reconstruction strategy, how to implement on-line decision of a reconstruction level, and how to handle uncertainty of a distributed power supply become important problems to be discussed and researched in the background of a novel power system.
Disclosure of Invention
The invention aims to provide a machine learning-based multistage dynamic reconstruction method for an urban power distribution network, which is used for solving the technical problems in the prior art, such as: the traditional optimization method relies on an explicit model, a prediction technology and an optimization solver, solving is time-consuming and is difficult to realize online decision, and meanwhile, uncertainty caused by large-scale access of distributed power sources such as wind-driven photovoltaic and the like also improves solving difficulty. Therefore, in the face of increasingly complex power grid environments, how to select an urban power distribution network reconstruction strategy, how to implement on-line decision of a reconstruction level, and how to handle uncertainty of a distributed power supply become important problems to be discussed and researched in the background of a novel power system.
In order to achieve the above purpose, the technical scheme of the invention is as follows:
a multistage dynamic reconstruction method of an urban power distribution network based on machine learning comprises the following steps:
s1: fitting the historical operation data of the power distribution network and the reconstruction level decision based on a neural network multi-label classification method to realize the on-line decision of the reconstruction level of the power distribution network;
s2: taking the power and current characteristics of each node in the power distribution network at each moment as an agent state space, taking the on-off state of a branch and the light discarding and load discarding rate as an action space, comprehensively considering the running cost, the voltage offset index and the load balance degree of the power distribution network by a reward function, taking the uncertainty of photovoltaic output into account in state transition, and constructing an urban power distribution network reconstruction model based on reinforcement learning;
s3: and carrying out joint optimization training on different agents distributed in the whole optimization period, and constructing a multi-agent reinforcement learning joint optimization model.
Further, the step S1 specifically includes:
node load, upper power grid interaction quantity and photovoltaic output measured by the whole system of the power distribution network are used as input characteristics of a neural network, and reconstruction levels of all the main bodies are used as output of the neural network; the neural network structure has 4 layers, namely an input layer, a hidden layer formed by two full-connection layers and an output layer;
the neural network structure is as follows:
Figure BDA0003599432400000021
wherein: w= { W (1) ,b (1) ,W (2) ,b (2) ,W (3) ,b (3) },f o Representing the sigmoid activation function,
Figure BDA0003599432400000022
and->
Figure BDA0003599432400000023
Representing a linear rectification activation function; the two hidden layers are provided with F neurons, and the neurons of the input layer pass through the weight W (1) ∈R F×D And bias term b (1) ∈R F×1 Hidden layer neuron h E R with first layer F×1 Connecting; hidden layer neuron h E R F×1 By weight W (2) ∈R F×F And bias term b (2) ∈R F×1 Is connected with the second layer hidden layer neuron, and finally is connected with the output layer neuron o E R through a sigmoid activation function L×1 Connection to limit the output range to [0,1 ]]Between them.
Further, in step S2,
state space:
Figure BDA0003599432400000024
wherein:
Figure BDA0003599432400000025
respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period, the load reduction amount of the load reduction node and the purchase power amount of the upper power grid; i ij,t A current flowing from the node i to the node j at the time t;
action space:
Figure BDA0003599432400000026
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1;
Figure BDA0003599432400000027
respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period and the load reduction amount of the load-reducible node; d represents the discretized granularity;
state transfer function:
s t+1 =f(s t ,a t ,ρ)
wherein: ρ represents a random amount; this indicates that the state transition is not only subject to action a t And is also influenced by randomness, and on the basis of the photovoltaic output prediction baseline, normal function noise is added during each training to simulate the uncertainty of the photovoltaic output.
Further, in step S2, the reward function is:
Figure BDA0003599432400000031
Figure BDA0003599432400000032
Figure BDA0003599432400000033
Figure BDA0003599432400000034
Figure BDA0003599432400000035
Figure BDA0003599432400000036
Figure BDA0003599432400000037
Figure BDA0003599432400000038
Figure BDA0003599432400000039
Figure BDA00035994324000000310
Figure BDA00035994324000000311
wherein: lambda (lambda) 1 ,λ 2 Respectively representing the rewarding weight coefficients;
Figure BDA00035994324000000312
giving corresponding punishment to the intelligent agent if the branch power exceeds the allowable upper limit; />
Figure BDA00035994324000000313
The economic operation cost is t time intervals of the power distribution network; />
Figure BDA00035994324000000314
Representing the line loss cost in the t period; />
Figure BDA00035994324000000315
Representing switching loss costs; />Representing cut-down of load costs; />
Figure BDA00035994324000000317
Representing the reject cost; Δt represents a time interval; c loss Representing the electricity price of unit network loss; r is (r) ij Representing the resistance of branch ij; omega shape l Representing all branch sets in the optimization area;
Figure BDA00035994324000000318
respectively represents the single operation cost when changing the on-off states of a feeder tie switch, a transformer substation tie switch and a branch sectionalizer, and
Figure BDA00035994324000000319
each kind of contact switch state marks respectively representing the reconstruction area, 0 represents open, 1 represents close; omega shape LR Representing a reducible load set; c LR Representing unit load rejection cost; />
Figure BDA0003599432400000041
The load power of the node i is abandoned in the period t; omega shape PV Representing a collection of light Fu Jiedian; />
Figure BDA0003599432400000042
The unit light rejection penalty cost at the time t is represented; />
Figure BDA0003599432400000043
Representing the power of photovoltaic output at the moment t of the node i; />
Figure BDA0003599432400000044
The actual power of the node i in the period t of photovoltaic access to the power grid is represented; v (V) i N And V i,t Respectively a voltage rated value of a node i and an actual value of a period t; Ω denotes a set of all nodes; r is R i,t Representing the load rate of the node i in the t period; />
Figure BDA0003599432400000045
Representing the average load rate of the power distribution network in the t period; p (P) i,t Active power injected for node i in period t; p (P) i max Active power is injected for the maximum allowable node i; n represents the number of nodes of the distribution network.
Further, the multi-agent reinforcement learning joint optimization model of step S3 firstly determines an optimization subject of a 24-hour period through step S1, and distributes different agents to different optimization subjects of different periods; distributing the same agent to the same optimizing subject in different time periods; and the distribution network structure changed by the execution action decision of the agent in the current period is matched with a state transfer function to serve as an agent state space in the next period.
A machine learning based multistage dynamic reconfiguration system for an urban power distribution network, comprising: reconstructing a level fast decision model and a distribution network optimization operation model based on reinforcement learning;
the reconstruction level fast decision model comprises: the system comprises a power distribution network state rapid sensing module, a reconstruction level decision module and a first information interaction module;
the power distribution network state rapid sensing module is used for monitoring photovoltaic power generation capacity of a real-time photovoltaic node of a power distribution network in real time, reducing load capacity of load nodes, power exchange capacity of a superior power grid and load demand capacity;
the reconstruction level decision module is used for deciding the reconstruction level of the power distribution network according to the running state of the urban power distribution network and limiting the optimization main body range of the reinforcement learning agent;
the first information interaction module is used for transmitting a reconstruction level decision result to the power distribution network optimization operation model for reinforcement learning;
the power distribution network optimization operation model based on reinforcement learning comprises a second information interaction module, a power distribution network state accurate sensing module, an experience pool module, a tie switch action module, a photovoltaic output decision module, a load reduction decision module, an agent joint operation module and a reconstruction decision module;
the second information interaction module is used for receiving a reconstruction level decision result of the reconstruction level quick decision model;
the power distribution network state accurate sensing module is used for accurately sensing real-time photovoltaic power generation capacity of the power distribution network, load reduction capacity capable of reducing load nodes, power exchange capacity of a superior power grid, load demand and branch current;
the experience pool module is used for storing historical operation environments of the power distribution network and rewarding values obtained after decision making is carried out on the historical operation environments and the corresponding agents;
the contact switch action module is used for remotely controlling the opening and closing actions of the contact switches of all the branches according to the corresponding reconstruction scheme;
the photovoltaic output decision module is used for deciding the photovoltaic node light rejection amount behavior under the corresponding reconstruction scheme;
the load reduction decision module is used for reducing load rejection decision of load nodes under the corresponding reconstruction scheme;
the intelligent agent joint operation module is used for carrying out power distribution network optimization operation decision by combining all intelligent agents;
and the reconstruction decision module is used for correspondingly controlling each module according to the multi-stage dynamic reconstruction method of the urban power distribution network based on machine learning.
Compared with the prior art, the invention has the following beneficial effects:
the scheme has the advantages that a machine learning urban power distribution network multistage dynamic reconstruction method is provided, a reconstruction level rapid judgment model and a multi-agent deep reinforcement learning model are established, and real-time decision of reconstruction level and optimized operation is realized. Firstly, an on-line decision of a reconstruction level is realized by establishing a neural network-based reconstruction level rapid judgment model, real-time reference is provided for a dispatcher, and meanwhile, the problem that the action space grows exponentially when a plurality of subjects are optimized by a traditional deep reinforcement learning single agent is solved by dividing an optimizing subject; secondly, simulating the uncertainty of the photovoltaic through a state transfer function, realizing convergence of training accuracy through a large amount of training, and solving the problem of difficult solution of the problem containing uncertainty; and finally, a multi-agent joint solution model is established to finish the solution of the multi-stage dynamic reconstruction problem of the power distribution network considering uncertainty, and the model does not need repeated solution under the similar running state of the power distribution network and has practicability.
Drawings
Fig. 1 is a schematic diagram of a multistage dynamic reconstruction method of an urban power distribution network based on machine learning.
FIG. 2 is a schematic diagram of the operation of the reconstruction level fast decision model of the present invention.
Fig. 3 is a working principle diagram of a reconstruction optimization operation model of the power distribution network based on reinforcement learning.
FIG. 4 is a block diagram of a multi-agent reinforcement learning architecture of the present invention.
FIG. 5 is a graph of the reconstruction level fast decision model fitting results of the present invention.
FIG. 6 is a graph of the results of the multi-agent reinforcement learning model optimization of the present invention.
Detailed Description
For the purpose of making the technical solution and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and examples. It should be understood that the particular embodiments described herein are illustrative only and are not intended to limit the invention, i.e., the embodiments described are merely some, but not all, of the embodiments of the invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention. It is noted that relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The features and capabilities of the present invention are described in further detail below in connection with the examples.
Examples:
the traditional optimization method relies on an explicit model, a prediction technology and an optimization solver, solving is time-consuming and is difficult to realize online decision, and meanwhile, uncertainty caused by large-scale access of distributed power sources such as wind-driven photovoltaic and the like also improves solving difficulty. Therefore, in the face of increasingly complex power grid environments, how to select an urban power distribution network reconstruction strategy, how to implement on-line decision of a reconstruction level, and how to handle uncertainty of a distributed power supply become important problems to be discussed and researched in the background of a novel power system.
As shown in fig. 1, the invention uses node load, upper power grid interaction quantity and photovoltaic output measured by a whole system as input characteristics based on measurement characteristics of space-time big data of a power system, so that sample data has space-time characteristics, provides labels for a neural network multi-label classification fitting model, uses binary cross entropy as a loss function, establishes a reconstruction level fast decision model, realizes fast decision of the reconstruction level of a power distribution network, and performs dimension reduction on a next reinforcement learning action space; the power distribution network with the reconstructed level judged is subjected to a single-agent reinforcement learning model containing parameter freezing and experience playback mechanisms by taking the optimal economic operation cost, voltage offset and load balance as targets and considering constraints such as the radiation performance, the tide and the like of a power distribution network structure; dividing a reconstruction optimization main body of the power distribution network in each period according to a 24-hour reconstruction level judgment result, establishing a multi-agent reinforcement learning model, performing joint optimization on the model, and performing validity verification on the method based on an example system.
As shown in fig. 2, the reconstruction level fast decision model consists of a power distribution network state fast sensing module, a reconstruction level decision module and an information interaction module; the power distribution network state rapid sensing module is used for monitoring photovoltaic power generation capacity of real-time photovoltaic nodes of the urban power distribution network in real time, reducing load capacity of load-reducible nodes, power exchange capacity of an upper-level power grid and load demand capacity; the reconstruction level decision module is used for deciding the reconstruction level of the power distribution network according to the running state of the urban power distribution network and limiting the optimization main body range of the reinforcement learning agent; the information interaction module is used for transmitting the reconstruction level decision result to the power distribution network optimization operation model for reinforcement learning;
the power distribution network optimization operation model based on reinforcement learning is composed of an information interaction module, a power distribution network state accurate sensing module, an experience pool module, a tie switch action module, a photovoltaic output decision module, a load reduction decision module, an agent joint operation module and a reconstruction decision module as shown in fig. 3. The information interaction module is used for receiving a reconstruction level decision result of the reconstruction level quick decision model; the power distribution network state accurate sensing module is used for accurately sensing real-time photovoltaic power generation capacity of the power distribution network, load reduction capacity capable of reducing load nodes, power exchange capacity of a superior power grid, load demand and branch current; the experience pool module is used for storing historical operation environments of the power distribution network and rewarding values obtained after decision making is carried out on the historical operation environments and corresponding intelligent agents; the contact switch action module is used for remotely controlling the switching action of each branch contact switch according to the corresponding reconstruction scheme; the photovoltaic output decision module is used for deciding the photovoltaic node light rejection amount behavior under the corresponding reconstruction scheme; the load reduction decision module is used for reducing the load rejection decision of the load node under the corresponding reconstruction scheme; the intelligent agent joint operation module is used for carrying out power distribution network optimization operation decision by combining all intelligent agents;
the reconstruction decision module is used for correspondingly controlling each module according to the machine learning-based power distribution network multistage dynamic reconstruction method established by the invention.
In the reconstruction level fast decision model,
the neural network structure:
Figure BDA0003599432400000071
wherein: w= { W (1) ,b (1) ,W (2) ,b (2) ,W (3) ,b (3) },f o Representing the sigmoid activation function,
Figure BDA0003599432400000072
and->
Figure BDA0003599432400000073
Representing a linear rectifying activation function. The two hidden layers are provided with F neurons, and the neurons of the input layer pass through the weight W (1) ∈R F×D And bias term b (1) ∈R F×1 Hidden layer neuron h E R with first layer F×1 And (5) connection. Hidden layer neuron h E R F×1 By weight W (2) ∈R F×F And bias term b (2) ∈R F×1 Is connected with the second layer hidden layer neuron, and finally is connected with the output layer neuron o E R through a sigmoid activation function L×1 Connection to limit the output range to [0,1 ]]Between them.
The binary cross entropy:
Figure BDA0003599432400000081
wherein: loss represents a binary cross entropy Loss function used to represent the error between evaluating a given real label and the predicted outcome in a multi-label classification task;o nl And y nl The prediction result for the tag l and the real tag are represented respectively.
Meanwhile, the reconstruction level of the power distribution network is judged to have coupling constraint, and the constraint is as follows:
Figure BDA0003599432400000082
Figure BDA0003599432400000083
(m, n correlation)
Figure BDA0003599432400000084
Wherein:
Figure BDA0003599432400000085
respectively representing substation level reconstruction, transformer level reconstruction or feeder level reconstruction which are required by different main bodies in t time periods, wherein the variables are 0-1; omega shape Sub Representing a substation node set; n (N) i Trans Representing the total number of transformers contained in the i substation; f epsilon mu (i) represents a transformer f belonging to a transformer substation i; the formula (3) shows that the transformer substation can only selectively execute transformer level reconstruction or substation level reconstruction at the same time; equation (4) indicates that the actions of the mutually related substations m and n are consistent when whether to execute substation level reconstruction or not; equation (5) indicates that when the upper-layer substation i performs transformer level or substation level reconstruction, the inside of the substation cannot perform feeder level reconstruction at the same time.
In the reinforcement learning-based power distribution network optimization operation model,
the state space:
Figure BDA0003599432400000086
wherein:
Figure BDA0003599432400000087
respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period, the load reduction amount of the load reduction node and the purchase power amount of the upper power grid; i ij,t The current flowing from node i to node j at time t is shown.
The action space is as follows:
Figure BDA0003599432400000088
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1;
Figure BDA0003599432400000089
respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period and the load reduction amount of the load-reducible node; d represents the discretized granularity.
Meanwhile, the action space needs to meet the switch set selection constraint, and the constraint is as follows:
Figure BDA0003599432400000091
Figure BDA0003599432400000092
Figure BDA0003599432400000093
Figure BDA0003599432400000094
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1; omega shape i SW,Sub 、N i SW,Sub 、Ω i SW,Trans And N i SW ,Trans Respectively representing the collection and the number of transformer station interconnection switches and transformer interconnection switches of the i transformer station;
Figure BDA0003599432400000095
Figure BDA0003599432400000096
and->
Figure BDA0003599432400000097
Respectively representing the collection and the quantity of feeder tie switches and branch on-off switches of transformers belonging to the substation i. From the above formula, it can be seen that: 1) When beta is i SR =1、β i TR =0、/>
Figure BDA0003599432400000098
When the transformer substation is in operation, the i transformer substation performs transformer substation level reconstruction, and the on-off state of the interconnection switch can be freely adjusted with the transformer interconnection switch, the feeder interconnection switch and the branch sectionalizer of the associated transformer substation; 2) When beta is i SR =0、β i TR =1、/>
Figure BDA0003599432400000099
When the transformer stage reconstruction is carried out by the i transformer station, the on-off states of the interconnection switch, the feeder interconnection switch and the branch sectionalizer of the transformer station can be freely adjusted; 3) When beta is i SR =0、β i TR =0、/>
Figure BDA00035994324000000910
And when the feeder level reconstruction of the transformer f belonging to the transformer i is executed, the on-off states of a feeder tie switch and a branch sectionalizer belonging to the transformer can be adjusted.
The state transfer function:
s t+1 =f(s t ,a t ,ρ)
where ρ represents a random quantity. This indicates that the state transition is not only subject to action a t And is also affected by randomness, and is added during each training round on the basis of the predicted baseline of the photovoltaic outputNormal function noise to simulate photovoltaic output uncertainty.
The reward function:
Figure BDA00035994324000000911
Figure BDA00035994324000000912
Figure BDA00035994324000000913
Figure BDA0003599432400000101
Figure BDA0003599432400000102
Figure BDA0003599432400000103
Figure BDA0003599432400000104
Figure BDA0003599432400000105
Figure BDA0003599432400000106
Figure BDA0003599432400000107
Figure BDA0003599432400000108
wherein: lambda (lambda) 1 ,λ 2 Respectively representing the rewarding weight coefficients;
Figure BDA0003599432400000109
giving a larger punishment to the intelligent agent if the branch power exceeds the allowable upper limit; />
Figure BDA00035994324000001010
The economic operation cost is t time intervals of the power distribution network; />
Figure BDA00035994324000001011
Representing the line loss cost in the t period;
Figure BDA00035994324000001012
representing switching loss costs; />
Figure BDA00035994324000001013
Representing cut-down of load costs; />
Figure BDA00035994324000001014
Representing the reject cost; Δt represents a time interval; c loss Representing the electricity price of unit network loss; r is (r) ij Representing the resistance of branch ij; omega shape l Representing all branch sets in the optimization area;
Figure BDA00035994324000001015
respectively represents the single operation cost when changing the on-off states of a feeder tie switch, a transformer substation tie switch and a branch sectionalizer, and
Figure BDA00035994324000001016
each type of tie switch status flag, 0, and 1 respectively represent a reconstruction area, respectively, is open and closed. Omega shape LR Representing a reducible load set; c LR Representing unit load rejection cost; />
Figure BDA00035994324000001017
The load power of the node i is abandoned in the period t; omega shape PV Representing a collection of light Fu Jiedian; />
Figure BDA00035994324000001018
The unit light rejection penalty cost at the time t is represented; />
Figure BDA00035994324000001019
Representing the power of photovoltaic output at the moment t of the node i; />
Figure BDA00035994324000001020
The actual power of the node i in the period t of photovoltaic access to the power grid is represented; v (V) i N And V i,t Respectively a voltage rated value of a node i and an actual value of a period t; Ω denotes a set of all nodes; r is R i,t Representing the load rate of the node i in the t period; />
Figure BDA00035994324000001021
Representing the average load rate of the power distribution network in the t period; p (P) i,t Active power injected for node i in period t; p (P) i max Active power is injected for the maximum allowable node i; n represents the number of nodes of the distribution network.
The multi-agent reinforcement learning model:
as shown in fig. 4, firstly, determining an optimized main body in a 24-hour period through a reconstruction level rapid judgment model, and distributing different intelligent agents to different optimized main bodies in different periods; distributing the same agent to the same optimizing subject in different time periods; and the distribution network structure changed by the execution action decision of the agent in the current period is matched with a state transfer function to serve as an agent state space in the next period.
The constraint conditions are as follows:
(1) Tidal current constraint
Figure BDA0003599432400000111
Figure BDA0003599432400000112
Wherein: p (P) i,t And Q i,t Respectively representing the active power and the reactive power of a node i at the moment t; v (V) i,t Representing the voltage of node i at time t, the admittance between adjacent nodes is G ij And B ij ;θ ij Is the voltage phase angle difference.
(2) Safe operation constraint
Figure BDA0003599432400000113
Figure BDA0003599432400000114
Figure BDA0003599432400000115
Wherein: w (w) ij Indicating the switching state of branch ij, if w ij =1, then it indicates that the branch ij switch is closed; omega shape SW Representing a collection of switching legs.
(3) Radiation constraints for power distribution network
Figure BDA0003599432400000116
Figure BDA0003599432400000117
Figure BDA0003599432400000118
Figure BDA0003599432400000119
Wherein: e (E) Always Representing the total number of the branches which are always in a closed state and cannot be adjusted in the net rack; l (L) s (i) To represent a set of branch terminal nodes with i as an initial node; l (L) e (i) Representing a branch initial node set taking i as a terminal node; n (N) Sub Representing the number of substations of the optimized subject, if feeder level reconstruction or transformer level reconstruction is performed, N Sub =1, the size is the sum of the numbers of the substation and the associated substation if substation level reconstruction is performed. However, the power distribution network containing DG may have island operation under the constraint of formula (32), so that the supplementary formulas (33-35) are needed, power epsilon is injected into the non-substation nodes, and the nodes are kept in a communicated state through the simplified power flow constraint.
Figure BDA0003599432400000121
And the auxiliary power flow active power on the branch ij at the moment t is represented.
And (3) performing example verification analysis:
the method is verified by adopting a modified practical 145 node system, fitting training is carried out on the reconstruction level fast decision model based on the existing data set, and validity analysis is carried out on the multi-agent reinforcement learning model by using the prediction data of initial load and photovoltaic output.
As shown in fig. 5, it can be seen that the loss function value of the neural network in the verification set is continuously reduced, approaches to the minimum value and gradually converges after 15 rounds, no fitting phenomenon occurs, and the prediction accuracy is 99% -100%, which indicates that the neural network has been fitted with a reconstruction level fast judgment model based on a mathematical method, thereby realizing accurate sensing of the power distribution network environment, shortening the reconstruction level judgment time, and being capable of fast judging the reconstruction level of the power distribution network.
As shown in fig. 6, for the multi-agent reinforcement learning model for dynamic multi-stage reconstruction of the urban distribution network, the model reaches the vicinity of the maximum value after training for 15 000 rounds, and the reason for the oscillation of the reward value is that the joint agent continuously tries new selection because of the setting of the search value, so as to avoid sinking into local optimum. The optimization effect is continuously improved and tends to be stable according to the rewarding trend, and the effectiveness of the model is verified.
According to the invention, the reconstruction level rapid judgment and operation optimization strategy based on machine learning is mainly researched by considering the space-time flexibility requirement of the net load of the urban power distribution network and the difference of the adjustment capability of the multi-type interconnection switches. Firstly, realizing the quick decision of the reconstruction level of the power distribution network through a quick decision model of the reconstruction level, and reducing the dimension of the next reinforcement learning action space. And secondly, establishing an optimal operation model of the power distribution network based on reinforcement learning by taking the optimal economic operation cost, voltage offset and load balance as targets and considering constraints such as the radiation performance, the tide and the like of the power distribution network structure. And finally, dividing the reconstruction optimization main body of the power distribution network in each period according to the 24-hour reconstruction level judgment result, establishing a multi-agent reinforcement learning model and carrying out joint optimization on the multi-agent reinforcement learning model.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.

Claims (3)

1. The multistage dynamic reconfiguration method for the urban power distribution network based on machine learning is characterized by comprising the following steps of:
s1: fitting the historical operation data of the power distribution network and the reconstruction level decision based on a neural network multi-label classification method to realize the on-line decision of the reconstruction level of the power distribution network;
s2: taking the power and current characteristics of each node in the power distribution network at each moment as an agent state space, taking the on-off state of a branch and the light discarding and load discarding rate as an action space, comprehensively considering the running cost, the voltage offset index and the load balance degree of the power distribution network by a reward function, taking the uncertainty of photovoltaic output into account in state transition, and constructing an urban power distribution network reconstruction model based on reinforcement learning;
s3: performing joint optimization training on different agents distributed in the whole optimization period, and constructing a multi-agent reinforcement learning joint optimization model;
the step S1 is specifically as follows:
node load, upper power grid interaction quantity and photovoltaic output measured by the whole system of the power distribution network are used as input characteristics of a neural network, and reconstruction levels of all the main bodies are used as output of the neural network; the neural network structure has 4 layers, namely an input layer, a hidden layer formed by two full-connection layers and an output layer;
the neural network structure is as follows:
Figure FDA0004170097520000011
wherein: w= { W (1) ,b (1) ,W (2) ,b (2) ,W (3) ,b (3) },f o Representing the sigmoid activation function,
Figure FDA0004170097520000012
and->
Figure FDA0004170097520000013
Representing a linear rectification activation function; the two hidden layers are provided with F neurons, and the neurons of the input layer pass through the weight W (1) ∈R F×D And bias term b (1) ∈R F×1 Hidden layer neuron h E R with first layer F×1 Connecting; hidden layer neuron h E R F×1 By weight W (2) ∈R F×F And bias term b (2) ∈R F×1 Is connected with the second layer hidden layer neuron, and finally is connected with the output layer neuron o E R through a sigmoid activation function L×1 Connection to limit the output range to [0,1 ]]Between them;
in the step S2 of the process,
state space:
Figure FDA0004170097520000014
wherein:
Figure FDA0004170097520000015
respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period, the load reduction amount of the load reduction node and the purchase power amount of the upper power grid; i ij,t A current flowing from the node i to the node j at the time t;
action space:
Figure FDA0004170097520000021
wherein: w (w) ij Representing the communication state of the branch ij switch, wherein the communication state is a variable of 0-1;
Figure FDA0004170097520000022
respectively representing the photovoltaic power generation amount of the photovoltaic node in the t period and the load reduction amount of the load-reducible node; d represents the discretized granularity;
state transfer function:
s t+1 =f(s t ,a t ,ρ)
wherein: ρ represents a random amount; this indicates that the state transition is not only subject to action a t On the basis of a photovoltaic output prediction baseline, normal function noise is added during each training to simulate the uncertainty of the photovoltaic output;
in step S2, the bonus function is:
Figure FDA0004170097520000023
Figure FDA0004170097520000024
Figure FDA0004170097520000025
Figure FDA0004170097520000026
Figure FDA0004170097520000027
Figure FDA0004170097520000028
Figure FDA0004170097520000029
Figure FDA00041700975200000210
wherein: lambda (lambda) 1 ,λ 2 Respectively representing the rewarding weight coefficients;
Figure FDA0004170097520000031
giving corresponding punishment to the intelligent agent if the branch power exceeds the allowable upper limit; />
Figure FDA0004170097520000032
The economic operation cost is t time intervals of the power distribution network; />
Figure FDA0004170097520000033
Representing the line loss cost in the t period; />
Figure FDA0004170097520000034
Representing switching loss costs; />
Figure FDA0004170097520000035
Representing cut-down of load costs; />
Figure FDA0004170097520000036
Representing the reject cost; Δt represents a time interval; c loss Representing the electricity price of unit network loss; r is (r) ij Representing the resistance of branch ij; omega shape l Representing all branch sets in the optimization area;
Figure FDA0004170097520000037
respectively represents the single operation cost when changing the on-off states of a feeder tie switch, a transformer substation tie switch and a branch sectionalizer, and
Figure FDA0004170097520000038
each kind of contact switch state marks respectively representing the reconstruction area, 0 represents open, 1 represents close; omega shape LR Representing a reducible load set; c LR Representing unit load rejection cost; />
Figure FDA0004170097520000039
The load power of the node i is abandoned in the period t; omega shape PV Representing a collection of light Fu Jiedian; />
Figure FDA00041700975200000310
The unit light rejection penalty cost at the time t is represented; />
Figure FDA00041700975200000311
Representing the power of photovoltaic output at the moment t of the node i; />
Figure FDA00041700975200000312
The actual power of the node i in the period t of photovoltaic access to the power grid is represented; v (V) i N And V i,t Respectively a voltage rated value of a node i and an actual value of a period t; Ω denotes a set of all nodes; r is R i,t Representing the load rate of the node i in the t period; />
Figure FDA00041700975200000313
Representing the average load rate of the power distribution network in the t period; p (P) i,t Active power injected for node i in period t; p (P) i max Active power is injected for the maximum allowable node i; n represents the number of nodes of the distribution network.
2. The method for multistage dynamic reconfiguration of an urban power distribution network based on machine learning according to claim 1, wherein the multi-agent reinforcement learning joint optimization model of step S3 firstly determines an optimization subject of 24-hour period through step S1, and distributes different agents to different optimization subjects of different periods; distributing the same agent to the same optimizing subject in different time periods; and the distribution network structure changed by the execution action decision of the agent in the current period is matched with a state transfer function to serve as an agent state space in the next period.
3. The utility model provides a multistage dynamic reconfiguration system of urban distribution network based on machine learning which characterized in that includes: reconstructing a level fast decision model and a distribution network optimization operation model based on reinforcement learning;
the reconstruction level fast decision model comprises: the system comprises a power distribution network state rapid sensing module, a reconstruction level decision module and a first information interaction module;
the power distribution network state rapid sensing module is used for monitoring photovoltaic power generation capacity of a real-time photovoltaic node of a power distribution network in real time, reducing load capacity of load nodes, power exchange capacity of a superior power grid and load demand capacity;
the reconstruction level decision module is used for deciding the reconstruction level of the power distribution network according to the running state of the urban power distribution network and limiting the optimization main body range of the reinforcement learning agent;
the first information interaction module is used for transmitting a reconstruction level decision result to the power distribution network optimization operation model for reinforcement learning;
the power distribution network optimization operation model based on reinforcement learning comprises a second information interaction module, a power distribution network state accurate sensing module, an experience pool module, a tie switch action module, a photovoltaic output decision module, a load reduction decision module, an agent joint operation module and a reconstruction decision module;
the second information interaction module is used for receiving a reconstruction level decision result of the reconstruction level quick decision model;
the power distribution network state accurate sensing module is used for accurately sensing real-time photovoltaic power generation capacity of the power distribution network, load reduction capacity capable of reducing load nodes, power exchange capacity of a superior power grid, load demand and branch current;
the experience pool module is used for storing historical operation environments of the power distribution network and rewarding values obtained after decision making is carried out on the historical operation environments and the corresponding agents;
the contact switch action module is used for remotely controlling the opening and closing actions of the contact switches of all the branches according to the corresponding reconstruction scheme;
the photovoltaic output decision module is used for deciding the photovoltaic node light rejection amount behavior under the corresponding reconstruction scheme;
the load reduction decision module is used for reducing load rejection decision of load nodes under the corresponding reconstruction scheme;
the intelligent agent joint operation module is used for carrying out power distribution network optimization operation decision by combining all intelligent agents;
a reconstruction decision module for controlling each module according to the machine learning-based multistage dynamic reconstruction method for the urban power distribution network according to any one of claims 1-2.
CN202210399965.7A 2022-04-15 2022-04-15 Multistage dynamic reconstruction method for urban power distribution network based on machine learning Active CN114662982B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210399965.7A CN114662982B (en) 2022-04-15 2022-04-15 Multistage dynamic reconstruction method for urban power distribution network based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210399965.7A CN114662982B (en) 2022-04-15 2022-04-15 Multistage dynamic reconstruction method for urban power distribution network based on machine learning

Publications (2)

Publication Number Publication Date
CN114662982A CN114662982A (en) 2022-06-24
CN114662982B true CN114662982B (en) 2023-07-14

Family

ID=82035547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210399965.7A Active CN114662982B (en) 2022-04-15 2022-04-15 Multistage dynamic reconstruction method for urban power distribution network based on machine learning

Country Status (1)

Country Link
CN (1) CN114662982B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110716550A (en) * 2019-11-06 2020-01-21 南京理工大学 Gear shifting strategy dynamic optimization method based on deep reinforcement learning
CN112491037A (en) * 2020-11-09 2021-03-12 四川大学 Multi-target multi-stage dynamic reconstruction method and system for urban power distribution network
CN114123178A (en) * 2021-11-17 2022-03-01 哈尔滨工程大学 Intelligent power grid partition network reconstruction method based on multi-agent reinforcement learning
CN114282330A (en) * 2021-12-28 2022-04-05 山东科技大学 Distribution network real-time dynamic reconstruction method and system based on branch dual-depth Q network
CN114298429A (en) * 2021-12-30 2022-04-08 国网北京市电力公司 Power distribution network scheme aided decision-making method, system, device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110716550A (en) * 2019-11-06 2020-01-21 南京理工大学 Gear shifting strategy dynamic optimization method based on deep reinforcement learning
CN112491037A (en) * 2020-11-09 2021-03-12 四川大学 Multi-target multi-stage dynamic reconstruction method and system for urban power distribution network
CN114123178A (en) * 2021-11-17 2022-03-01 哈尔滨工程大学 Intelligent power grid partition network reconstruction method based on multi-agent reinforcement learning
CN114282330A (en) * 2021-12-28 2022-04-05 山东科技大学 Distribution network real-time dynamic reconstruction method and system based on branch dual-depth Q network
CN114298429A (en) * 2021-12-30 2022-04-08 国网北京市电力公司 Power distribution network scheme aided decision-making method, system, device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于蚁群最优算法的配电网重构;蔡国伟;张言滨;孙铭泽;辛鹏;王继松;;东北电力大学学报(自然科学版)(04);9-14 *

Also Published As

Publication number Publication date
CN114662982A (en) 2022-06-24

Similar Documents

Publication Publication Date Title
Yang et al. Reinforcement learning in sustainable energy and electric systems: A survey
Li et al. Many-objective distribution network reconfiguration via deep reinforcement learning assisted optimization algorithm
Macedo et al. Demand side management using artificial neural networks in a smart grid environment
Cinar et al. Development of future energy scenarios with intelligent algorithms: case of hydro in Turkey
Xu et al. Multi-objective robust optimization of active distribution networks considering uncertainties of photovoltaic
CN105932723A (en) Optimization planning method for grid structure of alternating current/direct current hybrid microgrid
Gao et al. Multi-objective dynamic reconfiguration for urban distribution network considering multi-level switching modes
Tao et al. Reserve evaluation and energy management of micro-grids in joint electricity markets based on non-intrusive load monitoring
Jiang et al. A novel multi-agent cooperative reinforcement learning method for home energy management under a peak power-limiting
Pandey et al. Nodal congestion price estimation in spot power market using artificial neural network
Zhang et al. Two-layered hierarchical optimization strategy with distributed potential game for interconnected hybrid energy systems
CN114662982B (en) Multistage dynamic reconstruction method for urban power distribution network based on machine learning
Haydari et al. Time-series load modelling and load forecasting using neuro-fuzzy techniques
Yasin et al. Long-term load forecasting using grey wolf optimizer-least-squares support vector machine
Shayanfar et al. Multi-Objective allocation of DG simultaneous with capacitor and protective device including load model
Rouzbahani Energy scheduling in IoE-enabled smart grids using probabilistic delayed double deep Q-learning (P3DQL) algorithm
Li et al. A gray rbf model improved by genetic algorithm for electrical power forecasting
Wang et al. Mutual information and non-fixed ANNs for daily peak load forecasting
Chen et al. Research on short-term electricity price prediction in power market based on BP neural network
Singla et al. Application of Levenberg Marquardt Algorithm for Short Term Load Forecasting: A theoretical investigation.
Khavari et al. Forecasting of energy demand in virtual power plants
Sanjani The prediction of increase or decrease of electricity cost using fuzzy expert systems
Ghods et al. Long-term peak demand forecasting by using radial basis function neural networks
CN117913827B (en) Optimization method of complex power distribution network considering trigger function
Liu et al. Distributed optimal dispatch method for smart community demand response based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant