CN116151562A

CN116151562A - Mobile emergency vehicle scheduling and power distribution network toughness improving method based on graphic neural network reinforcement learning

Info

Publication number: CN116151562A
Application number: CN202310064061.3A
Authority: CN
Inventors: 江昌旭; 周龙灿; 卢玥君; 林铮; 邵振国
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2023-01-13
Filing date: 2023-01-13
Publication date: 2023-05-23

Abstract

The invention provides a mobile emergency vehicle dispatching and power distribution network toughness improving method based on graphic neural network reinforcement learning, which comprises the following steps: step S1: initializing a power distribution network-traffic network model; step S2: the fault of the traffic network and the power system line is randomly recovered; step S3: constructing a state of a graph neural network reinforcement learning algorithm; step S4: generating a scheduling behavior strategy of the mobile emergency energy storage vehicle; step S5: executing a dispatching strategy of the mobile emergency energy storage vehicle, and judging and updating the state of the mobile energy storage vehicle; step S6: calculating a reconstruction strategy of the power distribution network, and calculating a reward function of the mobile emergency energy storage vehicle according to reconstruction and optimization of the power distribution network; step S7: the state of the graph neural network reinforcement learning algorithm is updated; step S8: storing the information of the current step in a memory unit; step S9: judging whether a preset time is reached; if not, executing the steps (2) - (8); if yes, outputting the parameters of the graph neural network reinforcement learning algorithm and the corresponding optimized dispatching results.

Description

Mobile emergency vehicle scheduling and power distribution network toughness improving method based on graphic neural network reinforcement learning

Technical Field

The invention belongs to the technical field of toughness improvement of power distribution networks, and particularly relates to a mobile emergency vehicle scheduling and power distribution network toughness improvement method based on graphic neural network reinforcement learning.

Background

In recent years, with global climate change, extreme weather disasters occur increasingly frequently, resulting in huge economic losses and political-social effects. In the middle of 2 months 2021, the state of the united states encounters a winter storm attack and the air temperature of the whole state drops. The highest power generation loss of the power system in the state of getting in extreme weather is 52277MW, the maximum load shedding amount is 20000MW, the duration of load shedding is 70.5h, the direct/indirect economic loss of the power system in the state of getting in power failure accidents reaches 800 to 1300 hundred million dollars, and the social production life is seriously threatened. The power distribution network is used as an important component of the power system and is directly related to users, the capability of resisting disaster accidents of the power distribution network can effectively ensure people to produce and live power supply, and the loss of power interruption to the economy and society in power supply is reduced, so that the power distribution network is widely focused. The damage caused by extreme weather is often an N-k fault, and a power grid running based on reliability cannot function under such serious accidents, so that it is necessary to research a toughness improvement strategy of the urban power distribution network for coping with extreme disasters such as typhoons.

At present, along with the continuous popularization of electrified traffic, mobile energy storage energy flowing in a traffic network is continuously increased, a discrete energy network with huge scale, various forms and flexible topology is formed, and a substance and energy foundation is provided for enhancing the toughness of a power distribution network. The special mobile emergency energy storage vehicle is the most widely used power grid emergency power supply resource. The power supply capacity of the emergency energy storage vehicle depends on the vehicle-mounted battery, and the capacity of the emergency energy storage vehicle is generally 200-500 kW.h. The emergency power supply vehicle is flexibly deployed, so that disaster prevention measures can be optimized, the power failure risk caused by extreme disasters is reduced, key load recovery is accelerated after the disasters, and the toughness of the power distribution network is obviously improved. However, most of the existing studies assume that the potential traffic network and power system line fault repair time is a fixed time, and even some of the studies believe that these faults persist for the duration of the study. In practice, however, due to extreme weather changes, the degree of line faults in the traffic network and the power system, and the level of expertise in emergency repair personnel, the time for causing the fault to be repaired is not a definite time, but is subject to normal distribution with a certain time as a mean value and a variance. Therefore, how to formulate a power distribution network recovery strategy by scheduling of mobile emergency energy storage vehicles under the condition of considering the uncertainty of the fault recovery time of the potential traffic network and the power system line is a problem to be solved urgently.

Disclosure of Invention

In view of the above, the invention aims to provide a mobile emergency vehicle dispatching and power distribution network toughness improving method based on graphic neural network reinforcement learning, which can formulate a power distribution network recovery strategy by dispatching a mobile emergency energy storage vehicle under the condition of considering the uncertainty of the potential traffic network and the line fault recovery time of a power system so as to improve the power distribution network toughness.

The method mainly comprises the following steps: step S1: initializing a power distribution network-traffic network model; step S2: the fault of the traffic network and the power system line is randomly recovered; step S3: constructing a state of a graph neural network reinforcement learning algorithm according to the information of the power distribution network and the traffic system; step S4: generating a scheduling behavior strategy of the mobile emergency energy storage vehicle according to an epsilon-Greedy algorithm and a graph neural network reinforcement learning algorithm; step S5: executing a dispatching strategy of the mobile emergency energy storage vehicle, and judging and updating the state of the mobile energy storage vehicle; step S6: calculating a reconstruction strategy of the power distribution network, and calculating a reward function of the mobile emergency energy storage vehicle according to reconstruction and optimization of the power distribution network; step S7: the state of the graph neural network reinforcement learning algorithm is updated; step S8: storing the information of the current step in a memory unit; step S9: judging whether a preset time is reached; if not, executing (2) - (8); if yes, outputting the parameters of the graph neural network reinforcement learning algorithm and the corresponding optimized dispatching results.

The technical scheme adopted for solving the technical problems is as follows:

the mobile emergency vehicle dispatching and power distribution network toughness improving method based on graphic neural network reinforcement learning is characterized by comprising the following steps of:

step S1: initializing a power distribution network-traffic network model;

step S2: the fault of the traffic network and the power system line is randomly recovered;

step S3: according to information of a power distribution network and a traffic system, constructing a state x of a graph neural network reinforcement learning algorithm _i,t ；

Step S4: generating a scheduling behavior strategy a of the mobile emergency energy storage vehicle according to an epsilon-Greedy algorithm and a graph neural network reinforcement learning algorithm _i,t ；

Step S5: executing a scheduling strategy a of mobile emergency energy storage vehicle _i,t Judging and updating the state of the mobile energy storage vehicle;

step S6: calculating a reconstruction strategy of the power distribution network, and calculating a reward function r of the mobile emergency energy storage vehicle according to reconstruction and optimization of the power distribution network _i,t ；

Step S7: state x 'of graph neural net reinforcement learning algorithm' _i,t Updating;

step S8: information of the current step (x _i ,t,a _i ,t,r _i,t ,x' _i,t ) The method comprises the steps of storing the weight of the graph neural network reinforcement learning algorithm in a memory unit D, and updating the weight of the graph neural network reinforcement learning algorithm based on a random gradient descent method;

step S9: judging whether or not a predetermined time T is reached _end The method comprises the steps of carrying out a first treatment on the surface of the If not, executing the steps S2 to S8; if yes, outputting the parameters of the graph neural network reinforcement learning algorithm and the corresponding optimized dispatching results.

Further, the step S1 specifically includes the following steps:

step S11: the power distribution network model initialization at least comprises: initializing upper and lower limit voltages of nodes of a power distribution network system, parameters of lines and transformers, upper and lower constraints of voltage and output of a distributed generator, output of renewable energy sources, load rate of the power distribution network, capacity and position of a charging station, position of a fault line and expected recovery time;

step S12: the traffic network model initialization at least comprises: initializing the position of a charging station in a traffic network, the positions of fault traffic nodes and lines and the expected recovery time, wherein the positions of the traffic network nodes, the road lengths, the road capacities and the free-running speeds are the same as those of the charging station;

step S13: the neural network parameter initialization at least comprises: initializing a neural network weight W and a bias term B, and initializing super parameters of a learning rate alpha, a discount factor gamma, a batch size B and a memory unit D capacity size;

step S14: time t=0 is initialized.

Further, in step S2, it is considered that the extreme event causes disconnection of the weak line of the power distribution network, and at the same time, the traffic road is damaged; with the improvement of weather and the emergency repair of emergency personnel, damaged roads of a power distribution network fault line and a traffic network can be gradually repaired, but the repair time shows certain randomness due to the difference of damage degree and the difference of professional levels of emergency repair personnel, so that the fault recovery time of the traffic network and the power distribution network line is assumed to be compliant with a certain normal distribution

Wherein->

Representing a faulty element L _i Repairing the time average value; />

Representing a faulty element L _i The temporal variance of the repair.

Further, in step S3, the graph neural network reinforcement learningState x of algorithm _i,t Self state EV of ith electric automobile at time t _i,t Neighbor electric automobile state Ne _i,t Line status

I.e. whether to disconnect or not, power load demand->

And renewable energy source output->

Namely: />

Wherein, the formula (2) represents the ith electric vehicle state EV _i,t Including the next node when the electric car goes to the charging station

Road number->

Electric automobile running speed v _i,t And mobile energy storage vehicle residual electric quantity SOC _i,t ；

Equation (3) represents the state Ne of the adjacent electric vehicle _i,t Including the state of each neighboring electric car k, such as the next node of the kth electric car adjacent to the ith electric car

The road number where it is located +.>

Electric motorSpeed v of automobile _i,k,t And the remaining capacity SOC _i,k,t 。

Further, step S4 includes the steps of:

step S41: generating a scheduling behavior strategy of the mobile energy storage vehicle in a random mode by using epsilon probability, namely

a _i,t ＝np.random.randint(||A _action ||)(4)

In the formula, ||A _action The I represents the behavior quantity of the mobile emergency energy storage vehicle; when a is _i,t When the value of the energy storage device is=0, the mobile emergency energy storage vehicle is in a charging or discharging state at the position; when a is _i,t If not equal to 0, the mobile emergency energy storage vehicle is represented according to the scheduling behavior strategy a _i,t Starting to charge or discharge to the next destination so as to achieve the purpose of reducing load reduction of the power distribution network and improve the toughness of the power distribution network;

step S42: generating a scheduling behavior strategy a of the mobile emergency energy storage vehicle according to experience of the graphic neural network reinforcement learning algorithm with probability of 1-epsilon _i,t The method comprises the following steps:

wherein argmax (·) represents a parameter corresponding to the maximum value; q (x) _i,t ,a；θ _t ) Indicating that the ith mobile emergency energy storage vehicle is in state x _i,t An action value function under the adjacency matrix A and the scheduling behavior strategy a; θ _t Representing neural network parameters of the neural network reinforcement learning algorithm at time t;

the neural network structure of the graph neural network reinforcement learning algorithm comprises an input layer, wherein the input layer comprises a state set x of all mobile emergency energy storage vehicles _t ＝{x _1,t ,x _2,t ,...,x _N,t -and a relationship matrix consisting of mobile emergency energy storage vehicles, namely an adjacency matrix a; and adopts a full connection layer to input state x _t Feature extraction of x' _t The method comprises the steps of carrying out a first treatment on the surface of the Then, connecting two layers of graph-annotated force neural network to the proposed characteristic x' _t Processing with the adjacency matrix A; finally, connect a layer of wholeThe connection layer outputs and moves the emergent energy storage car in this state x _t The lower and adjacent matrix A correspond to all action value functions; and the mobile emergency energy storage vehicle agent selects a scheduling behavior strategy according to the action value function.

Further, in step S6, a bonus function r of the mobile emergency energy storage vehicle is calculated according to the power distribution network reconstruction and the optimal power flow optimization _i,t The method specifically comprises the following steps:

step S61: updating the load rate of the distribution network, and calculating the charging of the movable energy storage vehicle

And maximum discharge power

Step S62: establishing a power distribution network reconstruction and a power distribution network optimal power flow model:

/>

-Mα _mn,t ≤V _mn,t ≤Mα _mn,t (9)

wherein S is _b Representing a set of all nodes of the power distribution network; s is S _r Representing that the first node and the last node of the access transformer substation, the distributed power supply node and the fault line form a potential root node set; n (N) _b The number of nodes of the power distribution network is counted; alpha _mn,t The line state at time t is represented as a binary variable, the line closure is 1, and the opening is 0; gamma ray _n,t As a binary variable, if the potential root node n is used as the root node of the power distribution network in the period t, the value of the potential root node n is 1, otherwise, the value of the potential root node n is 0; m is a sufficiently large constant; v (V) _mn,t Virtual power flow through branch m-n for time t;

the formula (6) shows that the power distribution network established in the network reconstruction process needs to meet the radial topological structure, namely the number of closed circuits is equal to the number of network nodes minus the number of subgraphs;

the formulas (7) to (9) are constraint conditions of power distribution network reconstruction, and the constraint conditions require that the insides of all sub-graphs are communicated;

representing a fault line set at time t; pf (pf) _mn,t And qf _mn,t Active power flow and reactive power flow of the branch m-n of the power distribution network at the time t are represented; pd (pd) _m,t And qd _m,t Representing the recovery active load and reactive load of the power distribution network branch at the node m at the time t; pg (pg) _m,t And qg _m,t Representing the active output and the reactive output of the distributed power supply m; />

And->

Representing the active power and reactive power of the mobile emergency energy storage vehicle in the discharge of the mth node; />

And->

Representing the active and reactive output of a fan connected with an mth node; />

And->

Respectively representing the original active load demand and reactive load demand of the mth node; />

And->

Respectively representing the maximum charging and discharging power which can be provided by the mobile emergency energy storage vehicle at the mth node; />

And->

Respectively representing the rated active power and reactive power of the branch m-n;

the formulas (11) to (20) represent optimal power flow models of the power distribution network;

according to the reconstruction of the power distribution network and the optimal power flow demand of the power distribution network, an objective function is constructed as follows:

in the method, in the process of the invention,

a unit value representing the load of the ith node; />

Indicating the loss of load at node i at time t, i.e. +.>

Δt represents the calculated time scale; c _i Representing the cost of the distributed generator to produce a unit load; s is S _g Representing a distributed generator set; pg (pg) _i,t Representing the output active power of the distributed generator;

step S63: obtaining a power distribution network reconstruction scheme, optimal power flow distribution and power of an emergency mobile energy storage vehicle by adopting a Gurobi solver solution model

Step S64: calculating a reward function r of the mobile emergency energy storage vehicle according to the obtained optimization _i,t ：

In the method, in the process of the invention,

representing load unit value recovered by the mobile energy storage vehicle; equation (22) shows that the rewarding function of the mobile emergency energy storage vehicle is mainly determined by the recovered load quantity, so that the load of the power distribution network is supplied as soon as possible, and the purpose of improving the toughness of the power distribution network is achieved.

Further, in step S7, updating weights of the reinforcement learning algorithm of the neural network based on the random gradient descent method specifically includes:

step S71: randomly extracting a certain number of Sample samples from the memory unit D;

step S72: constructing a loss function as shown in a formula (23), and updating the weight of the graph neural network reinforcement learning algorithm according to a random gradient descent method under the condition of the extracted Sample as shown in a formula (24);

wherein x, a, x 'and a' are the current state, action and the state and action at the next moment respectively; r represents an immediate reward for graph neural network reinforcement learning; θ _t The parameters of the graph neural network reinforcement learning algorithm at the current time t are represented; 0.ltoreq.γ.ltoreq.1 represents a discount factor reflecting the effect of future Q values on current actions;

the neural network reinforcement learning algorithm parameter theta 'expressed in the target graph' _t Lower state-action value;

in θ _t The parameters of the graph neural network reinforcement learning algorithm at the current time t are represented;

represents the relative theta _t Conducting derivation operation; alpha represents a learning rate;

step S73: per given number of steps takenCurrent map neural network reinforcement learning parameter theta _t Strengthening learning parameter theta for target graph neural network _t ' update.

Compared with the prior art, the invention and the preferable scheme thereof have the following beneficial effects:

the method takes the mobile energy storage vehicles in the research area as agents, abstracts the dynamic relationship of the mobile energy storage vehicles into edges, and converts the coordination and the coordination relationship of the mobile energy storage vehicles into a dynamic network diagram structure. The graph neural network-based reinforcement learning method can combine the strong dynamic relation processing capacity of the graph neural network with the strong sequential random optimization decision capacity of reinforcement learning, and can effectively solve the problem of multi-agent dynamic interaction in an uncertainty environment. The proposed method for strengthening learning of the graphic neural network formulates a scheduling strategy of the mobile emergency vehicle, and can effectively solve the problem of toughness improvement considering the uncertainty of the line fault repair time of the potential traffic network and the power system. On the basis of an active power distribution network considering renewable energy source output, a power distribution network reconstruction and optimal power flow model is constructed to formulate a power distribution network toughness lifting strategy, and a reward function of the mobile emergency energy storage vehicle is calculated based on the power distribution network toughness lifting strategy, so that optimal scheduling of the mobile emergency energy storage vehicle is realized. In summary, the method provided by the invention can effectively consider the uncertainty of the fault repair time of the traffic network and the power system line, and the power distribution network toughness is improved by formulating a power distribution network recovery strategy through the dispatching of the mobile emergency energy storage vehicle.

Drawings

The invention is described in further detail below with reference to the attached drawings and detailed description:

FIG. 1 is a flow chart of a method of a preferred embodiment of the present invention.

Detailed Description

In order to make the features and advantages of the present patent more comprehensible, embodiments accompanied with figures are described in detail below:

it should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

As shown in fig. 1, the method for dispatching the mobile emergency vehicles and improving the toughness of the power distribution network based on the reinforcement learning of the graphic neural network provided by the embodiment comprises the following steps:

s11: initializing a power distribution network-traffic network model;

s12: the fault of the traffic network and the power system line is randomly recovered;

s13: according to information of a power distribution network and a traffic system, constructing a state x of a graph neural network reinforcement learning algorithm _i,t ；

S14: generating a scheduling behavior strategy a of the mobile emergency energy storage vehicle according to an epsilon-Greedy algorithm and a graph neural network reinforcement learning algorithm _i,t ；

S15: executing a scheduling strategy a of mobile emergency energy storage vehicle _i,t Judging and updating the state of the mobile energy storage vehicle;

s16: calculating a reconstruction strategy of the power distribution network, and calculating a reward function r of the mobile emergency energy storage vehicle according to reconstruction and optimization of the power distribution network _i,t ；

S17: state x 'of graph neural net reinforcement learning algorithm' _i,t Updating;

s18: information of the current step (x _i ,t,a _i ,t,r _i ,t,x' _i,t ) The method comprises the steps of storing the weight of the graph neural network reinforcement learning algorithm in a memory unit D, and updating the weight of the graph neural network reinforcement learning algorithm based on a random gradient descent method;

s19: judging whether or not a predetermined time T is reached _end . If not, executing (S12) to (S17); if yes, output the graphAnd the neural network strengthens the parameters of the learning algorithm and correspondingly optimizes the dispatching result.

The following description will be given for specific development of this embodiment:

1. and initializing a power distribution network-traffic network model. The method comprises the following steps of 1) initializing a power distribution network model, wherein the power distribution network model comprises power distribution network system node upper and lower limit voltages, line and transformer parameters, distributed generator voltage and output upper and lower constraints, renewable energy output, power distribution network load rate, charging station capacity and position, fault line position, expected recovery time and the like; 2) Initializing a traffic network model, wherein the traffic network model comprises a traffic network node, a road length, a road capacity, a free-running speed, the position of a charging station in a traffic network, the positions of fault traffic nodes and lines, an expected recovery time and the like; 3) Initializing parameters of the neural network, including initializing the weight W and the bias term B of the neural network, and initializing super parameters such as the learning rate alpha, the discount factor gamma, the batch size B, the capacity size of the memory unit D and the like; 4) Time t=0 is initialized.

The mobile energy storage vehicles in the research area are regarded as agents and are regarded as a node N epsilon N, the connection of the mobile energy storage vehicles is regarded as a side E epsilon E, the graph network structure G= (N, E) is formed, and each electric vehicle i is in the current state x _i,t And initializing the adjacency matrix A.

2. And the fault of the traffic network and the power system line is recovered randomly.

The extreme event causes the weak line of the distribution network to break, and at the same time, the traffic road is damaged. With the improvement of weather and the emergency repair of emergency personnel, damaged roads of a power distribution network fault line and a traffic network can be gradually repaired, but the repair time shows a certain randomness due to the difference of damage degree and the difference of professional levels of emergency repair personnel, so the embodiment assumes that the fault recovery time of the traffic network and the power distribution network line obeys a certain normal distribution

Wherein->

Representing a faulty element L _i Repairing the time average value; />

Representing a faulty element L _i The temporal variance of the repair.

3. According to information of a power distribution network and a traffic system, constructing a state x of a graph neural network reinforcement learning algorithm _i,t 。

State x of graph neural network reinforcement learning algorithm _i,t Self state EV of ith electric automobile at time t _i,t Neighbor electric automobile state Ne _i,t Line status

(whether or not to disconnect), power load demand->

And renewable energy source output

Composition, i.e

Wherein, the formula (48) represents the ith electric vehicle state EV _i,t Including the next node when the electric car goes to the charging station

Road number->

Electric automobile running speed v _i,t And mobile energy storage vehicle residual electric quantity SOC _i,t The method comprises the steps of carrying out a first treatment on the surface of the Equation (49) represents a neighboring electricityState Ne of motor car _i,t Including the state of each neighboring electric car k, such as the next node of the kth electric car adjacent to the ith electric car +.>

Road number where it is located ∈ ->

Electric automobile running speed v _i,k,t And the remaining capacity SOC _i,k,t 。

4. Generating a scheduling behavior strategy a of the mobile emergency energy storage vehicle according to an epsilon-Greedy algorithm and a graph neural network reinforcement learning algorithm _i,t The method comprises the following steps:

step 41: generating a scheduling behavior strategy of the mobile energy storage vehicle in a random mode by using epsilon probability, namely

a _i,t ＝np.random.randint(||A _action ||)(50)

step 42: generating a scheduling behavior strategy a of the mobile emergency energy storage vehicle according to experience of the graphic neural network reinforcement learning algorithm with probability of 1-epsilon _i,t I.e.

Wherein argmax (·) represents a parameter corresponding to the maximum value; q (x) _i,t ,a；θ _t ) Indicating that the ith mobile emergency energy storage vehicle is in state x _i,t An action value function under the adjacency matrix A and the scheduling behavior strategy a; θ _t The neural network parameters of the neural network reinforcement learning algorithm at time t are represented.

The neural network reinforcement learning algorithm of the embodiment comprises a neural network structure including an input layer, wherein the input layer comprises a state set x of all mobile emergency energy storage vehicles _t ＝{x _1,t ,x _2,t ,...,x _N,t And a relationship matrix, i.e., adjacency matrix a, made up of moving emergency energy storage vehicles. Then, a full connection layer is adopted for inputting the state x _t Feature extraction of x' _t The method comprises the steps of carrying out a first treatment on the surface of the Secondly, connecting two layers of graph-annotated force neural networks to the proposed characteristic x' _t Processing with the adjacency matrix A; finally, connecting a layer of all-connection layer to output and move the emergency energy storage vehicle in the state x _t The lower and adjacency matrix a correspond to all action value functions. And the mobile emergency energy storage vehicle agent selects a scheduling behavior strategy according to the action value function.

5. Executing a scheduling strategy a of mobile emergency energy storage vehicle _i,t And judging and updating the state of the mobile energy storage vehicle.

6. Calculating a reconstruction strategy of the power distribution network, and calculating a reward function r of the mobile emergency energy storage vehicle according to reconstruction and optimization of the power distribution network _i,t Comprises the following steps:

And maximum discharge power

-Mα _mn,t ≤V _mn,t ≤Mα _mn,t (55)

/>

wherein S is _b Representing a set of all nodes of the power distribution network; s is S _r Representing that the first node and the last node of an access transformer substation, a distributed power supply (generator) node and a fault line form a potential root node set; n (N) _b The number of nodes of the power distribution network is counted; alpha _mn,t The line state at time t is represented as a binary variable, the line closure is 1, and the opening is 0; gamma ray _n,t And (3) taking the potential root node n as a root node of the power distribution network in a period t as a binary variable, wherein the value of the potential root node n is 1, and otherwise, the value of the potential root node n is 0.M is a sufficiently large constant; v (V) _mn,t For a virtual power flow through branch m-n at time t.

Equation (52) indicates that the distribution network established during network reconfiguration needs to satisfy a radial topology, i.e., the number of closed circuits is equal to the number of network nodes minus the number of subgraphs.

Equations (53) through (55) are constraints for power distribution network reconstruction, which require that the interior of each sub-graph be connected.

And->

And->

Representing the active and reactive power output of the fan connected to the mth node. />

And->

And->

And->

The rated active power and reactive power of the branch m-n are shown, respectively.

The formulas (57) to (66) represent optimal power flow models of the power distribution network.

in the method, in the process of the invention,

a unit value representing the load of the ith node; />

Indicating the loss of load at node i at time t, i.e. +.>

Δt represents the calculated time scale; c _i Representing the cost of the distributed generator to produce a unit load; s is S _g Representing a distributed generator set; pg (pg) _i,t Representing the output active power of the distributed generator.

Step S63: solving the above-mentioned modes by means of Gurobi solverObtaining a power distribution network reconstruction scheme, optimal power flow distribution and power of an emergency mobile energy storage vehicle

In the method, in the process of the invention,

representing load unit value recovered by the mobile energy storage vehicle; equation (68) shows that the rewarding function of the mobile emergency energy storage vehicle is mainly determined by the recovered load quantity, so that the load of the power distribution network is supplied as soon as possible, and the purpose of improving the toughness of the power distribution network is achieved.

7. State x 'of graph neural net reinforcement learning algorithm' _i,t Updating, including updating electric vehicle state EV _i,t Neighbor electric automobile state Ne _i,t 。

8. Information of the current step (x _i ,t,a _i ,t,r _i,t ,x' _i,t ) And the data are stored in a memory unit D, and the weights of the graph neural network reinforcement learning algorithm are updated based on a random gradient descent method. The method mainly comprises the following steps:

step 71: randomly extracting a certain number of Sample samples from the memory unit D;

step 72: constructing a loss function as shown in a formula (69), and updating the weight of the graph neural network reinforcement learning algorithm according to a random gradient descent method under the condition of the extracted Sample as shown in a formula (70);

wherein x, a, x 'and a' are the current state, action and the state and action at the next moment respectively; θ _t Representing the current time tThe neural network reinforcement learning algorithm parameters; 0.ltoreq.γ.ltoreq.1 represents a discount factor reflecting the effect of future Q values on current actions;

the neural network reinforcement learning algorithm parameter theta 'expressed in the target graph' _t The lower state-action value.

represents the relative theta _t Conducting derivation operation; alpha represents the learning rate.

Step 73: strengthening learning parameters theta according to the neural network of the current map every time a certain number of steps are passed _t Strengthening learning parameter theta for target graph neural network _t ' update.

9. Judging whether or not a predetermined time T is reached _end . If not, executing the second to eighth steps; if yes, outputting the parameters of the graph neural network reinforcement learning algorithm and corresponding output results.

The invention provides a mobile emergency vehicle dispatching and power distribution network toughness improving method based on graphic neural network reinforcement learning. The method takes the mobile energy storage vehicles in the research area as agents, abstracts the dynamic relationship of the mobile energy storage vehicles into edges, and converts the coordination and the coordination relationship of the mobile energy storage vehicles into a dynamic network diagram structure. The graph neural network-based reinforcement learning method can combine the strong dynamic relation processing capacity of the graph neural network with the strong sequential random optimization decision capacity of reinforcement learning, and can effectively solve the problem of multi-agent dynamic interaction in an uncertainty environment. Therefore, the embodiment provides a method for strengthening learning of a graphic neural network to formulate a scheduling strategy of a mobile emergency vehicle so as to effectively solve the problem of toughness improvement considering uncertainty of line fault repair time of a potential traffic network and a power system. On the basis of an active power distribution network considering renewable energy source output, a power distribution network reconstruction and optimal power flow model is constructed to formulate a power distribution network toughness lifting strategy, and a reward function of the mobile emergency energy storage vehicle is calculated based on the power distribution network toughness lifting strategy, so that optimal scheduling of the mobile emergency energy storage vehicle is realized. In summary, the method provided by the embodiment can effectively consider the uncertainty of the fault repair time of the traffic network and the power system line, and the power distribution network toughness improvement is realized by formulating a power distribution network recovery strategy through the dispatching of the mobile emergency energy storage vehicle.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the invention in any way, and any person skilled in the art may make modifications or alterations to the disclosed technical content to the equivalent embodiments. However, any simple modification, equivalent variation and variation of the above embodiments according to the technical substance of the present invention still fall within the protection scope of the technical solution of the present invention.

The present patent is not limited to the above-mentioned best embodiment, any person can obtain other various forms of mobile emergency vehicle dispatching and power distribution network toughness improving methods based on graphic neural network reinforcement learning under the teaching of the present patent, and all equivalent changes and modifications made according to the scope of the present application should be covered by the present patent.

Claims

1. The mobile emergency vehicle dispatching and power distribution network toughness improving method based on graphic neural network reinforcement learning is characterized by comprising the following steps of:

step S1: initializing a power distribution network-traffic network model;

step S8: information of the current step (x _i,t ,a _i,t ,r _i,t ,x' _i,t ) The method comprises the steps of storing the weight of the graph neural network reinforcement learning algorithm in a memory unit D, and updating the weight of the graph neural network reinforcement learning algorithm based on a random gradient descent method;

2. The mobile emergency vehicle dispatching and power distribution network toughness improving method based on the graphic neural network reinforcement learning according to claim 1, wherein the step S1 specifically comprises the following steps:

step S14: time t=0 is initialized.

3. The mobile emergency vehicle dispatching and power distribution network toughness improving method based on the graphic neural network reinforcement learning according to claim 1, wherein in step S2, the power distribution network weak line is broken due to the consideration of an extreme event, and meanwhile, a traffic road is damaged; with the improvement of weather and the emergency repair of emergency personnel, damaged roads of a power distribution network fault line and a traffic network can be gradually repaired, but the repair time shows certain randomness due to the difference of damage degree and the difference of professional levels of emergency repair personnel, so that the fault recovery time of the traffic network and the power distribution network line is assumed to be compliant with a certain normal distribution

Wherein->

Representing a faulty element L _i Repairing the time average value; />

Representing a faulty element L _i The temporal variance of the repair.

4. The method for scheduling mobile emergency vehicles and improving toughness of power distribution network based on graph neural network reinforcement learning according to claim 1, wherein in step S3, the state x of the graph neural network reinforcement learning algorithm is _i,t Self state EV of ith electric automobile at time t _i,t Neighbor electric automobile state Ne _i,t Line status

I.e. whether to disconnect or not, power load demand->

And renewable energy source output->

Namely: />

Road number->

The road number where it is located +.>

5. The mobile emergency vehicle dispatching and power distribution network toughness improvement method based on graphic neural network reinforcement learning according to claim 1, wherein step S4 comprises the following steps:

a _i,t ＝np.random.randint(||A _action ||) (4)

the neural network structure of the graph neural network reinforcement learning algorithm comprises an input layer, wherein the input layer comprises a state set x of all mobile emergency energy storage vehicles _t ＝{x _1,t ,x _2,t ,...,x _N,t -and a relationship matrix consisting of mobile emergency energy storage vehicles, namely an adjacency matrix a; and adopts a full connection layer to input state x _t Feature extraction of x' _t The method comprises the steps of carrying out a first treatment on the surface of the Then, connecting two layers of graph-annotated force neural network to the proposed characteristic x' _t Processing with the adjacency matrix A; finally, connecting a layer of all-connection layer to output and move emergency energy storageThe vehicle is in the state x _t The lower and adjacent matrix A correspond to all action value functions; and the mobile emergency energy storage vehicle agent selects a scheduling behavior strategy according to the action value function.

6. The method for scheduling mobile emergency vehicles and improving toughness of power distribution network based on graphic neural network reinforcement learning according to claim 1, wherein in step S6, a reward function r of the mobile emergency energy storage vehicles is calculated according to power distribution network reconstruction and optimal power flow optimization _i,t The method specifically comprises the following steps:

And maximum discharge power

/>

-Mα _mn,t ≤V _mn,t ≤Mα _mn,t (9)

And->

And->

Representing the sum of the active forces of fans connected to the mth nodeReactive power output; />

And->

And->

And->

in the method, in the process of the invention,

a unit value representing the load of the ith node; />

Indicating the loss of load at the ith node at time t, i.e

In the method, in the process of the invention,

7. The method for dispatching mobile emergency vehicles and improving toughness of power distribution network based on reinforcement learning of graphic neural network according to claim 6, wherein in step S7, the weight of reinforcement learning algorithm of graphic neural network is updated based on a random gradient descent method, specifically comprising:

step S73: strengthening learning parameter theta according to current graph neural network every given number of steps _t Strengthening learning parameter theta 'for target graph neural network' _t And updating.