CN111478331B

CN111478331B - Method and system for adjusting power flow convergence of power system

Info

Publication number: CN111478331B
Application number: CN202010187181.9A
Authority: CN
Inventors: 徐华廷; 于之虹; 侯金秀; 戴红阳; 贾育培; 王兵; 张璐路; 魏亚威; 解梅; 史东宇; 吕颖; 鲁广明; 田芳
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Priority date: 2020-03-17
Filing date: 2020-03-17
Publication date: 2023-01-06
Anticipated expiration: 2040-03-17
Also published as: CN111478331A

Abstract

The invention discloses a method and a system for adjusting power flow convergence of a power system, and belongs to the technical field of power flow convergence adjustment of large power grids. The method comprises the following steps: determining input and output dimensions of the Q neural network model according to the state space and the action space; determining a mapping relation between the action space and the start-stop state of the generator of the power system, and adjusting the running state of the generator of the power system according to the adjusting action output by the training model; taking a target load level and a start-stop state of a generator of a power system as inputs, adjusting actions as outputs, and training a Q neural network model according to input and output dimensions; and adjusting the power flow of the power system to a convergence state according to the adjusting action. The method does not depend on expert experience, greatly saves manpower, is beneficial to improving the automation level of the calculation of the operation mode of the power system, and has great engineering application value and popularization prospect.

Description

Method and system for adjusting power flow convergence of power system

Technical Field

The present invention relates to the field of large power grid power flow convergence adjustment technologies, and more particularly, to a method and a system for adjusting power flow convergence of a power system.

Background

The operation mode calculation of the power system is the basis for ensuring the safe and stable operation of the power system, the load flow calculation is also a core task in the operation mode calculation, and the static stability, the transient stability analysis and the like in various operation modes are all based on the load flow calculation result. In recent years, the construction scale of the power grid in China is remarkably enlarged, and particularly, the extra-high voltage alternating current and direct current hybrid connection large power grid pattern is gradually formed. The problem of load flow convergence in load flow calculation of a large power grid is increasingly prominent, and solving the problem of load flow convergence is also one of the most time-consuming links in operation mode calculation. At present, the method is still mainly completed manually, a large power grid is divided into a plurality of small sub-areas, boundary conditions are set among the sub-areas, converged currents of the sub-areas are obtained respectively, and then the sub-areas are spliced together step by step, and finally the converged currents of the complete large power grid are obtained.

Disclosure of Invention

In view of the above problem, the present invention provides a method for adjusting power flow convergence of an electric power system, including:

acquiring a historical base state load state of a power system, determining a state space and an action space according to a load level of the historical base state load state and an adjustable electrical element in the system, and determining input and output dimensions of a Q neural network model according to the state space and the action space;

determining a mapping relation between the action space and the start-stop state of the generator of the power system, and adjusting the running state of the generator of the power system according to the adjusting action output by the training model;

taking a target load level and a start-stop state of a generator of a power system as inputs, adjusting actions as outputs, and training a Q neural network model according to input and output dimensions;

and acquiring a target load level and the current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action.

Optionally, the ground state load state includes a load active power value and a load reactive power value.

Optionally, the state space includes start-stop states of the power system generator and load states of the power system.

Optionally, the action space includes the start-stop state of all adjustable generators in the power system.

Optionally, training the Q neural network model specifically includes:

initializing a current Q network and randomly initializing a weight parameter of the current Q network;

initializing a target Q network, and assigning the weight parameters of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;

initializing a playback memory unit, and storing experience tuples generated in the training process, wherein the experience tuples are system state transition, adjustment action and instant reward;

after the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;

inputting a target load level and a start-stop state of a generator of the power system to a current Q network;

the current Q network outputs an adjusting action according to the target load level and the start-stop state of the generator of the power system;

executing an adjusting action with a preset probability, wherein the adjusting action is the running state of a reverse adjustable generator;

determining the reward value of the adjusting action, acquiring the next operation state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next operation state as an experience tuple in a playback memory unit;

randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated reward of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;

and (4) carrying out a preset round of training on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and then generating a Q neural network model.

The invention also provides a system for adjusting power flow convergence of a power system, comprising:

the acquisition module is used for acquiring the historical base state load state of the power system, determining a state space and an action space according to the load level of the historical base state load state and adjustable electrical elements in the system, and determining the input dimension and the output dimension of the Q neural network model according to the state space and the action space;

the processing module is used for determining the mapping relation between the action space and the start-stop state of the power system generator and adjusting the running state of the power system generator according to the adjustment action output by the training model;

the training module takes a target load level and a start-stop state of a generator of the power system as input, adjusts actions as output, and trains a Q neural network model according to input and output dimensions;

and the adjusting module is used for acquiring the target load level and the current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action.

Optionally, the state space includes a start-stop state of the power system generator and a load state of the power system.

Optionally, the action space includes start-stop states of all adjustable generators in the power system.

Optionally, training the Q neural network model specifically includes:

initializing a target Q network, and assigning the weight parameter of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;

executing an adjusting action with a preset probability, wherein the adjusting action is the running state of the reverse adjustable generator;

The method does not depend on expert experience, greatly saves manpower, is beneficial to improving the automation level of the calculation of the operation mode of the power system, and has great engineering application value and popularization prospect.

Drawings

FIG. 1 is a flow chart of a method for adjusting power flow convergence in an electrical power system according to the present invention;

FIG. 2 is a flow chart of an embodiment of a method for adjusting power flow convergence in a power system according to the invention;

fig. 3 is a block diagram of a system for adjusting power flow convergence in a power system according to the present invention.

Detailed Description

The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terms used in the exemplary embodiments shown in the drawings are not intended to limit the present invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their context in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.

The invention provides a method for adjusting power flow convergence of a power system, as shown in fig. 1, comprising:

taking a target load level and a start-stop state of a generator of a power system as input, adjusting action as output, and training a Q neural network model according to input and output dimensions;

And the ground state load state comprises a load active power value and a load reactive power value.

And the state space comprises a start-stop state of a generator of the power system and a load state of the power system.

And the action space comprises the start-stop state of all adjustable generators in the power system.

Training a Q neural network model, specifically comprising:

the current Q network outputs an adjusting action according to the target load level and the start-stop state of a generator of the power system;

determining the reward value of the adjusting action, acquiring the next running state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next running state as an experience tuple to a playback memory unit;

randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated rewards of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;

and (4) carrying out training of a preset round on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and then generating a Q neural network model.

The invention is further illustrated by the following examples:

the method of the present invention, as shown in fig. 2, comprises:

s1, inputting a basic state load state, wherein the specific content comprises a load active power value and a load reactive power value;

s2, determining a state space and an action space, wherein the content of the state space comprises the starting and stopping state and the current load state of the generators in the power system, and the action space corresponds to the number G of the generators which can be started and stopped in the power system;

s3, randomly generating a load state and providing a random training target for each training round;

s4, determining a mapping relation between an action space and an actual starting and stopping state of the generator, and adjusting the running state of the generator of the power system according to a specific action given by a proposed algorithm;

and S5, constructing a value function-based deep reinforcement learning algorithm structure, taking the starting and stopping state and the load state of the generator of the power system as input, taking the adjustment strategy of the generator as output, training the Q network and the target Q network in the power system to obtain a training model, and adjusting the target power system according to the adjustment action of the training model.

In the step S2, a state space and an action space are determined, the content of the state space includes a start-stop state and a current load state of the generators in the power system, and the action space corresponds to the number G of the generators which can be started and stopped in the power system, and the specific steps include:

s21, representing the starting and stopping state of a generator of the power system by using a variable of 0-1, wherein 0 represents the stop of the generator, and 1 represents the operation of the generator;

s22, numbering N load states possibly occurring in the power system, and representing by using binary codes;

s23, combining the generator state represented by using the variable 0-1 and the load state represented by using the binary code to jointly become the input state of the step S5;

and S24, taking N possible load states of the power system as the action space of the step S5.

Step S4, determining a mapping relation between an action space and an actual generator start-stop state, and adjusting the running state of a generator of the power system according to a specific action given by a proposed algorithm, wherein the specific steps comprise:

s41, numbering all adjustable generators of the power system, wherein each number corresponds to the output action in the step S5;

and S42, specific action execution is to reverse the start-stop state of the corresponding generator, namely when the action i is output in the step S5, the generator running state with the number of i is reversed.

S5, constructing a value function-based deep reinforcement learning algorithm structure, taking the start-stop state and the load state of a generator of the power system as input, taking an adjustment strategy of the generator as output, and training a Q network and a target Q network, wherein the specific steps comprise:

s51, the deep reinforcement learning algorithm structure based on the value function comprises a current Q network and a target Q network, the structures of the two networks are completely the same, the two networks are both n layers of fully-connected neural networks, the input of the two networks is the system state quantity formed by the generator and the load in the step S2, the output of the two networks is a G-dimensional column vector, and the column vector element G is _i Corresponding to all adjustable generators i one by one, and the action a of the output _t Adjusting the starting and stopping state of the generator corresponding to the maximum value;

s52, initializing a playback memory unit D with the capacity of M, storing the training samples, and continuously removing the oldest stored old samples after the samples are fully stored;

s53, initializing the current Q network, randomly initializing the weight parameter theta of the current Q network, initializing a target Q network, wherein the structure and the initialization weight theta' of the target Q network are the same as those of the current Q network

S54, the current Q network is based on the current state s _t And step S3 of randomly generating load state information output action a _t ；

S55, action a is executed with probability 1-epsilon _t Reversing the operating state of the corresponding generator (i.e. if the generator is in operation, then stopping operation, if the generator is stopped, then putting it into operation), and randomly reversing the operating state of any one generator with a probability epsilon (small);

s56, calculating the reward value r under the corresponding action according to the formula (1) _t The system state is from s _t Is transferred to s _t+1 A 1 is to _t ，a _t ，r _t ，s _t+1 Combining into one experience(s) _t ，a _t ，r _t ，s _t+1 ) And stores it in a playback memory unit D

S57, randomly extracting m pieces of empirical data from the playback memory unit D, and calculating y according to a formula (2) _j ；

S58, calculating total loss of the m experiences

Updating a Q network parameter theta by using an Adam algorithm according to the gradient descending direction, and completely copying the Q network parameter to the target Q network at intervals of C training steps;

s59, repeating the steps S54-S58 until a convergent power flow state without exceeding the limit of the balancing machine is obtained or the maximum iteration step number T of the current training round is reached by adjusting the running state of the generator based on the given load state;

and S510, re-executing the step S3, randomly generating a training target, and re-executing the step S59 until the Q network can adjust a convergent power flow state without exceeding the limit of the balancing machine for all load states.

The present invention also proposes a system 200 for adjusting power flow convergence in a power system, as shown in fig. 3, comprising:

the acquisition module 201 is used for acquiring the historical base state load state of the power system, determining a state space and an action space according to the load level of the historical base state load state and adjustable electrical elements in the system, and determining the input dimension and the output dimension of the Q neural network model according to the state space and the action space;

the processing module 202 determines a mapping relation between an action space and a start-stop state of the power system generator, and adjusts the running state of the power system generator according to an adjusting action output by the training model;

the training module 203 is used for taking the target load level and the start-stop state of the generator of the power system as input, adjusting the action as output and training a Q neural network model according to input and output dimensions;

the adjusting module 204 obtains the target load level and the current start-stop state of the generator, determines an adjusting action according to the Q neural network model, and adjusts the power flow of the power system to a convergence state according to the adjusting action.

And the state space comprises a start-stop state of the generator of the power system and a load state of the power system.

Training a Q neural network model, specifically comprising:

when the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be implemented by adopting various computer languages, such as object-oriented programming language Java and transliterated scripting language JavaScript.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method for adjusting power system power flow convergence, the method comprising:

acquiring historical ground state load state of a power system, determining state space and action space according to load level of the historical ground state load state and adjustable electrical elements in the system, and determining input dimension and output dimension of a Q neural network model according to the state space and the action space, wherein

The action space comprises start-stop states of all adjustable generators in the power system, wherein a variable of 0-1 is used for representing the start-stop states of the generators in the power system, 0 represents the stop of the generators, and 1 represents the operation of the generators;

determining a mapping relation between the action space and the start-stop state of the generator of the power system, and adjusting the running state of the generator of the power system according to the adjustment action output by the training model;

obtaining a target load level and a current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action, wherein

The training Q neural network model specifically comprises the following steps:

initializing a target Q network, and assigning a weight parameter of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;

performing a preset round of training on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and generating a Q neural network model, wherein

The reward function for the reward value is as follows:

2. the method of claim 1, the ground state load condition comprising a load real power value and a reactive power value.

3. The method of claim 1, the state space comprising a start-stop state of a power system generator and a load state of a power system.

4. A system for adjusting power system power flow convergence, the system comprising:

the acquisition module acquires the historical base state load state of the power system, determines a state space and an action space according to the load level of the historical base state load state and adjustable electrical elements in the system, and determines the input dimension and the output dimension of the Q neural network model according to the state space and the action space, wherein

the adjusting module is used for acquiring a target load level and the current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action, wherein the adjusting action is

The training Q neural network model specifically comprises the following steps:

training the target Q network in a preset round until the average accumulated reward value reaches a higher level and keeps stable, and generating a Q neural network model, wherein

The reward function for the reward value is as follows:

5. the system of claim 4, the ground state load condition comprising a load real power value and a reactive power value.

6. The system of claim 4, the state space comprising a start-stop state of a power system generator and a load state of a power system.