CN111478331B - Method and system for adjusting power flow convergence of power system - Google Patents

Method and system for adjusting power flow convergence of power system Download PDF

Info

Publication number
CN111478331B
CN111478331B CN202010187181.9A CN202010187181A CN111478331B CN 111478331 B CN111478331 B CN 111478331B CN 202010187181 A CN202010187181 A CN 202010187181A CN 111478331 B CN111478331 B CN 111478331B
Authority
CN
China
Prior art keywords
state
power system
network
adjusting
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010187181.9A
Other languages
Chinese (zh)
Other versions
CN111478331A (en
Inventor
徐华廷
于之虹
侯金秀
戴红阳
贾育培
王兵
张璐路
魏亚威
解梅
史东宇
吕颖
鲁广明
田芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Corp of China SGCC
Priority to CN202010187181.9A priority Critical patent/CN111478331B/en
Publication of CN111478331A publication Critical patent/CN111478331A/en
Application granted granted Critical
Publication of CN111478331B publication Critical patent/CN111478331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/04Circuit arrangements for ac mains or ac distribution networks for connecting networks of the same frequency but supplied from different sources
    • H02J3/06Controlling transfer of power between connected networks; Controlling sharing of load between connected networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a method and a system for adjusting power flow convergence of a power system, and belongs to the technical field of power flow convergence adjustment of large power grids. The method comprises the following steps: determining input and output dimensions of the Q neural network model according to the state space and the action space; determining a mapping relation between the action space and the start-stop state of the generator of the power system, and adjusting the running state of the generator of the power system according to the adjusting action output by the training model; taking a target load level and a start-stop state of a generator of a power system as inputs, adjusting actions as outputs, and training a Q neural network model according to input and output dimensions; and adjusting the power flow of the power system to a convergence state according to the adjusting action. The method does not depend on expert experience, greatly saves manpower, is beneficial to improving the automation level of the calculation of the operation mode of the power system, and has great engineering application value and popularization prospect.

Description

Method and system for adjusting power flow convergence of power system
Technical Field
The present invention relates to the field of large power grid power flow convergence adjustment technologies, and more particularly, to a method and a system for adjusting power flow convergence of a power system.
Background
The operation mode calculation of the power system is the basis for ensuring the safe and stable operation of the power system, the load flow calculation is also a core task in the operation mode calculation, and the static stability, the transient stability analysis and the like in various operation modes are all based on the load flow calculation result. In recent years, the construction scale of the power grid in China is remarkably enlarged, and particularly, the extra-high voltage alternating current and direct current hybrid connection large power grid pattern is gradually formed. The problem of load flow convergence in load flow calculation of a large power grid is increasingly prominent, and solving the problem of load flow convergence is also one of the most time-consuming links in operation mode calculation. At present, the method is still mainly completed manually, a large power grid is divided into a plurality of small sub-areas, boundary conditions are set among the sub-areas, converged currents of the sub-areas are obtained respectively, and then the sub-areas are spliced together step by step, and finally the converged currents of the complete large power grid are obtained.
Disclosure of Invention
In view of the above problem, the present invention provides a method for adjusting power flow convergence of an electric power system, including:
acquiring a historical base state load state of a power system, determining a state space and an action space according to a load level of the historical base state load state and an adjustable electrical element in the system, and determining input and output dimensions of a Q neural network model according to the state space and the action space;
determining a mapping relation between the action space and the start-stop state of the generator of the power system, and adjusting the running state of the generator of the power system according to the adjusting action output by the training model;
taking a target load level and a start-stop state of a generator of a power system as inputs, adjusting actions as outputs, and training a Q neural network model according to input and output dimensions;
and acquiring a target load level and the current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action.
Optionally, the ground state load state includes a load active power value and a load reactive power value.
Optionally, the state space includes start-stop states of the power system generator and load states of the power system.
Optionally, the action space includes the start-stop state of all adjustable generators in the power system.
Optionally, training the Q neural network model specifically includes:
initializing a current Q network and randomly initializing a weight parameter of the current Q network;
initializing a target Q network, and assigning the weight parameters of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;
initializing a playback memory unit, and storing experience tuples generated in the training process, wherein the experience tuples are system state transition, adjustment action and instant reward;
after the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;
inputting a target load level and a start-stop state of a generator of the power system to a current Q network;
the current Q network outputs an adjusting action according to the target load level and the start-stop state of the generator of the power system;
executing an adjusting action with a preset probability, wherein the adjusting action is the running state of a reverse adjustable generator;
determining the reward value of the adjusting action, acquiring the next operation state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next operation state as an experience tuple in a playback memory unit;
randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated reward of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;
and (4) carrying out a preset round of training on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and then generating a Q neural network model.
The invention also provides a system for adjusting power flow convergence of a power system, comprising:
the acquisition module is used for acquiring the historical base state load state of the power system, determining a state space and an action space according to the load level of the historical base state load state and adjustable electrical elements in the system, and determining the input dimension and the output dimension of the Q neural network model according to the state space and the action space;
the processing module is used for determining the mapping relation between the action space and the start-stop state of the power system generator and adjusting the running state of the power system generator according to the adjustment action output by the training model;
the training module takes a target load level and a start-stop state of a generator of the power system as input, adjusts actions as output, and trains a Q neural network model according to input and output dimensions;
and the adjusting module is used for acquiring the target load level and the current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action.
Optionally, the ground state load state includes a load active power value and a load reactive power value.
Optionally, the state space includes a start-stop state of the power system generator and a load state of the power system.
Optionally, the action space includes start-stop states of all adjustable generators in the power system.
Optionally, training the Q neural network model specifically includes:
initializing a current Q network and randomly initializing a weight parameter of the current Q network;
initializing a target Q network, and assigning the weight parameter of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;
initializing a playback memory unit, and storing experience tuples generated in the training process, wherein the experience tuples are system state transition, adjustment action and instant reward;
after the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;
inputting a target load level and a start-stop state of a generator of the power system to a current Q network;
the current Q network outputs an adjusting action according to the target load level and the start-stop state of the generator of the power system;
executing an adjusting action with a preset probability, wherein the adjusting action is the running state of the reverse adjustable generator;
determining the reward value of the adjusting action, acquiring the next operation state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next operation state as an experience tuple in a playback memory unit;
randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated reward of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;
and (4) carrying out a preset round of training on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and then generating a Q neural network model.
The method does not depend on expert experience, greatly saves manpower, is beneficial to improving the automation level of the calculation of the operation mode of the power system, and has great engineering application value and popularization prospect.
Drawings
FIG. 1 is a flow chart of a method for adjusting power flow convergence in an electrical power system according to the present invention;
FIG. 2 is a flow chart of an embodiment of a method for adjusting power flow convergence in a power system according to the invention;
fig. 3 is a block diagram of a system for adjusting power flow convergence in a power system according to the present invention.
Detailed Description
The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terms used in the exemplary embodiments shown in the drawings are not intended to limit the present invention. In the drawings, the same units/elements are denoted by the same reference numerals.
Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their context in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.
The invention provides a method for adjusting power flow convergence of a power system, as shown in fig. 1, comprising:
acquiring a historical base state load state of a power system, determining a state space and an action space according to a load level of the historical base state load state and an adjustable electrical element in the system, and determining input and output dimensions of a Q neural network model according to the state space and the action space;
determining a mapping relation between the action space and the start-stop state of the generator of the power system, and adjusting the running state of the generator of the power system according to the adjusting action output by the training model;
taking a target load level and a start-stop state of a generator of a power system as input, adjusting action as output, and training a Q neural network model according to input and output dimensions;
and acquiring a target load level and the current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action.
And the ground state load state comprises a load active power value and a load reactive power value.
And the state space comprises a start-stop state of a generator of the power system and a load state of the power system.
And the action space comprises the start-stop state of all adjustable generators in the power system.
Training a Q neural network model, specifically comprising:
initializing a current Q network and randomly initializing a weight parameter of the current Q network;
initializing a target Q network, and assigning the weight parameter of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;
initializing a playback memory unit, and storing experience tuples generated in the training process, wherein the experience tuples are system state transition, adjustment action and instant reward;
after the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;
inputting a target load level and a start-stop state of a generator of the power system to a current Q network;
the current Q network outputs an adjusting action according to the target load level and the start-stop state of a generator of the power system;
executing an adjusting action with a preset probability, wherein the adjusting action is the running state of a reverse adjustable generator;
determining the reward value of the adjusting action, acquiring the next running state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next running state as an experience tuple to a playback memory unit;
randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated rewards of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;
and (4) carrying out training of a preset round on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and then generating a Q neural network model.
The invention is further illustrated by the following examples:
the method of the present invention, as shown in fig. 2, comprises:
s1, inputting a basic state load state, wherein the specific content comprises a load active power value and a load reactive power value;
s2, determining a state space and an action space, wherein the content of the state space comprises the starting and stopping state and the current load state of the generators in the power system, and the action space corresponds to the number G of the generators which can be started and stopped in the power system;
s3, randomly generating a load state and providing a random training target for each training round;
s4, determining a mapping relation between an action space and an actual starting and stopping state of the generator, and adjusting the running state of the generator of the power system according to a specific action given by a proposed algorithm;
and S5, constructing a value function-based deep reinforcement learning algorithm structure, taking the starting and stopping state and the load state of the generator of the power system as input, taking the adjustment strategy of the generator as output, training the Q network and the target Q network in the power system to obtain a training model, and adjusting the target power system according to the adjustment action of the training model.
In the step S2, a state space and an action space are determined, the content of the state space includes a start-stop state and a current load state of the generators in the power system, and the action space corresponds to the number G of the generators which can be started and stopped in the power system, and the specific steps include:
s21, representing the starting and stopping state of a generator of the power system by using a variable of 0-1, wherein 0 represents the stop of the generator, and 1 represents the operation of the generator;
s22, numbering N load states possibly occurring in the power system, and representing by using binary codes;
s23, combining the generator state represented by using the variable 0-1 and the load state represented by using the binary code to jointly become the input state of the step S5;
and S24, taking N possible load states of the power system as the action space of the step S5.
Step S4, determining a mapping relation between an action space and an actual generator start-stop state, and adjusting the running state of a generator of the power system according to a specific action given by a proposed algorithm, wherein the specific steps comprise:
s41, numbering all adjustable generators of the power system, wherein each number corresponds to the output action in the step S5;
and S42, specific action execution is to reverse the start-stop state of the corresponding generator, namely when the action i is output in the step S5, the generator running state with the number of i is reversed.
S5, constructing a value function-based deep reinforcement learning algorithm structure, taking the start-stop state and the load state of a generator of the power system as input, taking an adjustment strategy of the generator as output, and training a Q network and a target Q network, wherein the specific steps comprise:
s51, the deep reinforcement learning algorithm structure based on the value function comprises a current Q network and a target Q network, the structures of the two networks are completely the same, the two networks are both n layers of fully-connected neural networks, the input of the two networks is the system state quantity formed by the generator and the load in the step S2, the output of the two networks is a G-dimensional column vector, and the column vector element G is i Corresponding to all adjustable generators i one by one, and the action a of the output t Adjusting the starting and stopping state of the generator corresponding to the maximum value;
s52, initializing a playback memory unit D with the capacity of M, storing the training samples, and continuously removing the oldest stored old samples after the samples are fully stored;
s53, initializing the current Q network, randomly initializing the weight parameter theta of the current Q network, initializing a target Q network, wherein the structure and the initialization weight theta' of the target Q network are the same as those of the current Q network
S54, the current Q network is based on the current state s t And step S3 of randomly generating load state information output action a t
S55, action a is executed with probability 1-epsilon t Reversing the operating state of the corresponding generator (i.e. if the generator is in operation, then stopping operation, if the generator is stopped, then putting it into operation), and randomly reversing the operating state of any one generator with a probability epsilon (small);
s56, calculating the reward value r under the corresponding action according to the formula (1) t The system state is from s t Is transferred to s t+1 A 1 is to t ,a t ,r t ,s t+1 Combining into one experience(s) t ,a t ,r t ,s t+1 ) And stores it in a playback memory unit D
Figure BDA0002414606790000081
S57, randomly extracting m pieces of empirical data from the playback memory unit D, and calculating y according to a formula (2) j
Figure BDA0002414606790000082
S58, calculating total loss of the m experiences
Figure BDA0002414606790000083
Updating a Q network parameter theta by using an Adam algorithm according to the gradient descending direction, and completely copying the Q network parameter to the target Q network at intervals of C training steps;
s59, repeating the steps S54-S58 until a convergent power flow state without exceeding the limit of the balancing machine is obtained or the maximum iteration step number T of the current training round is reached by adjusting the running state of the generator based on the given load state;
and S510, re-executing the step S3, randomly generating a training target, and re-executing the step S59 until the Q network can adjust a convergent power flow state without exceeding the limit of the balancing machine for all load states.
The present invention also proposes a system 200 for adjusting power flow convergence in a power system, as shown in fig. 3, comprising:
the acquisition module 201 is used for acquiring the historical base state load state of the power system, determining a state space and an action space according to the load level of the historical base state load state and adjustable electrical elements in the system, and determining the input dimension and the output dimension of the Q neural network model according to the state space and the action space;
the processing module 202 determines a mapping relation between an action space and a start-stop state of the power system generator, and adjusts the running state of the power system generator according to an adjusting action output by the training model;
the training module 203 is used for taking the target load level and the start-stop state of the generator of the power system as input, adjusting the action as output and training a Q neural network model according to input and output dimensions;
the adjusting module 204 obtains the target load level and the current start-stop state of the generator, determines an adjusting action according to the Q neural network model, and adjusts the power flow of the power system to a convergence state according to the adjusting action.
And the ground state load state comprises a load active power value and a load reactive power value.
And the state space comprises a start-stop state of the generator of the power system and a load state of the power system.
And the action space comprises the start-stop state of all adjustable generators in the power system.
Training a Q neural network model, specifically comprising:
initializing a current Q network and randomly initializing a weight parameter of the current Q network;
initializing a target Q network, and assigning the weight parameter of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;
initializing a playback memory unit, and storing experience tuples generated in the training process, wherein the experience tuples are system state transition, adjustment action and instant reward;
when the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;
inputting a target load level and a start-stop state of a generator of the power system to a current Q network;
the current Q network outputs an adjusting action according to the target load level and the start-stop state of the generator of the power system;
executing an adjusting action with a preset probability, wherein the adjusting action is the running state of the reverse adjustable generator;
determining the reward value of the adjusting action, acquiring the next running state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next running state as an experience tuple to a playback memory unit;
randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated reward of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;
and (4) carrying out training of a preset round on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and then generating a Q neural network model.
The method does not depend on expert experience, greatly saves manpower, is beneficial to improving the automation level of the calculation of the operation mode of the power system, and has great engineering application value and popularization prospect.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be implemented by adopting various computer languages, such as object-oriented programming language Java and transliterated scripting language JavaScript.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (6)

1. A method for adjusting power system power flow convergence, the method comprising:
acquiring historical ground state load state of a power system, determining state space and action space according to load level of the historical ground state load state and adjustable electrical elements in the system, and determining input dimension and output dimension of a Q neural network model according to the state space and the action space, wherein
The action space comprises start-stop states of all adjustable generators in the power system, wherein a variable of 0-1 is used for representing the start-stop states of the generators in the power system, 0 represents the stop of the generators, and 1 represents the operation of the generators;
determining a mapping relation between the action space and the start-stop state of the generator of the power system, and adjusting the running state of the generator of the power system according to the adjustment action output by the training model;
taking a target load level and a start-stop state of a generator of a power system as input, adjusting action as output, and training a Q neural network model according to input and output dimensions;
obtaining a target load level and a current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action, wherein
The training Q neural network model specifically comprises the following steps:
initializing a current Q network and randomly initializing a weight parameter of the current Q network;
initializing a target Q network, and assigning a weight parameter of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;
initializing a playback memory unit, and storing experience tuples generated in the training process, wherein the experience tuples are system state transition, adjustment action and instant reward;
after the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;
inputting a target load level and a start-stop state of a generator of the power system to a current Q network;
the current Q network outputs an adjusting action according to the target load level and the start-stop state of the generator of the power system;
executing an adjusting action with a preset probability, wherein the adjusting action is the running state of a reverse adjustable generator;
determining the reward value of the adjusting action, acquiring the next running state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next running state as an experience tuple to a playback memory unit;
randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated reward of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;
performing a preset round of training on the target Q network until the average accumulated reward value reaches a higher level and keeps stable, and generating a Q neural network model, wherein
The reward function for the reward value is as follows:
Figure FDA0003891514440000021
2. the method of claim 1, the ground state load condition comprising a load real power value and a reactive power value.
3. The method of claim 1, the state space comprising a start-stop state of a power system generator and a load state of a power system.
4. A system for adjusting power system power flow convergence, the system comprising:
the acquisition module acquires the historical base state load state of the power system, determines a state space and an action space according to the load level of the historical base state load state and adjustable electrical elements in the system, and determines the input dimension and the output dimension of the Q neural network model according to the state space and the action space, wherein
The action space comprises start-stop states of all adjustable generators in the power system, wherein a variable of 0-1 is used for representing the start-stop states of the generators in the power system, 0 represents the stop of the generators, and 1 represents the operation of the generators;
the processing module is used for determining the mapping relation between the action space and the start-stop state of the power system generator and adjusting the running state of the power system generator according to the adjustment action output by the training model;
the training module takes a target load level and a start-stop state of a generator of the power system as input, adjusts actions as output, and trains a Q neural network model according to input and output dimensions;
the adjusting module is used for acquiring a target load level and the current start-stop state of the generator, determining an adjusting action according to the Q neural network model, and adjusting the power flow of the power system to a convergence state according to the adjusting action, wherein the adjusting action is
The training Q neural network model specifically comprises the following steps:
initializing a current Q network and randomly initializing a weight parameter of the current Q network;
initializing a target Q network, and assigning a weight parameter of the current Q network to the target Q network; the current Q network and the target Q network have the same structure;
initializing a playback memory unit, and storing experience tuples generated in the training process, wherein the experience tuples are system state transition, adjustment action and instant reward;
when the experience tuples are full, the playback memory unit eliminates the earliest stored experience tuples;
inputting a target load level and a start-stop state of a generator of the power system to a current Q network;
the current Q network outputs an adjusting action according to the target load level and the start-stop state of a generator of the power system;
executing an adjusting action with a preset probability, wherein the adjusting action is the running state of a reverse adjustable generator;
determining the reward value of the adjusting action, acquiring the next operation state of the power system, and storing the transfer process, the instant reward and the adjusting action of the next operation state as an experience tuple in a playback memory unit;
randomly extracting a plurality of experiences from a playback memory unit, calculating the prediction deviation of the accumulated rewards of each experience, updating the weight parameter of the current Q network according to the sum of the prediction deviations in the gradient descending direction, and copying the weight parameter of the current Q network into the target Q network at a preset frequency;
training the target Q network in a preset round until the average accumulated reward value reaches a higher level and keeps stable, and generating a Q neural network model, wherein
The reward function for the reward value is as follows:
Figure FDA0003891514440000031
5. the system of claim 4, the ground state load condition comprising a load real power value and a reactive power value.
6. The system of claim 4, the state space comprising a start-stop state of a power system generator and a load state of a power system.
CN202010187181.9A 2020-03-17 2020-03-17 Method and system for adjusting power flow convergence of power system Active CN111478331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010187181.9A CN111478331B (en) 2020-03-17 2020-03-17 Method and system for adjusting power flow convergence of power system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010187181.9A CN111478331B (en) 2020-03-17 2020-03-17 Method and system for adjusting power flow convergence of power system

Publications (2)

Publication Number Publication Date
CN111478331A CN111478331A (en) 2020-07-31
CN111478331B true CN111478331B (en) 2023-01-06

Family

ID=71747519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010187181.9A Active CN111478331B (en) 2020-03-17 2020-03-17 Method and system for adjusting power flow convergence of power system

Country Status (1)

Country Link
CN (1) CN111478331B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114362151B (en) * 2021-12-23 2023-12-12 浙江大学 Power flow convergence adjustment method based on deep reinforcement learning and cascade graph neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443447A (en) * 2019-07-01 2019-11-12 中国电力科学研究院有限公司 A kind of method and system learning adjustment electric power system tide based on deeply
CN110535146A (en) * 2019-08-27 2019-12-03 哈尔滨工业大学 The Method for Reactive Power Optimization in Power of Policy-Gradient Reinforcement Learning is determined based on depth

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08317559A (en) * 1995-05-12 1996-11-29 Toshiba Corp Training simulator for power system operation
US8756047B2 (en) * 2010-09-27 2014-06-17 Sureshchandra B Patel Method of artificial nueral network loadflow computation for electrical power system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443447A (en) * 2019-07-01 2019-11-12 中国电力科学研究院有限公司 A kind of method and system learning adjustment electric power system tide based on deeply
CN110535146A (en) * 2019-08-27 2019-12-03 哈尔滨工业大学 The Method for Reactive Power Optimization in Power of Policy-Gradient Reinforcement Learning is determined based on depth

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
含储能系统的配电网电压调节深度强化学习算法;史景坚等;《电力建设》;20200301(第03期);73-80 *
基于知识经验和深度强化学习的大电网潮流计算收敛自动调整方法;王甜婧等;《中国电机工程学报》;20200304;第40卷(第8期);2396-2405,S2 *
王甜婧等.基于知识经验和深度强化学习的大电网潮流计算收敛自动调整方法.《中国电机工程学报》.2020,第40卷(第8期),2396-2405,S2. *

Also Published As

Publication number Publication date
CN111478331A (en) 2020-07-31

Similar Documents

Publication Publication Date Title
Ghadimi A new hybrid algorithm based on optimal fuzzy controller in multimachine power system
US20210133536A1 (en) Load prediction method and apparatus based on neural network
CN108365608B (en) Uncertain optimization scheduling method and system for regional energy Internet
CN115085202A (en) Power grid multi-region intelligent power collaborative optimization method, device, equipment and medium
CN115293052A (en) Power system active power flow online optimization control method, storage medium and device
CN115940294B (en) Multi-stage power grid real-time scheduling strategy adjustment method, system, equipment and storage medium
CN108323797A (en) Cigarette Weight Control System based on GPR models starts position predicting method and system
CN111478331B (en) Method and system for adjusting power flow convergence of power system
CN116169698A (en) Distributed energy storage optimal configuration method and system for stable new energy consumption
CN104915788B (en) A method of considering the Electrical Power System Dynamic economic load dispatching of windy field correlation
CN113315131A (en) Intelligent power grid operation mode adjusting method and system
Lopez-Garcia et al. Power flow analysis via typed graph neural networks
CN115765050A (en) Power system safety correction control method, system, equipment and storage medium
CN110362378A (en) A kind of method for scheduling task and equipment
CN115995847B (en) Micro-grid black start method, device, system and storage medium
CN106054665B (en) A kind of large-scale photovoltaic inverter system divides group's equivalent modeling method
CN111695623A (en) Large-scale battery energy storage system group modeling method, system and equipment based on fuzzy clustering and readable storage medium
CN114880932B (en) Power grid operating environment simulation method, system, equipment and medium
Jasmin et al. A Reinforcement Learning algorithm to Economic Dispatch considering transmission losses
CN113991752A (en) Power grid quasi-real-time intelligent control method and system
CN113221390A (en) Training method and device for scheduling model
Yasin et al. Optimal least squares support vector machines parameter selection in predicting the output of distributed generation
Zhu et al. Mitigating multi-stage cascading failure by reinforcement learning
CN111598294A (en) Active power distribution network reconstruction algorithm and device based on improved teaching optimization
CN115169754B (en) Energy scheduling method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant