CN115360772A - Power system active safety correction control method, system, equipment and storage medium - Google Patents

Power system active safety correction control method, system, equipment and storage medium Download PDF

Info

Publication number
CN115360772A
CN115360772A CN202210289577.3A CN202210289577A CN115360772A CN 115360772 A CN115360772 A CN 115360772A CN 202210289577 A CN202210289577 A CN 202210289577A CN 115360772 A CN115360772 A CN 115360772A
Authority
CN
China
Prior art keywords
power system
correction control
active safety
safety correction
generator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210289577.3A
Other languages
Chinese (zh)
Other versions
CN115360772B (en
Inventor
王一迪
於益军
李立新
刘金波
马晓忱
李理
吕闫
唐俊刺
李铁
李桐
徐瑕龄
韩巍
罗雅迪
孙博
刘蒙
张�浩
曹坤
王淼
狄方春
张�杰
商敬安
石上丘
孙略
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Tianjin Electric Power Co Ltd
State Grid Jibei Electric Power Co Ltd
State Grid Liaoning Electric Power Co Ltd
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Tianjin Electric Power Co Ltd
State Grid Jibei Electric Power Co Ltd
State Grid Liaoning Electric Power Co Ltd
Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI, State Grid Tianjin Electric Power Co Ltd, State Grid Jibei Electric Power Co Ltd, State Grid Liaoning Electric Power Co Ltd, Electric Power Research Institute of State Grid Liaoning Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202210289577.3A priority Critical patent/CN115360772B/en
Publication of CN115360772A publication Critical patent/CN115360772A/en
Application granted granted Critical
Publication of CN115360772B publication Critical patent/CN115360772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/46Controlling of the sharing of output between the generators, converters, or transformers
    • H02J3/48Controlling the sharing of the in-phase component
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/04Circuit arrangements for ac mains or ac distribution networks for connecting networks of the same frequency but supplied from different sources
    • H02J3/06Controlling transfer of power between connected networks; Controlling sharing of load between connected networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/10Power transmission or distribution systems management focussing at grid-level, e.g. load flow analysis, node profile computation, meshed network optimisation, active network management or spinning reserve management
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Supply And Distribution Of Alternating Current (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method, a system, equipment and a storage medium for controlling active safety correction of a power system, wherein the method comprises the following steps: acquiring real-time operation data of the power system; the real-time operation data of the power system is input into the trained intelligent agent to obtain an active safety correction control scheme of the power system, and then active safety correction control is carried out on the power system according to the active safety correction control scheme of the power system.

Description

Power system active safety correction control method, system, equipment and storage medium
Technical Field
The invention belongs to the technical field of smart power grids, and relates to an active safety correction control method, system, equipment and a storage medium, in particular to an active safety correction control method, system, equipment and a storage medium for a power system.
Background
The task of the power system is to supply the users with sufficient electrical energy of a quality that is in accordance with regulations without interruption. With the improvement of the interconnection degree of a power grid, the application of a new technology and the access of large-scale new energy sources with randomness, intermittence and time variability, the fluctuation of system power and the change of tide are greatly increased, a power system is increasingly complex, the form and the characteristics of the power grid face deep changes, the safety and stability level are mutually restricted, meanwhile, the improvement of the economic operation requirement of the power grid and the constraint conditions brought by the marketization of the power force the operation point of the power system to be at the safe edge, and the factors cause the new safety problem of the power system. The safety correction of the power system provides theoretical basis and technical support for ensuring the maximum economy and safety of the power grid under severe conditions. The power system operation personnel need a good system safety analysis tool and provide a safe operation strategy so as to improve the safe operation level of the system. Under the condition that the structure of a power transmission network of an electric power system is not changed, two methods are available for the active safety correction of the traditional electric power system:
(1) And (3) a sensitivity analysis method. And selecting a generator with higher sensitivity to one or a group of target transmission section power, and adjusting the transmission power of the target section.
(2) And (5) optimizing a planning method. The method comprises the steps of converting an active safety correction problem of the power system into an optimization planning problem, taking the minimum number of adjustment elements or the minimum adjustment quantity as an active safety correction optimization target of the system, and solving by using a mathematical planning method under the condition of various safety constraints of the system.
Because the sensitivity method has poor calculation accuracy, sometimes the sawing phenomenon occurs, and the machine set is adjusted repeatedly; the mathematical programming method needs too many devices to be adjusted, has slow calculation speed and may have the problem of calculation non-convergence. Because the active safety correction control is used in real time, the traditional research method is difficult to meet the requirements of calculation speed and calculation precision at the same time, and the requirements on precision are usually sacrificed for high calculation speed in the active safety correction control of an actual power system.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method, a system, equipment and a storage medium for controlling active safety correction of a power system, wherein the method, the system, the equipment and the storage medium can realize active safety correction of the power system and have the characteristics of high speed and high precision.
In order to achieve the purpose, the invention adopts the following technical scheme:
in one aspect of the present invention, the present invention provides a method for controlling active safety correction of an electrical power system, including:
acquiring real-time operation data of the power system;
and inputting the real-time operation data of the power system into the trained intelligent agent to obtain an active safety correction control scheme of the power system, and then performing active safety correction control on the power system according to the active safety correction control scheme of the power system.
The invention further improves the active safety correction control method of the power system, which comprises the following steps:
before the inputting the real-time operation data of the power system into the trained intelligent agent, the method further comprises:
establishing a sorting experience playback pool, wherein the sorting experience playback pool comprises a successful experience playback area and a failed experience playback area;
extracting samples from the categorized experience playback pool;
and training the intelligent agent by using the extracted sample to obtain the trained intelligent agent.
The state space s in the process of training the agent by using the extracted samples is as follows:
Figure BDA0003561147530000031
wherein ,
Figure BDA0003561147530000032
for the active generator j output of the ith sample,
Figure BDA0003561147530000033
for the line power flow of the ith sample,
Figure BDA0003561147530000034
j =1, \8230;, n for the load power of the ith sample gen ,k=1,…,n line ,m=1,…,n load ,n gen Number of generators, n line Number of branches, n load For the number of loads, M is the sample takenThe number of the cells.
The action space in the process of training the intelligent agent by using the extracted samples is as follows:
the action space is a controllable variable in load flow calculation, the continuous variable of the action space is a generator output adjustment quantity, and the action space a = [ delta P ] of the intelligent agent at t moment G1 ,…,ΔP Gj ]Wherein, Δ P Gj Is an adjustment value of the generator power.
The reward function in the process of training the intelligent agent by using the extracted samples is as follows:
R=ν 1 r 12 r 23 r 3
wherein ,r1 、r 2 、r 3 Respectively, the reward value of the line out-of-limit condition, the reward value of the generator output out-of-limit constraint, the reward value of the generator cost, v 1 、v 2 、v 3 Are weight coefficients.
Reward value r for line out-of-limit conditions 1 Comprises the following steps:
Figure BDA0003561147530000041
wherein ,nline Is the number of branches of the power grid, I i and Ti The current and thermal limits of branch i, respectively, are constant.
Reward value r of generator output out-of-limit constraint 2 Comprises the following steps:
Figure BDA0003561147530000042
wherein ,ngen Number of generators, P Gi For the output of the generator, P Gimax Represents the upper limit of the active power output of the generator, P Gimin And represents the lower limit of the active output of the generator.
Cost r of generator 3 The reward value of (c) is:
Figure BDA0003561147530000043
wherein n is the number of the units, P Gi The output of the generator is shown as a, b and c, and d is the start-stop coefficient of the unit.
In a second aspect of the present invention, the present invention provides an active safety correction control system for an electrical power system, including:
the acquisition module is used for acquiring real-time operation data of the power system;
and the control module is used for inputting the real-time operation data of the power system into the trained intelligent agent to obtain an active safety correction control scheme of the power system, and then performing active safety correction control on the power system according to the active safety correction control scheme of the power system.
In another aspect, the present invention provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the power system active safety correction control method when executing the computer program.
In a fourth aspect of the present invention, the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the power system active safety correction control method.
The invention has the following beneficial effects:
the active safety correction control method, the system, the equipment and the storage medium of the power system are used for realizing the active safety correction control of the power system in a deep reinforcement learning mode without depending on a specific physical model during specific operation, so that the problems of calculation precision and calculation time caused by repeated adjustment are avoided.
Further, the categorized experience replay pool comprises a successful experience replay region and a failed experience replay region, so as to improve the efficiency of the intelligent agent training.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a flow chart of agent training;
fig. 3 is a system configuration diagram of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments, and do not limit the scope of the disclosure of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
There is shown in the drawings a schematic block diagram of a disclosed embodiment in accordance with the invention. The figures are not drawn to scale, wherein certain details are exaggerated and some details may be omitted for clarity of presentation. The shapes of the various regions, layers and their relative sizes, positional relationships are shown in the drawings as examples only, and in practice deviations due to manufacturing tolerances or technical limitations are possible, and a person skilled in the art may additionally design regions/layers with different shapes, sizes, relative positions, according to the actual needs.
As is known, safety corrections: in the static field, potential out-of-limit phenomena are eliminated through rearrangement of controllable variables in the system, and the potential out-of-limit phenomena can be generally divided into two sub-problems of active safety correction and reactive safety correction.
Active safety correction control: and when the branch active power of the system is out of limit, adjusting the active power of the generator to redistribute the power flow, and eliminating the branch power out of limit.
And (3) exceeding the limit of the tide: the phenomenon that the branch current exceeds the rated value.
The output of the generator is as follows: and the generator set inputs active power to the power grid.
Example one
Referring to fig. 1, the active safety correction control method for the power system according to the present invention includes the following steps:
1) Collecting multi-scenario data of a power grid, wherein the multi-scenario data comprises information of power flow out-of-limit, a topological structure of a power system, line parameters, generator parameters, load parameters and power grid power flow information.
2) Carrying out load flow calculation according to the multi-scene data of the power grid to obtain real-time operation data of the power grid;
3) And judging whether the power grid has a power flow out-of-limit condition, when the power flow out-of-limit condition exists, obtaining an active safety correction control scheme of the power system by using the trained intelligent body according to the operation data of the power grid, and performing active safety correction control on the power system according to the active safety correction control scheme of the power system, otherwise, performing active safety correction on the power grid.
The specific process of training the agent is as follows:
31 Building a classification experience playback pool;
the state of each step-action-reward value-next state(s) t ,a t ,r,s t+1 ) The method comprises the steps of combining a group of experiences and storing the group of experiences in a classification experience playback pool, wherein the classification experience playback pool is used for solving the problems of data correlation and non-static distribution generated when an agent interacts with the environment;
32 TD3 agents are trained offline;
extracting a plurality of groups of experiences, namely samples, from the classified experience playback pool, training the TD3 intelligent agent by using each group of extracted experiences, and recalculating the power grid flow after each action in the training process until all the flow out-of-limit conditions are eliminated or the iteration times are met;
33 Obtaining a trained TD3 agent;
through accumulated return, the TD3 intelligent agent can adapt to various out-of-limit conditions, the trained TD3 intelligent agent can make real-time decision on an online running power system, and an active safety correction control scheme aiming at power flow out-of-limit is provided.
34 Aiming at the problems of the power grid, performing active safety correction control on the running power system by using the trained TD3 agent;
when the situation of power flow out-of-limit exists, the trained TD3 intelligent agent is used for outputting an active safety correction control scheme, the power grid carries out active safety correction control according to the active safety correction control scheme, an adjustment scheme is given on line in the primary interaction process, when the power grid operation data are in a reasonable range, namely the operation working condition is met, the power grid can operate safely, and active safety correction is not needed.
The modeling process of the active safety correction control decision is as follows:
311 Obtain a state space;
the state is an observable variable of the intelligent agent to the unit and a surrounding power grid, the state space considers factors influencing decision as much as possible, and the continuous action variables of the state space comprise generator power generation power, load power and line tide value and are characterized as follows:
Figure BDA0003561147530000081
wherein ,
Figure BDA0003561147530000082
the active output of the generator j for the ith sample,
Figure BDA0003561147530000083
for the line active power flow of the ith sample,
Figure BDA0003561147530000084
j =1, \ 8230;, n is the load power of the ith sample gen ,k=1,…,n line ,m=1,…,n load ,n gen Number of generators, n line Number of branches, n load For the number of loads, M is the number of samples.
312 Determine an action space;
the action space is a controllable variable in load flow calculation, the continuous variable of the action space is a generator output adjustment quantity, and the action space a = [ delta P ] of the intelligent agent at t moment G1 ,…,ΔP Gj ]Wherein, Δ P Gj Is an adjustment value of the generator power.
313 The reward function in the agent training process is:
R=ν 1 r 12 r 23 r 3 (2)
wherein ,r1 、r 2 、r 3 And r 4 Respectively, the reward value of the line out-of-limit condition, the reward value of the generator output out-of-limit constraint, the reward value of the generator cost and the reward value of the active power balance, v 1 、v 2 、v 3 And v 4 Is a weight coefficient;
wherein the reward value r of the line out-of-limit condition 1 Comprises the following steps:
Figure BDA0003561147530000091
wherein ,nline Is the number of branches of the grid, I i and Ti The current and thermal limits of the branch i are respectively, epsilon is a constant, and epsilon generally takes the value of 0 or 1, so as to avoid the condition that the denominator is 0.
Reward value r of generator output out-of-limit constraint 2 Comprises the following steps:
Figure BDA0003561147530000092
wherein ,ngen Number of generators, P Gi For generator output, P Gimax Representing the upper limit of the active power output, P, of the generator Gimin And represents the lower limit of the active output of the generator.
Cost r of generator 3 The reward value of (c) is:
Figure BDA0003561147530000093
wherein n is the number of units, P Gi The output of the generator is shown as a, b and c, and d is the start-stop coefficient of the unit.
The classification experience playback pool in the step 31) is divided into a successful experience playback area and a failed experience playback area, and the specific classification method comprises the following steps:
the conventional TD3 algorithm uses an empirical playback zone for solving the problem of correlation and static distribution between data to(s) t ,a t ,r,s t+1 ) The sequence is stored in an experience playback area as a unit, and when the sequence is updated every time, the Actor network and the Critic network randomly extract a part of samples from the sequence for optimization, wherein the random sampling mode causes low training efficiency and poor algorithm convergence. The classification experience playback pool is divided into a successful experience playback area and a failed experience playback area, when the trend after the successful safety correction is not out of limit, the task is successful, the experience is stored in the successful experience playback area and is recorded as T S (ii) a When the trend is out of limit, the task fails, the failure experience is stored in a failure experience playback area and is recorded as T f Since there is a time delay in the rewarding process of reinforcement learning, it is stored in T S Some of the experience immediately before failure will also be related to failure, and therefore this part of the experience is taken from T S Extracted according to the proportion of eta.
The improved sampling mode is as follows:
at each update step, when successful experience is obtained, then(s) t ,a t ,r,s t+1 ) Is stored in T S Performing the following steps; when it is a failed experience, then(s) t ,a t ,r,s t+1 ) Is stored at T f Simultaneously from T in the proportion of eta f Extracting the failure experience.
And sampling from the reconstructed experience playback area to obtain a small batch of training data, updating the parameters of the current network by TD3 through a gradient ascending and gradient descending algorithm, and updating the parameters of the target network by a sliding average method, so that the parameters of the target network are changed slowly to improve the learning stability.
In step 32), the specific method for off-line training the TD3 agent is as follows:
referring to fig. 2, the td3 agent is composed of a current network and a target network, wherein the current network and the target network can be divided into a policy network for executing actions and a value network for evaluating the quality of the actions. Each strategy network and each value network are all a fully-connected neural network, each fully-connected neural network consists of an input layer, an output layer and a plurality of hidden layers, an Actor part corresponds to two networks and respectively comprises an Actor network and an Actor _ target network, and a Critic part comprises four neural networks which respectively comprise a Critic _1 network, a Critic _ target1 network, a Critic _2 network and a Critic _ target2 network.
The Actor network is a policy network, acquires the action required to be taken currently according to the current state, inputs the action as state quantity and outputs the action quantity; the Critic network is a current value network, the value of the output action of the Actor network is evaluated to generate the updated gradient of the Actor network, input state quantity and action quantity and output the value of executing the input action in the current environment state; the Actor _ target network is a target strategy network and is used for selecting an optimal next action according to a next state sampled in the experience playback pool and updating network parameters; the Critic _ target1 network and the Critic _ target2 network are target value networks and are used for calculating target values and updating network parameters.
The Actor network parameter of the fitting strategy is theta, and the input is the current state s t Outputting the action a of the generator t The hidden layer of the network uses Relu activation function to carry out nonlinear change, the output layer uses Sigmoid activation function, and the network parameters are updated by deterministic strategy network gradient theorem:
Figure BDA0003561147530000111
wherein ,
Figure BDA0003561147530000112
for the gradient of the objective function with respect to theta, N is the amount of data randomly taken from the successful experience playback pool, mu and Q represent the Actor network and Critic network, respectively, i represents the sample number, s i The state characterizing vector representing the ith sample, and a represents the action at the current time.
The parameter of the Actor _ target network is theta', and the input is the next state s t+1 Outputting an action a in the next state t
The Critic _1 network and Critic _2 network parameters of the fitting state action value function are w respectively 1 and w2 Input is the current state s t And the actually performed action a t The output is a state action value Q w1 and Qw2 The hidden layer of the network uses Relu activation function to carry out nonlinear change, the output layer uses Tanh activation function, and the network parameters are updated by a small batch random gradient descent method:
Figure BDA0003561147530000121
wherein ,
Figure BDA0003561147530000122
is the gradient of the loss function, y i And a i The sample time differential target value and action are taken for the ith, respectively.
The parameters of the Critic _ target1 network and the Critic _ target2 network are w' 1 and w′2 Input is the next state s t+1 And behavior of next state of target policy network output
Figure BDA0003561147530000123
Outputs a next-state operation value Q' w1 And Q' w2 . Since the TD3 agent selects one of the two target networks having a smaller Q value to prevent the Q value from being overestimated, the time difference target value y of equation (9) is used when updating both the critical _ target1 network and the critical _ target2 network, and the loss function shown in equation (10) is shared:
Figure BDA0003561147530000124
Figure BDA0003561147530000125
wherein r is reward, gamma is discount coefficient, mu' is Actor _ target network, w i And w' i The weights of the ith sample in the Critic network and the Critic _ target network are respectively.
After updating Critic twice, the Actor updates, and the parameters of the policy network and the value network respectively obtain the parameters of the target policy network and the target value network through sliding average:
Figure BDA0003561147530000126
wherein ,
Figure BDA0003561147530000127
is the output of the Actor _ target network.
At regular intervals of training, the current value network will extract quantitative samples from the classification experience playback pool according to(s) t ,a t ) Obtaining the value at s by using the current value network t Under the state of executing a t Q value of action, and according to s t+1 Obtaining approximate target y of Q on the target network, updating the value network by minimizing loss function by gradient descent method to make Q value approximate to y, and updating the strategy network by maximizing target function by strategy gradient ascent algorithm to make the model more biased to the action of outputting higher Q value。
The method has the advantages that the deep neural network and the TD3 algorithm are utilized, the global optimal decision is automatically learned from the interaction with the environment through reinforcement learning, the optimal strategy is quickly given on line, and the requirements of online active safety correction on the calculation speed and the calculation precision are met, in addition, the TD3 algorithm adopts parallel calculation, the calculation time is greatly shortened under the condition that the hardware condition allows, and the method is very suitable for the quick optimization solution of the active safety correction control problem of the power grid; the invention is not influenced by a system model and has better expandability; the method is suitable for the active safety correction strategy decision under multiple scenes, and the TD3 intelligent agent is trained under the operation data of the multi-scene power grid, so that the trained intelligent agent can cope with various power flow out-of-limit conditions.
Example two
Referring to fig. 3, the active safety correction control system of the power system according to the present invention includes:
the acquisition module 1 is used for acquiring real-time operation data of the power system;
and the control module 2 is used for inputting the real-time operation data of the power system into the trained intelligent agent to obtain an active safety correction control scheme of the power system, and then performing active safety correction control on the power system according to the active safety correction control scheme of the power system.
Further, the control module 2 includes:
the building module 21 is configured to build a categorized experience playback pool, where the categorized experience playback pool includes a successful experience playback area and a failed experience playback area;
an extraction module 22 for extracting samples from the categorized experience playback pool;
and the training module 23 is configured to train the agent by using the extracted sample, so as to obtain the trained agent.
All relevant contents of each step related to the embodiment of the active safety correction control method for the power system can be cited to the functional description of the functional module corresponding to the active safety correction control system for the power system in the embodiment of the present application, and are not described herein again.
The division of the modules in the embodiments of the present application is schematic, and only one logical function division is provided, and in actual implementation, there may be another division manner, and in addition, each functional module in each embodiment of the present application may be integrated in one processor, may also exist alone physically, or may also be integrated in one module by two or more modules. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
EXAMPLE III
A computer device, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the power system active safety correction control method when executing the computer program, wherein the memory may include a memory, such as a high speed random access memory, and may further include a nonvolatile memory, such as at least one disk memory; the processor, the network interface and the memory are connected with each other through an internal bus, wherein the internal bus can be an industrial standard system structure bus, a peripheral component interconnection standard bus, an extended industrial standard structure bus and the like, and the bus can be divided into an address bus, a data bus, a control bus and the like. The memory is used for storing programs, and particularly, the programs can comprise program codes which comprise computer operation instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
Example four
A computer readable storage medium storing a computer program which when executed by a processor implements the steps of the power system active safety correction control method, in particular, but not limited to, for example, volatile memory and/or non-volatile memory. The volatile memory may include Random Access Memory (RAM) and/or cache memory (cache), among others. The non-volatile memory may include a Read Only Memory (ROM), hard disk, flash memory, optical disk, magnetic disk, and the like.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (10)

1. An active safety correction control method for a power system is characterized by comprising the following steps:
acquiring real-time operation data of the power system;
and inputting the real-time operation data of the power system into the trained intelligent agent to obtain an active safety correction control scheme of the power system, and then performing active safety correction control on the power system according to the active safety correction control scheme of the power system.
2. The power system active safety correction control method according to claim 1, wherein before inputting the real-time operation data of the power system into the trained agent, the method further comprises:
establishing a sorting experience playback pool, wherein the sorting experience playback pool comprises a successful experience playback area and a failed experience playback area;
extracting samples from the categorized experience playback pool;
and training the intelligent agent by using the extracted sample to obtain the trained intelligent agent.
3. The power system active safety correction control method according to claim 2, wherein the state space s in the process of training the agent by using the extracted samples is:
Figure FDA0003561147520000011
wherein ,
Figure FDA0003561147520000012
for the active generator j output of the ith sample,
Figure FDA0003561147520000013
for the line active power flow of the ith sample,
Figure FDA0003561147520000014
j =1, \8230;, n for the load power of the ith sample gen ,k=1,…,n line ,m=1,…,n load ,n gen Number of generators, n line Number of branches, n load For the number of loads, M is the number of samples taken.
4. The power system active safety correction control method according to claim 2, wherein the reward function in the process of training the agent by using the extracted samples is as follows:
R=ν 1 r 12 r 23 r 3
wherein ,r1 、r 2 、r 3 Respectively as the reward value of line out-of-limit condition, the reward value of generator output out-of-limit constraint, the reward value of generator cost, v 1 、ν 2 、v 3 Are weight coefficients.
5. The power system active safety correction control method according to claim 4, characterized in that the reward value r for line out-of-limit condition 1 Comprises the following steps:
Figure FDA0003561147520000021
wherein ,nline The number of the branches of the power grid,I i and Ti The current and thermal limits of branch i, respectively, and epsilon is a constant.
6. The power system active safety correction control method according to claim 4, characterized in that the reward value r of the generator output out-of-limit constraint 2 Comprises the following steps:
Figure FDA0003561147520000022
wherein ,ngen Number of generators, P Gi For the output of the generator, P Gimax Representing the upper limit of the active power output, P, of the generator Gimin And represents the lower limit of the active output of the generator.
7. The power system active safety correction control method according to claim 4, characterized in that the generator cost r 3 The reward value of (c) is:
Figure FDA0003561147520000023
wherein n is the number of units, P Gi The output of the generator is shown as a, b and c, and d is the start-stop coefficient of the unit.
8. An active safety correction control system for a power system, comprising:
the acquisition module (1) is used for acquiring real-time operation data of the power system;
and the control module (2) is used for inputting the real-time operation data of the power system into the trained intelligent agent to obtain an active safety correction control scheme of the power system, and then performing active safety correction control on the power system according to the active safety correction control scheme of the power system.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the power system active safety correction control method according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored, and the computer program, when being executed by a processor, implements the steps of the power system active safety correction control method according to any one of claims 1 to 7.
CN202210289577.3A 2022-03-23 2022-03-23 Active safety correction control method, system, equipment and storage medium for power system Active CN115360772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210289577.3A CN115360772B (en) 2022-03-23 2022-03-23 Active safety correction control method, system, equipment and storage medium for power system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210289577.3A CN115360772B (en) 2022-03-23 2022-03-23 Active safety correction control method, system, equipment and storage medium for power system

Publications (2)

Publication Number Publication Date
CN115360772A true CN115360772A (en) 2022-11-18
CN115360772B CN115360772B (en) 2023-08-15

Family

ID=84030580

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210289577.3A Active CN115360772B (en) 2022-03-23 2022-03-23 Active safety correction control method, system, equipment and storage medium for power system

Country Status (1)

Country Link
CN (1) CN115360772B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120326653A1 (en) * 2011-06-27 2012-12-27 Kfir Godrich Convergent Energized IT Apparatus for Residential Use
US20190393721A1 (en) * 2016-12-19 2019-12-26 Electricite De France Transmission of electrical energy between user entities of a distribution network
CN112818588A (en) * 2021-01-08 2021-05-18 南方电网科学研究院有限责任公司 Optimal power flow calculation method and device for power system and storage medium
CN112994022A (en) * 2021-03-16 2021-06-18 南京邮电大学 Source-storage-load distributed cooperative voltage control method and system thereof
CN113315131A (en) * 2021-05-18 2021-08-27 国网浙江省电力有限公司 Intelligent power grid operation mode adjusting method and system
CN113725922A (en) * 2021-08-24 2021-11-30 安徽大学 Active power distribution method and system of hybrid micro-grid based on self-triggering mechanism

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120326653A1 (en) * 2011-06-27 2012-12-27 Kfir Godrich Convergent Energized IT Apparatus for Residential Use
US20190393721A1 (en) * 2016-12-19 2019-12-26 Electricite De France Transmission of electrical energy between user entities of a distribution network
CN112818588A (en) * 2021-01-08 2021-05-18 南方电网科学研究院有限责任公司 Optimal power flow calculation method and device for power system and storage medium
CN112994022A (en) * 2021-03-16 2021-06-18 南京邮电大学 Source-storage-load distributed cooperative voltage control method and system thereof
CN113315131A (en) * 2021-05-18 2021-08-27 国网浙江省电力有限公司 Intelligent power grid operation mode adjusting method and system
CN113725922A (en) * 2021-08-24 2021-11-30 安徽大学 Active power distribution method and system of hybrid micro-grid based on self-triggering mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵川 等: "多智能体型电网调度决策支持系统", vol. 30, no. 22, pages 59 - 66 *

Also Published As

Publication number Publication date
CN115360772B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
Pan et al. Fractional order fuzzy control of hybrid power system with renewable generation using chaotic PSO
CN107317334B (en) A kind of electric system rack reconstructing method and device
CN106296044B (en) Power system risk scheduling method and system
CN107590564B (en) Transient stability constraint-based active power output adjustment method for power system
CN113591954B (en) Filling method of missing time sequence data in industrial system
CN110932281B (en) Multi-section cooperative correction method and system based on quasi-steady-state sensitivity of power grid
El Helou et al. Fully decentralized reinforcement learning-based control of photovoltaics in distribution grids for joint provision of real and reactive power
CN109638815B (en) Method for determining safety and stability prevention control strategy of medium-and-long-term voltage of power system
Zhang et al. Load shedding scheme with deep reinforcement learning to improve short-term voltage stability
CN115940294B (en) Multi-stage power grid real-time scheduling strategy adjustment method, system, equipment and storage medium
CN112310980B (en) Safety and stability evaluation method and system for direct-current blocking frequency of alternating-current and direct-current series-parallel power grid
CN114094592A (en) Method, system, equipment and storage medium for controlling emergency load of power grid
CN113867295A (en) Manufacturing workshop AGV dynamic scheduling method, system, equipment and storage medium based on digital twinning
CN107069708B (en) Extreme learning machine-based transmission network line active safety correction method
He et al. Biobjective Optimization‐Based Frequency Regulation of Power Grids with High‐Participated Renewable Energy and Energy Storage Systems
CN114336632A (en) Method for correcting alternating current power flow based on model information assisted deep learning
CN110943463A (en) Power grid fast frequency modulation control method based on deep learning energy storage battery participation
CN115995847B (en) Micro-grid black start method, device, system and storage medium
CN113097994A (en) Power grid operation mode adjusting method and device based on multiple reinforcement learning agents
CN115360772A (en) Power system active safety correction control method, system, equipment and storage medium
CN112994016B (en) Method and system for adjusting restoration resolvable property of power flow of power system
CN114219125A (en) High-elasticity urban power grid multi-dimensional intelligent partitioning method
Angel et al. Hardware in the loop experimental validation of PID controllers tuned by genetic algorithms
CN114240144A (en) Power system dynamic economic dispatching system and method based on generation countermeasure simulation learning
CN111859780A (en) Micro-grid operation optimization method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant