CN114094592A - Method, system, equipment and storage medium for controlling emergency load of power grid - Google Patents

Method, system, equipment and storage medium for controlling emergency load of power grid Download PDF

Info

Publication number
CN114094592A
CN114094592A CN202111363835.XA CN202111363835A CN114094592A CN 114094592 A CN114094592 A CN 114094592A CN 202111363835 A CN202111363835 A CN 202111363835A CN 114094592 A CN114094592 A CN 114094592A
Authority
CN
China
Prior art keywords
load
power
power system
value
load shedding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111363835.XA
Other languages
Chinese (zh)
Inventor
李健
王新迎
任汉涛
陈二松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Hebei Electric Power Co Ltd
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Hebei Electric Power Co Ltd
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Hebei Electric Power Co Ltd, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Corp of China SGCC
Priority to CN202111363835.XA priority Critical patent/CN114094592A/en
Publication of CN114094592A publication Critical patent/CN114094592A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/12Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load
    • H02J3/14Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load by switching loads on to, or off from, network, e.g. progressively balanced loading
    • H02J3/144Demand-response operation of the power transmission or distribution network
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/001Methods to deal with contingencies, e.g. abnormalities, faults or failures
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/10Power transmission or distribution systems management focussing at grid-level, e.g. load flow analysis, node profile computation, meshed network optimisation, active network management or spinning reserve management
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02BCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO BUILDINGS, e.g. HOUSING, HOUSE APPLIANCES OR RELATED END-USER APPLICATIONS
    • Y02B70/00Technologies for an efficient end-user side electric power management and consumption
    • Y02B70/30Systems integrating technologies related to power network operation and communication or information technologies for improving the carbon footprint of the management of residential or tertiary loads, i.e. smart grids as climate change mitigation technology in the buildings sector, including also the last stages of power distribution and the control, monitoring or operating management systems at local level
    • Y02B70/3225Demand response systems, e.g. load shedding, peak shaving
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S20/00Management or operation of end-user stationary applications or the last stages of power distribution; Controlling, monitoring or operating thereof
    • Y04S20/20End-user application control systems
    • Y04S20/222Demand response systems, e.g. load shedding, peak shaving

Abstract

The invention discloses a method, a system, equipment and a storage medium for controlling emergency load of a power grid, wherein the method comprises the following steps: acquiring a state space of a power system under a fault condition; inputting the state space of the power system under the fault condition into a load shedding control model to obtain load shedding actions, wherein the load shedding control model is obtained by carrying out simulation training on a deep reinforcement learning network. According to the invention, the research on the emergency load shedding of the power grid is carried out by adopting a deep reinforcement learning algorithm controlled by continuous action, and an intelligent and feasible solution is provided for ensuring the safe and stable operation of the power system.

Description

Method, system, equipment and storage medium for controlling emergency load of power grid
Technical Field
The invention relates to the field of emergency control of a power grid, in particular to a method, a system, equipment and a storage medium for emergency load control of the power grid.
Background
A modern power system is a typical nonlinear system, and in order to maintain stable performance of the power system, it is necessary to control the bus voltage of the system within a standard range. Most of the traditional voltage regulation rules and methods rely on historical experience and offline research, the effects are not ideal, sometimes the effects are too conservative, sometimes risks exist, and the random change of a power system cannot be well met. Meanwhile, as the scale of the power system is continuously increased, uncertain factors such as new energy and electric vehicles are added, so that the safe and stable operation of the power system is more challenging. In recent years, blackout accidents occur around the world. Therefore, taking measures to reduce the imbalance of active power in the early stage of a power system fault is the main method to prevent grid collapse. When the system load power demand exceeds the output limit of the system generator, part of the load capacity is cut off actively or passively, which is an important measure for maintaining the stability of the power system. With the gradual rise of artificial intelligence technology, and the application degree in the electric power field is deepened gradually. An artificial intelligence method for automatically switching off loads in an emergency state is urgently needed in a power system, so that safe and stable operation of a power grid is further guaranteed.
Currently, intensive learning is mainly applied to generator tripping control in the emergency state of the power grid, and most of the control applications adopt Q-learning discrete control to cut off the generator. For example, the switching machine control strategy is based on the combination of a competitive Q network and a dual Q network; or estimating a state-action Q function by using a convolutional neural network as a function approximator; or the Q-learning algorithm is improved, the convergence rate of the algorithm is accelerated, the output of the generator is controlled, and the generation of a voltage control strategy for the power system is realized. The research is mainly directed at the control of the generator, and the DQN (deep Q network) algorithm and other algorithms derived from the DQN algorithm have a certain effect on learning a discrete action strategy, but the problem of processing a continuous action strategy is difficult, although the continuous action can be discretized, the action space increased after discretization can cause the algorithm to fall into a dimension disaster. The existing emergency control scheme is usually designed off-line according to a plurality of typical operation scenes, and the research on the tangential load is less by sampling deep reinforcement learning, particularly the research on the deep reinforcement learning method adopting continuous action control is less, and the setting of the reward function is relatively simple. With the increasing uncertainty and variation in modern power grids, control and prevention will face significant adaptability and refinement problems.
Disclosure of Invention
The invention aims to provide a method, a system, equipment and a storage medium for controlling the emergency load of a power grid based on deep reinforcement learning, so as to overcome the defects in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
the method for controlling the emergency load of the power grid comprises the following steps:
acquiring a state space of a power system under a fault condition;
and inputting the state space of the power system under the fault condition into a load shedding control model to obtain load shedding actions, wherein the load shedding control model is obtained by carrying out simulation training on a deep reinforcement learning network.
Further, the power system state space includes at least one of: the active power of the bus load; the reactive power of the bus load; a bus node voltage; the active power of the generator; the reactive power of the generator; the rotating speed of the generator; and (5) power angle of the generator.
Further, the load shedding action satisfies the following condition: if the number of the grid structure nodes of the power system is less than or equal to a first threshold value and the number of the load nodes which can participate in the adjustment is less than or equal to a second threshold value, all the loads are subjected to load shedding action adjustment; if the number of grid structure nodes of the power system is greater than a first threshold value and the number of load nodes which can participate in adjustment is greater than a second threshold value, sorting the voltage sensitivity of the load nodes from large to small by adopting a sensitivity analysis method, and selecting the first N load nodes for preferential removal, wherein N is a positive integer;
the load node load shedding action involved in the regulation is set between 0% and M%, and the continuous action regulation can be carried out, wherein M% is the maximum upper limit of the load node capable of shedding.
Further, the load shedding control model is established by the following steps:
simulating power flow initialization of the power system;
setting load random fluctuation;
setting a fault of the power system, judging whether the active power of the total load of the power system is greater than the active power of the total generator set or not according to the power flow calculation result of the power system during the fault, if so, performing load shedding action, and if not, performing power flow initialization again;
judging whether the load flow calculation result meets constraint conditions or not according to the load flow calculation result of the power system after the load shedding action, if all the constraint conditions are met, directly calculating a final reward value through a reward function, and finishing training; if any constraint condition is not met, calculating the current reward value through the reward function and the penalty function, performing load shedding action again until all the constraint conditions are met, adding the current reward values calculated in each step of the simulation training of the round to obtain a final reward value, and finishing the training.
Further, the reward function is designed as follows:
Figure BDA0003359842980000031
the penalty function is designed as follows:
Figure BDA0003359842980000041
in the formula: l is a radical of an alcoholstepThe number of iterative steps for deep reinforcement learning, v is the node voltage value of the power system, PloadiAnd QloadiRespectively cutting off active power and reactive power for the ith load node in the power system; pgeniActive power of the ith generator node in the power system; pbalThe active power of a balancing machine in the power system; p isloadi maxAnd Qloadi maxMaximum upper limits of real power and reactive power which can be cut off for the ith load node respectively; pgeni maxThe maximum upper limit of the active power output of the ith generator in the power system is set; pbal maxThe upper limit of the control output of the balancer; lambda1To award the weight coefficient, C1,C2,C3,C4,C5Respectively, the value of the reward constant, lambda, when the result satisfies the constraint condition2Eta, delta, gamma are penalty weight coefficients when the result does not satisfy the constraint condition, C6A penalty constant value r for power flow non-convergence1-r6Is a reward value or penalty value under the corresponding constraint condition;
and adding the reward values or the penalty values to obtain the current reward value calculated by each step of the deep reinforcement learning load shedding action, wherein the current reward value is shown as the following formula:
Figure BDA0003359842980000042
in the formula, RkIs the current prize value of step k, rjWhen the depth reinforcement learning training is carried out in the kth step, the reward value and the penalty value are obtained when the result meets or does not meet the constraint condition;
and adding the current reward value of each step to obtain the final reward value of the current round of simulation training, wherein the final reward value is shown as the following formula:
Figure BDA0003359842980000043
in the formula, RkIs the current prize value, R, of step ksumFor the final reward value, L, obtained during the deep reinforcement learning simulation training of the current roundstepThe number of iteration steps for deep reinforcement learning.
Furthermore, after the load shedding action is obtained, the reactive power of the power system is adjusted through reactive power compensation, so that the node voltage of the power system is further optimized and improved.
Further, the adjusting the reactive power of the power system through reactive compensation specifically includes: and reactive power is injected into the nodes of the generator, and the capacitor reactors are switched to further optimize the node voltage of the power system.
The power grid emergency load shedding control system comprises a power system state space acquisition module and a load shedding action output module under the fault condition, wherein the power system state space acquisition module is used for acquiring the state space of a power system;
the power system state space acquisition module under the fault condition: the method comprises the steps of obtaining a power system state space under a fault condition;
load shedding action output module: and the load shedding control model is used for inputting the state space of the power system under the fault condition into the load shedding control model to obtain load shedding action, wherein the load shedding control model is obtained by carrying out simulation training on a deep reinforcement learning network.
Further, the process of establishing the load shedding control model comprises the following steps:
simulating power flow initialization of the power system;
setting load random fluctuation;
setting a fault of the power system, judging whether the active power of the total load of the power system is greater than the active power of the total generator set or not according to the power flow calculation result of the power system during the fault, if so, performing load shedding action, and if not, performing power flow initialization again;
judging whether the load flow calculation result meets constraint conditions or not according to the load flow calculation result of the power system after load shedding action, if all the constraint conditions are met, calculating a final reward value directly through a reward function, and finishing training; if any constraint condition is not met, calculating the current reward value through the reward function and the penalty function, performing load shedding action again until all the constraint conditions are met, adding the current reward values calculated in each step of the simulation training of the round to obtain a final reward value, and finishing the training.
Further, the reward function is designed as follows:
Figure BDA0003359842980000061
the penalty function is designed as follows:
Figure BDA0003359842980000062
in the formula: l isstepThe iteration step number for deep reinforcement learning, v is the node voltage value of the power system, PloadiAnd QloadiRespectively cutting off active power and reactive power for the ith load node in the power system; pgeniActive power of the ith generator node in the power system; p isbalThe active power of a balancing machine in the power system; p isloadi maxAnd Qloadi maxMaximum upper limits of real power and reactive power which can be cut off for the ith load node respectively; pgeni maxThe maximum upper limit of the active power output of the ith generator in the power system is set; p isbal maxThe upper limit of the control output of the balancer; lambda [ alpha ]1To award the weight coefficient, C1,C2,C3,C4,C5Respectively, the value of the reward constant, lambda, at which the result satisfies the constraint2Eta, delta, gamma are penalty weight coefficients when the result does not satisfy the constraint condition, C6A penalty constant value r for non-convergence of the power flow1-r6Is a reward value or penalty value under the corresponding constraint condition;
and adding the reward values or the penalty values to obtain the current reward value calculated by each step of the deep reinforcement learning load shedding action, wherein the current reward value is shown as the following formula:
Figure BDA0003359842980000063
in the formula, RkIs the current prize value of step k, rjWhen the result meets or does not meet the constraint condition during the k-th deep reinforcement learning training, the reward value and the penalty value are obtained;
and adding the current reward value of each step to obtain the final reward value of the current round of simulation training, wherein the final reward value is shown as the following formula:
Figure BDA0003359842980000071
in the formula, RkIs the current prize value, R, of step ksumFor the final reward value, L, obtained during the deep reinforcement learning simulation training of the current roundstepThe number of iteration steps for deep reinforcement learning.
A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the grid emergency load control method when executing the computer program.
A computer-readable storage medium, having stored thereon a computer program for, when being executed by a processor, performing the steps of the grid emergency load control method.
Compared with the prior art, the invention has the following beneficial technical effects:
the method is applied to the field of automatic load shedding control in the emergency state of the power grid, improves the accuracy of manual operation of load shedding control actions in the emergency state of the power grid by using a deep reinforcement learning artificial intelligence algorithm, can quickly give a control strategy, and greatly saves the reaction time of operators.
Furthermore, by combining a power system stable operation theory, namely, the control voltage is stabilized in a specified interval, and simultaneously considering the problem of practical power system operation constraint, a load shedding state space and an action space are constructed by utilizing a deep reinforcement learning method, and the control strategy generation is guided by setting a constraint reward function which accords with the power grid operation experience, so that the control strategy generation of one or more fault types can be met, the repeated physical modeling of the traditional method can be effectively avoided, and the algorithm has certain generalization capability.
Furthermore, the method can effectively ensure the safe and stable operation of the power system, and improves the load shedding control strategy level of the power system in an emergency state.
Drawings
The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a flow chart of an embodiment of the present invention.
FIG. 2 is a flow chart of deep reinforcement learning simulation training according to the present invention.
Fig. 3 is a diagram of an IEEE39 node system architecture.
FIG. 4 is a graph of node voltage after the control action of the IEEE39 node system.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The invention provides a power grid emergency load shedding control method based on deep reinforcement learning, which aims to enhance the safe and stable operation capability of a large power grid, improve the anti-interference capability and realize the automatic control of the emergency load shedding of a power system. The invention adopts a load shedding control strategy based on artificial intelligence, and the load shedding action strategy selected by deep reinforcement learning can support automatic voltage control, namely the control voltage is stabilized in a specified interval. The method of the invention takes the operation data of the power system as basic data, constructs a state space and an action space, then establishes a voltage stability constraint reward function which accords with the operation characteristics of the power grid and is based on deep reinforcement learning so as to calculate the reward value of the deep reinforcement learning, and adopts a deep determination strategy gradient (DDPG) algorithm with a continuous action strategy, thereby realizing the combined driving of the data and the model. And finally, continuously training the deep reinforcement learning network to obtain an automatic load shedding adjustment strategy for the power grid voltage control problem under the emergency control condition of the power system.
As shown in fig. 1, the present invention provides a method for controlling an emergency load of an electrical power system based on deep reinforcement learning, the method comprising:
constructing a deep reinforcement learning state space
Because deep reinforcement learning requires continuous interaction between an intelligent agent (a deep reinforcement learning network) and the environment, in the power grid emergency load shedding control strategy, a state space needs to be data capable of reflecting the power grid tidal current operation state, so that the current operation state of the power grid is monitored, and the power grid operation state is recovered after the load shedding control strategy is executed. In summary, parameters such as bus load active power, bus load reactive power, bus node voltage, generator active power, generator reactive power, generator rotation speed, generator power angle and the like which are observable operation state variables of the power system are selected as components of the deep reinforcement learning state space.
Constructing deep reinforcement learning load shedding action conditions
Considering that all loads can not be cut off randomly for the load shedding of the power grid in an emergency state, only part of the adjustable loads can participate in the load shedding action, and the voltage stability of bus nodes is maintained, and all the adjustable loads can not participate in the adjustment and have obvious influence on the adjustment, so that the method mainly considers two aspects, on one hand, if the grid structure of the power system is small, namely the number of the nodes of the grid structure is less than or equal to a first threshold value, the number of the adjustable loads is small, namely the number of the load nodes participating in the adjustment is less than or equal to a second threshold value, all the loads can be adjusted for the load shedding action; on the other hand, if the grid structure of the power system is large, that is, the number of nodes of the grid structure is greater than the first threshold, and the number of loads which can participate in adjustment is large, that is, when the number of nodes of the load which can participate in adjustment is greater than the second threshold, a sensitivity analysis method can be adopted to sort the voltage sensitivity of the load nodes from large to small, the first N load nodes are selected to be preferentially removed, N is a positive integer, the value of N can be set according to experience, the load removal action of the load nodes which participate in adjustment is set between 0% and M%, and the continuous action adjustment can be performed, and M% is the maximum upper limit which can be removed by the load nodes.
Deep reinforcement learning algorithm flow design
The DDPG-based simulation flow design of the automatic load shedding algorithm for the power grid emergency state is shown in figure 2, and the design of the automatic load shedding adjustment algorithm model for the power grid emergency state is completed according to the flow chart. After the power system is in a fault and enters an emergency state, the active power P of the total load of the power system at the moment can be judged according to the power flow calculation result of the power systemloadWhether it is greater than the active power P of the total generator setgenIf the load is not greater than the preset load, the load shedding action is carried out, and if the load is not greater than the preset load shedding action, the load shedding action is not carried outLoad shedding action is required, and load flow initialization can be carried out again in a simulation experiment; judging whether the load flow calculation result meets constraint conditions or not according to the load flow calculation result of the power system after load shedding action, if all the constraint conditions are met, calculating a final reward value directly through a reward function, and finishing training; if any constraint condition is not met, calculating the current reward value through the reward function and the penalty function, performing load shedding action again until all the constraint conditions are met, adding the current reward values calculated in each step of the simulation training of the round to obtain a final reward value, and finishing the training.
Constructing a deep reinforcement learning constraint reward function
The construction of the constraint reward function needs to consider the operation recovery state of the power grid after the load shedding action in the emergency state, such as whether the tidal current is converged, the power balance condition of the power system, whether the node voltage is out of limit, whether the critical section is out of limit, the output constraint of the generator and the balancing machine, and the like. After the emergency load shedding action is executed, if the running state of the power grid meets all the constraint conditions, the intelligent agent for deep reinforcement learning obtains a corresponding reward value, and if any constraint condition is not met, the intelligent agent is punished correspondingly.
The reward function is designed as follows:
Figure BDA0003359842980000111
the penalty function is designed as follows:
Figure BDA0003359842980000112
in the formula: l isstepThe iteration step number for deep reinforcement learning, v is the node voltage value of the power system, PloadiAnd QloadiRespectively cutting off active power and reactive power for the ith load node in the power system; p isgeniActive power of the ith generator node in the power system; p isbalFor active work of machines in power systemsRate; ploadi maxAnd Qloadi maxMaximum upper limits of active power and reactive power that can be cut off for the ith load node respectively; p isgeni maxThe maximum upper limit of the active power output of the ith generator in the power system is set; pbal maxThe upper limit of the control output of the balancer; lambda [ alpha ]1To award the weight coefficient, C1,C2,C3,C4,C5Respectively, the value of the reward constant, lambda, when the result satisfies the constraint condition2Eta, delta, gamma are penalty weight coefficients when the result does not satisfy the constraint condition, C6A penalty constant value r for power flow non-convergence1-r6Is a reward value or penalty value under the corresponding constraint condition;
and adding the reward values or the penalty values to obtain the current reward value calculated by each step of the deep reinforcement learning load shedding action, wherein the current reward value is shown as the following formula:
Figure BDA0003359842980000121
in the formula, RkIs the current prize value of the kth step, rjWhen the result meets or does not meet the constraint condition during the k-th deep reinforcement learning training, the reward value and the penalty value are obtained;
and adding the current reward value of each step to obtain the final reward value of the current round of simulation training, wherein the final reward value is shown as the following formula:
Figure BDA0003359842980000122
in the formula, RkIs the current prize value, R, of the kth stepsumThe final reward value, L, obtained during the simulation training of the current round of deep reinforcement learningstepThe number of iteration steps for deep reinforcement learning.
Determining a deep reinforcement learning control strategy
And (4) training a load shedding control model according to the process, and selecting a strategy action with a higher reward return value as a final emergency load shedding action, so that the emergency load shedding control of the power system is realized, and the voltage of a control node is in a constraint interval to ensure the stable operation of the power system.
Reactive compensation rationality design
After the power system performs an emergency load shedding operation, the power system should maintain power balance. The active power balance adjustment can be realized by adjusting the active power of the load to be cut off; the reactive power is adjusted, because the reactive power with the same proportion as the active power is cut off when the load is cut off, if the node voltage is continuously optimized after the action control strategy, the regulation can be realized by adjusting the generator terminal voltage of the generator, namely injecting the generator node into a reactive and switching capacitor reactor.
The invention also provides a deep reinforcement learning-based emergency load shedding control system for the power system, which comprises a power system state space acquisition module and a load shedding action output module under the condition of failure, wherein the power system state space acquisition module is used for acquiring the state space of the power system;
the power system state space acquisition module under the fault condition: the method comprises the steps of obtaining a state space of the power system under the condition of a fault;
load shedding action output module: and the load shedding control model is used for inputting the state space of the power system under the fault condition into the load shedding control model to obtain load shedding action, wherein the load shedding control model is obtained by carrying out simulation training on a deep reinforcement learning network.
The invention is illustrated below by way of a specific example:
the invention provides a power grid emergency load shedding control method based on deep reinforcement learning, aiming at the problem of power grid emergency load shedding control, and shown in an attached figure 1. The method specifically comprises the following steps:
1. the observable operation state variables of the power system, such as bus load active power, bus load reactive power, bus node voltage, generator active power, generator reactive power, generator rotating speed, generator power angle and other parameters are selected as the components of the deep reinforcement learning state space.
2. Taking IEEE39 node system as an example as shown in fig. 3, considering that all loads can not be cut off randomly for the emergency load shedding of the power grid, and only some adjustable loads can participate in the load shedding action, the adjustable load nodes are set as nodes 3, 4, 7, 8, 16, 20 and 24, and all adjustable load nodes participate in the adjustment, and the load shedding action range can be set in the interval of [ 0% -40% ] of the original load. If the grid structure of the power grid is large and the number of load nodes which can participate in adjustment is large, the voltage sensitivity of the load nodes can be sorted from large to small by adopting a sensitivity analysis method, and the load nodes with high node voltage sensitivity are selected to be cut off preferentially.
3. And finishing the training of the automatic load shedding control model of the power grid in the emergency state by utilizing the flow chart of the automatic load shedding algorithm of the power grid in the emergency state based on deep reinforcement learning shown in fig. 2. After the power system breaks down and enters an emergency state, whether the active power of the total load of the power system is larger than the active power of the total generator set or not can be judged according to the power flow calculation result of the power system, if so, load shedding action is carried out, if not, the load shedding action is not needed, and power flow initialization can be carried out again in a simulation experiment; judging whether the load flow calculation result meets constraint conditions, if all the constraint conditions are met, calculating a final reward value directly through a reward function, and finishing training; if any constraint condition is not met, calculating the current reward value through the reward function and the penalty function, performing load shedding action again until all the constraint conditions are met, adding the current reward values calculated in each step to obtain a final reward value, and finishing training.
4. After the load shedding action is executed, if the running state of the power grid meets all the constraint conditions, the intelligent agent for deep reinforcement learning obtains a corresponding reward value, and if any constraint condition is not met, the intelligent agent receives corresponding punishment.
The reward function is designed as follows:
Figure BDA0003359842980000141
the penalty function is designed as follows:
Figure BDA0003359842980000142
in the formula: l is a radical of an alcoholstepThe iteration step number for deep reinforcement learning, v is the node voltage value of the power system, PloadiAnd QloadiRespectively cutting off active power and reactive power for the ith load node in the power system; p isgeniActive power of the ith generator node in the power system; pbalThe active power of a balancing machine in the power system; p isloadi maxAnd Qloadi maxMaximum upper limits of real power and reactive power which can be cut off for the ith load node respectively; pgeni maxThe maximum upper limit of the active power output of the ith generator in the power system is set; pbal maxThe upper limit of the control output of the balancer; lambda [ alpha ]1To award the weight coefficient, C1,C2,C3,C4,C5Respectively, the value of the reward constant, lambda, when the result satisfies the constraint condition2Eta, delta, gamma are penalty weight coefficients when the result does not satisfy the constraint condition, C6A penalty constant value r for power flow non-convergence1-r6Is a reward value or penalty value under the corresponding constraint condition;
and adding the reward values or the penalty values to obtain the current reward value calculated by each step of the deep reinforcement learning load shedding action, wherein the current reward value is shown as the following formula:
Figure BDA0003359842980000151
in the formula, RkIs the current prize value of the kth step, rjWhen the depth reinforcement learning training is carried out in the kth step, the reward value and the penalty value are obtained when the result meets or does not meet the constraint condition;
and adding the current reward value of each step to obtain the final reward value of the current round of simulation training, wherein the final reward value is shown as the following formula:
Figure BDA0003359842980000152
in the formula, RkIs the current prize value, R, of the kth stepsumFor the final reward value, L, obtained during the deep reinforcement learning simulation training of the current roundstepThe iteration step number of the deep reinforcement learning.
5. Under the normal operation condition of the power system, all load nodes are subjected to random fluctuation within the interval range of [ 90% -110% ], generator sets No. 35 and 38 are cut off randomly, system faults are simulated, the power system enters an emergency state, then the node voltage condition of the operation condition of the power system can be judged through simulation, as shown in table 1, part of the load node voltage drops out of the value of [0.95-1.05] per unit, and if the condition occurs continuously, the power system faces a dangerous state.
TABLE 1 load node Voltage per unit value after Fault occurrence
Node 3 Node 4 Node 7 Node 8 Node 16 Node 20 Node 24
Number 35 fault 0.99 0.93 0.92 0.92 0.98 0.98 0.99
Number 38 fault 0.95 0.88 0.86 0.86 0.97 0.97 0.98
6. At this time, the percentage of the original load is respectively cut off by applying the emergency load cutting control model of the power system generated based on the deep reinforcement learning of the invention as shown in table 2:
TABLE 2 load node action after failure
Node 3 Node 4 Node 7 Node 8 Node 16 Node 20 Node 24
Number 35 fault 40% 40% 40% 21.9% 0% 40% 40
Number
38 fault 0% 40% 40% 40% 0% 40% 40%
7. After the power system performs an emergency load shedding operation, the power system should maintain power balance. The active power balance adjustment can be realized by adjusting the active power of the load to be cut off; the reactive power is adjusted, because the reactive power with the same proportion as the active power is cut off when the load is cut off, if the node voltage is continuously optimized after the action control strategy, the regulation can be realized by adjusting the generator terminal voltage of the generator, namely injecting the generator node into a reactive and switching capacitor reactor.
8. Finally, the voltage of each node of the whole power system is restored to the value between [0.95 and 1.05] per unit, so that the stable operation of the power system is ensured, and the voltage of the node is shown in fig. 4.
The load shedding control method based on artificial intelligence solves the problem of automatic control of the load shedding strategy in the emergency state of the power system, and finally realizes voltage stabilization in the emergency state of the power system by using the operation data of the power system as the environment information of algorithm interaction through a method based on deep reinforcement learning based on a data driving method, so that operators are better assisted to improve the accuracy of the load shedding control action strategy in the emergency state, and the safe and stable operation level of the power system is improved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, those skilled in the art will appreciate that various changes, modifications and equivalents can be made in the embodiments of the invention without departing from the scope of the invention as defined by the appended claims.

Claims (12)

1. The method for controlling the emergency load of the power grid is characterized by comprising the following steps:
acquiring a state space of a power system under a fault condition;
and inputting the state space of the power system under the fault condition into a load shedding control model to obtain load shedding actions, wherein the load shedding control model is obtained by carrying out simulation training on a deep reinforcement learning network.
2. The grid emergency load control method of claim 1, wherein the power system state space comprises at least one of: the active power of the bus load; the reactive power of the bus load; a bus node voltage; the active power of the generator; the reactive power of the generator; the rotating speed of the generator; and (5) power angle of the generator.
3. The grid emergency load shedding control method according to claim 1, wherein the load shedding action satisfies the following condition: if the number of the grid structure nodes of the power system is less than or equal to a first threshold value and the number of the load nodes which can participate in the adjustment is less than or equal to a second threshold value, all the loads are subjected to load shedding action adjustment; if the number of grid structure nodes of the power system is greater than a first threshold value and the number of load nodes which can participate in adjustment is greater than a second threshold value, sorting the voltage sensitivity of the load nodes from large to small by adopting a sensitivity analysis method, and selecting the first N load nodes for preferential removal, wherein N is a positive integer;
the load node load shedding action involved in the regulation is set between 0% and M%, and the continuous action regulation can be carried out, wherein M% is the maximum upper limit of the load node capable of shedding.
4. The grid emergency load shedding control method according to claim 1, wherein the load shedding control model is established by:
simulating power flow initialization of the power system;
setting load random fluctuation;
setting a fault of the power system, judging whether the active power of the total load of the power system is greater than the active power of the total generator set or not according to the power flow calculation result of the power system during the fault, if so, performing load shedding action, and if not, performing power flow initialization again;
judging whether the load flow calculation result meets constraint conditions or not according to the load flow calculation result of the power system after the load shedding action, if all the constraint conditions are met, directly calculating a final reward value through a reward function, and finishing training; if any constraint condition is not met, calculating the current reward value through the reward function and the penalty function, performing load shedding action again until all the constraint conditions are met, adding the current reward values calculated in each step of the simulation training of the round to obtain a final reward value, and finishing the training.
5. The grid emergency load control method according to claim 4, wherein the reward function is designed as follows:
Figure FDA0003359842970000021
the penalty function is designed as follows:
Figure FDA0003359842970000022
in the formula: l isstepThe number of iterative steps for deep reinforcement learning, v is the node voltage value of the power system, PloadiAnd QloadiRespectively cutting active power and reactive power for the ith load node in the power system; pgeniActive power of the ith generator node in the power system; pbalThe active power of a balancing machine in the power system; p isloadi maxAnd Qloadi maxMaximum upper limits of real power and reactive power which can be cut off for the ith load node respectively; pgeni maxThe maximum upper limit of the active power output of the ith generator in the power system is set; pbal maxThe upper limit of the control output of the balancer; lambda [ alpha ]1To award the weight coefficient, C1,C2,C3,C4,C5Respectively, the value of the reward constant, lambda, when the result satisfies the constraint condition2Eta, delta, gamma are penalty weight coefficients when the result does not satisfy the constraint condition, C6A penalty constant value r for power flow non-convergence1-r6Is a reward value or penalty value under the corresponding constraint condition;
and adding the reward values or the penalty values to obtain the current reward value calculated by each step of the deep reinforcement learning load shedding action, wherein the current reward value is shown as the following formula:
Figure FDA0003359842970000031
in the formula, RkIs the current prize value of the kth step, rjWhen the result meets or does not meet the constraint condition during the k-th deep reinforcement learning training, the reward value and the penalty value are obtained;
and adding the current reward value of each step to obtain the final reward value of the current round of simulation training, wherein the final reward value is shown as the following formula:
Figure FDA0003359842970000032
in the formula, RkIs the current prize value, R, of step ksumThe final reward value, L, obtained during the simulation training of the current round of deep reinforcement learningstepThe number of iteration steps for deep reinforcement learning.
6. The method for controlling the emergency load of the power grid according to claim 1, wherein after the load shedding action is obtained, the reactive power of the power system is adjusted through the reactive power compensation, so that the node voltage of the power system is further optimized and improved.
7. The method for grid emergency load control according to claim 6, wherein the adjusting reactive power of the power system by reactive power compensation specifically comprises: and reactive power is injected into the nodes of the generator, and the capacitor reactors are switched to further optimize the node voltage of the power system.
8. The power grid emergency load shedding control system is characterized by comprising a power system state space acquisition module and a load shedding action output module under the condition of failure, wherein the power system state space acquisition module is connected with the load shedding action output module;
the power system state space acquisition module under the fault condition: the method comprises the steps of obtaining a power system state space under a fault condition;
load shedding action output module: and the load shedding control model is used for inputting the state space of the power system under the fault condition into the load shedding control model to obtain load shedding action, wherein the load shedding control model is obtained by carrying out simulation training on a deep reinforcement learning network.
9. The grid emergency load control system according to claim 8, wherein the load shedding control model is established by:
simulating power flow initialization of the power system;
setting load random fluctuation;
setting a fault of the power system, judging whether the active power of the total load of the power system is greater than the active power of the total generator set or not according to the power flow calculation result of the power system during the fault, if so, performing load shedding action, and if not, performing power flow initialization again;
judging whether the load flow calculation result meets constraint conditions or not according to the load flow calculation result of the power system after the load shedding action, if all the constraint conditions are met, directly calculating a final reward value through a reward function, and finishing training; if any constraint condition is not met, calculating the current reward value through the reward function and the penalty function, performing load shedding action again until all the constraint conditions are met, adding the current reward values calculated in each step of the simulation training of the round to obtain a final reward value, and finishing the training.
10. The grid emergency load control system of claim 9, wherein the reward function is designed as follows:
Figure FDA0003359842970000041
the penalty function is designed as follows:
Figure FDA0003359842970000051
in the formula: l isstepIterations for deep reinforcement learningNumber of steps, v is the node voltage value of the power system, PloadiAnd QloadiRespectively cutting off active power and reactive power for the ith load node in the power system; pgeniActive power of the ith generator node in the power system; pbalThe active power of a balancing machine in the power system; ploadi maxAnd Qloadi maxMaximum upper limits of real power and reactive power which can be cut off for the ith load node respectively; pgeni maxThe maximum upper limit of the active power output of the ith generator in the power system is set; pbal maxThe upper limit of the control output of the balancer; lambda [ alpha ]1To award the weight coefficient, C1,C2,C3,C4,C5Respectively, the value of the reward constant, lambda, at which the result satisfies the constraint2Eta, delta, gamma are penalty weight coefficients when the result does not satisfy the constraint condition, C6A penalty constant value r for power flow non-convergence1-r6Is a reward value or penalty value under the corresponding constraint condition;
and adding the reward values or the penalty values to obtain the current reward value calculated by each step of the deep reinforcement learning load shedding action, wherein the current reward value is shown as the following formula:
Figure FDA0003359842970000052
in the formula, RkIs the current prize value of step k, rjWhen the result meets or does not meet the constraint condition during the k-th deep reinforcement learning training, the reward value and the penalty value are obtained;
and adding the current reward value of each step to obtain the final reward value of the current round of simulation training, wherein the final reward value is shown as the following formula:
Figure FDA0003359842970000053
in the formula, RkIs the current prize value, R, of step ksumThe most obtained during the deep reinforcement learning simulation training of the current roundFinal prize value, LstepThe number of iteration steps for deep reinforcement learning.
11. Computer arrangement comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor when executing the computer program performs the steps of the grid emergency load control method according to any of the claims 1 to 7.
12. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the grid emergency load control method according to any one of claims 1 to 7.
CN202111363835.XA 2021-11-17 2021-11-17 Method, system, equipment and storage medium for controlling emergency load of power grid Pending CN114094592A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111363835.XA CN114094592A (en) 2021-11-17 2021-11-17 Method, system, equipment and storage medium for controlling emergency load of power grid

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111363835.XA CN114094592A (en) 2021-11-17 2021-11-17 Method, system, equipment and storage medium for controlling emergency load of power grid

Publications (1)

Publication Number Publication Date
CN114094592A true CN114094592A (en) 2022-02-25

Family

ID=80301531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111363835.XA Pending CN114094592A (en) 2021-11-17 2021-11-17 Method, system, equipment and storage medium for controlling emergency load of power grid

Country Status (1)

Country Link
CN (1) CN114094592A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116245334A (en) * 2023-03-15 2023-06-09 东南大学 Power system risk perception real-time scheduling method based on deep reinforcement learning
CN116345498A (en) * 2023-05-30 2023-06-27 南方电网数字电网研究院有限公司 Frequency emergency coordination control method for data-model hybrid drive power system
CN117495426A (en) * 2023-12-29 2024-02-02 国网山西省电力公司经济技术研究院 New energy power system operation cost rapid calculation method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312839A (en) * 2021-05-25 2021-08-27 武汉大学 Power grid emergency auxiliary load shedding decision method and device based on reinforcement learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312839A (en) * 2021-05-25 2021-08-27 武汉大学 Power grid emergency auxiliary load shedding decision method and device based on reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIAN LI 等: "Load Shedding Control Strategy in Power Grid Emergency State Based on Deep Reinforcement Learning", CSEE JOURNAL OF POWER AND ENERGY SYSTEMS, pages 1175 - 1182 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116245334A (en) * 2023-03-15 2023-06-09 东南大学 Power system risk perception real-time scheduling method based on deep reinforcement learning
CN116245334B (en) * 2023-03-15 2024-04-16 东南大学 Power system risk perception real-time scheduling method based on deep reinforcement learning
CN116345498A (en) * 2023-05-30 2023-06-27 南方电网数字电网研究院有限公司 Frequency emergency coordination control method for data-model hybrid drive power system
CN116345498B (en) * 2023-05-30 2023-09-15 南方电网数字电网研究院有限公司 Frequency emergency coordination control method for data-model hybrid drive power system
CN117495426A (en) * 2023-12-29 2024-02-02 国网山西省电力公司经济技术研究院 New energy power system operation cost rapid calculation method and system
CN117495426B (en) * 2023-12-29 2024-03-29 国网山西省电力公司经济技术研究院 New energy power system operation cost rapid calculation method and system

Similar Documents

Publication Publication Date Title
CN114094592A (en) Method, system, equipment and storage medium for controlling emergency load of power grid
CN111628501B (en) AC/DC large power grid transient voltage stability assessment method and system
CN110932281B (en) Multi-section cooperative correction method and system based on quasi-steady-state sensitivity of power grid
CN103985058B (en) Available transfer capability calculation method based on improved multiple centrality-correction interior point method
CN115940294B (en) Multi-stage power grid real-time scheduling strategy adjustment method, system, equipment and storage medium
CN113077075B (en) New energy uncertainty electric power system safety risk prevention control method and device
CN113761791A (en) Power system automatic operation method and device based on physical information and deep reinforcement learning
CN114678860A (en) Power distribution network protection control method and system based on deep reinforcement learning
CN111159922A (en) Key line identification method and device for cascading failure of power system
CN115021249A (en) Distribution network transient characteristic equivalence method and system considering distributed photovoltaic
CN113011679A (en) Hydropower station flood discharge and power generation combined operation regulation and control method and device and electronic equipment
CN110556828A (en) Online safety and stability emergency control method and system adaptive to equipment power flow change
CN111864744B (en) Online switching method and system for control modes of speed regulator of high-proportion hydroelectric system
CN113872230B (en) New energy fault ride-through control parameter optimization method and device
CN116031957A (en) Large-scale wind farm voltage and frequency recovery control method
CN115940148A (en) Minimum inertia requirement evaluation method and device, electronic equipment and storage medium
CN115133540A (en) Power distribution network model-free real-time voltage control method
CN113725863A (en) Power grid autonomous control and decision method and system based on artificial intelligence
CN113555876A (en) Line power flow regulation and control method and system based on artificial intelligence
CN111082414B (en) Transient voltage calculation method and system
CN111478331B (en) Method and system for adjusting power flow convergence of power system
CN115360772B (en) Active safety correction control method, system, equipment and storage medium for power system
CN110797905B (en) Control method and device for low voltage ride through of wind turbine generator and storage medium
CN117578466B (en) Power system transient stability prevention control method based on dominant function decomposition
CN113689112B (en) Intelligent energy station energy efficiency evaluation method and system by utilizing cloud computing improved analytic hierarchy process

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination