CN112465664B - AVC intelligent control method based on artificial neural network and deep reinforcement learning - Google Patents

AVC intelligent control method based on artificial neural network and deep reinforcement learning Download PDF

Info

Publication number
CN112465664B
CN112465664B CN202011263523.7A CN202011263523A CN112465664B CN 112465664 B CN112465664 B CN 112465664B CN 202011263523 A CN202011263523 A CN 202011263523A CN 112465664 B CN112465664 B CN 112465664B
Authority
CN
China
Prior art keywords
neural network
function
avc
artificial neural
agent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011263523.7A
Other languages
Chinese (zh)
Other versions
CN112465664A (en
Inventor
朱勇
陶用伟
王常沛
蒋宏荣
徐坤
李泽群
张韵
杨键
黄琼
杨晓燕
邓钦
郑华
高卫华
王秀境
时敏
李明宏
刘岑俐
肖彬
肖浩宇
王寅
曹杰
陈锐
苏华英
田年杰
代江
刘明顺
吴应双
龙秋风
张丹
欧阳可凤
汪明清
黄才云
潘云
王雨
陈愿米
付麟淞
舒晓晴
吴秋君
蒋进芳
顾本洪
唐洁瑶
廖玉琼
姚璐
肖倩宏
安甦
陈锦龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Kaili Power Supply Bureau of Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Kaili Power Supply Bureau of Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd, Kaili Power Supply Bureau of Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN202011263523.7A priority Critical patent/CN112465664B/en
Publication of CN112465664A publication Critical patent/CN112465664A/en
Application granted granted Critical
Publication of CN112465664B publication Critical patent/CN112465664B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/12Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load
    • H02J3/16Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load by adjustment of reactive power
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/46Controlling of the sharing of output between the generators, converters, or transformers
    • H02J3/50Controlling the sharing of the out-of-phase component
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/30Reactive power compensation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention discloses an AVC intelligent control method based on an artificial neural network and deep reinforcement learning, which comprises the steps of dividing a transformer substation into different sub-control areas by combining a situation prediction result of reactive load of a power grid and a reactive load change rule of a new energy grid-connected point; optimizing an action utility function based on a Bellman equation and a minimized loss function, and obtaining a decision metric function by combining the action utility function; training the agent by optimizing decision model parameters of the agent using the gradient of the decision metric function; and inputting the situation prediction results of different sub-regions and the reactive change rule of the new energy into the intelligent agent, and calculating the voltage control quantity of the power system through the intelligent agent to control the reactive voltage of the power grid. The invention trains the intelligent agent by combining the artificial neural network and the multi-agent reinforcement learning algorithm of the deterministic strategy, thereby improving the active control capability of the reactive voltage.

Description

AVC intelligent control method based on artificial neural network and deep reinforcement learning
Technical Field
The invention relates to the technical field of power control, in particular to an AVC intelligent control method based on an artificial neural network and deep reinforcement learning.
Background
In recent years, in the operation control process of the power system, the large-scale power failure accidents caused by insufficient situation perception are increased day by day in all countries in the world, and the wide-area situation perception of the power system is paid more and more attention; electric power system wide area situation perception includes through gathering wide area electric wire netting steady state and developments, electric quantity and non-electric quantity information: analyzing, understanding and evaluating equipment state information, power grid steady-state data information, power grid dynamic data information, power grid transient fault information, power grid operating environment information and the like by means of wide-area dynamic safety monitoring, data mining, dynamic parameter identification, super real-time simulation, visualization and the like, and further predicting the power grid development situation; the application of situation awareness technology in power systems is still in the beginning stage, and the situation awareness has been listed as one of the technical fields of preferential support of smart grids by mechanisms such as the U.S. federal energy management commission and the national standards and technical society.
With the rapid development of large-scale new energy access and alternating current-direct current hybrid power grids, the uncertainty of the source-load double sides is enhanced, the reactive voltage problem of the system is increasingly prominent, and the challenge is brought to the safe operation of the power grids; at present, reactive power optimization control belongs to system global optimization under a short time scale, control decisions do not have initiative and predictability, and the influence of uncertainty of new energy and reactive load on reactive voltage control under a long time scale is not fully considered, so that reactive power equipment is frequently adjusted, and the overall control effect under the long time scale is not ideal.
Disclosure of Invention
This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention of this application, simplifications or omissions may be made to avoid obscuring the purpose of the section, the abstract and the title, and such simplifications or omissions are not intended to limit the scope of the invention.
The present invention has been made in view of the above-mentioned conventional problems.
Therefore, the invention provides an AVC intelligent control method based on an artificial neural network and deep reinforcement learning, which can avoid reactive voltage risks and solve the problem of poor reactive voltage active control effect.
In order to solve the technical problems, the invention provides the following technical scheme: the method comprises the steps of dividing the transformer substation into different sub-control areas by combining a situation prediction result of the reactive load of a power grid and a reactive load change rule of a new energy grid-connected point; optimizing an action utility function based on a Bellman equation and a minimized loss function, and obtaining a decision metric function by combining the action utility function; training the agent by optimizing decision model parameters of the agent using the gradient of the decision metric function; and inputting the situation prediction results of different sub-regions and the reactive change rule of the new energy into the intelligent agent, and calculating the voltage control quantity of the power system through the intelligent agent to control the reactive voltage of the power grid.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the situation prediction result comprises the steps of constructing a deep neural network regression model based on a deep artificial neural network, and integrating a plurality of regression load results of the deep neural network regression model to obtain the situation prediction result of the reactive load.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the method for constructing the deep neural network regression model comprises the following steps of constructing a regression model structure based on reactive load data characteristics and by considering climate environment, season, regional distribution, user load and power grid scheduling control strategies:
Figure BDA0002775396190000021
wherein k is the order; x is the number of(k)A k-order hidden layer node unit vector is obtained; y is(k)Is a k-order output node vector; u. of(k)Is an input vector of order k;
Figure BDA0002775396190000022
inputting a vector for a k-order feedback state;
Figure BDA0002775396190000023
is a k-order feedback state vector;
Figure BDA0002775396190000024
outputting vectors for k-order hidden layers; omegaiA connection weight matrix of each layer, i is 1, 2, 3, 4, 5, 6; g () is the transfer function of the output neuron; f () is the transfer function of the middle layer neurons.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the regression model structure may further include,
Figure BDA0002775396190000025
Figure BDA0002775396190000026
Figure BDA0002775396190000027
wherein x is(k-1)Is a k-1 order hidden layer node unit vector;
Figure BDA0002775396190000028
is a feedback state vector of k-1 order; u. of(k-1)Is an input vector of order k-1;
Figure BDA0002775396190000029
inputting a vector for a feedback state of k-1 order; y is(k-1)Is a k-1 order output node vector;
Figure BDA00027753961900000210
outputting vectors for k-1 order hidden layer; eta, b,
Figure BDA00027753961900000211
Is a self-feedback gain factor.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the minimizing a loss function includes defining the minimizing a loss function:
Figure BDA0002775396190000031
wherein the content of the first and second substances,
Figure BDA0002775396190000032
for taking the independent variable as a training parameter
Figure BDA0002775396190000033
A time-dependent minimum loss function, E is an expected value, s is a current system state, s' is an environmental state at a next moment, a is a selected action in a corresponding state,
Figure BDA0002775396190000034
as a pool of experiences, yiTo pass through the Bellman equation pair
Figure BDA0002775396190000035
And estimating the true value.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the estimated true values of the values may include,
Figure BDA0002775396190000036
wherein r isiIs the return value obtained in the ith iteration; mu is a decision value, gamma represents a decay rate, and gamma is in the range of 0,1];Qi' is a function of the Q value of the next state target Critic network; s' is the next state entered by taking the action a at the system state s; a' is according to the target Actor network in the system state s
Figure BDA0002775396190000037
An act of selecting;
Figure BDA0002775396190000038
is a parameter of the target Actor network;
Figure BDA0002775396190000039
is a parameter of the target Critic network.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the parameters of the target Critic network and the parameters of the target Actor network comprise parameters passing through the actual Actor network
Figure BDA00027753961900000310
And (3) updating parameters:
Figure BDA00027753961900000311
parameters through a practical Critic network
Figure BDA00027753961900000312
And (3) updating parameters:
Figure BDA00027753961900000313
where τ controls the update rate.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the decision metric function includes defining the action utility function Qi(s, a) is the expectation of the sum of the rewards subsequently obtained by the agents in the ith area after the action a is executed in the system state s, and then the decision metric function is:
Figure BDA00027753961900000314
as a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the gradient of the decision metric function comprises the decision function parameters of the i-th regional agent
Figure BDA0002775396190000041
The gradient of (d) is:
Figure BDA0002775396190000042
wherein the content of the first and second substances,
Figure BDA0002775396190000043
computing the sign of the gradient for the function; a isiAn action value representing the ith iteration;
Figure BDA0002775396190000044
iterating the gradient i times for the action utility function,
Figure BDA0002775396190000045
for the target Actor network
Figure BDA0002775396190000046
The gradient is iterated i times.
As a preferred scheme of the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning, the AVC intelligent control method comprises the following steps: the voltage control quantity comprises a calculation formula based on Newton-Raphson power flow, wherein the calculation formula of the voltage control quantity is as follows:
Figure BDA0002775396190000047
wherein U is the voltage control mass, MiFor modulation of voltage source converters, UdThe fundamental voltage of the dc node.
The invention has the beneficial effects that: the reactive voltage future situation prediction is formed based on the analysis of data samples of new energy and reactive load, the reactive voltage of a power grid is controlled through an intelligent agent, and meanwhile, the intelligent agent is trained by combining an artificial neural network and a multi-agent reinforcement learning algorithm of a deterministic strategy, so that the active control capability of the reactive voltage is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise. Wherein:
FIG. 1 is a schematic flowchart of an AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to a first embodiment of the present invention;
fig. 2 is a schematic diagram of a transformer substation and a substation system region division of the AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to the first embodiment of the present invention;
fig. 3 is a schematic diagram of an Actor network structure of an AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to a first embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a criticic network of the AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to the first embodiment of the present invention;
FIG. 5 is a schematic diagram of an agent training process of an AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to a first embodiment of the present invention;
FIG. 6 is a schematic diagram of the operation flow of an agent in an AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to a first embodiment of the present invention;
fig. 7 is a schematic diagram of a loss function curve of an Actor network of an AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to a second embodiment of the present invention;
FIG. 8 is a graph illustrating a loss function curve of a criticc network of an AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to a second embodiment of the present invention;
FIG. 9 is a diagram illustrating the variation of the total reward function and the action times of the AVC intelligent control method based on artificial neural network and deep reinforcement learning according to the second embodiment of the present invention with the training process;
fig. 10 is a schematic diagram of voltage amplitudes of nodes before and after the intelligent agent controls in a certain operating state according to the AVC intelligent control method based on the artificial neural network and the deep reinforcement learning according to the second embodiment of the present invention;
fig. 11 is a schematic diagram of a loss function curve of an Actor network in consideration of new energy output fluctuation according to a second embodiment of the AVC intelligent control method based on an artificial neural network and deep reinforcement learning;
fig. 12 is a schematic diagram of a loss function curve of the Critic network in consideration of new energy output fluctuation in the AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to the second embodiment of the present invention;
fig. 13 is a schematic diagram of the action times of each agent in consideration of new energy fluctuation according to the AVC intelligent control method based on an artificial neural network and deep reinforcement learning according to the second embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, specific embodiments accompanied with figures are described in detail below, and it is apparent that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present invention, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
Furthermore, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
The present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not enlarged partially in general scale for convenience of illustration, and the drawings are only exemplary and should not be construed as limiting the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.
Meanwhile, in the description of the present invention, it should be noted that the terms "upper, lower, inner and outer" and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation and operate, and thus, cannot be construed as limiting the present invention. Furthermore, the terms first, second, or third are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected and connected" in the present invention are to be understood broadly, unless otherwise explicitly specified or limited, for example: can be fixedly connected, detachably connected or integrally connected; they may be mechanically, electrically, or directly connected, or indirectly connected through intervening media, or may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example 1
Referring to fig. 1 to 6, a first embodiment of the present invention provides an AVC intelligent control method based on an artificial neural network and deep reinforcement learning, including:
s1: and dividing the transformer substation into different sub-control areas by combining the situation prediction result of the reactive load of the power grid and the reactive load change rule of the new energy grid-connected point.
(1) And constructing a deep neural network regression model based on the deep artificial neural network, and integrating a plurality of regression load results of the deep neural network regression model to further obtain a situation prediction result of the reactive load.
It should be noted that, before the deep neural network regression model is constructed, the load data is preprocessed by methods including denoising, normalization, whitening and the like, massive reactive load data is integrated, error load data is eliminated, and a reactive load data set with a complete structure, a standard format and a low error rate is generated.
Specifically, the constructing of the deep neural network regression model comprises,
based on the reactive load data characteristics, a regression model structure is constructed by considering climate environment, season, regional distribution, user load and a power grid dispatching control strategy, and input information of a middle layer of the structure comprises an input layer, an input bearing layer, a middle bearing layer and an output bearing layer; the input of the output layer comprises an intermediate layer and an intermediate receiving layer, and the mathematical formula corresponding to the regression model is as follows:
Figure BDA0002775396190000071
wherein k is the order; x is the number of(k)A k-order hidden layer node unit vector is obtained; y is(k)Is a k-order output node vector; u. of(k)Is an input vector of order k;
Figure BDA0002775396190000072
inputting a vector for a k-order feedback state;
Figure BDA0002775396190000073
is a k-order feedback state vector;
Figure BDA0002775396190000074
outputting vectors for k-order hidden layers; omegaiA connection weight matrix of each layer, i is 1, 2, 3, 4, 5, 6; g () is the transfer function of the output neuron; f () is the transfer function of the middle layer neurons.
k order feedback state vector
Figure BDA0002775396190000075
Comprises the following steps:
Figure BDA0002775396190000076
k order feedback state input vector
Figure BDA0002775396190000077
Comprises the following steps:
Figure BDA0002775396190000078
k order hidden layer output vector
Figure BDA0002775396190000079
Comprises the following steps:
Figure BDA00027753961900000710
wherein x is(k-1)Is a k-1 order hidden layer node unit vector;
Figure BDA00027753961900000711
is a feedback state vector of k-1 order; u. of(k-1)Is an input vector of order k-1;
Figure BDA00027753961900000712
inputting a vector for a feedback state of k-1 order; y is(k-1)Is a k-1 order output node vector;
Figure BDA00027753961900000713
a k-1 order hidden layer output vector; eta (eta is more than or equal to 0),
Figure BDA00027753961900000716
Is a self-feedback gain factor.
It should be noted that, in this embodiment, g () uses a linear function, and f () uses a Sigmoid function;
sigmoid function is shown as follows:
Figure BDA00027753961900000715
further, integrating a plurality of regression load results of the regression model based on the preprocessed data set to obtain a situation prediction value of the reactive load.
(2) Because reactive power compensation devices installed in different new energy plants are different and reactive voltage control methods of the reactive power compensation devices are different, firstly, reactive power characteristics of different reactive power sources are analyzed based on the near-area actual situation of the new energy plants; and (3) integrating the output and load characteristics of the new energy based on a cluster analysis method, and equivalently obtaining the equivalent load characteristic of the near region of the system energy field station to obtain the fluctuation rule of the node voltage under the equivalent load characteristic.
Specifically, Clustering Analysis (Clustering Analysis) is an Analysis method for grouping according to the principle of maximizing intra-class similarity and minimizing inter-class similarity of objects, and also belongs to a descriptive mining task.
The embodiment adopts K-means to perform partition clustering on the data.
Dividing a data set D into K classes, and evaluating the cluster quality by using the sum of squared errors, wherein the classes are defined as follows:
Figure BDA0002775396190000081
wherein E represents the sum of the squares of the errors for all objects of the data set; p represents a point of a given data object in space; dist (x, y) represents the Euclidean distance in space from point x to point y.
And secondly, determining the value of the cluster number k by adopting an elbow method.
Finding out an inflection point through SSE (sum of the squared errors), wherein the K value at the moment is the value obtained; calculating the SSE:
Figure BDA0002775396190000082
wherein, CiIs the ith cluster, p is CiSample point of (1), miIs CiThe center of mass of;
region division referring to fig. 2, each is controlled by 2 different agents.
S2: and optimizing the action utility function based on the Bellman equation and the minimized loss function, and combining the action utility function to obtain a decision metric function.
Defining a minimization loss function:
Figure BDA0002775396190000083
wherein the content of the first and second substances,
Figure BDA0002775396190000084
for taking the independent variable as a training parameter
Figure BDA0002775396190000085
The function of the minimum loss of time,e is the expected value, s is the current system state, s' is the environmental state at the next moment, a is the selected action in the corresponding state,
Figure BDA0002775396190000086
as a pool of experiences, yiIs to pass through a Bellman equation pair
Figure BDA0002775396190000087
And estimating the true value.
In particular, the method comprises the following steps of,
Figure BDA0002775396190000088
wherein r isiThe return value obtained in the ith iteration is obtained; mu is a decision value; gamma denotes the decay rate and gamma is in [0,1 ]]When γ is 0, only immediate return is considered and no long-term return is considered, and when γ is 1, the system considers both long-term return and immediate return as equally important; qi' is a function of the Q value of the next state target Critic network; (ii) a s' is the next state entered by taking action a at system state s; a' is a network according to a target Actor in a system state s
Figure BDA0002775396190000091
An act of selecting;
Figure BDA0002775396190000092
is a parameter of the target Actor network;
Figure BDA0002775396190000093
is a parameter of the target Critic network.
Further, updating the parameters of the target Critic network and the parameters of the target Actor network:
the embodiment adopts an Adaptive motion estimation (Adam) optimization algorithm to update parameters;
wherein, it is required to be noted that: momentum gradient descent part (exponentially weighted average) in Adam optimization algorithm:
vdw=β1vdw+(1-β1)dW
vdb=β1vdb+(1-β1)db
RMSprop section (exponentially weighted average of squared versions) in Adam optimization algorithm:
Sdw=β2Sdw+(1-β2)dW2
Sdb=β2Sdb+(1-β2)db2
wherein, beta1Is the first torch, beta2Is a second torch;
thus, the parameters of the actual Actor network are passed
Figure BDA0002775396190000094
And (3) updating parameters:
Figure BDA0002775396190000095
parameters through the actual Critic network
Figure BDA0002775396190000096
And (3) updating parameters:
Figure BDA0002775396190000097
wherein τ controls the update rate, and τ < 1 is usually satisfied.
Still further, an action utility function Q is definedi(s, a) is the expectation of the sum of the rewards subsequently obtained by agents in the ith zone after performing action a in system state s:
Qi(s,a)=E(r(s,a)+γmaxQi(s′,a′))
wherein r (s, a) is the return value after executing action a under the system state s, Qi(s ', a') is the goodness of taking action a 'under system state s';
the decision metric function is then:
Figure BDA0002775396190000101
s3: the agent is trained by optimizing decision model parameters of the agent using the gradient of the decision metric function.
And optimizing the decision model of the agent in the ith area through the gradient of the decision metric function to finish the training of the agent.
In particular, the decision function parameter of the i-th area agent
Figure BDA0002775396190000102
The gradient of (d) is:
Figure BDA0002775396190000104
wherein the content of the first and second substances,
Figure BDA0002775396190000105
computing the sign of the gradient for the function; a isiAn action value representing the ith iteration;
Figure BDA0002775396190000106
the gradient is iterated i times for the action utility function,
Figure BDA0002775396190000107
for target Actor network
Figure BDA0002775396190000108
The gradient is iterated i times.
S4: and inputting the situation prediction results of different sub-regions and the reactive change rule of the new energy into the intelligent agent, and calculating the voltage control quantity of the power system through the intelligent agent to control the reactive voltage of the power grid.
Based on Newton Raphson power flow calculation, the calculation formula of the voltage control quantity is as follows:
Figure BDA0002775396190000109
wherein U is the voltage control mass, MiFor modulation of voltage source converters, UdThe fundamental voltage of the dc node.
Example 2
In order to verify and explain the technical effect adopted in the method, the embodiment selects the new energy power station which does not generate output fluctuation and the new energy power station which generates output fluctuation to carry out voltage control comparison test, and compares the test results by a scientific demonstration means to verify the real effect of the method.
(1) Analysis of voltage control result when no output fluctuation occurs in new energy power station
Firstly, analyzing the effect of the invention on the voltage control of the power system under the condition that the output of the new energy power station is relatively stable; under the condition, the active output and the load of each generator set (including a new energy generator set) in the power system are kept near relatively stable values in the whole voltage real-time control process, so that the active output and the load of the generator are considered to be kept unchanged in the process of interaction of each agent and the power system environment, and only the change of the generator terminal voltage caused by excitation regulation of the generator is considered.
The method comprises the steps of generating power system operation state data samples through random sampling, training the agents in two areas by using the first 70% of group operation states of which the node load change ranges are 0.8-1.2 times of rated load, and using the last 30% of group operation states of which the node load change ranges are 0.7-1.3 times of rated load as a verification set of a regression model.
As can be seen from fig. 7 and 8, with the progress of the training process, the loss function of the Actor network of the agent first rises obviously, then falls, and finally converges to a stable value; this shows that the parameter initialization of the neural network is random, and the output of the Actor network cannot effectively regulate the generator terminal voltage in the training early stage, so that the voltage of the power system is out of limit, and the loss function is high; however, with the continuous update of the neural network parameters, after the generator terminal voltage is set according to the output of the Actor network, the voltage level of the power system is effectively controlled, and the loss function is continuously reduced, which shows that the training algorithm provided by the method can effectively train the regression model.
It can be known by comparing the loss function curves between the two agents that the drop speed of the Actor network loss function of agent 1 is significantly faster than that of agent 2, and when the Critic network loss function of agent 1 converges, the fluctuation degree is significantly lower than that of agent 2, which means that the number of nodes in area 1 controlled by agent 1 is less than that in area 2 controlled by agent 2, and the control action of agent 2 can be used for controlling the node voltage of area 1, and the node voltage of area 2 is controlled by agent 2 only, indicating that for the training strategy proposed by the method, the number of nodes is less, the more controllable node voltage units are in the area, and the corresponding agent model is easier to train.
In fig. 9, the gray line in the left graph is the total reward curve obtained by each agent during each screen interaction, the black line is the smoothed total reward curve, and the black dotted points in the right graph represent the required action times when the control voltage of each agent is not exceeded during each screen interaction; therefore, as can be seen from fig. 9, in the training process, the total reward obtained by each agent is continuously increased, and the number of actions required by each agent to control the voltage to be not out-of-limit is continuously reduced, which means that after each agent is continuously trained, the number of actions required to be executed from the control voltage to be not out-of-limit is as small as possible, and when the training is completed and the test is performed, the agent can execute only one or two actions to prevent the voltage from being out-of-limit.
Taking a certain running state in the test, calculating the voltage sum and the average value of each node in each intelligent agent control area before and after control, and displaying the result visually as shown in fig. 10; in fig. 10, dotted dots indicate upper and lower limits of node voltages, scattered gray dots indicate voltages of respective nodes before control, black dotted lines indicate average values of voltages of respective nodes before control, and triangular dotted lines indicate average values of voltages of respective nodes after control; before control, the voltage of each node is integrally higher, and the voltage of each node is higher than the upper limit; after the intelligent agent control, the voltage of each node moves towards the direction of the voltage reference value 1.0, and the average value of the voltage is close to 1.0, which shows that the node voltage is effectively controlled from the out-of-limit.
(2) Voltage control result analysis considering output fluctuation of new energy power station
Under the condition of considering new energy fluctuation, the uncertainty of the output of the new energy unit is enhanced, so in the process of real-time voltage control, the output of the new energy unit should be regarded as a variable quantity, that is, in the process of interaction between each intelligent agent and the power system environment, the active output of the new energy unit is considered to be changed, and meanwhile, the change of the generator terminal voltage caused by generator excitation regulation is considered.
Similarly, the running state data sample of the power system is generated through random sampling, but in the interaction process of the intelligent agent and the environment, the load power of the node 2 and the output of the wind turbine generator of the node 3 need to be dynamically adjusted, and in each step of the interaction, the adjustment range of the randomly generated load and the output of the wind turbine generator is increased to 0.5-1.3 times of the rated power relative to the generation of the sample, so that the uncertainty of stronger output of the new energy turbine generator is reflected.
Fig. 11 and 12 show loss function curves of the Actor network and the Critic network after the new energy output fluctuation is considered, and comparing the convergence conditions of the loss functions in fig. 7 and 8, it can be seen that the drop speed of the loss function of the Actor network is lower than that when the new energy fluctuation is not considered, and the convergence value of the loss function is higher than that when the new energy fluctuation is not considered; the loss function of the criticic network is reduced after being trained for a certain number of times, but the loss function is difficult to converge to a stable value, and the fluctuation with a large amplitude is kept, so that the model is more difficult to train under the condition of considering the fluctuation of new energy.
Fig. 13 shows the number of actions required when the node voltage of each agent control area is not out of limit in consideration of new energy fluctuation, and as can be seen from comparison with fig. 9, the number of control actions required by each agent is relatively more, and can be as high as 50 or more; however, with the progress of the training process, the number of actions required for controlling the voltage not to exceed the limit can be continuously reduced, and finally, the number can be basically controlled to be less than 5, which shows that although the difficulty of model training is higher when the fluctuation of new energy is considered, the model with the control effect can still be obtained through training.
It should be noted that the above-mentioned embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims (7)

1. An AVC intelligent control method based on an artificial neural network and deep reinforcement learning is characterized in that: comprises the steps of (a) preparing a mixture of a plurality of raw materials,
dividing the transformer substation into different sub-control areas by combining the situation prediction result of the reactive load of the power grid and the reactive load change rule of the new energy grid-connected point;
optimizing an action utility function based on a Bellman equation and a minimized loss function, and obtaining a decision metric function by combining the action utility function;
training the agent by optimizing decision model parameters of the agent using the gradient of the decision metric function;
inputting situation prediction results of different sub-control areas and a new energy reactive power change rule into the intelligent agent, and calculating the voltage control quantity of the power system through the intelligent agent to control the reactive voltage of the power grid;
wherein the step of obtaining the situation prediction result comprises:
constructing a deep neural network regression model based on a deep artificial neural network, and integrating a plurality of regression load results of the deep neural network regression model to further obtain a situation prediction result of the reactive load;
the constructing of the deep neural network regression model comprises the following steps:
based on the reactive load data characteristics, considering the climate environment, season, regional distribution, user load and power grid dispatching control strategy, constructing the deep neural network regression model:
Figure FDA0003525993510000011
Figure FDA0003525993510000012
wherein k is the order; x is the number of(k)A k-order hidden layer node unit vector is obtained; y is(k)Is a k-order output node vector; u. of(k)Is an input vector of order k;
Figure FDA0003525993510000013
inputting a vector for a k-order feedback state;
Figure FDA0003525993510000014
is a k-order feedback state vector;
Figure FDA0003525993510000015
outputting vectors for k-order hidden layers; omegaiA connection weight matrix of each layer, i is 1, 2, 3, 4, 5, 6; g () is the transfer function of the output neuron; f () is the transfer function of the middle layer neurons;
the deep neural network regression model further includes,
Figure FDA0003525993510000016
Figure FDA0003525993510000017
Figure FDA0003525993510000018
wherein x is(k-1)Is a k-1 order hidden layer node unit vector;
Figure FDA0003525993510000019
is a feedback state vector of k-1 order; u. of(k-1)Is an input vector of order k-1;
Figure FDA00035259935100000110
inputting a vector for a feedback state of k-1 order; y is(k-1)Is a k-1 order output node vector;
Figure FDA00035259935100000111
outputting vectors for k-1 order hidden layer; eta, b,
Figure FDA00035259935100000112
Is a self-feedback gain factor.
2. The AVC intelligent control method based on artificial neural network and deep reinforcement learning of claim 1, wherein: the function for minimizing the loss comprises,
defining the minimization of loss function:
Figure FDA0003525993510000021
wherein the content of the first and second substances,
Figure FDA0003525993510000022
for taking the independent variable as a training parameter
Figure FDA0003525993510000023
A time minimum loss function, E is an expected value, s is the current system state, s' is the environment state at the next moment, and a is the phaseIn response to the action selected in the state,
Figure FDA0003525993510000024
as a pool of experiences, yiTo pass through the Bellman equation pair
Figure FDA0003525993510000025
And estimating the true value.
3. The AVC intelligent control method based on artificial neural network and deep reinforcement learning of claim 2, wherein: the estimated true values include, for example,
Figure FDA0003525993510000026
wherein r isiThe return value obtained in the ith iteration is obtained; mu is a decision value, gamma represents a decay rate, and gamma is in the range of 0,1];Qi' is a function of the Q value of the next state target Critic network; s' is the next state entered by taking the action a at the system state s; a' is according to the target Actor network in the system state s
Figure FDA0003525993510000027
An act of selecting;
Figure FDA0003525993510000028
is a parameter of the target Actor network;
Figure FDA0003525993510000029
is a parameter of the target Critic network.
4. The AVC intelligent control method based on artificial neural network and deep reinforcement learning of claim 3, wherein: the parameters of the target Critic network and the parameters of the target Actor network comprise,
through the actual Actor networkParameters of the network
Figure FDA00035259935100000210
And (3) updating parameters:
Figure FDA00035259935100000211
parameters through a practical Critic network
Figure FDA00035259935100000212
And (3) updating parameters:
Figure FDA00035259935100000213
where τ controls the update rate.
5. The AVC intelligent control method based on artificial neural network and deep reinforcement learning of claim 4, wherein: the decision metric function includes at least one of,
defining the action utility function Qi(s, a) is the expectation of the sum of the rewards subsequently obtained by the agents in the ith area after the action a is executed in the system state s, and then the decision metric function is:
Figure FDA0003525993510000031
6. the AVC intelligent control method based on artificial neural network and deep reinforcement learning of claim 5, wherein: the gradient of the decision metric function includes,
decision function parameters of the ith regional agent
Figure FDA0003525993510000032
Gradient of (2)Comprises the following steps:
Figure FDA0003525993510000033
wherein the content of the first and second substances,
Figure FDA0003525993510000034
computing the sign of the gradient for the function; a isiAn action value representing the ith iteration;
Figure FDA0003525993510000035
iterating the gradient i times for the action utility function,
Figure FDA0003525993510000036
for the target Actor network
Figure FDA0003525993510000037
The gradient is iterated i times.
7. The AVC intelligent control method based on artificial neural network and deep reinforcement learning of claim 6, wherein: the voltage control amount may include a voltage control amount,
based on Newton Raphson power flow calculation, the calculation formula of the voltage control quantity is as follows:
Figure FDA0003525993510000038
wherein U is the voltage control quantity, MiFor modulation of voltage source converters, UdThe fundamental voltage of the dc node.
CN202011263523.7A 2020-11-12 2020-11-12 AVC intelligent control method based on artificial neural network and deep reinforcement learning Active CN112465664B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011263523.7A CN112465664B (en) 2020-11-12 2020-11-12 AVC intelligent control method based on artificial neural network and deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011263523.7A CN112465664B (en) 2020-11-12 2020-11-12 AVC intelligent control method based on artificial neural network and deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN112465664A CN112465664A (en) 2021-03-09
CN112465664B true CN112465664B (en) 2022-05-03

Family

ID=74825674

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011263523.7A Active CN112465664B (en) 2020-11-12 2020-11-12 AVC intelligent control method based on artificial neural network and deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN112465664B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283156B (en) * 2021-03-29 2023-09-15 北京建筑大学 Energy-saving control method for subway station air conditioning system based on deep reinforcement learning
CN112924177B (en) * 2021-04-02 2022-07-19 哈尔滨理工大学 Rolling bearing fault diagnosis method for improved deep Q network
CN113300379B (en) * 2021-05-08 2022-04-29 武汉大学 Electric power system reactive voltage control method and system based on deep learning
CN113363997B (en) * 2021-05-28 2022-06-14 浙江大学 Reactive voltage control method based on multi-time scale and multi-agent deep reinforcement learning
CN113489015B (en) * 2021-06-17 2024-01-26 清华大学 Multi-time-scale reactive voltage control method for power distribution network based on reinforcement learning
CN113725863A (en) * 2021-07-30 2021-11-30 国家电网有限公司 Power grid autonomous control and decision method and system based on artificial intelligence
CN114400675B (en) * 2022-01-21 2023-04-07 合肥工业大学 Active power distribution network voltage control method based on weight mean value deep double-Q network
CN115081702A (en) * 2022-06-14 2022-09-20 国网信息通信产业集团有限公司 Power load prediction method with interpretable characteristic, system and terminal

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103490431A (en) * 2013-09-29 2014-01-01 华南理工大学 Power distribution network voltage reactive power optimization method based on BART algorithm
CN106713935A (en) * 2017-01-09 2017-05-24 杭州电子科技大学 Fast method for HEVC (High Efficiency Video Coding) block size partition based on Bayes decision
CN106709820A (en) * 2017-01-11 2017-05-24 中国南方电网有限责任公司电网技术研究中心 Electrical power system load prediction method and device based on depth belief network
CN107257133A (en) * 2017-06-12 2017-10-17 浙江群力电气有限公司 A kind of idle work optimization method, device and AVC systems
CN107423839A (en) * 2017-04-17 2017-12-01 湘潭大学 A kind of method of the intelligent building microgrid load prediction based on deep learning
CN107634866A (en) * 2017-10-27 2018-01-26 朱秋华 A kind of distribution network communication system performance estimating method and device
CN108495129A (en) * 2018-03-22 2018-09-04 北京航空航天大学 The complexity optimized method and device of block partition encoding based on deep learning method
WO2018187632A1 (en) * 2017-04-05 2018-10-11 Carnegie Mellon University Deep learning methods for estimating density and/or flow of objects, and related methods and software
CN108964023A (en) * 2018-06-29 2018-12-07 国网上海市电力公司 A kind of busbar voltage situation short term prediction method and system for power grid
CN109343341A (en) * 2018-11-21 2019-02-15 北京航天自动控制研究所 It is a kind of based on deeply study carrier rocket vertically recycle intelligent control method
CN109698556A (en) * 2019-02-25 2019-04-30 深圳市广前电力有限公司 The control method and logical construction of smart grid AVC substation system interface
CN110087092A (en) * 2019-03-11 2019-08-02 西安电子科技大学 Low bit-rate video decoding method based on image reconstruction convolutional neural networks
KR20190109868A (en) * 2018-03-19 2019-09-27 삼성전자주식회사 System and control method of system for processing sound data
CN110474339A (en) * 2019-08-07 2019-11-19 国网福建省电力有限公司 A kind of electric network reactive-load control method based on the prediction of depth generation load
CN110535146A (en) * 2019-08-27 2019-12-03 哈尔滨工业大学 The Method for Reactive Power Optimization in Power of Policy-Gradient Reinforcement Learning is determined based on depth
CN110545416A (en) * 2019-09-03 2019-12-06 国家广播电视总局广播电视科学研究院 ultra-high-definition film source detection method based on deep learning
CN110676842A (en) * 2019-09-23 2020-01-10 南方电网科学研究院有限责任公司 Power distribution network reconstruction and modeling solving method and device for minimally removing fault area
CN110738010A (en) * 2019-10-17 2020-01-31 湖南科技大学 Wind power plant short-term wind speed prediction method integrated with deep learning model
CN110866640A (en) * 2019-11-11 2020-03-06 山东科技大学 Power load prediction method based on deep neural network
CN110958680A (en) * 2019-12-09 2020-04-03 长江师范学院 Energy efficiency-oriented unmanned aerial vehicle cluster multi-agent deep reinforcement learning optimization method
CN111130053A (en) * 2020-01-08 2020-05-08 华南理工大学 Power distribution network overcurrent protection method based on deep reinforcement learning
CN111460650A (en) * 2020-03-31 2020-07-28 北京航空航天大学 Unmanned aerial vehicle end-to-end control method based on deep reinforcement learning
CN111884213A (en) * 2020-07-27 2020-11-03 国网北京市电力公司 Power distribution network voltage adjusting method based on deep reinforcement learning algorithm

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080291999A1 (en) * 2007-05-24 2008-11-27 Julien Lerouge Method and apparatus for video frame marking
US7802286B2 (en) * 2007-07-24 2010-09-21 Time Warner Cable Inc. Methods and apparatus for format selection for network optimization
US9806991B2 (en) * 2015-01-21 2017-10-31 Cisco Technology, Inc. Rendering network policy and monitoring compliance

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103490431A (en) * 2013-09-29 2014-01-01 华南理工大学 Power distribution network voltage reactive power optimization method based on BART algorithm
CN106713935A (en) * 2017-01-09 2017-05-24 杭州电子科技大学 Fast method for HEVC (High Efficiency Video Coding) block size partition based on Bayes decision
CN106709820A (en) * 2017-01-11 2017-05-24 中国南方电网有限责任公司电网技术研究中心 Electrical power system load prediction method and device based on depth belief network
WO2018187632A1 (en) * 2017-04-05 2018-10-11 Carnegie Mellon University Deep learning methods for estimating density and/or flow of objects, and related methods and software
CN107423839A (en) * 2017-04-17 2017-12-01 湘潭大学 A kind of method of the intelligent building microgrid load prediction based on deep learning
CN107257133A (en) * 2017-06-12 2017-10-17 浙江群力电气有限公司 A kind of idle work optimization method, device and AVC systems
CN107634866A (en) * 2017-10-27 2018-01-26 朱秋华 A kind of distribution network communication system performance estimating method and device
KR20190109868A (en) * 2018-03-19 2019-09-27 삼성전자주식회사 System and control method of system for processing sound data
CN108495129A (en) * 2018-03-22 2018-09-04 北京航空航天大学 The complexity optimized method and device of block partition encoding based on deep learning method
CN108964023A (en) * 2018-06-29 2018-12-07 国网上海市电力公司 A kind of busbar voltage situation short term prediction method and system for power grid
CN109343341A (en) * 2018-11-21 2019-02-15 北京航天自动控制研究所 It is a kind of based on deeply study carrier rocket vertically recycle intelligent control method
CN109698556A (en) * 2019-02-25 2019-04-30 深圳市广前电力有限公司 The control method and logical construction of smart grid AVC substation system interface
CN110087092A (en) * 2019-03-11 2019-08-02 西安电子科技大学 Low bit-rate video decoding method based on image reconstruction convolutional neural networks
CN110474339A (en) * 2019-08-07 2019-11-19 国网福建省电力有限公司 A kind of electric network reactive-load control method based on the prediction of depth generation load
CN110535146A (en) * 2019-08-27 2019-12-03 哈尔滨工业大学 The Method for Reactive Power Optimization in Power of Policy-Gradient Reinforcement Learning is determined based on depth
CN110545416A (en) * 2019-09-03 2019-12-06 国家广播电视总局广播电视科学研究院 ultra-high-definition film source detection method based on deep learning
CN110676842A (en) * 2019-09-23 2020-01-10 南方电网科学研究院有限责任公司 Power distribution network reconstruction and modeling solving method and device for minimally removing fault area
CN110738010A (en) * 2019-10-17 2020-01-31 湖南科技大学 Wind power plant short-term wind speed prediction method integrated with deep learning model
CN110866640A (en) * 2019-11-11 2020-03-06 山东科技大学 Power load prediction method based on deep neural network
CN110958680A (en) * 2019-12-09 2020-04-03 长江师范学院 Energy efficiency-oriented unmanned aerial vehicle cluster multi-agent deep reinforcement learning optimization method
CN111130053A (en) * 2020-01-08 2020-05-08 华南理工大学 Power distribution network overcurrent protection method based on deep reinforcement learning
CN111460650A (en) * 2020-03-31 2020-07-28 北京航空航天大学 Unmanned aerial vehicle end-to-end control method based on deep reinforcement learning
CN111884213A (en) * 2020-07-27 2020-11-03 国网北京市电力公司 Power distribution network voltage adjusting method based on deep reinforcement learning algorithm

Also Published As

Publication number Publication date
CN112465664A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
CN112465664B (en) AVC intelligent control method based on artificial neural network and deep reinforcement learning
Lin et al. An improved moth-flame optimization algorithm for support vector machine prediction of photovoltaic power generation
CN102270309B (en) Short-term electric load prediction method based on ensemble learning
CN110009529B (en) Transient frequency acquisition method based on stack noise reduction automatic encoder
CN109787236A (en) A kind of power system frequency Tendency Prediction method based on deep learning
CN112507614B (en) Comprehensive optimization method for power grid in distributed power supply high-permeability area
CN105631483A (en) Method and device for predicting short-term power load
CN113363998B (en) Power distribution network voltage control method based on multi-agent deep reinforcement learning
CN104636985A (en) Method for predicting radio disturbance of electric transmission line by using improved BP (back propagation) neural network
Wan et al. Data-driven hierarchical optimal allocation of battery energy storage system
Ye et al. Combined approach for short-term wind power forecasting based on wave division and Seq2Seq model using deep learning
CN113471982B (en) Cloud edge cooperation and power grid privacy protection distributed power supply in-situ voltage control method
CN114362175B (en) Wind power prediction method and system based on depth certainty strategy gradient algorithm
Roukerd et al. Probabilistic-possibilistic flexibility-based unit commitment with uncertain negawatt demand response resources considering Z-number method
Zou et al. Wind turbine power curve modeling using an asymmetric error characteristic-based loss function and a hybrid intelligent optimizer
Li et al. Short term prediction of photovoltaic power based on FCM and CG-DBN combination
CN107239850A (en) A kind of long-medium term power load forecasting method based on system dynamics model
CN104834975A (en) Power network load factor prediction method based on intelligent algorithm optimization combination
CN109242136A (en) A kind of micro-capacitance sensor wind power Chaos-Genetic-BP neural network prediction technique
CN109858665A (en) Photovoltaic short term power prediction technique based on Feature Selection and ANFIS-PSO
CN103618315B (en) A kind of line voltage idle work optimization method based on BART algorithm and super-absorbent wall
CN107706938B (en) A kind of wind power waving interval analysis method returned based on quantile
CN112365074A (en) Artificial intelligence decision-making method based on power grid regulation and control data
CN111799820A (en) Double-layer intelligent hybrid zero-star cloud energy storage countermeasure regulation and control method for power system
Li et al. Reactive power convex optimization of active distribution network based on Improved GreyWolf Optimizer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant