CN112564118A - Distributed real-time voltage control method capable of expanding quantum deep width learning - Google Patents

Distributed real-time voltage control method capable of expanding quantum deep width learning Download PDF

Info

Publication number
CN112564118A
CN112564118A CN202011319512.6A CN202011319512A CN112564118A CN 112564118 A CN112564118 A CN 112564118A CN 202011319512 A CN202011319512 A CN 202011319512A CN 112564118 A CN112564118 A CN 112564118A
Authority
CN
China
Prior art keywords
action
representing
network
quantum
expandable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011319512.6A
Other languages
Chinese (zh)
Other versions
CN112564118B (en
Inventor
殷林飞
陆悦江
陆造树
高放
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Original Assignee
Guangxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi University filed Critical Guangxi University
Priority to CN202011319512.6A priority Critical patent/CN112564118B/en
Publication of CN112564118A publication Critical patent/CN112564118A/en
Application granted granted Critical
Publication of CN112564118B publication Critical patent/CN112564118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/12Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/12Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load
    • H02J3/16Circuit arrangements for ac mains or ac distribution networks for adjusting voltage in ac networks by changing a characteristic of the network load by adjustment of reactive power
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/30Reactive power compensation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention provides a distributed real-time voltage control method capable of expanding quantum deep width learning. The method combines a distributed structure with an expandable quantum deep width learning method and is used for voltage control of the power system. Firstly, the method combines the ideas of deep learning and width learning, introduces a density matrix in quantum mechanics, and provides an expandable quantum deep width neural network. Secondly, the method uses an expandable quantum deep width neural network to fit four networks of a depth certainty strategy gradient method structure, and provides an expandable quantum deep width learning method. And finally, controlling the voltage of the power system in real time through a distributed structure. The method can realize real-time global optimal control of the voltage of the power system, reduce the calculation burden of the controller on the basis of ensuring the control precision, accelerate the calculation process, reduce the requirement of the voltage control process on the reliability of the communication technology, and keep the privacy of the information of the power system in each area.

Description

Distributed real-time voltage control method capable of expanding quantum deep width learning
Technical Field
The invention belongs to the field of voltage control of power systems, and relates to a real-time voltage control method of a distributed artificial intelligence technology, which is suitable for real-time control of the voltage of a power system.
Background
As more and more intermittent energy sources are added to the power system, the uncertainty of reactive power output of a large number of intermittent energy sources increases the risk of grid voltage out-of-limit, and also increases the real-time requirement of the power system for voltage control. The existing power grid voltage control method is mainly a centralized three-layer voltage control method, and voltage control instructions need to be transmitted layer by layer to be executed, so that the existing voltage control method is difficult to be applied to a power system containing a large amount of intermittent energy sources. Therefore, it is necessary to study a real-time voltage control method of the power system.
In recent years, with the rapid increase in the number of distributed power sources, the scale of power systems is also rapidly increasing. When a large number of power generation units are faced by the traditional centralized three-layer voltage control method, a central controller needs to process a large amount of information from the whole network, the calculation load is overlarge, and the centralized control method has high requirements on the reliability of communication technology. By adopting a distributed voltage control method, a large amount of information required to be processed by the central controller can be distributed to a plurality of regional controllers, so that the calculation burden of each controller can be reduced, and the calculation process can be accelerated; the requirement of the voltage control process on the reliability of the communication technology can be reduced; the privacy of the power system information of each area can be maintained. In addition, the distributed voltage control method can realize the global optimal control of the power grid voltage through the information exchange among the regional controllers.
At present, a deep learning method is concerned by a plurality of scholars in a plurality of research fields, but the problem of low learning training speed caused by a multi-hidden layer structure of the deep learning method is not solved particularly effectively. The width learning method has no multi-hidden layer structure, so the learning and training speed is not limited by the multi-hidden layer structure, but the training precision is still to be improved. The depth-determining strategy gradient method can be applied to a continuous action space and is suitable for continuous control action in power system voltage control. Therefore, a deep learning method, a width learning method and a depth certainty strategy gradient method can be combined to obtain a method which is high in training calculation speed and training precision and suitable for a continuous action process.
In quantum physics, the state of an isolated system can be described by a quantum state. Quantum states containing all the information of an isolated system can be classified into pure states and mixed states. Wherein the pure state is a special case of the mixed state. The state of subsystems in a multi-body system is typically a mixture of states, which can be described by a density matrix. The state of the system is described by using the mixed state, so that the state of the system can be accurately described. Therefore, the learning effect of the intelligent method on the input and output relationship of the system can be optimized by describing the system state by using the density matrix of the mixed state.
In summary, in order to replace the traditional centralized three-layer voltage control method, it is necessary to provide a distributed real-time voltage control method based on quantum mechanics and artificial intelligence methods.
Disclosure of Invention
The invention provides a distributed real-time voltage control method capable of expanding quantum deep width learning. The method combines a distributed structure with an expandable quantum deep width learning method, realizes real-time control of the whole network voltage through information exchange among all regional voltage controllers, and can reduce the requirement on the communication technology, reduce the calculation burden of the controllers and keep the privacy of the information of all regional power systems under the condition of meeting the voltage control precision.
The invention provides a voltage control method based on a distributed structure, which trains input and output data of voltage controllers in various regions by adopting an expandable quantum deep width learning method. The method for learning the expandable quantum deep width mainly comprises two parts, namely an expandable quantum deep width neural network and a depth certainty strategy gradient method structure. The expandable quantum deep width neural network is responsible for learning the mapping relation between input and output data, and the density matrix rho in the quantum deep width neural network can be expanded to more accurately describe the state of the input data so as to optimize the learning effect of the neural network; the depth certainty strategy gradient method structure enables the expandable quantum deep width learning method to interact with the environment in real time to obtain a better learning effect.
The method mainly comprises the following steps in the using process:
step (1): inputting training data into an expandable quantum deep width neural network for training;
step (2): fitting four networks of the expandable quantum deep width learning method by using the trained expandable quantum deep width neural network, and training the expandable quantum deep width learning method by using training data;
and (3): the real-time global optimal control of the voltage of the power system is realized through a distributed structure.
To train the expandable quantum deep width learning method, firstly, training data is input into the expandable quantum deep width neural network for training. The expandable quantum deep width neural network mainly comprises two parts, namely a depth part and a width part. The input of the expandable quantum deep width neural network is randomly distributed to a depth part and a width part, and a density matrix rho in quantum mechanics is introduced to more accurately describe the state of input data so as to achieve the aim of optimal input data distribution
Figure BDA0002792399210000021
In the formula, X1Input data that is a depth component; x2Input data that is a width portion; x is the input of the neural network capable of expanding the quantum depth width; λ is distribution factor and its value range is [0,1 ]](ii) a ρ is a density matrix and can be calculated by
Figure BDA0002792399210000022
In the formula, | ψi>And piRespectively representing the pure state of the quantum state and the corresponding probability of the pure state.
The output of the deep partial hidden layer can be calculated by the following formula
al+1=f(Wlal+bl) (3)
In the formula, al+1And alRespectively representing the output of the (l +1) th layer and the l-th layer hidden layer; wlRepresenting a weight matrix of the l layer; blRepresenting an offset matrix of the l layer; f (g) represents an activation function.
If the expandable quantum deep width neural network has a common L-layer hidden layer, the output of the deep part can be expressed as
Figure BDA0002792399210000031
In the formula, Y1An output representing a depth portion; a isLAnd aL-1Respectively representing the output of the L-th layer and the (L-1) -th hidden layer; w1 LThe weight matrix is the L-th layer of the depth part; bLAn offset matrix for layer L is shown.
Input data X of width part2Firstly, converting into a characteristic node matrix Z, and setting input data X2Has the advantages ofnA feature node, the feature node transformation equation can be expressed as
Zi=φ(X2Weiei),i=1,K,n (5)
In the formula (I), the compound is shown in the specification,Zirepresenting the ith characteristic node matrix and representing all characteristic nodes as Zn=[Z1,K,Zn];WeiIs a random weight matrix; beta is aeiIs a random deviation value; phi (g) denotes a random mapping function.
Subsequently, the characteristic node matrix is converted into an enhanced node matrix having a width portionmThe enhanced nodes are grouped, and the conversion process of the enhanced nodes can be expressed as
Hi=ξ(ZnWhihi),i=1,K,m (6)
In the formula, HiRepresents the ith enhanced node matrix and represents all enhanced nodes as Hm=[H1,K,Hm];WhiIs a random weight matrix; beta is ahiIs a random deviation value; ξ (g) represents a random mapping function.
Output Y of width part2Can be expressed as
Y2=[Zn|Hm]W2 (7)
In the formula, W2A connection weight matrix representing the width portion.
Finally, the output Y of the neural network capable of expanding the quantum deep width can be obtained
Figure BDA0002792399210000032
In the formula, Y represents the output of the quantum deep width expandable neural network; y is1And Y2Outputs representing a depth portion and a width portion, respectively; f (g) represents an activation function;
Figure BDA0002792399210000033
the weight matrix is the L-th layer of the depth part; a isL-1Represents the output of the hidden layer of the (L-1) th layer; bLAn offset value matrix representing the L-th layer; zn=[Z1,K,Zn]Representing all characteristic node matrixes; hm=[H1,K,Hm]Representing all enhanced node momentsArraying; w2A connection weight matrix representing the width portion.
The deep deterministic strategy gradient method structure mainly comprises an action network and an evaluation network, wherein the action network and the evaluation network respectively comprise two sub-networks of a current network and a target network, and the two sub-networks have the same structure. Therefore, the gradient method structure of the depth certainty strategy mainly comprises four networks, namely an action current network, an action target network, an evaluation current network and an evaluation target network. In the method for learning the depth-width-expandable quantum, four networks in a depth certainty strategy gradient method structure are respectively fitted by using a depth-width-expandable quantum neural network, and each network is respectively parameterized by using a parameter theta. The action network is mainly used for generating a deterministic action strategy to generate a deterministic action, and the evaluation network is mainly used for simulating a real cost function Q to guide the updating of the action strategy.
After the training of the expandable quantum deep width neural network is completed, the trained expandable quantum deep width neural network is used for fitting the four networks of the expandable quantum deep width learning method, and the training data is used for training the expandable quantum deep width learning method. Actions of Agents in a deep deterministic policy gradient method architectureaDetermined by the policy function pi. Transient state s of a known environmenttAction a of agenttAnd the current network of actions by parameterizationμA deterministic action strategy is generated, the obtainable value function Q is
Figure BDA0002792399210000041
In the formula, E [ g ]]Representing a desired value; r(s)t,at) Is shown in state stAnd action atThe reward obtained is made; gamma represents a discount factor in the bellman equation; Ψ represents and state st+1And a prize rtThe corresponding expected value distribution.
Evaluation of the current network by the merit function QQApproximate representation, evaluating the current network thetaQThe loss of (1) is the difference between the equal sign of the Bellman equation
Figure BDA0002792399210000042
Wherein L (θ)Q) A loss value representing a merit function Q; e [ g ]]Representing a desired value; rhoψRepresents a state stDistribution under the current deterministic policy ψ; psi denotes the current deterministic action policy; Ψ represents the prize rtA corresponding expected value distribution; q(s)t,atQ) Representing evaluation of the current network thetaQIn a state stAnd action atA cost function of; y istCan be expressed as
yt=r(st,at)+γQ(st+1,μ(st+1)|θQ) (11)
In the formula, r(s)t,at) Is shown in state stAnd action atThe value of the prize to be won; q(s)t+1,μ(st+1)|θQ) Representing evaluation of the current network thetaQIn a state st+1And action μ(s)t+1) The following cost function.
The action network evaluates the timing difference error updating strategy provided by the network, and the gradient updating of the strategy can be expressed as
Figure BDA0002792399210000051
In the formula (I), the compound is shown in the specification,
Figure BDA0002792399210000052
representing a gradient value of the strategy; e [ g ]]Representing a desired value; rhoψRepresents a state stDistribution under a current deterministic policy ψ;
Figure BDA0002792399210000053
representing evaluation of the current network thetaQIn a state stAnd action μ(s)t) A value function gradient value of;
Figure BDA0002792399210000054
representing the current network of actions thetaμIn a state stLower motion gradient values.
After completing one training process, the target network needs to be soft updated, and the soft update process is as follows
Figure BDA0002792399210000055
In the formula, thetaQ、θQ'、θμAnd thetaμ'Respectively representing an evaluation current network, an evaluation target network, an action current network and an action target network; τ represents a [0,1 ]]The value of (a) is a small constant.
The real-time global optimal control of the power system voltage is realized through a distributed structure in the following manner: the distributed structure cancels a central controller in a centralized three-layer voltage control method, divides the whole power system into N regional power systems, and is provided with a regional voltage controller for each regional power system, and each regional voltage controller is respectively responsible for guiding a primary voltage controller of a corresponding region to generate voltage control action. The adjacent regional voltage controllers in the power system exchange with each other through the reactive power shortage value delta Q, and the regional power system with surplus reactive power can transmit the reactive power to the regional power system with the reactive power shortage through the regional connecting lines. Through multiple exchanges and rapid adjustment of each regional voltage controller, the interconnections among the regions of the power system all achieve a consistency protocol, and the global optimal control of the voltage of the power system is realized. The coherence protocol is
Figure RE-GDA0002948013070000056
In the formula uiThe voltage value of the ith inter-zone connecting line is represented; kiAnd KijIs constant inverseA feed gain matrix; x represents all controllable variables within the area; i and j represent the ith and jth regional power systems, respectively; n denotes the number of divided area power systems.
Compared with the prior art, the invention has the following advantages and effects:
(1) compared with the traditional centralized control method, the distributed voltage control method provided by the invention can reduce the requirement of the voltage control process of the power system on the communication reliability, reduce the calculation burden of the voltage controller, improve the real-time performance of the voltage controller on the voltage control of the power grid, and keep the privacy of the power grid information of each area.
(2) The invention is inspired by a deep learning method, a width learning method and a depth certainty strategy gradient method, and provides an expandable quantum deep width learning method. Compared with the traditional deep reinforcement learning method, the method can expand the quantum deep width learning method without the restriction of multiple hidden layers of a deep neural network, has high learning and training speed and high prediction precision, can perform real-time interaction with the environment, and is suitable for a continuous action process.
(3) The invention introduces the quantum state in quantum mechanics into a deep reinforcement learning method, and provides an expandable quantum deep width learning method. The density matrix rho describing the quantum state can more accurately describe the state of input data, so that the learning effect of the expandable quantum deep width neural network is optimized, and the learning effect of the expandable quantum deep width learning method is superior to that of the traditional deep reinforcement learning method.
Drawings
FIG. 1 is a schematic diagram of a distributed voltage control architecture for the method of the present invention.
FIG. 2 is a schematic structural diagram of a scalable quantum deep width neural network of the method of the present invention.
FIG. 3 is a schematic diagram of an expandable quantum deep width learning method of the present invention.
Detailed Description
The invention provides a distributed real-time voltage control method capable of expanding quantum deep width learning, which is described in detail by combining the accompanying drawings as follows:
FIG. 1 is a schematic diagram of a distributed voltage control architecture for the method of the present invention. First, the smart grid is divided into N regional power systems according to its actual topology, and power lines connecting neighboring regional power systems in the smart grid actual topology are called inter-regional tie lines. The regional power systems of the smart grid transmit and receive reactive power to adjacent regional power systems primarily through inter-regional tie lines. In a zone power system, the voltage of the zone power system is mainly controlled by a zone voltage controller and a primary voltage controller. The regional voltage controllers are responsible for sending or receiving information communicated with a neighboring region power system and providing corresponding voltage reference values for each level of voltage controller in the region; after the first-stage voltage controller receives the voltage reference value provided by the regional voltage controller, the output voltage of the first-stage voltage controller is rapidly adjusted, and therefore the voltage of the regional power system is adjusted. When the reactive power reserve capacity of a certain regional power system is insufficient, namely reactive power shortage occurs, the regional power system transmits the reactive power shortage value information to an adjacent regional power system; after the adjacent regional power systems receive the vacancy value information of corresponding reactive power, the corresponding regional voltage controllers can provide voltage reference values for the first-level voltage controllers, the first-level voltage controllers rapidly adjust the reactive output of the first-level voltage controllers, and the reactive power is transmitted to the regional power systems with reactive power vacancy through inter-regional connecting lines, so that multi-region voltage coordination control is achieved.
FIG. 2 is a schematic structural diagram of a scalable quantum deep width neural network of the method of the present invention. First, input data X is randomly divided into input X of depth parts1And input X of width part2. Input data X of depth part1Inputting the data into a plurality of hidden layers for learning and training; at the same time, the input data X of the width part2And converting the characteristic node matrix Z into a characteristic node matrix Z, and further converting the characteristic node matrix Z into an enhanced node matrix H. Obtaining output Y of the depth part through the learning of multiple hidden layers of the depth part1(ii) a Through feature node matrix Z andthe conversion calculation of the strong node matrix H obtains the output Y of the width part2(ii) a The output Y of the depth part1And the output Y of the width part2And combining to obtain the output Y of the quantum deep width expanded neural network.
FIG. 3 is a schematic diagram of an expandable quantum deep width learning method of the present invention. In the training process of the expandable quantum deep width learning method, the action current network gives an action mu(s)t) To increase the stochastic exploratory nature of the method, noise a is added to the actionst=μ(st)+ntAnd acts on the environment. The context gives rewards for the action and generates the state at the next moment. Will(s)t,at,rt,st+1) And storing the data into an experience playback pool to disturb the correlation of the data before and after the time. Randomly sampling from an empirical playback pool in small batches, and inputting into an action network and an evaluation network. The action target network generates an action mu' at the next moment according to the small batch of the sampling data and the action strategyt+1) And output to the evaluation target network. The evaluation target network receives a small batch of sampling data and mu'(s)t+1) Then, calculate ytAnd passes to evaluating the current network to calculate the loss function. Evaluating the gradient of the Q value of the current network calculation, transmitting the gradient to the optimizer, and updating and evaluating the parameters of the current network through the optimizer. And evaluating the current network, transmitting the time sequence difference error to the action current network, calculating the strategy gradient of the action current network, transmitting the strategy gradient to the optimizer, and updating the parameters of the action current network through the optimizer. At the same time, the action current network sets action a to μ(s)t) And transmitting the evaluation current network to calculate a Q value gradient. And finally, soft updating the network parameters of the action target network and the evaluation target network through a small constant tau.

Claims (4)

1. A distributed real-time voltage control method capable of expanding quantum deep width learning is characterized in that a distributed structure and an expandable quantum deep width learning method are combined, and the real-time control of the voltage of the whole network is realized through information exchange among voltage controllers of various regions, so that the requirement on a communication technology can be reduced under the condition of meeting the voltage control precision, the calculation burden of the controllers is reduced, and the privacy of the information of a power system of each region is kept; the method mainly comprises the following steps in the using process:
step (1): inputting training data into an expandable quantum deep width neural network for training;
step (2): fitting four networks of the expandable quantum deep width learning method by using the trained expandable quantum deep width neural network, and training the expandable quantum deep width learning method by using training data;
and (3): the real-time global optimal control of the voltage of the power system is realized through a distributed structure.
2. The real-time voltage control method of distributed expandable quantum deep width learning according to claim 1, wherein in the step (1), the input of the expandable quantum deep width neural network is randomly distributed to the depth part and the width part, and a density matrix p in quantum mechanics is introduced to describe the state of the input data more accurately, so as to achieve the purpose of optimal input data distribution, and the input data distribution mode of the method is that
Figure FDA0002792399200000011
In the formula, X1Input data that is a depth component; x2Input data that is a width portion; x is the input of the neural network capable of expanding the quantum depth width; λ is distribution factor and its value range is [0,1 ]](ii) a ρ is a density matrix and can be calculated by
Figure FDA0002792399200000012
In the formula, | ψi>And piRespectively representing the pure state of the quantum state and the corresponding probability of the pure state;
finally, the output Y of the neural network capable of expanding the quantum deep width can be obtained
Figure FDA0002792399200000013
In the formula, Y represents the output of the quantum deep width expandable neural network; y is1And Y2Outputs representing the depth portion and the width portion, respectively; f (g) represents an activation function; w1 LThe weight matrix is the L-th layer of the depth part; a isL-1Represents the output of the (L-1) th layer hidden layer; bLAn offset value matrix representing the L-th layer; zn=[Z1,K,Zn]Representing all characteristic node matrixes; hm=[H1,K,Hm]Representing all the enhanced node matrixes; w2A connection weight matrix representing the width portion.
3. The real-time voltage control method of distributed expandable quantum deep width learning according to claim 1, wherein the expandable quantum deep width learning method proposed in the step (2) uses an expandable quantum deep width neural network to respectively fit four networks in a depth certainty strategy gradient method structure, and uses a parameter θ to respectively parameterize each network; the action network is mainly used for generating a deterministic action strategy to generate a deterministic action, and the evaluation network is mainly used for simulating a real cost function Q to guide the updating of the action strategy; the action a of the agent is determined by a policy function pi; instantaneous state s of a known environmenttAction a of agenttAnd the current network of actions by parameterizationμGenerating a deterministic action strategy mu, the obtainable cost function Q being
Figure FDA0002792399200000021
In the formula, E [ g ]]Representing a desired value; r(s)t,at) Is shown in state stAnd action atThe reward obtained is made; gamma denotes the Bellman equationThe discount factor of (1); Ψ represents and state st+1And a prize rtA corresponding expected value distribution;
evaluation of the current network by the merit function QQApproximate representation, evaluating the current network thetaQThe loss of (1) is the difference between the equal sign of the Bellman equation
Figure FDA0002792399200000022
Wherein L (θ)Q) A loss value representing a merit function Q; e [ g ]]Representing a desired value; rhoψRepresents a state stDistribution under the current deterministic policy ψ; psi denotes the current deterministic action policy; Ψ represents the prize rtA corresponding expected value distribution; q(s)t,atQ) Representing evaluation of the current network thetaQIn a state stAnd action atA cost function of; y istCan be expressed as
yt=r(st,at)+γQ(st+1,μ(st+1)|θQ)
In the formula, r(s)t,at) Is shown in state stAnd action atThe value of the prize to be won; q(s)t+1,μ(st+1)|θQ) Representing evaluation of the current network thetaQIn a state st+1And action μ(s)t+1) A cost function of;
the action network evaluates the timing difference error updating strategy provided by the network, and the gradient updating of the strategy can be expressed as
Figure FDA0002792399200000023
In the formula (I), the compound is shown in the specification,
Figure FDA0002792399200000024
representing a gradient value of the strategy; e [ g ]]Representing a desired value; rhoψRepresents a state stDistribution under the current deterministic policy ψ;
Figure FDA0002792399200000025
representing evaluation of the current network thetaQIn a state stAnd action μ(s)t) A value function gradient value of;
Figure FDA0002792399200000026
representing the current network of actions thetaμIn a state stA lower motion gradient value;
after completing one training process, the target network needs to be soft updated, and the soft update process is as follows
θQ'←τθQ+(1-τ)θQ'
θμ'←τθμ+(1-τ)θμ'
In the formula, thetaQ、θQ'、θμAnd thetaμ'Respectively representing an evaluation current network, an evaluation target network, an action current network and an action target network; τ represents a [0,1 ]]The value of (a) is a small constant.
4. The real-time voltage control method of distributed expandable quantum deep width learning as claimed in claim, wherein the distributed structure in step (3) divides the power system into N regional power systems, and each regional power system is equipped with a regional voltage controller, and each regional voltage controller is responsible for directing the voltage control action of the primary voltage controller of the corresponding region; through multiple exchanges and rapid adjustment of each regional voltage controller, the interconnections among the regions of the power system all achieve a consistency protocol, and the global optimal control of the voltage of the power system is realized; the coherence protocol is
Figure RE-FDA0002948013060000031
In the formula uiIndicating the i-th inter-regionVoltage value of the tie line; kiAnd KijA constant feedback gain matrix; x represents all controllable variables within the region; i and j represent the ith and jth regional power systems, respectively; n represents the number of divided regional power systems.
CN202011319512.6A 2020-11-23 2020-11-23 Distributed real-time voltage control method capable of expanding quantum deep width learning Active CN112564118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011319512.6A CN112564118B (en) 2020-11-23 2020-11-23 Distributed real-time voltage control method capable of expanding quantum deep width learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011319512.6A CN112564118B (en) 2020-11-23 2020-11-23 Distributed real-time voltage control method capable of expanding quantum deep width learning

Publications (2)

Publication Number Publication Date
CN112564118A true CN112564118A (en) 2021-03-26
CN112564118B CN112564118B (en) 2022-03-18

Family

ID=75044766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011319512.6A Active CN112564118B (en) 2020-11-23 2020-11-23 Distributed real-time voltage control method capable of expanding quantum deep width learning

Country Status (1)

Country Link
CN (1) CN112564118B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113346543A (en) * 2021-06-03 2021-09-03 广西大学 Distributed micro-grid voltage multilayer cooperative control method
CN114202066A (en) * 2022-02-21 2022-03-18 北京邮电大学 Network control method and device, electronic equipment and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2616876C1 (en) * 2015-12-17 2017-04-19 Федеральное государственное автономное образовательное учреждение высшего образования "Национальный исследовательский Нижегородский государственный университет им. Н.И. Лобачевского" METHOD FOR MONITORING PRESENCE OF GaAs MATRIX DEEP DEFECTS CONNECTED WITH EMBEDDING INAS QUANTUM DOTS THEREIN
CN108964042A (en) * 2018-07-24 2018-12-07 合肥工业大学 Regional power grid operating point method for optimizing scheduling based on depth Q network
CN109951438A (en) * 2019-01-15 2019-06-28 中国科学院信息工程研究所 A kind of communication optimization method and system of distribution deep learning
CN110225535A (en) * 2019-06-04 2019-09-10 吉林大学 Heterogeneous wireless network vertical handoff method based on depth deterministic policy gradient
CN110378467A (en) * 2019-06-17 2019-10-25 浙江大学 A kind of quantization method for deep learning network parameter
CN110429652A (en) * 2019-08-28 2019-11-08 广西大学 A kind of intelligent power generation control method for expanding the adaptive Dynamic Programming of deep width
CN110450153A (en) * 2019-07-08 2019-11-15 清华大学 A kind of mechanical arm article active pick-up method based on deeply study
CN110490867A (en) * 2019-08-22 2019-11-22 四川大学 Metal increasing material manufacturing forming dimension real-time predicting method based on deep learning
US20200020117A1 (en) * 2018-07-16 2020-01-16 Ford Global Technologies, Llc Pose estimation
CN111555297A (en) * 2020-05-21 2020-08-18 广西大学 Unified time scale voltage control method with tri-state energy unit
CN111769547A (en) * 2020-06-12 2020-10-13 广西大学 Real-time regulation and control method for three-layer linkage mechanism interactive comprehensive energy system
CN111832812A (en) * 2020-06-27 2020-10-27 南通大学 Wind power short-term prediction method based on deep learning

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2616876C1 (en) * 2015-12-17 2017-04-19 Федеральное государственное автономное образовательное учреждение высшего образования "Национальный исследовательский Нижегородский государственный университет им. Н.И. Лобачевского" METHOD FOR MONITORING PRESENCE OF GaAs MATRIX DEEP DEFECTS CONNECTED WITH EMBEDDING INAS QUANTUM DOTS THEREIN
US20200020117A1 (en) * 2018-07-16 2020-01-16 Ford Global Technologies, Llc Pose estimation
CN108964042A (en) * 2018-07-24 2018-12-07 合肥工业大学 Regional power grid operating point method for optimizing scheduling based on depth Q network
CN109951438A (en) * 2019-01-15 2019-06-28 中国科学院信息工程研究所 A kind of communication optimization method and system of distribution deep learning
CN110225535A (en) * 2019-06-04 2019-09-10 吉林大学 Heterogeneous wireless network vertical handoff method based on depth deterministic policy gradient
CN110378467A (en) * 2019-06-17 2019-10-25 浙江大学 A kind of quantization method for deep learning network parameter
CN110450153A (en) * 2019-07-08 2019-11-15 清华大学 A kind of mechanical arm article active pick-up method based on deeply study
CN110490867A (en) * 2019-08-22 2019-11-22 四川大学 Metal increasing material manufacturing forming dimension real-time predicting method based on deep learning
CN110429652A (en) * 2019-08-28 2019-11-08 广西大学 A kind of intelligent power generation control method for expanding the adaptive Dynamic Programming of deep width
CN111555297A (en) * 2020-05-21 2020-08-18 广西大学 Unified time scale voltage control method with tri-state energy unit
CN111769547A (en) * 2020-06-12 2020-10-13 广西大学 Real-time regulation and control method for three-layer linkage mechanism interactive comprehensive energy system
CN111832812A (en) * 2020-06-27 2020-10-27 南通大学 Wind power short-term prediction method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ALI EHSAN等: "Optimal integration and planning of renewable distributed generation in the power distribution networks:A review of analytical techniques", 《APPLIED ENERGY》 *
龚锦霞等: "基于深度确定策略梯度算法的主动配电网协调优化", 《电力系统自动化》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113346543A (en) * 2021-06-03 2021-09-03 广西大学 Distributed micro-grid voltage multilayer cooperative control method
CN113346543B (en) * 2021-06-03 2022-10-11 广西大学 Distributed micro-grid voltage multilayer cooperative control method
CN114202066A (en) * 2022-02-21 2022-03-18 北京邮电大学 Network control method and device, electronic equipment and storage medium
CN114202066B (en) * 2022-02-21 2022-04-26 北京邮电大学 Network control method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112564118B (en) 2022-03-18

Similar Documents

Publication Publication Date Title
CN112615379B (en) Power grid multi-section power control method based on distributed multi-agent reinforcement learning
CN112564118B (en) Distributed real-time voltage control method capable of expanding quantum deep width learning
Jiang et al. Tracking control for linear discrete-time networked control systems with unknown dynamics and dropout
CN113363997A (en) Reactive voltage control method based on multi-time scale and multi-agent deep reinforcement learning
CN110429652B (en) Intelligent power generation control method capable of expanding deep width self-adaptive dynamic planning
CN110232475B (en) Distributed transmission network power distribution network coordinated economic dispatching method
WO2021036414A1 (en) Co-channel interference prediction method for satellite-to-ground downlink under low earth orbit satellite constellation
Rao Adaptive Neuro Fuzzy based Load Frequency Control of multi area system under open market scenario
CN109149648A (en) A kind of adaptive width Dynamic Programming intelligent power generation control method
CN110474353A (en) Layer-stepping energy-storage system and its power grid frequency modulation control method for coordinating of participation
CN105391090A (en) Multi-intelligent-agent multi-target consistency optimization method of intelligent power grid
CN110165714A (en) Micro-capacitance sensor integration scheduling and control method, computer readable storage medium based on limit dynamic programming algorithm
Fang et al. Distributed deep reinforcement learning for renewable energy accommodation assessment with communication uncertainty in internet of energy
CN113346543B (en) Distributed micro-grid voltage multilayer cooperative control method
CN108388115A (en) NCS method for compensating network delay based on generalized predictive control
CN113394770A (en) Interconnected micro-grid group frequency complete distribution type optimization control method and system
CN111767621A (en) Multi-energy system optimization scheduling method based on knowledge migration Q learning algorithm
CN106779248B (en) Electric power system economic dispatching decentralized Q method based on extreme transfer learning
Rokhforoz et al. Large-scale dynamic system optimization using dual decomposition method with approximate dynamic programming
Javalera et al. Negotiation and learning in distributed MPC of large scale systems
CN115347583A (en) Energy internet power instruction distribution method and system based on multiple intelligent agents
CN115760178A (en) Multi-region dynamic economic scheduling method with real-time power balance constraint
CN110084360B (en) MVB period real-time improvement algorithm based on GA-BP
CN114188997A (en) Dynamic reactive power optimization method for high-ratio new energy power access area power grid
CN109245178B (en) Wind power plant power cooperative scheduling based on distributed cooperative MPC

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant