CN115130733A - Hydrogen-containing building energy system operation control method combining optimization and learning - Google Patents

Hydrogen-containing building energy system operation control method combining optimization and learning Download PDF

Info

Publication number
CN115130733A
CN115130733A CN202210631486.3A CN202210631486A CN115130733A CN 115130733 A CN115130733 A CN 115130733A CN 202210631486 A CN202210631486 A CN 202210631486A CN 115130733 A CN115130733 A CN 115130733A
Authority
CN
China
Prior art keywords
hydrogen
subsystem
energy storage
slot
energy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210631486.3A
Other languages
Chinese (zh)
Other versions
CN115130733B (en
Inventor
余亮
张予涵
任静怡
岳东
窦春霞
张腾飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210631486.3A priority Critical patent/CN115130733B/en
Publication of CN115130733A publication Critical patent/CN115130733A/en
Application granted granted Critical
Publication of CN115130733B publication Critical patent/CN115130733B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2300/00Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
    • H02J2300/30The power source being a fuel cell
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2310/00The network for supplying or distributing electric power characterised by its spatial reach or by the load
    • H02J2310/10The network having a local or delimited stationary reach
    • H02J2310/12The local stationary network supplying a household or a building
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Power Engineering (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a hydrogen-containing building energy system operation control method combining optimization and learning in the field of building energy system operation control, which comprises the following steps: establishing an expected operation cost minimization problem model of the hydrogen-containing building energy system, and converting the problem model into a plurality of single-time-slot optimization sub-problem models; decomposing the single-time-slot optimization sub-problem model into an upper sub-problem model and a lower sub-problem model; solving the upper sub-problem model by adopting a convex optimization method, and calculating according to the solving result of the upper sub-problem to obtain the heat production quantity of the fuel cell; taking the heat production quantity of the fuel cell as the input state of the lower layer subproblem model; solving the lower sub-problem model to obtain an optimal control strategy of the heat energy subsystem; the operation of the hydrogen-containing building energy system is controlled in real time; the invention realizes the minimum operation cost under high thermal comfort by utilizing the dual advantages of the convex optimization method based on the model and the learning method based on the model-free.

Description

Hydrogen-containing building energy system operation control method combining optimization and learning
Technical Field
The invention belongs to the field of building energy system operation control, and particularly relates to a hydrogen-containing building energy system operation control method.
Background
Buildings account for a significant percentage of the total energy consumption and carbon emissions worldwide. In 2019, the energy consumed by global buildings accounts for about 30% of the total amount of global energy, and the generated carbon emission accounts for about 28% of the total amount of global carbon emission. At present, global energy supply mainly depends on non-renewable energy sources such as fossil fuels, so that the problem of energy exhaustion and the problem of environmental pollution are increasingly serious. In recent years, hydrogen energy has attracted much attention because of its advantages of being clean, renewable, widely available, convenient to store and transport, high in utilization rate, etc., and is recognized as a promising fossil fuel substitute. In addition, the coordinated operation of the hydrogen energy storage system and other energy storage systems (such as a thermal energy storage system and an electric energy storage system) is beneficial to improving the energy efficiency of the building. Therefore, the operation control of the hydrogen-containing building energy system is worth intensive research.
The existing research proposes a plurality of operation control methods of the hydrogen-containing building energy system, such as random planning, model predictive control and the like. The goal of these methods is to minimize system operating costs (mainly including energy costs and carbon emission costs, etc.). Despite the advances made in the prior art, none of the prior art has considered building thermal dynamics, which means that the high building thermal inertia (i.e., the phenomenon of building room temperature weakening and delaying reactions due to initial stimuli such as sudden cessation of heating) is not fully exploited to reduce system operating costs.
When building thermodynamics are considered in a hydrogen-containing building energy system, optimal control of system operation faces four challenges: (1) there are a number of uncertain system parameters; (2) there are a number of time and space coupled operational constraints; (3) the fuel cell in the hydrogen energy storage system simultaneously generates electricity and heat to cause coupling between the electrical energy flow and the thermal energy flow; (4) it is difficult to establish a definite building thermodynamic model that is both accurate and easy to control the building. Specifically, the action space dimension of single agent deep reinforcement learning will increase dramatically as the number of hot regions increases; due to the fact that cooperation among heterogeneous agents is faced in multi-agent deep reinforcement learning, effective learning of the multi-agent deep reinforcement learning faces difficulty when the number of agents is increased.
Disclosure of Invention
The invention aims to provide a hydrogen-containing building energy system operation control method combining optimization and learning, which utilizes the dual advantages of a convex optimization method based on a model and a learning method based on no model to realize the minimization of the operation cost under high thermal comfort.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
the invention provides a method for controlling the operation of a hydrogen-containing building energy system by combining optimization and learning, which comprises the following steps:
establishing an expected operation cost minimization problem model of the hydrogen-containing building energy system according to the operation constraint conditions and parameter uncertainty of the hydrogen-containing building energy system; converting the expected running cost minimization problem into a plurality of single-time slot optimization sub-problem models by utilizing a Lyapunov optimization framework;
decomposing the single-time-slot optimization sub-problem model into an upper layer sub-problem model corresponding to the electric-hydrogen subsystem and a lower layer sub-problem model corresponding to the heat energy subsystem;
solving the upper sub-problem model by adopting a convex optimization method, and calculating according to the solving result of the upper sub-problem to obtain the heat production quantity of the fuel cell;
taking the heat production quantity of the fuel cell as the input state of the lower layer subproblem model; based on a Markov game framework, carrying out re-modeling on a lower-layer subproblem model, and solving by adopting a multi-agent attention depth certainty strategy gradient algorithm to obtain an optimal control strategy of a heat energy subsystem;
and controlling the operation of the hydrogen-containing building energy system in real time according to the convex optimization solving method of the upper sub-problem model and the optimal control strategy of the heat energy subsystem.
Preferably, the hydrogen-containing building energy system expected operation cost minimization problem model is expressed by the following formula:
Figure BDA0003680126570000031
s.t. the operation constraint of the electric energy subsystem, the operation constraint of the hydrogen energy subsystem and the operation constraint of the heat energy subsystem;
in the formula, C 1,t Cost of buying and selling electricity for t time slot, C 2,t Cost of carbon emissions for t time slot, C 3,t Cost of loss for t-slot electrical energy storage system, C 4,t The operation and maintenance cost of the hydrogen energy subsystem in the t time slot, C 5,t For the loss cost of the t-slot thermal subsystem, C 6,t For T time slot natural gas purchase cost, T represents time slot length; the decision variables Θ include: energy trading volume between a local energy system and a large power grid, charging and discharging power of an electric energy storage system, input power of an electrolytic cell, output power of a fuel cell, heat supply power of each room, charging and discharging power of a heat energy storage system and natural gas consumption.
Preferably, the method for converting the expected running cost minimization problem into a plurality of single-slot optimization sub-problem models by using the Lyapunov optimization framework comprises the following steps:
judging the controllability of a hydrogen-containing building energy system; selecting a hydrogen building energy system which meets controllable conditions to construct a virtual queue of an electric energy subsystem and a hydrogen energy subsystem; defining a Lyapunov function according to the virtual queue, and calculating the weighted sum delta Y (t) of the single-time-slot Lyapunov drift and the operation cost; and converting the minimization problem model of the expected operation cost of the hydrogen-containing building energy system into a plurality of single-time-slot optimization sub-problem models through the minimization weighted sum delta Y (t), and calculating and determining optimal system parameters in the single-time-slot optimization sub-problem models.
Preferably, the expression formula of the controllable condition is as follows:
v max >τ max
v min >τ min
Figure BDA0003680126570000032
Figure BDA0003680126570000033
Figure BDA0003680126570000041
Figure BDA0003680126570000042
v max =max t v t ,τ max =max t τ t ,v min =min t v t ,τ min =min t τ t
Figure BDA0003680126570000043
Figure BDA0003680126570000044
in the formula, v max And v min Respectively representing the highest electricity price and the lowest electricity price for buying electricity; tau. max And τ min Respectively representing the highest electricity price and the lowest electricity price for selling electricity; eta bc And η bd Respectively representing the charging efficiency and the discharging efficiency of the electric energy storage system; mu.s c Is a weighting parameter that represents the importance of carbon emissions relative to energy costs;
Figure BDA0003680126570000045
and
Figure BDA0003680126570000046
respectively representing a maximum rate and a minimum rate of carbon emission; psi BESS Is the electrical energy storage system depreciation coefficient;B max and B min Respectively representing the maximum energy storage level and the minimum energy storage level of the electric energy storage system;
Figure BDA0003680126570000047
and
Figure BDA0003680126570000048
respectively representing the injection rated power and the release rated power of the electric energy storage system; omega el And omega fc Respectively representing the conversion coefficients of the electrolytic cell and the fuel cell;
Figure BDA0003680126570000049
and
Figure BDA00036801265700000410
indicating variables respectively indicating whether the electrolyzer and the fuel cell are on or off; h max And H min Respectively representing the maximum energy storage level and the minimum energy storage level of the hydrogen energy storage system;
Figure BDA00036801265700000411
and
Figure BDA00036801265700000412
respectively representing rated power of the electrolytic cell and the fuel cell; Δ t represents the slot length.
Preferably, the method of calculating the weighted sum of the single-slot lyapunov drift and the operating cost Δ y (t) comprises:
the Lyapunov function L (t) is expressed by the formula:
Figure BDA00036801265700000413
in the formula, X B,t =B t +W B ,X H,t =H t +W H ,ω r Is a unity of X B,t And X H,t A dimensional weighting factor; b t Energy storage level of an electrical energy storage system, denoted t time slot, H t Is shown asEnergy storage level of a hydrogen energy storage system of time t slots, W B Expressed as a parameter of the optimal electric energy storage system, W H Parameters expressed as an optimal hydrogen energy storage system; b is t And H t The dynamic constraints that need to be satisfied are respectively expressed as:
Figure BDA00036801265700000414
Figure BDA0003680126570000051
in the formula, P bc,t And P bd,t Respectively representing the charging power and the discharging power of the electric energy storage system; p is el,t And P fc,t Respectively representing the input power of the electrolyzer and the output power of the fuel cell at t time slot. The single-time-slot lyapunov drift is expressed by the following formula:
Λ t =E{L(t+1)-L(t)|X(t)},
Figure BDA0003680126570000052
Figure BDA0003680126570000053
Figure BDA0003680126570000054
in the formula, X (t) ═ X B,t ,X H,t ) And E {. cndot } represents the desired operation.
Then the single time slot Lyapunov drift Lambda t The expression of (c) can be converted into:
Λ t ≤ξ BH +E{Γ 0 |X(t)},
Figure BDA0003680126570000055
calculating a weighted sum Δ y (t) of the single-slot lyapunov drift and the operating cost, expressed by the formula:
Figure BDA0003680126570000056
where V is a weighting parameter.
Preferably, the expression formula of the single-slot optimization subproblem model is
Figure BDA0003680126570000057
Figure BDA0003680126570000061
Figure BDA0003680126570000062
Figure BDA0003680126570000063
Figure BDA0003680126570000064
Figure BDA0003680126570000065
Parameter W of an optimal electrical energy storage system B The calculation formula of (2) is as follows:
Figure BDA0003680126570000066
parameter W of optimal hydrogen energy storage system H The calculation formula of (2) is as follows:
Figure BDA0003680126570000067
s.t. the operating constraints of the electrical energy subsystem, the operating constraints of the hydrogen energy subsystem and the operating constraints of the thermal energy subsystem.
Preferably, the single-slot optimization sub-problem model is decomposed into an upper sub-problem model corresponding to the electric-hydrogen subsystem and a lower sub-problem model corresponding to the thermal energy subsystem according to information certainty, and the method comprises the following steps:
the upper layer subproblem model corresponding to the electro-hydrogen subsystem is expressed by the formula:
Figure BDA0003680126570000068
s.t. the operation constraint of the electric energy subsystem and the operation constraint of the hydrogen energy subsystem;
the lower layer subproblem model corresponding to the heat energy subsystem has the expression formula as follows:
min(V(C 5,t +C 6,t ) S.t. operating constraints of the thermal energy subsystem.
Preferably, the method for modeling the underlying subproblem model again based on the markov game framework comprises the following steps:
the environmental state expression of the thermal energy subsystem is as follows:
s t =(Q fc,t ,Q th,tin,i,tout,i,t ,t),
Figure BDA0003680126570000071
in the formula, Q fc,t Representing the heat generation of the fuel cell at t time slot; q th,t Representing the energy storage level of the slot thermal energy storage system in the t-slot thermal energy subsystem; beta is a in,i,t The indoor temperature of the ith room at the time slot t; beta is a out,t An outdoor temperature of t time slot; t represents the time interval of two continuous action decisions executed by the current hydrogen-containing building energy system; q th,t Indicating that t time slot is in the thermal subsystemEnergy storage level, η, of thermal energy storage system of (1) tc And η td Respectively representing the injection efficiency and the release efficiency of a thermal energy storage system in the thermal energy subsystem; p tc,t And P td,t Respectively representing the injection power and the release power of a slot thermal energy storage system in the t-slot thermal energy subsystem;
the action expression of the heat energy subsystem is as follows:
a t =(P sp,1,t ,P sp,2,t ,…,P sp,i,t ),1≤i≤N b
in the formula, P sp,i,t Supplying power for the heat of the ith room at the time of the t time slot; n is a radical of b The number of rooms;
the reward expression for the thermal energy subsystem is as follows:
Figure BDA0003680126570000072
in the formula (I), the compound is shown in the specification,
Figure BDA0003680126570000073
wherein, κ th Is a penalty factor.
Preferably, the method for solving by using the multi-agent attention depth certainty strategy gradient algorithm comprises the following steps:
at the beginning of each time slot, acquiring the environmental state of the heat energy subsystem;
the deep neural network outputs the current heat supply behavior of the hydrogen-containing building energy system to control the heat energy subsystem according to the environmental state of the current heat energy subsystem;
acquiring the reward of the next time slot and the environmental state of the next time slot; storing the rewards and the environment state of each time slot into an experience pool;
computing a loss function L (theta) for a deep neural network i ) And a strategic gradient
Figure BDA0003680126570000081
Then training samples are extracted from the experience pool and the multi-agent attention depth certainty strategy is utilizedTraining the deep neural network by a slight gradient algorithm according to a loss function L (theta) i ) And a strategic gradient
Figure BDA0003680126570000082
And (4) iterating the deep neural network to obtain the optimal control strategy of the heat energy subsystem.
Preferably, the multi-agent attention depth certainty strategy gradient algorithm framework comprises i agents, wherein each agent is provided with a single deep neural network, and each deep neural network comprises an actor network, a target actor network, a critic network and a target critic network; the actor network and the target actor network have the same structure, and the critic network and the target critic network have the same structure;
neuron number and environment state s of actor network input layer t The number of components of (a) is the same, and the number of neurons of the output layer is the same as the behavior a t The number of the groups is the same; the critic network of the intelligent agent comprises an action behavior encoder module, an attention mechanism module and a multilayer perceptron module;
the input to the i-th agent actor network in the attention mechanism module is o i The output is a i (ii) a The input to the critic network includes o i 、a i And
Figure BDA0003680126570000083
the output is Q i (o,a),
Figure BDA0003680126570000084
Figure BDA0003680126570000085
Wherein o is i Is the local observed state of the ith agent; a is i Is an action of output; e.g. of the type i Code representing local observations and behaviors of the ith agent; q i (o, a) is the Q value of the critic network output, and in the critic network of the ith agent, the input to the attention module is
Figure BDA0003680126570000086
The output is x i ,x i Represents contributions of other agents;
contribution x of other Agents i The expression is as follows:
Figure BDA0003680126570000091
in the formula, W value,j A value transformation matrix representing a value associated with a jth agent;
Figure BDA0003680126570000092
is a non-linear activation function;
w j is the weight associated with the jth agent;
jth agent dependent weight w j Expressed as:
Figure BDA0003680126570000093
in the formula, W key,i And W query,i Respectively, the transformation matrices associated with the ith agent.
Preferably, the training deep neural network has a loss function L (θ) i ) And a strategic gradient
Figure BDA0003680126570000094
The expression is as follows:
Figure BDA0003680126570000095
Figure BDA0003680126570000096
Figure BDA0003680126570000097
where π represents the policy of the agent (represented by the actor network); y represents the output Q value of the target critic network, and pi' represents the target policy (represented by the target actor network) of the agent;
Figure BDA0003680126570000098
representing the Q value output by the critic network of the ith agent under the strategy pi; pi i (a i |o i ) Representing the actor network output of the ith agent.
Compared with the prior art, the invention has the following beneficial effects:
the operation of the electricity-hydrogen subsystem adopts the optimization based on the upper sub-problem model, then the optimization result is used as the input state of the operation of the heat energy subsystem, and the optimal operation control strategy of the heat energy subsystem is learned by adopting the multi-agent deep reinforcement learning technology, so that the occurrence of heterogeneous agents is avoided; the attention mechanism is adopted, so that the learning of the optimal operation control strategy of the heat energy subsystem has high expandability.
The method utilizes the dual advantages of a convex optimization method based on a model and a learning method based on no model, and realizes the minimization of the operation cost under high thermal comfort on the premise of not knowing the prior information of uncertain parameters and defining a thermodynamic model of the building.
Drawings
Fig. 1 is a flowchart of a method for controlling operation of a hydrogen-containing building energy system by combined optimization and learning according to an embodiment of the present invention;
FIG. 2 is a network framework diagram of a multi-agent depth of attention deterministic policy gradient algorithm of the present invention;
FIG. 3 is a graph of average temperature deviation of an embodiment of the present invention compared to other solutions;
FIG. 4 is a graph comparing the average operating cost of an embodiment of the present invention with other solutions.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
A method for controlling the operation of a hydrogen-containing building energy system by combining optimization and learning comprises the following steps:
establishing an expected operation cost minimization problem model of the hydrogen-containing building energy system according to the operation constraint conditions and parameter uncertainty of the hydrogen-containing building energy system;
the minimum problem model of the expected operation cost of the hydrogen-containing building energy system has the expression formula as follows:
Figure BDA0003680126570000101
Figure BDA0003680126570000102
C 2,t =μ c μ e,t P g,t Δt
C 3,t =ψ BESS (|P bc,t |+|P bd,t |)
Figure BDA0003680126570000111
C 5,t =ψ TESS (|P tc,t |+|P td,t |)
Figure BDA0003680126570000112
s.t. the operation constraint of the electric energy subsystem, the operation constraint of the hydrogen energy subsystem and the operation constraint of the heat energy subsystem;
in the formula, C 1,t Cost of buying and selling electricity for t time slot, C 2,t Cost of carbon emissions for t time slot, C 3,t Cost of loss for t-slot electrical energy storage systems, C 4,t The operation and maintenance cost of the hydrogen energy subsystem in the t time slot, C 5,t For the loss cost of the t-slot thermal subsystem, C 6,t For the t-slot purchase cost of natural gas,t represents the time slot length; v. of t And τ t Respectively representing the electricity buying price and the electricity selling price of the t time slot; p g,t The energy trading volume of the hydrogen-containing building energy system and the large power grid interaction is t time slot; mu.s c Is the carbon emission cost coefficient, with the unit of RMB/kg; mu.s e,t The carbon emission rate of a large power grid at the time slot t; psi BESS Is the battery depreciation coefficient, the unit is RMB/kW; p bc,t And P bd,t Respectively representing the charging power and the discharging power of the electric energy storage system;
Figure BDA0003680126570000113
and
Figure BDA0003680126570000114
respectively representing the operation and maintenance costs, the start-up costs and the shut-down costs of a component x (x. epsilon. { el, fc }) in the hydrogen energy storage system, wherein "el" and "fc" respectively represent an electrolyzer and a fuel cell;
Figure BDA0003680126570000115
and
Figure BDA0003680126570000116
respectively, representing logical indicator variables associated with the ON/OFF state, the ON state and the OFF state of the component x, wherein,
Figure BDA0003680126570000117
Figure BDA0003680126570000118
ψ TESS the depreciation coefficient of the thermal energy storage system is RMB/kW; p tc,t And P td,t Respectively representing the injection power and the release power of the t-time slot thermal energy storage system; eta gb Representing the conversion efficiency of natural gas into heat energy; p gb,t The thermal power output by the natural gas boiler is represented; lambda gb Indicating the price of natural gas in RMB/kWh.
In the above problem of minimizing the operation cost of the hydrogen-containing building energy system containing hydrogen-electricity-heat mixed energy storage, the decision variables Θ include: the energy trading volume between the local energy system and the large power grid, the charging and discharging power of the electric energy storage system, the input power of the electrolytic cell, the output power of the fuel cell, the heat supply power of each room, the charging and discharging power of the heat energy storage system and the natural gas consumption. The constraints to be considered are: the operating constraints associated with the hydrogen energy storage system, the electrical energy storage system, the thermal energy storage system, and the room comfort temperature range are as follows:
(1) the hydrogen energy storage system should satisfy the following constraints: h is not less than 0 t ≤H max
Figure BDA0003680126570000121
Figure BDA0003680126570000122
P el,t ·P fc,t 0 in the formula, H max Is the maximum storage capacity of the hydrogen tank;
Figure BDA0003680126570000123
and
Figure BDA0003680126570000124
the nominal power of the electrolyzer and the fuel cell, respectively.
(2) The electrical energy storage system needs to satisfy the following constraints: b is min ≤B t ≤B max
Figure BDA0003680126570000125
Figure BDA0003680126570000126
P bc,t ·P bd,t 0, wherein B min And B max Minimum and maximum energy levels of the electrical energy storage system, respectively;
Figure BDA0003680126570000127
the maximum charging and discharging power of the electric energy storage system are respectively.
(3) In thermal energy storage systemsDuring charging and discharging, the following operation constraints are required to be met:
Figure BDA0003680126570000128
Figure BDA0003680126570000129
P td,t ·P tc,t when the ratio is 0, in the formula,
Figure BDA00036801265700001210
is the maximum capacity of the thermal energy storage system;
Figure BDA00036801265700001211
and
Figure BDA00036801265700001212
respectively the maximum released power and the maximum injected power of the thermal energy storage system.
(4) The thermal load demand meets the following operating constraints:
Figure BDA00036801265700001213
β in,i,t+1 =F(P sp,i,tout,tin,i,ti,t ) In the formula (I), the reaction is carried out,
Figure BDA00036801265700001214
and
Figure BDA00036801265700001215
respectively representing the lower limit and the upper limit of a comfortable temperature range in a building i; beta is a in,i,t The indoor temperature of the ith room at the time slot t; f i A thermodynamic model representing a building i; epsilon i,t Representing a random thermal perturbation of the t time slot;
Figure BDA00036801265700001216
representing the maximum heat supply power within the building i.
The method for converting the expected operation cost minimization problem into a plurality of single-time-slot optimization sub-problem models by utilizing the Lyapunov optimization framework comprises the following steps:
judging the controllability of a hydrogen-containing building energy system; the expression formula of the controllable condition is as follows:
v max >τ max
v min >τ min
Figure BDA00036801265700001217
Figure BDA0003680126570000131
Figure BDA0003680126570000132
Figure BDA0003680126570000133
v max =max t v t ,τ max =max t τ t ,v min =min t v t ,τ min =min t τ t
Figure BDA0003680126570000134
Figure BDA0003680126570000135
in the formula, v max And v min Respectively representing the highest electricity price and the lowest electricity price for buying electricity; tau is max And τ min Respectively representing the highest electricity price and the lowest electricity price for selling electricity; eta bc And η bd Respectively representing the charging efficiency and the discharging efficiency of the electric energy storage system; mu.s c Is a weighting parameter that represents the importance of carbon emissions relative to energy costs;
Figure BDA0003680126570000136
and
Figure BDA0003680126570000137
respectively representing a maximum rate and a minimum rate of carbon emission; psi BESS Is the electrical energy storage system depreciation coefficient; b is max And B min Respectively representing the maximum energy storage level and the minimum energy storage level of the electric energy storage system;
Figure BDA0003680126570000138
and
Figure BDA0003680126570000139
respectively representing the injection rated power and the release rated power of the electric energy storage system; omega el And ω fc Respectively representing the conversion coefficients of the electrolytic cell and the fuel cell;
Figure BDA00036801265700001310
and
Figure BDA00036801265700001311
indicating variables respectively indicating whether the electrolyzer and the fuel cell are on or off; h max And H min Respectively representing the maximum energy storage level and the minimum energy storage level of the hydrogen energy storage system;
Figure BDA00036801265700001312
and
Figure BDA00036801265700001313
respectively representing rated power of the electrolytic cell and the fuel cell; Δ t represents the slot length.
Selecting a hydrogen building energy system which meets controllable conditions to construct a virtual queue of an electric energy subsystem and a hydrogen energy subsystem; the method for calculating the weighted sum of the single-slot lyapunov drift and the running cost deltay (t) according to the virtual queue definition lyapunov function comprises the following steps:
the Lyapunov function L (t) is expressed by the formula:
Figure BDA00036801265700001314
in the formula, X B,t =B t +W B ,X H,t =H t +W H ,ω r Is a uniform X B,t And X H,t A dimensional weighting factor; b is t Energy storage level of an electrical energy storage system, denoted t time slot, H t Energy storage level of a hydrogen energy storage system, denoted t time slot, W B Expressed as a parameter of the optimal electric energy storage system, W H Parameters expressed as an optimal hydrogen energy storage system; b is t And H t The dynamic constraints that need to be satisfied are respectively expressed as:
Figure BDA0003680126570000141
in the formula, P bc,t And P bd,t Respectively representing the charging power and the discharging power of the electric energy storage system; p el,t And P fc,t Respectively representing the input power of the electrolyzer and the output power of the fuel cell at t time slot.
The single-time-slot lyapunov drift is expressed by the following formula:
Λ t =E{L(t+1)-L(t)|X(t)},
Figure BDA0003680126570000142
Figure BDA0003680126570000143
Figure BDA0003680126570000144
in the formula, X (t) ═ X B,t ,X H,t ) And E {. cndot } represents the desired operation.
Then the single-time slot lyapunov drift Λ t The expression of (c) can be converted into:
Λ t ≤ξ BH +E{Γ 0 |X(t)},
Figure BDA0003680126570000145
calculating a weighted sum Δ y (t) of the single-slot lyapunov drift and the operating cost, expressed by the formula:
Figure BDA0003680126570000151
where V is a weighting parameter.
Converting the hydrogen-containing building energy system expected operation cost minimization problem model into a plurality of single-time slot optimization sub-problem models through the minimization weighted sum delta Y (t), wherein the expression formula of the single-time slot optimization sub-problem model is as follows:
Figure BDA0003680126570000152
Figure BDA0003680126570000153
Figure BDA0003680126570000154
Figure BDA0003680126570000155
Figure BDA0003680126570000156
Figure BDA0003680126570000157
calculating and determining optimal system in single-time-slot optimization subproblem modelSystem parameters; parameter W of an optimal electrical energy storage system B The calculation formula of (2) is as follows:
Figure BDA0003680126570000158
parameter W of optimal hydrogen energy storage system H The calculation formula of (2) is as follows:
Figure BDA0003680126570000159
s.t. the operating constraints of the electrical energy subsystem, the operating constraints of the hydrogen energy subsystem and the operating constraints of the thermal energy subsystem.
Decomposing the single-time-slot optimization sub-problem model into an upper layer sub-problem model corresponding to the electric-hydrogen subsystem and a lower layer sub-problem model corresponding to the heat energy subsystem according to the information certainty;
decomposing the single-time-slot optimization sub-problem model into an upper sub-problem model corresponding to the electric-hydrogen subsystem and a lower sub-problem model corresponding to the thermal energy subsystem according to information certainty, wherein the method comprises the following steps:
the upper layer subproblem model corresponding to the electro-hydrogen subsystem is expressed by the formula:
Figure BDA0003680126570000161
s.t. the operation constraint of the electric energy subsystem and the operation constraint of the hydrogen energy subsystem;
the lower layer subproblem model corresponding to the heat energy subsystem has the expression formula as follows:
min(V(C 5,t +C 6,t ))
s.t. operating constraints of the thermal energy subsystem.
Solving the upper sub-problem model by adopting a convex optimization method, and calculating to obtain the heat production of the fuel cell according to the solving result of the upper sub-problem, wherein the method comprises the following steps:
object box due to upper sub-problemThe number is a non-convex function, and convex relaxation is carried out on the non-convex function in the following way, namely the objective function is adjusted to be:
Figure BDA0003680126570000162
the maximum difference between the target function and the original target function is
Figure BDA0003680126570000163
After the objective function is adjusted, the whole problem is linear programming, so that the optimal solution can be quickly obtained. Then, the heat generation quantity Q of the fuel cell is obtained according to the solving result fc,t =η hr η h2e P fc,t Δ t, wherein: eta hr Indicates the heat recovery efficiency, η h2e Represents the thermoelectric ratio, P, of the fuel cell fc,t Indicating the fuel cell output power.
Taking the heat production quantity of the fuel cell as the input state of the lower layer subproblem model; the method for re-modeling the lower-layer sub-problem model based on the Markov game framework comprises the following steps:
the environmental state expression of the thermal energy subsystem is as follows:
s t =(Q fc,t ,Q th,tin,i,tout,i,t ,t),
Figure BDA0003680126570000171
in the formula, Q fc,t Representing the heat generation of the fuel cell at t time slot; q th,t Representing the energy storage level of the slot thermal energy storage system in the t-slot thermal energy subsystem; beta is a in,i,t The indoor temperature of the ith room at the time slot t; beta is a out,t Outdoor temperature for t time slot; t represents the time interval of two continuous action decisions executed by the current hydrogen-containing building energy system; q th,t Representing the energy storage level, η, of the thermal energy storage system in the thermal energy sub-system for the t time slot tc And η td Respectively representing the injection efficiency and the release efficiency of a thermal energy storage system in the thermal energy subsystem; p tc,t And P td,t In thermal subsystems representing t time slots separatelyThe injection power and the release power of the thermal energy storage system are measured;
the action expression of the heat energy subsystem is as follows:
a t =(P sp,1,t ,P sp,2,t ,…,P sp,i,t ),1≤i≤N b
in the formula, P sp,i,t Supplying power for the heat of the ith room at the time of the t time slot; n is a radical of b The number of rooms;
the reward expression for the thermal energy subsystem is as follows:
Figure BDA0003680126570000172
in the formula (I), the compound is shown in the specification,
Figure BDA0003680126570000173
wherein, κ th Is a penalty factor.
The method for solving by adopting a multi-agent attention depth certainty strategy gradient algorithm to obtain the optimal control strategy of the heat energy subsystem comprises the following steps:
at the beginning of each time slot, acquiring the environmental state of the heat energy subsystem;
the deep neural network outputs the current heat supply behavior of the hydrogen-containing building energy system to control the heat energy subsystem according to the environmental state of the current heat energy subsystem;
acquiring the reward of the next time slot and the environmental state of the next time slot; storing the rewards and the environment state of each time slot into an experience pool;
computing a loss function L (theta) for a deep neural network i ) And a strategic gradient
Figure BDA0003680126570000181
Extracting training samples from the experience pool, training a deep neural network by using a multi-agent attention deep certainty strategy gradient algorithm, and obtaining a loss function L (theta) i ) And strategic gradient
Figure BDA0003680126570000182
And (4) iterating the deep neural network to obtain the optimal control strategy of the heat energy subsystem.
A loss function L (theta) of the training deep neural network i ) And a strategic gradient
Figure BDA0003680126570000183
The expression is as follows:
Figure BDA0003680126570000184
Figure BDA0003680126570000185
Figure BDA0003680126570000186
where π represents the agent's policy (represented by the actor network); y represents the output Q value of the target critic network, and pi' represents the target strategy (represented by the target actor network) of the agent;
Figure BDA0003680126570000187
representing the Q value output by the critic network of the ith agent under the strategy pi; pi i (a i |o i ) Representing the actor network output of the ith agent.
The multi-agent attention depth certainty strategy gradient algorithm architecture comprises i agents, wherein each agent is provided with a single depth neural network, and each depth neural network comprises an actor network, a target actor network, a critic network and a target critic network; the actor network and the target actor network have the same structure, and the critic network and the target critic network have the same structure;
neuron number and environment state s of actor network input layer t The number of components of (a) is the same, and the number of neurons of the output layer is the same as the behavior a t The number of the groups is the same; critic network of the agentThe system comprises an action behavior encoder module, an attention mechanism module and a multilayer perceptron module;
the input to the i-th agent actor network in the attention mechanism module is o i Output is a i (ii) a The input to the critic network includes o i 、a i And
Figure BDA0003680126570000188
the output is Q i (o,a),
Figure BDA0003680126570000189
Figure BDA0003680126570000191
Wherein o is i Is the local observed state of the ith agent; a is i Is an action of output; e.g. of a cylinder i Code representing local observations and behaviors of the ith agent; q i (o, a) is the Q value of the critic network output, and in the critic network of the ith agent, the input to the attention module is
Figure BDA0003680126570000192
The output is x i ,x i The contribution of other agents is represented and,
contribution x of other Agents i The expression is as follows:
Figure BDA0003680126570000193
in the formula, W value,j A value transformation matrix representing a value associated with a jth agent;
Figure BDA0003680126570000194
is a non-linear activation function;
w j is the weight associated with the jth agent,
jth agent dependent weight w j Expressed as:
Figure BDA0003680126570000195
in the formula, W key,i And W query,i Respectively, the transformation matrices associated with the ith agent.
And controlling the operation of the hydrogen-containing building energy system in real time according to the convex optimization solving method of the upper sub-problem model and the optimal control strategy of the heat energy subsystem.
Figure 3 shows a graph comparing the performance of the method of the invention with other comparison schemes. Scheme 1 represents the combined control of an electrical energy storage system and a hydrogen energy storage system. Specifically, when there is a surplus of renewable energy, the electric energy storage system and the hydrogen energy storage system are charged. And otherwise, discharging the electric energy storage system and the hydrogen energy storage system. Furthermore, the ON-OFF strategy is adopted to control the building heat supply power, namely: when the indoor temperature is lower than the lower limit, the input thermal power is 0; when the indoor temperature is higher than the upper limit, the input thermal power is the maximum thermal supply power. Scheme 2 utilizes a Deep Q Network (DQN) algorithm to control the electrical energy storage system and the hydrogen energy storage system. Meanwhile, the ON-OFF strategy is adopted to control the building heat supply power. Scheme 3 employs a multi-agent deep deterministic policy gradient algorithm (MADDPG) for joint control of all energy storage devices and thermal loads. Scheme 4 is similar to the inventive method, but does not consider the attention mechanism. As can be seen from fig. 4, the method of the present invention can significantly reduce the operation cost while maintaining high thermal comfort (e.g., average temperature deviation less than 0.03 ℃). Specifically, the average running cost was reduced by 30.09%, 20.31%, 25.66%, 18.53% compared to scheme 1, scheme 2, scheme 3, and scheme 4, respectively.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, it is possible to make various improvements and modifications without departing from the technical principle of the present invention, and those improvements and modifications should be considered as the protection scope of the present invention.

Claims (10)

1. The operation control method of the hydrogen-containing building energy system based on the combined optimization and learning is characterized by comprising the following steps of: establishing an expected operation cost minimization problem model of the hydrogen-containing building energy system according to the operation constraint conditions and parameter uncertainty of the hydrogen-containing building energy system; converting the expected running cost minimization problem into a plurality of single-time slot optimization sub-problem models by utilizing a Lyapunov optimization framework;
decomposing the single-time-slot optimization sub-problem model into an upper layer sub-problem model corresponding to the electric-hydrogen subsystem and a lower layer sub-problem model corresponding to the heat energy subsystem;
solving the upper sub-problem model by adopting a convex optimization method, and calculating according to the solving result of the upper sub-problem to obtain the heat production quantity of the fuel cell;
taking the heat production quantity of the fuel cell as the input state of the lower layer subproblem model; based on a Markov game framework, carrying out re-modeling on a lower-layer sub-problem model, and solving by adopting a multi-agent attention depth certainty strategy gradient algorithm to obtain an optimal control strategy of a heat energy subsystem;
and controlling the operation of the hydrogen-containing building energy system in real time according to the convex optimization solving method of the upper sub-problem model and the optimal control strategy of the heat energy subsystem.
2. The method for controlling the operation of the hydrogen-containing building energy system based on the combined optimization and learning of claim 1, wherein the problem model of minimizing the expected operation cost of the hydrogen-containing building energy system is expressed by the following formula:
Figure FDA0003680126560000011
s.t. the operation constraint of the electric energy subsystem, the operation constraint of the hydrogen energy subsystem and the operation constraint of the heat energy subsystem;
in the formula, C 1,t Cost of buying and selling electricity for t time slot, C 2,t Cost of carbon emission for t time slot, C 3,t Cost of loss for t-slot electrical energy storage system, C 4,t Hydrogen for t time slotEnergy subsystem operation and maintenance cost, C 5,t For the loss cost of the t-slot thermal subsystem, C 6,t For T time slot natural gas purchase cost, T represents time slot length; the decision variables Θ include: energy trading volume between a local energy system and a large power grid, charging and discharging power of an electric energy storage system, input power of an electrolytic cell, output power of a fuel cell, heat supply power of each room, charging and discharging power of a heat energy storage system and natural gas consumption.
3. The method for controlling the operation of a hydrogen-containing building energy system based on combined optimization and learning of claim 2, wherein the method for converting the desired operation cost minimization problem into a plurality of single-time-slot optimization sub-problem models by using a Lyapunov optimization framework comprises:
judging the controllability of a hydrogen-containing building energy system; selecting a hydrogen building energy system which meets controllable conditions to construct a virtual queue of an electric energy subsystem and a hydrogen energy subsystem; defining a Lyapunov function according to the virtual queue, and calculating the weighted sum delta Y (t) of the single-time-slot Lyapunov drift and the operation cost; and converting the hydrogen-containing building energy system expected operation cost minimization problem model into a plurality of single-time-slot optimization sub-problem models through the minimization weighted sum delta Y (t), and calculating and determining optimal system parameters in the single-time-slot optimization sub-problem models.
4. The method for controlling the operation of the hydrogen-containing building energy system through combined optimization and learning according to claim 3, wherein the expression formula of the controllable conditions is as follows:
v max >τ max
v min >τ min
Figure FDA0003680126560000021
Figure FDA0003680126560000022
Figure FDA0003680126560000023
Figure FDA0003680126560000024
v max =max t v t ,τ max =max t τ t ,v min =min t v t ,τ min =min t τ t
Figure FDA0003680126560000025
Figure FDA0003680126560000031
in the formula, v max And v min Respectively representing the highest electricity price and the lowest electricity price for buying electricity; tau. max And τ min Respectively representing the highest electricity price and the lowest electricity price for selling electricity; eta bc And η bd Respectively representing the charging efficiency and the discharging efficiency of the electric energy storage system; mu.s c Is a weighting parameter that represents the importance of carbon emissions relative to energy costs;
Figure FDA0003680126560000032
and
Figure FDA0003680126560000033
respectively representing a maximum rate and a minimum rate of carbon emission; psi BESS Is the electrical energy storage system depreciation coefficient; b max And B min Respectively representing the maximum energy storage level and the minimum energy storage level of the electric energy storage system;
Figure FDA0003680126560000034
and
Figure FDA0003680126560000035
respectively representing the injection rated power and the release rated power of the electric energy storage system; omega el And omega fc Respectively representing the conversion coefficients of the electrolytic cell and the fuel cell;
Figure FDA0003680126560000036
and
Figure FDA0003680126560000037
indicating variables respectively indicating whether the electrolyzer and the fuel cell are on or off; h max And H min Respectively representing the maximum energy storage level and the minimum energy storage level of the hydrogen energy storage system;
Figure FDA0003680126560000038
and
Figure FDA0003680126560000039
respectively representing rated power of the electrolytic cell and the fuel cell; Δ t represents the slot length.
5. The method for controlling the operation of the hydrogen-containing building energy system based on the combined optimization and learning of claim 4, wherein the method for calculating the weighted sum Δ Y (t) of the single-slot lyapunov drift and the operation cost comprises:
the Lyapunov function L (t) is expressed by the formula:
Figure FDA00036801265600000310
in the formula, X B,t =B t +W B ,X H,t =H t +W H ,ω r Is a unity of X B,t And X H,t A dimensional weighting factor; b is t Energy storage level of an electrical energy storage system, denoted t time slot, H t Storage of a hydrogen energy storage system denoted t-slotsCan be horizontal, W B Expressed as a parameter of the optimal electric energy storage system, W H Parameters expressed as an optimal hydrogen energy storage system;
B t and H t The dynamic constraints that need to be satisfied are respectively expressed as:
Figure FDA00036801265600000311
Figure FDA0003680126560000041
in the formula, P bc,t And P bd,t Respectively representing the charging power and the discharging power of the electric energy storage system; p is el,t And P fc,t Respectively representing the input power of the electrolytic cell and the output power of the fuel cell at t time slot;
the single-time-slot lyapunov drift is expressed by the following formula:
Λ t =E{L(t+1)-L(t)|X(t)},
Figure FDA0003680126560000042
Figure FDA0003680126560000043
Figure FDA0003680126560000044
in the formula, X (t) ═ X B,t ,X H,t ) E {. cndot } represents an expected operation;
then the single time slot Lyapunov drift Lambda t The expression of (c) can be converted into:
Λ t ≤ξ BH +E{Γ 0 |X(t)},
Figure FDA0003680126560000045
calculating a weighted sum Δ y (t) of the single-slot lyapunov drift and the operating cost, expressed by the formula:
Figure FDA0003680126560000046
where V is a weighting parameter.
6. The method for controlling the operation of the hydrogen-containing building energy system through the combined optimization and learning of the claim 5 is characterized in that the expression formula of the single-time-slot optimization subproblem model is as follows:
Figure FDA0003680126560000051
s.t. the operation constraint of the electric energy subsystem, the operation constraint of the hydrogen energy subsystem and the operation constraint of the heat energy subsystem;
Figure FDA0003680126560000052
Figure FDA0003680126560000053
Figure FDA0003680126560000054
Figure FDA0003680126560000055
Figure FDA0003680126560000056
parameter W of an optimal electrical energy storage system B The calculation formula of (2) is as follows:
Figure FDA0003680126560000057
parameter W of optimal hydrogen energy storage system H The calculation formula of (c) is:
Figure FDA0003680126560000058
7. the method for controlling the operation of the hydrogen-containing building energy system based on the combined optimization and learning of claim 6, wherein the single-time-slot optimization sub-problem model is decomposed into an upper sub-problem model corresponding to the electric-hydrogen subsystem and a lower sub-problem model corresponding to the thermal energy subsystem according to the information certainty, and the method comprises the following steps:
an upper sub-problem model corresponding to the electro-hydrogen subsystem, expressed as:
Figure FDA0003680126560000061
s.t. the operation constraint of the electric energy subsystem and the operation constraint of the hydrogen energy subsystem;
the lower-layer sub-problem model corresponding to the heat energy subsystem has the expression formula as follows:
min(V(C 5,t +C 6,t ))
s.t. operating constraints of the thermal energy subsystem.
8. The method for controlling the operation of the hydrogen-containing building energy system based on the combined optimization and learning of claim 7, wherein the method for modeling the lower layer subproblem model again based on the Markov game framework comprises the following steps:
the environmental state expression of the thermal energy subsystem is as follows:
s t =(Q fc,t ,Q th,tin,i,tout,i,t ,t),
Figure FDA0003680126560000062
in the formula, Q fc,t Representing the heat generation amount of the fuel cell at t time slot; q th,t Representing the energy storage level of the slot thermal energy storage system in the t-slot thermal energy subsystem; beta is a in,i,t The indoor temperature of the ith room at the time slot t; beta is a out,t An outdoor temperature of t time slot; t represents the time interval of two continuous action decisions executed by the current hydrogen-containing building energy system; q th,t Representing the energy storage level, η, of the thermal energy storage system in the thermal energy sub-system for the t time slot tc And η td Respectively representing the injection efficiency and the release efficiency of the thermal energy storage system in the thermal energy subsystem; p tc,t And P td,t Respectively representing the injection power and the release power of a slot thermal energy storage system in the t-slot thermal energy subsystem;
the action expression of the heat energy subsystem is as follows:
a t =(P sp,1,t ,P sp,2,t ,…,P sp,i,t ),1≤i≤N b
in the formula, P sp,i,t Supplying power for the heat of the ith room at the time of the t time slot; n is a radical of hydrogen b The number of rooms;
the reward expression for the thermal energy subsystem is as follows:
Figure FDA0003680126560000071
in the formula (I), the compound is shown in the specification,
Figure FDA0003680126560000072
wherein, κ th Is a penalty factor.
9. The method for controlling the operation of a hydrogen-containing building energy system based on combined optimization and learning of claim 8, wherein the method for solving by using a multi-agent depth of attention deterministic strategy gradient algorithm comprises:
at the beginning of each time slot, acquiring the environmental state of the heat energy subsystem;
the deep neural network outputs the current heat supply behavior of the hydrogen-containing building energy system to control the heat energy subsystem according to the environmental state of the current heat energy subsystem;
acquiring the reward of the next time slot and the environmental state of the next time slot; storing the rewards and the environment state of each time slot into an experience pool;
computing a loss function L (theta) for a deep neural network i ) And strategic gradient
Figure FDA0003680126560000073
Extracting training samples from the experience pool, training a deep neural network by using a multi-agent attention depth deterministic strategy gradient algorithm, and obtaining a loss function L (theta) i ) And strategic gradient
Figure FDA0003680126560000074
Iterating the deep neural network to obtain an optimal control strategy of the heat energy subsystem;
a loss function L (theta) of the training deep neural network i ) And strategic gradient
Figure FDA0003680126560000075
The expression is as follows:
Figure FDA0003680126560000076
Figure FDA0003680126560000077
Figure FDA0003680126560000078
where π represents the policy of the agent (represented by the actor network); y represents the output Q value of the target critic network, and pi' represents the target strategy (represented by the target actor network) of the agent;
Figure FDA0003680126560000079
representing the Q value output by the critic network of the ith agent under the strategy pi; pi i (a i |o i ) Representing the actor network output of the ith agent.
10. The method for controlling the operation of the hydrogen-containing building energy system through combined optimization and learning of claim 9 is characterized in that a multi-agent attention depth certainty strategy gradient algorithm framework comprises i agents, each agent is provided with a single deep neural network, and each deep neural network comprises an actor network, a target actor network, a critic network and a target critic network; the actor network and the target actor network have the same structure, and the critic network and the target critic network have the same structure;
neuron number and environment state s of actor network input layer t The number of the components of (a) is the same, the number of the neurons of the output layer is the same as the behavior a t The number of the groups is the same; the critic network of the intelligent agent comprises an action behavior encoder module, an attention mechanism module and a multilayer perceptron module;
the input to the i-th agent actor network in the attention mechanism module is o i Output is a i (ii) a The input to the critic network includes o i 、a i And
Figure FDA0003680126560000081
the output is Q i (o,a),
Figure FDA0003680126560000082
Figure FDA0003680126560000083
Wherein o is i Is the local observed state of the ith agent; a is a i Is an action of output; e.g. of the type i Code representing local observations and behaviors of the ith agent; q i (o, a) is the Q value of the critic network output, and in the critic network of the ith agent, the input to the attention module is
Figure FDA0003680126560000084
The output is x i ,x i Represents contributions of other agents;
contribution x of other Agents i The expression is as follows:
Figure FDA0003680126560000085
in the formula, W value,j A value transformation matrix representing a value associated with a jth agent;
Figure FDA0003680126560000086
is a non-linear activation function;
w j is the weight associated with the jth agent;
jth agent dependent weight w j Expressed as:
Figure FDA0003680126560000091
in the formula, W key,i And W query,i Respectively, the transformation matrices associated with the ith agent.
CN202210631486.3A 2022-06-06 2022-06-06 Hydrogen-containing building energy system operation control method combining optimization and learning Active CN115130733B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210631486.3A CN115130733B (en) 2022-06-06 2022-06-06 Hydrogen-containing building energy system operation control method combining optimization and learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210631486.3A CN115130733B (en) 2022-06-06 2022-06-06 Hydrogen-containing building energy system operation control method combining optimization and learning

Publications (2)

Publication Number Publication Date
CN115130733A true CN115130733A (en) 2022-09-30
CN115130733B CN115130733B (en) 2024-07-09

Family

ID=83378492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210631486.3A Active CN115130733B (en) 2022-06-06 2022-06-06 Hydrogen-containing building energy system operation control method combining optimization and learning

Country Status (1)

Country Link
CN (1) CN115130733B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458443A (en) * 2019-08-07 2019-11-15 南京邮电大学 A kind of wisdom home energy management method and system based on deeply study
US20200301924A1 (en) * 2019-03-20 2020-09-24 Guangdong University Of Technology Method for constructing sql statement based on actor-critic network
CN112966444A (en) * 2021-03-12 2021-06-15 南京邮电大学 Intelligent energy optimization method and device for building multi-energy system
US20220036392A1 (en) * 2020-08-03 2022-02-03 Desong Bian Deep Reinforcement Learning Based Real-time scheduling of Energy Storage System (ESS) in Commercial Campus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200301924A1 (en) * 2019-03-20 2020-09-24 Guangdong University Of Technology Method for constructing sql statement based on actor-critic network
CN110458443A (en) * 2019-08-07 2019-11-15 南京邮电大学 A kind of wisdom home energy management method and system based on deeply study
US20220036392A1 (en) * 2020-08-03 2022-02-03 Desong Bian Deep Reinforcement Learning Based Real-time scheduling of Energy Storage System (ESS) in Commercial Campus
CN112966444A (en) * 2021-03-12 2021-06-15 南京邮电大学 Intelligent energy optimization method and device for building multi-energy system

Also Published As

Publication number Publication date
CN115130733B (en) 2024-07-09

Similar Documents

Publication Publication Date Title
Li et al. Sizing of a stand-alone microgrid considering electric power, cooling/heating, hydrogen loads and hydrogen storage degradation
Pu et al. Optimal sizing for an integrated energy system considering degradation and seasonal hydrogen storage
CN112966444B (en) Intelligent energy optimization method and device for building multi-energy system
CN109636056B (en) Multi-energy microgrid decentralized optimization scheduling method based on multi-agent technology
CN108985524B (en) Coordination control method of multi-energy complementary system
Yu et al. Joint optimization and learning approach for smart operation of hydrogen-based building energy systems
CN115169916A (en) Electric heating comprehensive energy control method based on safety economy
Sanaye et al. A novel energy management method based on Deep Q Network algorithm for low operating cost of an integrated hybrid system
Ahmadi et al. Performance of a smart microgrid with battery energy storage system's size and state of charge
CN116300755A (en) Double-layer optimal scheduling method and device for heat storage-containing heating system based on MPC
Liang et al. Deep reinforcement learning-based optimal scheduling of integrated energy systems for electricity, heat, and hydrogen storage
CN115795992A (en) Park energy Internet online scheduling method based on virtual deduction of operation situation
Liang et al. Real-time optimization of large-scale hydrogen production systems using off-grid renewable energy: Scheduling strategy based on deep reinforcement learning
CN114971071A (en) Park comprehensive energy system time sequence planning method considering wind-solar access and electric heating hybrid energy storage
CN111509784A (en) Uncertainty-considered virtual power plant robust output feasible region identification method and device
CN111275572A (en) Unit scheduling system and method based on particle swarm and deep reinforcement learning
Fan et al. Multi-agent deep reinforced co-dispatch of energy and hydrogen storage in low-carbon building clusters
Yin et al. Decomposition prediction fractional-order PID reinforcement learning for short-term smart generation control of integrated energy systems
Zhou et al. Deep reinforcement learning guided cascade control for air supply of polymer exchange membrane fuel cell
CN111555362B (en) Optimal regulation and control method and device for full-renewable energy source thermoelectric storage coupling system
CN115130733B (en) Hydrogen-containing building energy system operation control method combining optimization and learning
CN113098073B (en) Day-ahead scheduling optimization method considering source-load bilateral elastic space
CN112583053A (en) Microgrid energy optimization scheduling method containing distributed wind power
Sun et al. Energy management based on safe multi-agent reinforcement learning for smart buildings in distribution networks
CN113131464A (en) Multi-energy collaborative optimization method based on chaotic frog-leaping algorithm and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant