CN115130733A - Hydrogen-containing building energy system operation control method combining optimization and learning - Google Patents
Hydrogen-containing building energy system operation control method combining optimization and learning Download PDFInfo
- Publication number
- CN115130733A CN115130733A CN202210631486.3A CN202210631486A CN115130733A CN 115130733 A CN115130733 A CN 115130733A CN 202210631486 A CN202210631486 A CN 202210631486A CN 115130733 A CN115130733 A CN 115130733A
- Authority
- CN
- China
- Prior art keywords
- hydrogen
- subsystem
- energy storage
- slot
- storage system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 229910052739 hydrogen Inorganic materials 0.000 title claims abstract description 101
- 239000001257 hydrogen Substances 0.000 title claims abstract description 100
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 title claims abstract description 89
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000005457 optimization Methods 0.000 title claims abstract description 55
- 239000000446 fuel Substances 0.000 claims abstract description 32
- 238000011217 control strategy Methods 0.000 claims abstract description 12
- 238000004519 manufacturing process Methods 0.000 claims abstract description 12
- 238000004146 energy storage Methods 0.000 claims description 109
- 239000003795 chemical substances by application Substances 0.000 claims description 70
- 210000004027 cell Anatomy 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 28
- 238000013528 artificial neural network Methods 0.000 claims description 21
- 230000005611 electricity Effects 0.000 claims description 20
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 claims description 18
- 230000007613 environmental effect Effects 0.000 claims description 16
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 claims description 15
- 229910052799 carbon Inorganic materials 0.000 claims description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 238000007599 discharging Methods 0.000 claims description 14
- 230000009471 action Effects 0.000 claims description 13
- 230000006399 behavior Effects 0.000 claims description 12
- 239000003345 natural gas Substances 0.000 claims description 9
- 238000002347 injection Methods 0.000 claims description 8
- 239000007924 injection Substances 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 210000002569 neuron Anatomy 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000012423 maintenance Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 150000001875 compounds Chemical class 0.000 claims 1
- 230000001419 dependent effect Effects 0.000 claims 1
- 230000020169 heat generation Effects 0.000 claims 1
- 150000002431 hydrogen Chemical class 0.000 claims 1
- 230000008901 benefit Effects 0.000 abstract description 3
- 230000009977 dual effect Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 10
- 239000000243 solution Substances 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 4
- 230000002787 reinforcement Effects 0.000 description 3
- 239000002803 fossil fuel Substances 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J3/00—Circuit arrangements for AC mains or AC distribution networks
- H02J3/38—Arrangements for parallely feeding a single network by two or more generators, converters or transformers
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J2300/00—Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
- H02J2300/30—The power source being a fuel cell
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J2310/00—The network for supplying or distributing electric power characterised by its spatial reach or by the load
- H02J2310/10—The network having a local or delimited stationary reach
- H02J2310/12—The local stationary network supplying a household or a building
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Primary Health Care (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Power Engineering (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Feedback Control In General (AREA)
Abstract
Description
技术领域technical field
本发明属于建筑能源系统运行控制领域,具体涉及含氢建筑能源系统运行控制方法。The invention belongs to the field of building energy system operation control, and in particular relates to a hydrogen-containing building energy system operation control method.
背景技术Background technique
建筑在全世界能源消耗和碳排放总量中占有很大的比重。在2019年,全球建筑消耗的能源占全球能源总量约30%,产生的碳排放占全球碳排放总量约28%。目前全球能源供给主要依赖化石燃料等不可再生能源,导致能源枯竭问题和环境污染问题日益严重。近年来,氢能因其具有清洁、可再生、来源广泛、储运方便、利用率高等优点受到了广泛关注,被公认为一种很有前景的化石燃料替代品。此外,氢能存储系统与其他储能系统(如热能存储系统、电能存储系统)的协调运行有助于提升建筑能量效率。因此,含氢建筑能源系统的运行控制值得深入研究。Buildings account for a large proportion of the world's total energy consumption and carbon emissions. In 2019, the energy consumed by global buildings accounted for about 30% of the total global energy, and the carbon emissions generated accounted for about 28% of the total global carbon emissions. At present, the global energy supply mainly relies on non-renewable energy sources such as fossil fuels, resulting in increasingly serious problems of energy depletion and environmental pollution. In recent years, hydrogen energy has received extensive attention due to its clean, renewable, wide-ranging sources, convenient storage and transportation, and high utilization rate, and has been recognized as a promising alternative to fossil fuels. In addition, the coordinated operation of hydrogen energy storage systems and other energy storage systems (such as thermal energy storage systems, electrical energy storage systems) can help improve building energy efficiency. Therefore, the operation control of the hydrogen-containing building energy system is worthy of in-depth study.
现有研究提出了若干含氢建筑能源系统的运行控制方法,如随机规划、模型预测控制等。这些方法的目标是最小化系统运行成本(主要包括能量成本和碳排放成本等)。尽管现有研究取得了一定的进展,但均未考虑建筑热动态性,这意味着高建筑热惯性(即建筑室内温度由于初始激励(如突然停止加热)呈现弱化和延迟反应的现象)并未被充分利用以降低系统运行成本。Existing studies have proposed several operational control methods for hydrogen-containing building energy systems, such as stochastic programming and model predictive control. The goal of these methods is to minimize system operating costs (mainly including energy costs and carbon emissions costs, etc.). Although some progress has been made in existing studies, none of them consider building thermal dynamics, which means that high building thermal inertia (i.e. the weakening and delayed response of building interior temperature due to initial excitation (such as abrupt heating stop)) does not be fully utilized to reduce system operating costs.
当将建筑热动态性考虑在含氢建筑能源系统中时,系统运行优化控制面临四个方面的挑战:(1)存在大量不确定性系统参数;(2)存在大量时间和空间耦合运行约束;(3)氢能存储系统中燃料电池同时产生电和热导致电能流和热能流之间存在耦合;(4)很难建立既准确又易于建筑控制的明确建筑热动态性模型。具体而言,单智能体深度强化学习的动作空间维度将随着热区域数量增大而急剧增加;多智能体深度强化学习由于面临的是异构智能体之间的协同,在智能体数量增加时,其有效学习面临困难。When considering building thermal dynamics in a hydrogen-containing building energy system, the optimal control of system operation faces four challenges: (1) there are a lot of uncertain system parameters; (2) there are a lot of time and space coupled operational constraints; (3) The simultaneous generation of electricity and heat by a fuel cell in a hydrogen energy storage system results in a coupling between the electrical energy flow and the thermal energy flow; (4) It is difficult to establish a clear building thermal dynamic model that is both accurate and easy to control. Specifically, the action space dimension of single-agent deep reinforcement learning will increase sharply with the increase of the number of hot regions; multi-agent deep reinforcement learning is faced with the cooperation between heterogeneous agents, and the number of agents increases It is difficult to learn effectively.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种联合优化与学习的含氢建筑能源系统运行控制方法,利用基于模型的凸优化方法和基于无模型的学习方法的双重优势,实现高热舒适性下的运行成本最小化。The purpose of the present invention is to provide a combined optimization and learning operation control method for a hydrogen-containing building energy system, which utilizes the dual advantages of a model-based convex optimization method and a model-free learning method to minimize operating costs under high thermal comfort. .
为达到上述目的,本发明所采用的技术方案是:In order to achieve the above object, the technical scheme adopted in the present invention is:
本发明第一方面提供了一种联合优化与学习的含氢建筑能源系统运行控制方法,包括:A first aspect of the present invention provides an operation control method for a hydrogen-containing building energy system that combines optimization and learning, including:
根据含氢建筑能源系统的运行约束条件和参数不确定性,建立含氢建筑能源系统的期望运行成本最小化问题模型;利用李雅普诺夫最优化框架将期望运行成本最小化问题转化为多个单时隙最优化子问题模型;According to the operating constraints and parameter uncertainties of the hydrogen-containing building energy system, a model of the expected operating cost minimization problem of the hydrogen-containing building energy system is established; the Lyapunov optimization framework is used to transform the expected operating cost minimization problem into multiple single Slot optimization subproblem model;
将单时隙最优化子问题模型分解为与电-氢子系统对应的上层子问题模型以及与热能子系统对应的下层子问题模型;The single-slot optimization sub-problem model is decomposed into an upper sub-problem model corresponding to the electric-hydrogen subsystem and a lower sub-problem model corresponding to the thermal energy subsystem;
对上层子问题模型采用凸优化方法进行求解,并根据上层子问题的求解结果计算得到燃料电池产热量;The upper sub-problem model is solved by using the convex optimization method, and the heat production of the fuel cell is calculated according to the solution results of the upper sub-problem;
将燃料电池产热量作为下层子问题模型的输入状态;基于马尔科夫博弈框架对下层子问题模型进行重新建模,并采用多智能体注意力深度确定性策略梯度算法进行求解,得到热能子系统的最优控制策略;The fuel cell heat production is used as the input state of the lower sub-problem model; the lower sub-problem model is re-modeled based on the Markov game framework, and the multi-agent attention depth deterministic strategy gradient algorithm is used to solve the problem, and the thermal energy subsystem is obtained. the optimal control strategy;
根据上层子问题模型的凸优化求解方法和热能子系统最优控制策略对含氢建筑能源系统的运行进行实时控制。According to the convex optimization solution method of the upper sub-problem model and the optimal control strategy of the thermal energy subsystem, the operation of the hydrogen-containing building energy system is controlled in real time.
优选的,含氢建筑能源系统期望运行成本最小化问题模型,表达公式为:Preferably, the expected operating cost minimization problem model of the hydrogen-containing building energy system is expressed as:
s.t.电能子系统的运行约束、氢能子系统的运行约束和热能子系统的运行约束;s.t. Operational constraints of the electrical energy subsystem, operational constraints of the hydrogen energy subsystem, and operational constraints of the thermal energy subsystem;
公式中,C1,t为t时隙买卖电成本,C2,t为t时隙碳排放成本,C3,t为t时隙电能存储系统损耗成本,C4,t为t时隙氢能子系统运维成本,C5,t为t时隙热能子系统损耗成本,C6,t为t时隙天然气购买成本,T表示时隙长度;决策变量Θ包括:本地能源系统与大电网之间的能量交易量、电能存储系统充放电功率、电解槽输入功率、燃料电池输出功率、每个房间的热供给功率、热能存储系统的充放电功率、天然气消耗量。In the formula, C1 ,t is the cost of buying and selling electricity in the t slot, C2 ,t is the carbon emission cost in the t slot, C3 ,t is the loss cost of the energy storage system in the t slot, and C4 ,t is the hydrogen in the t slot. Energy subsystem operation and maintenance cost, C 5,t is the loss cost of the thermal energy subsystem in the t time slot, C 6,t is the natural gas purchase cost in the t time slot, T represents the time slot length; decision variables Θ include: local energy system and large power grid The energy transaction volume, the charging and discharging power of the electric energy storage system, the input power of the electrolyzer, the output power of the fuel cell, the heat supply power of each room, the charging and discharging power of the thermal energy storage system, and the natural gas consumption.
优选的,利用李雅普诺夫最优化框架将期望运行成本最小化问题转化为多个单时隙最优化子问题模型的方法包括:Preferably, the method for transforming the expected running cost minimization problem into multiple single-slot optimization sub-problem models using the Lyapunov optimization framework includes:
判定含氢建筑能源系统的可控性;选择符合可控条件的氢建筑能源系统构建电能子系统和氢能子系统的虚拟队列;根据虚拟队列定义李雅普诺夫函数,计算单时隙李雅普诺夫漂移和运行成本的加权和ΔY(t);通过最小化加权和ΔY(t)将含氢建筑能源系统期望运行成本最小化问题模型转化为多个单时隙最优化子问题模型,计算确定单时隙最优化子问题模型中的最优系统参数。Determine the controllability of the hydrogen-containing building energy system; select the hydrogen building energy system that meets the controllable conditions to construct the virtual queue of the electric energy subsystem and the hydrogen energy subsystem; define the Lyapunov function according to the virtual queue, and calculate the single-slot Lyapunov The weighted sum ΔY(t) of drift and operating cost; by minimizing the weighted sum ΔY(t), the expected operating cost minimization problem model of the hydrogen-containing building energy system is transformed into multiple single-slot optimization sub-problem models, and the calculation and determination of single Optimal system parameters in the slotted optimization subproblem model.
优选的,所述可控条件的表达公式为:Preferably, the expression formula of the controllable condition is:
vmax>τmax,v max >τ max ,
vmin>τmin,v min >τ min ,
vmax=maxt vt,τmax=maxtτt,vmin=mint vt,τmin=mintτt, v max =max t v t , τ max =max t τ t , v min =min t v t , τ min =min t τ t ,
式中,vmax和vmin分别表示买电最高电价和最低电价;τmax和τmin分别表示卖电最高电价和最低电价;ηbc和ηbd分别表示电能存储系统的充电效率和放电效率;μc是加权参数,用来表示碳排放相对于能量成本的重要性;和分别表示碳排放最大速率和最小速率;ψBESS是电能存储系统折旧系数;Bmax和Bmin分别表示电能存储系统的最大储能水平和最小储能水平;和分别表示电能存储系统的注入额定功率和释放额定功率;ωel和ωfc分别表示电解槽和燃料电池的转换系数;和分别表示电解槽和燃料电池是否开启的指示变量;Hmax和Hmin分别表示氢能存储系统的最大储能水平和最小储能水平;和分别表示电解槽和燃料电池的额定功率;Δt表示时隙长度。In the formula, v max and v min represent the highest and lowest electricity prices for buying electricity, respectively; τ max and τ min represent the highest and lowest electricity prices for selling electricity, respectively; η bc and η bd represent the charging efficiency and discharging efficiency of the electrical energy storage system, respectively; μ c is a weighting parameter used to express the importance of carbon emissions relative to energy costs; and represent the maximum rate and minimum rate of carbon emission, respectively; ψ BESS is the depreciation coefficient of the electrical energy storage system; B max and B min represent the maximum energy storage level and the minimum energy storage level of the electrical energy storage system, respectively; and represent the injection rated power and release rated power of the electrical energy storage system, respectively; ω el and ω fc represent the conversion coefficients of the electrolyzer and fuel cell, respectively; and respectively indicate whether the electrolyzer and the fuel cell are turned on; H max and H min respectively indicate the maximum energy storage level and the minimum energy storage level of the hydrogen energy storage system; and are the rated power of the electrolyzer and the fuel cell, respectively; Δt is the time slot length.
优选的,计算单时隙李雅普诺夫漂移和运行成本的加权和ΔY(t)的方法包括:Preferably, the method for calculating the weighted sum ΔY(t) of the single-slot Lyapunov drift and the running cost includes:
所述李雅普诺夫函数L(t),表达公式为:The Lyapunov function L(t), the expression formula is:
公式中,XB,t=Bt+WB,XH,t=Ht+WH,ωr是统一XB,t和XH,t量纲的加权系数;Bt表示为t时隙的电能存储系统的储能水平,Ht表示为t时隙的氢能存储系统的储能水平,WB表示为最优电能存储系统的参数,WH表示为最优氢能存储系统的参数;Bt和Ht需要满足的动态性约束分别表示为: 式中,Pbc,t和Pbd,t分别表示电能存储系统的充电功率和放电功率;Pel,t和Pfc,t分别表示t时隙的电解槽输入功率和燃料电池输出功率。单时隙李雅普诺夫漂移,表达公式为:In the formula, X B,t =B t +W B , X H,t =H t +W H , ω r is a weighting coefficient that unifies the dimensions of X B,t and X H,t ; when B t is expressed as t The energy storage level of the electric energy storage system in time slot t, H t is the energy storage level of the hydrogen energy storage system in time slot t, WB is the parameter of the optimal electric energy storage system, and W H is the optimal hydrogen energy storage system. parameters; the dynamic constraints that B t and H t need to satisfy are expressed as: In the formula, P bc,t and P bd,t represent the charging power and discharging power of the electrical energy storage system, respectively; P el,t and P fc,t represent the electrolyzer input power and fuel cell output power in time slot t, respectively. The single-slot Lyapunov drift is expressed as:
Λt=E{L(t+1)-L(t)|X(t)},Λ t =E{L(t+1)-L(t)|X(t)},
公式中,X(t)=(XB,t,XH,t),E{·}表示期望运算。In the formula, X(t)=(X B,t ,X H,t ), and E{·} represents the expected operation.
则单时隙李雅普诺夫漂移Λt的表达式可转化为:Then the expression of the single-slot Lyapunov drift Λ t can be transformed into:
Λt≤ξB+ξH+E{Γ0|X(t)},Λ t ≤ξ B +ξ H +E{Γ 0 |X(t)},
计算单时隙李雅普诺夫漂移和运行成本的加权和ΔY(t),表达公式为:Calculate the weighted sum ΔY(t) of the single-slot Lyapunov drift and running cost, expressed as:
式中,V是一个加权参数。where V is a weighting parameter.
优选的,单时隙最优化子问题模型的表达公式为Preferably, the expression formula of the single-slot optimization sub-problem model is:
最优电能存储系统的参数WB的计算公式为: The calculation formula of the parameter WB of the optimal electric energy storage system is:
最优氢能存储系统的参数WH的计算公式为:The calculation formula of the parameter W H of the optimal hydrogen energy storage system is:
s.t.电能子系统的运行约束、氢能子系统的运行约束和热能子系统的运行约束。s.t. Operational constraints of the electrical energy subsystem, operational constraints of the hydrogen energy subsystem, and operational constraints of the thermal energy subsystem.
优选的,根据信息确定性将单时隙最优化子问题模型分解为与电-氢子系统对应的上层子问题模型以及与热能子系统对应的下层子问题模型,方法包括:Preferably, the single-slot optimization sub-problem model is decomposed into an upper-level sub-problem model corresponding to the electric-hydrogen subsystem and a lower-level sub-problem model corresponding to the thermal energy subsystem according to the information determinism, and the method includes:
与电-氢子系统对应的上层子问题模型,表达公式为:The upper-level sub-problem model corresponding to the electro-hydrogen subsystem is expressed as:
s.t.电能子系统的运行约束和氢能子系统的运行约束;s.t. Operational constraints of the electrical energy subsystem and operational constraints of the hydrogen energy subsystem;
与热能子系统对应的下层子问题模型,表达公式为:The lower sub-problem model corresponding to the thermal energy subsystem is expressed as:
min(V(C5,t+C6,t))s.t.热能子系统的运行约束。min(V(C 5 , t + C 6 , t )) st operating constraints of the thermal energy subsystem.
优选的,基于马尔科夫博弈框架对下层子问题模型进行重新建模的方法包括:Preferably, the method for re-modeling the underlying sub-problem model based on the Markov game framework includes:
所述热能子系统的环境状态表达式如下:The environmental state expression of the thermal energy subsystem is as follows:
st=(Qfc,t,Qth,t,βin,i,t,βout,i,t,t),s t =(Q fc,t ,Q th,t ,β in,i,t ,β out,i,t ,t),
式中,Qfc,t表示t时隙的燃料电池的产热量;Qth,t表示t时隙热能子系统中的隙热能存储系统的储能水平;βin,i,t为t时隙第i个房间的室内温度;βout,t为t时隙的室外温度;t表示指当前含氢建筑能源系统执行连续两次动作决策的时间间隔;Qth,t表示t时隙在热能子系统中的热能存储系统的储能水平,ηtc和ηtd分别表示热能子系统中的热能存储系统的注入效率和释放效率;Ptc,t和Ptd,t分别表示t时隙热能子系统中的隙热能存储系统的注入功率和释放功率;In the formula, Q fc,t represents the heat production of the fuel cell in time slot t; Q th,t represents the energy storage level of the interstitial thermal energy storage system in the thermal energy subsystem of time slot t; β in,i,t is the time slot t The indoor temperature of the i-th room; β out,t is the outdoor temperature in time slot t; t refers to the time interval between the current hydrogen-containing building energy system executing two consecutive action decisions; Q th,t refers to the time slot t in the thermal energy quantum The energy storage level of the thermal energy storage system in the system, η tc and η td represent the injection efficiency and release efficiency of the thermal energy storage system in the thermal energy subsystem, respectively; P tc,t and P td,t represent the t-slot thermal energy subsystem, respectively The injected power and the released power of the interstitial thermal energy storage system in ;
所述热能子系统的动作表达式为:The action expression of the thermal energy subsystem is:
at=(Psp,1,t,Psp,2,t,…,Psp,i,t),1≤i≤Nb,a t =(P sp,1,t ,P sp,2,t ,...,P sp,i,t ), 1≤i≤N b ,
式中,Psp,i,t为在t时隙时第i个房间的热供给功率;Nb为房间个数;In the formula, P sp,i,t is the heat supply power of the ith room at time slot t; N b is the number of rooms;
所述热能子系统的奖励表达式如下:The reward expression of the thermal energy subsystem is as follows:
式中,其中,κth为惩罚系数。In the formula, Among them, κ th is the penalty coefficient.
优选的,采用多智能体注意力深度确定性策略梯度算法进行求解的方法包括:Preferably, the method for solving by using the multi-agent attention depth deterministic policy gradient algorithm includes:
在每个时隙初,获取热能子系统的环境状态;At the beginning of each time slot, obtain the environmental state of the thermal energy subsystem;
深度神经网络根据所述当前热能子系统的环境状态,输出含氢建筑能源系统的当前热供给行为对热能子系统进行控制;The deep neural network controls the thermal energy subsystem by outputting the current heat supply behavior of the hydrogen-containing building energy system according to the environmental state of the current thermal energy subsystem;
获取下一时隙奖励和下一时隙的环境状态;将各时隙的奖励和环境状态存储至经验池中;Obtain the reward of the next time slot and the environmental state of the next time slot; store the reward and environmental state of each time slot into the experience pool;
计算深度神经网络的损失函数L(θi)和策略梯度则从经验池中抽取训练样本,利用多智能体注意力深度确定性策略梯度算法训练深度神经网络,根据损失函数L(θi)和策略梯度对深度神经网络进行迭代,获得热能子系统的最优控制策略。Calculate the loss function L(θ i ) and the policy gradient of the deep neural network Then, the training samples are extracted from the experience pool, and the deep neural network is trained by the multi-agent attention depth deterministic policy gradient algorithm. According to the loss function L(θ i ) and the policy gradient Iterate the deep neural network to obtain the optimal control strategy of the thermal energy subsystem.
优选的,多智能体注意力深度确定性策略梯度算法架构包括i个智能体,所述智能体设有单个深度神经网络,各深度神经网络包括行动者网络、目标行动者网络、评论家网络和目标评论家网络;行动者网络和目标行动者网络结构相同,评论家网络和目标评论家网络结构相同;Preferably, the multi-agent attention depth deterministic policy gradient algorithm architecture includes i agents, the agents are provided with a single deep neural network, and each deep neural network includes an actor network, a target actor network, a critic network and a The target critic network; the actor network and the target actor network have the same structure, and the critic network and the target critic network have the same structure;
行动者网络输入层的神经元个数与环境状态st的分量数相同,输出层的神经元个数与行为at的个数相同;所述智能体的评论家网络包括动作行为编码器模块、注意力机制模块和多层感知机模块;The number of neurons in the input layer of the actor network is the same as the number of components of the environmental state s t , and the number of neurons in the output layer is the same as the number of the behavior a t ; the critic network of the agent includes an action behavior encoder module , attention mechanism module and multilayer perceptron module;
所述注意力机制模块中第i个智能体行动者网络的输入是为oi,输出为ai;评论家网络的输入包括oi、ai和输出是Qi(o,a), The input of the ith agent actor network in the attention mechanism module is o i , and the output is a i ; the input of the critic network includes o i , a i and The output is Q i (o,a),
其中,oi是第i个智能体的局部观测状态;ai是输出的动作;ei表示第i个智能体的局部观察和行为的编码;Qi(o,a)是该评论家网络输出的Q值,在第i个智能体的评论家网络中,注意力模块的输入是输出是xi,xi表示其他智能体的贡献;Among them, o i is the local observation state of the ith agent; a i is the output action; ei represents the encoding of the local observation and behavior of the ith agent; Q i (o, a) is the critic network The Q value of the output, in the critic network of the ith agent, the input of the attention module is The output is xi , where xi represents the contribution of other agents;
其他智能体的贡献xi表达式为:The contribution xi of other agents is expressed as:
式中,Wvalue,j表示与第j个智能体相关的值变换矩阵;是一个非线性激活函数;In the formula, W value,j represents the value transformation matrix related to the jth agent; is a nonlinear activation function;
wj是与第j个智能体相关的权重;w j is the weight associated with the jth agent;
第j个智能体相关的权重wj表达为:The weight w j associated with the jth agent is expressed as:
式中,Wkey,i和Wquery,i分别是与第i个智能体相关的变换矩阵。where W key,i and W query,i are the transformation matrices related to the ith agent, respectively.
优选的,所述训深度神经网络的损失函数L(θi)和策略梯度表达式为:Preferably, the loss function L(θ i ) and the policy gradient of the training deep neural network The expression is:
式中,π表示智能体的策略(由行动者网络表示);y表示目标评论家网络的输出Q值,π′表示智能体的目标策略(由目标行动者网络表示);表示第i个智能体的评论家网络在策略π下输出的Q值;πi(ai|oi)表示第i个智能体的行动者网络输出。where π represents the agent's strategy (represented by the actor network); y represents the output Q value of the target critic network, and π' represents the agent's target strategy (represented by the target actor network); represents the Q value of the critic network output of the ith agent under policy π; π i (a i |o i ) represents the actor network output of the ith agent.
与现有技术相比,本发明的有益效果:Compared with the prior art, the beneficial effects of the present invention:
本发明电-氢子系统的运行采用基于上层子问题模型的优化,然后将其优化结果作为热能子系统运行的输入状态,采用多智能体深度强化学习技术学习热能子系统的最优运行控制策略,因而避免了异构智能体的出现;采用了注意力机制使热能子系统的最优运行控制策略的学习具有高可扩展性。The operation of the electro-hydrogen subsystem of the present invention adopts the optimization based on the upper-level sub-problem model, and then the optimization result is used as the input state of the operation of the thermal energy subsystem, and the multi-agent deep reinforcement learning technology is used to learn the optimal operation control strategy of the thermal energy subsystem , thus avoiding the emergence of heterogeneous agents; the attention mechanism is adopted to make the learning of the optimal operation control strategy of the thermal energy subsystem highly scalable.
本发明利用基于模型的凸优化方法和基于无模型的学习方法的双重优势,在无需知晓不确定性参数的先验信息和明确建筑热动态性模型的前提下,实现高热舒适性下的运行成本最小化。The invention utilizes the dual advantages of the model-based convex optimization method and the model-free learning method, and realizes the operation cost under high thermal comfort without knowing the prior information of the uncertain parameters and clarifying the building thermal dynamic model. minimize.
附图说明Description of drawings
图1是本发明实施例提供的一种联合优化与学习的含氢建筑能源系统运行控制方法的流程图;Fig. 1 is a flow chart of an operation control method of a hydrogen-containing building energy system for joint optimization and learning provided by an embodiment of the present invention;
图2是本发明多智能体注意力深度确定性策略梯度算法网络框架图;Fig. 2 is the multi-agent attention depth deterministic strategy gradient algorithm network frame diagram of the present invention;
图3是本发明实施例与其他方案的平均温度偏离对比图;Fig. 3 is the average temperature deviation contrast diagram of the embodiment of the present invention and other schemes;
图4是本发明实施例与其他方案的平均运行成本对比图。FIG. 4 is a comparison diagram of the average running cost of the embodiment of the present invention and other solutions.
具体实施方式Detailed ways
下面结合附图对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案,而不能以此来限制本发明的保护范围。The present invention will be further described below in conjunction with the accompanying drawings. The following examples are only used to illustrate the technical solutions of the present invention more clearly, and cannot be used to limit the protection scope of the present invention.
一种联合优化与学习的含氢建筑能源系统运行控制方法,包括:An operation control method for a hydrogen-containing building energy system based on joint optimization and learning, including:
根据含氢建筑能源系统的运行约束条件和参数不确定性,建立含氢建筑能源系统的期望运行成本最小化问题模型;According to the operating constraints and parameter uncertainties of the hydrogen-containing building energy system, a model for the minimization of the expected operating cost of the hydrogen-containing building energy system is established;
含氢建筑能源系统期望运行成本最小化问题模型,表达公式为:The expected operating cost minimization problem model of hydrogen-containing building energy system, the expression formula is:
C2,t=μcμe,tPg,tΔtC 2,t = μ c μ e,t P g,t Δt
C3,t=ψBESS(|Pbc,t|+|Pbd,t|)C 3,t =ψ BESS (|P bc,t |+|P bd,t |)
C5,t=ψTESS(|Ptc,t|+|Ptd,t|)C 5,t =ψ TESS (|P tc,t |+|P td,t |)
s.t.电能子系统的运行约束、氢能子系统的运行约束和热能子系统的运行约束;s.t. Operational constraints of the electrical energy subsystem, operational constraints of the hydrogen energy subsystem, and operational constraints of the thermal energy subsystem;
公式中,C1,t为t时隙买卖电成本,C2,t为t时隙碳排放成本,C3,t为t时隙电能存储系统损耗成本,C4,t为t时隙氢能子系统运维成本,C5,t为t时隙热能子系统损耗成本,C6,t为t时隙天然气购买成本,T表示时隙长度;vt和τt分别表示t时隙买电价格和卖电价格;Pg,t为t时隙含氢建筑能源系统与大电网交互的能量交易量;μc是碳排放成本系数,单位为RMB/kg;μe,t为t时隙大电网的碳排放率;ψBESS是电池折旧系数,单位为RMB/kW;Pbc,t和Pbd,t分别表示电能存储系统的充电功率和放电功率;和分别表示氢能存储系统中组件x(x∈{el,fc})的运行和维护成本、启动成本和关闭成本,其中,“el”和“fc”分别表示电解槽和燃料电池;和分别表示与组件x的ON/OFF状态、启动状态和关闭状态相关的逻辑指示变量,其中, ψTESS是热能存储系统折旧系数,单位为RMB/kW;Ptc,t和Ptd,t分别表示t时隙热能存储系统的注入功率和释放功率;ηgb表示天然气转换为热能的转换效率;Pgb,t表示天然气锅炉输出的热功率;λgb表示天然气价格,单位为RMB/kWh。In the formula, C1 ,t is the cost of buying and selling electricity in the t slot, C2 ,t is the carbon emission cost in the t slot, C3 ,t is the loss cost of the energy storage system in the t slot, and C4 ,t is the hydrogen in the t slot. Energy subsystem operation and maintenance cost, C 5,t is the loss cost of thermal energy subsystem in time slot t, C 6,t is the cost of purchasing natural gas in time slot t, T represents the length of the time slot; v t and τ t represent the purchase cost of time slot t, respectively electricity price and electricity selling price; P g,t is the energy transaction volume of the hydrogen-containing building energy system interacting with the large power grid in time slot t; μ c is the carbon emission cost coefficient, in RMB/kg; μ e,t is when t is the carbon emission rate of the grid with large gaps; ψ BESS is the battery depreciation coefficient, in RMB/kW; P bc,t and P bd,t represent the charging power and discharging power of the electric energy storage system, respectively; and denote the operation and maintenance cost, startup cost and shutdown cost of component x (x∈{el,fc}) in the hydrogen energy storage system, respectively, where “el” and “fc” denote electrolyzer and fuel cell, respectively; and respectively represent the logical indicator variables related to the ON/OFF state, startup state and shutdown state of component x, where, ψ TESS is the depreciation coefficient of the thermal energy storage system, in RMB/kW; P tc,t and P td,t represent the injection power and release power of the thermal energy storage system in the t time slot, respectively; η gb represents the conversion efficiency of natural gas into thermal energy; P gb,t represents the thermal power output by the natural gas boiler; λ gb represents the price of natural gas, in RMB/kWh.
在上述含有氢电热混合储能的含氢建筑能源系统运行成本最小化问题中,决策变量Θ包括:本地能源系统与大电网之间的能量交易量、电能存储系统充放电功率、电解槽输入功率、燃料电池输出功率、每个房间的热供给功率、热能存储系统的充放电功率、天然气消耗量。需要考虑的约束有:与氢能存储系统相关的运行约束、与电能存储系统相关的运行约束、与热能存储系统相关的运行约束以及与房间舒适温度范围相关的约束,具体如下:In the above problem of minimizing the operating cost of the hydrogen-containing building energy system with hydrogen-electric-heat hybrid energy storage, the decision variables Θ include: the energy transaction volume between the local energy system and the large power grid, the charging and discharging power of the electric energy storage system, and the input power of the electrolyzer , Fuel cell output power, heat supply power of each room, charge and discharge power of thermal energy storage system, natural gas consumption. The constraints to consider are: operational constraints related to hydrogen energy storage systems, operational constraints related to electrical energy storage systems, operational constraints related to thermal energy storage systems, and constraints related to room comfort temperature range, as follows:
(1)氢能存储系统应满足以下约束:0≤Ht≤Hmax, Pel,t·Pfc,t=0,式中,Hmax是氢罐的最大存储容量;和分别是电解槽和燃料电池的额定功率。(1) The hydrogen energy storage system should satisfy the following constraints: 0≤H t ≤H max , P el,t ·P fc,t =0, where H max is the maximum storage capacity of the hydrogen tank; and are the power ratings of the electrolyzer and fuel cell, respectively.
(2)电能存储系统需满足以下约束:Bmin≤Bt≤Bmax, Pbc,t·Pbd,t=0,式中,Bmin和Bmax分别是电能存储系统的最小和最大能量水平;分别为电能存储系统的最大充电、放电功率。(2) The electric energy storage system needs to satisfy the following constraints: B min ≤ B t ≤ B max , P bc,t ·P bd,t =0, where B min and B max are the minimum and maximum energy levels of the electrical energy storage system, respectively; are the maximum charging and discharging power of the electrical energy storage system, respectively.
(3)在热能存储系统充放过程中,需满足如下运行约束: Ptd,t·Ptc,t=0,式中,是热能存储系统的最大容量;和分别是热能存储系统的最大释放功率和最大注入功率。(3) During the charging and discharging process of the thermal energy storage system, the following operating constraints must be satisfied: P td,t ·P tc,t =0, where, is the maximum capacity of the thermal energy storage system; and are the maximum released power and maximum injected power of the thermal energy storage system, respectively.
(4)热负载需求满足以下运行约束:βin,i,t+1=F(Psp,i,t,βout,t,βin,i,t,εi,t),式中,和分别表示建筑i内舒适温度范围的下限和上限;βin,i,t为t时隙第i个房间的室内温度;Fi表示建筑i的热动态性模型;εi,t表示t时隙的随机热扰动;表示建筑i内的最大热供给功率。(4) The heat load requirement satisfies the following operating constraints: β in,i,t+1 =F(P sp,i,t ,β out,t ,β in,i,t ,ε i,t ), where, and represent the lower and upper limits of the comfortable temperature range in building i, respectively; β in,i,t is the indoor temperature of the ith room in time slot t; F i represents the thermal dynamic model of building i; ε i,t represents time slot t Random thermal disturbances; represents the maximum heat supply power in building i.
利用李雅普诺夫最优化框架将期望运行成本最小化问题转化为多个单时隙最优化子问题模型的方法包括:Using the Lyapunov optimization framework to transform the expected running cost minimization problem into multiple single-slot optimization subproblems models include:
判定含氢建筑能源系统的可控性;所述可控条件的表达公式为:Determine the controllability of the hydrogen-containing building energy system; the expression formula of the controllable condition is:
vmax>τmax,v max >τ max ,
vmin>τmin,v min >τ min ,
vmax=maxt vt,τmax=maxtτt,vmin=mint vt,τmin=mintτt, v max =max t v t , τ max =max t τ t , v min =min t v t , τ min =min t τ t ,
式中,vmax和vmin分别表示买电最高电价和最低电价;τmax和τmin分别表示卖电最高电价和最低电价;ηbc和ηbd分别表示电能存储系统的充电效率和放电效率;μc是加权参数,用来表示碳排放相对于能量成本的重要性;和分别表示碳排放最大速率和最小速率;ψBESS是电能存储系统折旧系数;Bmax和Bmin分别表示电能存储系统的最大储能水平和最小储能水平;和分别表示电能存储系统的注入额定功率和释放额定功率;ωel和ωfc分别表示电解槽和燃料电池的转换系数;和分别表示电解槽和燃料电池是否开启的指示变量;Hmax和Hmin分别表示氢能存储系统的最大储能水平和最小储能水平;和分别表示电解槽和燃料电池的额定功率;Δt表示时隙长度。In the formula, v max and v min represent the highest and lowest electricity prices for buying electricity, respectively; τ max and τ min represent the highest and lowest electricity prices for selling electricity, respectively; η bc and η bd represent the charging efficiency and discharging efficiency of the electrical energy storage system, respectively; μ c is a weighting parameter used to express the importance of carbon emissions relative to energy costs; and represent the maximum rate and minimum rate of carbon emission, respectively; ψ BESS is the depreciation coefficient of the electrical energy storage system; B max and B min represent the maximum energy storage level and the minimum energy storage level of the electrical energy storage system, respectively; and represent the injection rated power and release rated power of the electrical energy storage system, respectively; ω el and ω fc represent the conversion coefficients of the electrolyzer and fuel cell, respectively; and respectively indicate whether the electrolyzer and the fuel cell are turned on; H max and H min respectively indicate the maximum energy storage level and the minimum energy storage level of the hydrogen energy storage system; and are the rated power of the electrolyzer and the fuel cell, respectively; Δt is the time slot length.
选择符合可控条件的氢建筑能源系统构建电能子系统和氢能子系统的虚拟队列;根据虚拟队列定义李雅普诺夫函数,计算单时隙李雅普诺夫漂移和运行成本的加权和ΔY(t)的方法包括:Select the hydrogen building energy system that meets the controllable conditions to construct the virtual queue of the electric energy subsystem and the hydrogen energy subsystem; define the Lyapunov function according to the virtual queue, and calculate the weighted sum ΔY(t) of the single-slot Lyapunov drift and operating cost methods include:
所述李雅普诺夫函数L(t),表达公式为:The Lyapunov function L(t), the expression formula is:
公式中,XB,t=Bt+WB,XH,t=Ht+WH,ωr是一个统一XB,t和XH,t量纲的加权系数;Bt表示为t时隙的电能存储系统的储能水平,Ht表示为t时隙的氢能存储系统的储能水平,WB表示为最优电能存储系统的参数,WH表示为最优氢能存储系统的参数;Bt和Ht需要满足的动态性约束分别表示为:式中,Pbc,t和Pbd,t分别表示电能存储系统的充电功率和放电功率;Pel,t和Pfc,t分别表示t时隙的电解槽输入功率和燃料电池输出功率。In the formula, X B,t =B t +W B , X H,t =H t +W H , ω r is a weighting coefficient that unifies the dimensions of X B,t and X H,t ; B t is expressed as t The energy storage level of the electric energy storage system in the time slot, H t is the energy storage level of the hydrogen energy storage system in the time slot t, WB is the parameter of the optimal electric energy storage system, and WH is the optimal hydrogen energy storage system. The parameters of ; the dynamic constraints that B t and H t need to satisfy are expressed as: In the formula, P bc,t and P bd,t represent the charging power and discharging power of the electrical energy storage system, respectively; P el,t and P fc,t represent the electrolyzer input power and fuel cell output power in time slot t, respectively.
单时隙李雅普诺夫漂移,表达公式为:The single-slot Lyapunov drift is expressed as:
Λt=E{L(t+1)-L(t)|X(t)},Λ t =E{L(t+1)-L(t)|X(t)},
公式中,X(t)=(XB,t,XH,t),E{·}表示期望运算。In the formula, X(t)=(X B,t ,X H,t ), and E{·} represents the expected operation.
则单时隙李雅普诺夫漂移Λt的表达式可转化为:Then the expression of the single-slot Lyapunov drift Λ t can be transformed into:
Λt≤ξB+ξH+E{Γ0|X(t)},Λ t ≤ξ B +ξ H +E{Γ 0 |X(t)},
计算单时隙李雅普诺夫漂移和运行成本的加权和ΔY(t),表达公式为:Calculate the weighted sum ΔY(t) of the single-slot Lyapunov drift and running cost, expressed as:
式中,V是一个加权参数。where V is a weighting parameter.
通过最小化加权和ΔY(t)将含氢建筑能源系统期望运行成本最小化问题模型转化为多个单时隙最优化子问题模型,单时隙最优化子问题模型的表达公式为:By minimizing the weighted sum ΔY(t), the expected operating cost minimization problem model of the hydrogen-containing building energy system is transformed into multiple single-slot optimization sub-problem models. The expression formula of the single-slot optimization sub-problem model is:
计算确定单时隙最优化子问题模型中的最优系统参数;最优电能存储系统的参数WB的计算公式为:Calculate and determine the optimal system parameters in the single-slot optimization sub-problem model; the calculation formula of the parameter W B of the optimal electric energy storage system is:
最优氢能存储系统的参数WH的计算公式为:The calculation formula of the parameter W H of the optimal hydrogen energy storage system is:
s.t.电能子系统的运行约束、氢能子系统的运行约束和热能子系统的运行约束。s.t. Operational constraints of the electrical energy subsystem, operational constraints of the hydrogen energy subsystem, and operational constraints of the thermal energy subsystem.
根据信息确定性将单时隙最优化子问题模型分解为与电-氢子系统对应的上层子问题模型以及与热能子系统对应的下层子问题模型;Decompose the single-slot optimization sub-problem model into an upper-level sub-problem model corresponding to the electric-hydrogen subsystem and a lower-level sub-problem model corresponding to the thermal energy subsystem according to the information determinism;
根据信息确定性将单时隙最优化子问题模型分解为与电-氢子系统对应的上层子问题模型以及与热能子系统对应的下层子问题模型,方法包括:The single-slot optimization sub-problem model is decomposed into an upper-level sub-problem model corresponding to the electric-hydrogen subsystem and a lower-level sub-problem model corresponding to the thermal energy subsystem according to the information determinism, and the method includes:
与电-氢子系统对应的上层子问题模型,表达公式为:The upper-level sub-problem model corresponding to the electro-hydrogen subsystem is expressed as:
s.t.电能子系统的运行约束和氢能子系统的运行约束;s.t. Operational constraints of the electrical energy subsystem and operational constraints of the hydrogen energy subsystem;
与热能子系统对应的下层子问题模型,表达公式为:The lower sub-problem model corresponding to the thermal energy subsystem is expressed as:
min(V(C5,t+C6,t))min(V(C 5, t + C 6, t ))
s.t.热能子系统的运行约束。s.t. Operational constraints for thermal energy subsystems.
对上层子问题模型采用凸优化方法进行求解,并根据上层子问题的求解结果计算得到燃料电池产热量,方法包括:The upper-layer sub-problem model is solved by using the convex optimization method, and the heat production of the fuel cell is calculated according to the solution results of the upper-layer sub-problem. The methods include:
由于上层子问题的目标函数为非凸函数,采用如下方式将其进行凸松弛,即目标函数调整为:该目标函数与原目标函数的最大差距为由于调整目标函数后,整个问题为线性规划,故可以快速得到最优解。然后,根据求解结果得到燃料电池产热量Qfc,t=ηhrηh2ePfc,tΔt,其中:ηhr表示热恢复效率,ηh2e表示燃料电池的热电比,Pfc,t表示燃料电池输出功率。Since the objective function of the upper sub-problem is a non-convex function, it is convexly relaxed in the following way, that is, the objective function is adjusted as: The maximum difference between the objective function and the original objective function is Since the whole problem is a linear programming after adjusting the objective function, the optimal solution can be obtained quickly. Then, according to the solution result, the fuel cell heat production Q fc,t = η hr η h2e P fc,t Δt, where: η hr represents the heat recovery efficiency, η h2e represents the thermoelectric ratio of the fuel cell, and P fc,t represents the fuel cell Output Power.
将燃料电池产热量作为下层子问题模型的输入状态;基于马尔科夫博弈框架对下层子问题模型进行重新建模的方法包括:The fuel cell heat production is used as the input state of the lower sub-problem model; the methods of re-modeling the lower sub-problem model based on the Markov game framework include:
所述热能子系统的环境状态表达式如下:The environmental state expression of the thermal energy subsystem is as follows:
st=(Qfc,t,Qth,t,βin,i,t,βout,i,t,t),s t =(Q fc,t ,Q th,t ,β in,i,t ,β out,i,t ,t),
式中,Qfc,t表示t时隙的燃料电池的产热量;Qth,t表示t时隙热能子系统中的隙热能存储系统的储能水平;βin,i,t为t时隙第i个房间的室内温度;βout,t为t时隙的室外温度;t表示指当前含氢建筑能源系统执行连续两次动作决策的时间间隔;Qth,t表示t时隙在热能子系统中的热能存储系统的储能水平,ηtc和ηtd分别表示热能子系统中的热能存储系统的注入效率和释放效率;Ptc,t和Ptd,t分别表示t时隙热能子系统中的隙热能存储系统的注入功率和释放功率;In the formula, Q fc,t represents the heat production of the fuel cell in time slot t; Q th,t represents the energy storage level of the interstitial thermal energy storage system in the thermal energy subsystem of time slot t; β in,i,t is the time slot t The indoor temperature of the i-th room; β out,t is the outdoor temperature in time slot t; t refers to the time interval between the current hydrogen-containing building energy system executing two consecutive action decisions; Q th,t refers to the time slot t in the thermal energy quantum The energy storage level of the thermal energy storage system in the system, η tc and η td represent the injection efficiency and release efficiency of the thermal energy storage system in the thermal energy subsystem, respectively; P tc,t and P td,t represent the t-slot thermal energy subsystem, respectively The injected power and the released power of the interstitial thermal energy storage system in ;
所述热能子系统的动作表达式为:The action expression of the thermal energy subsystem is:
at=(Psp,1,t,Psp,2,t,…,Psp,i,t),1≤i≤Nb,a t =(P sp,1,t ,P sp,2,t ,...,P sp,i,t ), 1≤i≤N b ,
式中,Psp,i,t为在t时隙时第i个房间的热供给功率;Nb为房间个数;In the formula, P sp,i,t is the heat supply power of the ith room at time slot t; N b is the number of rooms;
所述热能子系统的奖励表达式如下:The reward expression of the thermal energy subsystem is as follows:
式中,其中,κth为惩罚系数。In the formula, Among them, κ th is the penalty coefficient.
采用多智能体注意力深度确定性策略梯度算法进行求解,得到热能子系统的最优控制策略的方法包括:The multi-agent attention depth deterministic policy gradient algorithm is used to solve the problem, and the method to obtain the optimal control strategy of the thermal energy subsystem includes:
在每个时隙初,获取热能子系统的环境状态;At the beginning of each time slot, obtain the environmental state of the thermal energy subsystem;
深度神经网络根据所述当前热能子系统的环境状态,输出含氢建筑能源系统的当前热供给行为对热能子系统进行控制;The deep neural network controls the thermal energy subsystem by outputting the current heat supply behavior of the hydrogen-containing building energy system according to the environmental state of the current thermal energy subsystem;
获取下一时隙奖励和下一时隙的环境状态;将各时隙的奖励和环境状态存储至经验池中;Obtain the reward of the next time slot and the environmental state of the next time slot; store the reward and environmental state of each time slot into the experience pool;
计算深度神经网络的损失函数L(θi)和策略梯度则从经验池中抽取训练样本,利用多智能体注意力深度确定性策略梯度算法训练深度神经网络,根据损失函数L(θi)和策略梯度对深度神经网络进行迭代,获得热能子系统的最优控制策略。Calculate the loss function L(θ i ) and the policy gradient of the deep neural network Then, the training samples are extracted from the experience pool, and the deep neural network is trained by the multi-agent attention depth deterministic policy gradient algorithm. According to the loss function L(θ i ) and the policy gradient Iterate the deep neural network to obtain the optimal control strategy of the thermal energy subsystem.
所述训深度神经网络的损失函数L(θi)和策略梯度表达式为:The loss function L(θ i ) and the policy gradient of the trained deep neural network The expression is:
式中,π表示智能体的策略(由行动者网络表示);y表示目标评论家网络的输出Q值,π′表示智能体的目标策略(由目标行动者网络表示);表示第i个智能体的评论家网络在策略π下输出的Q值;πi(ai|oi)表示第i个智能体的行动者网络输出。where π represents the agent's strategy (represented by the actor network); y represents the output Q value of the target critic network, and π' represents the agent's target strategy (represented by the target actor network); represents the Q value of the critic network output of the ith agent under policy π; π i (a i |o i ) represents the actor network output of the ith agent.
多智能体注意力深度确定性策略梯度算法架构包括i个智能体,所述智能体设有单个深度神经网络,各深度神经网络包括行动者网络、目标行动者网络、评论家网络和目标评论家网络;行动者网络和目标行动者网络结构相同,评论家网络和目标评论家网络结构相同;The multi-agent attention depth deterministic policy gradient algorithm architecture includes i agents, the agents are provided with a single deep neural network, and each deep neural network includes an actor network, a target actor network, a critic network and a target critic network; the actor network and the target actor network have the same structure, and the critic network and the target critic network have the same structure;
行动者网络输入层的神经元个数与环境状态st的分量数相同,输出层的神经元个数与行为at的个数相同;所述智能体的评论家网络包括动作行为编码器模块、注意力机制模块和多层感知机模块;The number of neurons in the input layer of the actor network is the same as the number of components of the environmental state s t , and the number of neurons in the output layer is the same as the number of the behavior a t ; the critic network of the agent includes an action behavior encoder module , attention mechanism module and multilayer perceptron module;
所述注意力机制模块中第i个智能体行动者网络的输入是为oi,输出为ai;评论家网络的输入包括oi、ai和输出是Qi(o,a), The input of the ith agent actor network in the attention mechanism module is o i , and the output is a i ; the input of the critic network includes o i , a i and The output is Q i (o,a),
其中,oi是第i个智能体的局部观测状态;ai是输出的动作;ei表示第i个智能体的局部观察和行为的编码;Qi(o,a)是该评论家网络输出的Q值,在第i个智能体的评论家网络中,注意力模块的输入是输出是xi,xi表示其他智能体的贡献,Among them, o i is the local observation state of the ith agent; a i is the output action; ei represents the encoding of the local observation and behavior of the ith agent; Qi (o, a) is the critic network The Q value of the output, in the critic network of the ith agent, the input of the attention module is The output is xi , where xi represents the contribution of other agents,
其他智能体的贡献xi表达式为:The contribution xi of other agents is expressed as:
式中,Wvalue,j表示与第j个智能体相关的值变换矩阵;是一个非线性激活函数;In the formula, W value,j represents the value transformation matrix related to the jth agent; is a nonlinear activation function;
wj是与第j个智能体相关的权重,w j is the weight associated with the jth agent,
第j个智能体相关的权重wj表达为:The weight w j associated with the jth agent is expressed as:
式中,Wkey,i和Wquery,i分别是与第i个智能体相关的变换矩阵。where W key,i and W query,i are the transformation matrices related to the ith agent, respectively.
根据上层子问题模型的凸优化求解方法和热能子系统最优控制策略对含氢建筑能源系统的运行进行实时控制。According to the convex optimization solution method of the upper sub-problem model and the optimal control strategy of the thermal energy subsystem, the operation of the hydrogen-containing building energy system is controlled in real time.
图3展示了本发明方法与其他对比方案的性能对比图。方案1表示对电能存储系统和氢能存储系统进行联合控制。具体而言,当存在可再生能源过剩时,电能存储系统和氢能存储系统进行充电。反之,则电能存储系统和氢能存储系统进行放电。而且,采用ON-OFF策略对建筑热供给功率进行控制,即:当室内温度低于下限时,输入热功率为0;当室内温度高于上限时,输入热功率为最大热供给功率。方案2利用深度Q网络(DQN)算法对电能存储系统和氢能存储系统进行控制。同时,采用ON-OFF策略对建筑热供给功率进行控制。方案3采用多智能体深度确定性策略梯度算法(MADDPG)对所有储能设备和热负荷进行联合控制。方案4与本发明方法类似,但未考虑注意力机制。由图4可知,本发明方法可在维持高热舒适性(如平均温度偏离小于0.03摄氏度)的前提下显著降低运行成本。具体而言,相比方案1,方案2,方案3和方案4,分别降低平均运行成本30.09%,20.31%,25.66%,18.53%。FIG. 3 shows a performance comparison diagram of the method of the present invention and other comparison schemes.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the technical principle of the present invention, several improvements and modifications can also be made. These improvements and modifications It should also be regarded as the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210631486.3A CN115130733B (en) | 2022-06-06 | 2022-06-06 | Hydrogen-containing building energy system operation control method combining optimization and learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210631486.3A CN115130733B (en) | 2022-06-06 | 2022-06-06 | Hydrogen-containing building energy system operation control method combining optimization and learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115130733A true CN115130733A (en) | 2022-09-30 |
CN115130733B CN115130733B (en) | 2024-07-09 |
Family
ID=83378492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210631486.3A Active CN115130733B (en) | 2022-06-06 | 2022-06-06 | Hydrogen-containing building energy system operation control method combining optimization and learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115130733B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115907220A (en) * | 2022-12-26 | 2023-04-04 | 南京邮电大学 | Method and system for online optimization operation of hydrogen-containing energy system in uncertain environment |
CN119065418A (en) * | 2024-11-05 | 2024-12-03 | 宁波隆跃科技有限公司 | Mold temperature adaptive adjustment method and system based on machine learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110458443A (en) * | 2019-08-07 | 2019-11-15 | 南京邮电大学 | A smart home energy management method and system based on deep reinforcement learning |
US20200301924A1 (en) * | 2019-03-20 | 2020-09-24 | Guangdong University Of Technology | Method for constructing sql statement based on actor-critic network |
CN112966444A (en) * | 2021-03-12 | 2021-06-15 | 南京邮电大学 | Intelligent energy optimization method and device for building multi-energy system |
US20220036392A1 (en) * | 2020-08-03 | 2022-02-03 | Desong Bian | Deep Reinforcement Learning Based Real-time scheduling of Energy Storage System (ESS) in Commercial Campus |
-
2022
- 2022-06-06 CN CN202210631486.3A patent/CN115130733B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200301924A1 (en) * | 2019-03-20 | 2020-09-24 | Guangdong University Of Technology | Method for constructing sql statement based on actor-critic network |
CN110458443A (en) * | 2019-08-07 | 2019-11-15 | 南京邮电大学 | A smart home energy management method and system based on deep reinforcement learning |
US20220036392A1 (en) * | 2020-08-03 | 2022-02-03 | Desong Bian | Deep Reinforcement Learning Based Real-time scheduling of Energy Storage System (ESS) in Commercial Campus |
CN112966444A (en) * | 2021-03-12 | 2021-06-15 | 南京邮电大学 | Intelligent energy optimization method and device for building multi-energy system |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115907220A (en) * | 2022-12-26 | 2023-04-04 | 南京邮电大学 | Method and system for online optimization operation of hydrogen-containing energy system in uncertain environment |
CN119065418A (en) * | 2024-11-05 | 2024-12-03 | 宁波隆跃科技有限公司 | Mold temperature adaptive adjustment method and system based on machine learning |
CN119065418B (en) * | 2024-11-05 | 2025-03-04 | 宁波隆跃科技有限公司 | Mold temperature adaptive adjustment method and system based on machine learning |
Also Published As
Publication number | Publication date |
---|---|
CN115130733B (en) | 2024-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Narimani et al. | A novel approach to multi-objective optimal power flow by a new hybrid optimization algorithm considering generator constraints and multi-fuel type | |
Hou et al. | Multi-time scale optimization scheduling of microgrid considering source and load uncertainty | |
Gazijahani et al. | Game theory based profit maximization model for microgrid aggregators with presence of EDRP using information gap decision theory | |
Moghaddam et al. | Multi-operation management of a typical micro-grids using Particle Swarm Optimization: A comparative study | |
CN112966444B (en) | Intelligent energy optimization method and device for building multi-energy system | |
Qi et al. | Low-carbon community adaptive energy management optimization toward smart services | |
CN109636056B (en) | A decentralized optimization scheduling method for multi-energy microgrids based on multi-agent technology | |
Mei et al. | Multi-objective optimal scheduling of microgrid with electric vehicles | |
Yu et al. | Joint optimization and learning approach for smart operation of hydrogen-based building energy systems | |
CN105790266B (en) | A kind of parallel Multi-objective Robust Optimized Operation integrated control method of micro-capacitance sensor | |
Yin et al. | Relaxed deep generative adversarial networks for real-time economic smart generation dispatch and control of integrated energy systems | |
CN106712075A (en) | Peaking strategy optimization method considering safety constraints of wind power integration system | |
CN115130733B (en) | Hydrogen-containing building energy system operation control method combining optimization and learning | |
Liang et al. | Deep reinforcement learning-based optimal scheduling of integrated energy systems for electricity, heat, and hydrogen storage | |
Yang et al. | Data-driven optimal dynamic dispatch for hydro-PV-PHS integrated power systems using deep reinforcement learning approach | |
CN116957229A (en) | Hausdorff distance-based micro-grid two-stage distribution robust optimal scheduling method | |
Dong et al. | A coordinated active and reactive power optimization approach for multi-microgrids connected to distribution networks with multi-actor-attention-critic deep reinforcement learning | |
CN117610813A (en) | A virtual power plant distributed resource collaborative optimization dispatching method and system | |
Ahmadi et al. | Performance of a smart microgrid with battery energy storage system's size and state of charge | |
Fan et al. | Multi-agent deep reinforced co-dispatch of energy and hydrogen storage in low-carbon building clusters | |
He et al. | Economic optimization scheduling of multi‐microgrid based on improved genetic algorithm | |
Hongwei et al. | Robust stochastic optimal dispatching of integrated electricity-gas-heat systems with improved integrated demand response | |
Yujie et al. | Optimal operation of new coastal power systems with seawater desalination based on grey wolf optimization | |
Sun et al. | Energy management based on safe multi-agent reinforcement learning for smart buildings in distribution networks | |
Zheng et al. | Multi‐scale coordinated optimal dispatch method of electricity‐thermal‐hydrogen integrated energy systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |