CN116610037A - Comprehensive optimization control method for air quantity of ocean platform ventilation system - Google Patents

Comprehensive optimization control method for air quantity of ocean platform ventilation system Download PDF

Info

Publication number
CN116610037A
CN116610037A CN202310868753.3A CN202310868753A CN116610037A CN 116610037 A CN116610037 A CN 116610037A CN 202310868753 A CN202310868753 A CN 202310868753A CN 116610037 A CN116610037 A CN 116610037A
Authority
CN
China
Prior art keywords
agent
cabin
fan
air valve
intelligent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310868753.3A
Other languages
Chinese (zh)
Other versions
CN116610037B (en
Inventor
崔璨
王树青
王立豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN202310868753.3A priority Critical patent/CN116610037B/en
Publication of CN116610037A publication Critical patent/CN116610037A/en
Application granted granted Critical
Publication of CN116610037B publication Critical patent/CN116610037B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02BCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO BUILDINGS, e.g. HOUSING, HOUSE APPLIANCES OR RELATED END-USER APPLICATIONS
    • Y02B30/00Energy efficient heating, ventilation or air conditioning [HVAC]
    • Y02B30/70Efficient control or regulation technologies, e.g. for control of refrigerant flow, motor or heating

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Ventilation (AREA)

Abstract

The invention provides a comprehensive optimization control method for the air quantity of an ocean platform ventilation system. Defining each cabin air dividing valve as a cabin air valve intelligent body, defining a total air valve as a fan intelligent body, and further defining the observed quantity of the cabin air valve intelligent body at the current time, the observed quantity of the fan intelligent body at the current time and the actions of the cabin air valve intelligent body and the fan intelligent body; and defining punishment of cabin intelligent bodies and punishment of air valve intelligent bodies, and considering dangerous gas concentration factors and human comfort factors in the process of defining punishment. Based on cabin agent punishment and fan agent punishment, total rewarding function of the fan agent and the air valve agent, training an air quantity control neural network, and adopting a trained strategy network to carry out control decision. According to the invention, dangerous gas punishment is introduced, so that the accumulation and diffusion of dangerous gas can be effectively inhibited when the leakage rate of dangerous gas is smaller, and the safety and health of staff on an ocean platform are protected.

Description

Comprehensive optimization control method for air quantity of ocean platform ventilation system
Technical Field
The invention relates to the technical field of ocean engineering, in particular to an ocean platform ventilation control method, and particularly relates to an air quantity comprehensive optimization control method of an ocean platform ventilation system.
Background
The ocean contains more than one third of the oil and gas resources of the earth, 44% of which are buried in deep sea more than 300 meters away from land and water depth. The ocean platform is used for offshore operation and development of deep sea oil and gas resources.
The ventilation system is used for the ventilation management of the ocean platform, is an essential important part of the ocean platform, can provide comfortable and healthy environment for workers on the ocean platform, and provides safe and reliable operation environment for production equipment on the ocean platform. From a comfort point of view as well as from a safety point of view, both living and production areas, the application of the ventilation system on the ocean platform should be considered.
During the production of oil and gas, various dangerous gases such as methane (CH) which are toxic, harmful or flammable and explosive can be generated 4 ) Carbon monoxide (CO), hydrogen sulfide (H) 2 S), etc. When the harmful gas reaches a certain concentration, the harmful gas can cause damage to human bodies, and the harmful gas can cause death when serious. When the concentration is higher, some inflammable and explosive gases can be mixed with air to form explosive mixed gas, and the explosive mixed gas can be burnt and exploded when exposed fire and high heat energy are met. Research shows that when the leakage rate is smaller, the accumulation and diffusion of dangerous gases can be effectively inhibited by adopting proper ventilation measures, the dangerous gases are prevented from reaching dangerous concentration, and the safety and the health of workers on the ocean platform are protected.
Compared with other ventilation systems, the ventilation system of the ocean platform has unique requirements and difficulties. For example, each cabin of the ocean platform is required to ensure the comfort and health of workers and prevent the accumulation and diffusion of dangerous gases; the ocean platform is often provided with a plurality of areas needing ventilation, the environments of all cabins are different, and the ventilation requirements are different and need to be met respectively; on the premise of safety and comfort, energy consumption is also considered, and the use of energy sources is reduced.
In the prior art, the ventilation system of the ocean platform is mainly realized by monitoring parameters such as air flow, temperature and humidity, pressure, concentration of related gases and the like in a cabin, the method can only realize the basic function of ventilation control of the ocean platform, and depends on hardware equipment such as fans, various sensors, a PLC (programmable logic controller) and the like, various parameters are manually debugged by staff, and the debugging quality depends on experience and skill of parameter debugging staff, so that great uncertainty is brought to the quality and performance of the ventilation system of the ocean platform.
In addition, there is a method for performing thermodynamic modeling on the ocean platform cabins, but because the requirements of the ocean platform cabins are different, each cabin needs to be independently modeled by using the method, and an accurate thermodynamic model is very difficult to build, so that complicated design and huge calculation are caused. Moreover, model-based methods are poorly versatile and cannot fully cope with a variety of ventilation environments. Most importantly, the Xu Haiyang platform ventilation system only considers safety and comfort, and does not consider the influence of energy conservation, harmful gas and other factors on the ocean platform ventilation control strategy.
Disclosure of Invention
The invention aims to solve one of the technical problems and provides a comprehensive optimization control method for the air quantity of an ocean platform ventilation system.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the utility model provides a comprehensive optimizing control method of air quantity of platform ventilation system, the platform includes main ventilation pipeline and a plurality of cabin, be provided with total blast gate on the main ventilation pipeline, main ventilation pipeline communicates with each cabin; each cabin is provided with a variable air box, and each air box is provided with a cabin air valve;
the control method comprises the following steps:
s1: defining a cabin intelligent agent and a blast gate intelligent agent;
defining each cabin air dividing valve as a cabin air valve intelligent body, and defining a total air valve as a fan intelligent body;
define cabin blast gate intelligent objectAt the present moment +.>The observed quantity of (2) is:
wherein :,/>the number of cabins;
define the fan agent at the current momentThe observed quantity of (2) is:
wherein ,representation->Outdoor temperature of ocean platform at moment, collectionMRepresenting cabin-removed ocean platform>A set of other cabin areas outside, +.>Representation->Time cabin->Indoor temperature of>Representation->Time cabin- >Indoor temperature of>Representation ofTime cabin->Number of indoor personnel, and->Indicates the time interval index during a day, +.>Respectively indicate->Time cabin->Methane, carbon monoxide, hydrogen sulfide;
defining actions of a cabin air valve agent and a fan agent:
wherein ,indicate->The intelligent body of the air valve of each cabin is +.>Action of moment->Representation->The action of the fan intelligent body at any moment;
defining states of a cabin damper agent and a fan agent:
wherein ,indicate->The intelligent body of the air valve of each cabin is +.>Status of moment->,/>Representation->The state of the fan intelligent body at any moment;
s2: defining punishment to cabin agent and punishment to air valve agent;
defining a hazardous gas concentration penalty for the damper agent, comprising: methane gas punishment, carbon monoxide gas punishment and hydrogen sulfide gas punishment;
intelligent body for setting cabin air valveIs punished in the dangerous gas concentration oversrange:
defining a hazardous gas concentration penalty for the cabin damper agent and the fan agent, comprising: methane gas punishment, carbon monoxide gas punishment and hydrogen sulfide gas punishment;
setting a dangerous gas concentration oversrange penalty of a cabin fan intelligent body:
wherein ,、/>、/>、/>for the set weight, ++>For methane gas penalty set for cabin damper agent, +.>For carbon monoxide gas penalty set for cabin damper agent, +.>A hydrogen sulfide gas penalty set for the cabin damper agent; />For the methane gas penalty set for the fan agent,carbon monoxide gas penalty for fan agent setting, +.>Punishment of hydrogen sulfide gas set for the fan agent;
defining a cabin air valve agent super comfort penalty:
=/>+/>
wherein :penalty indicating cabin damper agent out of comfort temperature range,/->Indicating a punishment of cabin damper agent out of the comfort humidity range, +.>A weight representing a penalty between exceeding a temperature comfort range and exceeding a humidity comfort range;
defining a power consumption penalty:
wherein :is->Time cooling coil energy consumption/>Is->Energy consumption of air supply machine at moment->Is the sampling interval;
defining a total rewarding function of the fan intelligent agent and the cabin air valve intelligent agent:
wherein ,、/>is a bonus weight;
s3: based on punishment of cabin intelligent bodies and punishment of fan intelligent bodies and total rewarding functions of the fan intelligent bodies and cabin air valve intelligent bodies, training an air quantity control neural network, wherein each cabin air valve intelligent body and each fan intelligent body correspond to a strategy network and a value network, and a trained strategy network is adopted to carry out control decision of each cabin air valve intelligent body and each fan intelligent body.
In some embodiments of the invention, the step of neural network training includes:
s31: initializing empirical data pools and neural network parameters, definingRound, define->Step sizes;
s32: resetting parameters and environments of all cabin air valve intelligent bodies and fan intelligent bodies to obtain initial observation of the fan intelligent bodies and each cabin air valve intelligent body;
s33: training the cabin air valve intelligent body and the fan intelligent body based on a strategy function to obtain action output of each cabin air valve intelligent body and each fan intelligent body, wherein />Normally distributed noise with mean value 0, < >>Representing intelligent agent->State observance of->Representing intelligent agent->Wherein for the value of (a)Cabin air valve intelligent body->,/>For the number of cabins, for the fan agent +.>
S34: executing an actionObtaining the rewards of the cabin air valve intelligent body and the fan intelligent body at the current moment>And observation of next moment of cabin air valve intelligent body and fan intelligent bodyWill->、/>、/>、/>Transmitting to a central processing unit; wherein the method comprises the steps ofThe observation is the current moment;
s35: storing transfer matrices in experience poolsThe method comprises the steps of carrying out a first treatment on the surface of the Update status->
S36: intelligent body of air valve of each cabin and intelligent fanBody random slaveDIs of the selected size of bIs a transfer matrix dataset of (2); wherein o' represents the observation of the next moment;
s37: all ofN+1The individual target policy network makes predictions:; wherein ,/>Representation->Time->Action pre-measurement of individual agent, +.>Representation->Time->State observance of individual agents, +.>Representing intelligent agent->Parameters before policy network update; wherein, for the cabin air valve intelligent agent,,/>for the number of cabins, for the fan agent +.>
S38:All 2%N+1) The individual target value networks make predictions:
taking the small of the two:
calculating a TD target:
wherein ,representing intelligent agent->1 st target value network pair of (2)tThe value predictions made at time +1,representing intelligent agent->Is the 2 nd target value network pairtValue predictions made at time +1, +.>Representation->Time of dayNState of +1 agents, +.>Representation ofN+1 agent pairstPredicted amount of motion at +1, +.>Representing an agentiBefore updating the 1 st target value network, of->Representing an agentiBefore updating the 2 nd target value network of (a), a +.>Representing intelligent agent->The lesser of the predicted values of the two target value networks,/->Representation oftTime intelligent agent->TD target of->Intelligent body->At the position oftAwards obtained at time +1; wherein, for cabin air valve intelligent body, +. >,/>For the number of cabins, for the fan agent +.>
S39: all 2%N+1) The personal value network makes predictions:
calculating TD error:
representing intelligent agent->1 st value network pair of (2)tValue predictions made at time +1, +.>Representing intelligent agent->Is the 2 nd value network pair of (2)tValue predictions made at time +1, +.>Representation oftStatus of time cabin damper agent and fan agent +.>Representation oftAction of time cabin air valve agent and fan agent +.>Representing intelligent agent->Before update parameter of 1 st value network, of->Representing intelligent agent->Before update parameter of the 2 nd value network, of->Representing intelligent agent->TD error of 1 st value network, +.>Representing intelligent agent->TD error of the 2 nd value network, +.>Representing intelligent agent->1 st value network pair of (2)tValue predictions made at time of day->Representing intelligent agent->Is the 2 nd value network pair of (2)tPredicting value made at moment; wherein, for cabin air valve intelligent body, +.>,/>For the number of cabins, for the fan agent,
s40: randomly decimated quadruples from experience playback arraysUpdating allN+1 value network;
representing intelligent agent->Is updated by the 1 st value network, is added to the value network>Representing intelligent agent- >Updated parameters of the 2 nd value network, for example>Indicates learning rate (I/O)>Representing intelligent agent->TD error of 1 st value network, +.>Representing intelligent agent->TD error of the 2 nd value network, +.>Representing +.>Parameter of the value network->Obtaining a gradient; wherein for the cabinRoom air valve intelligent body->,/>For the number of cabins, for the fan agent +.>
S41: every other intervalAll->The individual strategy network predicts to obtain +.>The method comprises the steps of carrying out a first treatment on the surface of the Update->Personal policy network and 3 (/ -)>) A target network;
let all ofThe individual policy network predicts:
summarizing predictions into
Updating allPolicy network:
update all 3%N+1) The target network:
representing intelligent agent->Before policy network update parameters, +.>Indicating update rate->Representing an agentiWherein, for the cabin air valve agent, +.>,/>For the number of cabins, for the fan agent +.>
S42: repeating S31-S41 until the maximum round number is reachedT
In some embodiments of the invention:
the methane gas penalty is set for the cabin air valve intelligent agent as follows:
the carbon monoxide gas penalty is set for the cabin air valve intelligent agent as follows:
the hydrogen sulfide gas punishment is set for the cabin air valve intelligent agent:
wherein ,、/>、/>respectively->Time cabin->Methane, carbon monoxide, hydrogen sulfide concentration; />、/>Cabin->The highest safe concentration of methane, carbon monoxide and hydrogen sulfide.
In some embodiments of the invention:
methane gas punishment is set for the fan agent:
carbon monoxide gas punishment is set for the fan intelligent body:
the punishment of hydrogen sulfide gas is set for the fan intelligent body:
wherein ,、/>、/>respectively->Time cabin->Methane, carbon monoxide, hydrogen sulfide concentration; />、/>Cabin->The highest safe concentration of methane, carbon monoxide and hydrogen sulfide.
In some embodiments of the invention:
defining a penalty for cabin damper agents exceeding a comfort temperature range:
;
defining a penalty for cabin damper agents to exceed the comfort humidity range:
wherein :indicating only cabin->At->Whether platform operators exist at any time has no practical significance, if so, the platform operators are +.>=1, if no one is present +.>=0;/>Is->Time cabin->Indoor temperature of>、/>Cabin->A maximum comfort temperature and a minimum comfort temperature of (a); />Is->Time cabin->Indoor humidity of>、/>Cabin->Maximum comfort humidity and minimum comfort humidity.
In some embodiments of the invention: and defining the cabin air valve intelligent agent super comfort penalty as 0.
In some embodiments of the invention:
wherein :c p is the specific heat of air, which is used for the air,ηfor the inverse coefficient of performance of the cooling coil,d r is the fresh air ratio,m i z is the air quantity of the area,P f,ref (k) For the purpose of reference to the energy consumption of the wind turbine,m ref for the total air volume provided by the reference fan,T i (k)、T o (k) Respectively iskThe time cabin temperature and the outdoor temperature,T c a temperature is set for the cabin.
The comprehensive optimization control method for the air quantity of the ocean platform ventilation system has the beneficial effects that:
1. according to the invention, dangerous gas punishment is introduced, so that when the leakage rate of dangerous gases (methane, carbon monoxide and hydrogen sulfide) is small, the accumulation and diffusion of the dangerous gases can be effectively inhibited, and the dangerous gases are prevented from reaching dangerous concentration, so that the safety and health of workers on an ocean platform are protected. The invention can give consideration to energy consumption and reduce energy consumption on the premise of ensuring safety and comfort, and makes practice for realizing the aim of double carbon.
2. The ocean platform ventilation control method provided by the invention can realize automatic regulation of cabin air volume, only current observability is needed when an intelligent body is trained, any priori knowledge about uncertain parameters in a system is not needed, a thermodynamic model and parameter adjustment are not needed to be established, and inaccurate control caused by inaccurate model or parameter adjustment is avoided. The intelligent ventilation system trained by the method provided by the invention can quickly adjust the ventilation quantity under the condition of any given initial value, and adjust the control index to a reasonable range.
3. The MATD3 reinforcement learning algorithm is applied to training of the offshore platform ventilation system for the first time, and the trained offshore platform ventilation system can control the ventilation quantity of each cabin under the conditions of given outdoor temperature and the number of people in each cabin so as to meet different ventilation requirements of each cabin, prevent accumulation and diffusion of toxic dangerous gases and ensure safety and comfort of workers on the offshore platform; meanwhile, energy consumption is considered on the premise of ensuring safety and comfort, and the use of energy sources is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an inventive single agent neural network;
FIG. 2 is a flowchart of the neural network training process of the present invention;
FIG. 3 is a schematic diagram of a distributed decision according to the present invention.
Detailed Description
In order to make the technical problems, technical schemes and beneficial effects to be solved more clear, the invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides a comprehensive optimization control method for the air quantity of an ocean platform ventilation system.
The main ventilation pipeline is provided with a main air valve and is communicated with each cabin; each cabin is provided with a variable air box, and each air box is provided with a cabin air valve. The total air valve is used for controlling the total ventilation quantity of the whole main ventilation pipeline, and the cabin air valve is used for controlling the ventilation quantity in each cabin. The opening degree of the cabin air valve can be selectively controlled according to the requirements according to the specific application of the cabin, so as to control the ventilation quantity of each cabin.
In this embodiment, selecting the offshore platform includesThe compartments, i.e. comprising->The variable air volume bellows further comprises a total air valve in the 1 air processing unit. For->The intelligent agent defines the action, state and rewarding of each intelligent agent, builds a proper neural network, builds a multi-intelligent-agent deep reinforcement learning model, realizes a ventilation target through the cooperation of a plurality of intelligent agents, namely, the trained ocean platform ventilation system can control the ventilation quantity of each cabin under the conditions of given outdoor temperature and the number of people in each cabin so as to meet different ventilation requirements of each cabin, prevent the accumulation and diffusion of toxic dangerous gases and ensure the safety and comfort of staff on an ocean platform; meanwhile, the energy consumption is reduced on the premise of ensuring safety and comfort And (3) energy use.
In the ocean platform ventilation system, the indoor thermal comfort level is adjusted by adjusting the air valve angle of the variable air volume bellows and the air supply volume of the air processor. In an ocean platform ventilation system, the indoor temperature and humidity of each cabin are selected to represent thermal comfort, and the cabiniAt the position oftIndoor temperature and humidity at moment need to be controlled in a certain range:
according to the design method of an offshore platform heating ventilation air conditioning refrigeration house system (Q/HS 3008-2016) issued by China marine oil company, a cabin is generally divided into a living area, a manned area and an unmanned working area, wherein the living area and the manned area mainly consider the temperature and humidity requirements of workers and the comfort level of the workers is preferentially considered; the unmanned working area is used for maintaining the normal operation of each device as a main part and carrying out reasonable ventilation and heat dissipation. For example, the room temperature in summer of living areas and artificial areas (laboratory and instrument room) is 24-28 ℃, the room temperature in winter is 18-22 ℃, and the indoor relative humidity is 30% -70%; the indoor humidity of the unmanned working area (such as a battery room, a transformer room and a generator room) is not required, the indoor temperature is 40-45 ℃ in summer and is-10-5 ℃ in winter.
In practice, the indoor temperature of each cabin of the ocean platform is affected by many factors, for example, heat transfer can be performed between adjacent cabins, heat transfer can be performed between the cabin and the outside of the cabin, heat can be generated by staff in the cabin, and the temperature in the cabin can be changed by ventilation system air outlet. In the present invention, the indoor temperature is obtained by an in-cabin temperature sensor, and the indoor humidity is obtained by an in-cabin humidity sensor.
When the ocean platform is used for exploiting petroleum or natural gas, dangerous gases such as methane, carbon monoxide, hydrogen sulfide and the like can be leaked, if the leaked dangerous gases are not discharged in time, aggregation and diffusion can occur in a cabin, the safety and the health of ocean platform staff are affected, and serious casualties and property loss can be caused. The invention selects the concentration of dangerous gases (methane, carbon monoxide and hydrogen sulfide) in each cabin in the ocean platform as an index to represent the air quality. In order to protect the safety and health of offshore platform personnel, the concentration of hazardous gases must be controlled within a safe range:
wherein ,representation oftTime cabiniMethane gas concentration,/, of%>Indicating cabiniThe highest concentration of methane gas in the safety range; / >Representation oftTime cabiniCarbon monoxide gas concentration,/->Indicating cabiniThe highest concentration of carbon monoxide gas in the safety range; />Representation oftTime cabiniHydrogen sulfide gas concentration,/->Indicating cabiniVulcanization within safe limitsThe highest concentration of hydrogen gas.
According to the standard of indoor air quality (GB/T18883-2022), the concentration of CO in living areas should be lower than 10mg/m 3 (8 ppm); according to section 1 of the occupational contact limit of all harmful factors of worksite: chemical hazard (GBZ 2.1-2019) specifies that the working area carbon monoxide concentration cannot be higher than 20mg/m 3 (16 ppm); according to the regulations of the office of the emergency management department of the people's republic of China, the lower limit of methane Explosion (LEL) is 5%, and the primary alarm point of the methane gas alarm is 20% LEL, namely the indoor methane concentration is controlled within 1%; according to the regulations of the safety of hydrogen sulfide protection for shallow oil operations (SY 6504-2010), the threshold mass concentration of hydrogen sulfide is 10mg/m 3 (about 6 ppm), working zone hydrogen sulfide safety concentration of 20mg/m 3 (about 12 ppm).
The invention adopts Multi-agent double-delay depth deterministic strategy gradient (Multi-agent Twin Delayed Deep Deterministic Policy Gradient, MATD 3) reinforcement learning algorithm to train the Multi-agent. MATD3 algorithm is the optimization of Multi-agent depth deterministic strategy gradient (Multi-agnet Twin Delayed Deep Deterministic Policy Gradient, MADDPG) reinforcement learning algorithm, and solves the defect of easy overestimation of MADDPG by introducing three skills of truncated double Q learning, delayed strategy updating and target strategy smoothing. Through verification, compared with the MADDPG algorithm used by the former, the MATD3 algorithm can help us to train the strategy network and the value network better, and further greatly improve the training effect.
For the deep reinforcement learning algorithm, it is most important how to select or define the states of the agents (multi-agents are observation and states), actions, rewards, and build a suitable neural network. As will be described in detail below.
Specifically, the comprehensive optimization control method for the air quantity of the ocean platform ventilation system provided by the invention comprises the following steps of.
S1: defining a cabin intelligent agent and a blast gate intelligent agent.
Defining a partition for each compartmentThe air valve is a cabin air valve intelligent bodyDefining a total air valve as a fan intelligent body; wherein:,/>the number of cabins; in the ocean platform working process, the air valve intelligent body of each cabin can adjust the air output based on the current state so as to maintain indoor thermal comfort and reduce dangerous gas concentration.
Define cabin blast gate intelligent objectAt the present moment +.>The observed quantity of (2) is:
;
because the fan agent has great influence on the concentration of dangerous gas in each cabin of the ocean platform, the fan agent needs to take the concentration of dangerous gas in all cabins as one of own observation objects, so that the fan agent is defined at the current momentThe observed quantity of (2) is:
wherein ,representation->Outdoor temperature of ocean platform at moment, collectionMRepresenting cabin-removed ocean platform >A set of other cabin areas outside, +.>Representation->Time cabin->Indoor temperature of>Representation->Time cabin->Indoor temperature of>Representation ofTime cabin->Number of indoor personnel, and->Representing an index of time intervals during a day (e.g. when time intervalsτTime interval index +.>24 x 60/15=96),>respectively indicate->Time cabin->Methane, carbon monoxide, hydrogen sulfide. In the definition of the above-mentioned agent, the cabin air valve agent and the fan agent may be collectively referred to as an agent, and may be denoted as an agent->Cabin air valve intelligent body>,/>For fan agent,/>
For cabin air valve intelligent bodyThe action of the marine platform cabin is ocean platform cabin->At time->Is used for the ventilation quantity of the air conditioner; for the fan intelligent body, the action is +.>Angle of total air valve in air handling unit at moment +.>
The invention willThe actions of the individual agents are represented by a set of action values:
defining actions of a cabin air valve agent and a fan agent:
wherein ,indicate->The intelligent body of the air valve of each cabin is +.>Action of moment->Representation->Action of fan agent at any time.
Defining states of a cabin damper agent and a fan agent:
;
wherein ,indicate->The intelligent body of the air valve of each cabin is +.>Status of moment->Representation->And the state of the fan intelligent body at any moment.
S2: a penalty for the cabin damper agent and a damper agent penalty are defined.
The design of the invention is initially hoped to make the trained intelligent air valve and the air blower simultaneously take into consideration three aspects of safety, comfort and energy conservation by means of a deep reinforcement learning method so as to realize comprehensive optimization control of the air quantity of the ocean platform ventilation system. Thus the invention winsMainly consists of three parts: firstly, punishment is carried out on dangerous gas concentration in each cabin area beyond a safety range; secondly, punishment of indoor thermal comfort exceeding a comfort area is carried out; thirdly, punishment of power consumption of the ocean platform ventilation system is achieved.
S21: defining a hazardous gas concentration penalty for the damper agent, comprising: methane gas punishment, carbon monoxide gas punishment, hydrogen sulfide gas punishment.
The concentration of the dangerous gas in the room is related to the ventilation quantity of the air valve in the cabin and the total air valve opening degree of the unit.
Definition of cabin air valve agent(/>) Punishments for methane, carbon monoxide, hydrogen sulfide concentrations outside the safe range are respectively:
intelligent body for setting cabin air valve Is punished in the dangerous gas concentration oversrange:
defining a hazardous gas concentration penalty for the cabin damper agent and the fan agent, comprising: methane gas punishment, carbon monoxide gas punishment, hydrogen sulfide gas punishment. Defining punishments for the concentration of methane, carbon monoxide and hydrogen sulfide of the fan intelligent agent exceeding the safety range as follows:
setting a dangerous gas concentration oversrange penalty of a fan intelligent body:
wherein ,、/>、/>、/>for the set weight, ++>For methane gas penalty set for damper agent, +.>Is a needleCarbon monoxide gas penalty for damper agent setting,/-)>Punishment of hydrogen sulfide gas set for the air valve agent; />Punishment for methane gas set for fan agent, < ->Carbon monoxide gas penalty for fan agent setting, +.>And punishment is carried out on hydrogen sulfide gas set for the fan intelligent body.、/>、/>Respectively->Time cabin->Methane, carbon monoxide, hydrogen sulfide concentration; />、/>、/>Cabin->The highest safe concentration of methane, carbon monoxide and hydrogen sulfide.
S21: and defining the cabin air valve intelligent agent super comfort penalty.
Indoor thermal comfort is divided into two parts, indoor temperature and indoor humidity, and therefore comfort penalties are mainly considered from two aspects, namely temperature penalties and moderate penalties.
Defining a penalty for cabin damper agents exceeding a comfort temperature range:
;
defining a penalty for cabin damper agents to exceed the comfort humidity range:
wherein :indicating only cabin->At->Whether platform operators exist at any time has no practical significance, if so, the platform operators are +.>=1, if no one is present +.>=0;/>Is->Time cabin->Indoor temperature of>、/>Cabin->The maximum comfort temperature and the minimum comfort temperature of (2), divided by>、/>The purpose of (2) is to eliminate the influence of dimension; />Is->Time cabin->Indoor humidity of>、/>Cabin->Maximum comfort humidity and minimum comfort humidity of (2), divided by、/>The purpose of (2) is to eliminate the influence of dimension.
Because the indoor thermal comfort of each cabin is not greatly correlated with the valve angle of the aggregate valve, in some embodiments of the invention: defining the air valve intelligent body super comfort penalty as 0, namely
Then the air valve intelligent body(/>) Penalties beyond the comfort thermal comfort range are:
=/>+/>
wherein :indicating a penalty for the damper agent to exceed the comfort temperature range,/->Indicating a penalty for the damper agent to exceed the comfort humidity range,/->A weight representing a penalty between exceeding a temperature comfort range and exceeding a humidity comfort range;
S23: a power consumption penalty is defined.
The energy consumption in the ventilation system mainly comprises two parts, namely cooling coil energy consumption and air supply machine energy consumption.
wherein :c p is the specific heat of air, which is used for the air,ηfor the inverse coefficient of performance of the cooling coil,d r is the fresh air ratio,m i z is a regionThe air quantity is controlled by the air quantity,P f,ref (k) For the purpose of reference to the energy consumption of the wind turbine,m ref for the total air volume provided by the reference fan,T i (k)、T o (k) Respectively iskThe time zone temperature and the outdoor temperature,T c a temperature is set for the cabin.
The power consumption calculation formula is:
thus, the invention defines the penalty to the power consumption of the ocean platform ventilation system as follows:
wherein :is->Time cooling coil energy consumption/>Is->Energy consumption of air supply machine at moment->Is the sampling interval;
defining a total rewarding function of the fan intelligent agent and the cabin air valve intelligent agent:
wherein ,、/>are bonus weights.
When (when)After the air valve agent of the time cabin selects the corresponding action and implements the action, the environment is in the +.>The status of the moment gives the corresponding rewards +.>The agent then adjusts its own strategy by the rewards obtained in order to maximize the total revenue.
The acquisition of the parameters required for the above calculation is explained as follows. Each cabiniAt the moment of timeIndoor temperature +.>And outdoor temperatureAcquired by a temperature sensor, the humidity in the cabin is +.>The number of people in the cabin is acquired by a humidity sensor >The dangerous gas concentration (methane, carbon monoxide and hydrogen sulfide) in the cabin is obtained by an electronic counting sensor, and the dangerous gas concentration (methane, carbon monoxide and hydrogen sulfide) in the cabin is obtained by a specific toxic and harmful gas sensor; the power consumption can be obtained through calculation or power meter.
S3: based on the cabin air valve agent punishment, the fan agent punishment and the total rewarding function of the fan agent and the cabin air valve agent, the air quantity control neural network training is carried out. And a trained strategy network is adopted to carry out control decision of the cabin air valve and the fan.
The MATD3 algorithm adopted by the invention is a multi-agent reinforcement learning algorithm of central training-distributed decision, and the training value network and the strategy network need to use the state of the whole environment instead of the observation of a certain agent. Each agent corresponds to two policy networks (including one evaluation network and one target network) for outputting actions and four value networks (including two evaluation networks and two target networks) for real-time evaluation.
Single agentThe corresponding neural network structure is shown in fig. 1 (the target network is not shown).
All policy networks are identical in structure and comprise an input layer, a plurality of hidden layers and an output layer, and each layer consists of a linear function and an activation function. For the input layer and the hidden layer, the invention selects the Leaky ReLU function (the problem of 'inactivation' of the ReLU function neuron can be solved) as an activation function; for the output layer, the invention selects the softmax function as the activation function. In the invention, the dimension of the strategy network input layer is the dimension of the observation of the agent, the dimension of the output layer is the dimension of the action of the agent, and the other dimensions are self-determined by the user.
All value networks also have the same structure, comprising an input layer, an output layer and a plurality of hidden layers, each layer consisting of a linear function and an activation function. For the input layer and the hidden layer, the invention selects the leak ReLU function as an activation function; for the output layer, the invention selects the softmax function as the activation function. In the invention, the dimension of the value network input layer is the sum of the dimension of the environment state and the dimension of the intelligent agent action, and the dimension of the output layer is 1, which means' based on the stateExecution of action->"the remaining dimensions are user-defined.
MATD3 belongs to a heterogeneous strategy algorithm, so that the strategy network and the value network can be trained using empirical playback skills.
Set the first(/>,/>Total number of agents) number of agents currently evaluating the parameters of the network are respectively:
,/>,/>/>
the parameters of the corresponding target network are respectively as follows:
,/>,/>
the central controller randomly extracts a tetrad from the experience playback array each timeAll policy networks and all value networks are then updated as follows.
In some embodiments of the present invention, the neural network training process, referring to fig. 2, includes the following steps.
S31: initializing empirical data pools and neural network parameters, defining Round, define->Step sizes;
s32: resetting parameters and environments of all cabin air valve intelligent bodies and fan intelligent bodies to obtain initial observation of the fan intelligent bodies and each cabin air valve intelligent body;
s33: training the cabin air valve intelligent body and the fan intelligent body based on a strategy function to obtain action output of each cabin air valve intelligent body and each fan intelligent body, wherein />Normally distributed noise with mean value 0, < >>Representing intelligent agent->State observance of->Representing intelligent agent->Wherein, for the cabin air valve agent,,/>for the number of cabins, for the fan agent +.>
S34: executing an actionObtaining the rewards of the cabin air valve intelligent body and the fan intelligent body at the current moment>And observation of next moment of cabin air valve intelligent body and fan intelligent bodyWill->、/>、/>、/>Transmitting to a central processing unit; wherein->The observation is the current moment;
s35: storing transfer matrices in experience pools
S36: each cabin air valve intelligent body and fan intelligent body are randomly selected fromDIs of the selected size ofbIs a transfer matrix dataset of (2); wherein o' represents the observation of the next moment;
s37: all ofN+1The predictions are made by the individual target policy networks,; wherein ,/>Representation->Time->Action pre-measurement of individual agent, +.>Representation->Time->State observance of individual agents, +.>Representing intelligent agent->Parameters before policy network update; wherein, for the cabin air valve intelligent agent,,/>for the number of cabins, for the fan agent +.>
S38: all 2%N+1) The individual target value networks make predictions:
taking the small of the two:
calculating a TD target:
wherein ,representing intelligent agent->1 st target value network pair of (2)tThe value predictions made at time +1,representing intelligent agent->Is the 2 nd target value network pairtValue predictions made at time +1, +.>Representation->Time of dayStatus of individual agent->Representation->Personal agent pairtPredicted amount of motion at +1, +.>Representing intelligent agent->Before updating the 1 st target value network, of->Representing intelligent agent->Before updating the 2 nd target value network of (a), a +.>Representing intelligent agent->The lesser of the predicted values of the two target value networks,/->Representation oftTime intelligent bodyTD target of->Intelligent body->At the position oftAwards obtained at time +1; wherein, for cabin air valve intelligent body, +.>,/>For the number of cabins, for the fan agent +.>
S39: all 2%N+1) The personal value network makes predictions:
Calculating TD error:
representing intelligent agent->1 st value network pair of (2)tValue predictions made at time +1, +.>Representing intelligent agent->Is the 2 nd value network pair of (2)tValue predictions made at time +1, +.>Representation oftStatus of time cabin damper agent and fan agent +.>Representation oftAction of time cabin air valve agent and fan agent +.>Representing intelligent agent->Before update parameter of 1 st value network, of->Representing intelligent agent->Before update parameter of the 2 nd value network, of->Representing intelligent agent->TD error of 1 st value network, +.>Representing intelligent agent->TD error of the 2 nd value network, +.>Representing intelligent agent->1 st value network pair of (2)tValue predictions made at time of day->Representing intelligent agent->Is the 2 nd value network pair of (2)tPredicting value made at moment; wherein, for cabin air valve intelligent body, +.>,/>For the number of cabins, for the fan agent,
s40: randomly decimated quadruples from experience playback arraysUpdating allN+1 value network;
representing intelligent agent->Is updated by the 1 st value network, is added to the value network>Representing intelligent agent->Updated parameters of the 2 nd value network, for example>Indicates learning rate (I/O)>Representing intelligent agent->TD error of 1 st value network, +. >Representing intelligent agent->TD error of the 2 nd value network, +.>Representing +.>Parameter of the value network->Obtaining a gradient; wherein, for cabin air valve intelligent body, +.>,/>For the number of cabins, for the fan agent +.>
S41: every other intervalAll->The individual strategy network predicts to obtain +.>The method comprises the steps of carrying out a first treatment on the surface of the Update->Personal policy network and 3 (/ -)>) A target network;
let all ofN+1The individual policy network predicts:
summarizing predictions into
Updating allPolicy network: />
Update all 3%N+1) The target network:
representing intelligent agent->Before policy network update parameters, +.>Indicating update rate->Representing an agentiWherein, for the cabin air valve agent, +.>,/>For the number of cabins, for the fan agent +.>
S42: repeating S31-S41 until the maximum round number is reachedT
The MATD3 algorithm adopted by the invention is structured as 'centralized training + distributed decision', and after training is completed, a value network can be not used any more, and only a strategy network is used for control decision. As shown in fig. 3, a policy network is deployed to the corresponding agent, the thThe number agent can be based on the local observations +.>Decisions are made independently locally.
The method provided by the invention can overcome the problems. The method can control the temperature and humidity and the concentration of dangerous gas in the ocean platform cabin, and ensure the safety and comfort of ocean platform staff; the method can control a plurality of cabins simultaneously, and meet the ventilation requirements of different cabins; the method does not need to establish a separate thermodynamic model for a specific cabin, avoids inaccurate control caused by inaccurate model, and has strong universality; the method does not need manual parameter adjustment, and avoids low performance caused by inaccurate debugging of staff; the intelligent ventilation system trained by the method can quickly adjust the ventilation quantity under the condition of any given initial value, adjust the control index to a reasonable range, meet the fresh air requirement of staff, and eliminate the potential safety hazard caused by dangerous gas; meanwhile, the method gives consideration to energy consumption on the premise of ensuring safety and comfort, and makes practice for realizing the double-carbon target.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (7)

1. The comprehensive optimization control method for the air quantity of the ocean platform ventilation system is characterized in that the ocean platform comprises a main ventilation pipeline and a plurality of cabins, a main air valve is arranged on the main ventilation pipeline, and the main ventilation pipeline is communicated with each cabin; each cabin is provided with a variable air box, and each variable air box is provided with a cabin air dividing valve;
the control method comprises the following steps:
s1: defining a cabin intelligent agent and a blast gate intelligent agent;
defining each cabin air dividing valve as a cabin air valve intelligent body, and defining a total air valve as a fan intelligent body;
define cabin blast gate intelligent objectAt the present moment +.>The observed quantity of (2) is:
wherein :,/>the number of cabins;
define the fan agent at the current momentThe observed quantity of (2) is:
wherein ,representation->Outdoor temperature of ocean platform at moment, collectionMRepresenting cabin-removed ocean platform>A set of other cabin areas outside, +.>Representation->Time cabin->Indoor temperature of>Representation->Time cabin->Indoor temperature of>Representation->Time cabin->Number of indoor personnel, and->Indicates the time interval index during a day, +.>Respectively indicate->Time cabin->Methane, carbon monoxide, hydrogen sulfide;
Defining actions of a cabin air valve agent and a fan agent:
wherein ,indicate->The intelligent body of the air valve of each cabin is +.>Action of moment->,/>Representation->The action of the fan intelligent body at any moment;
defining states of a cabin damper agent and a fan agent:
wherein ,indicate->The intelligent body of the air valve of each cabin is +.>Status of moment->,/>Representation->The state of the fan intelligent body at any moment;
s2: defining punishment to cabin agent and punishment to air valve agent;
defining a hazardous gas concentration penalty for the damper agent, comprising: methane gas punishment, carbon monoxide gas punishment and hydrogen sulfide gas punishment;
intelligent body for setting cabin air valveIs punished in the dangerous gas concentration oversrange:
defining a hazardous gas concentration penalty for the cabin damper agent and the fan agent, comprising: methane gas punishment, carbon monoxide gas punishment and hydrogen sulfide gas punishment;
setting a dangerous gas concentration oversrange penalty of a cabin fan intelligent body:
wherein ,、/>、/>、/>for the set weight, ++>For methane gas penalty set for cabin damper agent, +.>For carbon monoxide gas penalty set for cabin damper agent, +.>A hydrogen sulfide gas penalty set for the cabin damper agent; / >Punishment for methane gas set for fan agent, < ->Carbon monoxide gas penalty for fan agent setting, +.>Punishment of hydrogen sulfide gas set for the fan agent;
defining a cabin air valve agent super comfort penalty:
=/>+/>
wherein :penalty indicating cabin damper agent out of comfort temperature range,/->Indicating a punishment of cabin damper agent out of the comfort humidity range, +.>A weight representing a penalty between exceeding a temperature comfort range and exceeding a humidity comfort range;
defining a power consumption penalty:
wherein :is->Time cooling coil energy consumption/>Is->Energy consumption of air supply machine at moment->Is the sampling interval;
defining a total rewarding function of the fan intelligent agent and the cabin air valve intelligent agent:
wherein ,、/>is a bonus weight;
s3: based on punishment of cabin intelligent bodies and punishment of fan intelligent bodies and total rewarding functions of the fan intelligent bodies and cabin air valve intelligent bodies, training an air quantity control neural network, wherein each cabin air valve intelligent body and each fan intelligent body correspond to a strategy network and a value network, and a trained strategy network is adopted to carry out control decision of each cabin air valve intelligent body and each fan intelligent body.
2. The method for comprehensively optimizing and controlling the air quantity of the ocean platform ventilation system according to claim 1, wherein the step of training the neural network comprises the following steps:
S31: initializing empirical data pools and neural network parameters, definingRound, define->Step sizes;
s32: resetting parameters and environments of all cabin air valve intelligent bodies and fan intelligent bodies to obtain initial observation of the fan intelligent bodies and each cabin air valve intelligent body;
s33: training the cabin air valve intelligent body and the fan intelligent body based on a strategy function to obtain action output of each cabin air valve intelligent body and each fan intelligent body, wherein />Normally distributed noise with mean value 0, < >>Representing intelligent agent->State observance of->Representing intelligent agent->Wherein, for cabin air valve agent, <' > a value of->,/>For the quantity of cabins, for fan intelligenceBody (S)>
S34: executing an actionObtaining the rewards of the cabin air valve intelligent body and the fan intelligent body at the current moment>And observation of next moment of cabin air valve intelligent body and fan intelligent bodyWill->、/>、/>、/>Transmitting to a central processing unit; wherein the method comprises the steps ofThe observation is the current moment;
s35: storing transfer matrices in experience poolsThe method comprises the steps of carrying out a first treatment on the surface of the Update status->
S36: each cabin air valve intelligent body and fan intelligent body are randomly selected fromDIs of the selected size ofbIs a transfer matrix dataset of (2); wherein o' represents the observation of the next moment;
S37: all ofN+1The individual target policy network makes predictions:; wherein ,representation->Time->Action pre-measurement of individual agent, +.>Representation->Time->State observance of individual agents, +.>Representing an agentiParameters before policy network update; wherein, for cabin air valve intelligent body, +.>For the number of cabins, for the fan agent +.>
S38: all 2%N+1) The individual target value networks make predictions:
taking the small of the two:
calculating a TD target:
wherein ,representing an agenti1 st target value network pair of (2)tValue predictions made at time +1, +.>Representing an agentiIs the 2 nd target value network pairtValue predictions made at time +1, +.>Representation->Time of dayNState of +1 agents, +.>Representation ofN+1 agent pairstPredicted amount of motion at +1, +.>Representing an agenti1 st order of (2)Parameters before updating the target value network, +.>Representing an agentiThe 2 nd target value network pre-update parameter,representing an agentiThe lesser of the predicted values of the two target value networks,/->Representation oftTime intelligent bodyiTD target of->Intelligent bodyiAt the position oftAwards obtained at time +1; wherein, for the cabin air valve intelligent agent,,/>for the number of cabins, for the fan agent +. >
S39: all 2%N+1) The personal value network makes predictions:
calculating TD error:
representing an agenti1 st value network pair of (2)tValue predictions made at time +1, +.>Representing an agentiIs the 2 nd value network pair of (2)tValue predictions made at time +1, +.>Representation oftStatus of time cabin damper agent and fan agent +.>Representation oftAction of time cabin air valve agent and fan agent +.>Representing an agentiBefore update parameter of 1 st value network, of->Representing an agentiBefore update parameter of the 2 nd value network, of->Representing an agentiTD error of 1 st value network, +.>Representing an agentiTD error of the 2 nd value network, +.>Representing an agenti1 st value network pair of (2)tValue predictions made at time of day->Representing an agentiIs the 2 nd value network pair of (2)tPredicting value made at moment; wherein, for cabin air valve intelligent body, +.>,/>For the number of cabins, for the fan agent,
s40: randomly decimated quadruples from experience playback arraysUpdating allN+1 value network;
representing an agentiIs updated by the 1 st value network, is added to the value network>Representing an agentiUpdated parameters of the 2 nd value network, for example>Indicates learning rate (I/O) >Representing an agentiTD error of 1 st value network, +.>Representing an agentiTD error of the 2 nd value network, +.>Representation of an agentiParameter of the value network->Obtaining a gradient; wherein, for cabin air valve intelligent body, +.>,/>For the number of cabins, for the fan agent +.>
S41: every other intervalkEach round is to allN+1 strategy network makes predictions to obtainThe method comprises the steps of carrying out a first treatment on the surface of the UpdatingN+Strategy network 1 and 3%N+1) A target network;
let all ofN+1The individual policy network predicts:
summarizing predictions into
Updating allN+1 policy network:
update all 3%N+1) The target network:
representing an agentiBefore policy network update parameters, +.>Indicating update rate->Representing an agentiWherein, for the cabin air valve agent, +.>,/>For the number of cabins, for the fan agent +.>
S42: repeating S31-S41 until the maximum round number is reachedT
3. The method for comprehensively optimizing and controlling the air quantity of the ocean platform ventilation system according to claim 1, which is characterized in that:
the methane gas penalty is set for the cabin air valve intelligent agent as follows:
the carbon monoxide gas penalty is set for the cabin air valve intelligent agent as follows:
the hydrogen sulfide gas punishment is set for the cabin air valve intelligent agent:
wherein ,、/>、/>respectively->Time cabin->Methane, carbon monoxide, hydrogen sulfide concentration; />、/>Cabin->The highest safe concentration of methane, carbon monoxide and hydrogen sulfide.
4. The method for comprehensively optimizing and controlling the air quantity of the ocean platform ventilation system according to claim 1, which is characterized in that:
methane gas punishment is set for the fan agent:
carbon monoxide gas punishment is set for the fan intelligent body:
the punishment of hydrogen sulfide gas is set for the fan intelligent body:
wherein ,、/>、/>respectively->Time cabin->Methane, carbon monoxide, hydrogen sulfide concentration; />、/>Cabin->The highest safe concentration of methane, carbon monoxide and hydrogen sulfide.
5. The method for comprehensively optimizing and controlling the air quantity of the ocean platform ventilation system according to claim 1, which is characterized in that:
defining a penalty for cabin damper agents exceeding a comfort temperature range:
;
defining a penalty for cabin damper agents to exceed the comfort humidity range:
wherein :indicating only cabin->At->Whether platform operators exist at any time has no practical significance, if so, the platform operators are +.>=1, if no one is present +.>=0;/>Is->Time cabin->Indoor temperature of>、/>Cabin->A maximum comfort temperature and a minimum comfort temperature of (a); / >Is->Time cabin->Indoor humidity of>、/>Cabin->Maximum comfort humidity and minimum comfort humidity.
6. The method for comprehensively optimizing and controlling the air quantity of the ocean platform ventilation system according to claim 1, which is characterized in that: and defining the cabin air valve intelligent agent super comfort penalty as 0.
7. The method for comprehensively optimizing and controlling the air quantity of the ocean platform ventilation system according to claim 1, which is characterized in that:
wherein :c p is the specific heat of air, which is used for the air,ηfor the inverse coefficient of performance of the cooling coil,d r is the fresh air ratio,m i z is the air quantity of the area,P f,ref (k) For the purpose of reference to the energy consumption of the wind turbine,m ref for the total air volume provided by the reference fan,T i (k)、T o (k) Respectively iskThe time cabin temperature and the outdoor temperature,T c a temperature is set for the cabin.
CN202310868753.3A 2023-07-17 2023-07-17 Comprehensive optimization control method for air quantity of ocean platform ventilation system Active CN116610037B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310868753.3A CN116610037B (en) 2023-07-17 2023-07-17 Comprehensive optimization control method for air quantity of ocean platform ventilation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310868753.3A CN116610037B (en) 2023-07-17 2023-07-17 Comprehensive optimization control method for air quantity of ocean platform ventilation system

Publications (2)

Publication Number Publication Date
CN116610037A true CN116610037A (en) 2023-08-18
CN116610037B CN116610037B (en) 2023-09-29

Family

ID=87678544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310868753.3A Active CN116610037B (en) 2023-07-17 2023-07-17 Comprehensive optimization control method for air quantity of ocean platform ventilation system

Country Status (1)

Country Link
CN (1) CN116610037B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117450637A (en) * 2023-12-25 2024-01-26 中国海洋大学 Layered optimization control method for ocean platform ventilation system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344490A (en) * 2018-09-27 2019-02-15 中国石油大学(华东) A kind of ocean platform based on BRANN model fires risk analysis method
CN113449458A (en) * 2021-07-15 2021-09-28 海南大学 Multi-agent depth certainty strategy gradient method based on course learning
CN114216256A (en) * 2021-12-22 2022-03-22 中国海洋大学 Ventilation system air volume control method of off-line pre-training-on-line learning
CN114484822A (en) * 2022-02-10 2022-05-13 中国海洋大学 Ocean platform ventilation system control method based on temperature and hydrogen sulfide concentration control
WO2023019536A1 (en) * 2021-08-20 2023-02-23 上海电气电站设备有限公司 Deep reinforcement learning-based photovoltaic module intelligent sun tracking method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344490A (en) * 2018-09-27 2019-02-15 中国石油大学(华东) A kind of ocean platform based on BRANN model fires risk analysis method
CN113449458A (en) * 2021-07-15 2021-09-28 海南大学 Multi-agent depth certainty strategy gradient method based on course learning
WO2023019536A1 (en) * 2021-08-20 2023-02-23 上海电气电站设备有限公司 Deep reinforcement learning-based photovoltaic module intelligent sun tracking method
CN114216256A (en) * 2021-12-22 2022-03-22 中国海洋大学 Ventilation system air volume control method of off-line pre-training-on-line learning
CN114484822A (en) * 2022-02-10 2022-05-13 中国海洋大学 Ocean platform ventilation system control method based on temperature and hydrogen sulfide concentration control

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘胜祥;林群煦;杨智才;吴月玉;翟玉江;: "基于深度确定性策略梯度算法的双轮机器人平衡控制研究", 机械工程师, no. 03 *
李春晓等: "基于深度强化学习的多区域通风系统风量控制方法研究", 《控制工程》 *
许诺;杨振伟;: "稀疏奖励下基于MADDPG算法的多智能体协同", 现代计算机, no. 15 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117450637A (en) * 2023-12-25 2024-01-26 中国海洋大学 Layered optimization control method for ocean platform ventilation system
CN117450637B (en) * 2023-12-25 2024-03-19 中国海洋大学 Layered optimization control method for ocean platform ventilation system

Also Published As

Publication number Publication date
CN116610037B (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN116610037B (en) Comprehensive optimization control method for air quantity of ocean platform ventilation system
CN114484822B (en) Ocean platform ventilation system control method based on temperature and hydrogen sulfide concentration control
Zhang et al. Control of a novel synthetical index for the local indoor air quality by the artificial neural network and genetic algorithm
Xie et al. A prediction model of ammonia emission from a fattening pig room based on the indoor concentration using adaptive neuro fuzzy inference system
Yeo et al. Computational fluid dynamics evaluation of pig house ventilation systems for improving the internal rearing environment
Yan et al. Quantifying uncertainty in outdoor air flow control and its impacts on building performance simulation and fault detection
CN111829003A (en) Power plant combustion control system and control method
CN108760592A (en) A kind of unburned carbon in flue dust On-line Measuring Method based on BP neural network
Michaelides et al. Contaminant event monitoring in multi-zone buildings using the state-space method
Babadi et al. CFD modeling of air flow, humidity, CO2 and NH3 distributions in a caged laying hen house with tunnel ventilation system
Pichler et al. Simulation-assisted building energy performance improvement using sensible control decisions
Chao et al. Fuzzy logic controller design for staged heating and ventilating systems
Liu et al. Building information modelling-enabled multi-objective optimization for energy consumption parametric analysis in green buildings design using hybrid machine learning algorithms
CN106707999A (en) Building energy-saving system based on self-adaptive controller, control method and simulation
Wang et al. Evaluation of the Alberta air infiltration model using measurements and inter-model comparisons
Nassif Modeling and Optimization of HVAC Systems Using Artificial Intelligence Approaches.
Song Intelligent PID controller based on fuzzy logic control and neural network technology for indoor environment quality improvement
CN117022633B (en) Ventilation control method of prefabricated cabin ventilation system for ship or ocean platform
Tennakoon et al. A fuzzy inference system prototype for indoor air and temperature quality monitoring and hazard detection
Javed et al. Modelling and optimization of residential heating system using random neural networks
Zhong et al. Airflow optimizing control research based on genetic algorithm during mine fire period
Li et al. Comparative studies and optimizations of air distribution of underground building ventilation systems based on response surface methodology: A case study
Wu et al. Large-scale experimental investigation of effect of mechanical ventilation on smoke temperature in ship engine room
Jiao et al. Development of field-zone-net model for fire smoke propagation simulation in ships
Nkeshita et al. Prediction of indoor Total Volatile Organic Compound in a university hostel using a Neural Network Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant