CN111637614A - Intelligent control method for active ventilation floor in data center - Google Patents

Intelligent control method for active ventilation floor in data center Download PDF

Info

Publication number
CN111637614A
CN111637614A CN202010455152.6A CN202010455152A CN111637614A CN 111637614 A CN111637614 A CN 111637614A CN 202010455152 A CN202010455152 A CN 202010455152A CN 111637614 A CN111637614 A CN 111637614A
Authority
CN
China
Prior art keywords
time
rack
behavior
eval
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010455152.6A
Other languages
Chinese (zh)
Other versions
CN111637614B (en
Inventor
万剑雄
周杰
熊伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN202010455152.6A priority Critical patent/CN111637614B/en
Publication of CN111637614A publication Critical patent/CN111637614A/en
Application granted granted Critical
Publication of CN111637614B publication Critical patent/CN111637614B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/89Arrangement or mounting of control or safety devices
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/62Control or safety arrangements characterised by the type of control or by internal processing, e.g. using fuzzy logic, adaptive control or estimation of values
    • F24F11/63Electronic processing
    • F24F11/64Electronic processing using pre-stored data

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Chemical & Material Sciences (AREA)
  • Combustion & Propulsion (AREA)
  • Mechanical Engineering (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Air Conditioning Control Device (AREA)

Abstract

数据中心主动通风地板的智能控制方法,对数据中心机架热点问题建立马尔可夫决策过程模型,并提供三种模型求解算法,包括基础智能算法、样本值变体智能算法和结构变体智能算法,分别作为主动通风地板控制算法的核心。模型由系统状态、行为、奖励和价值函数四部分组成,模型的解为,在一系列系统状态下不断选择最优行为,使得系统累计奖励最大化,主动通风地板控制算法,通过不断探索和学习机架入风口温度分布与主动通风地板风扇转速间的复杂关系,最终可以根据机架入风口温度分布,产生最优PWM信号占空比值,调节主动通风地板风扇转速,使得机架入风口温度分布均匀化,缓解机架热点问题。相比其他方案,本发明普适性更高,更易部署,更具成本效益。

Figure 202010455152

The intelligent control method of the active ventilation floor of the data center, establishes a Markov decision process model for the hotspot problem of the data center rack, and provides three model solving algorithms, including the basic intelligent algorithm, the sample value variant intelligence algorithm and the structural variant intelligence algorithm , respectively as the core of the active ventilation floor control algorithm. The model consists of four parts: system state, behavior, reward and value function. The solution of the model is that the optimal behavior is continuously selected under a series of system states to maximize the cumulative reward of the system. The active ventilation floor control algorithm, through continuous exploration and Learn the complex relationship between the temperature distribution of the rack air inlet and the fan speed of the active ventilation floor. Finally, according to the temperature distribution of the rack air inlet, the optimal PWM signal duty cycle value can be generated, and the speed of the active ventilation floor fan can be adjusted so that the temperature of the rack air inlet can be adjusted. Uniform distribution to alleviate rack hotspot issues. Compared with other solutions, the present invention is more universal, easier to deploy, and more cost-effective.

Figure 202010455152

Description

数据中心主动通风地板的智能控制方法Intelligent control method for active ventilation floor in data center

技术领域technical field

本发明属于自动控制技术领域,特别涉及数据中心主动通风地板的智能控制方法。The invention belongs to the technical field of automatic control, and particularly relates to an intelligent control method for an active ventilation floor of a data center.

背景技术Background technique

机架热点,即数据中心机房机架某一个或几个位置,温度明显高于其他位置温度的高温点。过高的温度会导致数据中心某些服务器工作效率降低,进而降低其整体功率密度,同时也会降低其可靠性,这显然与数据中心的需求相悖。Rack hotspots are high-temperature spots where the temperature of one or several locations in the data center rack is significantly higher than that of other locations. Excessive temperatures can cause some servers in the data center to work less efficiently, thereby reducing their overall power density and reducing their reliability, which is obviously contrary to the needs of the data center.

采用全局调控的方式进行缓解或消除机架热点,例如提升机房空调功率以提供足量冷气,必然会导致大部分机架区域处于过度制冷状态,在造成制冷资源浪费的同时,使得数据中心总能耗中占比近半的制冷能耗更加巨大。因此,机架级制冷方案更适合于缓解机架热点问题。Using global regulation to alleviate or eliminate rack hot spots, such as increasing the power of the air conditioner in the equipment room to provide sufficient cooling air, will inevitably lead to excessive cooling in most of the rack areas. While causing waste of cooling resources, the data center can always be The cooling energy consumption, which accounts for nearly half of the consumption, is even more huge. Therefore, rack-level cooling solutions are more suitable for alleviating rack hotspot issues.

目前已有机架级制冷方案,例如安装自适应通风地板、安装挡板、封闭单个机架并为其设置通风管等。但这些方案皆为“被动式”制冷方案,不能主动为机架提供冷气流,当冷气供应不足时,这些方案都无能为力。Rack-level cooling solutions exist, such as installing adaptive ventilation floors, installing baffles, and enclosing and ducting individual racks. However, these solutions are all "passive" cooling solutions, which cannot actively provide cold airflow to the racks. When the cooling air supply is insufficient, these solutions are powerless.

主动通风地板作为另一种机架级制冷方案,通过主动输送冷气的方式缓解机架热点问题,相较于上述方案更容易部署,更具成本效益,但其控制的难点主要在于其放置环境的多样性与动态性,例如机房空调、机架相对位置以及机架内部服务器分布不同;冷、热通道封闭状态不同,服务器机架标准和密封情况不同;机房空调功率、不同机架服务器的热负载不同,等等。因此,数据中心的热能效与气流模型,一般难以用解析模型进行描述。As another rack-level cooling solution, active ventilation floors can alleviate the problem of rack hot spots by actively transporting cold air. Compared with the above solutions, it is easier to deploy and more cost-effective, but the difficulty of its control mainly lies in the placement environment. Diversity and dynamism, such as different computer room air conditioners, relative positions of racks, and server distribution inside the rack; different closed states of cold and hot aisles, different server rack standards and sealing conditions; computer room air conditioner power, heat load of servers in different racks different, wait. Therefore, the thermal energy efficiency and airflow models of data centers are generally difficult to describe with analytical models.

现有的主动通风地板相关研究大多是基于测量或仿真的性能建模和评估,目前还没有主动通风地板控制问题的研究文献。Most of the existing active ventilation floor related research is based on measurement or simulation performance modeling and evaluation, and there is no research literature on the control problem of active ventilation floor.

发明内容SUMMARY OF THE INVENTION

为了克服上述现有技术的缺点,本发明的目的在于提供一种数据中心主动通风地板的智能控制方法,在不提升机房空调功率的前提下,自动学习最优运行策略,规划机架气流,使机架入风口温度分布均匀化,缓解机架热点问题。且不必建立和校准复杂气流和热交换模型,从而提高主动通风地板的普适性。In order to overcome the above-mentioned shortcomings of the prior art, the purpose of the present invention is to provide an intelligent control method for an active ventilation floor of a data center, which can automatically learn the optimal operation strategy, plan the air flow of the rack, and make The temperature distribution of the air inlet of the rack is uniform, which alleviates the problem of rack hot spots. And there is no need to build and calibrate complex airflow and heat exchange models, thereby improving the universality of active ventilation floors.

为了实现上述目的,本发明采用的技术方案是:In order to achieve the above object, the technical scheme adopted in the present invention is:

数据中心主动通风地板的智能控制方法,对数据中心机架热点问题建立马尔可夫决策过程模型,并提供三种模型求解算法,包括基础智能算法、样本值变体智能算法和结构变体智能算法,分别作为主动通风地板控制算法的核心。所述模型由系统状态、行为、奖励和价值函数四部分组成,所述模型的解为,在一系列系统状态下不断选择最优行为,使得系统累计奖励最大化,所述主动通风地板控制算法,通过不断探索和学习机架入风口温度分布与主动通风地板风扇转速间的复杂关系,最终可以根据机架入风口温度分布,产生最优PWM信号占空比值,调节主动通风地板风扇转速,使得机架入风口温度分布均匀化,缓解机架热点问题。The intelligent control method of the active ventilation floor of the data center, establishes a Markov decision process model for the hotspot problem of the data center rack, and provides three model solving algorithms, including the basic intelligent algorithm, the sample value variant intelligence algorithm and the structural variant intelligence algorithm , respectively as the core of the active ventilation floor control algorithm. The model consists of four parts: system state, behavior, reward, and value function. The solution of the model is that the optimal behavior is continuously selected under a series of system states to maximize the cumulative reward of the system. The active ventilation floor controls The algorithm, by continuously exploring and learning the complex relationship between the temperature distribution of the rack air inlet and the fan speed of the active ventilation floor, can finally generate the optimal PWM signal duty cycle value according to the temperature distribution of the air inlet of the rack, and adjust the fan speed of the active ventilation floor. The temperature distribution of the air inlet of the rack is uniform, and the hot spot problem of the rack is alleviated.

与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:

本发明不必建立和校准复杂的气流和热交换模型,使用智能控制算法,克服主动通风地板放置环境的多样性和动态性,自动匹配机架入风口温度分布与最优主动通风地板风扇转速,只需要将原普通通风地板置换为运行本发明的主动通风地板,本发明即可自主运行,改善机架入风口温度分布,缓解机架热点问题,相比其他方案,本发明普适性更高,更易部署,更具成本效益。The present invention does not need to establish and calibrate complex airflow and heat exchange models, uses intelligent control algorithms, overcomes the diversity and dynamics of the placement environment of the active ventilation floor, automatically matches the temperature distribution of the air inlet of the rack and the optimal active ventilation floor fan speed, only It is necessary to replace the original ordinary ventilation floor with the active ventilation floor running the present invention, and the present invention can operate autonomously, improve the temperature distribution of the air inlet of the rack, and alleviate the problem of hot spots of the rack. Compared with other solutions, the present invention has higher universality. Easier to deploy and more cost-effective.

附图说明Description of drawings

图1为主动通风地板设计及部署图。Figure 1 is an active ventilation floor design and deployment diagram.

具体实施方式Detailed ways

下面结合附图和实施例详细说明本发明的实施方式。The embodiments of the present invention will be described in detail below with reference to the accompanying drawings and examples.

图1为本发明的详细部署实施示意图,一定数量的温度传感器一1均匀分布在机架2入风口处,监测机架2入风口温度分布,同时在主动通风地板下另设一个温度传感器二,监测主动通风地板下送风温度。1 is a schematic diagram of the detailed deployment implementation of the present invention. A certain number of temperature sensors 1 are evenly distributed at the air inlets of rack 2 to monitor the temperature distribution of the air inlets of rack 2. At the same time, another temperature sensor 2 is installed under the active ventilation floor. Monitors actively ventilated underfloor supply air temperature.

本领域中,机架2是一个长方体铁盒子,里面放一定数量的服务器,许多机架一排一排摆放。在某一排机架中,一般某一机架左右面板与其他机架紧贴,机架前面板即为入风口,用来吸冷气制冷服务器,机架后面板为出风口,用来排出制冷后的热气,监测机架入风口温度分布即监测机架前面板某些位置的温度,这些位置的温度组成了机架入风口温度分布,因此温度传感器一1的个数取决于这些位置的数量。In the art, the rack 2 is a rectangular iron box in which a certain number of servers are placed, and many racks are placed in rows. In a row of racks, the left and right panels of a rack are generally close to other racks. The front panel of the rack is the air inlet, which is used to suck in cold air to cool the server, and the rear panel of the rack is the air outlet, which is used to discharge cooling. After the hot air, monitoring the temperature distribution of the air inlet of the rack is to monitor the temperature of certain positions on the front panel of the rack. The temperature of these positions constitutes the temperature distribution of the air inlet of the rack. Therefore, the number of temperature sensors 1 depends on the number of these positions. .

本发明主动通风地板智能控制方法运行于PC端,PC6与微控制器3连接,微控制器3连接驱动板4,驱动板4在连接开关电源5(12V,20A)后与主动通风地板风扇7连接。根据温度传感器一1传回的温度分布,产生PWM信号的占空比值,并传给微控制器3,微控制器3据此占空比值,产生相应PWM信号,传输给驱动板4,驱动板4根据PWM信号控制开关电源5提供给主动通风地板风扇7的电压,通过控制风扇供电电压,达到调节风扇转速的目的。The intelligent control method of the active ventilation floor of the present invention runs on the PC side, the PC6 is connected to the microcontroller 3, the microcontroller 3 is connected to the driving board 4, and the driving board 4 is connected to the active ventilation floor fan 7 after being connected to the switching power supply 5 (12V, 20A). connect. According to the temperature distribution returned by the temperature sensor 1, the duty cycle value of the PWM signal is generated and transmitted to the microcontroller 3. The microcontroller 3 generates the corresponding PWM signal according to the duty cycle value, and transmits it to the driver board 4. The driver board 4. Control the voltage provided by the switching power supply 5 to the active ventilation floor fan 7 according to the PWM signal, and achieve the purpose of adjusting the fan speed by controlling the fan power supply voltage.

控制方法包括以下部分:The control method includes the following parts:

1、对抬升地板结构(数据中心的送风结构,数据中心机房地板被架高,留出60-100cm高的地板下空间用于机房空调输送冷气,这种结构即为抬升地板结构,目前国内大部分数据中心均采用这种构造)数据中心机架热点问题建立马尔可夫决策过程模型,由以下ABCD四部分组成:1. For the raised floor structure (the air supply structure of the data center, the floor of the data center computer room is raised, leaving a 60-100cm high under-floor space for the air conditioner of the computer room to deliver cold air. This structure is the raised floor structure. At present, domestic Most data centers use this structure) The data center rack hotspot problem establishes a Markov decision process model, which consists of the following four parts: ABCD:

A系统状态,为带有历史的机架入风口温度分布集合,其公式为:A system state, which is a collection of temperature distributions of the air inlets of the rack with history, and its formula is:

φt={st-p,…,sx…,st-1,st},其中

Figure BDA0002508955000000031
φ t = {s tp ,…,s x …,s t-1 ,s t }, where
Figure BDA0002508955000000031

其中φt为t时刻系统状态,st-p、sx、st-1、st分别为t-p、x、t-1、t时刻机架入风口温度分布,x∈[t-p,t],p为历史长度;Ti为编号为i的温度传感器一的读数,

Figure BDA0002508955000000032
Figure BDA0002508955000000033
为温度传感器一的集合,
Figure BDA0002508955000000034
为温度传感器一的总数。where φ t is the system state at time t, s tp , s x , s t-1 , and s t are the temperature distribution of the air inlet of the rack at time tp, x, t-1, and t, respectively, x∈[tp,t], p is the history length; T i is the reading of temperature sensor one numbered i,
Figure BDA0002508955000000032
Figure BDA0002508955000000033
is a set of temperature sensors,
Figure BDA0002508955000000034
is the total number of temperature sensors one.

B行为空间

Figure BDA0002508955000000041
定义为离散化的PWM信号占空比值,其公式为:B behavior space
Figure BDA0002508955000000041
Defined as the discretized PWM signal duty cycle value, its formula is:

Figure BDA0002508955000000042
Figure BDA0002508955000000042

其中a是

Figure BDA0002508955000000043
中某个行为,DC为PWM信号占空比,max(DC)为最大占空比,DDRL为DC离散化等分比,k表示某个行为中DDRL的个数;where a is
Figure BDA0002508955000000043
In a certain behavior, DC is the PWM signal duty cycle, max(DC) is the maximum duty cycle, D DRL is the DC discretization equal division ratio, and k represents the number of D DRLs in a certain behavior;

C奖励Rt+1由机架入风口温度分布均匀程度的量化指标及主动通风地板风扇能耗两部分构成,其公式为:The C reward R t+1 is composed of two parts: the quantitative index of the uniformity of the temperature distribution of the air inlet of the rack and the energy consumption of the active ventilation floor fan. The formula is:

Figure BDA0002508955000000044
Figure BDA0002508955000000044

其中Rt+1为t时刻系统采取某行为后所得的奖励,

Figure BDA0002508955000000045
表示机架入风口温度分布均匀程度,该式值全为负,越接近0,表明机架入风口温度分布越均匀,其中Tt,i为t时刻编号为i的传感器一的温度读数,
Figure BDA0002508955000000046
为t时刻机架参考温度,
Figure BDA0002508955000000047
Tt,under为t时刻所述温度传感器二的读数,ΔT为根据主动通风地板上下冷热气流混合程度设置的固定温度差,为正数;-(Aref×DCt)3表示主动通风地板风扇能耗,该式的值全为负,越接近0,表明风扇能耗越低,其中Aref为保持与机架入风口温度分布均匀程度同一量级的参考行为值,DCt为t时刻PWM信号方波占空比。where R t+1 is the reward obtained by the system after taking a certain behavior at time t,
Figure BDA0002508955000000045
Indicates the uniformity of the temperature distribution at the air inlet of the rack. The value of this formula is all negative. The closer to 0, the more uniform the temperature distribution of the air inlet of the rack is. T t,i is the temperature reading of the sensor numbered i at time t.
Figure BDA0002508955000000046
is the rack reference temperature at time t,
Figure BDA0002508955000000047
T t,under is the reading of the temperature sensor 2 at time t, Δ T is the fixed temperature difference set according to the mixing degree of hot and cold air above and below the active ventilation floor, which is a positive number; -(A ref ×DC t ) 3 represents active ventilation The energy consumption of the floor fan. The values of this formula are all negative. The closer to 0, the lower the energy consumption of the fan. A ref is a reference behavior value that maintains the same level of uniformity as the temperature distribution of the air inlet of the rack. DC t is t The duty cycle of the square wave of the PWM signal at the moment.

D价值函数Q(φt,at)为行为价值函数,其公式为:The D value function Q(φ t , at t ) is the behavioral value function, and its formula is:

Figure BDA0002508955000000048
Figure BDA0002508955000000048

其中价值函数Q(φt,at)称为Q函数,

Figure BDA0002508955000000049
为t时刻系统采取的行为,
Figure BDA00025089550000000410
为期望函数,y为相对于t时刻的未来时刻,Rt+y+1表示系统在t+y时刻采取行为后获得的奖励,γ表示衰减因子,表示在某状态下采取某行为对系统未来奖励即环境影响的重视程度,0≤γ<1,γy为γ的y次方,是t+y时刻Rt+y+1的衰减因子。where the value function Q(φ t , at t ) is called the Q function,
Figure BDA0002508955000000049
is the action taken by the system at time t,
Figure BDA00025089550000000410
is the expectation function, y is the future time relative to time t, R t+y+1 represents the reward obtained by the system after taking action at time t+y, γ represents the decay factor, which means that taking a certain behavior in a certain state will affect the future of the system Reward is the importance of environmental impact, 0≤γ<1, γy is the y power of γ, which is the decay factor of R t+y+1 at time t+y.

E模型可以被总结为,在任意t时刻系统状态下,通过选择最优行为,使得累计奖励最大化,其模型公式为:The E model can be summarized as, in the system state at any time t, by selecting the optimal behavior to maximize the cumulative reward, the model formula is:

Figure BDA0002508955000000051
Figure BDA0002508955000000051

约束于bound to

Figure BDA0002508955000000052
Figure BDA0002508955000000052

其中,γt是t时刻系Rt+1的衰减因子。Among them, γ t is the decay factor of R t+1 at time t.

2、模型的解及求解算法2. Model solution and solution algorithm

a所述模型的解,在计算得到最优Q函数,即可根据最优Q函数在任意t时刻系统状态下选择最优行为,使累计奖励最大化,最优Q函数计算公式为:For the solution of the model described in a, after the optimal Q function is obtained by calculation, the optimal behavior can be selected under the system state at any time t according to the optimal Q function, so as to maximize the cumulative reward. The calculation formula of the optimal Q function is:

Figure BDA0002508955000000053
Figure BDA0002508955000000053

在任意t时刻,最优行为选择公式为:At any time t, the optimal behavior selection formula is:

Figure BDA0002508955000000054
Figure BDA0002508955000000054

其中Q*t,at)表示最优Q函数,φt+1表示t+1时刻的系统状态,a表示在t+1时刻系统可能采取的所有行为中的任一行为,亦即行为空间

Figure BDA0002508955000000055
中的某一行为。where Q *t , a t ) represents the optimal Q function, φ t+1 represents the system state at time t+1, and a represents any one of all actions that the system may take at time t+1, that is, behavior space
Figure BDA0002508955000000055
one of the behaviors.

b求解算法即为,计算得到最优Q函数并在决策中选择选择最优行为,使得累计奖励最大化。所述求解算法包括基础智能算法,样本值变体智能算法和结构变体智能算法,这三种算法均通过不断决策积累(φt,at,Rt+1t+1)样本记录训练神经网络,使得神经网络能够近似Q函数,进而选择最优行为,使得所述模型的累计奖励最大化,其中φt为t时刻系统状态,at为系统在t时刻采取的行为,Rt+1为系统采取at后得到的奖励,φt+1为t+1时刻系统状态。所述三种算法的设计如下:The b solution algorithm is to calculate the optimal Q function and select the optimal behavior in the decision-making, so as to maximize the cumulative reward. The solution algorithms include basic intelligence algorithms, sample value variant intelligence algorithms and structural variant intelligence algorithms, all of which are accumulated through continuous decision-making (φ t , at , R t +1 , φ t+1 ) sample records Train the neural network so that the neural network can approximate the Q function, and then select the optimal behavior to maximize the cumulative reward of the model, where φ t is the state of the system at time t, a t is the behavior taken by the system at time t, and R t +1 is the reward obtained by the system after taking a t , and φ t+1 is the state of the system at time t+1. The three algorithms are designed as follows:

a)所述基础智能算法,使用两个结构相同的神经网络近似Q函数,一个用于近似Q样本函数,计算Q样本值,称为targ网络;另一个用于近似Q预测函数,计算Q预测值,称为eval网络;利用所述样本记录计算Q样本值与Q预测值之差,训练更新神经网络,所述Q样本值计算公式为:a) The basic intelligent algorithm uses two neural networks with the same structure to approximate the Q function, one is used to approximate the Q sample function and calculate the Q sample value, which is called targ network; the other is used to approximate the Q prediction function and calculate the Q prediction value, called eval network; use the sample record to calculate the difference between the Q sample value and the Q predicted value, train and update the neural network, and the Q sample value calculation formula is:

Figure BDA0002508955000000061
Figure BDA0002508955000000061

其中Qt+1,target为Q样本值,Rt+1和φt+1取自所述样本记录,0≤γ<1为衰减因子,Q(φt+1,a;θt,target)为targ网络输出的Q样本集合,a表示在t+1时刻系统可能采取的所有行为,

Figure BDA0002508955000000062
为行为空间,θt,target为t时刻targ网络参数集合。where Q t+1,target is the Q sample value, R t+1 and φ t+1 are taken from the sample records, 0≤γ<1 is the attenuation factor, Q(φ t+1 ,a; θ t,target ) is the set of Q samples output by the targ network, a represents all actions that the system may take at time t+1,
Figure BDA0002508955000000062
is the behavior space, θ t, target is the set of targ network parameters at time t.

所述神经网络更新方式如下:The neural network update method is as follows:

Figure BDA0002508955000000063
Figure BDA0002508955000000063

其中δt+1为Q样本值与对应Q预测值之差,Q(φt,at;θt,eval)为eval网络输出的Q预测集合中,at对应的Q预测值,φt和at取自所述样本记录,θt,eval为eval网络t时刻参数集合,θt+1,eval为eval网络t+1时刻参数集合,

Figure BDA0002508955000000064
Figure BDA0002508955000000065
关于θt,eval的梯度,α为神经网络学习步长,θtarget是时刻t为N的整数倍(包括0)时,targ网络参数集合,θeval是时刻t为N的整数倍(包括0)时,eval网络参数集合。Where δ t+1 is the difference between the Q sample value and the corresponding Q predicted value, Q(φ t , at ; θ t , eval ) is the Q predicted value corresponding to a t in the Q prediction set output by the eval network, φ t and a t are taken from the sample records, θ t, eval is the parameter set at time t of the eval network, θ t+1, eval is the parameter set at time t+1 of the eval network,
Figure BDA0002508955000000064
for
Figure BDA0002508955000000065
Regarding the gradient of θ t and eval , α is the learning step size of the neural network, θ target is the set of targ network parameters when time t is an integer multiple of N (including 0), and θ eval is an integer multiple of N (including 0) at time t ), the set of eval network parameters.

b)所述样本值变体智能算法,在计算Q样本值时使用公式:b) The sample value variant intelligent algorithm uses the formula when calculating the Q sample value:

Figure BDA0002508955000000066
Figure BDA0002508955000000066

其中Qt+1,target为Q样本值,Rt+1和φt+1取自所述样本记录,Q(φt+1,a;θt,target)为targ网络输出的Q样本集合,

Figure BDA0002508955000000071
为targ网络输出的Q样本集合中,使Qevalt+1,a;θt,eval)最大的行为对应的Q样本值,Qevalt+1,a;θt,eval)为eval网络输出的Q预测集合,a表示在t+1时刻系统可能采取的所有行为中的任一行为,亦即行为空间
Figure BDA0002508955000000072
中的某一行为,θt,eval为t时刻eval网络参数集合,θt,target为t时刻targ网络参数集合;Wherein Q t+1, target is the Q sample value, R t+1 and φ t+1 are taken from the sample records, Q (φ t+1 , a; θ t, target ) is the set of Q samples output by the targ network ,
Figure BDA0002508955000000071
It is the Q sample value corresponding to the behavior that maximizes Q evalt+1 , a; θ t, eval ) in the set of Q samples output by the targ network, Q evalt+1 , a; θ t, eval ) is the Q prediction set output by the eval network, a represents any of all the actions that the system may take at time t+1, that is, the action space
Figure BDA0002508955000000072
For a certain behavior in , θ t, eval is the set of eval network parameters at time t, and θ t, target is the set of targ network parameters at time t;

所述样本值变体智能算法的神经网络结构及其更新方式,与所述基础智能算法相同。The neural network structure of the sample value variant intelligent algorithm and its update method are the same as those of the basic intelligent algorithm.

c)所述结构变体智能算法,使用两个结构相同的神经网络,在每个神经网络的倒数第二层设置DN层,DN层分V段和A段,其中V段神经元结点数为1,表示t时刻系统状态,A段神经元个数为行为空间中的元素个数,表示在该系统状态下可能采取的所有行为,DN层计算公式为:c) The structural variant intelligent algorithm uses two neural networks with the same structure, and sets a DN layer on the penultimate layer of each neural network. The DN layer is divided into V segment and A segment, and the number of neurons in the V segment is 1. Represents the state of the system at time t. The number of neurons in segment A is the number of elements in the behavior space, representing all possible behaviors in this system state. The calculation formula of the DN layer is:

Figure BDA0002508955000000073
Figure BDA0002508955000000073

其中,Q(φt,at;θtt,Vt,A)为神经网络最终输出,φt和at取自所述样本记录,θt为t时刻,结构变体智能算法神经网络DN层前的网络参数集合,θt,V为t时刻DN层V段参数,θt,A为t时刻DN层A段参数,V(φt;θtt,V)为V段输出,A(φt,at;θtt,A)为A段中at对应的输出值,A(φt,a';θtt,A)为A段全部输出,a'表示在状态φt下,系统可能采取的所有行为,

Figure BDA0002508955000000074
为行为空间中元素个数。Among them, Q(φ t , at ; θ t , θ t , V , θ t, A ) is the final output of the neural network, φ t and at t are taken from the sample records, θ t is time t, the structural variant The set of network parameters before the DN layer of the intelligent algorithm neural network . ) is the V segment output, A(φ t , at t ; θ t , θ t, A ) is the output value corresponding to a t in the A segment, A (φ t , a'; θ t , θ t, A ) is All outputs of segment A, a' represents all possible actions of the system in the state φ t ,
Figure BDA0002508955000000074
is the number of elements in the behavior space.

之后,采取与所述样本值变体智能算法相同的Q样本值计算及神经网络更新方式训练更新神经网络。After that, adopt the same Q sample value calculation and neural network update method as the sample value variant intelligent algorithm to train and update the neural network.

3、通过不断探索和学习机架入风口温度分布与主动通风地板风扇转速间的复杂关系,最终根据机架入风口温度分布,产生最优PWM信号占空比值,调节主动通风地板风扇转速,使得机架入风口温度分布均匀化,缓解机架热点问题。其在PC端的运行逻辑如下:3. By continuously exploring and learning the complex relationship between the temperature distribution of the rack air inlet and the fan speed of the active ventilation floor, and finally according to the temperature distribution of the air inlet of the rack, the optimal PWM signal duty cycle value is generated, and the speed of the active ventilation floor fan is adjusted so that the The temperature distribution of the air inlet of the rack is uniform, which alleviates the problem of rack hot spots. Its operation logic on the PC side is as follows:

1:在不同控制算法中,构建和初始化不同神经网络,并令targ网络参数与eval网络参数相同;设置所述样本记录缓存数组;设置参考温度Tt1: In different control algorithms, construct and initialize different neural networks, and make the targ network parameters the same as the eval network parameters; set the sample record buffer array; set the reference temperature T t ;

2:设置初始时刻t=0,缓存数组中样本记录的时刻记为τ;初始行为探索概率ε,探索率随t减少量Δε,最小探索概率εmin2: Set the initial time t=0, the time of the sample record in the cache array is recorded as τ; the initial behavior exploration probability ε, the exploration rate decreases with t Δ ε , the minimum exploration probability ε min ;

3:在Z个时刻内随机选择行为,并将每个时刻产生的记录(φz∈[0,Z),az∈[0,Z),Rz+1∈[0,Z]z+1∈[0,Z])存入缓存数组;3: Randomly select actions in Z moments, and record (φ z∈[0,Z) ,a z∈[0,Z) ,R z+1∈[0,Z] ,φ z∈[0,Z) ,R z+1∈[0,Z] , z+1∈[0,Z] ) is stored in the cache array;

4:获取初始机架入风口温度分布

Figure BDA0002508955000000081
4: Obtain the initial rack air inlet temperature distribution
Figure BDA0002508955000000081

5:循环体开始;5: The loop body starts;

6:获取p个历史机架入风口温度分布,共同组成一个系统状态φt={st-p,…,st-1,st};6: Obtain p historical rack air inlet temperature distributions to form a system state φ t ={s tp ,...,s t-1 ,s t };

7:若t=0,则选择行为at=max(DC)并转9,否则转8;7: If t =0, select the behavior at =max(DC) and turn to 9, otherwise turn to 8;

8:使用如下公式选择行为:8: Use the following formula to select the behavior:

Figure BDA0002508955000000082
Figure BDA0002508955000000082

9:执行at,PC发送占空比指令到微控制器,改变风扇转速,并获得系统下一时刻机架入风口温度分布st+1,根据权利要求4中奖励公式计算Rt+19: Execute a t , the PC sends a duty cycle command to the microcontroller, changes the fan speed, and obtains the rack air inlet temperature distribution s t+1 at the next moment of the system, and calculates R t+1 according to the reward formula in claim 4 ;

10:根据最新的p条温度分布历史,组成下一状态φt+1={st+1-p,…,st,st+1},并将(φt,at,Rt+1t+1)存入缓存数组;10: According to the latest p temperature distribution history, form the next state φ t+1 ={s t+1-p ,...,s t ,s t +1 }, and combine (φ t ,at ,R t +1t+1 ) is stored in the cache array;

11:从缓存数组中随机抽取Y条样本记录(φτ,aτ,Rτ+1τ+1);11: Randomly extract Y sample records from the cache array (φ τ , a τ , R τ+1 , φ τ+1 );

12:根据不同控制算法,利用Y条记录,计算Q样本值,公式如下:12: According to different control algorithms, use Y records to calculate the Q sample value, the formula is as follows:

Figure BDA0002508955000000083
Figure BDA0002508955000000083

13:使用学习步长α和如下损失函数更新eval网络:13: Update the eval network with the learning step size α and the following loss function:

Figure BDA0002508955000000084
Figure BDA0002508955000000084

14:探索概率ε取ε-Δε和εmin中的最小值;14: The exploration probability ε takes the minimum value of ε- Δε and ε min ;

15:如果t mod N=0,则targ网络复制eval网络参数,否则转16;15: If t mod N=0, the targ network copies the eval network parameters, otherwise go to 16;

16:时刻t增加1;16: time t increases by 1;

17:循环体结束。17: The loop body ends.

综上,本发明对数据中心机架热点问题建立马尔可夫决策过程模型,并提供三种模型求解算法,包括基础智能算法、样本值变体智能算法和结构变体智能算法,分别作为主动通风地板控制算法的核心。模型由系统状态、行为、奖励和价值函数四部分组成,模型的解为,在一系列系统状态下不断选择最优行为,使得系统累计奖励最大化,主动通风地板控制算法,通过不断探索和学习机架入风口温度分布与主动通风地板风扇转速间的复杂关系,最终可以根据机架入风口温度分布,产生最优PWM信号占空比值,调节主动通风地板风扇转速,使得机架入风口温度分布均匀化,缓解机架热点问题。相比其他方案,本发明普适性更高,更易部署,更具成本效益。In summary, the present invention establishes a Markov decision process model for the hotspot problem of data center racks, and provides three model solving algorithms, including the basic intelligent algorithm, the sample value variant intelligence algorithm and the structural variant intelligence algorithm, which are respectively used as active ventilation. The core of the floor control algorithm. The model consists of four parts: system state, behavior, reward and value function. The solution of the model is that the optimal behavior is continuously selected under a series of system states to maximize the cumulative reward of the system. The active ventilation floor control algorithm, through continuous exploration and Learn the complex relationship between the temperature distribution of the rack air inlet and the fan speed of the active ventilation floor. Finally, according to the temperature distribution of the rack air inlet, the optimal PWM signal duty cycle value can be generated, and the speed of the active ventilation floor fan can be adjusted so that the temperature of the rack air inlet can be adjusted. Uniform distribution to alleviate rack hotspot issues. Compared with other solutions, the present invention is more universal, easier to deploy, and more cost-effective.

Claims (6)

1.数据中心主动通风地板的智能控制方法,其特征在于,包括如下步骤:1. the intelligent control method of the active ventilation floor of the data center, is characterized in that, comprises the steps: 步骤1,在机架入风口处设置一定数量的用于监测机架入风口温度分布的温度传感器一,在主动通风地板下设置一个用于监测主动通风地板下送风温度的温度传感器二;Step 1, set a certain number of temperature sensors at the air inlet of the rack for monitoring the temperature distribution of the air inlet of the rack; 步骤2,对数据中心机架热点问题建立马尔可夫决策过程模型,所述模型由系统状态φt、行为空间
Figure FDA0002508954990000011
奖励Rt+1和价值函数Q(φt,at)四部分组成;
Step 2, establish a Markov decision process model for the data center rack hotspot problem, the model consists of system state φ t , behavior space
Figure FDA0002508954990000011
The reward R t+1 and the value function Q (φ t , a t ) are composed of four parts;
其中:t时刻系统状态φt定义为带有历史的机架入风口温度分布集合,其公式为:Among them: the system state φ t at time t is defined as the temperature distribution set of the air inlet of the rack with history, and its formula is: φt={st-p,…,sx…,st-1,st},其中
Figure FDA0002508954990000012
φ t = {s tp ,…,s x …,s t-1 ,s t }, where
Figure FDA0002508954990000012
其中st-p、sx、st-1、st分别为t-p、x、t-1、t时刻机架入风口温度分布,x∈[t-p,t],p为历史长度;Ti为编号为i的温度传感器一的读数,
Figure FDA0002508954990000013
Figure FDA0002508954990000014
为温度传感器一的集合,
Figure FDA0002508954990000015
为温度传感器一的总数;
where s tp , s x , s t-1 , and s t are the temperature distribution of the air inlet of the rack at tp, x, t-1, and t, respectively, x∈[tp,t], p is the history length; T i is the serial number is the reading of temperature sensor one of i,
Figure FDA0002508954990000013
Figure FDA0002508954990000014
is a set of temperature sensors,
Figure FDA0002508954990000015
is the total number of temperature sensors one;
行为空间
Figure FDA0002508954990000016
定义为离散化的PWM信号占空比值,其公式为:
behavior space
Figure FDA0002508954990000016
Defined as the discretized PWM signal duty cycle value, its formula is:
Figure FDA0002508954990000017
Figure FDA0002508954990000017
其中a是
Figure FDA0002508954990000018
中某个行为,DC为PWM信号占空比,max(DC)为最大占空比,DDRL为DC离散化等分比,k表示某个行为中DDRL的个数;
where a is
Figure FDA0002508954990000018
In a certain behavior, DC is the PWM signal duty cycle, max(DC) is the maximum duty cycle, D DRL is the DC discretization equal division ratio, and k represents the number of D DRLs in a certain behavior;
奖励Rt+1由机架入风口温度分布均匀程度的量化指标及主动通风地板风扇能耗两部分构成,其公式为:The reward R t+1 is composed of the quantitative index of the uniformity of the temperature distribution of the air inlet of the rack and the energy consumption of the active ventilation floor fan. The formula is:
Figure FDA0002508954990000019
Figure FDA0002508954990000019
其中Rt+1为t时刻系统采取某行为后所得的奖励,
Figure FDA0002508954990000021
表示机架入风口温度分布均匀程度,该式值全为负,越接近0,表明机架入风口温度分布越均匀,其中Tt,i为t时刻编号为i的传感器一的温度读数,
Figure FDA0002508954990000022
为t时刻机架参考温度,
Figure FDA0002508954990000023
Tt,under为t时刻所述温度传感器二的读数,ΔT为根据主动通风地板上下冷热气流混合程度设置的固定温度差,为正数;-(Aref×DCt)3表示主动通风地板风扇能耗,该式的值全为负,越接近0,表明风扇能耗越低,其中Aref为保持与机架入风口温度分布均匀程度同一量级的参考行为值,DCt为t时刻PWM信号方波占空比;
where R t+1 is the reward obtained by the system after taking a certain behavior at time t,
Figure FDA0002508954990000021
Indicates the uniformity of the temperature distribution at the air inlet of the rack. The value of this formula is all negative. The closer to 0, the more uniform the temperature distribution of the air inlet of the rack is. T t,i is the temperature reading of the sensor numbered i at time t.
Figure FDA0002508954990000022
is the rack reference temperature at time t,
Figure FDA0002508954990000023
T t,under is the reading of the temperature sensor 2 at time t, Δ T is the fixed temperature difference set according to the mixing degree of hot and cold air above and below the active ventilation floor, which is a positive number; -(A ref ×DC t ) 3 represents active ventilation The energy consumption of the floor fan. The values of this formula are all negative. The closer to 0, the lower the energy consumption of the fan. A ref is a reference behavior value that maintains the same level of uniformity as the temperature distribution of the air inlet of the rack. DC t is t The duty cycle of the square wave of the PWM signal at the moment;
价值函数Q(φt,at)为行为价值函数,其公式为:The value function Q(φ t , at t ) is the behavioral value function, and its formula is:
Figure FDA0002508954990000024
Figure FDA0002508954990000024
其中价值函数Q(φt,at)称为Q函数,
Figure FDA0002508954990000025
为t时刻系统采取的行为,
Figure FDA0002508954990000026
为期望函数,y为相对于t时刻的未来时刻,Rt+y+1表示系统在t+y时刻采取行为后获得的奖励,γ表示衰减因子,表示在某状态下采取某行为对系统未来奖励即环境影响的重视程度,0≤γ<1,γy为γ的y次方,是t+y时刻Rt+y+1的衰减因子;
where the value function Q(φ t , at t ) is called the Q function,
Figure FDA0002508954990000025
is the action taken by the system at time t,
Figure FDA0002508954990000026
is the expectation function, y is the future time relative to time t, R t+y+1 represents the reward obtained by the system after taking action at time t+y, γ represents the decay factor, which means that taking a certain behavior in a certain state will affect the future of the system Reward is the importance of environmental impact, 0≤γ<1, γy is the y power of γ, which is the decay factor of R t+y+1 at time t+y;
马尔可夫决策过程模型被总结为:在任意t时刻系统状态下,通过选择最优行为,使得系统累计奖励最大化,公式为:The Markov decision process model is summarized as: in the system state at any time t, by selecting the optimal behavior to maximize the cumulative reward of the system, the formula is:
Figure FDA0002508954990000027
Figure FDA0002508954990000027
约束于bound to
Figure FDA0002508954990000028
Figure FDA0002508954990000028
其中,γt是t时刻系Rt+1的衰减因子;Among them, γ t is the decay factor of R t+1 at time t; 步骤3,对所述模型求解,通过不断探索和学习机架入风口温度分布与主动通风地板风扇转速间的复杂关系,最终根据机架入风口温度分布,产生最优PWM信号占空比值,调节主动通风地板风扇转速,使得机架入风口温度分布均匀化,缓解机架热点问题。Step 3: Solve the model, through continuous exploration and learning of the complex relationship between the rack air inlet temperature distribution and the fan speed of the active ventilation floor, and finally generate the optimal PWM signal duty cycle value according to the rack air inlet temperature distribution, and adjust The active ventilation floor fan speed makes the temperature distribution of the rack air inlets uniform and alleviates the problem of rack hot spots.
2.根据权利要求1所述数据中心主动通风地板的智能控制方法,其特征在于,所述步骤2中,计算得到最优Q函数,即可根据最优Q函数在任意t时刻系统状态下选择最优行为,使累计奖励最大化,最优Q函数计算公式为:2. The intelligent control method for the active ventilation floor of the data center according to claim 1, wherein in the step 2, the optimal Q function is obtained by calculation, and the optimal Q function can be selected according to the system state at any time t. The optimal behavior maximizes the cumulative reward, and the optimal Q function calculation formula is:
Figure FDA0002508954990000031
Figure FDA0002508954990000031
在任意t时刻,最优行为选择公式为:At any time t, the optimal behavior selection formula is:
Figure FDA0002508954990000032
Figure FDA0002508954990000032
其中Q*t,at)表示最优Q函数,φt+1表示t+1时刻的系统状态,a表示在t+1时刻系统可能采取的所有行为中的任一行为,亦即行为空间
Figure FDA0002508954990000033
中的某一行为。
where Q *t , a t ) represents the optimal Q function, φ t+1 represents the system state at time t+1, and a represents any one of all actions that the system may take at time t+1, that is, behavior space
Figure FDA0002508954990000033
one of the behaviors.
3.根据权利要求1所述数据中心主动通风地板的智能控制方法,其特征在于,所述步骤3中,采用基础智能算法、样本值变体智能算法和结构变体智能算法求解模型,通过不断决策积累(φt,at,Rt+1t+1)样本记录训练神经网络,使得神经网络能够近似Q函数,进而选择最优行为,使得所述模型的累计奖励最大化,其中φt+1表示t+1时刻的系统状态。3. The intelligent control method for the active ventilation floor of the data center according to claim 1, wherein in the step 3, the basic intelligent algorithm, the sample value variant intelligent algorithm and the structural variant intelligent algorithm are used to solve the model, and the model is solved by continuously The decision accumulation (φ t , at , R t +1 , φ t+1 ) sample records to train the neural network, so that the neural network can approximate the Q function, and then select the optimal behavior to maximize the cumulative reward of the model, where φ t+1 represents the system state at time t+1. 4.根据权利要求3所述数据中心主动通风地板的智能控制方法,其特征在于,所述基础智能算法,使用两个结构相同的神经网络近似Q函数,一个用于近似Q样本函数,计算Q样本值,称为targ网络;另一个用于近似Q预测函数,计算Q预测值,称为eval网络;利用所述样本记录计算Q样本值与Q预测值之差,训练更新神经网络,所述Q样本值计算公式为:4. The intelligent control method of the active ventilation floor of the data center according to claim 3, wherein the basic intelligent algorithm uses two neural networks with the same structure to approximate the Q function, one is used to approximate the Q sample function, and calculates the Q function. The sample value is called the targ network; the other is used to approximate the Q prediction function, and the Q prediction value is calculated, which is called the eval network; the difference between the Q sample value and the Q prediction value is calculated by using the sample record, and the neural network is trained and updated. The formula for calculating the Q sample value is:
Figure FDA0002508954990000034
Figure FDA0002508954990000034
所述样本值变体智能算法中,Q样本值计算公式为:In the sample value variant intelligent algorithm, the Q sample value calculation formula is:
Figure FDA0002508954990000041
Figure FDA0002508954990000041
其中Qt+1,target为Q样本值,Rt+1和φt+1取自所述样本记录,Q(φt+1,a;θt,target)为targ网络输出的Q样本集合,
Figure FDA0002508954990000042
为targ网络输出的Q样本集合中,使Qevalt+1,a;θt,eval)最大的行为对应的Q样本值,Qevalt+1,a;θt,eval)为eval网络输出的Q预测集合,a表示在t+1时刻系统可能采取的所有行为中的任一行为,亦即行为空间
Figure FDA0002508954990000043
中的某一行为,θt,eval为t时刻eval网络参数集合,θt,target为t时刻targ网络参数集合;
Wherein Q t+1, target is the Q sample value, R t+1 and φ t+1 are taken from the sample records, Q (φ t+1 , a; θ t, target ) is the set of Q samples output by the targ network ,
Figure FDA0002508954990000042
It is the Q sample value corresponding to the behavior that maximizes Q evalt+1 , a; θ t, eval ) in the set of Q samples output by the targ network, Q evalt+1 , a; θ t, eval ) is the Q prediction set output by the eval network, a represents any of all the actions that the system may take at time t+1, that is, the action space
Figure FDA0002508954990000043
For a certain behavior in , θ t, eval is the set of eval network parameters at time t, and θ t, target is the set of targ network parameters at time t;
所述神经网络更新方式如下:The neural network update method is as follows:
Figure FDA0002508954990000044
Figure FDA0002508954990000044
其中δt+1为Q样本值与对应Q预测值之差,Q(φt,at;θt,eval)为eval网络输出的Q预测集合中,at对应的Q预测值,φt和at取自所述样本记录,θt+1,eval为t+1时刻eval网络参数集合,
Figure FDA0002508954990000045
Figure FDA0002508954990000046
关于θt,eval的梯度,α为神经网络学习步长,θtarget是时刻t为N的包括0在内的整数倍时的targ网络参数集合,θeval是时刻t为N的包括0在内的整数倍时的eval网络参数集合。
Where δ t+1 is the difference between the Q sample value and the corresponding Q predicted value, Q(φ t , at ; θ t , eval ) is the Q predicted value corresponding to a t in the Q prediction set output by the eval network, φ t and a t are taken from the sample records, θ t+1, eval is the set of eval network parameters at time t+1,
Figure FDA0002508954990000045
for
Figure FDA0002508954990000046
Regarding the gradient of θ t and eval , α is the learning step size of the neural network, θ target is the set of targ network parameters when time t is an integer multiple of N including 0, and θ eval is the time t is N including 0 The set of eval network parameters when an integer multiple of .
5.根据权利要求4所述数据中心主动通风地板的智能控制方法,其特征在于,所述结构变体智能算法,使用两个结构相同的神经网络,在每个神经网络的倒数第二层设置DN层,DN层分V段和A段,其中V段神经元结点数为1,表示t时刻系统状态,A段神经元个数为行为空间中的元素个数,表示在该系统状态下可能采取的所有行为,DN层计算公式为:5. The intelligent control method for the active ventilation floor of the data center according to claim 4, wherein the structural variant intelligent algorithm uses two neural networks with the same structure, and is set on the penultimate layer of each neural network DN layer, the DN layer is divided into V segment and A segment, where the number of neurons in segment V is 1, which represents the system state at time t, and the number of neurons in segment A is the number of elements in the behavior space, indicating that it is possible in this system state. All actions taken, the DN layer calculation formula is:
Figure FDA0002508954990000047
Figure FDA0002508954990000047
其中,Q(φt,at;θtt,Vt,A)为神经网络最终输出,φt和at取自所述样本记录,θt为t时刻,结构变体智能算法神经网络DN层前的网络参数集合,θt,V为t时刻DN层V段参数,θt,A为t时刻DN层A段参数,V(φt;θtt,V)为V段输出,A(φt,at;θtt,A)为A段中at对应的输出值,A(φt,a';θtt,A)为A段全部输出,a'表示在状态φt下,系统可能采取的所有行为,
Figure FDA0002508954990000051
为行为空间中元素个数;
Among them, Q(φ t , at ; θ t , θ t , V , θ t, A ) is the final output of the neural network, φ t and at t are taken from the sample records, θ t is time t, the structural variant The set of network parameters before the DN layer of the intelligent algorithm neural network . ) is the V segment output, A(φ t , at t ; θ t , θ t, A ) is the output value corresponding to a t in the A segment, A (φ t , a'; θ t , θ t, A ) is All outputs of segment A, a' represents all possible actions of the system in the state φ t ,
Figure FDA0002508954990000051
is the number of elements in the behavior space;
之后,采取与所述样本值变体智能算法相同的Q样本值计算及神经网络更新方式训练更新神经网络。After that, adopt the same Q sample value calculation and neural network update method as the sample value variant intelligent algorithm to train and update the neural network.
6.根据权利要求1所述数据中心主动通风地板的智能控制方法,其特征在于,所述智能控制方法的运行逻辑如下:6. The intelligent control method of the active ventilation floor of the data center according to claim 1, wherein the operation logic of the intelligent control method is as follows: 1:在不同控制算法中,构建和初始化不同神经网络,并令targ网络参数与eval网络参数相同;设置所述样本记录缓存数组;设置参考温度
Figure FDA0002508954990000054
1: In different control algorithms, construct and initialize different neural networks, and make the targ network parameters the same as the eval network parameters; set the sample record cache array; set the reference temperature
Figure FDA0002508954990000054
2:设置初始时刻t=0,缓存数组中样本记录的时刻记为τ;初始行为探索概率ε,探索率随t减少量Δε,最小探索概率εmin2: Set the initial time t=0, the time of the sample record in the cache array is recorded as τ; the initial behavior exploration probability ε, the exploration rate decreases with t Δ ε , the minimum exploration probability ε min ; 3:在Z个时刻内随机选择行为,并将每个时刻产生的记录(φz∈[0,Z),az∈[0,Z),Rz+1∈[0,Z]z+1∈[0,Z])存入缓存数组;3: Randomly select actions in Z moments, and record (φ z∈[0,Z) ,a z∈[0,Z) ,R z+1∈[0,Z] ,φ z∈[0,Z) ,R z+1∈[0,Z] , z+1∈[0,Z] ) is stored in the cache array; 4:获取初始机架入风口温度分布
Figure FDA0002508954990000052
4: Obtain the initial rack air inlet temperature distribution
Figure FDA0002508954990000052
5:循环体开始;5: The loop body starts; 6:获取p个历史机架入风口温度分布,共同组成一个系统状态φt={st-p,…,st-1,st};6: Obtain p historical rack air inlet temperature distributions to form a system state φ t ={s tp ,...,s t-1 ,s t }; 7:若t=0,则选择行为at=max(DC)并转9,否则转8;7: If t =0, select the behavior at =max(DC) and turn to 9, otherwise turn to 8; 8:使用如下公式选择行为:8: Use the following formula to select the behavior:
Figure FDA0002508954990000053
Figure FDA0002508954990000053
9:执行at,PC发送占空比指令到微控制器,改变风扇转速,并获得系统下一时刻机架入风口温度分布st+1,根据权利要求4中奖励公式计算Rt+19: Execute a t , the PC sends a duty cycle command to the microcontroller, changes the fan speed, and obtains the rack air inlet temperature distribution s t+1 at the next moment of the system, and calculates R t+1 according to the reward formula in claim 4 ; 10:根据最新的p条温度分布历史,组成下一状态φt+1={st+1-p,…,st,st+1},并将(φt,at,Rt+1t+1)存入缓存数组;10: According to the latest p temperature distribution history, form the next state φ t+1 ={s t+1-p ,...,s t ,s t +1 }, and combine (φ t ,at ,R t +1t+1 ) is stored in the cache array; 11:从缓存数组中随机抽取Y条样本记录(φτ,aτ,Rτ+1τ+1);11: Randomly extract Y sample records from the cache array (φ τ , a τ , R τ+1 , φ τ+1 ); 12:根据不同控制算法,利用Y条记录,计算Q样本值,公式如下:12: According to different control algorithms, use Y records to calculate the Q sample value, the formula is as follows:
Figure FDA0002508954990000061
Figure FDA0002508954990000061
13:使用学习步长α和如下损失函数更新eval网络:13: Update the eval network with the learning step size α and the following loss function:
Figure FDA0002508954990000062
Figure FDA0002508954990000062
14:探索概率ε取ε-Δε和εmin中的最小值;14: The exploration probability ε takes the minimum value of ε- Δε and ε min ; 15:如果t mod N=0,则targ网络复制eval网络参数,否则转16;15: If t mod N=0, the targ network copies the eval network parameters, otherwise go to 16; 16:时刻t增加1;16: time t increases by 1; 17:循环体结束。17: The loop body ends.
CN202010455152.6A 2020-05-26 2020-05-26 Intelligent control method for active ventilation floor in data center Active CN111637614B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010455152.6A CN111637614B (en) 2020-05-26 2020-05-26 Intelligent control method for active ventilation floor in data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010455152.6A CN111637614B (en) 2020-05-26 2020-05-26 Intelligent control method for active ventilation floor in data center

Publications (2)

Publication Number Publication Date
CN111637614A true CN111637614A (en) 2020-09-08
CN111637614B CN111637614B (en) 2021-06-08

Family

ID=72329604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010455152.6A Active CN111637614B (en) 2020-05-26 2020-05-26 Intelligent control method for active ventilation floor in data center

Country Status (1)

Country Link
CN (1) CN111637614B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117588394A (en) * 2024-01-18 2024-02-23 华土木(厦门)科技有限公司 AIoT-based intelligent linkage control method and system for vacuum pump

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030050003A1 (en) * 2001-09-07 2003-03-13 International Business Machines Corporation Air flow management system for an internet data center
CN101634480A (en) * 2009-08-05 2010-01-27 于郡东 System and method for controlling under-floor air distribution air-conditioning fan device in machine room
CN201854532U (en) * 2010-05-17 2011-06-01 罗达俊 Temperature control unit and automatic floor pressurization system therewith
CN203024346U (en) * 2013-01-23 2013-06-26 东北石油大学 Air conditioner control system of machine room
CN103292426A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Machine room cooling device and cooling air supply adjusting method
CN103631351A (en) * 2013-12-17 2014-03-12 北京百度网讯科技有限公司 Fan control method and device of server and server
CN203720764U (en) * 2013-12-30 2014-07-16 国家计算机网络与信息安全管理中心 Cooling device used for closed-door cabinet
CN204902119U (en) * 2015-08-25 2015-12-23 贵州电网公司信息通信分公司 Computer room of data center temperature intelligent control device
CN105975029A (en) * 2016-06-13 2016-09-28 天津欧迈通信技术有限公司 Distributed temperature control case radiating system
CN106461260A (en) * 2014-05-15 2017-02-22 三星电子株式会社 Method and apparatus for controlling temperature
CN106528941A (en) * 2016-10-13 2017-03-22 内蒙古工业大学 Data center energy consumption optimization resource control algorithm under server average temperature constraint
CN206283772U (en) * 2016-12-27 2017-06-27 贵州电网有限责任公司信息中心 A kind of active air-flow drainage ventilation unit
CN108446783A (en) * 2018-01-29 2018-08-24 杭州电子科技大学 A kind of prediction of new fan operation power and monitoring method
CN108921223A (en) * 2018-07-05 2018-11-30 广东水利电力职业技术学院(广东省水利电力技工学校) A kind of server cooling system and control method, computer program, computer
CN109654674A (en) * 2018-12-11 2019-04-19 珠海格力电器股份有限公司 Air conditioning system fan control method, air conditioning system and computer readable storage medium
CN110528815A (en) * 2019-06-26 2019-12-03 中电万维信息技术有限责任公司 A kind of based on the controllable anti-static ventilation of the novel Wind Volume of data center machine room Slab
CN110831407A (en) * 2019-11-06 2020-02-21 上海理工大学 Data room tower type multi-hole air supply temperature control device
CN111096094A (en) * 2017-09-06 2020-05-01 维谛公司 Cooling unit energy optimization via intelligent supply air temperature setpoint control
CN111126605A (en) * 2020-02-13 2020-05-08 创新奇智(重庆)科技有限公司 Data center machine room control method and device based on reinforcement learning algorithm
CN111144793A (en) * 2020-01-03 2020-05-12 南京邮电大学 Commercial building HVAC control method based on multi-agent deep reinforcement learning
CN111351180A (en) * 2020-03-06 2020-06-30 上海外高桥万国数据科技发展有限公司 System and method for realizing energy conservation and temperature control of data center by applying artificial intelligence

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030050003A1 (en) * 2001-09-07 2003-03-13 International Business Machines Corporation Air flow management system for an internet data center
CN101634480A (en) * 2009-08-05 2010-01-27 于郡东 System and method for controlling under-floor air distribution air-conditioning fan device in machine room
CN201854532U (en) * 2010-05-17 2011-06-01 罗达俊 Temperature control unit and automatic floor pressurization system therewith
CN103292426A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Machine room cooling device and cooling air supply adjusting method
CN203024346U (en) * 2013-01-23 2013-06-26 东北石油大学 Air conditioner control system of machine room
CN103631351A (en) * 2013-12-17 2014-03-12 北京百度网讯科技有限公司 Fan control method and device of server and server
CN203720764U (en) * 2013-12-30 2014-07-16 国家计算机网络与信息安全管理中心 Cooling device used for closed-door cabinet
CN106461260A (en) * 2014-05-15 2017-02-22 三星电子株式会社 Method and apparatus for controlling temperature
CN204902119U (en) * 2015-08-25 2015-12-23 贵州电网公司信息通信分公司 Computer room of data center temperature intelligent control device
CN105975029A (en) * 2016-06-13 2016-09-28 天津欧迈通信技术有限公司 Distributed temperature control case radiating system
CN106528941A (en) * 2016-10-13 2017-03-22 内蒙古工业大学 Data center energy consumption optimization resource control algorithm under server average temperature constraint
CN206283772U (en) * 2016-12-27 2017-06-27 贵州电网有限责任公司信息中心 A kind of active air-flow drainage ventilation unit
CN111096094A (en) * 2017-09-06 2020-05-01 维谛公司 Cooling unit energy optimization via intelligent supply air temperature setpoint control
CN108446783A (en) * 2018-01-29 2018-08-24 杭州电子科技大学 A kind of prediction of new fan operation power and monitoring method
CN108921223A (en) * 2018-07-05 2018-11-30 广东水利电力职业技术学院(广东省水利电力技工学校) A kind of server cooling system and control method, computer program, computer
CN109654674A (en) * 2018-12-11 2019-04-19 珠海格力电器股份有限公司 Air conditioning system fan control method, air conditioning system and computer readable storage medium
CN110528815A (en) * 2019-06-26 2019-12-03 中电万维信息技术有限责任公司 A kind of based on the controllable anti-static ventilation of the novel Wind Volume of data center machine room Slab
CN110831407A (en) * 2019-11-06 2020-02-21 上海理工大学 Data room tower type multi-hole air supply temperature control device
CN111144793A (en) * 2020-01-03 2020-05-12 南京邮电大学 Commercial building HVAC control method based on multi-agent deep reinforcement learning
CN111126605A (en) * 2020-02-13 2020-05-08 创新奇智(重庆)科技有限公司 Data center machine room control method and device based on reinforcement learning algorithm
CN111351180A (en) * 2020-03-06 2020-06-30 上海外高桥万国数据科技发展有限公司 System and method for realizing energy conservation and temperature control of data center by applying artificial intelligence

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIANXIONG WAN等: "Air Flow Measurement and Management for", 《IEEE ACCESS》 *
李永利等: "基于机器学习的数据中心主动地板模型研究", 《计算机仿真》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117588394A (en) * 2024-01-18 2024-02-23 华土木(厦门)科技有限公司 AIoT-based intelligent linkage control method and system for vacuum pump
CN117588394B (en) * 2024-01-18 2024-04-05 华土木(厦门)科技有限公司 AIoT-based intelligent linkage control method and system for vacuum pump

Also Published As

Publication number Publication date
CN111637614B (en) 2021-06-08

Similar Documents

Publication Publication Date Title
JP7075944B2 (en) How to operate the predictive building control system and the equipment in the system
CN109189190B (en) Data center heat management method based on temperature prediction
Huang et al. A neural network-based multi-zone modelling approach for predictive control system design in commercial buildings
Fang et al. Cross temporal-spatial transferability investigation of deep reinforcement learning control strategy in the building HVAC system level
EP3585139B1 (en) A chassis intelligent airflow control and cooling regulation mechanism
CN106949598A (en) Network center&#39;s machine room energy-saving optimization method when network traffic load changes
WO2011006344A1 (en) Temperature regulating device and intelligent temperature control method for sand dust environment test system
CN110852498A (en) A method for predicting data center energy efficiency value PUE based on GRU neural network
Wan et al. Intelligent rack-level cooling management in data centers with active ventilation tiles: A deep reinforcement learning approach
CN115103562A (en) Distributed intelligent control method of data center air conditioner
KR102148726B1 (en) Method for controlling economizer air conditioning system
CN116907036A (en) Deep reinforcement learning water chilling unit control method based on cold load prediction
CN115732810A (en) Control method of electric vehicle battery pack heating system
CN111637614A (en) Intelligent control method for active ventilation floor in data center
CN115361841B (en) Shielding pump control system and method suitable for all-condition operation of data center
CN111601490B (en) Reinforced learning control method for data center active ventilation floor
He et al. Efficient model-free control of chiller plants via cluster-based deep reinforcement learning
CN115717758A (en) Indoor space temperature and humidity regulation and control method and system
Tashiro et al. Application of convolutional neural network to prediction of temperature distribution in data centers
CN117930647B (en) Data center cooling control method and device based on thermal prediction model
WO2021234763A1 (en) Indoor temperature estimation device, program, and indoor temperature estimation method
CN117574613A (en) Temperature prediction system of automatic machine room based on artificial intelligence
CN117525480A (en) Fuel cell automobile integrated thermal management system and method based on neural network
CN116954329A (en) Method, device, equipment, medium and program product for regulating state of refrigeration system
TWI811858B (en) Temperature control system and temperature control method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant