CN111601490B - Reinforced learning control method for data center active ventilation floor - Google Patents

Reinforced learning control method for data center active ventilation floor Download PDF

Info

Publication number
CN111601490B
CN111601490B CN202010456237.6A CN202010456237A CN111601490B CN 111601490 B CN111601490 B CN 111601490B CN 202010456237 A CN202010456237 A CN 202010456237A CN 111601490 B CN111601490 B CN 111601490B
Authority
CN
China
Prior art keywords
rack
time
value
active ventilation
ventilation floor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010456237.6A
Other languages
Chinese (zh)
Other versions
CN111601490A (en
Inventor
万剑雄
周杰
熊伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN202010456237.6A priority Critical patent/CN111601490B/en
Publication of CN111601490A publication Critical patent/CN111601490A/en
Application granted granted Critical
Publication of CN111601490B publication Critical patent/CN111601490B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H05ELECTRIC TECHNIQUES NOT OTHERWISE PROVIDED FOR
    • H05KPRINTED CIRCUITS; CASINGS OR CONSTRUCTIONAL DETAILS OF ELECTRIC APPARATUS; MANUFACTURE OF ASSEMBLAGES OF ELECTRICAL COMPONENTS
    • H05K7/00Constructional details common to different types of electric apparatus
    • H05K7/20Modifications to facilitate cooling, ventilating, or heating
    • H05K7/20709Modifications to facilitate cooling, ventilating, or heating for server racks or cabinets; for data centers, e.g. 19-inch computer racks
    • H05K7/20836Thermal management, e.g. server temperature control
    • HELECTRICITY
    • H05ELECTRIC TECHNIQUES NOT OTHERWISE PROVIDED FOR
    • H05KPRINTED CIRCUITS; CASINGS OR CONSTRUCTIONAL DETAILS OF ELECTRIC APPARATUS; MANUFACTURE OF ASSEMBLAGES OF ELECTRICAL COMPONENTS
    • H05K7/00Constructional details common to different types of electric apparatus
    • H05K7/20Modifications to facilitate cooling, ventilating, or heating
    • H05K7/20709Modifications to facilitate cooling, ventilating, or heating for server racks or cabinets; for data centers, e.g. 19-inch computer racks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Thermal Sciences (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Air Conditioning Control Device (AREA)

Abstract

A reinforcement learning control method for an active ventilation floor of a data center is characterized in that a Markov decision process model is established for the problem of a rack hotspot of a lifting floor structure data center, a reinforcement learning model solving algorithm is provided as the core of the reinforcement learning control algorithm, the rotating speed of a fan of the active ventilation floor (a floor with the fan attached to the back of a common ventilation floor) is intelligently controlled according to the current rack temperature distribution on the premise of not improving the air conditioning power of a machine room, and the rack inlet temperature distribution is homogenized by the mode of actively conveying sufficient cold air, so that the problem of the rack hotspot ubiquitous in the data center of the lifting floor structure is solved, the refrigeration energy consumption is saved, and the safety and the stability of a server are ensured. Compared with the existing data center rack-level airflow management method, the method is easier to deploy, more cost-effective and stronger in universality.

Description

Reinforced learning control method for data center active ventilation floor
Technical Field
The invention belongs to the technical field of automatic control, and particularly relates to a reinforcement learning control method for an active ventilation floor of a data center.
Background
The rack hot spot is a high-temperature point at which the temperature of one or more positions of the rack of the data center machine room is obviously higher than that of other positions. Excessive temperatures can cause some servers in a data center to operate less efficiently, thereby reducing its overall power density and also reducing its reliability, which is clearly contrary to the needs of data centers.
The hot spots of the racks are relieved or eliminated by adopting a global regulation and control mode, for example, the power of an air conditioner in a machine room is increased to provide sufficient cold air, so that most of the rack areas are in an over-cooling state inevitably, and the total energy consumption of the data center is more huge than half of the total energy consumption of the data center while the waste of cooling resources is caused. Thus, rack level cooling solutions are more suitable for mitigating rack hot spot issues.
There are currently rack-level refrigeration solutions, such as installing adaptive ventilation floors, installing baffles, enclosing individual racks and providing them with ventilation ducts, etc. However, these solutions are "passive" cooling solutions, which do not actively provide a cooling air flow to the racks, and they are not sufficient when the cooling air supply is insufficient.
The active ventilation floor is used as another rack-level refrigeration scheme, the hot spot problem of the rack is relieved by actively conveying cold air, and compared with the scheme, the active ventilation floor is easier to deploy and more cost-effective, but the control difficulty is mainly characterized by the diversity and the dynamic property of the placing environment, such as different distribution of machine room air conditioners, relative positions of racks and servers in the racks; the cold and hot channels are in different closed states, and the server rack is in different standards and sealing conditions; the room air conditioning power, the thermal load of different rack servers, etc. Therefore, the thermal energy efficiency and airflow model of the data center is generally difficult to describe by an analytic model.
Most of the existing active ventilation floor related researches are performance modeling and evaluation based on measurement or simulation, and no research literature of active ventilation floor control problems exists at present.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a reinforcement learning control method for an active ventilation floor of a data center, which automatically learns an optimal operation strategy and plans air flow of a rack on the premise of not increasing the power of an air conditioner of a machine room, so that the temperature distribution of the rack is uniform, and the hot spot problem of the rack is relieved. And complex airflow and heat exchange models do not need to be established and calibrated, so that the universality of the active ventilation floor is improved.
In order to achieve the purpose, the invention adopts the technical scheme that:
a reinforcement learning control method for an active ventilation floor of a data center is used for establishing a Markov decision process model for a rack hotspot problem of a lifting floor structure data center and providing a reinforcement learning model solving algorithm and an array algorithm as the core of a reinforcement learning control algorithm. The model consists of four parts, namely a system state, a behavior, an incentive and a value function, the solution of the model is that the optimal behavior is continuously selected under a series of system states to maximize the accumulated incentive of the system, the reinforcement learning control algorithm utilizes whether the temperature distribution of the air inlet of the rack is uniform and whether the energy consumption of the active ventilation floor is low as evaluation standards, and adjusts the rotating speed of the fan of the active ventilation floor by continuously exploring and learning the complex relation between the duty ratio value of the PWM signal and the rising, the lowering or the constant maintaining of the value, so that the temperature distribution of the air inlet of the rack is uniform, and the hot spot problem of the rack is relieved.
Compared with the prior art, the invention has the beneficial effects that:
the invention does not need to establish and calibrate complex airflow and heat exchange models, uses an array control algorithm, overcomes the diversity and the dynamic property of the placing environment of the active ventilation floor, automatically matches the relationship between the duty ratio value of the PWM signal and the rise, the fall or the maintenance of the value according to whether the temperature distribution of the air inlet of the rack is uniform and the energy consumption of the active ventilation floor, and only needs to replace the original common ventilation floor with the active ventilation floor for operating the invention.
Compared with an intelligent control method using three intelligent algorithms, the reinforcement learning control method using the array algorithm is simpler and requires less computing resource overhead.
Compared with the reinforcement learning control method using the array algorithm, the definition of the intelligent control method using the three intelligent algorithms on the state and the behavior is more direct and effective for solving the hot spot problem, and the non-discretization state definition and the approximation to the Q function strengthen the universality of the intelligent control method.
Drawings
FIG. 1 is a diagram of active vent floor design and deployment. In the figure, reference numeral 1 is a temperature sensor, 2 is a rack, 3 is a microcontroller, 4 is a driving board, 5 is a switching power supply, 6 is a PC, and 7 is an active ventilation floor.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
Fig. 1 is a detailed deployment implementation schematic diagram of the invention, wherein a certain number of temperature sensors 1 are uniformly distributed at an air inlet of a rack 2 to monitor the temperature distribution of the air inlet of the rack 2, and a temperature sensor two is additionally arranged below an active ventilation floor to monitor the air supply temperature below the active ventilation floor.
In the field, the rack 2 is a rectangular iron box, a certain number of servers are placed in the box, and a plurality of racks are placed in a row. In a certain row of racks, a left panel and a right panel of a certain rack are generally tightly attached to other racks, a front panel of the rack is an air inlet for sucking cold air to refrigerate the server, a rear panel of the rack is an air outlet for discharging refrigerated hot air, the temperature distribution of the air inlet of the rack is monitored, namely the temperature of certain positions of the front panel of the rack is monitored, the temperature of the positions forms the temperature distribution of the air inlet of the rack, and therefore the number of the first temperature sensors 1 depends on the number of the positions.
The active ventilation floor reinforcement learning control method runs at a PC (personal computer) end, a PC6 is connected with a microcontroller 3, the microcontroller 3 is connected with a drive plate 4, and the drive plate 4 is connected with an active ventilation floor fan 7 after being connected with a switching power supply 5(12V, 20A). According to the temperature distribution returned by the temperature sensor I1, a duty ratio value of a PWM signal is generated and transmitted to the microcontroller 3, the microcontroller 3 generates a corresponding PWM signal according to the duty ratio value and transmits the PWM signal to the drive plate 4, the drive plate 4 controls the voltage provided by the switching power supply 5 to the active ventilation floor fan 7 according to the PWM signal, and the purpose of adjusting the rotating speed of the fan is achieved by controlling the power supply voltage of the fan.
The control method comprises the following steps:
1. a Markov decision process model is established for the frame hotspot problem of a raised floor structure (a data center air supply structure, a data center machine room floor is elevated, and a 60-100cm high underfloor space is reserved for conveying cold air by a machine room air conditioner, the structure is the raised floor structure, and most of domestic data centers adopt the structure at present) and consists of the following four parts:
system A State s t Defined as the duty ratio of the discretized square wave of the PWM signal, the formula is as follows:
Figure GDA0003657059700000041
s t for the system state at the time t,
Figure GDA0003657059700000042
is a state space, s is
Figure GDA0003657059700000043
DC is the value of the square wave duty ratio of the PWM signal, max (DC) is the maximum value of DC, D TQ For DC discretization of the equivalence ratio, k denotes D in a certain state TQ The number of (2).
B System behavior space
Figure GDA0003657059700000044
Defined as the change in the speed of rotation of the actively ventilated floor fan, i.e.
Figure GDA0003657059700000045
C award R t+1 The system consists of a quantitative index of the uniform degree of the temperature distribution of an air inlet of a rack and the energy consumption of an active ventilation floor fan, and has the formula as follows:
Figure GDA0003657059700000046
wherein R is t+1 The reward obtained after the system takes some action for time t,
Figure GDA0003657059700000047
the temperature distribution uniformity of the air inlet of the rack is shown, the more the formula value is negative, the closer to 0, the more uniform the temperature distribution of the air inlet of the rack is, the T t,i The temperature reading of the first temperature sensor numbered i at time t,
Figure GDA0003657059700000048
for the reference temperature of the rack at time t,
Figure GDA0003657059700000049
T t,under is the reading of the second temperature sensor at time t, Δ T The fixed temperature difference set according to the mixing degree of the cold air and the hot air on the active ventilation floor is positive,
Figure GDA00036570597000000410
is a collection of the first temperature sensors,
Figure GDA00036570597000000411
the total number of the temperature sensors is one; - (A) ref ×DC t ) 3 Representing the active ventilation floor fan energy consumption, the values of the formula are all negative, the closer to 0, the lower the fan energy consumption, wherein A ref To maintain a reference behavior value of the same order of magnitude as the uniformity of the temperature distribution at the inlet of the frame, DC t The square wave duty ratio of the PWM signal at time t.
D cost function Q(s) t ,a t ) As a behavior cost function, the formula is:
Figure GDA0003657059700000051
wherein the merit function Q (s, a) is referred to as the Q function,
Figure GDA0003657059700000052
for the action taken by the system at time t,
Figure GDA0003657059700000053
for the expectation function, y is the future time relative to time t, R t+y+1 Represents the reward obtained after the system takes action at the time t + y, gamma represents the attenuation factor and represents the attention degree of the model to the future reward (environmental influence), gamma is more than or equal to 0 and less than 1 y Y power of gamma, is t + y time R t+y+1 The attenuation factor of (2).
The E markov decision process model can be summarized as maximizing the cumulative reward by selecting the optimal behavior at any time t system state, with the model formula:
Figure GDA0003657059700000054
is constrained to
Figure GDA0003657059700000055
γ t Is time t system R t+1 The attenuation factor of (2).
2. Model solution and solving algorithm
and (b) calculating to obtain an optimal Q function according to the solution of the model a, namely selecting an optimal behavior according to the optimal Q function under the system state at any time t to maximize the accumulated reward, wherein the calculation formula of the optimal Q function is as follows:
Figure GDA0003657059700000056
at any time t, the optimal behavior selection formula is as follows:
Figure GDA0003657059700000057
wherein Q * (s t ,a t ) Representing the optimal Q function, s t+1 Represents the state of the system at the moment t +1, and a represents any action in all actions that the system may take at the moment t +1, namely, the action space
Figure GDA0003657059700000058
Is performed in a manner such that a certain behavior in (2),
Figure GDA0003657059700000059
is shown at s t+1 In the state, the system adopts any one
Figure GDA00036570597000000510
Behavior of (1), maximum obtainable optimal Q functionNumerical values.
And b, solving an algorithm, namely, calculating to obtain an optimal Q function and selecting an optimal behavior in the decision so as to maximize the accumulated reward. The reinforcement learning model solving algorithm is an array algorithm, a two-dimensional array (row index is a state and column index is a behavior) is adopted to store the Q function, and a Q sample value Q is calculated t+1,target And Q query value Q t (s t ,a t ) Difference of delta t+1 And iteratively updating the Q value in the array, calculating an optimal Q function, and further selecting an optimal behavior by inquiring the array, so that the accumulated reward of the model is maximized. Wherein the Q sample value is calculated according to the optimal Q function, and R obtained by the real-time system t+1 And s t+1 Calculated, Q query value is s obtained in real time according to the system t And a t And searching the value obtained by the corresponding row and column in the two-dimensional array.
The Q sample value calculation formula is as follows:
Figure GDA0003657059700000061
wherein
Figure GDA0003657059700000062
For time t said two-dimensional array s t+1 Corresponding to the maximum Q query value in the row, the array updating mode is as follows:
Figure GDA0003657059700000063
wherein Q t (s t ,a t ) For s in a two-dimensional array at time t t And a t Corresponding Q query value, Q t+1 (s t ,a t ) For s in a two-dimensional array at time t +1 t And a t Corresponding Q query value, beta(s) t ,a t )∈[0,1]A corresponding learning step size for each state-behavior pair in the array.
And 3, solving the model by adopting a reinforcement learning model solving algorithm, and adjusting the rotating speed of the fan on the active ventilation floor by continuously exploring and learning the complex relation between the duty ratio of the PWM signal and the rise, fall or maintenance of the duty ratio by using whether the temperature distribution of the air inlet of the rack is uniform and whether the energy consumption of the active ventilation floor is low as evaluation standards, so that the temperature distribution of the air inlet of the rack is uniform, and the hot spot problem of the rack is relieved. The running logic of the system at the PC end is as follows:
1: setting a reference temperature
Figure GDA0003657059700000064
Initializing beta(s) t ,a t ) (ii) a Initializing the array;
2: setting an initial time t to be 0; exploring a probability change interval random _ slots; probability of exploration of initial behavior, e, rate of exploration, delta, of decrease with t ε Minimum exploration probability ε min
3: selecting an initial state s 0 =max(DC);
4: beginning of circulation body
5: if t is less than random _ slots, randomly selecting and switching 7 the behavior from the behavior space, otherwise switching 6;
6: using the exploration probability epsilon as epsilon-delta ε And epsilon min And selecting a behavior according to the following formula:
Figure GDA0003657059700000071
7: execution of a t (PC sends duty cycle command to microcontroller) and gets the next state s of the system t+1 (PC sends temperature request command to obtain rack temperature distribution), and calculates R according to reward formula t+1
8: updating the corresponding value in the array according to a formula;
9: the time t is increased by 1;
10: the cycle body is ended.
In summary, the present invention establishes a markov decision process model for the rack hotspot problem of the lifting floor structure data center, and provides a reinforcement learning model solving algorithm as the core of a reinforcement learning control algorithm, and intelligently controls the fan rotation speed of an active ventilation floor (a floor with a fan attached to the back of a common ventilation floor) according to the current rack temperature distribution on the premise of not increasing the air conditioning power of a machine room, so that the rack inlet temperature distribution is homogenized by the way of actively conveying sufficient amount of cold air, the rack hotspot problem commonly existing in the lifting floor structure data center is alleviated, thereby saving refrigeration energy consumption, and ensuring the safety and stability of a server. Compared with the existing data center rack-level airflow management method, the method is easier to deploy, more cost-effective and stronger in universality.

Claims (3)

1. The reinforcement learning control method of the data center active ventilation floor is characterized by comprising the following steps:
step 1, arranging a certain number of first temperature sensors for monitoring the temperature distribution of an air inlet of a rack at the air inlet of the rack, and arranging a second temperature sensor for monitoring the air supply temperature under an active ventilation floor under the active ventilation floor;
step 2, establishing a Markov decision process model for the rack hot spot problem of the raised floor structure data center, wherein the model is determined by a system state s t Behavior space
Figure FDA0003657059690000011
Reward R t+1 And a cost function Q(s) t ,a t ) The four parts are formed;
wherein: the system state s t For the system state at the time t,
Figure FDA0003657059690000012
the state space is defined as the duty ratio of a discretized PWM signal square wave, and the formula is as follows:
Figure FDA0003657059690000013
wherein s is
Figure FDA0003657059690000014
DC is the value of the square wave duty ratio of the PWM signal, max (DC) is the maximum value of DC, D TQ For DC discretization of the equivalence ratio, k denotes D in a certain state TQ The number of (2); the PWM signal square wave is generated by the following method: generating a duty ratio value of a PWM signal according to the temperature distribution returned by the first temperature sensor, and transmitting the duty ratio value to the microcontroller, wherein the microcontroller generates a corresponding PWM signal according to the duty ratio value;
space of action
Figure FDA0003657059690000015
Defined as the change in the rotational speed of the active ventilation floor fan,
Figure FDA0003657059690000016
reward R t+1 The temperature distribution uniformity of the air inlet of the rack is quantified, and the energy consumption of the active ventilation floor fan is calculated according to the following formula:
Figure FDA0003657059690000021
wherein R is t+1 The reward obtained after the system takes some action for time t,
Figure FDA0003657059690000022
the temperature distribution uniformity of the air inlet of the rack is shown, the more the formula value is negative, the closer to 0, the more uniform the temperature distribution of the air inlet of the rack is, the T t,i The temperature reading of the first temperature sensor numbered i at time t,
Figure FDA0003657059690000023
for the reference temperature of the rack at time t,
Figure FDA0003657059690000024
T t,under is tReading, delta, of the second temperature sensor at the moment T The fixed temperature difference set according to the mixing degree of the cold air and the hot air on the active ventilation floor is positive,
Figure FDA0003657059690000025
is a collection of the first temperature sensors,
Figure FDA0003657059690000026
the total number of the temperature sensors is one; - (A) ref ×DC t ) 3 Representing the active ventilation floor fan energy consumption, the values of the formula are all negative, the closer to 0, the lower the fan energy consumption, wherein A ref To maintain a reference behavior value of the same order of magnitude as the uniformity of the temperature distribution at the inlet of the frame, DC t The duty ratio of the square wave of the PWM signal at the moment t;
cost function Q(s) t ,a t ) The formula of the behavior cost function is as follows:
Figure FDA0003657059690000027
wherein the merit function Q (s, a) is referred to as the Q function,
Figure FDA0003657059690000028
for the action taken by the system at time t,
Figure FDA0003657059690000029
as a function of the expectation, y is the future time relative to time t, R t+y+1 Represents the reward obtained after the system takes action at the time t + y, gamma represents the attenuation factor, gamma is more than or equal to 0 and less than 1, and gamma is y Y power of gamma, is t + y time R t+y+1 The attenuation factor of (d);
the markov decision process model is summarized as: under the system state at any time t, the accumulated reward is maximized by selecting the optimal behavior, and the model formula is as follows:
Figure FDA00036570596900000210
is constrained to
Figure FDA0003657059690000031
Wherein, γ t Is time t system R t+1 The attenuation factor of (d);
and 3, solving the model by adopting a reinforcement learning model solving algorithm, and adjusting the rotating speed of the fan of the active ventilation floor by continuously exploring and learning the complex relation between the duty ratio of the PWM signal and the rise, fall or maintenance of the duty ratio by using whether the temperature distribution of the air inlet of the rack is uniform and whether the energy consumption of the active ventilation floor is low as evaluation standards, so that the temperature distribution of the air inlet of the rack is uniform, and the hot spot problem of the rack is relieved.
2. The reinforcement learning control method for the active ventilation floor of the data center according to claim 1, wherein in the step 2, an optimal Q function is obtained through calculation, that is, an optimal behavior can be selected according to the optimal Q function under a system state at any time t, so that the accumulated reward is maximized, and the optimal Q function has a calculation formula:
Figure FDA0003657059690000032
at any time t, the optimal behavior selection formula is as follows:
Figure FDA0003657059690000033
wherein Q * (s t ,a t ) Representing the optimal Q function, s t+1 Represents the state of the system at the moment t +1, and a represents any action in all actions that the system may take at the moment t +1, namely, the action space
Figure FDA0003657059690000034
Is performed in a manner such that a certain behavior in (2),
Figure FDA0003657059690000035
is shown at s t+1 In the state, the system adopts any one
Figure FDA0003657059690000036
The largest optimal Q function value can be obtained.
3. The reinforcement learning control method for the active ventilation floor of the data center according to claim 1, wherein in the step 3, the reinforcement learning model solving algorithm is an array algorithm, the Q function is stored by using a two-dimensional array, wherein a row index is a state and a column index is a behavior, and the Q sample value Q is calculated t+1,target And Q query value Q t (s t ,a t ) Difference of delta t+1 Iteratively updating the Q value in the array, calculating an optimal Q function, and further selecting an optimal behavior by inquiring the array so as to maximize the accumulative reward of the model; wherein the Q sample value is calculated according to the optimal Q function, and R obtained by the real-time system t+1 And s t+1 Calculated, the Q query value is s obtained in real time according to the system t And a t Searching the value obtained by the corresponding row and column query in the two-dimensional array;
the Q sample value calculation formula is as follows:
Figure FDA0003657059690000041
wherein
Figure FDA0003657059690000042
For time t said two-dimensional array s t+1 Corresponding to the maximum Q query value in the row, the array updating mode is as follows:
Figure FDA0003657059690000043
wherein Q t (s t ,a t ) For s in a two-dimensional array at time t t And a t Corresponding Q query value, Q t+1 (s t ,a t ) For s in a two-dimensional array at time t +1 t And a t Corresponding Q query value, beta(s) t ,a t )∈[0,1]A corresponding learning step size for each state-behavior pair in the array.
CN202010456237.6A 2020-05-26 2020-05-26 Reinforced learning control method for data center active ventilation floor Active CN111601490B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010456237.6A CN111601490B (en) 2020-05-26 2020-05-26 Reinforced learning control method for data center active ventilation floor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010456237.6A CN111601490B (en) 2020-05-26 2020-05-26 Reinforced learning control method for data center active ventilation floor

Publications (2)

Publication Number Publication Date
CN111601490A CN111601490A (en) 2020-08-28
CN111601490B true CN111601490B (en) 2022-08-02

Family

ID=72186518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010456237.6A Active CN111601490B (en) 2020-05-26 2020-05-26 Reinforced learning control method for data center active ventilation floor

Country Status (1)

Country Link
CN (1) CN111601490B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114020079B (en) * 2021-11-03 2022-09-16 北京邮电大学 Indoor space temperature and humidity regulation and control method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05159075A (en) * 1991-12-03 1993-06-25 Nippon Telegr & Teleph Corp <Ntt> Interpolation method based on malkov probability place taking continuous value
CN103473613A (en) * 2013-09-09 2013-12-25 武汉理工大学 Landscape structure-surface temperature-electricity consumption coupling model and application thereof
JP2015082224A (en) * 2013-10-23 2015-04-27 日本電信電話株式会社 Stochastic server load amount estimation method and server load amount estimation device
CN106528941A (en) * 2016-10-13 2017-03-22 内蒙古工业大学 Data center energy consumption optimization resource control algorithm under server average temperature constraint
CN108446783A (en) * 2018-01-29 2018-08-24 杭州电子科技大学 A kind of prediction of new fan operation power and monitoring method
WO2019154739A1 (en) * 2018-02-07 2019-08-15 Abb Schweiz Ag Method and system for controlling power consumption of a data center based on load allocation and temperature measurements
CN110322977A (en) * 2019-07-10 2019-10-11 河北工业大学 A kind of analysis method for reliability of nuclear power reactor core water level monitoring system
CN111144793A (en) * 2020-01-03 2020-05-12 南京邮电大学 Commercial building HVAC control method based on multi-agent deep reinforcement learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8478451B2 (en) * 2009-12-14 2013-07-02 Intel Corporation Method and apparatus for dynamically allocating power in a data center
US20130226501A1 (en) * 2012-02-23 2013-08-29 Infosys Limited Systems and methods for predicting abnormal temperature of a server room using hidden markov model
US20140324240A1 (en) * 2012-12-14 2014-10-30 Alcatel-Lucent Usa Inc. Method And System For Disaggregating Thermostatically Controlled Appliance Energy Usage From Other Energy Usage
CN109983481B (en) * 2016-09-26 2023-08-15 D-波系统公司 System, method and apparatus for sampling from a sampling server
US20180100662A1 (en) * 2016-10-11 2018-04-12 Mitsubishi Electric Research Laboratories, Inc. Method for Data-Driven Learning-based Control of HVAC Systems using High-Dimensional Sensory Observations

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05159075A (en) * 1991-12-03 1993-06-25 Nippon Telegr & Teleph Corp <Ntt> Interpolation method based on malkov probability place taking continuous value
CN103473613A (en) * 2013-09-09 2013-12-25 武汉理工大学 Landscape structure-surface temperature-electricity consumption coupling model and application thereof
JP2015082224A (en) * 2013-10-23 2015-04-27 日本電信電話株式会社 Stochastic server load amount estimation method and server load amount estimation device
CN106528941A (en) * 2016-10-13 2017-03-22 内蒙古工业大学 Data center energy consumption optimization resource control algorithm under server average temperature constraint
CN108446783A (en) * 2018-01-29 2018-08-24 杭州电子科技大学 A kind of prediction of new fan operation power and monitoring method
WO2019154739A1 (en) * 2018-02-07 2019-08-15 Abb Schweiz Ag Method and system for controlling power consumption of a data center based on load allocation and temperature measurements
CN110322977A (en) * 2019-07-10 2019-10-11 河北工业大学 A kind of analysis method for reliability of nuclear power reactor core water level monitoring system
CN111144793A (en) * 2020-01-03 2020-05-12 南京邮电大学 Commercial building HVAC control method based on multi-agent deep reinforcement learning

Also Published As

Publication number Publication date
CN111601490A (en) 2020-08-28

Similar Documents

Publication Publication Date Title
CN106949598B (en) Network center&#39;s machine room energy-saving optimization method when network traffic load changes
US6832490B2 (en) Cooling of data centers
US20120136487A1 (en) Modulized heat-dissipation control method for datacenter
CN104728997A (en) Air conditioner for constant temperature control, constant temperature control system and constant temperature control method
CN111601490B (en) Reinforced learning control method for data center active ventilation floor
CN103615782A (en) Refrigerating unit cluster controlling method and device
CN107917509A (en) A kind of computer-room air conditioning system control method
Wan et al. Intelligent rack-level cooling management in data centers with active ventilation tiles: A deep reinforcement learning approach
CN111836524A (en) IT load change-based method for regulating and controlling variable air volume of precision air conditioner between data center columns
CN115793751A (en) Battery cabinet heat management method and device, battery cabinet and readable storage medium
Hamann et al. Methods and techniques for measuring and improving data center best practices
CN111637614B (en) Intelligent control method for data center active ventilation floor
CN106642583B (en) A kind of data center&#39;s intelligence jet system and its control method
CN117395942A (en) Cold volume automatic scheduling system based on intelligent computation center
CN116390455A (en) Modularized data center machine room with cabinet in fish scale type arrangement and control method
CN110008515B (en) Renewable energy data center management method and device
CN116954329A (en) Method, device, equipment, medium and program product for regulating state of refrigeration system
KR102314866B1 (en) System for controlling temperature of computer room
CN114126369A (en) Heat dissipation control method of photovoltaic inverter
Yu et al. Hierarchical fuzzy rule-based control of renewable energy building systems
US20240194968A1 (en) Active fan balancing
Wu et al. Data center job scheduling algorithm based on temperature prediction
Lin et al. Thermal Modeling and Thermal-aware Energy Saving Methods for Cloud Data Centers: A Review
CN117537451B (en) Method and system for controlling low power consumption of intelligent thermoelectric air conditioning equipment
CN115349448B (en) Temperature control system and control method for cultivation house

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant