WO2023199482A1 - Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program - Google Patents

Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program Download PDF

Info

Publication number
WO2023199482A1
WO2023199482A1 PCT/JP2022/017844 JP2022017844W WO2023199482A1 WO 2023199482 A1 WO2023199482 A1 WO 2023199482A1 JP 2022017844 W JP2022017844 W JP 2022017844W WO 2023199482 A1 WO2023199482 A1 WO 2023199482A1
Authority
WO
WIPO (PCT)
Prior art keywords
power consumption
server
air conditioning
accelerator
gpu
Prior art date
Application number
PCT/JP2022/017844
Other languages
French (fr)
Japanese (ja)
Inventor
彦俊 中里
誠亮 新井
雅志 金子
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2022/017844 priority Critical patent/WO2023199482A1/en
Publication of WO2023199482A1 publication Critical patent/WO2023199482A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/329Power saving characterised by the action undertaken by task scheduling
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to a power consumption reduction control device, a power consumption reduction control method, a power consumption reduction control system, and a program that reduce power consumption in a data center (hereinafter sometimes referred to as "DC").
  • DC data center
  • DCs data centers
  • the power consumption of air conditioning in data centers (DCs) accounts for a large proportion, and as the number and scale of DCs increases, there is a need to reduce the power consumption of air conditioning. Furthermore, the amount of data processed in a DC tends to increase year by year, and it is necessary to improve the power consumption efficiency of the entire DC (the amount of power consumed by the entire DC for processing a certain amount of data).
  • Non-Patent Document 1 A technique described in Non-Patent Document 1 has been disclosed as a technique for optimizing the overall power consumption of a DC by considering the power consumption of air conditioning and the power consumption of a server (IT device).
  • the air conditioning linked IT load placement optimization method for data centers described in Non-Patent Document 1 by collecting operating information and monitoring information of IT equipment in the data center, it is possible to predict future changes in the loads of IT equipment and Calculate the power increase for air conditioning equipment according to the power increase. Then, an optimization problem is solved in which the objective function, which is the power amount of the data center, is minimized so that the load aggregation rate on IT equipment increases over time, that is, the number of operating IT equipment is reduced. This calculates the placement of IT loads (virtual machines) on IT equipment that minimizes the power consumption of the data center.
  • Non-Patent Document 1 a general-purpose rule-based standard that does not depend on equipment conditions that differ for each DC is adopted in the air conditioning power model used to calculate the power of the air conditioning equipment. Therefore, it was difficult to optimize the total amount of power consumed by the DC, taking into account individual equipment conditions such as the location of air conditioning equipment, airflow, server arrangement within the DC, and thermal cooling efficiency. .
  • the accelerator is, for example, an FPGA (Field Programmable Gate Array), an ASIC (Application Specific Integrated Circuit), or a TPU (Tensor Processing Unit).
  • the CPU, GPU, and accelerator differ in the amount of power consumption required for server cooling during load processing. Specifically, CPUs show almost no fluctuations in power consumption within the normal temperature range, whereas GPU servers and accelerators experience fluctuations in power consumption even within the normal temperature range. It is assumed that
  • the inventor of this application set the inlet temperature of the GPU server at 20 degrees Celsius and 33 degrees Celsius, which are within the normal temperature range, and set the temperature of the GPU card in the GPU server (GPU temperature), the amount of power consumption, and the fan rotation. The rate was measured.
  • the GPU temperature increases by approximately 10 degrees at 33 degrees Celsius when comparing the inlet temperature between 20 degrees Celsius and 33 degrees Celsius, as shown in FIG.
  • the power consumption of the GPU card also increases by approximately 20 W at 33 degrees Celsius.
  • the fan rotation rate of the GPU server also increases by approximately 10% at 33° C., as shown in FIG. Note that the horizontal axis in FIGS. 1 to 3 represents time [minutes: seconds] from the start of measurement.
  • the present invention was made in view of these points, and the present invention reduces the total power consumption consisting of server power consumption and air conditioning power consumption in an environment where CPU servers, GPU servers, accelerators, etc. coexist.
  • the task is to do so.
  • a power consumption reduction control device is a power consumption reduction control device that controls a CPU server, a GPU server, an accelerator, and a plurality of air conditioners, wherein any one of the CPU server, the GPU server, and the accelerator is A plurality of placement control areas to be arranged and an air conditioning control area that is an area for measuring the effect of air conditioning control by the plurality of air conditioners are set, and the power consumption reduction control device an air conditioning control value generation unit that generates an air conditioning control value that includes at least a target temperature set to a target temperature; an air conditioning control execution unit that executes control of the plurality of air conditioners using the air conditioning control value; In a plurality of layout patterns in which processing loads are placed on the GPU server and the accelerator, the air conditioning control execution unit calculates a reward for evaluating the results of controlling the plurality of air conditioners using the air conditioning control value using the target temperature as an index.
  • a remuneration calculation unit that determines whether the remuneration satisfies a predetermined condition, and a remuneration calculation unit that determines whether or not the remuneration satisfies a predetermined condition, and temperature distribution information and information of the plurality of air conditioners as a control result based on the air conditioning control value that is determined to satisfy the predetermined condition.
  • an operation history creation unit that acquires air conditioning power consumption and creates operation history information associated with the predicted heat generation amount of each layout control area in each of the plurality of layout patterns; a placement pattern calculation unit that calculates a plurality of placement patterns in which new processing loads are placed using processing load information; and a CPU server and a GPU server that belong to each of the placement control areas for each of the calculated placement patterns. and an area heat generation estimation unit that estimates the predicted heat generation amount of each of the placement control areas by summing the heat generation amount when processing loads are placed in the accelerators, and an area heat generation estimation unit that estimates the predicted heat generation amount of each of the placement control areas.
  • an operation history information extraction unit that extracts the temperature distribution information and the air conditioning power consumption amount when controlling using the air conditioning control value in each arrangement pattern with reference to the operation history information, and the extracted temperature distribution; Using the information and information regarding the new processing load, total the power consumption of each CPU server, the power consumption of each GPU server, and the power consumption of each accelerator for each of the placement control areas in each of the placement patterns.
  • a server power consumption prediction unit that calculates the server power consumption of the server; and a server power consumption prediction unit that calculates the server power consumption of each of the placement control areas in each of the placement patterns; and an arrangement pattern determining unit that calculates a total amount of the air-conditioning power consumption and determines an arrangement pattern that minimizes the calculated total amount as an arrangement pattern in which the processing load is arranged.
  • the total power consumption consisting of server power consumption and air conditioning power consumption can be reduced.
  • FIG. 3 is a diagram for comparing GPU temperatures when the inlet temperature is 20° C. and 33° C.
  • FIG. 4 is a diagram for comparing the power consumption of the GPU card when the inlet temperature is 20° C. and 33° C. It is a figure for comparing the FAN rotation rate of the GPU server when the suction port temperature is 20° C. and 33° C.
  • 1 is a diagram showing the overall configuration of a power amount reduction control system including a power amount reduction control device according to the present embodiment.
  • FIG. 2 is a functional block diagram showing a configuration example of a power amount reduction control device according to the present embodiment.
  • FIG. 3 is a diagram for explaining situation classification according to the present embodiment.
  • FIG. 3 is a diagram for explaining temperature distribution information according to the present embodiment.
  • FIG. 2 is a hardware configuration diagram showing an example of a computer that implements the functions of the power reduction control device according to the present embodiment.
  • FIG. 4 is a diagram showing the overall configuration of the power amount reduction control system 1 including the power amount reduction control device 100 according to the present embodiment.
  • the power consumption reduction control system 1 includes a plurality of CPU servers 3, a GPU server 4, an accelerator 5, and a plurality of It is configured to include a data center (DC 10) having an air conditioner 2 and a power consumption reduction control device 100.
  • a plurality of CPU servers 3, GPU servers 4, and accelerators 5 accommodated in a predetermined control area (arrangement control area) may be hereinafter referred to as a "server group" or collectively as a "server.” 4 and FIG. 7, which will be described later, the server 3 is represented by an unfilled hexagon, the GPU server 4 is represented by a hexagon with a plurality of diagonal lines, and the accelerator 5 is represented by a hexagon filled with dots.
  • This server group includes at least one CPU server 3, GPU server 4, and accelerator 5.
  • the accelerator 5 This also includes a case where there is no server and the server is configured with a CPU server 3 and a GPU server 4.
  • the power amount reduction control device 100 may be provided inside the DC 10 or may be provided in a location different from the DC 10 and may control a plurality of DCs 10.
  • This power consumption reduction control device 100 receives status information of air conditioners 2 (air conditioners “1”, “2”, and “3” in FIG. 4) installed in the DC 10 via an air conditioning management device (not shown). The information may be acquired, the air conditioning control information may be transmitted, or the communication may be directly connected to each air conditioner 2 without using an air conditioning management device.
  • the power consumption reduction control device 100 also acquires status information and controls the CPU server 3, GPU server 4, and accelerator 5 that are accommodated as a server group provided in the DC 10 via a server management device (not shown). It may transmit information, or it may be directly communicatively connected to each CPU server 3, GPU server 4, and accelerator 5.
  • the CPU server 3, the GPU server 4, and the accelerator 5 are arranged as shown in FIG.
  • Each area is divided and controlled as a "placement control area.”
  • This placement control area 30 is an area that accommodates a group of servers on which processing loads (virtual resources, processing of GPUs, accelerators, etc.) are placed.
  • FIG. 4 shows an example in which placement control areas "1" to "6" are provided.
  • a virtualization infrastructure is constructed in the CPU server 3, and the description will be made assuming that it is operated using containers and VMs.
  • OpenStack registered trademark
  • Kubernetes registered trademark
  • OpenStack is primarily used for managing and operating physical machines and virtual machines (VMs).
  • Kubernetes is mainly used for managing and operating containers.
  • an application consisting of one or more containers, one or more VMs, etc.
  • a virtual resource In this specification, an application (consisting of one or more containers, one or more VMs, etc.) that is virtualized on a virtualization platform is referred to as a virtual resource.
  • the minimum execution unit of an application is a Pod, which is made up of one or more containers.
  • an "air conditioning control area” is provided as shown in FIG. 4 in association with the placement control area 30 of the server group.
  • the air conditioning control area 20 is a grouped area in which the room temperature effect due to air conditioning control is measured, and faces either the suction port side or the discharge port side of each server (CPU server 3, GPU server 4, accelerator 5). shall be taken as a thing.
  • the air blown from the air conditioner 2 is sent to the air conditioning control areas 20 (in FIG. 4, air conditioning control areas "3", "4", and "7”) on the suction side via piping installed under the floor of the DC 10, for example. ”, “8”).
  • a plurality of sensors are installed in each of the air conditioning control areas 20. Furthermore, temperature sensors are also installed at the suction ports of the GPU servers 4 and accelerators 5 in each placement control area 30. Furthermore, a sensor (temperature sensor, etc.) is installed outside the DC 10 as well. Information obtained from these sensors (sensor information) can be acquired by the power consumption reduction control device 100 via a communication line or the like.
  • the power consumption reduction control device 100 reduces the load to each server resource (this embodiment The amount of heat generated for each placement control area 30 ("predicted heat generation amount of placement control area" to be described later) in the placement pattern arranged in the CPU server 3, GPU server 4, and accelerator 5 is predicted. Note that the above “/" means “and/or”.
  • the power consumption reduction control device 100 sets air conditioning control values for the air conditioner 2 in multiple stages for each situation (“Situation” to be described later) of the external temperature of the DC 10, the floor temperature, and the predicted amount of heat generation in the placement control area 30. Temperature distribution information and air conditioning power consumption information when controlled at each stage are held.
  • the power consumption reduction control device 100 calculates the server power consumption (to be described later, "total server power consumption") of each placement control area 30 based on the temperature distribution information etc. when controlling at each stage, Determine the layout pattern that minimizes the total amount of server power consumption and air conditioning power consumption (details will be described later).
  • the power amount reduction control device 100 will be described in detail below.
  • FIG. 5 is a functional block diagram showing a configuration example of the power amount reduction control device 100 according to the present embodiment.
  • the power consumption reduction control device 100 predicts the amount of heat generated (predicted amount of heat generated) for each placement control area 30 in each load placement pattern of server resources (CPU server 3, GPU server 4, accelerator 5), and calculates the situation (Situation). ), temperature distribution information 64 and air conditioning power consumption information 65 are acquired when the air conditioning of the air conditioner 2 is controlled. Then, the power consumption reduction control device 100 calculates the server power consumption of each of the CPU server 3, GPU server 4, and accelerator 5 using the learning model, and adds up the power consumption of each of the CPU servers 3, GPU servers 4, and accelerators 5. 3. Calculate the total power consumption of the GPU server 4 and accelerator 5.
  • the power consumption reduction control device 100 calculates the total server power consumption (total server power consumption to be described later), which is the sum of the total power consumption of the CPU server 3, GPU server 4, and accelerator 5 in each placement control area 30. , and the air-conditioning power consumption, determine an arrangement pattern that minimizes the total amount, and execute load arrangement and air-conditioning control based on the arrangement pattern.
  • This power consumption reduction control device 100 is constituted by a computer including a control section, an input/output section, and a storage section (all not shown).
  • the input unit inputs and outputs information between each device in the DC 10 (each air conditioner 2 and each server (CPU server 3, GPU server 4, accelerator 5)), etc.
  • This input/output unit is composed of a communication interface that sends and receives information via a communication line, and an input/output interface that inputs and outputs information between an input device such as a keyboard (not shown) and an output device such as a monitor (not shown). be done.
  • the storage unit includes a hard disk, flash memory, RAM (Random Access Memory), and the like.
  • This storage section temporarily stores programs for executing each function of the control section and information necessary for processing of the control section.
  • This storage unit also contains control values for the air conditioner 2 (air conditioning control value information 63) for each Situation in each arrangement pattern, temperature distribution information 64 as the control result, air conditioning power consumption information 65, etc. Operation history information 201 indicated by is stored.
  • the storage unit includes basic power consumption information 301 for calculating the predicted heat generation amount of each server (CPU server 3, GPU server 4, accelerator 5), and CPU information 301 for predicting the power consumption of the CPU server 3.
  • a server power amount learning model 302, a GPU server power amount learning model 303 for predicting the power consumption of the GPU server 4, and an accelerator power amount learning model 304 for predicting the power consumption of the accelerator 5 are stored. (Details below).
  • the control unit is in charge of overall processing executed by the power consumption reduction control device 100, and is configured to include an air conditioning control unit 200 and a server control unit 300, as shown in FIG.
  • the air conditioning control unit 200 uses the average temperature of the floor in the DC 10 before control (floor average temperature), the outside temperature (outside air temperature), and the predicted amount of heat generation for each location control area 30 as Situation components, and performs each air conditioning control for each Situation.
  • temperature distribution information 64 is acquired in the control turn, and air conditioning power consumption information 65 is calculated, thereby generating operation history information 201.
  • the phase in which this operation history information 201 is generated is referred to as a learning phase.
  • the air conditioning control unit 200 acquires information on the predicted heat generation amount of each placement control area 30 from the server control unit 300 during the operation phase in which load placement and air conditioning control are actually executed, the air conditioning control unit 200 selects the corresponding Situation (Situation classification 62). , and outputs temperature distribution information 64 and air conditioning power consumption information 65 to the server control unit 300. Then, the air conditioning control unit 200 causes each air conditioner 2 to perform air conditioning control in the arrangement pattern determined by the server control unit 300.
  • the air conditioning control unit 200 includes a situation recognition unit 210, an operation history information generation unit 220, an operation history information extraction unit 230, and an air conditioning control execution unit 240.
  • the situation recognition unit 210 acquires information on external factors that are parameter elements that constitute the Situation. Then, the situation recognition unit 210 divides each external world factor into a plurality of ranges, defines a combination of each range area as one situation, and determines a situation classification 62 indicating which situation it belongs to based on the acquired information on the external world factor. .
  • the situation recognition section 210 includes an external world factor acquisition section 211 and a situation determination section 212.
  • the external world factor acquisition unit 211 acquires information on the measurement results of external world factors.
  • the external factor is an element that affects an increase or decrease in air conditioning power consumption, and means a parameter element that constitutes the Situation classification 62.
  • the external factors are (1) the average floor temperature in the DC 10 before control, (2) the outside temperature (outside air temperature), and (3) the predicted amount of heat generation for each placement control area 30.
  • the external factor acquisition unit 211 calculates the floor average temperature in the DC 10 before control as follows.
  • the external factor acquisition unit 211 calculates the average value of the temperatures acquired from the temperature sensors of the air conditioning control area 20, and calculates the average temperature for each air conditioning control area 20. Then, the external factor acquisition unit 211 averages the calculated average temperature for each air conditioning control area 20 over the entire floor, and sets the obtained temperature as the floor average temperature.
  • the external temperature is information obtained from a temperature sensor installed outside the DC 10.
  • the predicted amount of heat generation for each placement control area 30 is information calculated by the server control unit 300 (details will be described later).
  • the external world factor acquisition unit 211 acquires information on this external world factor, it outputs it to the Situation determination unit 212 .
  • the Situation determining unit 212 determines to which Situation classification 62 the information acquired by the external world factor acquiring unit 211 belongs. Each external factor is divided into a plurality of ranges between the minimum value and the maximum value depending on the characteristics of the external factor. Then, a combination of ranges obtained by dividing each external factor is defined as one Situation. This will be explained below with reference to FIG.
  • each of the external factors is defined as a "factor”, and a range for division is defined (hereinafter referred to as "division definition").
  • the external factor of "factor 1” shown in the Situation classification 62 in FIG. 6 is "floor average temperature”, and the division definition is "0-48 degrees divided into 6".
  • the external factor of "factor 2" is “external temperature”, and the division definition is "0-48 degrees divided into 6”.
  • the external factor of "factor 3” is "predicted amount of heat generation in placement control area 1", and the division definition is "0-200W divided into 20".
  • the external factor of "factor 8” is “predicted heat generation amount of placement control area 6", and the division definition is "0-200W divided into 20".
  • the external world factor information acquired by the Situation determination unit 212 is the external world factor information 61 shown in FIG.
  • the Situation determination unit 212 determines that since the value of "factor 1" (floor average temperature) is "25", the “range” is included in the "24-32 range” (24 degrees or more and less than 32 degrees), Set the "factor range identifier” to "factor1-4".
  • This "factor range identifier” is, for example, 0-48 degrees divided into 6, "factor1-1” for 0 degrees or more and less than 8 degrees, “factor1-2” for 8 degrees or more and less than 16 degrees, and “factor 1-2” for 8 degrees or more and less than 16 degrees, and 16 degrees or more and less than 16 degrees. This is information that identifies the range to which it belongs, such as "factor1-3". The same applies to other "factors”.
  • the Situation determining unit 212 combines the information on the "factor range identifiers" of the external factors to form a "Situation classification" and determines that it is "factor1-4_factor2-4_factor3-4_factor4-4_factor5-5_factor6-5_factor7-4_factor8-4". In this way, the Situation determining unit 212 determines the "Situation classification" based on the acquired information on external factors.
  • the operation history information generation unit 220 generates temperature distribution information 64 as a result of controlling the air conditioner 2 and air conditioning power consumption based on air conditioning control values that divide the control of the air conditioner 2 into multiple stages in each Situation. Operation history information 201 is generated based on the amount information 65.
  • the operation history information generation section 220 includes an air conditioning control value generation section 221, a remuneration calculation section 222, and an operation history generation section 223.
  • the air conditioning control value generation unit 221 generates air conditioning control values divided into multiple stages of control of the air conditioner 2 in each Situation. Specifically, the air conditioning control value generation unit 221 sets each parameter between an upper limit value and a lower limit value for control parameters (for example, set temperature (target temperature), air volume, etc.) that can be changed in each air conditioner 2. It is divided into M stages. Then, air conditioning control value information 63 to be controlled by (one) air conditioner 2 is generated by combining the parameters of each stage.
  • control parameters for example, set temperature (target temperature), air volume, etc.
  • the air conditioning control execution unit 240 may, for example, control the air conditioners “1” ⁇ “2" ⁇ “3" in the order of air conditioners “1” ⁇ “2” ⁇ “3”, or control the air conditioners "1" and “2” , Air conditioning control with multiple patterns such as controlling air conditioners ⁇ 2'' and ⁇ 3'', controlling air conditioners ⁇ 1'' and ⁇ 3'' in combination, or controlling air conditioners ⁇ 1'', ⁇ 2'', and ⁇ 3'' simultaneously. Execute.
  • the reward calculation unit 222 calculates a reward (temperature reward) as an index for evaluating the result of executing control using the air conditioning control value generated by the air conditioning control value generation unit 221. Then, the remuneration calculation unit 222 determines whether the control result satisfies a predetermined remuneration, that is, whether the air conditioning control value satisfies a predetermined condition.
  • the reward calculation unit 222 defines two types of rewards, a high temperature warning reward and a low temperature warning reward, for each air conditioning control area 20, and calculates the reward for the control result for each turn.
  • the high temperature warning reward is applied when the temperature before control is higher than the target temperature, that is, when the room temperature is high and the temperature is controlled to be lowered.
  • the low temperature warning reward is applied when the temperature before control is lower than the target temperature, that is, when the room temperature is too low and the temperature is controlled to increase.
  • the reward calculation unit 222 uses the difference between the "target temperature of the turn” and the "temperature after turn control", that is, the deviation between the target temperature and the current temperature, as an index. For example, in the case of a high temperature warning reward, if the "temperature after turn control" is less than or equal to the "turn target temperature", the reward is "100%". In addition, the reward will be ⁇ -10%'' every time the ⁇ temperature after turn control'' increases by +1 degree from the ⁇ turn target temperature.'' Note that this reward is not limited to the above value and can be set arbitrarily.
  • the reward is "100%".
  • the reward will be ⁇ -10%'' every time the ⁇ temperature after turn control'' decreases by -1 degree from the ⁇ turn target temperature.'' Note that this reward is not limited to the above value and can be set arbitrarily.
  • the reward calculation unit 222 may determine that the test is finally passed when both the high temperature warning reward and the low temperature warning reward are passed. For example, if the initial temperature of air conditioner 2 before control is higher than the target temperature and control is performed using the air conditioning control value based on the high temperature warning reward, the predetermined passing threshold is exceeded, but the target temperature is exceeded and the temperature becomes low. There is a possibility of excessive control. In this case, excessive air conditioning power consumption occurs. Therefore, if the temperature before control is lower than the target temperature, control is performed based on the low temperature warning reward until it is determined to pass. In this manner, the reward calculation unit 222 determines that both the high temperature warning reward and the low temperature warning reward are passed, thereby making it possible to select an air conditioning control value that can be controlled within an appropriate range.
  • the reward calculation unit 222 calculates The air conditioning control value will be rejected.
  • the GPU server 4 if the GPU temperature does not fall within a predetermined range determined from the relationship between the GPU temperature (GPU card temperature) and the power consumption of the GPU server, it is judged as a failure.
  • the accelerator 5 if the accelerator temperature does not fall within a predetermined range determined from the relationship between the accelerator temperature (temperature inside the accelerator) and power consumption, the accelerator 5 is judged to have failed. This is a process to avoid unnecessary storage of operation history that is unlikely to be adopted when the DC 10 actually processes a load and has a temperature inappropriate for managing devices within the DC 10.
  • the operation history creation unit 223 generates temperature distribution information 64 and air conditioning consumption as a result of the air conditioning control execution unit 240 controlling each air conditioner 2 based on the air conditioning control value information 63 generated by the air conditioning control value generation unit 221 in each Situation.
  • the power amount information 65 is acquired. That is, the operation history creation unit 233 causes the air conditioning control execution unit 240 to execute control for each air conditioning control value for each pattern of combinations in which each parameter is divided into M stages, generated by the air conditioning control value generation unit 221 in each Situation. , temperature distribution information 64 and air conditioning power consumption information 65 when the air conditioning control value is executed are acquired.
  • the operation history creation unit 223 excludes from the creation of the operation history information 201 air conditioning control values and control results that the remuneration calculation unit 222 determines as failing because they do not satisfy a predetermined condition.
  • FIG. 7 is a diagram for explaining the temperature distribution information 64 according to this embodiment.
  • the temperature distribution information 64 is obtained by a temperature sensor 44 provided on the suction port side of the GPU server 4 and a temperature sensor 55 provided on the suction port side of the accelerator 5, as shown in FIG. This is temperature information measured over a period of time.
  • the temperature sensors 44 and 55 measure temperature after their identification information is associated with the identification information of the GPU server 4 and accelerator 5 in advance. From the start to the end of the turn, all the temperature sensors 44 and 45 on the floor measure the temperature at each time transition (at predetermined time intervals), and the resulting information is generated as temperature distribution information 64.
  • the temperature measured by the temperature sensor 44 provided on the suction port side of the GPU server 4 will be referred to as “GPU suction port temperature”
  • the temperature measured by the temperature sensor 55 provided on the suction port side of the accelerator 5 will be referred to as “GPU suction port temperature”.
  • the acceleration suction port temperature is referred to as the "accelerator suction port temperature.”
  • the air conditioning power consumption information 65 is measured by an unillustrated power consumption measuring means that monitors the air conditioners 2 in the turn in which the air conditioning control execution unit 240 controls each air conditioner 2 based on the air conditioning control value information 63. This is the total power consumption of each air conditioner 2. For example, the power consumption of each air conditioner 2 is measured at each time transition (predetermined time interval), and the sum of the power consumption measured in that turn is calculated as the air conditioning power consumption information 65.
  • the operation history creation unit 233 stores temperature distribution information 64 and air conditioning obtained as a result of the control in the Situation (Situation classification 62) and air conditioning control value information 63 when the air conditioning control execution unit 240 executes the air conditioning control. Operation history information 201 associated with power consumption information 65 is created and stored in a storage unit (not shown).
  • the operation history information extraction unit 230 acquires information on the predicted heat generation amount of each placement control area 30 from the server control unit 300 (area heat generation amount estimating unit 320) in the operation phase. Then, the operation history information extraction unit 230 determines the Situation classification 62 at the start of the control turn via the situation recognition unit 210. Then, the operation history information extraction unit 230 extracts temperature distribution information 64 and air conditioning power consumption information 65, which are the results of control using each air conditioning control value information 63 in the determined Situation classification 62, from the operation history information 201. . The operation history information extraction unit 230 outputs the extracted temperature distribution information 64 and air conditioning power consumption information 65 to the server control unit 300 (server power consumption prediction unit 330, layout pattern determination unit 340).
  • the air conditioning control execution unit 240 controls the air conditioner 2 in the plurality of patterns described above in each Situation using the air conditioning control value generated by the air conditioning control value generation unit 221. Furthermore, the air conditioning control execution unit 240 executes air conditioning control for each air conditioner 2 in the optimal arrangement pattern determined by the server control unit 300 in the operation phase.
  • the server control unit 300 allocates the load to each server resource (CPU server 3, GPU server 4, accelerator 5) based on the generation/deletion schedule information of the virtual resource that becomes the processing load on the CPU and the load on the GPU/accelerator.
  • the amount of heat generated for each placement control area 30 in the placement pattern (predicted amount of heat generation for the placement control area 30) is calculated.
  • the server control unit 300 calculates the total server power consumption, which is the sum of the server power consumption of each placement control area 30, based on the temperature distribution information 64 etc. obtained from the air conditioning control unit 200, which controls the air conditioning at each stage for each Situation. Calculate the power amount (total server power consumption).
  • the server control unit 300 calculates the total value of the total server power consumption and the air conditioning power consumption, and determines an arrangement pattern that minimizes the total amount.
  • the server control section 300 includes an arrangement pattern calculation section 310, an area heat generation estimation section 320, a server power consumption estimation section 330, and an arrangement pattern determination section 340.
  • the placement pattern calculation unit 310 calculates a generation/deletion schedule (hereinafter referred to as "load processing schedule information") regarding the load on virtual resources and GPUs/accelerators that become a processing load on the CPU server 3. get. Then, the placement pattern calculation unit 310 assigns a new load to each server resource (CPU server 3, GPU server 4, accelerator 5) based on the latest resource usage status (for example, usage rate of CPU, GPU, accelerator, etc.). Calculate the placed layout pattern. Note that, after allocating the load to each server resource (CPU server 3, GPU server 4, accelerator 5), the allocation pattern calculation unit 310 calculates that the resource occupation amount of each server resource is equal to or less than the load capacity (upper limit) x a predetermined threshold. Make it so that
  • the area heat generation estimation unit 320 calculates the power consumption of each server resource (CPU server 3, GPU server 4, accelerator 5) for each layout pattern calculated by the layout pattern calculation unit 310, with reference to the basic power consumption information 301. and predict. Then, the area heat generation amount estimating unit 320 calculates the total predicted heat generation amount for each placement control area 30 in each placement pattern based on the server arrangement configuration for each placement control area 30.
  • This basic power consumption information 301 does not take into account changes in server power consumption due to changes in the temperature of the inlet of each server resource, for example, the power consumption that is the standard for a normal state at a predetermined temperature (18 degrees Celsius). It's the amount.
  • the load processing schedule information is to execute 12 Pods of CPU processing "a", 5 Pods of GPU processing "b", and 10 Pods of FPGA processing "c" in placement control area "1".
  • the basic power consumption information 301 is "100w” for one Pod per server for CPU, "5kw” for one Pod per server for GPU, and "20w” for one Pod per server for FPGA.
  • the power consumption of the CPU server 3 in the placement control area "1" is "1.2 kw”
  • the power consumption of the GPU server 4 is "5 kw”
  • the power consumption of the FPGA is "0.2 kw”.
  • the area calorific value estimating unit 320 calculates the calorific value W of the corresponding placement control area 30 using the following equation (1).
  • Calorific value W of placement control area CPU server power consumption of applicable placement control area x kc + GPU server power consumption of applicable placement control area x kg + Accelerator power consumption of applicable placement control area x ka...Formula ( 1)
  • kc, kg, and ka are coefficients obtained by measuring in advance the amount of load on air conditioning cooling in each room in the DC 10 when the CPU server 3, GPU server 4, and accelerator 5 are in operation.
  • the area calorific value estimating unit 320 calculates the calorific value of each placement control area 30 by summing the calorific value of the CPU server 3, GPU server 4, and accelerator 5 using the above equation (1), and calculates the calorific value of each placement control area 30. This is the predicted amount of heat generation in the control area 30. Then, the area heat generation amount estimating unit 320 outputs the calculated predicted heat generation amount of each placement control area 30 to the air conditioning control unit 200 (operation history information extraction unit 230).
  • the server power consumption prediction unit 330 extracts load processing schedule information (information regarding new processing loads) for the CPU server 3, GPU server 4, and accelerator 5, and the air conditioning control unit 200 (operation history information extraction).
  • the total power consumption (CPU server total power consumption, GPU Calculate the total power consumption of the server and the total power consumption of the accelerator.
  • the server power consumption prediction unit 330 adds up the CPU server total power consumption, GPU server total power consumption, and accelerator total power consumption in the placement control area 30, and calculates the total amount of power consumption in the placement control area 30. Calculate the power consumption of each server.
  • the server power consumption prediction section 330 includes a CPU power consumption prediction section 331 , a GPU power consumption prediction section 332 , and an accelerator power consumption prediction section 333 .
  • the CPU power amount prediction unit 331 calculates information on the amount of virtual resources to be newly allocated based on the load processing schedule information (for example, the number of CPU cores) and the resource usage status of the CPU server 3 at that time (for example, , CPU usage rate). Then, the CPU power amount prediction unit 331 predicts the power consumption of each CPU server 3 using the CPU server power amount learning model 302. More specifically, when switching control turns, the CPU power amount prediction unit 331 deletes virtual resources (for example, Pods) whose processing was completed in the previous control turn.
  • the load processing schedule information for example, the number of CPU cores
  • the resource usage status of the CPU server 3 at that time for example, CPU usage rate
  • the CPU power amount prediction unit 311 obtains the resource usage rate (for example, CPU usage rate, memory usage rate, etc.) of each CPU server 3 excluding the deleted Pods. .
  • the CPU power amount prediction unit 311 calculates a predicted value of the resource usage rate when a new Pod is placed in each CPU server 3 based on the load processing schedule information.
  • the CPU power amount prediction unit 311 calculates the power consumption of each CPU server 3 by inputting this predicted value of the resource usage rate into the CPU server power amount learning model 302.
  • the CPU power amount prediction unit 331 calculates the CPU server total power consumption by summing up the power consumption calculated for each CPU server 3 in each of the placement control areas 30 based on the placement configuration of the CPU servers 3 in each placement control area 30. Calculate the amount.
  • the CPU server power consumption learning model 302 is a learning model that uses the resource usage status of the CPU server 3 (for example, CPU usage rate, memory usage rate, etc.) as input information, and uses the power consumption amount of the CPU server 3 as output information. It is.
  • This CPU server power amount learning model 302 is created in advance using the resource usage status of the CPU server 3 and the server power consumption, which is result information at that time, as learning data.
  • the GPU power amount prediction unit 332 predicts the amount of electricity based on the type of processing load that is newly scheduled to be processed (hereinafter referred to as "load type") obtained from the load processing schedule information, the GPU inlet temperature, the number of GPU cards, etc. , the GPU server power consumption of each GPU server 4 is predicted using the GPU server power learning model 303. Furthermore, the GPU power amount prediction unit 332 calculates the total GPU server power consumption, which is the sum of the power consumption of each GPU server 4 in each placement control area 30, based on the placement configuration of the GPU servers 4 in each placement control area 30. calculate.
  • the load type is a type of load depending on the purpose of executing the GPU server 4, such as image processing, machine learning processing, network processing, virtual space processing, etc., and each is determined using load processing schedule information. It is assumed that the type of load can be specified. Furthermore, it is assumed that each GPU server 4 executes a single type of application based on the load processing schedule information.
  • This GPU server power consumption learning model 303 has two methods: a method of directly predicting GPU server power consumption (one-step method), and a method of predicting GPU server power consumption in two stages via GPU temperature (GPU card temperature). There is a two-stage system).
  • one learning model is used as the GPU server power amount learning model 303.
  • This GPU server power consumption learning model 303 is a learning model that uses the GPU inlet temperature, load type, and number of GPU cards as input information, and uses the power consumption of the GPU server 4 as output information.
  • This GPU server power amount learning model 303 is created in advance using the GPU inlet temperature, load type, number of GPU cards, and information on the power consumption of the GPU server 4 at that time as learning data.
  • the GPU power amount prediction unit 332 calculates the GPU server power amount based on the GPU inlet temperature, load type, and number of GPU cards for each GPU server 4 that is not assigned processing at the beginning of the turn. Using the learning model 303, the GPU server power consumption of each GPU server 4 is predicted.
  • the GPU inlet temperature uses the current inlet temperature (GPU inlet temperature) of each GPU server 4 at the beginning of the turn, and thereafter is indicated by the temperature distribution information 64 acquired from the air conditioning control unit 200. Information on the GPU inlet temperature of each GPU server 4 (temperature distribution information 64 starting from the same temperature as the current GPU inlet temperature) is used (the same applies to the two-stage system).
  • the first GPU learning model 303a is a learning model that uses GPU inlet temperature, load type, and number of GPU cards as input information, and uses GPU temperature as output information.
  • This first GPU learning model 303a is created in advance using the GPU inlet temperature, load type, number of GPU cards, and current GPU temperature as learning data.
  • the second GPU learning model 303b is a learning model that uses the GPU temperature as input information and uses the power consumption of the GPU server 4 as output information. This second GPU learning model 303b is created in advance using the GPU temperature and the power consumption of the GPU server 4 at that time as learning data.
  • the GPU power amount prediction unit 332 When adopting the two-stage method, the GPU power amount prediction unit 332 generates a first GPU learning model based on the GPU inlet temperature, load type, and number of GPU cards for each GPU server 4 that is not assigned processing at the beginning of the turn. 303a to predict the GPU temperature. Then, the GPU power amount prediction unit 332 predicts the GPU server power consumption of each GPU server 4 based on the predicted GPU temperature using the second GPU learning model 303b.
  • the GPU power amount prediction unit 332 adds up the predicted power consumption of each GPU server 4 in each of the placement control areas 30 based on the placement configuration of the GPU servers 4 in each placement control area 30, and calculates the predicted amount of power for each placement control area 30. The total power consumption of 30 GPU servers is calculated.
  • the accelerator power amount prediction unit 333 performs accelerator power amount learning based on the type of accelerator processing load (load type) that is newly scheduled to be processed, the accelerator inlet temperature, the number of accelerator processing circuits, etc. obtained from the load processing schedule information. Using the model 304, the accelerator power consumption of each accelerator 5 is predicted. Further, the accelerator power amount prediction unit 333 calculates the total accelerator power consumption amount, which is the sum of the power consumption amounts of each accelerator 5 in each placement control area 30, based on the arrangement configuration of the accelerators 5 in each placement control area 30.
  • the load type is a type of load depending on the purpose of executing the accelerator 5, such as image processing, machine learning processing, internet processing, encryption processing, etc., and the load type is determined using load processing schedule information. It shall be possible to specify. Further, it is assumed that each accelerator 5 executes a single type of application based on a load processing schedule.
  • This accelerator power consumption learning model 304 like the GPU server power consumption learning model 303, uses a method (one-step method) of directly predicting accelerator power consumption and a two-step method using the accelerator temperature (temperature inside the accelerator). There is a method (two-step method) for predicting accelerator power consumption.
  • one learning model is used as the accelerator power amount learning model 304.
  • This accelerator power consumption learning model 304 is a learning model that uses the accelerator inlet temperature, load type, and number of accelerator processing circuits as input information, and uses the power consumption of the accelerator 5 as output information.
  • This accelerator power amount learning model 304 is created in advance using the accelerator inlet temperature, load type, number of accelerator processing circuits, and information on the power consumption of the accelerator 5 at that time as learning data.
  • the accelerator power amount prediction unit 333 performs accelerator power amount learning for each accelerator 5 that is not assigned processing at the beginning of a turn based on the accelerator inlet temperature, load type, and number of accelerator processing circuits. Using the model 304, the accelerator power consumption of each accelerator 5 is predicted. Note that the accelerator suction port temperature uses the current suction port temperature of each accelerator 5 (accelerator suction port temperature) at the beginning of the turn, and thereafter is indicated by the temperature distribution information 64 acquired from the air conditioning control unit 200. Information on the accelerator suction port temperature of each accelerator 5 (temperature distribution information 64 starting from the same temperature as the current accelerator suction port temperature) is used (the same applies to the two-stage system).
  • the first accelerator learning model 304a is a learning model that uses the accelerator inlet temperature, the load type, and the number of accelerator processing circuits as input information, and uses the accelerator temperature as output information.
  • This first accelerator learning model 304a is created in advance using the accelerator inlet temperature, load type, number of accelerator processing circuits, and accelerator temperature at that time as learning data.
  • the second accelerator learning model 304b is a learning model that uses the accelerator temperature as input information and uses the power consumption of the accelerator 5 as output information. This second accelerator learning model 304b is created in advance using the accelerator temperature and information on the power consumption of the accelerator 5 at that time as learning data.
  • the accelerator power amount prediction unit 333 When adopting the two-stage method, the accelerator power amount prediction unit 333 performs first accelerator learning based on the accelerator inlet temperature, load type, and number of accelerator processing circuits for each accelerator 5 that is not assigned processing at the beginning of the turn. Model 304a is used to predict accelerator temperature. Then, the accelerator power amount prediction unit 333 predicts the accelerator power consumption of each accelerator 5 based on the predicted accelerator temperature using the second accelerator learning model 304b.
  • the accelerator power amount prediction unit 333 calculates the predicted power consumption of each accelerator 5 in each placement control area 30 based on the arrangement configuration of the accelerators 5 in each placement control area 30. Calculate the total accelerator power consumption.
  • the server power consumption prediction unit 330 totals the CPU server total power consumption, the GPU server total power consumption, and the accelerator total power consumption in the placement control area 30, and controls the placement.
  • the server power consumption of each area 30 is calculated.
  • the placement pattern determination unit 340 sums up the server power consumption of each placement control area 30 in each placement pattern, and calculates the total server power consumption (total server power consumption).
  • the arrangement pattern determination unit 340 calculates the total amount of the calculated total server power consumption and the air conditioning power consumption in the arrangement pattern obtained from the air conditioning control unit 200, and selects the arrangement pattern that minimizes the total amount. decide.
  • FIG. 8 is a flowchart showing the flow of operation history information generation processing executed by the power consumption reduction control device 100 according to the present embodiment.
  • the air conditioning control value generation unit 221 of the air conditioning control unit 200 (operation history information generation unit 220) of the power consumption reduction control device 100 generates control parameters (for example, set temperature (target temperature)) that can be changed in each air conditioner 2. , air volume, etc.), air conditioning control values divided into multiple stages are generated (step S1). Specifically, the air conditioning control value generation unit 221 divides each parameter into M stages from an upper limit value to a lower limit value, and combines the parameters of each stage to generate air conditioning control value information 63 for each air conditioner 2. do.
  • control parameters for example, set temperature (target temperature)
  • the air conditioning control execution unit 240 executes air conditioning control in a plurality of patterns (step S2). For example, the air conditioning control execution unit 240 controls each air conditioning control value in the order of air conditioners "1" ⁇ “2" ⁇ “3", or controls air conditioners "1" and "2", air conditioner "2", etc. Air conditioning control is executed in multiple patterns, such as controlling a combination of air conditioners 1 and 3, or controlling air conditioners 1, 2, and 3 simultaneously.
  • the operation history information generation unit 220 (remuneration calculation unit 222) generates a reward (temperature reward) as an index for evaluating the result of performing air conditioning control using the air conditioning control value generated by the air conditioning control value generation unit 221.
  • the remuneration calculation unit 222 determines whether the control result satisfies a predetermined remuneration, that is, whether the air conditioning control value satisfies a predetermined condition.
  • the reward calculation unit 222 determines that the reward is passed if the calculated reward is equal to or higher than a predetermined threshold and satisfies predetermined conditions such as the average floor temperature after the control turn, the GPU temperature, the accelerator temperature, etc. are within the specified range. It is determined that
  • the operation history information generation section 220 uses the air conditioning control value information 63 generated by the air conditioning control value generation section 221 in each Situation and determined to be acceptable by the remuneration calculation section 222.
  • temperature distribution information 64 and air conditioning power consumption information 65 are acquired (step S4).
  • This temperature distribution information 64 includes the GPU suction port temperature measured by the temperature sensor 44 provided on the suction port side of the GPU server 4 and the accelerator suction temperature measured by the temperature sensor 55 provided on the suction port side of the accelerator 5. This is information obtained by measuring mouth temperature at predetermined time intervals.
  • the air conditioning power consumption information 65 is the total power consumption of each air conditioner 2 measured in a predetermined control turn.
  • the operation history creation unit 223 adds temperature distribution information 64 obtained as a result of the control to the Situation (Situation classification 62) and air conditioning control value information 63 when the air conditioning control execution unit 240 executes the air conditioning control. , and air conditioning power consumption information 65 to create operation history information 201 (step S5) and store it in the storage unit.
  • the power consumption reduction control device 100 creates the operation history information 201 generation process in advance in the learning phase before the operation phase.
  • FIG. 9 is a flowchart showing the flow of the arrangement pattern determination process executed by the power amount reduction control device 100 according to the present embodiment.
  • the server control unit 300 (arrangement pattern calculation unit 310) of the power reduction control device 100 acquires load processing schedule information, and at the start of each control turn, the server control unit 300 (arrangement pattern calculation unit 310) (e.g. usage rate), a placement pattern for placing a new load on each server resource (CPU server 3, GPU server 4, accelerator 5) is calculated (step S10).
  • the server control unit 300 (arrangement pattern calculation unit 310) (e.g. usage rate), a placement pattern for placing a new load on each server resource (CPU server 3, GPU server 4, accelerator 5) is calculated (step S10).
  • the area heat generation estimation unit 320 calculates the power consumption of each server resource (CPU server 3, GPU server 4, accelerator 5) for each layout pattern calculated by the layout pattern calculation unit 310, based on basic power consumption information. Prediction is made with reference to 301. Then, the area heat generation estimation unit 320 calculates the total predicted heat generation amount (predicted heat generation amount of the placement control area) for each placement control area 30 in each placement pattern based on the server placement configuration for each placement control area 30 ( Step S11). Then, the area heat generation amount estimating unit 320 outputs the calculated predicted heat generation amount of each placement control area 30 to the air conditioning control unit 200 (operation history information extraction unit 230).
  • the air conditioning control unit 200 operation history information extraction unit 230.
  • the operation history information extraction unit 230 of the air conditioning control unit 200 acquires information on the predicted heat generation amount of each layout control area 30 from the server control unit 300 (area heat generation amount estimating unit 320). Then, the operation history information extraction unit 230 determines the Situation classification 62 at the start of the control turn via the situation recognition unit 210 (Step S12). The operation history information extraction unit 230 extracts temperature distribution information 64 and air conditioning power consumption information 65, which are the results of control using each air conditioning control value information 63 in the determined Situation classification 62, from the operation history information 201 (step S13). The operation history information extraction unit 230 outputs the extracted temperature distribution information 64 and air conditioning power consumption information 65 to the server control unit 300.
  • the server power consumption prediction unit 330 of the server control unit 300 obtains the load processing schedule information for the CPU server 3, GPU server 4, and accelerator 5 from the air conditioning control unit 200 (operation history information extraction unit 230). Using the temperature distribution information 64 of the placement pattern, etc., the total power consumption (CPU server total power consumption, GPU server total power consumption, Accelerator total power consumption) is calculated (step S14).
  • the server power consumption prediction unit 330 calculates information on the amount of virtual resources (for example, the number of CPU cores) to be newly allocated based on the load processing schedule. and the resource usage status (for example, CPU usage rate) of the CPU server 3 at that time, and predicts the power consumption of each CPU server 3 using the CPU server power learning model 302. Further, the CPU power amount prediction unit 331 calculates the total CPU server power consumption, which is the sum of the power consumption of each CPU server 3 in each of the placement control areas 30, based on the arrangement configuration of the CPU servers 3 in each placement control area 30. calculate.
  • the server power consumption prediction unit 330 calculates the load type of the new load obtained from the load processing schedule information, and the GPU intake port obtained from the temperature distribution information 64. Based on the temperature and the number of GPU cards, the GPU server power consumption of each GPU server 4 is predicted using the GPU server power learning model 303. Furthermore, the GPU power amount prediction unit 332 calculates the total GPU server power consumption, which is the sum of the power consumption of each GPU server 4 in each placement control area 30, based on the placement configuration of the GPU servers 4 in each placement control area 30. calculate.
  • the server power consumption prediction unit 330 calculates the load type of the new load obtained from the load processing schedule information and the accelerator suction port obtained from the temperature distribution information 64. Based on the temperature and the number of accelerator processing circuits, the accelerator power consumption of each accelerator 5 is predicted using the accelerator power learning model 304. Further, the accelerator power amount prediction unit 333 calculates the total accelerator power consumption amount, which is the sum of the power consumption amounts of each accelerator 5 in each placement control area 30, based on the arrangement configuration of the accelerators 5 in each placement control area 30.
  • the server power consumption prediction unit 330 totals the CPU server total power consumption, the GPU server total power consumption, and the accelerator total power consumption in the placement control area 30, and controls the placement.
  • the server power consumption of each area 30 is calculated (step S15).
  • the placement pattern determination unit 340 sums up the server power consumption of each placement control area 30 in each placement pattern, and calculates the total server power consumption (total server power consumption).
  • the arrangement pattern determination unit 340 calculates the total amount of the calculated total server power consumption and the air conditioning power consumption in the arrangement pattern obtained from the air conditioning control unit 200, and selects the arrangement pattern that minimizes the total amount. Determine (step S16).
  • the power reduction control device 100 reduces the total power consumption of the data center consisting of server power consumption and air conditioning power consumption in a data center environment where CPU servers, GPU servers, accelerators, etc. coexist.
  • the processing load arrangement pattern and air conditioning control value can be determined.
  • FIG. 10 is a hardware configuration diagram showing an example of a computer 900 that implements the functions of the power consumption reduction control device 100 according to the present embodiment.
  • the computer 900 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM 903, an HDD (Hard Disk Drive) 904, an input/output I/F (Interface) 905, a communication I/F 906, and a media I/F 907. have a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM 903, an HDD (Hard Disk Drive) 904, an input/output I/F (Interface) 905, a communication I/F 906, and a media I/F 907.
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM 903 Random Access Memory
  • HDD Hard Disk Drive
  • I/F Interface
  • the CPU 901 operates based on a program stored in the ROM 902 or HDD 904, and performs control by the control unit.
  • the ROM 902 stores a boot program executed by the CPU 901 when the computer 900 is started, programs related to the hardware of the computer 900, and the like.
  • the CPU 901 controls an input device 910 such as a mouse or a keyboard, and an output device 911 such as a display or printer via an input/output I/F 905.
  • the CPU 901 acquires data from the input device 910 via the input/output I/F 905 and outputs the generated data to the output device 911.
  • a GPU Graphics Processing Unit
  • the like may be used in addition to the CPU 901 as the processor.
  • the HDD 904 stores programs executed by the CPU 901 and data used by the programs.
  • the communication I/F 906 receives data from other devices via a communication network (for example, NW (Network) 920) and outputs it to the CPU 901, and also sends data generated by the CPU 901 to other devices via the communication network. Send to device.
  • NW Network
  • the media I/F 907 reads the program or data stored in the recording medium 912 and outputs it to the CPU 901 via the RAM 903.
  • the CPU 901 loads a program related to target processing from the recording medium 912 onto the RAM 903 via the media I/F 907, and executes the loaded program.
  • the recording medium 912 is an optical recording medium such as a DVD (Digital Versatile Disc) or a PD (Phase change rewritable disk), a magneto-optical recording medium such as an MO (Magneto Optical disk), a magnetic recording medium, a semiconductor memory, or the like.
  • the CPU 901 of the computer 900 realizes the functions of the power consumption reduction control device 100 by executing a program loaded onto the RAM 903. Furthermore, data in the RAM 903 is stored in the HDD 904 .
  • the CPU 901 reads a program related to target processing from the recording medium 912 and executes it. In addition, the CPU 901 may read a program related to target processing from another device via a communication network (NW 920).
  • the power consumption reduction control device is a power consumption reduction control device 100 that controls a CPU server 3, a GPU server 4, an accelerator 5, and a plurality of air conditioners 2. 5 are arranged, and an air conditioning control area 20 which is an area for measuring the effect of air conditioning control by the plurality of air conditioners 2 is set.
  • An air conditioning control value generation unit 221 that generates an air conditioning control value including at least a target temperature to be set in a plurality of air conditioners 2, and an air conditioning control execution unit that executes control of the plurality of air conditioners 2 using the air conditioning control value.
  • the target temperature is used as an index for the results of the air conditioning control execution unit 240 controlling the plurality of air conditioners 2 using air conditioning control values.
  • a remuneration calculation unit 222 calculates a remuneration to be evaluated and determines whether the remuneration satisfies a predetermined condition, and a remuneration calculation unit 222 that calculates a remuneration to be evaluated as an operation history creation unit 223 that acquires the air conditioning power consumption of the air conditioners and creates operation history information 201 that is associated with the predicted heat generation amount of each placement control area 30 in each of the plurality of placement patterns; the CPU server 3; A placement pattern calculation unit 310 that calculates a plurality of placement patterns in which new processing loads are placed using information on processing loads on the server 4 and the accelerator 5; An area heat generation amount estimation unit 320 that estimates the predicted heat generation amount of each placement control area 30 by summing the heat generation amount when processing loads are placed on the CPU server 3, GPU server 4, and accelerator 5, and each placement control an operation history information extraction unit that uses information on the predicted amount of heat generation in the area 30, refers to the operation history information 201, and extracts temperature distribution information 64 and air
  • a server power consumption prediction unit 330 that calculates the server power consumption by summing up the power consumption of each accelerator 5, and the server power consumption of each of the placement control areas 30 in each placement pattern.
  • an arrangement pattern determination unit 340 that calculates the total amount of the total server power consumption and the extracted air conditioning power consumption, and determines the arrangement pattern in which the calculated total amount is the minimum as the arrangement pattern for allocating the processing load; It is characterized by comprising the following.
  • the power reduction control device 100 can reduce the total power consumption consisting of server power consumption and air conditioning power consumption in an environment where the CPU server 3, GPU server 4, and accelerator 5 coexist. I can do it.
  • the power consumption reduction control device is a power consumption reduction control device 100 that controls a CPU server 3, a GPU server 4, an accelerator 5, and a plurality of air conditioners 2, which are included in a data center 10.
  • a plurality of placement control areas 30 in which any of the CPU servers 3, GPU servers 4, and accelerators 5 are placed as a group of servers for placing processing loads, and air conditioning by a plurality of air conditioners 2.
  • the external world factor acquisition unit 211 divides the value of each external world factor into a predetermined range width, combines the divided ranges for each external world factor, defines a situation classification 62, and determines which situation classification 62 the acquired external world factor information is assigned to.
  • each of the Situation classifications 62 there is a Situation determination unit 212 that determines whether the system belongs to the Situation classification unit 212, an air conditioning control value generation unit 221 that generates an air conditioning control value that includes at least a target temperature to be set to a plurality of air conditioners 2, and an air conditioning control value generation unit 221 that uses the air conditioning control value to be set to a plurality of air conditioners 2.
  • an air conditioning control execution unit 240 that executes control of a plurality of air conditioners 2 using a target temperature; , a remuneration calculation unit 222 that determines whether the remuneration satisfies a predetermined condition, and a GPU suction unit 222 that indicates the temperature at the inlet of the GPU server 4 as a control result based on the air conditioning control value determined to satisfy the predetermined condition.
  • Temperature distribution information 64 indicating the mouth temperature and the accelerator suction port temperature indicating the temperature at the accelerator suction port, and the air conditioning power consumption of the plurality of air conditioners 2 when control is performed using the air conditioning control value.
  • An operation of creating operation history information 201 that associates temperature distribution information 64 and air conditioning power consumption acquired as control results with the Situation classification 62 and air conditioning control values obtained when air conditioning control is executed.
  • the history creation unit 223 and the information on the predicted heat generation amount of each placement control area 30 are acquired, the current situation classification 62 is determined via the situation determination unit 212, and each placement pattern is determined by referring to the operation history information 201.
  • the operation history information extraction unit 230 extracts temperature distribution information 64 and air conditioning power consumption when controlled by air conditioning control values, and the schedule for generating and deleting processing loads for the CPU server 3, GPU server 4, and accelerator 5.
  • a placement pattern calculation unit 310 that acquires the load processing schedule information shown in FIG.
  • An area heat generation estimation unit that estimates the predicted heat generation amount of each placement control area 30 by summing the heat generation amount when processing loads are placed in the CPU server 3, GPU server 4, and accelerator 5 belonging to each area 30.
  • the load processing schedule information, and the extracted temperature distribution information 64 the CPU server total power consumption, which is the sum of the power consumption of the CPU servers 3 in the placement control area 30, in each placement pattern, and the placement control
  • the total power consumption of GPU servers, which is the sum of the power consumption of GPU servers in area 30, and the total power consumption of accelerators, which is the sum of the power consumption of accelerators in placement control area 30, are calculated, and the total power consumption of CPU servers in the placement control area is calculated.
  • the server power consumption prediction unit 330 calculates the server power consumption of each placement control area 30 by summing the power amount, the GPU server total power consumption, and the accelerator total power consumption, and the placement control area 30 in each placement pattern. Add up the power consumption of each server, calculate the total server power consumption, which is the sum, and the extracted air conditioning power consumption, and select the layout pattern that minimizes the calculated total amount, based on the processing load. and an arrangement pattern determination unit 340 that determines an arrangement pattern for arranging.
  • the power consumption reduction control device 100 can reduce the total power consumption of the data center 10 consisting of server power consumption and air conditioning power consumption in an environment of the data center 10 in which CPU servers 3, GPU servers 4, and accelerators 5 are mixed. can reduce power consumption.
  • a CPU server power consumption learning model 302 that uses the resource usage status of the CPU server 3 as input information and the power consumption of the CPU server 3 as output information
  • a GPU A GPU server power consumption learning model 303 that uses the inlet temperature, the type of processing load, and the number of GPU cards as input information, and the power consumption of the GPU server 4 as output information, and the accelerator inlet temperature and processing load of the accelerator 5.
  • the accelerator power consumption learning model 304 uses the type of accelerator processing circuit and the number of accelerator processing circuits as input information, and the power consumption of the accelerator 5 as output information, and the server power consumption prediction unit 330 calculates the The power consumption is calculated using the CPU server power consumption learning model 302, the power consumption of the GPU server 4 is calculated using the GPU server power consumption learning model 303, and the power consumption of the accelerator 5 is calculated using the accelerator power consumption. It is characterized by calculation using a learning model 304.
  • the power consumption reduction control device 100 uses the CPU server power consumption learning model 302, the GPU server power consumption learning model 303, and the accelerator power consumption learning model 304 to Each power consumption amount can be suitably calculated.
  • the server power consumption prediction unit 330 includes a first GPU learning model that uses GPU temperature as output information, and a second GPU learning model that uses GPU temperature as input information and uses power consumption of the GPU server as output information. It is characterized in that the temperature is calculated using the first GPU learning model, and then the power consumption of the GPU server 4 is calculated using the second GPU learning model.
  • the power consumption reduction control device 100 calculates the GPU temperature using the first GPU learning model, and then suitably calculates the power consumption of the GPU server 4 by using the second GPU learning model. can do.
  • the accelerator inlet temperature, the type of processing load, and the number of accelerator processing circuits of the accelerator 5 are used as input information
  • the accelerator The server power consumption prediction unit 330 includes a first accelerator learning model that uses temperature as output information, and a second accelerator learning model that uses accelerator temperature as input information and uses power consumption of the accelerator as output information. The method is characterized in that the temperature is calculated using the first accelerator learning model, and then the power consumption of the accelerator is calculated using the second accelerator learning model.
  • the power consumption reduction control device 100 calculates the accelerator temperature using the first accelerator learning model, and then appropriately adjusts the power consumption of the accelerator 5 using the second accelerator learning model. It can be calculated.
  • the power consumption reduction control device 100 includes basic power consumption information 301 indicating the reference power consumption at a predetermined temperature for each of the CPU server 3, GPU server 4, and accelerator 5.
  • 320 calculates the amount of heat generated by each of the CPU server 3, GPU server 4, and accelerator 5 by calculating the amount of power consumed by each of the CPU server 3, GPU server 4, and accelerator 5 using the basic power consumption information 301. It is characterized by
  • the power consumption reduction control device 100 is equipped with basic power consumption information indicating the reference power consumption at a predetermined temperature, so that the amount of heat generated by each of the CPU server 3, GPU server 4, and accelerator 5 can be adjusted. It becomes possible to estimate.
  • Air conditioning control system Air conditioner 3 CPU server 4 GPU server 5 Accelerator 10 Data center (DC) 20 Air conditioning control area 30 Placement control area 62 Situation classification 63 Air conditioning control value information 64 Temperature distribution information 65 Air conditioning power consumption information 100 Electric power reduction control device 200 Air conditioning control unit 201 Operation history information 210 Situation recognition unit 211 External factor acquisition unit 212 Situation determination unit 220 Operation history information generation unit 221 Air conditioning control value generation unit 222 Reward calculation unit 223 Operation history creation unit 230 Operation history information extraction unit 240 Air conditioning control execution unit 300 Server control unit 301 Basic power consumption information 302 CPU server power consumption Learning model 303 GPU server power consumption learning model 304 Accelerator power consumption learning model 310 Arrangement pattern calculation unit 320 Area heat generation estimation unit 330 Server power consumption prediction unit 331 CPU power consumption prediction unit 332 GPU power consumption prediction unit 333 Accelerator power consumption prediction Section 340 Arrangement pattern determination section

Abstract

An electric power amount reduction control device (100) comprises: an air conditioning control value generation unit (221) that generates an air conditioning control value; an air conditioning control execution unit (240) that causes control of a plurality of air conditioners to be executed using the air conditioning control value; an operation history creation unit (223) that acquires temperature distribution information (64) and the air conditioning electric power consumption amounts of the plurality of air conditioners as a control result and creates operation history information (201); an arrangement pattern calculation unit (310) that calculates an arrangement pattern for a processing load; an area heat generation amount estimation unit (320) that estimates a predicted heat generation amount for each of arrangement control areas; a server electric power consumption amount prediction unit (330) that calculates a server electric power consumption amount for each of the arrangement control areas (30); and an arrangement pattern determination unit (340) that calculates the total of the server electric power consumption amount and the air conditioning electric power consumption amounts in each arrangement pattern and determines an arrangement pattern in which the total is smallest.

Description

電力量低減制御装置、電力量低減制御方法、電力量低減制御システム、および、プログラムElectric energy reduction control device, electric energy reduction control method, electric energy reduction control system, and program
 本発明は、データセンタ(以下、「DC」と称する場合がある。)における消費電力量を低減する、電力量低減制御装置、電力量低減制御方法、電力量低減制御システム、および、プログラムに関する。 The present invention relates to a power consumption reduction control device, a power consumption reduction control method, a power consumption reduction control system, and a program that reduce power consumption in a data center (hereinafter sometimes referred to as "DC").
 データセンタ(DC)における空調の消費電力量の比率は大きな比率を占めており、DCの数や規模の拡大に応じて、空調の消費電力量の削減が求められている。また、DCにおけるデータ処理量は、年々増加傾向にあり、DC全体としての電力消費効率(ある一定量のデータ処理に対するDC全体の消費電力量)の向上が必要となる。 The power consumption of air conditioning in data centers (DCs) accounts for a large proportion, and as the number and scale of DCs increases, there is a need to reduce the power consumption of air conditioning. Furthermore, the amount of data processed in a DC tends to increase year by year, and it is necessary to improve the power consumption efficiency of the entire DC (the amount of power consumed by the entire DC for processing a certain amount of data).
 空調の消費電力量、および、サーバ(IT装置)の消費電力量を考慮し、DC全体の消費電力量を最適化する技術として、非特許文献1に記載の技術が公開されている。
 非特許文献1のデータセンタ向け空調連係IT負荷配置最適化方式では、データセンタのIT機器の稼働情報や監視情報を収集することにより、IT機器の将来の負荷の推移を予測し、IT機器の電力増分に応じた空調設備の電力増分を算出する。そして、時系列でIT機器への負荷集約率が高まるように、つまりIT機器の稼働台数が削減されるように、データセンタの電力量である目的関数が最小となる最適化問題を解く。これにより、データセンタの電力量を最小化するIT機器へのIT負荷(仮想マシン)の配置を算出する。
A technique described in Non-Patent Document 1 has been disclosed as a technique for optimizing the overall power consumption of a DC by considering the power consumption of air conditioning and the power consumption of a server (IT device).
In the air conditioning linked IT load placement optimization method for data centers described in Non-Patent Document 1, by collecting operating information and monitoring information of IT equipment in the data center, it is possible to predict future changes in the loads of IT equipment and Calculate the power increase for air conditioning equipment according to the power increase. Then, an optimization problem is solved in which the objective function, which is the power amount of the data center, is minimized so that the load aggregation rate on IT equipment increases over time, that is, the number of operating IT equipment is reduced. This calculates the placement of IT loads (virtual machines) on IT equipment that minimizes the power consumption of the data center.
 しかしながら、非特許文献1に記載の技術では、空調設備の電力の算出に用いる空調電力モデルにおいて、DCごとに異なる設備条件に依存しない汎用的なルールベース基準を採用している。そのため、空調設備の配置位置、気流、DC内のサーバ配置構成、熱冷却効率などの個別の設備条件を考慮した、DCのトータルとしての電力量を低減するための最適化を行うことは難しかった。 However, in the technology described in Non-Patent Document 1, a general-purpose rule-based standard that does not depend on equipment conditions that differ for each DC is adopted in the air conditioning power model used to calculate the power of the air conditioning equipment. Therefore, it was difficult to optimize the total amount of power consumed by the DC, taking into account individual equipment conditions such as the location of air conditioning equipment, airflow, server arrangement within the DC, and thermal cooling efficiency. .
 さらに、DC内では、CPU(Central Processing Unit)サーバ、GPU(Graphics Processing Unit)サーバ、アクセラレータが混在する環境において、負荷処理がなされることが想定される。ここで、アクセラレータは、例えば、FPGA(Field Programmable Gate Array)、ASIC(Application Specific Integrated Circuit)、TPU(Tensor Processing Unit)等である。 Further, within the DC, it is assumed that load processing is performed in an environment in which a CPU (Central Processing Unit) server, a GPU (Graphics Processing Unit) server, and an accelerator coexist. Here, the accelerator is, for example, an FPGA (Field Programmable Gate Array), an ASIC (Application Specific Integrated Circuit), or a TPU (Tensor Processing Unit).
 このCPU、GPU、アクセラレータは、負荷処理時のサーバ冷却に要する消費電力量が異なる。具体的には、CPUは、常温の範囲内において、消費電力量の変動がほとんど見られないのに対し、GPUサーバやアクセラレータでは、常温の範囲内であっても、消費電力量の変動が発生することが想定される。 The CPU, GPU, and accelerator differ in the amount of power consumption required for server cooling during load processing. Specifically, CPUs show almost no fluctuations in power consumption within the normal temperature range, whereas GPU servers and accelerators experience fluctuations in power consumption even within the normal temperature range. It is assumed that
 本出願の発明者は、GPUサーバについて、常温の範囲内である20℃と33℃に、吸込口温度を設定し、GPUサーバ内のGPUカードの温度(GPU温度)、消費電力量、FAN回転率を測定した。
 その結果、GPU温度は、図1で示すように、吸込口温度について20℃と33℃とを比較すると、33℃ではおよそ10度上昇する。また、GPUカードの消費電力量も、図2で示すように、33℃ではおよそ20W上昇する。また、GPUサーバのFAN回転率も、図3で示すように、33℃ではおよそ10%上昇する。なお、図1~図3の横軸は、測定開始からの時間[分:秒]を表す。
The inventor of this application set the inlet temperature of the GPU server at 20 degrees Celsius and 33 degrees Celsius, which are within the normal temperature range, and set the temperature of the GPU card in the GPU server (GPU temperature), the amount of power consumption, and the fan rotation. The rate was measured.
As a result, the GPU temperature increases by approximately 10 degrees at 33 degrees Celsius when comparing the inlet temperature between 20 degrees Celsius and 33 degrees Celsius, as shown in FIG. Furthermore, as shown in FIG. 2, the power consumption of the GPU card also increases by approximately 20 W at 33 degrees Celsius. Furthermore, the fan rotation rate of the GPU server also increases by approximately 10% at 33° C., as shown in FIG. Note that the horizontal axis in FIGS. 1 to 3 represents time [minutes: seconds] from the start of measurement.
 実験では、吸込口温度が20℃のとき、GPUカード4枚の消費電力量は「0.66kWh」、GPUカードを除くGPUサーバ本体の消費電量は「0.33kWh」であり、GPUサーバ全体では、「0.99kWh」であった。一方、吸込口温度が33℃のとき、GPUカード4枚の消費電力量は「0.72kWh」、GPUカードを除くGPUサーバ本体の消費電量は「0.37kWh」であり、GPUサーバ全体では、「1.09kWh」であった。この結果から、吸込口温度が22℃から33℃に上昇することにより、GPUサーバ全体で約9%の消費電力量の増加が確認された。 In the experiment, when the inlet temperature was 20 degrees Celsius, the power consumption of four GPU cards was "0.66 kWh", and the power consumption of the GPU server itself excluding the GPU cards was "0.33 kWh", and the power consumption of the GPU server as a whole was ``0.33 kWh''. , "0.99kWh". On the other hand, when the inlet temperature is 33°C, the power consumption of the four GPU cards is "0.72kWh", and the power consumption of the GPU server itself excluding the GPU card is "0.37kWh", and the power consumption of the GPU server as a whole is: It was "1.09kWh". From this result, it was confirmed that as the inlet temperature rose from 22° C. to 33° C., the power consumption of the entire GPU server increased by about 9%.
 つまり、従来研究では、CPUサーバ、GPUサーバ、アクセラレータ等が混在する環境において、吸込口温度に対する冷却機能に要する消費電力特性の違いを考慮したモデリングができていなかった。より具体的には、空調制御段階を「強」にすると、吸込口温度は「低」となり、サーバ消費電力量は「低」となり下がるが、空調消費電力量は「高」となる。一方、空調制御段階を「低」にすると、吸込口温度は「高」となり、サーバ消費電力量は「高」となり上がるが、空調消費電力量は「低」とすることができる。即ち、空調制御に伴う空調消費電力量の高低と、サーバ消費電力量の高低とがトレードオフの関係となっている。よって、サーバ消費電力量と空調消費電力量からなるDC全体の消費電力量を正確に推定することが難しく、DC内の負荷処理の配置や空調制御の最適解を求めることにより、DC全体の消費電力量を低減することができなかった。 In other words, conventional research has not been able to perform modeling that takes into account differences in power consumption characteristics required for cooling functions with respect to inlet temperature in an environment where CPU servers, GPU servers, accelerators, etc. coexist. More specifically, when the air conditioning control stage is set to "strong", the inlet temperature becomes "low" and the server power consumption becomes "low" and decreases, but the air conditioning power consumption becomes "high". On the other hand, when the air conditioning control stage is set to "low", the inlet temperature becomes "high" and the server power consumption becomes "high" and increases, but the air conditioning power consumption can be set to "low". That is, there is a trade-off relationship between the level of air conditioning power consumption due to air conditioning control and the level of server power consumption. Therefore, it is difficult to accurately estimate the power consumption of the entire DC, which consists of server power consumption and air conditioning power consumption. It was not possible to reduce the amount of electricity.
 このような点に鑑みて本発明がなされたのであり、本発明は、CPUサーバ、GPUサーバ、アクセラレータ等が混在する環境において、サーバ消費電力と空調消費電力とからなるトータルの消費電力量を低減することを課題とする。 The present invention was made in view of these points, and the present invention reduces the total power consumption consisting of server power consumption and air conditioning power consumption in an environment where CPU servers, GPU servers, accelerators, etc. coexist. The task is to do so.
 本発明に係る電力量低減制御装置は、CPUサーバ、GPUサーバおよびアクセラレータ、並びに複数の空調機を制御する電力量低減制御装置であって、前記CPUサーバ、前記GPUサーバ、前記アクセラレータの何れかが配置される複数の配置制御区域と、前記複数の空調機による空調制御の効果を測定するエリアである空調制御区域とが、設定されており、前記電力量低減制御装置が、前記複数の空調機に設定する、少なくとも目標温度を含む空調制御値を生成する空調制御値生成部と、前記空調制御値を用いて前記複数の空調機の制御を実行させる空調制御実行部と、前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに処理負荷を配置した複数の配置パターンにおいて、前記空調制御実行部が前記複数の空調機を前記空調制御値により制御した結果について、前記目標温度を指標として評価する報酬を算出し、前記報酬が所定の条件を満たすか否かを判定する報酬計算部と、前記所定の条件を満たすと判定された前記空調制御値による制御結果として、温度分布情報および前記複数の空調機の空調消費電力量を取得し、前記複数の配置パターンそれぞれにおける各配置制御区域の発熱予測量に対応付けた運用履歴情報を作成する運用履歴作成部と、前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに対する処理負荷の情報を用いて、新規の前記処理負荷を配置する複数の配置パターンを算出する配置パターン算出部と、前記算出した配置パターン毎に、前記配置制御区域それぞれに属する、CPUサーバ、GPUサーバおよびアクセラレータに処理負荷が配置された場合の発熱量を合計することにより、各前記配置制御区域の発熱予測量を推定する区域発熱量推定部と、各前記配置制御区域の発熱予測量の情報を用いて、前記運用履歴情報を参照し、各配置パターンにおいて前記空調制御値により制御した場合の、前記温度分布情報および前記空調消費電力量を抽出する運用履歴情報抽出部と、抽出した前記温度分布情報と新規の処理負荷に関する情報とを用いて、前記配置パターンそれぞれにおいて、前記配置制御区域毎に、各CPUサーバの消費電力量、各GPUサーバの消費電力量および各アクセラレータの消費電力量を合計したサーバ消費電力量を算出するサーバ消費電力量予測部と、前記配置パターンそれぞれにおいて、前記配置制御区域それぞれの前記サーバ消費電力量を合計し、その合計であるトータルのサーバ消費電力量と、抽出した前記空調消費電力量との合計量を算出し、算出した合計量が最小となる配置パターンを、前記処理負荷を配置する配置パターンに決定する配置パターン決定部と、を備えることを特徴とする。 A power consumption reduction control device according to the present invention is a power consumption reduction control device that controls a CPU server, a GPU server, an accelerator, and a plurality of air conditioners, wherein any one of the CPU server, the GPU server, and the accelerator is A plurality of placement control areas to be arranged and an air conditioning control area that is an area for measuring the effect of air conditioning control by the plurality of air conditioners are set, and the power consumption reduction control device an air conditioning control value generation unit that generates an air conditioning control value that includes at least a target temperature set to a target temperature; an air conditioning control execution unit that executes control of the plurality of air conditioners using the air conditioning control value; In a plurality of layout patterns in which processing loads are placed on the GPU server and the accelerator, the air conditioning control execution unit calculates a reward for evaluating the results of controlling the plurality of air conditioners using the air conditioning control value using the target temperature as an index. and a remuneration calculation unit that determines whether the remuneration satisfies a predetermined condition, and a remuneration calculation unit that determines whether or not the remuneration satisfies a predetermined condition, and temperature distribution information and information of the plurality of air conditioners as a control result based on the air conditioning control value that is determined to satisfy the predetermined condition. an operation history creation unit that acquires air conditioning power consumption and creates operation history information associated with the predicted heat generation amount of each layout control area in each of the plurality of layout patterns; a placement pattern calculation unit that calculates a plurality of placement patterns in which new processing loads are placed using processing load information; and a CPU server and a GPU server that belong to each of the placement control areas for each of the calculated placement patterns. and an area heat generation estimation unit that estimates the predicted heat generation amount of each of the placement control areas by summing the heat generation amount when processing loads are placed in the accelerators, and an area heat generation estimation unit that estimates the predicted heat generation amount of each of the placement control areas. an operation history information extraction unit that extracts the temperature distribution information and the air conditioning power consumption amount when controlling using the air conditioning control value in each arrangement pattern with reference to the operation history information, and the extracted temperature distribution; Using the information and information regarding the new processing load, total the power consumption of each CPU server, the power consumption of each GPU server, and the power consumption of each accelerator for each of the placement control areas in each of the placement patterns. a server power consumption prediction unit that calculates the server power consumption of the server; and a server power consumption prediction unit that calculates the server power consumption of each of the placement control areas in each of the placement patterns; and an arrangement pattern determining unit that calculates a total amount of the air-conditioning power consumption and determines an arrangement pattern that minimizes the calculated total amount as an arrangement pattern in which the processing load is arranged. .
 本発明によれば、CPUサーバ、GPUサーバ、アクセラレータ等が混在する環境において、サーバ消費電力と空調消費電力とからなるトータルの消費電力量を低減することができる。 According to the present invention, in an environment where CPU servers, GPU servers, accelerators, etc. coexist, the total power consumption consisting of server power consumption and air conditioning power consumption can be reduced.
吸込口温度が20℃と33℃における、GPU温度を比較するための図である。FIG. 3 is a diagram for comparing GPU temperatures when the inlet temperature is 20° C. and 33° C. 吸込口温度が20℃と33℃における、GPUカードの消費電力量を比較するための図である。FIG. 4 is a diagram for comparing the power consumption of the GPU card when the inlet temperature is 20° C. and 33° C. 吸込口温度が20℃と33℃における、GPUサーバのFAN回転率を比較するための図である。It is a figure for comparing the FAN rotation rate of the GPU server when the suction port temperature is 20° C. and 33° C. 本実施形態に係る電力量低減制御装置を含む電力量低減制御システムの全体構成を示す図である。1 is a diagram showing the overall configuration of a power amount reduction control system including a power amount reduction control device according to the present embodiment. 本実施形態に係る電力量低減制御装置の構成例を示す機能ブロック図である。FIG. 2 is a functional block diagram showing a configuration example of a power amount reduction control device according to the present embodiment. 本実施形態に係るSituation分類を説明するための図である。FIG. 3 is a diagram for explaining situation classification according to the present embodiment. 本実施形態に係る温度分布情報を説明するための図である。FIG. 3 is a diagram for explaining temperature distribution information according to the present embodiment. 本実施形態に係る電力量低減制御装置が実行する、運用履歴情報生成処理の流れを示すフローチャートである。It is a flowchart which shows the flow of operation history information generation processing performed by the electric power reduction control device concerning this embodiment. 本実施形態に係る電力量低減制御装置が実行する、配置パターン決定処理の流れを示すフローチャートである。7 is a flowchart showing the flow of arrangement pattern determination processing executed by the power amount reduction control device according to the present embodiment. 本実施形態に係る電力量低減制御装置の機能を実現するコンピュータの一例を示すハードウェア構成図である。FIG. 2 is a hardware configuration diagram showing an example of a computer that implements the functions of the power reduction control device according to the present embodiment.
 次に、本発明を実施するための形態(以下、「本実施形態」と称する。)について説明する。
 図4は、本実施形態に係る電力量低減制御装置100を含む電力量低減制御システム1の全体構成を示す図である。
Next, a mode for carrying out the present invention (hereinafter referred to as "this embodiment") will be described.
FIG. 4 is a diagram showing the overall configuration of the power amount reduction control system 1 including the power amount reduction control device 100 according to the present embodiment.
 図4に示すように、電力量低減制御システム1は、所定の制御区域(後記する「配置制御区域」)に収容される複数の、CPUサーバ3、GPUサーバ4、アクセラレータ5、および、複数の空調機2を有するデータセンタ(DC10)と、電力量低減制御装置100とを備えて構成される。所定の制御区域(配置制御区域)に収容される複数の、CPUサーバ3、GPUサーバ4、および、アクセラレータ5を、以下において「サーバ群」または総称して「サーバ」と表記する場合がある。図4および後記する図7において、サーバ3を塗りつぶしなしの六角形で表し、GPUサーバ4を複数の斜線を引いた六角形で表し、アクセラレータ5をドットで塗りつぶした六角形で表す。このサーバ群には、CPUサーバ3、GPUサーバ4、アクセラレータ5のそれぞれが少なくとも1台以上備わる場合の他、GPUサーバ4が0台でありCPUサーバ3とアクセラレータ5で構成される場合、アクセラレータ5が0台でありCPUサーバ3とGPUサーバ4で構成される場合も含まれる。
 なお、電力量低減制御装置100は、DC10の内部に設けられてもよいし、DC10とは別の場所に設けられ、複数のDC10を制御するようにしてもよい。
As shown in FIG. 4, the power consumption reduction control system 1 includes a plurality of CPU servers 3, a GPU server 4, an accelerator 5, and a plurality of It is configured to include a data center (DC 10) having an air conditioner 2 and a power consumption reduction control device 100. A plurality of CPU servers 3, GPU servers 4, and accelerators 5 accommodated in a predetermined control area (arrangement control area) may be hereinafter referred to as a "server group" or collectively as a "server." 4 and FIG. 7, which will be described later, the server 3 is represented by an unfilled hexagon, the GPU server 4 is represented by a hexagon with a plurality of diagonal lines, and the accelerator 5 is represented by a hexagon filled with dots. This server group includes at least one CPU server 3, GPU server 4, and accelerator 5. In addition, when there is no GPU server 4 and the CPU server 3 and accelerator 5 are included, the accelerator 5 This also includes a case where there is no server and the server is configured with a CPU server 3 and a GPU server 4.
Note that the power amount reduction control device 100 may be provided inside the DC 10 or may be provided in a location different from the DC 10 and may control a plurality of DCs 10.
 この電力量低減制御装置100は、不図示の空調管理装置を介して、DC10内に設けられた空調機2(図4においては、空調機「1」「2」「3」)の状態情報を取得したり、空調制御情報を送信したりしてもよいし、空調管理装置を介さず直接各空調機2と通信接続されていてもよい。
 また、電力量低減制御装置100は、不図示のサーバ管理装置を介して、DC10内に設けられたサーバ群として収容されるCPUサーバ3,GPUサーバ4,アクセラレータ5の状態情報を取得したり制御情報を送信したりしてもよいし、直接各CPUサーバ3,GPUサーバ4,アクセラレータ5と通信接続されていてもよい。
This power consumption reduction control device 100 receives status information of air conditioners 2 (air conditioners “1”, “2”, and “3” in FIG. 4) installed in the DC 10 via an air conditioning management device (not shown). The information may be acquired, the air conditioning control information may be transmitted, or the communication may be directly connected to each air conditioner 2 without using an air conditioning management device.
The power consumption reduction control device 100 also acquires status information and controls the CPU server 3, GPU server 4, and accelerator 5 that are accommodated as a server group provided in the DC 10 via a server management device (not shown). It may transmit information, or it may be directly communicatively connected to each CPU server 3, GPU server 4, and accelerator 5.
 本実施形態におけるDC10では、収容する全体のサーバ(CPUサーバ3、GPUサーバ4、および、アクセラレータ5)について、図4で示すように、CPUサーバ3、GPUサーバ4、アクセラレータ5それぞれが配置されるエリアごとに区分けし、「配置制御区域」として制御を行う。この配置制御区域30は、処理負荷(仮想リソース、GPUやアクセラレータの処理等)を配置するまとまったサーバ群を収容するエリアである。図4においては、配置制御区域「1」~「6」が設けられた例を示している。 In the DC 10 in this embodiment, the CPU server 3, the GPU server 4, and the accelerator 5 are arranged as shown in FIG. Each area is divided and controlled as a "placement control area." This placement control area 30 is an area that accommodates a group of servers on which processing loads (virtual resources, processing of GPUs, accelerators, etc.) are placed. FIG. 4 shows an example in which placement control areas "1" to "6" are provided.
 なお、DC10では、CPUサーバ3において仮想化基盤が構築され、コンテナやVMを用いて運用されるものとして説明する。オープンソースの仮想化基盤としては、クラウド環境構築用のソフトウェアであるOpenStack(登録商標)や、コンテナ化されたワークロードやサービスを運用管理するためのソフトウェアであるKubernetes(登録商標)が知られている。OpenStackは、主に物理マシンや仮想マシン(VM)の管理・運用に用いられる。Kubernetesは、主にコンテナの管理・運用に用いられる。
 本明細書においては、仮想化基盤において仮想化されたアプリケーション(1つ以上のコンテナや、1つ以上のVM等で構成)のことを仮想リソースと称する。なお、Kubernetesでは、アプリケーションの最小実行単位が、1つ以上のコンテナにより構成されるPodとなる。
Note that in the DC 10, a virtualization infrastructure is constructed in the CPU server 3, and the description will be made assuming that it is operated using containers and VMs. Known open source virtualization platforms include OpenStack (registered trademark), which is software for building cloud environments, and Kubernetes (registered trademark), which is software for operating and managing containerized workloads and services. There is. OpenStack is primarily used for managing and operating physical machines and virtual machines (VMs). Kubernetes is mainly used for managing and operating containers.
In this specification, an application (consisting of one or more containers, one or more VMs, etc.) that is virtualized on a virtualization platform is referred to as a virtual resource. In Kubernetes, the minimum execution unit of an application is a Pod, which is made up of one or more containers.
 本実施形態ではサーバ群の配置制御区域30に対応付けて、図4で示すように「空調制御区域」を設ける。空調制御区域20は、空調制御による室温効果を測定するまとまったエリアであり、各サーバ(CPUサーバ3,GPUサーバ4,アクセラレータ5)の吸込口側、吐出口側のいずれかに面しているものとする。
 空調機2から送風される空気は、例えばDC10のフロアの床下に設けられた配管等を介して、吸込口側の空調制御区域20(図4では、空調制御区域「3」「4」「7」「8」)から吹き出される。そして、吐出口側の空調制御区域20(図4では、空調制御区域「1」「2」「5」「6」)に設けられた配管の吸込口から、各サーバの熱により温度上昇した空気を取り込み、空調機2へと戻る気流を生じさせる。
In this embodiment, an "air conditioning control area" is provided as shown in FIG. 4 in association with the placement control area 30 of the server group. The air conditioning control area 20 is a grouped area in which the room temperature effect due to air conditioning control is measured, and faces either the suction port side or the discharge port side of each server (CPU server 3, GPU server 4, accelerator 5). shall be taken as a thing.
The air blown from the air conditioner 2 is sent to the air conditioning control areas 20 (in FIG. 4, air conditioning control areas "3", "4", and "7") on the suction side via piping installed under the floor of the DC 10, for example. ”, “8”). Then, air whose temperature has risen due to the heat of each server flows from the suction ports of the piping provided in the air conditioning control area 20 on the discharge port side (air conditioning control areas "1", "2", "5", and "6" in FIG. 4). This generates an airflow that returns to the air conditioner 2.
 この空調制御区域20それぞれには、複数のセンサ(温度センサ等)が設置されている。また、各配置制御区域30のGPUサーバ4およびアクセラレータ5の吸込口にも温度センサが設置される。さらに、DC10の外部にもセンサ(温度センサ等)が設置される。これらのセンサから得られた情報(センサ情報)を、通信回線等を介して、電力量低減制御装置100が取得することができる。 A plurality of sensors (temperature sensors, etc.) are installed in each of the air conditioning control areas 20. Furthermore, temperature sensors are also installed at the suction ports of the GPU servers 4 and accelerators 5 in each placement control area 30. Furthermore, a sensor (temperature sensor, etc.) is installed outside the DC 10 as well. Information obtained from these sensors (sensor information) can be acquired by the power consumption reduction control device 100 via a communication line or the like.
 本実施形態に係る電力量低減制御装置100は、CPUサーバ3に対する処理負荷となる仮想リソース、および、GPU/アクセラレータに対する負荷についての生成/削除スケジュール情報に基づき、負荷をそれぞれのサーバリソース(本実施形態では、CPUサーバ3、GPUサーバ4、アクセラレータ5を意味する。)に配置した配置パターンにおける配置制御区域30毎の発熱量(後記する「配置制御区域の発熱予測量」)を予測する。なお、上記の「/」は、「および/または」を意味する。電力量低減制御装置100は、DC10の外部の温度やフロア温度、配置制御区域30の発熱予測量それぞれの状況(後記する「Situation」)毎に、複数段階の空調機2の空調制御値を設けて、各段階で制御したときの温度分布情報および空調消費電力量情報を保持しておく。そして、電力量低減制御装置100は、各段階で制御したときの温度分布情報等に基づき、各配置制御区域30のサーバ消費電力量(後記する、「総サーバ消費電力量」)を算出し、そのサーバ消費電力量と空調消費電力量の合計量が最小となる配置パターンを決定する(詳細は後記)。以下、電力量低減制御装置100について詳細に説明する。 The power consumption reduction control device 100 according to the present embodiment reduces the load to each server resource (this embodiment The amount of heat generated for each placement control area 30 ("predicted heat generation amount of placement control area" to be described later) in the placement pattern arranged in the CPU server 3, GPU server 4, and accelerator 5 is predicted. Note that the above "/" means "and/or". The power consumption reduction control device 100 sets air conditioning control values for the air conditioner 2 in multiple stages for each situation (“Situation” to be described later) of the external temperature of the DC 10, the floor temperature, and the predicted amount of heat generation in the placement control area 30. Temperature distribution information and air conditioning power consumption information when controlled at each stage are held. Then, the power consumption reduction control device 100 calculates the server power consumption (to be described later, "total server power consumption") of each placement control area 30 based on the temperature distribution information etc. when controlling at each stage, Determine the layout pattern that minimizes the total amount of server power consumption and air conditioning power consumption (details will be described later). The power amount reduction control device 100 will be described in detail below.
<電力量低減制御装置>
 図5は、本実施形態に係る電力量低減制御装置100の構成例を示す機能ブロック図である。
 電力量低減制御装置100は、サーバリソース(CPUサーバ3,GPUサーバ4,アクセラレータ5)の負荷の各配置パターンにおける配置制御区域30毎の発熱量(発熱予測量)を予測し、その状況(Situation)において、空調機2の空調制御をした場合の、温度分布情報64と空調消費電力量情報65とを取得する。そして、電力量低減制御装置100は、学習モデルを用いて、CPUサーバ3,GPUサーバ4,アクセラレータ5それぞれのサーバ消費電力量を算出し、それらを合計して配置制御区域30毎の、CPUサーバ3,GPUサーバ4,アクセラレータ5の総消費電力量を算出する。電力量低減制御装置100は、各配置制御区域30のCPUサーバ3,GPUサーバ4,アクセラレータ5の総消費電力量を合計したトータルのサーバ消費電力量(後記する「総サーバ消費電力量」)と、その空調消費電力量との合計量を算出し、その合計量が最小となる配置パターンを決定し、それに基づき、負荷配置および空調制御を実行する。
 この電力量低減制御装置100は、制御部、入出力部、記憶部(いずれも図示省略)を備えるコンピュータにより構成される。
<Power consumption reduction control device>
FIG. 5 is a functional block diagram showing a configuration example of the power amount reduction control device 100 according to the present embodiment.
The power consumption reduction control device 100 predicts the amount of heat generated (predicted amount of heat generated) for each placement control area 30 in each load placement pattern of server resources (CPU server 3, GPU server 4, accelerator 5), and calculates the situation (Situation). ), temperature distribution information 64 and air conditioning power consumption information 65 are acquired when the air conditioning of the air conditioner 2 is controlled. Then, the power consumption reduction control device 100 calculates the server power consumption of each of the CPU server 3, GPU server 4, and accelerator 5 using the learning model, and adds up the power consumption of each of the CPU servers 3, GPU servers 4, and accelerators 5. 3. Calculate the total power consumption of the GPU server 4 and accelerator 5. The power consumption reduction control device 100 calculates the total server power consumption (total server power consumption to be described later), which is the sum of the total power consumption of the CPU server 3, GPU server 4, and accelerator 5 in each placement control area 30. , and the air-conditioning power consumption, determine an arrangement pattern that minimizes the total amount, and execute load arrangement and air-conditioning control based on the arrangement pattern.
This power consumption reduction control device 100 is constituted by a computer including a control section, an input/output section, and a storage section (all not shown).
 入力部は、DC10内の各装置(各空調機2や各サーバ(CPUサーバ3,GPUサーバ4,アクセラレータ5))等との間の情報について入出力を行う。この入出力部は、通信回線を介して情報の送受信を行う通信インタフェースと、不図示のキーボード等の入力装置やモニタ等の出力装置との間で情報の入出力を行う入出力インタフェースとから構成される。 The input unit inputs and outputs information between each device in the DC 10 (each air conditioner 2 and each server (CPU server 3, GPU server 4, accelerator 5)), etc. This input/output unit is composed of a communication interface that sends and receives information via a communication line, and an input/output interface that inputs and outputs information between an input device such as a keyboard (not shown) and an output device such as a monitor (not shown). be done.
 記憶部は、ハードディスクやフラッシュメモリ、RAM(Random Access Memory)等により構成される。
 この記憶部には、制御部の各機能を実行させるためのプログラムや、制御部の処理に必要な情報が一時的に記憶される。また、この記憶部には、各配置パターンにおける、Situationそれぞれについての、空調機2の制御値(空調制御値情報63)や、その制御結果としての温度分布情報64、空調消費電力量情報65等で示される運用履歴情報201が記憶される。さらに、記憶部には、各サーバ(CPUサーバ3,GPUサーバ4,アクセラレータ5)の発熱予測量を算出するための基礎消費電力量情報301、CPUサーバ3の消費電力量を予測するためのCPUサーバ電力量学習モデル302、GPUサーバ4の消費電力量を予測するためのGPUサーバ電力量学習モデル303、および、アクセラレータ5の消費電力量を予測するためのアクセラレータ電力量学習モデル304が記憶される(詳細は後記)。
The storage unit includes a hard disk, flash memory, RAM (Random Access Memory), and the like.
This storage section temporarily stores programs for executing each function of the control section and information necessary for processing of the control section. This storage unit also contains control values for the air conditioner 2 (air conditioning control value information 63) for each Situation in each arrangement pattern, temperature distribution information 64 as the control result, air conditioning power consumption information 65, etc. Operation history information 201 indicated by is stored. Furthermore, the storage unit includes basic power consumption information 301 for calculating the predicted heat generation amount of each server (CPU server 3, GPU server 4, accelerator 5), and CPU information 301 for predicting the power consumption of the CPU server 3. A server power amount learning model 302, a GPU server power amount learning model 303 for predicting the power consumption of the GPU server 4, and an accelerator power amount learning model 304 for predicting the power consumption of the accelerator 5 are stored. (Details below).
 制御部は、電力量低減制御装置100が実行する処理の全般を司り、図5で示すように、空調制御部200とサーバ制御部300とを含んで構成される。 The control unit is in charge of overall processing executed by the power consumption reduction control device 100, and is configured to include an air conditioning control unit 200 and a server control unit 300, as shown in FIG.
≪空調制御部≫
 空調制御部200は、制御前のDC10内のフロアの平均温度(フロア平均温度)、外温(外気温度)、配置制御区域30ごとの発熱予測量をSituation構成要素とし、各Situationの各空調制御段階(各空調機2の空調制御値を段階的に設定)において、制御ターンにおいて温度分布情報64を取得し、空調消費電力量情報65を算出することにより、運用履歴情報201を生成する。この運用履歴情報201を生成するフェーズを学習フェーズと称する。また、空調制御部200は、実際に負荷配置と空調制御を実行する運用フェーズにおいて、各配置制御区域30の発熱予測量の情報をサーバ制御部300から取得すると、該当するSituation(Situation分類62)に相当する運用履歴情報201を抽出し、温度分布情報64および空調消費電力量情報65を、サーバ制御部300に出力する。そして、空調制御部200は、サーバ制御部300が決定した配置パターンにおける空調制御を各空調機2に実行させる。
 この空調制御部200は、状況認識部210、運用履歴情報生成部220、運用履歴情報抽出部230および空調制御実行部240を備える。
≪Air conditioning control section≫
The air conditioning control unit 200 uses the average temperature of the floor in the DC 10 before control (floor average temperature), the outside temperature (outside air temperature), and the predicted amount of heat generation for each location control area 30 as Situation components, and performs each air conditioning control for each Situation. In the step (setting the air conditioning control value of each air conditioner 2 in stages), temperature distribution information 64 is acquired in the control turn, and air conditioning power consumption information 65 is calculated, thereby generating operation history information 201. The phase in which this operation history information 201 is generated is referred to as a learning phase. In addition, when the air conditioning control unit 200 acquires information on the predicted heat generation amount of each placement control area 30 from the server control unit 300 during the operation phase in which load placement and air conditioning control are actually executed, the air conditioning control unit 200 selects the corresponding Situation (Situation classification 62). , and outputs temperature distribution information 64 and air conditioning power consumption information 65 to the server control unit 300. Then, the air conditioning control unit 200 causes each air conditioner 2 to perform air conditioning control in the arrangement pattern determined by the server control unit 300.
The air conditioning control unit 200 includes a situation recognition unit 210, an operation history information generation unit 220, an operation history information extraction unit 230, and an air conditioning control execution unit 240.
 状況認識部210は、Situationを構成するパラメータ要素である外界因子の情報を取得する。そして、状況認識部210は、各外界因子を複数レンジに分割し、各レンジ領域を組み合わせたものを1Situationとし、取得した外界因子の情報により、どのSituationに属するかを示すSituation分類62を決定する。
 この状況認識部210は、外界因子取得部211とSituation判定部212とを備える。
The situation recognition unit 210 acquires information on external factors that are parameter elements that constitute the Situation. Then, the situation recognition unit 210 divides each external world factor into a plurality of ranges, defines a combination of each range area as one situation, and determines a situation classification 62 indicating which situation it belongs to based on the acquired information on the external world factor. .
The situation recognition section 210 includes an external world factor acquisition section 211 and a situation determination section 212.
 外界因子取得部211は、外界因子の測定結果の情報を取得する。ここで、外界因子は、空調消費電力量の増減に影響を及ぼす要素であり、Situation分類62を構成するパラメータ要素を意味する。ここでは外界因子を、(1)制御前のDC10内のフロア平均温度、(2)外温(外気温度)、(3)配置制御区域30ごとの発熱予測量とする。 The external world factor acquisition unit 211 acquires information on the measurement results of external world factors. Here, the external factor is an element that affects an increase or decrease in air conditioning power consumption, and means a parameter element that constitutes the Situation classification 62. Here, the external factors are (1) the average floor temperature in the DC 10 before control, (2) the outside temperature (outside air temperature), and (3) the predicted amount of heat generation for each placement control area 30.
 (1)制御前のDC10内のフロア平均温度を、外界因子取得部211は、次のようにして算出する。外界因子取得部211は、当該空調制御区域20の温度センサから取得した温度の平均値を算出し、空調制御区域20ごとの平均温度を算出する。そして、外界因子取得部211は、算出した空調制御区域20ごとの平均温度をフロア全体で平均し、得られた温度をフロア平均温度とする。 (1) The external factor acquisition unit 211 calculates the floor average temperature in the DC 10 before control as follows. The external factor acquisition unit 211 calculates the average value of the temperatures acquired from the temperature sensors of the air conditioning control area 20, and calculates the average temperature for each air conditioning control area 20. Then, the external factor acquisition unit 211 averages the calculated average temperature for each air conditioning control area 20 over the entire floor, and sets the obtained temperature as the floor average temperature.
 (2)外温は、DC10の外部に設置された温度センサから得られる情報である。
 (3)配置制御区域30ごとの発熱予測量は、サーバ制御部300が算出する情報である(詳細は後記)。
 外界因子取得部211は、この外界因子の情報を取得すると、Situation判定部212に出力する。
(2) The external temperature is information obtained from a temperature sensor installed outside the DC 10.
(3) The predicted amount of heat generation for each placement control area 30 is information calculated by the server control unit 300 (details will be described later).
When the external world factor acquisition unit 211 acquires information on this external world factor, it outputs it to the Situation determination unit 212 .
 Situation判定部212は、外界因子取得部211が取得した情報が、どのSituation分類62に属するかを判定する。
 各外界因子は、最小値~最大値の間でその外界因子の特性に応じて複数個のレンジに分割される。そして、各外界因子を分割したレンジを組み合わせたものを、1Situationとして定義する。以下、図6を参照して説明する。
The Situation determining unit 212 determines to which Situation classification 62 the information acquired by the external world factor acquiring unit 211 belongs.
Each external factor is divided into a plurality of ranges between the minimum value and the maximum value depending on the characteristics of the external factor. Then, a combination of ranges obtained by dividing each external factor is defined as one Situation. This will be explained below with reference to FIG.
 図6に示すように、外界因子のそれぞれを「factor」とし、分割するレンジを定義(以下、「分割定義」と称する。)する。
 例えば、図6のSituation分類62で示す「factor1」の外界因子は「フロア平均温度」であり、分割定義は「0-48度を6分割」である。「factor2」の外界因子は「外温」であり、分割定義は「0-48度を6分割」である。「factor3」の外界因子は「配置制御区域1の発熱予測量」であり、分割定義は「0-200Wを20分割」である。以下同様に、「factor8」の外界因子は「配置制御区域6の発熱予測量」であり、分割定義は「0-200Wを20分割」である。
As shown in FIG. 6, each of the external factors is defined as a "factor", and a range for division is defined (hereinafter referred to as "division definition").
For example, the external factor of "factor 1" shown in the Situation classification 62 in FIG. 6 is "floor average temperature", and the division definition is "0-48 degrees divided into 6". The external factor of "factor 2" is "external temperature", and the division definition is "0-48 degrees divided into 6". The external factor of "factor 3" is "predicted amount of heat generation in placement control area 1", and the division definition is "0-200W divided into 20". Similarly, the external factor of "factor 8" is "predicted heat generation amount of placement control area 6", and the division definition is "0-200W divided into 20".
 ここで、Situation判定部212が取得した外界因子の情報が、図6で示す外界因子情報61であったとする。この場合、Situation判定部212は、「factor1」(フロア平均温度)の値が「25」であるから、「レンジ」として「24-32レンジ」(24度以上32度未満)に含まれるとし、「factorレンジ識別子」を「factor1-4」とする。この「factorレンジ識別子」は、例えば、0-48度を6分割した、0度以上8度未満を「factor1-1」、8度以上16度未満を「factor1-2」、16度以上24度未満を「factor1-3」のようにし、属するレンジを識別する情報である。他の「factor」において同様である。 Here, it is assumed that the external world factor information acquired by the Situation determination unit 212 is the external world factor information 61 shown in FIG. In this case, the Situation determination unit 212 determines that since the value of "factor 1" (floor average temperature) is "25", the "range" is included in the "24-32 range" (24 degrees or more and less than 32 degrees), Set the "factor range identifier" to "factor1-4". This "factor range identifier" is, for example, 0-48 degrees divided into 6, "factor1-1" for 0 degrees or more and less than 8 degrees, "factor1-2" for 8 degrees or more and less than 16 degrees, and "factor 1-2" for 8 degrees or more and less than 16 degrees, and 16 degrees or more and less than 16 degrees. This is information that identifies the range to which it belongs, such as "factor1-3". The same applies to other "factors".
 Situation判定部212は、外界因子の「factorレンジ識別子」の情報を組み合わせて「Situation分類」とし、「factor1-4_factor2-4_factor3-4_factor4-4_factor5-5_factor6-5_factor7-4_factor8-4」であると判定する。
 このようにして、Situation判定部212は、外界因子の取得情報に基づき、「Situation分類」を判定する。
The Situation determining unit 212 combines the information on the "factor range identifiers" of the external factors to form a "Situation classification" and determines that it is "factor1-4_factor2-4_factor3-4_factor4-4_factor5-5_factor6-5_factor7-4_factor8-4".
In this way, the Situation determining unit 212 determines the "Situation classification" based on the acquired information on external factors.
 図5に戻り、運用履歴情報生成部220は、各Situationにおいて空調機2の制御を複数段階に分けた空調制御値に基づき、空調機2を制御した結果としての温度分布情報64と空調消費電力量情報65とにより、運用履歴情報201を生成する。
 この運用履歴情報生成部220は、空調制御値生成部221と、報酬計算部222と、運用履歴作成部223とを備える。
Returning to FIG. 5, the operation history information generation unit 220 generates temperature distribution information 64 as a result of controlling the air conditioner 2 and air conditioning power consumption based on air conditioning control values that divide the control of the air conditioner 2 into multiple stages in each Situation. Operation history information 201 is generated based on the amount information 65.
The operation history information generation section 220 includes an air conditioning control value generation section 221, a remuneration calculation section 222, and an operation history generation section 223.
 空調制御値生成部221は、各Situationにおいて、空調機2の制御の複数段階に分けた空調制御値を生成する。
 具体的には、空調制御値生成部221は、空調機2それぞれで設定変更可能な制御パラメータ(例えば、設定温度(目標温度)、風量等)について、各パラメータを上限値から下限値までの間のM段階に分ける。そして、各段階のパラメータを組み合わせて(1つの)空調機2の制御対象となる空調制御値情報63を生成する。
 この生成された空調制御値情報63に基づき、空調制御実行部240が、例えば、空調機「1」→「2」→「3」の順番に制御したり、空調機「1」と「2」,空調機「2」と「3」,空調機「1」と「3」の組合せで制御したり、空調機「1」「2」「3」を同時に制御したりといった複数のパターンで空調制御を実行する。
The air conditioning control value generation unit 221 generates air conditioning control values divided into multiple stages of control of the air conditioner 2 in each Situation.
Specifically, the air conditioning control value generation unit 221 sets each parameter between an upper limit value and a lower limit value for control parameters (for example, set temperature (target temperature), air volume, etc.) that can be changed in each air conditioner 2. It is divided into M stages. Then, air conditioning control value information 63 to be controlled by (one) air conditioner 2 is generated by combining the parameters of each stage.
Based on the generated air conditioning control value information 63, the air conditioning control execution unit 240 may, for example, control the air conditioners "1" → "2" → "3" in the order of air conditioners "1" → "2" → "3", or control the air conditioners "1" and "2" , Air conditioning control with multiple patterns such as controlling air conditioners ``2'' and ``3'', controlling air conditioners ``1'' and ``3'' in combination, or controlling air conditioners ``1'', ``2'', and ``3'' simultaneously. Execute.
 報酬計算部222は、空調制御値生成部221が生成した空調制御値により制御を実行した結果を評価する指標として報酬(温度報酬)を計算する。そして、報酬計算部222は、制御結果が所定の報酬を満たすか否か、つまり、所定の条件を満たす空調制御値であるか否かを判定する。 The reward calculation unit 222 calculates a reward (temperature reward) as an index for evaluating the result of executing control using the air conditioning control value generated by the air conditioning control value generation unit 221. Then, the remuneration calculation unit 222 determines whether the control result satisfies a predetermined remuneration, that is, whether the air conditioning control value satisfies a predetermined condition.
 この報酬計算部222は、空調制御区域20ごとに、高温警戒報酬と低温警戒報酬の2種類の報酬を定義し、ターン毎の制御結果の報酬を算出する。
 高温警戒報酬は、制御前の温度が目標温度よりも高い場合、つまり、室温が高く温度を下げる方向に制御する場合に適用される。低温警戒報酬は、制御前の温度が目標温度よりも低い場合、つまり、室温が低すぎるため温度を上げる方向に制御する場合に適用される。
The reward calculation unit 222 defines two types of rewards, a high temperature warning reward and a low temperature warning reward, for each air conditioning control area 20, and calculates the reward for the control result for each turn.
The high temperature warning reward is applied when the temperature before control is higher than the target temperature, that is, when the room temperature is high and the temperature is controlled to be lowered. The low temperature warning reward is applied when the temperature before control is lower than the target temperature, that is, when the room temperature is too low and the temperature is controlled to increase.
 報酬計算部222は、報酬を計算する際に、「ターンの目標温度」と「ターン制御後の温度」の差、つまり、目標温度と現在温度との乖離を指標として、報酬を算出する。
 例えば、高温警戒報酬の場合、「ターン制御後の温度」が「ターンの目標温度」以下であれば、報酬「100%」とする。また、「ターン制御後の温度」が「ターンの目標温度」から+1度上がるごとに報酬を「-10%」とする。なお、この報酬は上記の値に限定されず、任意に設定可能である。
 一方、低温警戒報酬の場合、「ターン制御後の温度」が「ターンの目標温度」以上であれば、報酬「100%」とする。また、「ターン制御後の温度」が「ターンの目標温度」から-1度下がるごとに報酬を「-10%」とする。なお、この報酬は上記の値に限定されず、任意に設定可能である。
 そして、報酬計算部222により計算された報酬について、所定の閾値(%)以上のときに、所定の条件を満たし合格とし、所定の閾値(%)未満のときに不合格とする。
When calculating the reward, the reward calculation unit 222 uses the difference between the "target temperature of the turn" and the "temperature after turn control", that is, the deviation between the target temperature and the current temperature, as an index.
For example, in the case of a high temperature warning reward, if the "temperature after turn control" is less than or equal to the "turn target temperature", the reward is "100%". In addition, the reward will be ``-10%'' every time the ``temperature after turn control'' increases by +1 degree from the ``turn target temperature.'' Note that this reward is not limited to the above value and can be set arbitrarily.
On the other hand, in the case of low temperature warning reward, if the "temperature after turn control" is equal to or higher than the "turn target temperature", the reward is "100%". In addition, the reward will be ``-10%'' every time the ``temperature after turn control'' decreases by -1 degree from the ``turn target temperature.'' Note that this reward is not limited to the above value and can be set arbitrarily.
When the remuneration calculated by the remuneration calculation unit 222 is equal to or greater than a predetermined threshold (%), it satisfies a predetermined condition and is judged as a pass, and when it is less than a predetermined threshold (%), it is judged as a failure.
 なお、報酬計算部222は、高温警戒報酬と低温警戒報酬の両方で合格した場合に、最終的な合格と判定するようにしてもよい。例えば、空調機2の制御前の最初の温度が目標温度より高く、高温警戒報酬に基づき空調制御値による制御を行った場合、所定の合格閾値は超えているが、目標温度を過ぎて低温になりすぎる制御を行う可能性がある。この場合、空調消費電力を過剰に消費してしまう。よって、制御前の温度が目標温度より低い場合には、低温警戒報酬に基づき、合格と判定されるまで制御を行う。このように、報酬計算部222が、高温警戒報酬と低温警戒報酬の両方で合格と判定することにより、適正な範囲内で制御可能な空調制御値を選択することが可能となる。 Note that the reward calculation unit 222 may determine that the test is finally passed when both the high temperature warning reward and the low temperature warning reward are passed. For example, if the initial temperature of air conditioner 2 before control is higher than the target temperature and control is performed using the air conditioning control value based on the high temperature warning reward, the predetermined passing threshold is exceeded, but the target temperature is exceeded and the temperature becomes low. There is a possibility of excessive control. In this case, excessive air conditioning power consumption occurs. Therefore, if the temperature before control is lower than the target temperature, control is performed based on the low temperature warning reward until it is determined to pass. In this manner, the reward calculation unit 222 determines that both the high temperature warning reward and the low temperature warning reward are passed, thereby making it possible to select an air conditioning control value that can be controlled within an appropriate range.
 また、報酬計算部222は、ターン制御後の各空調制御区域20のセンサの平均温度(フロア平均温度)が、規定の範囲(例えば、5℃から35℃)の範囲以外となった場合に、その空調制御値について不合格とする。また、GPUサーバ4については、GPU温度(GPUカード温度)とGPUサーバの消費電力量との関係から予め設定した規定の範囲内にGPU温度が収まっていない場合に不合格とする。アクセラレータ5については、アクセラレータ温度(アクセラレータ内部の温度)と、消費電力量との関係から予め設定した規定の範囲内にアクセラレータ温度が収まっていない場合に不合格とする。これは、実際にDC10で負荷処理する際に、採用される見込みのない、DC10内の機器の管理に不適当な温度となる運用履歴を、無駄に記憶しないための処理である。 In addition, when the average temperature of the sensors in each air conditioning control area 20 (floor average temperature) after turn control is outside the prescribed range (for example, from 5°C to 35°C), the reward calculation unit 222 calculates The air conditioning control value will be rejected. Further, regarding the GPU server 4, if the GPU temperature does not fall within a predetermined range determined from the relationship between the GPU temperature (GPU card temperature) and the power consumption of the GPU server, it is judged as a failure. Regarding the accelerator 5, if the accelerator temperature does not fall within a predetermined range determined from the relationship between the accelerator temperature (temperature inside the accelerator) and power consumption, the accelerator 5 is judged to have failed. This is a process to avoid unnecessary storage of operation history that is unlikely to be adopted when the DC 10 actually processes a load and has a temperature inappropriate for managing devices within the DC 10.
 運用履歴作成部223は、各Situationにおいて空調制御値生成部221が生成した空調制御値情報63に基づき、各空調機2を空調制御実行部240が制御した結果として、温度分布情報64と空調消費電力量情報65とを取得する。
 即ち、運用履歴作成部233は、各Situationにおいて空調制御値生成部221が生成した、各パラメータをM段階に分けた組合せのパターンそれぞれの空調制御値について、空調制御実行部240による制御を実行させ、温度分布情報64と、その空調制御値を実行した際の空調消費電力量情報65とを取得する。
 なお、運用履歴作成部223は、報酬計算部222が所定の条件を満たさず、不合格とした空調制御値およびその制御結果は、運用履歴情報201の作成対象から除外する。
The operation history creation unit 223 generates temperature distribution information 64 and air conditioning consumption as a result of the air conditioning control execution unit 240 controlling each air conditioner 2 based on the air conditioning control value information 63 generated by the air conditioning control value generation unit 221 in each Situation. The power amount information 65 is acquired.
That is, the operation history creation unit 233 causes the air conditioning control execution unit 240 to execute control for each air conditioning control value for each pattern of combinations in which each parameter is divided into M stages, generated by the air conditioning control value generation unit 221 in each Situation. , temperature distribution information 64 and air conditioning power consumption information 65 when the air conditioning control value is executed are acquired.
Note that the operation history creation unit 223 excludes from the creation of the operation history information 201 air conditioning control values and control results that the remuneration calculation unit 222 determines as failing because they do not satisfy a predetermined condition.
 図7は、本実施形態に係る温度分布情報64を説明するための図である。温度分布情報64は、図7で示す、GPUサーバ4の吸込口側に設けられた温度センサ44や、アクセラレータ5の吸込口側に設けられた温度センサ55により、その制御ターンの各時間推移に亘って測定される温度情報である。温度センサ44,55は、その識別情報が予めGPUサーバ4,アクセラレータ5の識別情報と対応付けられた上で、温度計測が行われる。ターンの開始から終了まで、フロア内のすべての温度センサ44,45が各時間推移(所定の時間間隔)で温度測定を行い、その結果情報が、温度分布情報64として生成される。
 なお、以下において、GPUサーバ4の吸込口側に設けられた温度センサ44に計測された温度を「GPU吸込口温度」、アクセラレータ5の吸込口側に設けられた温度センサ55により計測された温度を「アクセラレータ吸込口温度」と称する。
FIG. 7 is a diagram for explaining the temperature distribution information 64 according to this embodiment. The temperature distribution information 64 is obtained by a temperature sensor 44 provided on the suction port side of the GPU server 4 and a temperature sensor 55 provided on the suction port side of the accelerator 5, as shown in FIG. This is temperature information measured over a period of time. The temperature sensors 44 and 55 measure temperature after their identification information is associated with the identification information of the GPU server 4 and accelerator 5 in advance. From the start to the end of the turn, all the temperature sensors 44 and 45 on the floor measure the temperature at each time transition (at predetermined time intervals), and the resulting information is generated as temperature distribution information 64.
In the following, the temperature measured by the temperature sensor 44 provided on the suction port side of the GPU server 4 will be referred to as "GPU suction port temperature", and the temperature measured by the temperature sensor 55 provided on the suction port side of the accelerator 5 will be referred to as "GPU suction port temperature". is referred to as the "accelerator suction port temperature."
 空調消費電力量情報65は、空調制御実行部240が各空調機2を空調制御値情報63に基づき制御したそのターンにおいて、空調機2を監視する不図示の消費電力量計測手段により計測された各空調機2のトータルの消費電力量である。例えば、各時間推移(所定の時間間隔)で各空調機2の消費電力量が計測され、そのターンにおいて計測された消費電力量を合計した値が、空調消費電力量情報65として算出される。 The air conditioning power consumption information 65 is measured by an unillustrated power consumption measuring means that monitors the air conditioners 2 in the turn in which the air conditioning control execution unit 240 controls each air conditioner 2 based on the air conditioning control value information 63. This is the total power consumption of each air conditioner 2. For example, the power consumption of each air conditioner 2 is measured at each time transition (predetermined time interval), and the sum of the power consumption measured in that turn is calculated as the air conditioning power consumption information 65.
 運用履歴作成部233は、空調制御実行部240が空調制御を実行した際の、Situation(Situation分類62)および空調制御値情報63に、その制御結果として得られた、温度分布情報64と、空調消費電力量情報65とを対応付けた運用履歴情報201を作成し、記憶部(図示省略)に格納する。 The operation history creation unit 233 stores temperature distribution information 64 and air conditioning obtained as a result of the control in the Situation (Situation classification 62) and air conditioning control value information 63 when the air conditioning control execution unit 240 executes the air conditioning control. Operation history information 201 associated with power consumption information 65 is created and stored in a storage unit (not shown).
 運用履歴情報抽出部230は、運用フェーズにおいて、サーバ制御部300(区域発熱量推定部320)から、各配置制御区域30の発熱予測量の情報を取得する。そして、運用履歴情報抽出部230は、その制御ターン開始時におけるSituation分類62を、状況認識部210を介して確定する。
 そして、運用履歴情報抽出部230は、確定したSituation分類62における、各空調制御値情報63で制御した結果である、温度分布情報64および空調消費電力量情報65を、運用履歴情報201から抽出する。運用履歴情報抽出部230は、その抽出した温度分布情報64および空調消費電力量情報65の情報を、サーバ制御部300(サーバ消費電力量予測部330、配置パターン決定部340)に出力する。
The operation history information extraction unit 230 acquires information on the predicted heat generation amount of each placement control area 30 from the server control unit 300 (area heat generation amount estimating unit 320) in the operation phase. Then, the operation history information extraction unit 230 determines the Situation classification 62 at the start of the control turn via the situation recognition unit 210.
Then, the operation history information extraction unit 230 extracts temperature distribution information 64 and air conditioning power consumption information 65, which are the results of control using each air conditioning control value information 63 in the determined Situation classification 62, from the operation history information 201. . The operation history information extraction unit 230 outputs the extracted temperature distribution information 64 and air conditioning power consumption information 65 to the server control unit 300 (server power consumption prediction unit 330, layout pattern determination unit 340).
 空調制御実行部240は、学習フェーズにおいて、各Situationでの上記した複数パターンでの空調機2の制御を、空調制御値生成部221が生成した空調制御値で実行する。
 また、空調制御実行部240は、運用フェーズにおいて、サーバ制御部300が決定した、最適な配置パターンにおける、各空調機2の空調制御を実行する。
In the learning phase, the air conditioning control execution unit 240 controls the air conditioner 2 in the plurality of patterns described above in each Situation using the air conditioning control value generated by the air conditioning control value generation unit 221.
Furthermore, the air conditioning control execution unit 240 executes air conditioning control for each air conditioner 2 in the optimal arrangement pattern determined by the server control unit 300 in the operation phase.
≪サーバ制御部≫
 サーバ制御部300は、CPUに対する処理負荷となる仮想リソースおよびGPU/アクセラレータに対する負荷の生成/削除スケジュール情報に基づき、負荷をそれぞれのサーバリソース(CPUサーバ3、GPUサーバ4、アクセラレータ5)に配置した配置パターンにおける配置制御区域30毎の発熱量(配置制御区域30の発熱予測量)を算出する。そして、サーバ制御部300は、空調制御部200から取得した、Situationごとの各段階で空調制御した温度分布情報64等に基づき、各配置制御区域30のサーバ消費電力量を合計したトータルのサーバ消費電力量(総サーバ消費電力量)を算出する。サーバ制御部300は、総サーバ消費電力量と空調消費電力量の合計値を算出し、その合計量が最小となる配置パターンを決定する。
 このサーバ制御部300は、配置パターン算出部310と、区域発熱量推定部320と、サーバ消費電力量予測部330と、配置パターン決定部340とを備える。
≪Server control section≫
The server control unit 300 allocates the load to each server resource (CPU server 3, GPU server 4, accelerator 5) based on the generation/deletion schedule information of the virtual resource that becomes the processing load on the CPU and the load on the GPU/accelerator. The amount of heat generated for each placement control area 30 in the placement pattern (predicted amount of heat generation for the placement control area 30) is calculated. Then, the server control unit 300 calculates the total server power consumption, which is the sum of the server power consumption of each placement control area 30, based on the temperature distribution information 64 etc. obtained from the air conditioning control unit 200, which controls the air conditioning at each stage for each Situation. Calculate the power amount (total server power consumption). The server control unit 300 calculates the total value of the total server power consumption and the air conditioning power consumption, and determines an arrangement pattern that minimizes the total amount.
The server control section 300 includes an arrangement pattern calculation section 310, an area heat generation estimation section 320, a server power consumption estimation section 330, and an arrangement pattern determination section 340.
 配置パターン算出部310は、各制御ターンの開始時に、CPUサーバ3に対する処理負荷となる仮想リソースおよびGPU/アクセラレータに対する負荷についての生成/削除スケジュール(以下、「負荷処理スケジュール情報」と称する。)を取得する。
 そして、配置パターン算出部310は、直近のリソース使用状況(例えば、CPU、GPU、アクセラレータの使用率等)に基づき、新規の負荷を各サーバリソース(CPUサーバ3、GPUサーバ4、アクセラレータ5)に配置した配置パターンを算出する。
 なお、配置パターン算出部310は、各サーバリソース(CPUサーバ3、GPUサーバ4、アクセラレータ5)に負荷を配置後、各サーバリソースにおけるリソース占有量が、負荷容量(上限値)×所定の閾値以下となるようにする。
At the start of each control turn, the placement pattern calculation unit 310 calculates a generation/deletion schedule (hereinafter referred to as "load processing schedule information") regarding the load on virtual resources and GPUs/accelerators that become a processing load on the CPU server 3. get.
Then, the placement pattern calculation unit 310 assigns a new load to each server resource (CPU server 3, GPU server 4, accelerator 5) based on the latest resource usage status (for example, usage rate of CPU, GPU, accelerator, etc.). Calculate the placed layout pattern.
Note that, after allocating the load to each server resource (CPU server 3, GPU server 4, accelerator 5), the allocation pattern calculation unit 310 calculates that the resource occupation amount of each server resource is equal to or less than the load capacity (upper limit) x a predetermined threshold. Make it so that
 区域発熱量推定部320は、配置パターン算出部310が算出した配置パターン毎に、各サーバリソース(CPUサーバ3、GPUサーバ4、アクセラレータ5)の消費電力量を、基礎消費電力量情報301を参照して予測する。そして、区域発熱量推定部320は、配置制御区域30ごとのサーバ配置構成に基づき、各配置パターンにおける配置制御区域30ごとのトータルの発熱予測量を算出する。
 この基礎消費電力量情報301は、各サーバリソースの吸込口等の温度変更に伴うサーバ消費電力量の変更を考慮しない、例えば、所定温度(18℃)での正常な状態の基準となる消費電力量である。
The area heat generation estimation unit 320 calculates the power consumption of each server resource (CPU server 3, GPU server 4, accelerator 5) for each layout pattern calculated by the layout pattern calculation unit 310, with reference to the basic power consumption information 301. and predict. Then, the area heat generation amount estimating unit 320 calculates the total predicted heat generation amount for each placement control area 30 in each placement pattern based on the server arrangement configuration for each placement control area 30.
This basic power consumption information 301 does not take into account changes in server power consumption due to changes in the temperature of the inlet of each server resource, for example, the power consumption that is the standard for a normal state at a predetermined temperature (18 degrees Celsius). It's the amount.
 具体的には、例えば、配置制御区域「1」にCPU処理「a」を12Pod、GPU処理「b」を5個、FPGA処理「c」を10個実行させる負荷処理スケジュール情報であるとする。ここで、基礎消費電力量情報301が、CPUについて1サーバあたり1Podが「100w」、GPUについて1サーバあたり1個が「5kw」、FPGAについて1サーバあたり1個が「20w」であるとする。この場合、配置制御区域「1」のCPUサーバ3の消費電力量は「1.2kw」、GPUサーバ4の消費電力量は「5kw」、FPGAの消費電力量は「0.2kw」となる。 Specifically, for example, assume that the load processing schedule information is to execute 12 Pods of CPU processing "a", 5 Pods of GPU processing "b", and 10 Pods of FPGA processing "c" in placement control area "1". Here, it is assumed that the basic power consumption information 301 is "100w" for one Pod per server for CPU, "5kw" for one Pod per server for GPU, and "20w" for one Pod per server for FPGA. In this case, the power consumption of the CPU server 3 in the placement control area "1" is "1.2 kw", the power consumption of the GPU server 4 is "5 kw", and the power consumption of the FPGA is "0.2 kw".
 そして、区域発熱量推定部320は、該当配置制御区域30の発熱量Wを以下の式(1)により求める。
 配置制御区域の発熱量W = 該当配置制御区域のCPUサーバ消費電力量 × kc + 該当配置制御区域のGPUサーバ消費電力量 × kg + 該当配置制御区域のアクセラレータ消費電力量 × ka  ・・・式(1)
 なお、kc,kg,kaは、CPUサーバ3、GPUサーバ4、アクセラレータ5それぞれの稼働時におけるDC10内の各ルームでの空調冷却への負担量を事前計測することにより求める係数である。
 区域発熱量推定部320は、上記の式(1)を用いて、CPUサーバ3、GPUサーバ4およびアクセラレータ5の発熱量を合計することにより、各配置制御区域30の発熱量を算出し、配置制御区域30の発熱予測量とする。
 そして、区域発熱量推定部320は、算出した各配置制御区域30の発熱予測量を、空調制御部200(運用履歴情報抽出部230)に出力する。
Then, the area calorific value estimating unit 320 calculates the calorific value W of the corresponding placement control area 30 using the following equation (1).
Calorific value W of placement control area = CPU server power consumption of applicable placement control area x kc + GPU server power consumption of applicable placement control area x kg + Accelerator power consumption of applicable placement control area x ka...Formula ( 1)
Note that kc, kg, and ka are coefficients obtained by measuring in advance the amount of load on air conditioning cooling in each room in the DC 10 when the CPU server 3, GPU server 4, and accelerator 5 are in operation.
The area calorific value estimating unit 320 calculates the calorific value of each placement control area 30 by summing the calorific value of the CPU server 3, GPU server 4, and accelerator 5 using the above equation (1), and calculates the calorific value of each placement control area 30. This is the predicted amount of heat generation in the control area 30.
Then, the area heat generation amount estimating unit 320 outputs the calculated predicted heat generation amount of each placement control area 30 to the air conditioning control unit 200 (operation history information extraction unit 230).
 サーバ消費電力量予測部330は、各制御ターンの開始時に、CPUサーバ3,GPUサーバ4、アクセラレータ5に対する負荷処理スケジュール情報(新規の処理負荷に関する情報)と、空調制御部200(運用履歴情報抽出部230)から取得した配置パターンの温度分布情報64とを用いて、CPUサーバ3、GPUサーバ4、アクセラレータ5それぞれについて、配置制御区域30ごとの総消費電力量(CPUサーバ総消費電力量、GPUサーバ総消費電力量、アクセラレータ総消費電力量)を算出する。
 サーバ消費電力量予測部330は、各配置パターンにおいて、その配置制御区域30のCPUサーバ総消費電力量と、GPUサーバ総消費電力量と、アクセラレータ総消費電力量とを合計し、配置制御区域30それぞれのサーバ消費電力量を算出する。
 このサーバ消費電力量予測部330は、CPU電力量予測部331と、GPU電力量予測部332と、アクセラレータ電力量予測部333とを備える。
At the start of each control turn, the server power consumption prediction unit 330 extracts load processing schedule information (information regarding new processing loads) for the CPU server 3, GPU server 4, and accelerator 5, and the air conditioning control unit 200 (operation history information extraction). The total power consumption (CPU server total power consumption, GPU Calculate the total power consumption of the server and the total power consumption of the accelerator.
In each placement pattern, the server power consumption prediction unit 330 adds up the CPU server total power consumption, GPU server total power consumption, and accelerator total power consumption in the placement control area 30, and calculates the total amount of power consumption in the placement control area 30. Calculate the power consumption of each server.
The server power consumption prediction section 330 includes a CPU power consumption prediction section 331 , a GPU power consumption prediction section 332 , and an accelerator power consumption prediction section 333 .
 CPU電力量予測部331は、各制御ターン開始時に、負荷処理スケジュール情報に基づく新規配置予定の仮想リソース量の情報(例えば、CPUコア数)や、その時点のCPUサーバ3のリソース使用状況(例えば、CPU使用率)を取得する。そして、CPU電力量予測部331は、各CPUサーバ3の消費電力量を、CPUサーバ電力量学習モデル302を用いて予測する。
 より詳細には、CPU電力量予測部331は、制御ターンの切替時に、前制御ターンで処理が終了した仮想リソース(例えば、Pod)を削除する。そして、CPU電力量予測部311は、今回の制御ターンの開始時に、削除分のPodを除いた場合の各CPUサーバ3のリソース使用率(例えば、CPU使用率、メモリ使用率など)を取得する。CPU電力量予測部311は、負荷処理スケジュール情報に基づき、新たなPodを各CPUサーバ3に配置した場合のリソース使用率の予測値を計算する。そして、CPU電力量予測部311は、このリソース使用率の予測値を、CPUサーバ電力量学習モデル302に入力することにより、各CPUサーバ3の消費電力量を算出する。
 さらに、CPU電力量予測部331は、各配置制御区域30のCPUサーバ3の配置構成に基づき、配置制御区域30それぞれにおいて、各CPUサーバ3について算出した消費電力量を合計したCPUサーバ総消費電力量を算出する。
At the start of each control turn, the CPU power amount prediction unit 331 calculates information on the amount of virtual resources to be newly allocated based on the load processing schedule information (for example, the number of CPU cores) and the resource usage status of the CPU server 3 at that time (for example, , CPU usage rate). Then, the CPU power amount prediction unit 331 predicts the power consumption of each CPU server 3 using the CPU server power amount learning model 302.
More specifically, when switching control turns, the CPU power amount prediction unit 331 deletes virtual resources (for example, Pods) whose processing was completed in the previous control turn. Then, at the start of the current control turn, the CPU power amount prediction unit 311 obtains the resource usage rate (for example, CPU usage rate, memory usage rate, etc.) of each CPU server 3 excluding the deleted Pods. . The CPU power amount prediction unit 311 calculates a predicted value of the resource usage rate when a new Pod is placed in each CPU server 3 based on the load processing schedule information. Then, the CPU power amount prediction unit 311 calculates the power consumption of each CPU server 3 by inputting this predicted value of the resource usage rate into the CPU server power amount learning model 302.
Furthermore, the CPU power amount prediction unit 331 calculates the CPU server total power consumption by summing up the power consumption calculated for each CPU server 3 in each of the placement control areas 30 based on the placement configuration of the CPU servers 3 in each placement control area 30. Calculate the amount.
 ここで、CPUサーバ電力量学習モデル302は、CPUサーバ3のリソース使用状況(例えば、CPU使用率やメモリ使用率など)を入力情報とし、CPUサーバ3の消費電力量を出力情報とする学習モデルである。このCPUサーバ電力量学習モデル302を、CPUサーバ3のリソース使用状況と、そのときの結果情報であるサーバ消費電力量とを学習データとして、予め作成しておく。 Here, the CPU server power consumption learning model 302 is a learning model that uses the resource usage status of the CPU server 3 (for example, CPU usage rate, memory usage rate, etc.) as input information, and uses the power consumption amount of the CPU server 3 as output information. It is. This CPU server power amount learning model 302 is created in advance using the resource usage status of the CPU server 3 and the server power consumption, which is result information at that time, as learning data.
 GPU電力量予測部332は、負荷処理スケジュール情報により得られる新規に処理を予定している処理負荷の種類(以下、「負荷種類」と称する。)、GPU吸込口温度、GPUカード数等に基づき、GPUサーバ電力量学習モデル303を用いて、各GPUサーバ4のGPUサーバ消費電力量を予測する。さらに、GPU電力量予測部332は、各配置制御区域30のGPUサーバ4の配置構成に基づき、配置制御区域30それぞれにおいて、各GPUサーバ4の消費電力量を合計したGPUサーバ総消費電力量を算出する。
 なお、負荷種類は、GPUサーバ4を実行する用途に応じた負荷の種類としての、例えば、画像処理、機械学習処理、ネットワーク処理、仮想空間処理等であり、負荷処理スケジュール情報を用いて、それぞれの負荷種類が特定できるものとする。また、GPUサーバ4は、負荷処理スケジュール情報に基づき、1台につき単一種類のアプリケーションを実行することを前提とする。
The GPU power amount prediction unit 332 predicts the amount of electricity based on the type of processing load that is newly scheduled to be processed (hereinafter referred to as "load type") obtained from the load processing schedule information, the GPU inlet temperature, the number of GPU cards, etc. , the GPU server power consumption of each GPU server 4 is predicted using the GPU server power learning model 303. Furthermore, the GPU power amount prediction unit 332 calculates the total GPU server power consumption, which is the sum of the power consumption of each GPU server 4 in each placement control area 30, based on the placement configuration of the GPU servers 4 in each placement control area 30. calculate.
Note that the load type is a type of load depending on the purpose of executing the GPU server 4, such as image processing, machine learning processing, network processing, virtual space processing, etc., and each is determined using load processing schedule information. It is assumed that the type of load can be specified. Furthermore, it is assumed that each GPU server 4 executes a single type of application based on the load processing schedule information.
 このGPUサーバ電力量学習モデル303は、直接GPUサーバ消費電力量を予測する手法(一段階式)と、GPU温度(GPUカード温度)を介して二段階でGPUサーバ消費電力量を予測する手法(二段階式)がある。 This GPU server power consumption learning model 303 has two methods: a method of directly predicting GPU server power consumption (one-step method), and a method of predicting GPU server power consumption in two stages via GPU temperature (GPU card temperature). There is a two-stage system).
 (一段階式)では、GPUサーバ電力量学習モデル303として、1つの学習モデルを用いる。
 このGPUサーバ電力量学習モデル303は、GPU吸込口温度、負荷種類、GPUカード数を入力情報とし、GPUサーバ4の消費電力量を出力情報とする学習モデルである。このGPUサーバ電力量学習モデル303を、GPU吸込口温度、負荷種類、GPUカード数と、そのときのGPUサーバ4の消費電力量の情報とを学習データとして、予め作成しておく。
In the (one-stage method), one learning model is used as the GPU server power amount learning model 303.
This GPU server power consumption learning model 303 is a learning model that uses the GPU inlet temperature, load type, and number of GPU cards as input information, and uses the power consumption of the GPU server 4 as output information. This GPU server power amount learning model 303 is created in advance using the GPU inlet temperature, load type, number of GPU cards, and information on the power consumption of the GPU server 4 at that time as learning data.
 GPU電力量予測部332は、一段階式を採用する場合に、ターンの始めに処理未割当ての各GPUサーバ4に対し、GPU吸込口温度、負荷種類、GPUカード数に基づき、GPUサーバ電力量学習モデル303を用いて、各GPUサーバ4のGPUサーバ消費電力量を予測する。
 なお、GPU吸込口温度は、ターンの始めは、現時点での各GPUサーバ4の吸込口温度(GPU吸込口温度)を用い、それ以降は、空調制御部200から取得した温度分布情報64で示される各GPUサーバ4のGPU吸込口温度の情報(現時点のGPU吸込口温度と同じ温度から始まる温度分布情報64)を用いる(二段階式でも同様である。)。
When adopting the one-stage method, the GPU power amount prediction unit 332 calculates the GPU server power amount based on the GPU inlet temperature, load type, and number of GPU cards for each GPU server 4 that is not assigned processing at the beginning of the turn. Using the learning model 303, the GPU server power consumption of each GPU server 4 is predicted.
Note that the GPU inlet temperature uses the current inlet temperature (GPU inlet temperature) of each GPU server 4 at the beginning of the turn, and thereafter is indicated by the temperature distribution information 64 acquired from the air conditioning control unit 200. Information on the GPU inlet temperature of each GPU server 4 (temperature distribution information 64 starting from the same temperature as the current GPU inlet temperature) is used (the same applies to the two-stage system).
 (二段階式)では、GPUサーバ電力量学習モデル303として、2つの学習モデル(第1GPU学習モデル303a,第2GPU学習モデル303b)を用いる。
 第1GPU学習モデル303aは、GPU吸込口温度、負荷種類、GPUカード数を入力情報とし、GPU温度を出力情報とする学習モデルである。この第1GPU学習モデル303aを、GPU吸込口温度、負荷種類、GPUカード数と、そのときのGPU温度とを学習データとして、予め作成しておく。
In the (two-stage method), two learning models (a first GPU learning model 303a and a second GPU learning model 303b) are used as the GPU server power amount learning model 303.
The first GPU learning model 303a is a learning model that uses GPU inlet temperature, load type, and number of GPU cards as input information, and uses GPU temperature as output information. This first GPU learning model 303a is created in advance using the GPU inlet temperature, load type, number of GPU cards, and current GPU temperature as learning data.
 第2GPU学習モデル303bは、GPU温度を入力情報とし、GPUサーバ4の消費電力量を出力情報とする学習モデルである。この第2GPU学習モデル303bを、GPU温度と、その時のGPUサーバ4の消費電力量とを学習データとして、予め作成しておく。 The second GPU learning model 303b is a learning model that uses the GPU temperature as input information and uses the power consumption of the GPU server 4 as output information. This second GPU learning model 303b is created in advance using the GPU temperature and the power consumption of the GPU server 4 at that time as learning data.
 GPU電力量予測部332は、二段階式を採用する場合に、ターンの始めに処理未割当ての各GPUサーバ4に対し、GPU吸込口温度、負荷種類、GPUカード数に基づき、第1GPU学習モデル303aを用いて、GPU温度を予測する。そして、GPU電力量予測部332は、その予測したGPU温度に基づき、第2GPU学習モデル303bを用いて、各GPUサーバ4のGPUサーバ消費電力量を予測する。 When adopting the two-stage method, the GPU power amount prediction unit 332 generates a first GPU learning model based on the GPU inlet temperature, load type, and number of GPU cards for each GPU server 4 that is not assigned processing at the beginning of the turn. 303a to predict the GPU temperature. Then, the GPU power amount prediction unit 332 predicts the GPU server power consumption of each GPU server 4 based on the predicted GPU temperature using the second GPU learning model 303b.
 GPU電力量予測部332は、各配置制御区域30のGPUサーバ4の配置構成に基づき、配置制御区域30それぞれにおいて、予測した各GPUサーバ4の消費電力量を合計することにより、その配置制御区域30のGPUサーバ総消費電力量を算出する。 The GPU power amount prediction unit 332 adds up the predicted power consumption of each GPU server 4 in each of the placement control areas 30 based on the placement configuration of the GPU servers 4 in each placement control area 30, and calculates the predicted amount of power for each placement control area 30. The total power consumption of 30 GPU servers is calculated.
 アクセラレータ電力量予測部333は、負荷処理スケジュール情報により得られる新規に処理を予定しているアクセラレータ処理負荷の種類(負荷種類)、アクセラレータ吸込口温度、アクセラレータ処理回路数等に基づき、アクセラレータ電力量学習モデル304を用いて、各アクセラレータ5のアクセラレータ消費電力量を予測する。さらに、アクセラレータ電力量予測部333は、各配置制御区域30のアクセラレータ5の配置構成に基づき、配置制御区域30それぞれにおいて、各アクセラレータ5の消費電力量を合計したアクセラレータ総消費電力量を算出する。
 なお、負荷種類は、アクセラレータ5を実行する用途に応じた負荷の種類として、例えば、画像処理、機械学習処理、インターネット処理、暗号化処理等であり、負荷処理スケジュール情報を用いて、負荷種類が特定できるものとする。また、アクセラレータ5は、負荷処理スケジュールに基づき、1台につき単一種類のアプリケーションを実行することを前提とする。
The accelerator power amount prediction unit 333 performs accelerator power amount learning based on the type of accelerator processing load (load type) that is newly scheduled to be processed, the accelerator inlet temperature, the number of accelerator processing circuits, etc. obtained from the load processing schedule information. Using the model 304, the accelerator power consumption of each accelerator 5 is predicted. Further, the accelerator power amount prediction unit 333 calculates the total accelerator power consumption amount, which is the sum of the power consumption amounts of each accelerator 5 in each placement control area 30, based on the arrangement configuration of the accelerators 5 in each placement control area 30.
Note that the load type is a type of load depending on the purpose of executing the accelerator 5, such as image processing, machine learning processing, internet processing, encryption processing, etc., and the load type is determined using load processing schedule information. It shall be possible to specify. Further, it is assumed that each accelerator 5 executes a single type of application based on a load processing schedule.
 このアクセラレータ電力量学習モデル304は、GPUサーバ電力量学習モデル303と同様に、直接アクセラレータ消費電力量を予測する手法(一段階式)と、アクセラレータ温度(アクセラレータ内部の温度)を介して二段階でアクセラレータ消費電力量を予測する手法(二段階式)がある。 This accelerator power consumption learning model 304, like the GPU server power consumption learning model 303, uses a method (one-step method) of directly predicting accelerator power consumption and a two-step method using the accelerator temperature (temperature inside the accelerator). There is a method (two-step method) for predicting accelerator power consumption.
 (一段階式)では、アクセラレータ電力量学習モデル304として、1つの学習モデルを用いる。
 このアクセラレータ電力量学習モデル304は、アクセラレータ吸込口温度、負荷種類、アクセラレータ処理回路数を入力情報とし、アクセラレータ5の消費電力量を出力情報とする学習モデルである。このアクセラレータ電力量学習モデル304を、アクセラレータ吸込口温度、負荷種類、アクセラレータ処理回路数と、そのときのアクセラレータ5の消費電力量の情報とを学習データとして、予め作成しておく。
In the (one-stage method), one learning model is used as the accelerator power amount learning model 304.
This accelerator power consumption learning model 304 is a learning model that uses the accelerator inlet temperature, load type, and number of accelerator processing circuits as input information, and uses the power consumption of the accelerator 5 as output information. This accelerator power amount learning model 304 is created in advance using the accelerator inlet temperature, load type, number of accelerator processing circuits, and information on the power consumption of the accelerator 5 at that time as learning data.
 アクセラレータ電力量予測部333は、一段階式を採用する場合に、ターンの始めに処理未割当ての各アクセラレータ5に対し、アクセラレータ吸込口温度、負荷種類、アクセラレータ処理回路数に基づき、アクセラレータ電力量学習モデル304を用いて、各アクセラレータ5のアクセラレータ消費電力量を予測する。
 なお、アクセラレータ吸込口温度は、ターンの始めは、現時点での各アクセラレータ5の吸込口温度(アクセラレータ吸込口温度)を用い、それ以降は、空調制御部200から取得した温度分布情報64で示される各アクセラレータ5のアクセラレータ吸込口温度の情報(現時点のアクセラレータ吸込口温度と同じ温度から始まる温度分布情報64)を用いる(二段階式でも同様である。)。
When adopting the one-stage method, the accelerator power amount prediction unit 333 performs accelerator power amount learning for each accelerator 5 that is not assigned processing at the beginning of a turn based on the accelerator inlet temperature, load type, and number of accelerator processing circuits. Using the model 304, the accelerator power consumption of each accelerator 5 is predicted.
Note that the accelerator suction port temperature uses the current suction port temperature of each accelerator 5 (accelerator suction port temperature) at the beginning of the turn, and thereafter is indicated by the temperature distribution information 64 acquired from the air conditioning control unit 200. Information on the accelerator suction port temperature of each accelerator 5 (temperature distribution information 64 starting from the same temperature as the current accelerator suction port temperature) is used (the same applies to the two-stage system).
 (二段階式)では、アクセラレータ電力量学習モデル304として、2つの学習モデル(第1アクセラレータ学習モデル304a,第2アクセラレータ学習モデル304b)を用いる。
 第1アクセラレータ学習モデル304aは、アクセラレータ吸込口温度、負荷種類、アクセラレータ処理回路数を入力情報とし、アクセラレータ温度を出力情報とする学習モデルである。この第1アクセラレータ学習モデル304aを、アクセラレータ吸込口温度、負荷種類、アクセラレータ処理回路数と、そのときのアクセラレータ温度とを学習データとして、予め作成しておく。
In the (two-stage method), two learning models (a first accelerator learning model 304a and a second accelerator learning model 304b) are used as the accelerator power amount learning model 304.
The first accelerator learning model 304a is a learning model that uses the accelerator inlet temperature, the load type, and the number of accelerator processing circuits as input information, and uses the accelerator temperature as output information. This first accelerator learning model 304a is created in advance using the accelerator inlet temperature, load type, number of accelerator processing circuits, and accelerator temperature at that time as learning data.
 第2アクセラレータ学習モデル304bは、アクセラレータ温度を入力情報とし、アクセラレータ5の消費電力量を出力情報とする学習モデルである。この第2アクセラレータ学習モデル304bを、アクセラレータ温度と、その時のアクセラレータ5の消費電力量の情報とを学習データとして、予め作成しておく。 The second accelerator learning model 304b is a learning model that uses the accelerator temperature as input information and uses the power consumption of the accelerator 5 as output information. This second accelerator learning model 304b is created in advance using the accelerator temperature and information on the power consumption of the accelerator 5 at that time as learning data.
 アクセラレータ電力量予測部333は、二段階式を採用する場合に、ターンの始めに処理未割当ての各アクセラレータ5に対し、アクセラレータ吸込口温度、負荷種類、アクセラレータ処理回路数に基づき、第1アクセラレータ学習モデル304aを用いて、アクセラレータ温度を予測する。そして、アクセラレータ電力量予測部333は、その予測したアクセラレータ温度に基づき、第2アクセラレータ学習モデル304bを用いて、各アクセラレータ5のアクセラレータ消費電力量を予測する。 When adopting the two-stage method, the accelerator power amount prediction unit 333 performs first accelerator learning based on the accelerator inlet temperature, load type, and number of accelerator processing circuits for each accelerator 5 that is not assigned processing at the beginning of the turn. Model 304a is used to predict accelerator temperature. Then, the accelerator power amount prediction unit 333 predicts the accelerator power consumption of each accelerator 5 based on the predicted accelerator temperature using the second accelerator learning model 304b.
 アクセラレータ電力量予測部333は、各配置制御区域30のアクセラレータ5の配置構成に基づき、配置制御区域30それぞれにおいて、予測した各アクセラレータ5の消費電力量を合計することにより、その配置制御区域30のアクセラレータ総消費電力量を算出する。 The accelerator power amount prediction unit 333 calculates the predicted power consumption of each accelerator 5 in each placement control area 30 based on the arrangement configuration of the accelerators 5 in each placement control area 30. Calculate the total accelerator power consumption.
 そして、サーバ消費電力量予測部330は、各配置パターンにおいて、その配置制御区域30のCPUサーバ総消費電力量と、GPUサーバ総消費電力量と、アクセラレータ総消費電力量とを合計し、配置制御区域30それぞれのサーバ消費電力量を算出する。 Then, in each placement pattern, the server power consumption prediction unit 330 totals the CPU server total power consumption, the GPU server total power consumption, and the accelerator total power consumption in the placement control area 30, and controls the placement. The server power consumption of each area 30 is calculated.
 配置パターン決定部340は、各配置パターンにおいて、配置制御区域30それぞれのサーバ消費電力量を合計し、その合計したトータルのサーバ消費電力量(総サーバ消費電力量)を算出する。配置パターン決定部340は、算出した総サーバ消費電力量と、空調制御部200から取得した、その配置パターンにおける空調消費電力量との合計量を計算し、その合計量が最小となる配置パターンを決定する。 The placement pattern determination unit 340 sums up the server power consumption of each placement control area 30 in each placement pattern, and calculates the total server power consumption (total server power consumption). The arrangement pattern determination unit 340 calculates the total amount of the calculated total server power consumption and the air conditioning power consumption in the arrangement pattern obtained from the air conditioning control unit 200, and selects the arrangement pattern that minimizes the total amount. decide.
<処理の流れ>
 次に、本実施形態に係る電力量低減制御装置100が実行する処理の流れについて説明する。
 ここでは、図8を参照して、学習フェーズにおいて行われる、運用履歴情報生成処理を説明する。また、図9を参照して、運用フェーズにおいて行われる、配置パターン決定処理を説明する。
<Processing flow>
Next, the flow of processing executed by the power amount reduction control device 100 according to this embodiment will be explained.
Here, with reference to FIG. 8, the operation history information generation process performed in the learning phase will be described. Further, with reference to FIG. 9, the arrangement pattern determination process performed in the operation phase will be described.
≪運用履歴情報生成処理≫
 図8は、本実施形態に係る電力量低減制御装置100が実行する、運用履歴情報生成処理の流れを示すフローチャートである。
≪Operation history information generation process≫
FIG. 8 is a flowchart showing the flow of operation history information generation processing executed by the power consumption reduction control device 100 according to the present embodiment.
 まず、電力量低減制御装置100の空調制御部200(運用履歴情報生成部220)の空調制御値生成部221は、空調機2それぞれで設定変更可能な制御パラメータ(例えば、設定温度(目標温度)、風量等)について、複数段階に分けた空調制御値を生成する(ステップS1)。
 具体的には、空調制御値生成部221は、各パラメータを上限値から下限値までの間のM段階に分け、各段階のパラメータを組み合わせて、各空調機2の空調制御値情報63を生成する。
First, the air conditioning control value generation unit 221 of the air conditioning control unit 200 (operation history information generation unit 220) of the power consumption reduction control device 100 generates control parameters (for example, set temperature (target temperature)) that can be changed in each air conditioner 2. , air volume, etc.), air conditioning control values divided into multiple stages are generated (step S1).
Specifically, the air conditioning control value generation unit 221 divides each parameter into M stages from an upper limit value to a lower limit value, and combines the parameters of each stage to generate air conditioning control value information 63 for each air conditioner 2. do.
 そして、生成された空調制御値情報63に基づき、空調制御実行部240は、複数のパターンでの空調制御を実行する(ステップS2)。例えば、空調制御実行部240は、各空調制御値について、空調機「1」→「2」→「3」の順番に制御したり、空調機「1」と「2」,空調機「2」と「3」,空調機「1」と「3」の組合せで制御したり、空調機「1」「2」「3」を同時に制御したりといった複数のパターンで空調制御を実行する。 Then, based on the generated air conditioning control value information 63, the air conditioning control execution unit 240 executes air conditioning control in a plurality of patterns (step S2). For example, the air conditioning control execution unit 240 controls each air conditioning control value in the order of air conditioners "1" → "2" → "3", or controls air conditioners "1" and "2", air conditioner "2", etc. Air conditioning control is executed in multiple patterns, such as controlling a combination of air conditioners 1 and 3, or controlling air conditioners 1, 2, and 3 simultaneously.
 次に、ステップS3において、運用履歴情報生成部220(報酬計算部222)は、空調制御値生成部221が生成した空調制御値により空調制御を実行した結果を評価する指標として報酬(温度報酬)を計算する。そして、報酬計算部222は、制御結果が所定の報酬を満たすか否か、つまり、所定の条件を満たす空調制御値であるか否かを判定する。
 報酬計算部222は、計算した報酬が、所定の閾値以上であり、制御ターン後のフロア平均温度や、GPU温度、アクセラレータ温度等が規定の範囲であるなどの所定の条件を満たす場合に、合格と判定する。
Next, in step S3, the operation history information generation unit 220 (remuneration calculation unit 222) generates a reward (temperature reward) as an index for evaluating the result of performing air conditioning control using the air conditioning control value generated by the air conditioning control value generation unit 221. Calculate. Then, the remuneration calculation unit 222 determines whether the control result satisfies a predetermined remuneration, that is, whether the air conditioning control value satisfies a predetermined condition.
The reward calculation unit 222 determines that the reward is passed if the calculated reward is equal to or higher than a predetermined threshold and satisfies predetermined conditions such as the average floor temperature after the control turn, the GPU temperature, the accelerator temperature, etc. are within the specified range. It is determined that
 続いて、運用履歴情報生成部220(運用履歴作成部223)は、各Situationにおいて空調制御値生成部221が生成し、報酬計算部222によって合格と判定された空調制御値情報63を用いて、各空調機2を空調制御実行部240が制御した結果として、温度分布情報64と空調消費電力量情報65とを取得する(ステップS4)。
 この温度分布情報64は、GPUサーバ4の吸込口側に設けられた温度センサ44により計測されるGPU吸込口温度と、アクセラレータ5の吸込口側に設けられた温度センサ55による計測されるアクセラレータ吸込口温度とを、所定の時間間隔で測定した情報である。また、空調消費電力量情報65は、所定の制御ターンにおいて計測された各空調機2のトータルの消費電力量である。
Subsequently, the operation history information generation section 220 (operation history generation section 223) uses the air conditioning control value information 63 generated by the air conditioning control value generation section 221 in each Situation and determined to be acceptable by the remuneration calculation section 222. As a result of the air conditioning control execution unit 240 controlling each air conditioner 2, temperature distribution information 64 and air conditioning power consumption information 65 are acquired (step S4).
This temperature distribution information 64 includes the GPU suction port temperature measured by the temperature sensor 44 provided on the suction port side of the GPU server 4 and the accelerator suction temperature measured by the temperature sensor 55 provided on the suction port side of the accelerator 5. This is information obtained by measuring mouth temperature at predetermined time intervals. Moreover, the air conditioning power consumption information 65 is the total power consumption of each air conditioner 2 measured in a predetermined control turn.
 そして、運用履歴作成部223は、空調制御実行部240が空調制御を実行した際の、Situation(Situation分類62)および空調制御値情報63に、その制御結果として得られた、温度分布情報64と、空調消費電力量情報65とを対応付けて運用履歴情報201を作成し(ステップS5)、記憶部に記憶する。 Then, the operation history creation unit 223 adds temperature distribution information 64 obtained as a result of the control to the Situation (Situation classification 62) and air conditioning control value information 63 when the air conditioning control execution unit 240 executes the air conditioning control. , and air conditioning power consumption information 65 to create operation history information 201 (step S5) and store it in the storage unit.
 電力量低減制御装置100は、この運用履歴情報201の生成処理を、運用フェーズの前の学習フェーズの段階で予め作成しておく。 The power consumption reduction control device 100 creates the operation history information 201 generation process in advance in the learning phase before the operation phase.
≪配置パターン決定処理≫
 図9は、本実施形態に係る電力量低減制御装置100が実行する、配置パターン決定処理の流れを示すフローチャートである。
<<Arrangement pattern determination process>>
FIG. 9 is a flowchart showing the flow of the arrangement pattern determination process executed by the power amount reduction control device 100 according to the present embodiment.
 まず、電力量低減制御装置100のサーバ制御部300(配置パターン算出部310)は、負荷処理スケジュール情報を取得し、各制御ターンの開始時に、直近のリソース使用状況(例えば、CPU、GPU、アクセラレータの使用率等)に基づき、新規の負荷を各サーバリソース(CPUサーバ3、GPUサーバ4、アクセラレータ5)に配置する配置パターンを算出する(ステップS10)。 First, the server control unit 300 (arrangement pattern calculation unit 310) of the power reduction control device 100 acquires load processing schedule information, and at the start of each control turn, the server control unit 300 (arrangement pattern calculation unit 310) (e.g. usage rate), a placement pattern for placing a new load on each server resource (CPU server 3, GPU server 4, accelerator 5) is calculated (step S10).
 続いて、区域発熱量推定部320は、配置パターン算出部310が算出した配置パターン毎に、各サーバリソース(CPUサーバ3、GPUサーバ4、アクセラレータ5)の消費電力量を、基礎消費電力量情報301を参照して予測する。そして、区域発熱量推定部320は、配置制御区域30ごとのサーバ配置構成に基づき、各配置パターンにおける配置制御区域30ごとのトータルの発熱予測量(配置制御区域の発熱予測量)を算出する(ステップS11)。
 そして、区域発熱量推定部320は、算出した各配置制御区域30の発熱予測量を、空調制御部200(運用履歴情報抽出部230)に出力する。
Next, the area heat generation estimation unit 320 calculates the power consumption of each server resource (CPU server 3, GPU server 4, accelerator 5) for each layout pattern calculated by the layout pattern calculation unit 310, based on basic power consumption information. Prediction is made with reference to 301. Then, the area heat generation estimation unit 320 calculates the total predicted heat generation amount (predicted heat generation amount of the placement control area) for each placement control area 30 in each placement pattern based on the server placement configuration for each placement control area 30 ( Step S11).
Then, the area heat generation amount estimating unit 320 outputs the calculated predicted heat generation amount of each placement control area 30 to the air conditioning control unit 200 (operation history information extraction unit 230).
 次に、空調制御部200の運用履歴情報抽出部230は、サーバ制御部300(区域発熱量推定部320)から、各配置制御区域30の発熱予測量の情報を取得する。そして、運用履歴情報抽出部230は、その制御ターン開始時におけるSituation分類62を、状況認識部210を介して判定する(ステップS12)。
 運用履歴情報抽出部230は、判定したSituation分類62における、各空調制御値情報63で制御した結果である、温度分布情報64および空調消費電力量情報65を、運用履歴情報201から抽出する(ステップS13)。運用履歴情報抽出部230は、その抽出した温度分布情報64および空調消費電力量情報65を、サーバ制御部300に出力する。
Next, the operation history information extraction unit 230 of the air conditioning control unit 200 acquires information on the predicted heat generation amount of each layout control area 30 from the server control unit 300 (area heat generation amount estimating unit 320). Then, the operation history information extraction unit 230 determines the Situation classification 62 at the start of the control turn via the situation recognition unit 210 (Step S12).
The operation history information extraction unit 230 extracts temperature distribution information 64 and air conditioning power consumption information 65, which are the results of control using each air conditioning control value information 63 in the determined Situation classification 62, from the operation history information 201 (step S13). The operation history information extraction unit 230 outputs the extracted temperature distribution information 64 and air conditioning power consumption information 65 to the server control unit 300.
 続いて、サーバ制御部300のサーバ消費電力量予測部330は、CPUサーバ3,GPUサーバ4、アクセラレータ5に対する負荷処理スケジュール情報に基づき、空調制御部200(運用履歴情報抽出部230)から取得した配置パターンの温度分布情報64等を用いて、CPUサーバ3、GPUサーバ4、アクセラレータ5それぞれについて、配置制御区域30ごとの総消費電力量(CPUサーバ総消費電力量、GPUサーバ総消費電力量、アクセラレータ総消費電力量)を算出する(ステップS14)。 Next, the server power consumption prediction unit 330 of the server control unit 300 obtains the load processing schedule information for the CPU server 3, GPU server 4, and accelerator 5 from the air conditioning control unit 200 (operation history information extraction unit 230). Using the temperature distribution information 64 of the placement pattern, etc., the total power consumption (CPU server total power consumption, GPU server total power consumption, Accelerator total power consumption) is calculated (step S14).
 具体的には、サーバ消費電力量予測部330(CPU電力量予測部331)は、各制御ターンの開始時に、負荷処理スケジュールに基づく新規配置予定の仮想リソース量の情報(例えば、CPUコア数)や、その時点のCPUサーバ3のリソース使用状況(例えば、CPU使用率)を取得し、CPUサーバ電力量学習モデル302を用いて、各CPUサーバ3の消費電力量を予測する。さらに、CPU電力量予測部331は、各配置制御区域30のCPUサーバ3の配置構成に基づき、配置制御区域30それぞれにおいて、各CPUサーバ3の消費電力量を合計したCPUサーバ総消費電力量を算出する。 Specifically, at the start of each control turn, the server power consumption prediction unit 330 (CPU power consumption prediction unit 331) calculates information on the amount of virtual resources (for example, the number of CPU cores) to be newly allocated based on the load processing schedule. and the resource usage status (for example, CPU usage rate) of the CPU server 3 at that time, and predicts the power consumption of each CPU server 3 using the CPU server power learning model 302. Further, the CPU power amount prediction unit 331 calculates the total CPU server power consumption, which is the sum of the power consumption of each CPU server 3 in each of the placement control areas 30, based on the arrangement configuration of the CPU servers 3 in each placement control area 30. calculate.
 また、サーバ消費電力量予測部330(GPU電力量予測部332)は、各制御ターンの開始時に、負荷処理スケジュール情報から得られる新規の負荷の負荷種類、温度分布情報64から得られるGPU吸込口温度、GPUカード数に基づき、GPUサーバ電力量学習モデル303を用いて、各GPUサーバ4のGPUサーバ消費電力量を予測する。さらに、GPU電力量予測部332は、各配置制御区域30のGPUサーバ4の配置構成に基づき、配置制御区域30それぞれにおいて、各GPUサーバ4の消費電力量を合計したGPUサーバ総消費電力量を算出する。 In addition, at the start of each control turn, the server power consumption prediction unit 330 (GPU power consumption prediction unit 332) calculates the load type of the new load obtained from the load processing schedule information, and the GPU intake port obtained from the temperature distribution information 64. Based on the temperature and the number of GPU cards, the GPU server power consumption of each GPU server 4 is predicted using the GPU server power learning model 303. Furthermore, the GPU power amount prediction unit 332 calculates the total GPU server power consumption, which is the sum of the power consumption of each GPU server 4 in each placement control area 30, based on the placement configuration of the GPU servers 4 in each placement control area 30. calculate.
 また、サーバ消費電力量予測部330(アクセラレータ電力量予測部333)は、各制御ターンの開始時に、負荷処理スケジュール情報から得られる新規の負荷の負荷種類、温度分布情報64から得られるアクセラレータ吸込口温度、アクセラレータ処理回路数に基づき、アクセラレータ電力量学習モデル304を用いて、各アクセラレータ5のアクセラレータ消費電力量を予測する。さらに、アクセラレータ電力量予測部333は、各配置制御区域30のアクセラレータ5の配置構成に基づき、配置制御区域30それぞれにおいて、各アクセラレータ5の消費電力量を合計したアクセラレータ総消費電力量を算出する。 Further, at the start of each control turn, the server power consumption prediction unit 330 (accelerator power consumption prediction unit 333) calculates the load type of the new load obtained from the load processing schedule information and the accelerator suction port obtained from the temperature distribution information 64. Based on the temperature and the number of accelerator processing circuits, the accelerator power consumption of each accelerator 5 is predicted using the accelerator power learning model 304. Further, the accelerator power amount prediction unit 333 calculates the total accelerator power consumption amount, which is the sum of the power consumption amounts of each accelerator 5 in each placement control area 30, based on the arrangement configuration of the accelerators 5 in each placement control area 30.
 そして、サーバ消費電力量予測部330は、各配置パターンにおいて、その配置制御区域30のCPUサーバ総消費電力量と、GPUサーバ総消費電力量と、アクセラレータ総消費電力量とを合計し、配置制御区域30それぞれのサーバ消費電力量を算出する(ステップS15)。 Then, in each placement pattern, the server power consumption prediction unit 330 totals the CPU server total power consumption, the GPU server total power consumption, and the accelerator total power consumption in the placement control area 30, and controls the placement. The server power consumption of each area 30 is calculated (step S15).
 配置パターン決定部340は、各配置パターンにおいて、配置制御区域30それぞれのサーバ消費電力量を合計し、その合計したトータルのサーバ消費電力量(総サーバ消費電力量)を算出する。配置パターン決定部340は、算出した総サーバ消費電力量と、空調制御部200から取得した、その配置パターンにおける空調消費電力量との合計量を計算し、その合計量が最小となる配置パターンを決定する(ステップS16)。 The placement pattern determination unit 340 sums up the server power consumption of each placement control area 30 in each placement pattern, and calculates the total server power consumption (total server power consumption). The arrangement pattern determination unit 340 calculates the total amount of the calculated total server power consumption and the air conditioning power consumption in the arrangement pattern obtained from the air conditioning control unit 200, and selects the arrangement pattern that minimizes the total amount. Determine (step S16).
 このようにして、電力量低減制御装置100は、CPUサーバ、GPUサーバ、アクセラレータ等が混在するデータセンタの環境において、サーバ消費電力と空調消費電力とからなるデータセンタのトータルの消費電力量を低減する、処理負荷の配置パターンと空調制御値とを決定することができる。 In this way, the power reduction control device 100 reduces the total power consumption of the data center consisting of server power consumption and air conditioning power consumption in a data center environment where CPU servers, GPU servers, accelerators, etc. coexist. The processing load arrangement pattern and air conditioning control value can be determined.
<ハードウェア構成>
 本実施形態に係る電力量低減制御装置100は、例えば図10で示すような構成のコンピュータ900によって実現される。
 図10は、本実施形態に係る電力量低減制御装置100の機能を実現するコンピュータ900の一例を示すハードウェア構成図である。コンピュータ900は、CPU(Central Processing Unit)901、ROM(Read Only Memory)902、RAM903、HDD(Hard Disk Drive)904、入出力I/F(Interface)905、通信I/F906およびメディアI/F907を有する。
<Hardware configuration>
The power consumption reduction control device 100 according to the present embodiment is realized, for example, by a computer 900 having a configuration as shown in FIG. 10.
FIG. 10 is a hardware configuration diagram showing an example of a computer 900 that implements the functions of the power consumption reduction control device 100 according to the present embodiment. The computer 900 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM 903, an HDD (Hard Disk Drive) 904, an input/output I/F (Interface) 905, a communication I/F 906, and a media I/F 907. have
 CPU901は、ROM902またはHDD904に記憶されたプログラムに基づき作動し、制御部による制御を行う。ROM902は、コンピュータ900の起動時にCPU901により実行されるブートプログラムや、コンピュータ900のハードウェアに係るプログラム等を記憶する。 The CPU 901 operates based on a program stored in the ROM 902 or HDD 904, and performs control by the control unit. The ROM 902 stores a boot program executed by the CPU 901 when the computer 900 is started, programs related to the hardware of the computer 900, and the like.
 CPU901は、入出力I/F905を介して、マウスやキーボード等の入力装置910、および、ディスプレイやプリンタ等の出力装置911を制御する。CPU901は、入出力I/F905を介して、入力装置910からデータを取得するともに、生成したデータを出力装置911へ出力する。なお、プロセッサとしてCPU901とともに、GPU(Graphics Processing Unit)等を用いても良い。 The CPU 901 controls an input device 910 such as a mouse or a keyboard, and an output device 911 such as a display or printer via an input/output I/F 905. The CPU 901 acquires data from the input device 910 via the input/output I/F 905 and outputs the generated data to the output device 911. Note that a GPU (Graphics Processing Unit) or the like may be used in addition to the CPU 901 as the processor.
 HDD904は、CPU901により実行されるプログラムおよび当該プログラムによって使用されるデータ等を記憶する。通信I/F906は、通信網(例えば、NW(Network)920)を介して他の装置からデータを受信してCPU901へ出力し、また、CPU901が生成したデータを、通信網を介して他の装置へ送信する。 The HDD 904 stores programs executed by the CPU 901 and data used by the programs. The communication I/F 906 receives data from other devices via a communication network (for example, NW (Network) 920) and outputs it to the CPU 901, and also sends data generated by the CPU 901 to other devices via the communication network. Send to device.
 メディアI/F907は、記録媒体912に格納されたプログラムまたはデータを読み取り、RAM903を介してCPU901へ出力する。CPU901は、目的の処理に係るプログラムを、メディアI/F907を介して記録媒体912からRAM903上にロードし、ロードしたプログラムを実行する。記録媒体912は、DVD(Digital Versatile Disc)、PD(Phase change rewritable Disk)等の光学記録媒体、MO(Magneto Optical disk)等の光磁気記録媒体、磁気記録媒体、半導体メモリ等である。 The media I/F 907 reads the program or data stored in the recording medium 912 and outputs it to the CPU 901 via the RAM 903. The CPU 901 loads a program related to target processing from the recording medium 912 onto the RAM 903 via the media I/F 907, and executes the loaded program. The recording medium 912 is an optical recording medium such as a DVD (Digital Versatile Disc) or a PD (Phase change rewritable disk), a magneto-optical recording medium such as an MO (Magneto Optical disk), a magnetic recording medium, a semiconductor memory, or the like.
 例えば、コンピュータ900が本発明の電力量低減制御装置100として機能する場合、コンピュータ900のCPU901は、RAM903上にロードされたプログラムを実行することにより、電力量低減制御装置100の機能を実現する。また、HDD904には、RAM903内のデータが記憶される。CPU901は、目的の処理に係るプログラムを記録媒体912から読み取って実行する。この他、CPU901は、他の装置から通信網(NW920)を介して目的の処理に係るプログラムを読み込んでもよい。 For example, when the computer 900 functions as the power consumption reduction control device 100 of the present invention, the CPU 901 of the computer 900 realizes the functions of the power consumption reduction control device 100 by executing a program loaded onto the RAM 903. Furthermore, data in the RAM 903 is stored in the HDD 904 . The CPU 901 reads a program related to target processing from the recording medium 912 and executes it. In addition, the CPU 901 may read a program related to target processing from another device via a communication network (NW 920).
<効果>
 以下、本発明に係る電力量低減制御装置100等の効果について説明する。
 本発明に係る電力量低減制御装置は、CPUサーバ3、GPUサーバ4およびアクセラレータ5、並びに複数の空調機2を制御する電力量低減制御装置100であって、CPUサーバ3、GPUサーバ4、アクセラレータ5の何れかが配置される複数の配置制御区域30と、複数の空調機2による空調制御の効果を測定するエリアである空調制御区域20とが、設定されており、電力量低減制御装置100は、複数の空調機2に設定する、少なくとも目標温度を含む空調制御値を生成する空調制御値生成部221と、空調制御値を用いて複数の空調機2の制御を実行させる空調制御実行部240と、CPUサーバ3、GPUサーバ4およびアクセラレータ5に処理負荷を配置した複数の配置パターンにおいて、空調制御実行部240が複数の空調機2を空調制御値により制御した結果について、目標温度を指標として評価する報酬を算出し、報酬が所定の条件を満たすか否かを判定する報酬計算部222と、所定の条件を満たすと判定された空調制御値による制御結果として、温度分布情報64および複数の空調機の空調消費電力量を取得し、複数の配置パターンそれぞれにおける各配置制御区域30の発熱予測量に対応付けた運用履歴情報201を作成する運用履歴作成部223と、CPUサーバ3、GPUサーバ4およびアクセラレータ5に対する処理負荷の情報を用いて、新規の処理負荷を配置する複数の配置パターンを算出する配置パターン算出部310と、算出した配置パターン毎に、配置制御区域30それぞれに属する、CPUサーバ3、GPUサーバ4およびアクセラレータ5に処理負荷が配置された場合の発熱量を合計することにより、各配置制御区域30の発熱予測量を推定する区域発熱量推定部320と、各配置制御区域30の発熱予測量の情報を用いて、運用履歴情報201を参照し、各配置パターンにおいて空調制御値により制御した場合の、温度分布情報64および空調消費電力量を抽出する運用履歴情報抽出部230と、抽出した温度分布情報64と新規の処理負荷に関する情報とを用いて、配置パターンそれぞれにおいて、配置制御区域30毎に、各CPUサーバ3の消費電力量、各GPUサーバ4の消費電力量および各アクセラレータ5の消費電力量を合計したサーバ消費電力量を算出するサーバ消費電力量予測部330と、配置パターンそれぞれにおいて、配置制御区域30それぞれのサーバ消費電力量を合計し、その合計であるトータルのサーバ消費電力量と、抽出した空調消費電力量との合計量を算出し、算出した合計量が最小となる配置パターンを、処理負荷を配置する配置パターンに決定する配置パターン決定部340と、を備えることを特徴とする。
<Effect>
Hereinafter, effects of the power amount reduction control device 100 and the like according to the present invention will be explained.
The power consumption reduction control device according to the present invention is a power consumption reduction control device 100 that controls a CPU server 3, a GPU server 4, an accelerator 5, and a plurality of air conditioners 2. 5 are arranged, and an air conditioning control area 20 which is an area for measuring the effect of air conditioning control by the plurality of air conditioners 2 is set. An air conditioning control value generation unit 221 that generates an air conditioning control value including at least a target temperature to be set in a plurality of air conditioners 2, and an air conditioning control execution unit that executes control of the plurality of air conditioners 2 using the air conditioning control value. 240, and a plurality of layout patterns in which processing loads are arranged in the CPU server 3, GPU server 4, and accelerator 5, the target temperature is used as an index for the results of the air conditioning control execution unit 240 controlling the plurality of air conditioners 2 using air conditioning control values. A remuneration calculation unit 222 calculates a remuneration to be evaluated and determines whether the remuneration satisfies a predetermined condition, and a remuneration calculation unit 222 that calculates a remuneration to be evaluated as an operation history creation unit 223 that acquires the air conditioning power consumption of the air conditioners and creates operation history information 201 that is associated with the predicted heat generation amount of each placement control area 30 in each of the plurality of placement patterns; the CPU server 3; A placement pattern calculation unit 310 that calculates a plurality of placement patterns in which new processing loads are placed using information on processing loads on the server 4 and the accelerator 5; An area heat generation amount estimation unit 320 that estimates the predicted heat generation amount of each placement control area 30 by summing the heat generation amount when processing loads are placed on the CPU server 3, GPU server 4, and accelerator 5, and each placement control an operation history information extraction unit that uses information on the predicted amount of heat generation in the area 30, refers to the operation history information 201, and extracts temperature distribution information 64 and air conditioning power consumption when controlled by air conditioning control values in each arrangement pattern; 230, the extracted temperature distribution information 64, and information regarding the new processing load, the power consumption of each CPU server 3 and the power consumption of each GPU server 4 are calculated for each placement control area 30 in each placement pattern. and a server power consumption prediction unit 330 that calculates the server power consumption by summing up the power consumption of each accelerator 5, and the server power consumption of each of the placement control areas 30 in each placement pattern. an arrangement pattern determination unit 340 that calculates the total amount of the total server power consumption and the extracted air conditioning power consumption, and determines the arrangement pattern in which the calculated total amount is the minimum as the arrangement pattern for allocating the processing load; It is characterized by comprising the following.
 このようにすることで、電力量低減制御装置100は、CPUサーバ3、GPUサーバ4、アクセラレータ5が混在する環境において、サーバ消費電力と空調消費電力とからなるトータルの消費電力量を低減することができる。 By doing so, the power reduction control device 100 can reduce the total power consumption consisting of server power consumption and air conditioning power consumption in an environment where the CPU server 3, GPU server 4, and accelerator 5 coexist. I can do it.
 また、本発明に係る電力量低減制御装置は、データセンタ10が有する、CPUサーバ3、GPUサーバ4およびアクセラレータ5、並びに複数の空調機2を制御する電力量低減制御装置100であって、データセンタ10のフロアには、処理負荷を配置するまとまったサーバ群として、CPUサーバ3、GPUサーバ4、アクセラレータ5の何れかが配置される複数の配置制御区域30と、複数の空調機2による空調制御の効果を測定するエリアである空調制御区域20とが、設定されており、電力量低減制御装置が、複数の空調制御区域20で測定された温度の平均値から算出するフロア平均温度、データセンタ10の外部の温度である外温、および、処理負荷がサーバ群に配置された場合の予測量である各配置制御区域30の発熱予測量、を含む空調制御に関する外界因子の情報を取得する外界因子取得部211と、外界因子それぞれの値を所定のレンジ幅に分割し、外界因子ごとに分割したレンジを組み合わせてSituation分類62として定義し、取得した外界因子の情報がどのSituation分類62に属するかを判定するSituation判定部212と、Situation分類62それぞれにおいて、複数の空調機2に設定する、少なくとも目標温度を含む空調制御値を生成する空調制御値生成部221と、空調制御値を用いて複数の空調機2の制御を実行させる空調制御実行部240と、空調制御実行部240が複数の空調機2を空調制御値により制御した結果を、目標温度を指標として評価する報酬を算出し、報酬が所定の条件を満たすか否かを判定する報酬計算部222と、所定の条件を満たすと判定された空調制御値による制御結果として、GPUサーバ4の吸込口での温度を示すGPU吸込口温度、および、アクセラレータの吸込口での温度を示すアクセラレータ吸込口温度、を示す温度分布情報64と、当該空調制御値による制御を行った場合の複数の空調機2の空調消費電力量とを取得し、空調制御を実行した際の、Situation分類62および空調制御値に、その制御結果として取得した、温度分布情報64と、空調消費電力量とを対応付けた運用履歴情報201を作成する運用履歴作成部223と、各配置制御区域30の発熱予測量の情報を取得すると、Situation判定部212を介して、現時点のSituation分類62を判定し、運用履歴情報201を参照して、各配置パターンにおいて空調制御値により制御した場合の、温度分布情報64および空調消費電力量を抽出する運用履歴情報抽出部230と、CPUサーバ3、GPUサーバ4およびアクセラレータ5に対する処理負荷の生成および削除のスケジュールを示す負荷処理スケジュール情報を取得し、新規の処理負荷を、CPUサーバ3、GPUサーバ4およびアクセラレータ5それぞれに配置する配置パターンを算出する配置パターン算出部310と、算出した配置パターン毎に、配置制御区域30それぞれに属する、CPUサーバ3、GPUサーバ4およびアクセラレータ5に処理負荷が配置された場合の発熱量を合計することにより、各配置制御区域30の発熱予測量を推定する区域発熱量推定部320と、負荷処理スケジュール情報と、抽出された温度分布情報64とを用いて、配置パターンそれぞれにおいて、配置制御区域30のCPUサーバ3の消費電力量を合計したCPUサーバ総消費電力量、配置制御区域30のGPUサーバの消費電力量を合計したGPUサーバ総消費電力量、配置制御区域30のアクセラレータの消費電力量を合計したアクセラレータ総消費電力量を算出し、当該配置制御区域のCPUサーバ総消費電力量、GPUサーバ総消費電力量およびアクセラレータ総消費電力量を合計し、配置制御区域30それぞれのサーバ消費電力量を算出するサーバ消費電力量予測部330と、配置パターンそれぞれにおいて、配置制御区域30それぞれのサーバ消費電力量を合計し、その合計であるトータルのサーバ消費電力量と、抽出した空調消費電力量との合計量を算出し、算出した合計量が最小となる配置パターンを、処理負荷を配置する配置パターンに決定する配置パターン決定部340と、を備えることを特徴とする。 Further, the power consumption reduction control device according to the present invention is a power consumption reduction control device 100 that controls a CPU server 3, a GPU server 4, an accelerator 5, and a plurality of air conditioners 2, which are included in a data center 10. On the floor of the center 10, there are a plurality of placement control areas 30 in which any of the CPU servers 3, GPU servers 4, and accelerators 5 are placed as a group of servers for placing processing loads, and air conditioning by a plurality of air conditioners 2. An air conditioning control area 20, which is an area for measuring the effect of control, is set, and the power consumption reduction control device calculates the floor average temperature and data from the average value of temperatures measured in a plurality of air conditioning control areas 20. Information on external factors related to air conditioning control is acquired, including the external temperature, which is the temperature outside the center 10, and the predicted amount of heat generation in each placement control area 30, which is the predicted amount when the processing load is placed in a server group. The external world factor acquisition unit 211 divides the value of each external world factor into a predetermined range width, combines the divided ranges for each external world factor, defines a situation classification 62, and determines which situation classification 62 the acquired external world factor information is assigned to. In each of the Situation classifications 62, there is a Situation determination unit 212 that determines whether the system belongs to the Situation classification unit 212, an air conditioning control value generation unit 221 that generates an air conditioning control value that includes at least a target temperature to be set to a plurality of air conditioners 2, and an air conditioning control value generation unit 221 that uses the air conditioning control value to be set to a plurality of air conditioners 2. an air conditioning control execution unit 240 that executes control of a plurality of air conditioners 2 using a target temperature; , a remuneration calculation unit 222 that determines whether the remuneration satisfies a predetermined condition, and a GPU suction unit 222 that indicates the temperature at the inlet of the GPU server 4 as a control result based on the air conditioning control value determined to satisfy the predetermined condition. Temperature distribution information 64 indicating the mouth temperature and the accelerator suction port temperature indicating the temperature at the accelerator suction port, and the air conditioning power consumption of the plurality of air conditioners 2 when control is performed using the air conditioning control value. An operation of creating operation history information 201 that associates temperature distribution information 64 and air conditioning power consumption acquired as control results with the Situation classification 62 and air conditioning control values obtained when air conditioning control is executed. When the history creation unit 223 and the information on the predicted heat generation amount of each placement control area 30 are acquired, the current situation classification 62 is determined via the situation determination unit 212, and each placement pattern is determined by referring to the operation history information 201. The operation history information extraction unit 230 extracts temperature distribution information 64 and air conditioning power consumption when controlled by air conditioning control values, and the schedule for generating and deleting processing loads for the CPU server 3, GPU server 4, and accelerator 5. A placement pattern calculation unit 310 that acquires the load processing schedule information shown in FIG. An area heat generation estimation unit that estimates the predicted heat generation amount of each placement control area 30 by summing the heat generation amount when processing loads are placed in the CPU server 3, GPU server 4, and accelerator 5 belonging to each area 30. 320, the load processing schedule information, and the extracted temperature distribution information 64, the CPU server total power consumption, which is the sum of the power consumption of the CPU servers 3 in the placement control area 30, in each placement pattern, and the placement control The total power consumption of GPU servers, which is the sum of the power consumption of GPU servers in area 30, and the total power consumption of accelerators, which is the sum of the power consumption of accelerators in placement control area 30, are calculated, and the total power consumption of CPU servers in the placement control area is calculated. The server power consumption prediction unit 330 calculates the server power consumption of each placement control area 30 by summing the power amount, the GPU server total power consumption, and the accelerator total power consumption, and the placement control area 30 in each placement pattern. Add up the power consumption of each server, calculate the total server power consumption, which is the sum, and the extracted air conditioning power consumption, and select the layout pattern that minimizes the calculated total amount, based on the processing load. and an arrangement pattern determination unit 340 that determines an arrangement pattern for arranging.
 このようにすることで、電力量低減制御装置100は、CPUサーバ3、GPUサーバ4、アクセラレータ5が混在するデータセンタ10の環境において、サーバ消費電力と空調消費電力とからなるデータセンタ10のトータルの消費電力量を低減することができる。 By doing so, the power consumption reduction control device 100 can reduce the total power consumption of the data center 10 consisting of server power consumption and air conditioning power consumption in an environment of the data center 10 in which CPU servers 3, GPU servers 4, and accelerators 5 are mixed. can reduce power consumption.
 また、電力量低減制御装置100において、CPUサーバ3のリソース使用状況を入力情報として、当該CPUサーバ3の消費電力量を出力情報とするCPUサーバ電力量学習モデル302と、GPUサーバ4の、GPU吸込口温度、処理負荷の種類およびGPUカード数を入力情報とし、当該GPUサーバ4の消費電力量を出力情報とするGPUサーバ電力量学習モデル303と、アクセラレータ5の、アクセラレータ吸込口温度、処理負荷の種類およびアクセラレータ処理回路数を入力情報とし、当該アクセラレータ5の消費電力量を出力情報とするアクセラレータ電力量学習モデル304とを備えており、サーバ消費電力量予測部330が、CPUサーバ3の消費電力量を、CPUサーバ電力量学習モデル302を用いて算出し、GPUサーバ4の消費電力量を、GPUサーバ電力量学習モデル303を用いて算出し、アクセラレータ5の消費電力量を、アクセラレータ電力量学習モデル304を用いて算出することを特徴とする。 In addition, in the power consumption reduction control device 100, a CPU server power consumption learning model 302 that uses the resource usage status of the CPU server 3 as input information and the power consumption of the CPU server 3 as output information, and a GPU A GPU server power consumption learning model 303 that uses the inlet temperature, the type of processing load, and the number of GPU cards as input information, and the power consumption of the GPU server 4 as output information, and the accelerator inlet temperature and processing load of the accelerator 5. The accelerator power consumption learning model 304 uses the type of accelerator processing circuit and the number of accelerator processing circuits as input information, and the power consumption of the accelerator 5 as output information, and the server power consumption prediction unit 330 calculates the The power consumption is calculated using the CPU server power consumption learning model 302, the power consumption of the GPU server 4 is calculated using the GPU server power consumption learning model 303, and the power consumption of the accelerator 5 is calculated using the accelerator power consumption. It is characterized by calculation using a learning model 304.
 このように、電力量低減制御装置100は、CPUサーバ電力量学習モデル302と、GPUサーバ電力量学習モデル303と、アクセラレータ電力量学習モデル304と用いて、CPUサーバ3、GPUサーバ4、アクセラレータ5それぞれの消費電力量を好適に算出することができる。 In this way, the power consumption reduction control device 100 uses the CPU server power consumption learning model 302, the GPU server power consumption learning model 303, and the accelerator power consumption learning model 304 to Each power consumption amount can be suitably calculated.
 また、電力量低減制御装置100において、GPUサーバ電力量学習モデル303の代わりに、GPUサーバ4の、GPU吸込口温度、処理負荷の種類およびGPUカード数を入力情報とし、GPUカードの温度を示すGPU温度を出力情報とする第1GPU学習モデルと、GPU温度を入力情報とし、当該GPUサーバの消費電力量を出力情報とする第2GPU学習モデルとを備え、サーバ消費電力量予測部330が、GPU温度を、第1GPU学習モデルを用いて算出した上で、GPUサーバ4の消費電力量を、第2GPU学習モデルを用いて算出することを特徴とする。 In addition, in the power consumption reduction control device 100, instead of the GPU server power consumption learning model 303, the GPU inlet temperature, processing load type, and number of GPU cards of the GPU server 4 are used as input information, and the temperature of the GPU card is indicated. The server power consumption prediction unit 330 includes a first GPU learning model that uses GPU temperature as output information, and a second GPU learning model that uses GPU temperature as input information and uses power consumption of the GPU server as output information. It is characterized in that the temperature is calculated using the first GPU learning model, and then the power consumption of the GPU server 4 is calculated using the second GPU learning model.
 このようにすることによっても、電力量低減制御装置100は、第1GPU学習モデルを用いてGPU温度を算出した上で、第2GPU学習モデルと用いて、GPUサーバ4の消費電力量を好適に算出することができる。 By doing so, the power consumption reduction control device 100 calculates the GPU temperature using the first GPU learning model, and then suitably calculates the power consumption of the GPU server 4 by using the second GPU learning model. can do.
 また、電力量低減制御装置100において、アクセラレータ電力量学習モデル304の代わりに、アクセラレータ5の、アクセラレータ吸込口温度、処理負荷の種類およびアクセラレータ処理回路数を入力情報とし、アクセラレータ内部の温度を示すアクセラレータ温度を出力情報とする第1アクセラレータ学習モデルと、アクセラレータ温度を入力情報とし、当該アクセラレータの消費電力量を出力情報とする第2アクセラレータ学習モデルとを備え、サーバ消費電力量予測部330が、アクセラレータ温度を、第1アクセラレータ学習モデルを用いて算出した上で、アクセラレータの消費電力量を、第2アクセラレータ学習モデルを用いて算出することを特徴とする。 In addition, in the power consumption reduction control device 100, instead of the accelerator power consumption learning model 304, the accelerator inlet temperature, the type of processing load, and the number of accelerator processing circuits of the accelerator 5 are used as input information, and the accelerator The server power consumption prediction unit 330 includes a first accelerator learning model that uses temperature as output information, and a second accelerator learning model that uses accelerator temperature as input information and uses power consumption of the accelerator as output information. The method is characterized in that the temperature is calculated using the first accelerator learning model, and then the power consumption of the accelerator is calculated using the second accelerator learning model.
 このようにすることによっても、電力量低減制御装置100は、第1アクセラレータ学習モデルを用いてアクセラレータ温度を算出した上で、第2アクセラレータ学習モデルを用いて、アクセラレータ5の消費電力量を好適に算出することができる。 By doing so, the power consumption reduction control device 100 calculates the accelerator temperature using the first accelerator learning model, and then appropriately adjusts the power consumption of the accelerator 5 using the second accelerator learning model. It can be calculated.
 また、電力量低減制御装置100において、CPUサーバ3、GPUサーバ4およびアクセラレータ5それぞれに関する所定温度での基準となる消費電力量を示す基礎消費電力量情報301を備えており、区域発熱量推定部320が、基礎消費電力量情報301を用いて、CPUサーバ3、GPUサーバ4およびアクセラレータ5それぞれの消費電力量を算出することにより、CPUサーバ3、GPUサーバ4およびアクセラレータ5それぞれの発熱量を算出することを特徴とする。 In addition, the power consumption reduction control device 100 includes basic power consumption information 301 indicating the reference power consumption at a predetermined temperature for each of the CPU server 3, GPU server 4, and accelerator 5. 320 calculates the amount of heat generated by each of the CPU server 3, GPU server 4, and accelerator 5 by calculating the amount of power consumed by each of the CPU server 3, GPU server 4, and accelerator 5 using the basic power consumption information 301. It is characterized by
 このように、電力量低減制御装置100は、所定温度での基準となる消費電力量を示す基礎消費電力量情報を備えておくことにより、CPUサーバ3、GPUサーバ4およびアクセラレータ5それぞれの発熱量を推定することが可能となる。 In this way, the power consumption reduction control device 100 is equipped with basic power consumption information indicating the reference power consumption at a predetermined temperature, so that the amount of heat generated by each of the CPU server 3, GPU server 4, and accelerator 5 can be adjusted. It becomes possible to estimate.
 なお、本発明は、以上説明した実施形態に限定されるものではなく、多くの変形が本発明の技術的思想内で当分野において通常の知識を有する者により可能である。 Note that the present invention is not limited to the embodiments described above, and many modifications can be made within the technical idea of the present invention by those having ordinary knowledge in this field.
 1   電力量低減制御システム
 2   空調機
 3   CPUサーバ
 4   GPUサーバ
 5   アクセラレータ
 10  データセンタ(DC)
 20  空調制御区域
 30  配置制御区域
 62  Situation分類
 63  空調制御値情報
 64  温度分布情報
 65  空調消費電力量情報
 100 電力量低減制御装置
 200 空調制御部
 201 運用履歴情報
 210 状況認識部
 211 外界因子取得部
 212 Situation判定部
 220 運用履歴情報生成部
 221 空調制御値生成部
 222 報酬計算部
 223 運用履歴作成部
 230 運用履歴情報抽出部
 240 空調制御実行部
 300 サーバ制御部
 301 基礎消費電力量情報
 302 CPUサーバ電力量学習モデル
 303 GPUサーバ電力量学習モデル
 304 アクセラレータ電力量学習モデル
 310 配置パターン算出部
 320 区域発熱量推定部
 330 サーバ消費電力量予測部
 331 CPU電力量予測部
 332 GPU電力量予測部
 333 アクセラレータ電力量予測部
 340 配置パターン決定部
1 Electric power reduction control system 2 Air conditioner 3 CPU server 4 GPU server 5 Accelerator 10 Data center (DC)
20 Air conditioning control area 30 Placement control area 62 Situation classification 63 Air conditioning control value information 64 Temperature distribution information 65 Air conditioning power consumption information 100 Electric power reduction control device 200 Air conditioning control unit 201 Operation history information 210 Situation recognition unit 211 External factor acquisition unit 212 Situation determination unit 220 Operation history information generation unit 221 Air conditioning control value generation unit 222 Reward calculation unit 223 Operation history creation unit 230 Operation history information extraction unit 240 Air conditioning control execution unit 300 Server control unit 301 Basic power consumption information 302 CPU server power consumption Learning model 303 GPU server power consumption learning model 304 Accelerator power consumption learning model 310 Arrangement pattern calculation unit 320 Area heat generation estimation unit 330 Server power consumption prediction unit 331 CPU power consumption prediction unit 332 GPU power consumption prediction unit 333 Accelerator power consumption prediction Section 340 Arrangement pattern determination section

Claims (9)

  1.  CPUサーバ、GPUサーバおよびアクセラレータ、並びに複数の空調機を制御する電力量低減制御装置であって、
     前記CPUサーバ、前記GPUサーバ、前記アクセラレータの何れかが配置される複数の配置制御区域と、前記複数の空調機による空調制御の効果を測定するエリアである空調制御区域とが、設定されており、
     前記電力量低減制御装置は、
     前記複数の空調機に設定する、少なくとも目標温度を含む空調制御値を生成する空調制御値生成部と、
     前記空調制御値を用いて前記複数の空調機の制御を実行させる空調制御実行部と、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに処理負荷を配置した複数の配置パターンにおいて、前記空調制御実行部が前記複数の空調機を前記空調制御値により制御した結果について、前記目標温度を指標として評価する報酬を算出し、前記報酬が所定の条件を満たすか否かを判定する報酬計算部と、
     前記所定の条件を満たすと判定された前記空調制御値による制御結果として、温度分布情報および前記複数の空調機の空調消費電力量を取得し、前記複数の配置パターンそれぞれにおける各配置制御区域の発熱予測量に対応付けた運用履歴情報を作成する運用履歴作成部と、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに対する処理負荷の情報を用いて、新規の前記処理負荷を配置する複数の配置パターンを算出する配置パターン算出部と、
     前記算出した配置パターン毎に、前記配置制御区域それぞれに属する、CPUサーバ、GPUサーバおよびアクセラレータに処理負荷が配置された場合の発熱量を合計することにより、各前記配置制御区域の発熱予測量を推定する区域発熱量推定部と、
     各前記配置制御区域の発熱予測量の情報を用いて、前記運用履歴情報を参照し、各配置パターンにおいて前記空調制御値により制御した場合の、前記温度分布情報および前記空調消費電力量を抽出する運用履歴情報抽出部と、
     抽出した前記温度分布情報と新規の処理負荷に関する情報とを用いて、前記配置パターンそれぞれにおいて、前記配置制御区域毎に、各CPUサーバの消費電力量、各GPUサーバの消費電力量および各アクセラレータの消費電力量を合計したサーバ消費電力量を算出するサーバ消費電力量予測部と、
     前記配置パターンそれぞれにおいて、前記配置制御区域それぞれの前記サーバ消費電力量を合計し、その合計であるトータルのサーバ消費電力量と、抽出した前記空調消費電力量との合計量を算出し、算出した合計量が最小となる配置パターンを、前記処理負荷を配置する配置パターンに決定する配置パターン決定部と、
     を備えることを特徴とする電力量低減制御装置。
    A power consumption reduction control device that controls a CPU server, a GPU server, an accelerator, and a plurality of air conditioners,
    A plurality of placement control areas in which any one of the CPU server, the GPU server, and the accelerator are arranged, and an air conditioning control area that is an area for measuring the effect of air conditioning control by the plurality of air conditioners are set. ,
    The power amount reduction control device includes:
    an air conditioning control value generation unit that generates an air conditioning control value that includes at least a target temperature to be set in the plurality of air conditioners;
    an air conditioning control execution unit that executes control of the plurality of air conditioners using the air conditioning control value;
    In a plurality of layout patterns in which processing loads are arranged in the CPU server, the GPU server, and the accelerator, the air conditioning control execution unit controls the plurality of air conditioners using the air conditioning control value, using the target temperature as an index. a remuneration calculation unit that calculates a remuneration to be evaluated and determines whether the remuneration satisfies a predetermined condition;
    As a control result based on the air conditioning control value determined to satisfy the predetermined condition, temperature distribution information and air conditioning power consumption of the plurality of air conditioners are acquired, and heat generation in each arrangement control area in each of the plurality of arrangement patterns is obtained. an operation history creation unit that creates operation history information associated with the predicted amount;
    a placement pattern calculation unit that calculates a plurality of placement patterns in which new processing loads are placed using information on processing loads on the CPU server, the GPU server, and the accelerator;
    For each of the calculated placement patterns, the predicted heat generation amount for each placement control area is calculated by summing the amount of heat generated when processing loads are placed on the CPU server, GPU server, and accelerator belonging to each placement control area. an area calorific value estimator for estimating;
    Using the information on the predicted heat generation amount of each of the placement control areas, referring to the operation history information, and extracting the temperature distribution information and the air conditioning power consumption when controlling according to the air conditioning control value in each placement pattern. an operation history information extraction unit;
    Using the extracted temperature distribution information and information regarding the new processing load, in each of the placement patterns, the power consumption of each CPU server, the power consumption of each GPU server, and the power consumption of each accelerator are calculated for each placement control area. a server power consumption prediction unit that calculates server power consumption by summing up the power consumption;
    In each of the placement patterns, the server power consumption of each of the placement control areas is summed, and the total server power consumption, which is the sum, and the extracted air conditioning power consumption are calculated. a placement pattern determining unit that determines a placement pattern with a minimum total amount as a placement pattern in which to place the processing load;
    A power consumption reduction control device comprising:
  2.  データセンタが有する、CPUサーバ、GPUサーバおよびアクセラレータ、並びに複数の空調機を制御する電力量低減制御装置であって、
     前記データセンタのフロアには、処理負荷を配置するまとまったサーバ群として、CPUサーバ、GPUサーバ、アクセラレータの何れかが配置される複数の配置制御区域と、前記複数の空調機による空調制御の効果を測定するエリアである空調制御区域とが、設定されており、
     前記電力量低減制御装置は、
     複数の前記空調制御区域で測定された温度の平均値から算出するフロア平均温度、前記データセンタの外部の温度である外温、および、前記処理負荷が前記サーバ群に配置された場合の予測量である各前記配置制御区域の発熱予測量、を含む空調制御に関する外界因子の情報を取得する外界因子取得部と、
     前記外界因子それぞれの値を所定のレンジ幅に分割し、前記外界因子ごとに分割したレンジを組み合わせてSituation分類として定義し、取得した前記外界因子の情報がどのSituation分類に属するかを判定するSituation判定部と、
     前記Situation分類それぞれにおいて、前記複数の空調機に設定する、少なくとも目標温度を含む空調制御値を生成する空調制御値生成部と、
     前記空調制御値を用いて前記複数の空調機の制御を実行させる空調制御実行部と、
     前記空調制御実行部が前記複数の空調機を前記空調制御値により制御した結果を、前記目標温度を指標として評価する報酬を算出し、前記報酬が所定の条件を満たすか否かを判定する報酬計算部と、
     前記所定の条件を満たすと判定された前記空調制御値による制御結果として、前記GPUサーバの吸込口での温度を示すGPU吸込口温度、および、前記アクセラレータの吸込口での温度を示すアクセラレータ吸込口温度、を示す温度分布情報と、当該空調制御値による制御を行った場合の前記複数の空調機の空調消費電力量とを取得し、空調制御を実行した際の、前記Situation分類および前記空調制御値に、その制御結果として取得した、前記温度分布情報と、前記空調消費電力量とを対応付けた運用履歴情報を作成する運用履歴作成部と、
     各前記配置制御区域の発熱予測量の情報を取得すると、前記Situation判定部を介して、現時点の前記Situation分類を判定し、前記運用履歴情報を参照して、各配置パターンにおいて前記空調制御値により制御した場合の、前記温度分布情報および前記空調消費電力量を抽出する運用履歴情報抽出部と、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに対する処理負荷の生成および削除のスケジュールを示す負荷処理スケジュール情報を取得し、新規の前記処理負荷を、前記CPUサーバ、前記GPUサーバおよび前記アクセラレータそれぞれに配置する配置パターンを算出する配置パターン算出部と、
     前記算出した配置パターン毎に、前記配置制御区域それぞれに属する、CPUサーバ、GPUサーバおよびアクセラレータに処理負荷が配置された場合の発熱量を合計することにより、各前記配置制御区域の発熱予測量を推定する区域発熱量推定部と、
     前記負荷処理スケジュール情報と、抽出された前記温度分布情報とを用いて、前記配置パターンそれぞれにおいて、前記配置制御区域の前記CPUサーバの消費電力量を合計したCPUサーバ総消費電力量、前記配置制御区域の前記GPUサーバの消費電力量を合計したGPUサーバ総消費電力量、前記配置制御区域の前記アクセラレータの消費電力量を合計したアクセラレータ総消費電力量を算出し、当該配置制御区域の前記CPUサーバ総消費電力量、前記GPUサーバ総消費電力量および前記アクセラレータ総消費電力量を合計し、前記配置制御区域それぞれのサーバ消費電力量を算出するサーバ消費電力量予測部と、
     配置パターンそれぞれにおいて、前記配置制御区域それぞれのサーバ消費電力量を合計し、その合計であるトータルのサーバ消費電力量と、抽出した前記空調消費電力量との合計量を算出し、算出した合計量が最小となる配置パターンを、前記処理負荷を配置する配置パターンに決定する配置パターン決定部と、
     を備えることを特徴とする電力量低減制御装置。
    A power consumption reduction control device that controls a CPU server, a GPU server, an accelerator, and a plurality of air conditioners in a data center,
    On the floor of the data center, there are a plurality of placement control areas in which any of CPU servers, GPU servers, and accelerators are placed as a group of servers that place processing loads, and the effects of air conditioning control by the plurality of air conditioners. An air-conditioned controlled area, which is the area where the
    The power amount reduction control device includes:
    An average floor temperature calculated from the average value of temperatures measured in a plurality of the air conditioning control areas, an external temperature that is the temperature outside the data center, and a predicted amount when the processing load is placed in the server group. an external factor acquisition unit that acquires information on external factors related to air conditioning control, including a predicted amount of heat generation in each of the placement control areas;
    A situation in which the values of each of the external world factors are divided into predetermined range widths, the ranges divided for each of the external world factors are combined to define a situation classification, and it is determined to which situation classification the acquired information on the external world factor belongs. A determination section;
    an air conditioning control value generation unit that generates an air conditioning control value including at least a target temperature to be set for the plurality of air conditioners in each of the Situation classifications;
    an air conditioning control execution unit that executes control of the plurality of air conditioners using the air conditioning control value;
    The air conditioning control execution unit calculates a reward for evaluating the results of controlling the plurality of air conditioners using the air conditioning control value using the target temperature as an index, and determines whether or not the reward satisfies a predetermined condition. calculation section and
    A GPU inlet temperature indicating the temperature at the inlet of the GPU server and an accelerator inlet indicating the temperature at the inlet of the accelerator as control results based on the air conditioning control value determined to satisfy the predetermined condition. temperature, and the air conditioning power consumption of the plurality of air conditioners when control is performed using the air conditioning control value, and executes the air conditioning control, the Situation classification and the air conditioning control. an operation history creation unit that creates operation history information in which the temperature distribution information obtained as a control result and the air conditioning power consumption are associated with the value;
    When information on the predicted amount of heat generation for each layout control area is acquired, the current situation classification is determined through the situation determination unit, and the air conditioning control value is determined in each layout pattern by referring to the operation history information. an operation history information extraction unit that extracts the temperature distribution information and the air conditioning power consumption when the control is performed;
    Obtain load processing schedule information indicating a schedule for generation and deletion of processing loads for the CPU server, GPU server, and accelerator, and place new processing loads on each of the CPU server, GPU server, and accelerator. a placement pattern calculation unit that calculates a placement pattern;
    For each of the calculated placement patterns, the predicted heat generation amount for each placement control area is calculated by summing the amount of heat generated when processing loads are placed on the CPU server, GPU server, and accelerator belonging to each placement control area. an area calorific value estimator for estimating;
    Using the load processing schedule information and the extracted temperature distribution information, calculate the CPU server total power consumption, which is the sum of the power consumption of the CPU servers in the placement control area, in each of the placement patterns, and the placement control. A GPU server total power consumption that is the sum of the power consumption of the GPU servers in the area, and an accelerator total power consumption that is the sum of the power consumption of the accelerators in the placement control area, and the CPU server in the placement control area is calculated. a server power consumption prediction unit that calculates the server power consumption of each of the placement control areas by summing the total power consumption, the GPU server total power consumption, and the accelerator total power consumption;
    In each placement pattern, the server power consumption of each of the placement control areas is totaled, and the total server power consumption that is the sum is calculated, and the total amount of the extracted air conditioning power consumption is calculated, and the calculated total amount is a placement pattern determining unit that determines a placement pattern in which the processing load is placed as a placement pattern in which the processing load is minimized;
    A power consumption reduction control device comprising:
  3.  前記電力量低減制御装置は、
     前記CPUサーバのリソース使用状況を入力情報として、当該CPUサーバの消費電力量を出力情報とするCPUサーバ電力量学習モデルと、前記GPUサーバの、GPU吸込口温度、前記処理負荷の種類およびGPUカード数を入力情報とし、当該GPUサーバの消費電力量を出力情報とするGPUサーバ電力量学習モデルと、前記アクセラレータの、アクセラレータ吸込口温度、前記処理負荷の種類およびアクセラレータ処理回路数を入力情報とし、当該アクセラレータの消費電力量を出力情報とするアクセラレータ電力量学習モデルとを備えており、
     前記サーバ消費電力量予測部は、前記CPUサーバの消費電力量を、前記CPUサーバ電力量学習モデルを用いて算出し、前記GPUサーバの消費電力量を、前記GPUサーバ電力量学習モデルを用いて算出し、前記アクセラレータの消費電力量を、前記アクセラレータ電力量学習モデルを用いて算出すること
     を特徴とする請求項1または請求項2に記載の電力量低減制御装置。
    The power amount reduction control device includes:
    A CPU server power learning model that uses the resource usage status of the CPU server as input information and the power consumption of the CPU server as output information, and the GPU inlet temperature, the type of processing load, and the GPU card of the GPU server. a GPU server power learning model that takes the number as input information and the power consumption of the GPU server as output information, and the accelerator inlet temperature of the accelerator, the type of processing load, and the number of accelerator processing circuits as input information, It is equipped with an accelerator power amount learning model that uses the power consumption amount of the accelerator as output information,
    The server power consumption prediction unit calculates the power consumption of the CPU server using the CPU server power learning model, and calculates the power consumption of the GPU server using the GPU server power learning model. The power consumption reduction control device according to claim 1 or 2, wherein the power consumption amount of the accelerator is calculated using the accelerator power amount learning model.
  4.  前記電力量低減制御装置は、
     前記GPUサーバ電力量学習モデルの代わりに、前記GPUサーバの、前記GPU吸込口温度、前記処理負荷の種類およびGPUカード数を入力情報とし、前記GPUカードの温度を示すGPU温度を出力情報とする第1GPU学習モデルと、前記GPU温度を入力情報とし、当該GPUサーバの消費電力量を出力情報とする第2GPU学習モデルとを備え、
     前記サーバ消費電力量予測部は、前記GPU温度を、前記第1GPU学習モデルを用いて算出した上で、前記GPUサーバの消費電力量を、前記第2GPU学習モデルを用いて算出すること
     を特徴とする請求項3に記載の電力量低減制御装置。
    The power amount reduction control device includes:
    Instead of the GPU server power amount learning model, the GPU inlet temperature, the type of processing load, and the number of GPU cards of the GPU server are used as input information, and the GPU temperature indicating the temperature of the GPU card is used as output information. comprising a first GPU learning model and a second GPU learning model that uses the GPU temperature as input information and uses the power consumption of the GPU server as output information,
    The server power consumption prediction unit calculates the GPU temperature using the first GPU learning model, and then calculates the power consumption of the GPU server using the second GPU learning model. The power amount reduction control device according to claim 3.
  5.  前記電力量低減制御装置は、
     前記アクセラレータ電力量学習モデルの代わりに、前記アクセラレータの、前記アクセラレータ吸込口温度、前記処理負荷の種類およびアクセラレータ処理回路数を入力情報とし、前記アクセラレータ内部の温度を示すアクセラレータ温度を出力情報とする第1アクセラレータ学習モデルと、前記アクセラレータ温度を入力情報とし、当該アクセラレータの消費電力量を出力情報とする第2アクセラレータ学習モデルとを備え、
     前記サーバ消費電力量予測部は、前記アクセラレータ温度を、前記第1アクセラレータ学習モデルを用いて算出した上で、前記アクセラレータの消費電力量を、前記第2アクセラレータ学習モデルを用いて算出すること
     を特徴とする請求項3に記載の電力量低減制御装置。
    The power amount reduction control device includes:
    Instead of the accelerator power consumption learning model, the accelerator inlet temperature, the type of processing load, and the number of accelerator processing circuits of the accelerator are used as input information, and the accelerator temperature indicating the temperature inside the accelerator is output information. a second accelerator learning model that uses the accelerator temperature as input information and uses the power consumption of the accelerator as output information,
    The server power consumption prediction unit calculates the accelerator temperature using the first accelerator learning model, and then calculates the power consumption of the accelerator using the second accelerator learning model. The power amount reduction control device according to claim 3.
  6.  前記電力量低減制御装置は、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータそれぞれに関する所定温度での基準となる消費電力量を示す基礎消費電力量情報を備えており、
     前記区域発熱量推定部は、前記基礎消費電力量情報を用いて、前記CPUサーバ、前記GPUサーバおよび前記アクセラレータそれぞれの消費電力量を算出することにより、前記CPUサーバ、前記GPUサーバおよび前記アクセラレータそれぞれの発熱量を算出すること
     を特徴とする請求項1または請求項2に記載の電力量低減制御装置。
    The power amount reduction control device includes:
    basic power consumption information indicating a reference power consumption at a predetermined temperature for each of the CPU server, the GPU server, and the accelerator;
    The area heat generation amount estimating unit calculates the amount of power consumed by each of the CPU server, the GPU server, and the accelerator using the basic power consumption information. The electric power reduction control device according to claim 1 or 2, further comprising: calculating a calorific value of the electric power amount.
  7.  CPUサーバ、GPUサーバおよびアクセラレータ、並びに複数の空調機を制御する電力量低減制御装置の電力量低減制御方法であって、
     前記CPUサーバ、前記GPUサーバ、前記アクセラレータの何れかが配置される複数の配置制御区域と、前記複数の空調機による空調制御の効果を測定するエリアである空調制御区域とが、設定されており、
     前記電力量低減制御装置は、
     前記複数の空調機に設定する、少なくとも目標温度を含む空調制御値を生成するステップと、
     前記空調制御値を用いて前記複数の空調機の制御を実行させるステップと、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに処理負荷を配置した複数の配置パターンにおいて、前記複数の空調機を前記空調制御値により制御した結果について、前記目標温度を指標として評価する報酬を算出し、前記報酬が所定の条件を満たすか否かを判定するステップと、
     前記所定の条件を満たすと判定された前記空調制御値による制御結果として、温度分布情報および前記複数の空調機の空調消費電力量を取得し、前記複数の配置パターンそれぞれにおける各配置制御区域の発熱予測量に対応付けた運用履歴情報を作成するステップと、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに対する処理負荷の情報を用いて、新規の前記処理負荷を配置する複数の配置パターンを算出するステップと、
     前記算出した配置パターン毎に、前記配置制御区域それぞれに属する、CPUサーバ、GPUサーバおよびアクセラレータに処理負荷が配置された場合の発熱量を合計することにより、各前記配置制御区域の発熱予測量を推定するステップと、
     各前記配置制御区域の発熱予測量の情報を用いて、前記運用履歴情報を参照し、各配置パターンにおいて前記空調制御値により制御した場合の、前記温度分布情報および前記空調消費電力量を抽出するステップと、
     抽出した前記温度分布情報と新規の処理負荷に関する情報とを用いて、前記配置パターンそれぞれにおいて、前記配置制御区域毎に、各CPUサーバの消費電力量、各GPUサーバの消費電力量および各アクセラレータの消費電力量を合計したサーバ消費電力量を算出するステップと、
     前記配置パターンそれぞれにおいて、前記配置制御区域それぞれの前記サーバ消費電力量を合計し、その合計であるトータルのサーバ消費電力量と、抽出した前記空調消費電力量との合計量を算出し、算出した合計量が最小となる配置パターンを、前記処理負荷を配置する配置パターンに決定するステップと、
     を実行することを特徴とする電力量低減制御方法。
    A power consumption reduction control method for a power consumption reduction control device that controls a CPU server, a GPU server, an accelerator, and a plurality of air conditioners, the method comprising:
    A plurality of placement control areas in which any one of the CPU server, the GPU server, and the accelerator are arranged, and an air conditioning control area that is an area for measuring the effect of air conditioning control by the plurality of air conditioners are set. ,
    The power amount reduction control device includes:
    generating air conditioning control values that include at least a target temperature to be set in the plurality of air conditioners;
    controlling the plurality of air conditioners using the air conditioning control value;
    In a plurality of arrangement patterns in which processing loads are placed on the CPU server, the GPU server, and the accelerator, a reward is calculated for evaluating the results of controlling the plurality of air conditioners using the air conditioning control value using the target temperature as an index. , determining whether the reward satisfies a predetermined condition;
    As a control result based on the air conditioning control value determined to satisfy the predetermined condition, temperature distribution information and air conditioning power consumption of the plurality of air conditioners are acquired, and heat generation in each arrangement control area in each of the plurality of arrangement patterns is obtained. a step of creating operational history information associated with the predicted amount;
    calculating a plurality of placement patterns for arranging new processing loads using information on processing loads on the CPU server, the GPU server, and the accelerator;
    For each of the calculated placement patterns, the predicted heat generation amount for each placement control area is calculated by summing the amount of heat generated when processing loads are placed on the CPU server, GPU server, and accelerator belonging to each placement control area. a step of estimating;
    Using the information on the predicted heat generation amount of each of the placement control areas, referring to the operation history information, and extracting the temperature distribution information and the air conditioning power consumption when controlling according to the air conditioning control value in each placement pattern. step and
    Using the extracted temperature distribution information and information regarding the new processing load, in each of the placement patterns, the power consumption of each CPU server, the power consumption of each GPU server, and the power consumption of each accelerator are calculated for each placement control area. a step of calculating server power consumption by summing the power consumption;
    In each of the placement patterns, the server power consumption of each of the placement control areas is summed, and the total server power consumption, which is the sum, and the extracted air conditioning power consumption are calculated. determining a placement pattern with a minimum total amount as a placement pattern for placing the processing load;
    A power consumption reduction control method characterized by performing the following.
  8.  CPUサーバ、GPUサーバおよびアクセラレータ、並びに複数の空調機を制御する電力量低減制御装置を備える電力量低減制御システムであって、
     前記CPUサーバ、前記GPUサーバ、前記アクセラレータの何れかが配置される複数の配置制御区域と、前記複数の空調機による空調制御の効果を測定するエリアである空調制御区域とが、設定されており、
     前記電力量低減制御装置は、
     前記複数の空調機に設定する、少なくとも目標温度を含む空調制御値を生成する空調制御値生成部と、
     前記空調制御値を用いて前記複数の空調機の制御を実行させる空調制御実行部と、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに処理負荷を配置した複数の配置パターンにおいて、前記空調制御実行部が前記複数の空調機を前記空調制御値により制御した結果について、前記目標温度を指標として評価する報酬を算出し、前記報酬が所定の条件を満たすか否かを判定する報酬計算部と、
     前記所定の条件を満たすと判定された前記空調制御値による制御結果として、温度分布情報および前記複数の空調機の空調消費電力量を取得し、前記複数の配置パターンそれぞれにおける各配置制御区域の発熱予測量に対応付けた運用履歴情報を作成する運用履歴作成部と、
     前記CPUサーバ、前記GPUサーバおよび前記アクセラレータに対する処理負荷の情報を用いて、新規の前記処理負荷を配置する複数の配置パターンを算出する配置パターン算出部と、
     前記算出した配置パターン毎に、前記配置制御区域それぞれに属する、CPUサーバ、GPUサーバおよびアクセラレータに処理負荷が配置された場合の発熱量を合計することにより、各前記配置制御区域の発熱予測量を推定する区域発熱量推定部と、
     各前記配置制御区域の発熱予測量の情報を用いて、前記運用履歴情報を参照し、各配置パターンにおいて前記空調制御値により制御した場合の、前記温度分布情報および前記空調消費電力量を抽出する運用履歴情報抽出部と、
     抽出した前記温度分布情報と新規の処理負荷に関する情報とを用いて、前記配置パターンそれぞれにおいて、前記配置制御区域毎に、各CPUサーバの消費電力量、各GPUサーバの消費電力量および各アクセラレータの消費電力量を合計したサーバ消費電力量を算出するサーバ消費電力量予測部と、
     前記配置パターンそれぞれにおいて、前記配置制御区域それぞれの前記サーバ消費電力量を合計し、その合計であるトータルのサーバ消費電力量と、抽出した前記空調消費電力量との合計量を算出し、算出した合計量が最小となる配置パターンを、前記処理負荷を配置する配置パターンに決定する配置パターン決定部と、
     を備えることを特徴とする電力量低減制御システム。
    A power consumption reduction control system comprising a power consumption reduction control device that controls a CPU server, a GPU server, an accelerator, and a plurality of air conditioners,
    A plurality of placement control areas in which any one of the CPU server, the GPU server, and the accelerator are arranged, and an air conditioning control area that is an area for measuring the effect of air conditioning control by the plurality of air conditioners are set. ,
    The power amount reduction control device includes:
    an air conditioning control value generation unit that generates an air conditioning control value that includes at least a target temperature to be set in the plurality of air conditioners;
    an air conditioning control execution unit that executes control of the plurality of air conditioners using the air conditioning control value;
    In a plurality of layout patterns in which processing loads are arranged in the CPU server, the GPU server, and the accelerator, the air conditioning control execution unit controls the plurality of air conditioners using the air conditioning control value, using the target temperature as an index. a remuneration calculation unit that calculates a remuneration to be evaluated and determines whether the remuneration satisfies a predetermined condition;
    As a control result based on the air conditioning control value determined to satisfy the predetermined condition, temperature distribution information and air conditioning power consumption of the plurality of air conditioners are acquired, and heat generation in each arrangement control area in each of the plurality of arrangement patterns is obtained. an operation history creation unit that creates operation history information associated with the predicted amount;
    a placement pattern calculation unit that calculates a plurality of placement patterns in which new processing loads are placed using information on processing loads on the CPU server, the GPU server, and the accelerator;
    For each of the calculated placement patterns, the predicted heat generation amount for each placement control area is calculated by summing the amount of heat generated when processing loads are placed on the CPU server, GPU server, and accelerator belonging to each placement control area. an area calorific value estimator for estimating;
    Using the information on the predicted heat generation amount of each of the placement control areas, referring to the operation history information, and extracting the temperature distribution information and the air conditioning power consumption when controlling according to the air conditioning control value in each placement pattern. an operation history information extraction unit;
    Using the extracted temperature distribution information and information regarding the new processing load, in each of the placement patterns, the power consumption of each CPU server, the power consumption of each GPU server, and the power consumption of each accelerator are calculated for each placement control area. a server power consumption prediction unit that calculates server power consumption by summing up the power consumption;
    In each of the placement patterns, the server power consumption of each of the placement control areas is summed, and the total server power consumption, which is the sum, and the extracted air conditioning power consumption are calculated. a placement pattern determining unit that determines a placement pattern with a minimum total amount as a placement pattern in which to place the processing load;
    A power consumption reduction control system characterized by comprising:
  9.  コンピュータを、請求項1または請求項2に記載の電力量低減制御装置として機能させるためのプログラム。 A program for causing a computer to function as the power consumption reduction control device according to claim 1 or 2.
PCT/JP2022/017844 2022-04-14 2022-04-14 Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program WO2023199482A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/017844 WO2023199482A1 (en) 2022-04-14 2022-04-14 Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/017844 WO2023199482A1 (en) 2022-04-14 2022-04-14 Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program

Publications (1)

Publication Number Publication Date
WO2023199482A1 true WO2023199482A1 (en) 2023-10-19

Family

ID=88329400

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/017844 WO2023199482A1 (en) 2022-04-14 2022-04-14 Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program

Country Status (1)

Country Link
WO (1) WO2023199482A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013152552A (en) * 2012-01-24 2013-08-08 Hitachi Ltd Operation management method for information processing system
JP2015050378A (en) * 2013-09-03 2015-03-16 日本電信電話株式会社 Air conditioning control method and air conditioning control system
JP2018048750A (en) * 2016-09-20 2018-03-29 株式会社東芝 Air conditioning control device, air conditioning control method, and air conditioning control program
WO2019154739A1 (en) * 2018-02-07 2019-08-15 Abb Schweiz Ag Method and system for controlling power consumption of a data center based on load allocation and temperature measurements

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013152552A (en) * 2012-01-24 2013-08-08 Hitachi Ltd Operation management method for information processing system
JP2015050378A (en) * 2013-09-03 2015-03-16 日本電信電話株式会社 Air conditioning control method and air conditioning control system
JP2018048750A (en) * 2016-09-20 2018-03-29 株式会社東芝 Air conditioning control device, air conditioning control method, and air conditioning control program
WO2019154739A1 (en) * 2018-02-07 2019-08-15 Abb Schweiz Ag Method and system for controlling power consumption of a data center based on load allocation and temperature measurements

Similar Documents

Publication Publication Date Title
Moore et al. Making Scheduling" Cool": Temperature-Aware Workload Placement in Data Centers.
US8904383B2 (en) Virtual machine migration according to environmental data
CN102096460B (en) Method and apparatus for dynamically allocating power in a data center
US8677365B2 (en) Performing zone-based workload scheduling according to environmental conditions
JP7254819B2 (en) How to optimize fan efficiency and/or operating performance or fan placement
US9037880B2 (en) Method and system for automated application layer power management solution for serverside applications
US20120005505A1 (en) Determining Status Assignments That Optimize Entity Utilization And Resource Power Consumption
Lee et al. Proactive thermal-aware resource management in virtualized HPC cloud datacenters
US20120254400A1 (en) System to improve operation of a data center with heterogeneous computing clouds
JP2011505035A (en) System integration to meet exergy loss targets
Arroba et al. Heuristics and metaheuristics for dynamic management of computing and cooling energy in cloud data centers
Zhang et al. An energy and SLA-aware resource management strategy in cloud data centers
Ran et al. Optimizing data center energy efficiency via event-driven deep reinforcement learning
WO2023199482A1 (en) Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program
CN113225994B (en) Intelligent air conditioner control method facing data center
Rahmani et al. Kullback-Leibler distance criterion consolidation in cloud
Marcel et al. Thermal aware workload consolidation in cloud data centers
JP6455937B2 (en) Simulation apparatus, simulation method, and program
Wolke et al. Evaluating dynamic resource allocation strategies in virtualized data centers
Lin et al. Allocating workload to minimize the power consumption of data centers
EP2575003B1 (en) Method for determining assignment of loads of data center and information processing system
Zhang et al. Real time thermal management controller for data center
CN114741160A (en) Dynamic virtual machine integration method and system based on balanced energy consumption and service quality
WO2023105557A1 (en) Electric power amount reduction control device, electric power amount reduction control method, electric power amount reduction control system, and program
CN113094149B (en) Data center virtual machine placement method, system, medium and equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22937461

Country of ref document: EP

Kind code of ref document: A1