WO2010050080A1 - Physical computer, method for controlling cooling device, and server system - Google Patents

Physical computer, method for controlling cooling device, and server system Download PDF

Info

Publication number
WO2010050080A1
WO2010050080A1 PCT/JP2009/000806 JP2009000806W WO2010050080A1 WO 2010050080 A1 WO2010050080 A1 WO 2010050080A1 JP 2009000806 W JP2009000806 W JP 2009000806W WO 2010050080 A1 WO2010050080 A1 WO 2010050080A1
Authority
WO
WIPO (PCT)
Prior art keywords
temperature
fan
processor
cooling
server
Prior art date
Application number
PCT/JP2009/000806
Other languages
French (fr)
Japanese (ja)
Inventor
陽子 志賀
加藤 猛
高本 良史
林 真一
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Publication of WO2010050080A1 publication Critical patent/WO2010050080A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • G06F1/206Cooling means comprising thermal management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • HELECTRICITY
    • H05ELECTRIC TECHNIQUES NOT OTHERWISE PROVIDED FOR
    • H05KPRINTED CIRCUITS; CASINGS OR CONSTRUCTIONAL DETAILS OF ELECTRIC APPARATUS; MANUFACTURE OF ASSEMBLAGES OF ELECTRICAL COMPONENTS
    • H05K7/00Constructional details common to different types of electric apparatus
    • H05K7/20Modifications to facilitate cooling, ventilating, or heating
    • H05K7/20709Modifications to facilitate cooling, ventilating, or heating for server racks or cabinets; for data centers, e.g. 19-inch computer racks
    • H05K7/20836Thermal management, e.g. server temperature control
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to a control method for a physical computer and a cooling device, and more particularly to a method for controlling an output of a fan and a cooling device of a physical computer in accordance with a CPU operating rate.
  • Patent Document 1 discloses an invention in which the fan is controlled based on the temperature of the heat-generating component and the temperature change to improve the cooling efficiency.
  • Patent Document 2 discloses an invention in which a fan is controlled based on the temperature of a heat-generating component and a temperature change to improve cooling efficiency.
  • Patent Document 3 discloses an invention in which an air flow in a machine room is monitored and ventilation is performed according to the air flow.
  • Patent Document 4 discloses an invention that avoids failure and shutdown by controlling heat dissipation by hibernation when the temperature exceeds a certain value in order to avoid occurrence of a malfunction due to forced shutdown.
  • a control method of the present invention is a control method of a management computer connected to a server device having a processor and a fan and a cooling device, and the temperature and operating rate of the processor from the server
  • the rotation speed of the fan and the intake air temperature to the server are acquired, and a predetermined period has elapsed from the temperature and the operating rate of the processor, the rotation speed of the fan, and the intake air temperature.
  • the estimated temperature of the subsequent processor is calculated, and when the estimated temperature is equal to or higher than a first predetermined value, the target rotational speed of the fan at which the estimated temperature after the period elapses is equal to or lower than the predetermined value.
  • the control method is characterized by determining and instructing the server device to set the target rotational speed.
  • the server system of the present invention has a processor and a fan, and measures the temperature and operating rate of the processor, the rotational speed of the fan, and the intake air temperature, the server device, and the cooling device. And calculating an estimated temperature of the processor after a predetermined period from the temperature and the operating rate of the processor, the rotation speed of the fan, and the intake air temperature, and calculating the estimated A management computer for determining a target rotational speed of the fan at which the estimated temperature is less than or equal to the predetermined value after the period when the temperature is greater than or equal to a first predetermined value.
  • Server system is a processor and a fan, and measures the temperature and operating rate of the processor, the rotational speed of the fan, and the intake air temperature, the server device, and the cooling device. And calculating an estimated temperature of the processor after a predetermined period from the temperature and the operating rate of the processor, the rotation speed of the fan, and the intake air temperature, and calculating the estimated A management computer for determining a target rotational speed of the fan at which the estimated temperature
  • the present invention by cooling the CPU in advance and maintaining the optimum temperature, power consumption due to leakage current can be minimized and the cooling efficiency can be increased.
  • DESCRIPTION OF SYMBOLS 110 Power saving control server, 111 ... Operation information monitoring part, 112 ... Temperature monitoring part, 113 ... CPU temperature estimation part, 114 ... Cooling control determination part, 115 ... Fan rotation speed determination part, 116 ... Fan monitoring / control part, 117: Cooling control unit, 121: Server configuration information, 122: Heat generation profile, 123 ... Fan profile, 124 ... Server operation history, 125 ... CPU temperature profile, 126 ... Rack / cooling map, 127 ... Cooling profile, 200 ... Physical Computer, 223 ... Measurement agent
  • FIG. 1 is a diagram showing a system configuration of an embodiment of the present invention.
  • the system configuration of this embodiment is an information processing system or a storage system.
  • the computer room in which one management computer 100, one or more physical computers 200, one or more storage devices 230, the management computer 100, the physical computers 200, and the storage devices 230 are installed is cooled.
  • a cooling device 151 and a cooling device control unit 150 that controls the cooling device 151 are provided.
  • the management computer 100, the physical computer 200, and the cooling device 151 are connected via the management network 225. Further, the physical computer 200 and the storage device 230 are connected by, for example, a fiber channel network 226.
  • the cooling device control unit 150 may be stored as a program in the memory of the physical computer so as to collectively control the cooling device 151.
  • the cooling device control unit 151 may be stored as a program in a memory in the management computer.
  • FIG. 2 is a diagram showing the management computer 100 in one embodiment of the present invention.
  • the management computer 100 manages the physical computer 200, the storage device 230, and the cooling device control unit 150. Then, information is exchanged with the plurality of physical computers 200 to detect the operation status of the plurality of physical computers 200.
  • the cooling device 151 is individually controlled via the cooling device control function 151 according to the detected operating status of the plurality of physical computers 200. Further, the fan rotation speed and the cooling device of the plurality of physical computers 200 are controlled according to the detected operating status of the plurality of physical computers 200.
  • the management computer 100 includes a central processing unit CPU (Central Processing Unit) 101, a storage device 105 such as a hard disk device or a flash memory, a memory 102, a bus 107, a network interface 104, and a disk interface 103.
  • CPU Central Processing Unit
  • the server 102 is stored in the memory 102.
  • the server program includes an operation information monitoring unit 111, a temperature monitoring unit 112, a CPU temperature estimation unit 113, a fan rotation speed determination unit 115, a fan monitoring / control unit 116, and a cooling control unit 117. These programs are initially stored in the magnetic disk 105, transferred to the memory 102 as necessary, and then executed by the CPU 101.
  • the operation information / power monitoring unit 111 collects operation information and power consumption information of the physical computer 200.
  • the temperature monitoring unit 112 acquires the intake air temperature, CPU temperature, and exhaust temperature of the physical computer.
  • the fan monitoring / control unit 116 acquires information on the fan rotation speed and issues an instruction to change the fan rotation speed.
  • the CPU temperature estimation unit 113 estimates the temperature after a certain time of the CPU built in the physical computer.
  • the fan rotation speed determination unit 115 determines a fan rotation speed that lowers the temperature of the CPU after a predetermined time to a target value.
  • the cooling control determination unit 114 acquires the rack / cooling map 126 and determines the output of the cooling device 151.
  • the cooling control unit 117 issues an instruction to control the output of the cooling device 151.
  • the operation information history 124, the server configuration information 121, the heat generation profile 122, the fan profile 123, the CPU temperature profile 125, the cooling device profile 127, the CPU temperature range 128, and the CPU optimum are stored in the storage device 105.
  • the temperature 129, the rack / cooling map 126, and the cooling device profile 2010 are stored.
  • FIG. 3 is a diagram showing a hardware configuration of the physical computer 200 in one embodiment of the present invention.
  • the physical computer 200 includes a central processing unit CPU 201, a memory 202, a storage device 205 such as a hard disk device or a flash memory, a bus 207, a network interface 204, a disk interface 203, a fan 208, and a BMC (Baseboard Management Controller) 207. .
  • BMC 207 performs monitoring of server inlet temperature, exhaust temperature, CPU temperature, monitoring / control of fan speed, and power supply control.
  • the memory 202 stores an OS 222, a measurement agent program 223 that collects operation information of the physical computer, and a business service program 224. These programs are first stored in the magnetic disk 205, transferred to the memory 202 as necessary, and then executed by the CPU 201. Note that these programs are stored in the magnetic disk 205 by being read from a portable recording medium or downloaded from another computer or storage device via a network connected to each device. It may be a thing.
  • each process of the server program 110 of the management computer is realized by executing each program by a CPU.
  • these are integrated into a processing unit that performs each process, such as a measurement agent determination unit and a measurement unit. Can also be realized in hardware.
  • the measurement agent 223 is a software program that runs on the computer 200 and collects operation information such as a CPU usage rate, a memory usage rate, and a network interface usage rate of a device in which the measurement agent 223 operates and records it as a measurement counter.
  • the operation information / power monitoring unit 111 of the server program 110 of the management computer transmits an operation information collection request by SNMP (Simple Network Management Protocol) to the measurement agent 223.
  • the measurement agent 223 receives this operation information collection request, and transmits the value of the measurement counter designated by the object ID (Identification) in the request to the operation information / power monitoring unit 111.
  • the server program 110 can centrally manage the operation information of a plurality of management targets by receiving the value of the measurement counter and recording it as operation information.
  • FIG. 4 is a diagram showing a device arrangement of the computer room 400 in which the physical computer 200, the storage device 230, the cooling device 151, and the like are installed according to an embodiment of the present invention.
  • each rack 401 and the cooling devices 151a and 151b are fixed on the floor.
  • a plurality of outlets 431 to 435 are installed on the floor.
  • the motor 440 is fixed to the air outlets 431 to 435, and an opening / closing plate 442 that opens and closes each air outlet according to the rotational drive of the motor 440 is provided on the motor rotating shaft 441. It is a fixed configuration.
  • the physical computer 200 and the storage device 230 are stored in the rack 401a.
  • the racks 401b to 401c similarly store the physical computer 200 and the storage device 230 (not shown).
  • the cooling devices 151 a and 151 b are attached to the side surface of the computer room 400 and are configured as one element of the cooling device 151 for keeping the temperature of the computer room 400 constant.
  • the cooling devices 151 a and 151 b remove the heat discharged from each server by sending cold air under the floor and blowing out the cold air from the air outlet (perforated tile) 431. At this time, in accordance with an instruction from the management computer 100, control is performed to open one of the outlets 431 to 435 and close the other outlets.
  • the blower outlet 431 is driven by the rotation of the motor 440 as a control for the cooling device 151.
  • the air outlet 433 is closed and the other air outlets 431, 432, 434, 435 are opened.
  • the air outlets 431 to 431 are controlled by the rotational drive of the motor 440 as a control for the cooling device 151. Control is performed such that the air outlets 431 and 435 out of 435 are closed and the other air outlets 432, 433, and 434 are opened.
  • the cooling device in the present embodiment is a general computer room air conditioner (CRAC: “Computer” Room ”Air” Conditioner), but is not limited thereto.
  • the cooling facility may be a liquid cooling device that removes heat discharged from each server by circulating the cooled liquid refrigerant through the pipes and circulating through each rack. In the liquid cooling device, there is a valve in front of the pipe that leads to each rack, and the cooling output is adjusted by opening and closing the valve in the same manner as the outlet. Further, the cooling facility may be an outside air cooling device that removes heat discharged from each server by taking in outside cold air and sending cool air from under the floor in the same manner as the computer room cooling device.
  • FIG. 5 is a diagram showing the server configuration information 121 in one embodiment of the present invention.
  • the server configuration information 121 includes a rack / physical computer map 500 (FIG. 5 (a)) and a physical computer list 510 (FIG. 5) showing the correspondence between racks installed in the computer room 400 and physical computers stored in the racks. 5 (b)).
  • the rack / physical computer map 500 (FIG. 5A) includes a rack ID 501 that is an identifier of the rack and a physical computer ID 502 that is an identifier of the physical computer 200 stored in each rack.
  • the physical computer list 510 (FIG. 5B) is one or more records including a physical computer ID 511, a chassis number 512, a component identifier (item) 513, and a component value 514 that the physical computer 200 has. Configured and represents the processing capability of the physical computer 200.
  • the physical computer ID 501 stores an identifier of each physical computer.
  • the chassis number 512 is used to specify the chassis that stores the blade server when the physical computer is a blade server. In the case of a non-modular type server such as a 1U server, “-” is recorded.
  • a blade server shares a fan and a power source among a plurality of servers, and may have a management processor that manages server configuration and power on / off.
  • the management target physical computer is a blade server
  • the IP address and the port number necessary for connecting to the management processor are managed as server configuration information (not shown).
  • the server configuration information 121 is often determined by a designer of a managed system at the time of system construction and managed by a document or software.
  • the physical computer configuration information may be created based on such managed configuration information, or may be created from dynamically collected information.
  • FIG. 6 is a diagram showing a rack / cooling device map 126 according to an embodiment of the present invention.
  • the rack / cooling device map 126 includes one or more records including a rack identifier 601, an identifier 602 of the cooling device 151, an outlet 603 located on the front of the rack, and an outlet 604 located on the rear of the rack. Consists of. Each record represents a correspondence between a rack, a cooling device 151 that cools each rack, and an identification number of an outlet that blows and ventilates the rack.
  • FIG. 7 is a diagram showing operation information 710 (FIG. 7A) and power information 720 (FIG. 7B) in an embodiment of the present invention.
  • the operation information 710 indicates the resource usage status of one physical computer 200. As an example, it is composed of one or more records including a measurement date 711, a measurement day 712, a measurement time 713, and a CPU operation rate 714. The unit of the CPU operation rate is%.
  • the operation information shown here can be acquired by WMI (Windows Management Interface) in the case of Windows (registered trademark) and by the Top command in the case of Linux.
  • the power information 720 (FIG. 7B) indicates the power consumption status of each physical computer 200.
  • the measurement date 721, the measurement day 722, the measurement time 723, the physical computer power amount 724, and the chassis power amount. 726 is composed of one or more records.
  • the physical computer 200 is a blade server, the power amount of a plurality of physical computers and the power amount of the chassis are managed in one table.
  • the physical computer 200 is not a blade server, only the power amount of one physical computer is managed.
  • FIG. 8 is a diagram showing a heat generation profile 122 in one embodiment of the present invention.
  • the heat generation profile 122 is composed of one or more records including a CPU operation rate 801 and a heat generation amount 802.
  • the operation rate 811 is the operation rate of the CPU
  • the heat generation amount and 802 are the heat generation amount of the CPU. That is, each record represents the amount of heat generated with respect to the operating rate of the CPU of the physical computer 200.
  • the heat generation profile 122 is different for each type of CPU included in the physical computer 200. There are various methods for obtaining the heat generation profile. For example, it is possible to record a past operating rate and a calorific value history of the CPU, and obtain from the history. Moreover, the process by CPU can be performed in advance and the relationship between an operation rate and the emitted-heat amount can also be measured. It is also possible to record the relationship between the operating rate provided by the CPU vendor and the heat generation amount.
  • FIG. 9 shows the CPU temperature profile 125 (FIG. 9A), the CPU temperature range 128 (FIG. 9B), the CPU optimum temperature 129 (FIG. 9C), the temperature rise in one embodiment of the present invention. It is a figure which shows the relationship of the power consumption by leak current, and the relationship (FIG.9 (d)) of temperature rise and fan power consumption.
  • the CPU temperature profile 125 (FIG. 9A) is composed of one or more records including a CPU heat generation amount 901 and a CPU temperature change 902 per fixed time. That is, each record represents a temperature change amount after a certain time with respect to the heat generation amount of the CPU.
  • the temperature change of an object is a value obtained by dividing the amount of heat given from the outside by the heat capacity, but the ease of heat transfer differs depending on the nature of the object. That is, even when the same amount of heat is given, the temperature finally reached and the rate of temperature change differ depending on the material and structure of the CPU. Therefore, the CPU temperature profile is a table determined for each CPU type.
  • the CPU temperature profile 125 is different for each type of CPU included in the physical computer 200.
  • the CPU temperature range 128 (FIG. 9B) includes an upper limit value 911 and a lower limit value 912, and indicates a temperature range where the CPU operates safely.
  • the upper and lower limits of the temperature at which the CPU operates normally are determined by the manufacturing vendor. In many servers, when the CPU temperature exceeds this range due to a rise in room temperature or a fan failure, a program for monitoring the hardware state of the server issues a warning to the outside of the server.
  • the CPU optimum temperature 129 is a temperature that minimizes the sum of the power consumed by the leakage current of the CPU and the power consumed by the fan that cools the CPU. Since the CPU optimum temperature 129 varies depending on the intake air temperature, the CPU optimum temperature 129 is composed of one or more records including the intake air temperature 921 and the CPU optimum temperature 922.
  • the CPU optimum temperature 129 will be described.
  • a small amount of current (leakage current) flows through the CPU even when it is in an OFF state due to miniaturization of the semiconductor, but this leakage current has a characteristic that it rises exponentially as the temperature rises. For this reason, even in the idle state, when the temperature of the CPU increases, the power consumption also increases exponentially.
  • fan power consumption is proportional to the fan wind speed and power consumption, and inversely proportional to the square of the CPU temperature rise. Therefore, as shown in FIG.
  • the leakage current can be suppressed if the temperature is kept low, but the power consumption of the fan increases. If the temperature rise is allowed, the leakage current increases, but the power consumption of the fan is low. It is a trade-off relationship that can be done.
  • the CPU optimum temperature 129 is a temperature at which the sum of the power consumed by the leak current and the power consumed by the fan is minimized, can be measured by the administrator, and can be disclosed by the server vendor.
  • FIG. 10 is a diagram showing a fan profile 123 in one embodiment of the present invention.
  • the fan profile 123 is composed of one or more records each including a fan rotation speed 1001 of the physical computer 200 and a CPU temperature change 1002 that can be changed per unit time by cooling the CPU by blowing air from the fan. .
  • the cooling efficiency of CPU by a fan changes with inlet temperature. Therefore, the temperature change of the CPU that can be changed per certain time by the fan differs depending on the intake air temperature.
  • the fan profile of the present embodiment shows the case of the inlet air temperature of 21 ° C., 22 ° C., and 23 ° C. (1003), it is not limited to this.
  • the relationship between the server CPU and the fans is not necessarily one-to-one.
  • the plurality of servers may be cooled by a shared fan.
  • the fan is configured to cool each CPU uniformly, and the fan profile 123 as described above can be defined.
  • FIG. 20 is a diagram showing a cooling device profile 2010 and a cooling control pattern 2010 in one embodiment of the present invention.
  • the cooling control profile 2010 is divided for each cooling device, and includes one or more records including a cooling device output stage 2011, an inlet air temperature change 2012 of each cooling target rack at the output stage, and power consumption 2013. Consists of.
  • the cooling control pattern 2020 represents the combination of the outputs of the plurality of cooling devices and the power consumption, and includes a combination number 2021, the output 2022 of each cooling device, and the power consumption 2023 of the entire cooling device at that time. Consists of the above records.
  • FIGS. 5 to 10 The information described in FIGS. 5 to 10 is described in the definition file by the administrator. However, these pieces of information may be input from a GUI (Graphical User Interface) instead of a definition file, or may be acquired from another server via a network.
  • GUI Graphic User Interface
  • FIG. 11 is a diagram showing a control flow of the physical computer and the cooling device by the management computer in one embodiment of the present invention.
  • the operation information monitoring unit 111 of the server program 110 of the physical computer reads the server configuration information 121 from the storage device 105 (S1101). Then, the physical computer 200 to be managed is grasped, and the operation information monitoring unit 111 collects operation information and power consumption of the physical computer 200 (S1102) and stores them in the server operation history 124.
  • the CPU temperature estimation unit 113 estimates the CPU temperature of each physical computer 200 after a certain period of time based on the stored operation history, the heat generation profile 122 and the CPU temperature profile 125 stored in the storage device ( S1103). This estimation process will be described with reference to FIG.
  • FIG. 12 is a diagram showing a CPU temperature estimation flow in one embodiment of the present invention. This process is executed by the CPU temperature estimation unit.
  • the CPU temperature estimation unit 113 refers to the server operation history 124 and acquires the CPU operation rate 714 of the physical computer 200 that is the processing target (S1201).
  • the amount of heat generated with respect to the CPU operating rate is obtained with reference to the heat generation profile 122 (S1202).
  • the CPU type of the physical computer 200 is obtained by referring to the server list 510 of the server configuration information 121, and the CPU temperature profile corresponding to the CPU type is used.
  • the CPU temperature change with respect to the generated heat quantity is obtained with reference to the CPU temperature profile 125 (S1203).
  • the CPU temperature estimation unit 113 acquires the current CPU temperature and the inlet temperature to the server device using the temperature monitoring unit, and uses the fan monitoring / control unit 116 to determine the current fan rotation speed. get.
  • the CPU temperature change (cooling effect) in the case where the current fan speed is maintained from the current time until a certain time has elapsed is obtained (S1204).
  • the fan monitoring / control unit 116 is connected to the BMC 207 by SSH (Secure Shell) and executes a command for acquiring the fan rotation speed, and acquires the value of the fan rotation speed.
  • SSH Secure Shell
  • the temperature monitoring unit acquires the intake air temperature to the physical computer, and the fan rotation speed determination unit 115 refers to the acquired intake air temperature and the fan profile 123. Then, the number of fan rotations necessary to keep the CPU temperature within the upper limit value within a certain time is calculated (S1105).
  • FIG. 13 is a diagram showing a fan rotational speed determination flow in one embodiment of the present invention.
  • the fan rotational speed determination unit 115 obtains the difference between the estimated CPU temperature and the threshold value of the CPU temperature after the elapse of a predetermined time, and the CPU temperature change amount to be realized. (S1301).
  • the process waits for a certain period of time (S1108) and returns to monitoring of operation information, temperature, and rotation speed (S1102).
  • the threshold value may be an upper limit value of the CPU temperature range 128.
  • the threshold value may be a value obtained by subtracting a certain value from the upper limit value. By setting a value obtained by subtracting a constant value from the upper limit value as a threshold value, the upper limit value can be operated even when the operation amount suddenly increases.
  • whether or not to control the rotation speed of the fan is determined depending on whether or not the threshold value after a certain time has elapsed, but the present invention is not limited to this.
  • the inlet temperature is acquired by the temperature monitoring unit 112, and the CPU optimum temperature 129 that is a value that minimizes the sum of the CPU leakage current and the fan power consumption is determined from the acquired inlet temperature. Then, the determination 1104 may be made based on whether the value obtained by subtracting the constant temperature from the CPU optimum temperature is within the range of the value obtained by adding the constant temperature to the CPU optimum temperature.
  • the server program 110 confirms whether or not the physical computer 200 can change the rotational speed of the fan to the calculated rotational speed (S1106). For example, it is confirmed whether the fan rotation speed obtained in the process 1105 exceeds the maximum value. If it can be changed (S1106: Y), the fan monitoring / control unit 116 instructs the physical computer 200 to change the rotational speed of the fan (S1107). Then, after a predetermined time has elapsed (S1108), the process returns to monitoring of operation information, temperature, and rotation speed (S1102).
  • the cooling control unit 117 sets the cooling facilities 151a and 151b installed in the computer room 400. And the inlet temperature of the physical computer 200 is lowered (S1520). This process will be described with reference to FIG.
  • FIG. 21 is a diagram for explaining a cooling control flow in one embodiment of the present invention.
  • the cooling control unit 117 refers to the fan profile 123 to obtain a target inlet temperature that achieves the CPU temperature change amount to be realized (S2101).
  • the fan rotation speed is the maximum value that can be realized. That is, if the 5000 revolutions in FIG. 10 is the maximum value and the CPU temperature change is ⁇ 3.0 ° C., the target inlet air temperature is obtained as 21 ° C.
  • the cooling control unit 117 refers to the server configuration information 121 to identify the rack in which the physical computer 200 to be controlled is stored. Then, referring to the rack / cooling map 126, the cooling facility responsible for cooling the specified rack is specified. Multiple racks may be specified. And the cooling control part 117 instruct
  • the specific method for determining the output of the cooling device refers to the cooling device profile 2010 and selects an output stage in which the rack temperature change amount is equal to or greater than the intake air temperature change target value.
  • a cooling control pattern list 2030 listing the combinations of outputs that the cooling equipment 151a and the cooling equipment 151b can take is created (S2103).
  • the cooling equipment 151a is set to stage 1, the cooling equipment 151a is set to stage 3, the both outputs are set to stage 2, the cooling equipment 151a is set to stage 3, and the cooling equipment 151b is set to stage 1.
  • There are patterns such as.
  • the power consumption of the cooling device with respect to the temperature change that is, the cooling temperature varies depending on the characteristics of the device and the distance to the rack, and therefore the power consumption of each pattern is different.
  • the output of the cooling equipment 151a, 151b is changed to lower the inlet temperature of the physical computer 200 (S2105).
  • the operation information / temperature / revolution speed monitoring is resumed (S1102).
  • the physical computer profile 124 of the present embodiment focuses only on the CPU that is the main heat generating component, but the heat generation profile may be information corresponding to the utilization rate of other components in addition to the CPU.
  • the IT device may be a storage device or a network device.
  • the IT device is a storage device
  • the amount of heat generated by the device changes not only with the CPU operation rate but also with IOPS (InputInOutput Per Second) indicating the number of data input / output to the device.
  • IOPS InputInOutput Per Second
  • the calorific value can be estimated based on the above.
  • the heat generation amount can be estimated based on the port usage rate.
  • the power saving control server 110 includes the operation information monitoring unit 111 that collects the operation information and power consumption information of the physical computer 200, and the temperature that collects the inlet temperature, CPU temperature, and exhaust temperature of the managed server.
  • the monitoring unit 112 monitors the fan rotation speed of the management target server, reads the fan monitoring / control unit 116 that changes the rotation speed of the fan, the heat generation profile 122, and the CPU temperature profile 125, and the management target server
  • a CPU temperature estimation unit 113 that estimates the temperature after a certain time of the built-in CPU
  • a fan rotation number determination unit 115 that reads the fan profile 123 and determines a fan rotation number that lowers the temperature after a certain time of the CPU to a target value.
  • Cooling device profile 127 and rack / cooling map 126 are received and cooling
  • the cooling control determining unit 114 for determining the output of the device 151 and the cooling control unit 117 for instructing the control of the cooling device 151 are used to cool the CPU in advance to maintain the optimum temperature, and to reduce the power generated by the leakage current. Consumption can be minimized and cooling efficiency can be increased.
  • the server fan and air conditioning in cooperation, even if the CPU temperature cannot be within the specified range only by controlling the fan, the air intake temperature is lowered in advance by adjusting the air conditioning. As a result, the temperature of the CPU is kept within a specified range, and the occurrence of a failure due to heat can be avoided.
  • the BMC in the server device controls the fan according to an instruction from the management computer.
  • the server program 110 is stored in the memory of the physical computer, and the operation information history is recorded. 124, server configuration information 121, heat generation profile 122, fan profile 123, CPU temperature profile 125, cooling device profile 127, CPU temperature range 128, CPU optimum temperature 129, rack / cooling map 126
  • the cooling device profile 2010 may be stored in the storage device, and the fan control and the air conditioning device control in this embodiment may be performed in the physical computer.
  • the system configuration of one embodiment of the present invention is the same as in FIG.
  • the physical computer is the same as that shown in FIG.
  • FIG. 14 is a diagram showing a management computer according to an embodiment of the present invention.
  • the difference from the management computer of the first embodiment is that the operation history information 124 is not held, but the job execution schedule 132 and the server / job map 131 are held.
  • Description of the server configuration information 121, the heat generation profile 122, the fan profile 123, the CPU temperature profile 125, the rack / cooling map 126, the CPU temperature range 128, and the CPU optimum temperature 129 already described in the first embodiment will be omitted.
  • the server program 110 obtains the CPU operating rate in a certain period from the job execution schedule (load fluctuation) of the business executed by the managed server, and based on the CPU operating rate and the heat generation amount for the operating rate determined for each CPU type.
  • the temperature of the CPU after a certain time is estimated and it is confirmed that the CPU temperature exceeds the upper limit value of the temperature range in which the CPU operates stably, the fan speed of the managed server and the output of the cooling device as necessary To control.
  • FIG. 15 is a configuration diagram showing a server / job map 131 (FIG. 15A) and a job execution schedule 132 (FIG. 15B) according to an embodiment of the present invention.
  • the server job map 131 (FIG. 15A) is composed of one or more records including a physical computer ID 1401 that is an identifier of the physical computer and a business type 1402 that indicates the type of business. Each record indicates which business is being executed on which physical computer.
  • the job execution schedule 132 (FIG. 15B) is one or more records including a day 1411, a day of the week 1412, a start time 1413, an end time 1414, a job ID 1415, and an average CPU operation rate 1416 for each physical computer. Composed. Each record indicates the average value of the CPU operation rate generated by the business for each day of the week and time zone.
  • the CPU operation rate of the server is predicted based on the batch job execution schedule, the time variation of the business request, or the server On / Off schedule.
  • the job execution schedule 132 is obtained from the server operation history when processing equivalent to the batch job execution schedule, the time variation of the business request, or the server On / Off schedule is executed.
  • FIG. 16 is a flowchart showing the control flow of the second embodiment of the present invention.
  • the operation information monitoring unit 111 of the server program 110 acquires server configuration information 121 (S1601), and acquires information on the physical computer 200 to be managed. Then, the job execution schedule information of the physical computer to be managed is acquired.
  • the server program 110 confirms the job execution schedule and confirms whether the end time of the job executed by the physical computer 200 has been reached (S1603). If it is the job end time (S1603: Y), the server / job map 131 is referenced to identify the next job to be executed by the physical computer 200, and the job execution schedule 132 corresponding to the job is referred to. The start and end times of jobs executed by the physical computer 200 and the average CPU operating rate are acquired (S1604).
  • the CPU temperature estimation unit 113 estimates the CPU temperature of each physical computer 200 after a certain period of time based on the average CPU operating rate, the heat generation profile 122, and the CPU temperature profile 125 (S1604).
  • the temperature estimation process is the same as in the first embodiment.
  • the power consumption can be suppressed by reducing or stopping the rotation speed of the fan.
  • processing of processing 1606 is the same as that of the first embodiment.
  • the job to be executed and the CPU operating rate for processing the job are acquired from the job execution schedule.
  • the job execution schedule it is not limited to this, for example, storing data such as jobs processed in the past and CPU utilization rate, predicting CPU utilization rate for a certain time based on the stored data, CPU utilization may be acquired.
  • the system configuration of one embodiment of the present invention is the same as in FIG.
  • the physical computer is the same as that shown in FIG.
  • FIG. 17 is a diagram showing a management computer according to an embodiment of the present invention.
  • the difference from the management computer of the first embodiment is that the server program holds the rule determination unit 133 and the storage device holds the cooling control rule.
  • the server configuration information 121, heat generation profile 122, fan profile 123, operation history information 124, CPU temperature profile 125, rack / cooling map 126, CPU temperature range 128, and CPU optimum temperature 129 already described in the first embodiment Description is omitted.
  • FIG. 18 is a configuration diagram showing a cooling control rule 131 of the third embodiment of the present invention.
  • the cooling control rule 131 includes a rule 1010 and an action 1020.
  • the rule 1010 includes one or more records including an evaluation item 1011 and a threshold value 1012. Each record represents one condition. The condition is satisfied when the value of the evaluation item is equal to or greater than the threshold value, and is specified by the action 1020 when the condition expressed by all the records is satisfied. Control is executed. Further, when the condition of one item is satisfied, the control designated by the action 1020 may be executed.
  • the cooling control rule is a condition that can be determined that an abnormal temperature rise has started, and is defined in advance by a computer room administrator.
  • the CPU temperature of all the physical computers 200 stored in a certain rack exceeds 60 ° C.
  • the exhaust temperature exceeds 40 ° C.
  • the fan rotation speed Is more than 10,000 revolutions / second, it is considered that a heat pool is generated on the rear surface of the rack where the exhaust of the physical computer 200 stored in the rack comes out.
  • action 1020 is executed to discharge this heat. It is instructed to maximize the output of the cooling device 151 for cooling the rack and to open 100% of the grating plate at the outlet located on the back of the rack.
  • the intake air temperature of all the physical computers 200 stored in the same rack exceeds a certain threshold value, it is considered that a heat accumulation has occurred on the front surface of the rack, and the cooling device 151 for cooling the rack. It is also effective to open the grating plate at the outlet located in front of the rack 100%.
  • FIG. 19 is a flowchart showing a control flow of the third embodiment of the present invention.
  • the operation information monitoring unit 111 of the power saving control server 110 acquires the server configuration information 121 (S1901).
  • the operation information monitoring unit 111 collects operation information and power consumption of the physical computer 200 and stores them in the server operation history 124.
  • the temperature monitoring unit 112 collects the current CPU temperature and exhaust temperature.
  • the fan monitoring / control unit 116 collects the fan rotation speed (S1902).
  • the rule determination unit 133 compares the value of each item of the rule 1010 of the cooling control rule 131 with the threshold value 1012 for the physical computer 200 stored in each rack based on the collected information (S1903). ).
  • the rack / cooling map 126 is referred to, the cooling device 151 that cools the rack, and the The air outlet located on the back of the rack is specified, and the cooling control designated by action 1020 is executed (S1905).
  • the process returns to monitoring of operation information and the like (S1902).
  • the present embodiment it is possible to reduce the installation cost of the sensor as compared with a method of detecting a temperature increase by installing a large number of sensors around the management target device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Hardware Design (AREA)
  • Thermal Sciences (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Power Sources (AREA)
  • Cooling Or The Like Of Electrical Apparatus (AREA)

Abstract

Cooling control for reducing the overall power consumption of a system is determined. A method for controlling a management computer connected to a server device having a processor and a fan and a cooling device is characterized in that the temperature and operating rate of the processor, the rotational speed of the fan, and the temperature of the air sent into the server are acquired from the server, the temperature of the processor after a predetermined time elapse is estimated by calculation on the basis of the temperature and operating rate of the processor, the rotational speed of the fan, and the temperature of the sent air, the target rotational speed of the fan at which the estimated temperature after the time elapse is below a first predetermined value is determined if the estimated temperature is above the first predetermined value, and the server device is instructed to change the rotational speed to the target one.

Description

物理計算機及び冷却装置の制御方法及びサーバシステムControl method and server system for physical computer and cooling device
 本発明は、物理計算機及び冷却装置の制御方法に関し、特にCPUの稼動率に応じた物理計算機のファン及び冷却装置の出力の制御方法に関する。 The present invention relates to a control method for a physical computer and a cooling device, and more particularly to a method for controlling an output of a fan and a cooling device of a physical computer in accordance with a CPU operating rate.
 プロセッサの高性能化、ブレードサーバのような高密度なIT(Information Technology)機器の出現により、ITシステムの消費電力の増加が大きな問題となっている。この問題に対して、低消費電力プロセッサ,高効率冷却方式など消費電力を削減する技術の開発が進められているが、装置単体の省電力化には限界があり、より大きな省電力効果を得るためにはITシステムと冷却装置とを含むコンピュータ室全体での取り組みが必要である。 Increased power consumption of IT systems has become a major problem due to the higher performance of processors and the emergence of high-density IT (Information Technology) equipment such as blade servers. In response to this problem, development of technologies that reduce power consumption, such as low-power consumption processors and high-efficiency cooling methods, is underway, but there is a limit to the power-saving of a single device, and a greater power-saving effect is obtained. Therefore, it is necessary to work on the entire computer room including the IT system and the cooling device.
 現在のITシステムの冷却について説明する。サーバの稼働率上昇に伴いCPUやメモリなどの部品は熱を発生する。サーバは冷却ファンを備えており、サーバ筐体内の温度を検出し、温度が一定のしきい値を超えると、気流を発生させて発熱部品を冷却する。熱はこの気流により筐体の外に排出される。この冷却ファンの制御に関して、発熱部品の温度と温度変化に基づいてファンを制御し冷却を効率化する発明が特許文献1に開示されている。また、発熱部品の温度と温度変化に基づいてファンを制御し、冷却を効率化する発明が特許文献2で開示されている。 Explain the cooling of the current IT system. As the operating rate of the server increases, components such as the CPU and memory generate heat. The server is provided with a cooling fan, detects the temperature in the server casing, and generates an air current to cool the heat-generating component when the temperature exceeds a certain threshold value. Heat is discharged out of the housing by this airflow. Regarding the control of the cooling fan, Patent Document 1 discloses an invention in which the fan is controlled based on the temperature of the heat-generating component and the temperature change to improve the cooling efficiency. Further, Patent Document 2 discloses an invention in which a fan is controlled based on the temperature of a heat-generating component and a temperature change to improve cooling efficiency.
 一方、サーバを設置するコンピュータ室には機器の発生する熱を冷却するための冷却装置が設置されており、固定のセンサの温度をもとに出力を決定し、室内の温度を一定に保つ。しかし、IT機器の発する熱の偏りや機器の配置によって、コンピュータ室の温度を均一することは難しい。この冷却装置の制御に関して、マシン室内の空気流を監視し、空気流に応じて換気を行う発明が特許文献3に開示されている。 On the other hand, in the computer room where the server is installed, a cooling device is installed to cool the heat generated by the equipment. The output is determined based on the temperature of the fixed sensor, and the room temperature is kept constant. However, it is difficult to make the temperature of the computer room uniform due to the bias of heat generated by IT equipment and the arrangement of equipment. Regarding the control of the cooling device, Patent Document 3 discloses an invention in which an air flow in a machine room is monitored and ventilation is performed according to the air flow.
 コンピュータ室の熱の偏りが大きい場合は、熱だまりにある機器がCPU熱暴走により動作が不安定になる、また、CPU温度制御回路が作動し、強制的に処理性能を落としたり、シャットダウンしたりすることがある。強制的なシャットダウンによる不具合の発生を回避するため、温度が一定値を超えた場合ハイバネーションにより放熱制御して障害、シャットダウンを回避する発明が特許文献4で開示されている。 If there is a large amount of heat in the computer room, the equipment in the heat pool becomes unstable due to CPU thermal runaway, and the CPU temperature control circuit is activated to forcibly reduce processing performance or shut down. There are things to do. Patent Document 4 discloses an invention that avoids failure and shutdown by controlling heat dissipation by hibernation when the temperature exceeds a certain value in order to avoid occurrence of a malfunction due to forced shutdown.
 また、CPUの現在の処理内容、CPUの温度特性、外部から入力されたユーザの指示に基づいて一定時間経過後の温度を予測し、予測された温度が基準値を上回った場合には、CPUの処理を他のCPUに移動する方法が、特許文献5に開示されている。 Also, predict the temperature after a certain period of time based on the current processing contents of the CPU, the temperature characteristics of the CPU, and user instructions input from the outside.If the predicted temperature exceeds the reference value, the CPU A method for moving the above process to another CPU is disclosed in Patent Document 5.
特開2002-268775号公報JP 2002-268775 A 特開2008-84173号公報JP 2008-84173 A 特開2006-208000号公報JP 2006-208000 A 特開2008-158787号公報JP 2008-158787 A 特開2007-241376号公報JP 2007-241376 A
 今後は、熱の偏りを無くすためコンピュータ室全体を大規模な冷却装置で均一に冷やすのではなく、局所的に温度を制御できる冷却装置が重要になる。例えば、指向性のある冷却装置、ラック背面に取り付けるラック単位の冷却装置などが既に出荷されている。また、床下からの冷風によって冷却を行う設備の場合には、床のグレーティング板(穴あきタイル)の開閉を制御して、床下からの送風箇所を変更し、温度の高い箇所を集中的に冷やすようになる。 In the future, it will be important to have a cooling device that can control the temperature locally rather than uniformly cooling the entire computer room with a large-scale cooling device in order to eliminate heat bias. For example, directional cooling devices and rack-based cooling devices attached to the back of the rack have already been shipped. In addition, in the case of equipment that cools with cold air from under the floor, it controls the opening and closing of the grating plate (perforated tile) on the floor, changes the air blowing area from the under floor, and intensively cools hot areas It becomes like this.
 また、発熱部品の稼動率が上昇してから熱が発生し、その熱が温度センサに検出されるまでには遅延がある。また、温度上昇が検出されてファンや空調の出力が上がり、実際に冷却効果が出るまでにはさらに遅延がある。このため、バースト的な負荷が発生した場合は、ファンの出力が上がっても一時的には温度が上昇し、その後安定稼動可能な温度に落ち着く。しかし、CPUは自身の温度が上昇すると、リーク電流が指数関数的に増加して冷却の電力もより多く必要になる。 Also, heat is generated after the operating rate of the heat generating component is increased, and there is a delay until the heat is detected by the temperature sensor. In addition, there is a further delay before the temperature rise is detected and the fan and air conditioning output increases and the cooling effect is actually produced. For this reason, when a bursty load occurs, the temperature temporarily rises even if the fan output increases, and then settles to a temperature at which stable operation is possible. However, as the temperature of the CPU rises, the leakage current increases exponentially, and more cooling power is required.
 この結果、冷却能力が不足してCPU熱暴走により動作が不安定になる、または、CPU温度制御回路が作動し、強制的に処理性能を落としたり、シャットダウンすることがある。ハイバネーションにより、強制シャットダウンを回避しても、処理は中断せざるを得ない。 As a result, there is a case where the cooling capacity is insufficient and the operation becomes unstable due to the CPU thermal runaway, or the CPU temperature control circuit is activated to forcibly reduce the processing performance or shut down. Even if a forced shutdown is avoided by hibernation, the process must be interrupted.
 また、サーバの冷却ファンが回転数を上げても、サーバ背面の空気流が不足していると、熱が排出穴に運ばれずに周囲の機器に拡散し、他の機器の温度も上昇し障害を招くおそれがある。現在の冷却装置は一点で計測した温度情報を元に出力を調整しているため、あるサーバで発生した熱がその計測点に回り込むまで空調は連動しないので、このような状態になることがある。 Also, even if the server cooling fan increases its rotation speed, if the air flow at the back of the server is insufficient, heat will not be transferred to the exhaust holes but will be diffused to surrounding equipment, causing other equipment to rise in temperature and causing trouble. May be incurred. Since the current cooling system adjusts the output based on the temperature information measured at one point, air conditioning does not work until the heat generated by a certain server reaches the measurement point, so this may occur. .
 前記目的を達成するために、本発明の制御方法は、プロセッサとファンを有するサーバ装置と冷却装置とに接続する管理計算機の制御方法であって、前記サーバから前記プロセッサの温度及び稼働率と前記ファンの回転数と前記サーバへの入気温度を取得し、前記プロセッサの前記温度及び前記稼働率と、前記ファンの前記回転数と、前記入気温度とから、予め定められた期間を経過した後の前記プロセッサの推定温度を算出し、前記推定温度が第1の所定値以上である場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記ファンの目標回転数を決定し、前記目標回転数とするように前記サーバ装置に指示する、ことを特徴とする制御方法である。 In order to achieve the above object, a control method of the present invention is a control method of a management computer connected to a server device having a processor and a fan and a cooling device, and the temperature and operating rate of the processor from the server The rotation speed of the fan and the intake air temperature to the server are acquired, and a predetermined period has elapsed from the temperature and the operating rate of the processor, the rotation speed of the fan, and the intake air temperature. The estimated temperature of the subsequent processor is calculated, and when the estimated temperature is equal to or higher than a first predetermined value, the target rotational speed of the fan at which the estimated temperature after the period elapses is equal to or lower than the predetermined value. The control method is characterized by determining and instructing the server device to set the target rotational speed.
 また、本発明のサーバシステムは、プロセッサとファンを有し、前記プロセッサの温度及び稼働率と前記ファンの回転数と入気温度とを計測する前記サーバ装置と、前記サーバ装置及び前記冷却装置に接続し、前記プロセッサの前記温度及び前記稼働率と、前記ファンの前記回転数と、前記入気温度とから、予め定められた期間を経過した後の前記プロセッサの推定温度を算出し、前記推定温度が第1の所定値を以上である場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記ファンの目標回転数を決定する管理計算機とを有する、ことを特徴とするサーバシステムである。 The server system of the present invention has a processor and a fan, and measures the temperature and operating rate of the processor, the rotational speed of the fan, and the intake air temperature, the server device, and the cooling device. And calculating an estimated temperature of the processor after a predetermined period from the temperature and the operating rate of the processor, the rotation speed of the fan, and the intake air temperature, and calculating the estimated A management computer for determining a target rotational speed of the fan at which the estimated temperature is less than or equal to the predetermined value after the period when the temperature is greater than or equal to a first predetermined value. Server system.
 本発明によれば、CPUを事前に冷却して最適温度に保つことにより、リーク電流による電力の消費を最小にし、冷却効率を上げることができる。 According to the present invention, by cooling the CPU in advance and maintaining the optimum temperature, power consumption due to leakage current can be minimized and the cooling efficiency can be increased.
本発明の第一の実施例のシステム構成を示すシステム構成図の一例である。It is an example of the system block diagram which shows the system configuration | structure of the 1st Example of this invention. 本発明の第一の実施例の省電力制御サーバのハードウェア構成を示す構成図の一例である。It is an example of the block diagram which shows the hardware constitutions of the power saving control server of 1st Example of this invention. 本発明の第一の実施例の管理対象である物理計算機ハードウェア構成を示す構成図の一例である。It is an example of the block diagram which shows the physical computer hardware structure which is the management object of the 1st Example of this invention. (a)は、本発明の第一の実施例のコンピュータ室の内部構成図、(b)は、コンピュータ室の床に設置されたモータと吹出口との関係を示す要部断面図の一例である。(A) is an internal block diagram of the computer room of 1st Example of this invention, (b) is an example of principal part sectional drawing which shows the relationship between the motor installed in the floor of the computer room, and a blower outlet. is there. 本発明の第一の実施例のサーバ構成情報を示す構成図の一例である。It is an example of the block diagram which shows the server structure information of the 1st Example of this invention. 本発明の第一の実施例のラック・冷却マップを示す構成図の一例である。It is an example of the block diagram which shows the rack and cooling map of 1st Example of this invention. 本発明の第一の実施例における稼動情報および消費電力情報を示す構成図の一例である。It is an example of the block diagram which shows the operation information and power consumption information in 1st Example of this invention. 本発明の第一の実施例における熱発生プロファイルを示す構成図の一例である。It is an example of the block diagram which shows the heat generation profile in the 1st Example of this invention. 本発明の第一の実施例におけるCPU温度プロファイルを示す構成図の一例である。It is an example of the block diagram which shows CPU temperature profile in 1st Example of this invention. 本発明の第一の実施例におけるファンプロファイルを示す構成図の一例である。It is an example of the block diagram which shows the fan profile in the 1st Example of this invention. 本発明の第一の実施例における省電力制御フローを示すフローチャートの一例である。It is an example of the flowchart which shows the power saving control flow in the 1st Example of this invention. 本発明の第一の実施例におけるCPU温度推定フローを示すフローチャートの一例である。It is an example of the flowchart which shows the CPU temperature estimation flow in the 1st Example of this invention. 本発明の第一の実施例におけるファン回転数決定フローを示すフローチャートの一例である。It is an example of the flowchart which shows the fan rotation speed determination flow in 1st Example of this invention. 本発明の第二の実施例のシステム構成を示すシステム構成図の一例である。It is an example of the system block diagram which shows the system configuration | structure of the 2nd Example of this invention. 本発明の第二の実施例のジョブ実行スケジュール、サーバ・ジョブマップを示す構成図の一例である。It is an example of the block diagram which shows the job execution schedule and server job map of 2nd Example of this invention. 本発明の第二の実施例の省電力制御フローを示すフローチャートの一例である。It is an example of the flowchart which shows the power saving control flow of the 2nd Example of this invention. 本発明の第三の実施例のシステム構成を示すシステム構成図の一例である。It is an example of the system configuration | structure figure which shows the system configuration | structure of the 3rd Example of this invention. 本発明の第三の実施例の冷却制御ルールを示す構成図の一例である。It is an example of the block diagram which shows the cooling control rule of the 3rd Example of this invention. 本発明の第三の実施例の省電力制御フローを示すフローチャートの一例である。It is an example of the flowchart which shows the power saving control flow of the 3rd Example of this invention. 本発明の第一の実施例の冷却装置プロファイルの一例である。It is an example of the cooling device profile of the 1st Example of this invention. 本発明の第一の実施例の冷却装置部のフローチャートの一例である。It is an example of the flowchart of the cooling device part of the 1st Example of this invention.
符号の説明Explanation of symbols
110…省電力制御サーバ、111…稼動情報監視部、112…温度監視部、113…CPU温度推定部、114…冷却制御決定部、115…ファン回転数決定部、116…ファン監視・制御部、117…冷却制御部、121…サーバ構成情報、122…熱発生プロファイル、123…ファンプロファイル、124…サーバ稼動履歴、125…CPU温度プロファイル、126…ラック・冷却マップ、127…冷却プロファイル、200…物理計算機、223…計測エージェント DESCRIPTION OF SYMBOLS 110 ... Power saving control server, 111 ... Operation information monitoring part, 112 ... Temperature monitoring part, 113 ... CPU temperature estimation part, 114 ... Cooling control determination part, 115 ... Fan rotation speed determination part, 116 ... Fan monitoring / control part, 117: Cooling control unit, 121: Server configuration information, 122: Heat generation profile, 123 ... Fan profile, 124 ... Server operation history, 125 ... CPU temperature profile, 126 ... Rack / cooling map, 127 ... Cooling profile, 200 ... Physical Computer, 223 ... Measurement agent
 以下、図面を用いて、本発明の幾つかの一実施形態を説明する。 Hereinafter, some embodiments of the present invention will be described with reference to the drawings.
 図1は、本発明の一実施例のシステム構成を示す図である。本実施例のシステム構成は、情報処理システムあるいはストレージシステム等である。例えば、1つの管理計算機100と、1つ以上の物理計算機200と、1つ以上のストレージ装置230と、該管理計算機100、該物理計算機200、該ストレージ装置230とを設置したコンピュータ室を冷却する冷却装置151と、冷却装置151を制御する冷却装置制御部150を備えて構成されている。そして、管理計算機100、物理計算機200及び冷却装置151は、管理ネットワーク225経由で接続している。また、物理計算機200とストレージ装置230は、例えばファイバチャネルネットワーク226で接続している。ここで、冷却装置制御部150は、冷却装置151を一括に制御すべく、物理計算機のメモリにプログラムとして格納されてもよい。また、冷却装置制御部151は、管理計算機内のメモリにプログラムとして格納されていてもよい。 FIG. 1 is a diagram showing a system configuration of an embodiment of the present invention. The system configuration of this embodiment is an information processing system or a storage system. For example, the computer room in which one management computer 100, one or more physical computers 200, one or more storage devices 230, the management computer 100, the physical computers 200, and the storage devices 230 are installed is cooled. A cooling device 151 and a cooling device control unit 150 that controls the cooling device 151 are provided. The management computer 100, the physical computer 200, and the cooling device 151 are connected via the management network 225. Further, the physical computer 200 and the storage device 230 are connected by, for example, a fiber channel network 226. Here, the cooling device control unit 150 may be stored as a program in the memory of the physical computer so as to collectively control the cooling device 151. Further, the cooling device control unit 151 may be stored as a program in a memory in the management computer.
 図2は、本発明の一実施例における管理計算機100を示す図である。 FIG. 2 is a diagram showing the management computer 100 in one embodiment of the present invention.
 管理計算機100の動作の概要を説明する。詳細については、以下に図面に沿って説明を加える。管理計算機100は、物理計算機200とストレージ装置230および冷却装置制御部150を管理する。そして、複数の物理計算機200と情報の授受を行って、複数の物理計算機200の稼動状況を検出する。検出した複数の物理計算機200の稼動状況に応じて、冷却装置制御機能151を介して冷却装置151を個別に制御する。また、検出した複数の物理計算機200の稼働状況に応じて複数の物理計算機200のファン回転数や冷却装置を制御する。 An outline of the operation of the management computer 100 will be described. Details will be described below with reference to the drawings. The management computer 100 manages the physical computer 200, the storage device 230, and the cooling device control unit 150. Then, information is exchanged with the plurality of physical computers 200 to detect the operation status of the plurality of physical computers 200. The cooling device 151 is individually controlled via the cooling device control function 151 according to the detected operating status of the plurality of physical computers 200. Further, the fan rotation speed and the cooling device of the plurality of physical computers 200 are controlled according to the detected operating status of the plurality of physical computers 200.
 管理計算機100は、中央演算装置CPU(Central Processing Unit)101、ハードディスク装置やフラッシュメモリ等の記憶装置105、メモリ102、バス107、ネットワークインタフェース104、ディスクインタフェース103から構成される。 The management computer 100 includes a central processing unit CPU (Central Processing Unit) 101, a storage device 105 such as a hard disk device or a flash memory, a memory 102, a bus 107, a network interface 104, and a disk interface 103.
 メモリ102には、サーバプログラム110が格納されている。そして、サーバプログラムには、稼動情報監視部111、温度監視部112、CPU温度推定部113、ファン回転数決定部115、ファン監視・制御部116、冷却制御部117が含まれる。これらのプログラムは、当初、磁気ディスク105に格納され、必要に応じてメモリ102に転送された後、CPU101で実行される。稼動情報・電力監視部111は、物理計算機200の稼動情報および消費電力情報を収集する。温度監視部112は、物理計算機の入気温度、CPU温度、排気温度を取得する。ファン監視・制御部116は、ファン回転数の情報を取得し、ファンの回転数を変更する指示を出す。CPU温度推定部113は、物理計算機に内蔵されるCPUの一定時間後の温度を推定する。ファン回転数決定部115は、CPUの一定時間後の温度を目標値まで下げるファン回転数を決定する。冷却制御決定部114は、ラック・冷却マップ126を取得し、冷却装置151の出力を決定する。冷却制御部117は、冷却装置151の出力を制御する指示を出す。 The server 102 is stored in the memory 102. The server program includes an operation information monitoring unit 111, a temperature monitoring unit 112, a CPU temperature estimation unit 113, a fan rotation speed determination unit 115, a fan monitoring / control unit 116, and a cooling control unit 117. These programs are initially stored in the magnetic disk 105, transferred to the memory 102 as necessary, and then executed by the CPU 101. The operation information / power monitoring unit 111 collects operation information and power consumption information of the physical computer 200. The temperature monitoring unit 112 acquires the intake air temperature, CPU temperature, and exhaust temperature of the physical computer. The fan monitoring / control unit 116 acquires information on the fan rotation speed and issues an instruction to change the fan rotation speed. The CPU temperature estimation unit 113 estimates the temperature after a certain time of the CPU built in the physical computer. The fan rotation speed determination unit 115 determines a fan rotation speed that lowers the temperature of the CPU after a predetermined time to a target value. The cooling control determination unit 114 acquires the rack / cooling map 126 and determines the output of the cooling device 151. The cooling control unit 117 issues an instruction to control the output of the cooling device 151.
 そして、記憶装置105に、稼動情報履歴124と、サーバ構成情報121と、熱発生プロファイル122と、ファンプロファイル123と、CPU温度プロファイル125と、冷却装置プロファイル127と、CPU温度範囲128と、CPU最適温度129と、ラック・冷却マップ126と、冷却装置プロファイル2010と、が格納される。 Then, the operation information history 124, the server configuration information 121, the heat generation profile 122, the fan profile 123, the CPU temperature profile 125, the cooling device profile 127, the CPU temperature range 128, and the CPU optimum are stored in the storage device 105. The temperature 129, the rack / cooling map 126, and the cooling device profile 2010 are stored.
 図3は、本発明の一実施例における物理計算機200のハードウェア構成を示す図である。 FIG. 3 is a diagram showing a hardware configuration of the physical computer 200 in one embodiment of the present invention.
 物理計算機200は、中央演算装置CPU201、メモリ202、ハードディスク装置やフラッシュメモリ等の記憶装置205、バス207、ネットワークインタフェース204、ディスクインタフェース203、ファン208、BMC(Baseboard Management Controller)207とから構成される。 The physical computer 200 includes a central processing unit CPU 201, a memory 202, a storage device 205 such as a hard disk device or a flash memory, a bus 207, a network interface 204, a disk interface 203, a fan 208, and a BMC (Baseboard Management Controller) 207. .
 BMC207は、サーバ入気温度、排気温度、CPU温度の監視や、ファン回転数の監視・制御、電源制御を行う。 BMC 207 performs monitoring of server inlet temperature, exhaust temperature, CPU temperature, monitoring / control of fan speed, and power supply control.
 メモリ202には、OS222、この物理計算機の稼動情報を収集する計測エージェントプログラム223、業務サービスプログラム224が格納される。これらのプログラムは、まず、磁気ディスク205に格納され、必要に応じてメモリ202に転送された後、CPU201で実行される。なお、これらのプログラムは、可搬型記録媒体から読み出されることにより、または、各々の装置に接続されたネットワーク経由で、他の計算機または記憶装置からダウンロードされることにより、磁気ディスク205に格納されるものであってもよい。 The memory 202 stores an OS 222, a measurement agent program 223 that collects operation information of the physical computer, and a business service program 224. These programs are first stored in the magnetic disk 205, transferred to the memory 202 as necessary, and then executed by the CPU 201. Note that these programs are stored in the magnetic disk 205 by being read from a portable recording medium or downloaded from another computer or storage device via a network connected to each device. It may be a thing.
 また、管理計算機のサーバプログラム110の各処理は、各プログラムをCPUで実行することにより実現するが、これらは計測エージェント決定部、計測部など、各処理を行う処理部として集積回路化するなどしてハードウェアで実現することもできる。 In addition, each process of the server program 110 of the management computer is realized by executing each program by a CPU. However, these are integrated into a processing unit that performs each process, such as a measurement agent determination unit and a measurement unit. Can also be realized in hardware.
 計測エージェント223は、計算機200上で稼動し、自身が稼動する装置のCPU使用率やメモリ使用率、ネットワークインタフェースの使用率などの稼動情報を収集し、計測カウンタとして記録するソフトウェアプログラムである。管理計算機のサーバプログラム110の稼動情報・電力監視部111は、計測エージェント223に対して、SNMP(Simple Network Management Protocol)による稼動情報収集リクエストを送信する。計測エージェント223は、この稼動情報収集リクエストを受け取り、リクエスト中のオブジェクトID(Identification)で指定された計測カウンタの値を稼動情報・電力監視部111に送信する。サーバプログラム110は、この計測カウンタの値を受け取り、稼動情報として記録することで、複数の管理対象の稼動情報を一元的に管理することができる。 The measurement agent 223 is a software program that runs on the computer 200 and collects operation information such as a CPU usage rate, a memory usage rate, and a network interface usage rate of a device in which the measurement agent 223 operates and records it as a measurement counter. The operation information / power monitoring unit 111 of the server program 110 of the management computer transmits an operation information collection request by SNMP (Simple Network Management Protocol) to the measurement agent 223. The measurement agent 223 receives this operation information collection request, and transmits the value of the measurement counter designated by the object ID (Identification) in the request to the operation information / power monitoring unit 111. The server program 110 can centrally manage the operation information of a plurality of management targets by receiving the value of the measurement counter and recording it as operation information.
 図4は、本発明の一実施例における、物理計算機200、ストレージ装置230、冷却装151置などが設置されるコンピュータ室400の機器配置を示す図である。 FIG. 4 is a diagram showing a device arrangement of the computer room 400 in which the physical computer 200, the storage device 230, the cooling device 151, and the like are installed according to an embodiment of the present invention.
 コンピュータ室400には、4つのラック401a、401b、401c、401dと、冷却装置151a、151bが配置されている。各ラック401と冷却装置151a、151bは、床上に固定されている。床には、複数の吹出口431~435が設置される。吹出口431~435は、例えば図4(b)に示すように、モータ440が固定され、モータ回転軸441には、モータ440の回転駆動に応じて、各吹出口を開閉する開閉版442が固定されている構成である。 In the computer room 400, four racks 401a, 401b, 401c, 401d and cooling devices 151a, 151b are arranged. Each rack 401 and the cooling devices 151a and 151b are fixed on the floor. A plurality of outlets 431 to 435 are installed on the floor. For example, as shown in FIG. 4B, the motor 440 is fixed to the air outlets 431 to 435, and an opening / closing plate 442 that opens and closes each air outlet according to the rotational drive of the motor 440 is provided on the motor rotating shaft 441. It is a fixed configuration.
 ラック401aには、物理計算機200及びストレージ装置230が収納されている。ラック401b~401cにも、同様に物理計算機200及びストレージ装置230が収納されている(図示せず)。 The physical computer 200 and the storage device 230 are stored in the rack 401a. The racks 401b to 401c similarly store the physical computer 200 and the storage device 230 (not shown).
 冷却装置151a、151bは、コンピュータ室400の側面に取り付けら、コンピュータ室400の温度を一定に保つための冷却装置151の一要素として構成されている。この冷却装置151a、151bは冷風を床下に送り、冷風が吹出口(穴あきタイル)431から吹き出すことで、各サーバが排出する熱を除去する。この際、管理計算機100からの指示に応じて、吹出口431~435のうちいずかの吹出口を開き、他の吹出口を閉じる制御が行われる。例えば、ラック401a、401dに収納されたサーバの稼働率が高く、ラック401b、401cに収納されたサーバがアイドル状態であるときには、冷却装置151に対する制御として、モータ440の回転駆動により、吹出口431~435のうち吹出口433が閉じ、他の吹出口431、432、434、435が開かれる制御が実行される。 The cooling devices 151 a and 151 b are attached to the side surface of the computer room 400 and are configured as one element of the cooling device 151 for keeping the temperature of the computer room 400 constant. The cooling devices 151 a and 151 b remove the heat discharged from each server by sending cold air under the floor and blowing out the cold air from the air outlet (perforated tile) 431. At this time, in accordance with an instruction from the management computer 100, control is performed to open one of the outlets 431 to 435 and close the other outlets. For example, when the operation rate of the servers stored in the racks 401a and 401d is high and the servers stored in the racks 401b and 401c are in an idle state, the blower outlet 431 is driven by the rotation of the motor 440 as a control for the cooling device 151. Among the ˜435, the air outlet 433 is closed and the other air outlets 431, 432, 434, 435 are opened.
 また、ラック401b、401cに収納されたサーバの稼働率が高く、ラック401a、401dに収納されたサーバがアイドル状態のときには、冷却装置151に対する制御として、モータ440の回転駆動により、吹出口431~435のうち吹出口431、435が閉じ、他の吹出口432、433、434が開かれる制御が実行される。 Further, when the operation rate of the servers stored in the racks 401b and 401c is high and the servers stored in the racks 401a and 401d are in an idle state, the air outlets 431 to 431 are controlled by the rotational drive of the motor 440 as a control for the cooling device 151. Control is performed such that the air outlets 431 and 435 out of 435 are closed and the other air outlets 432, 433, and 434 are opened.
 本実施例における冷却装置は、一般的なコンピュータ室空調(CRAC: Computer Room Air Conditioner)であるがこれに限定されるものではない。冷却設備は冷却された液体の冷媒がパイプを通り、各ラックを循環することで各サーバが排出する熱を除去する液冷装置であっても良い。液冷装置では、各ラックに通じるパイプの手前にバルブがあり、バルブを開閉することで、吹出口と同様に冷却出力の調整を行う。また、冷却設備は、外の冷えた空気を取り込み、コンピュータ室冷却装置と同様に床下から冷風を送ることで各サーバが排出する熱を除去する外気冷却装置であっても良い。 The cooling device in the present embodiment is a general computer room air conditioner (CRAC: “Computer” Room ”Air” Conditioner), but is not limited thereto. The cooling facility may be a liquid cooling device that removes heat discharged from each server by circulating the cooled liquid refrigerant through the pipes and circulating through each rack. In the liquid cooling device, there is a valve in front of the pipe that leads to each rack, and the cooling output is adjusted by opening and closing the valve in the same manner as the outlet. Further, the cooling facility may be an outside air cooling device that removes heat discharged from each server by taking in outside cold air and sending cool air from under the floor in the same manner as the computer room cooling device.
 図5は、本発明の一実施例におけるサーバ構成情報121を示す図である。 FIG. 5 is a diagram showing the server configuration information 121 in one embodiment of the present invention.
 サーバ構成情報121は、コンピュータ室400内に設置されたラックと、ラックに格納される物理計算機との対応関係を示すラック・物理計算機マップ500(図5(a))と物理計算機リスト510(図5(b))とからなる。 The server configuration information 121 includes a rack / physical computer map 500 (FIG. 5 (a)) and a physical computer list 510 (FIG. 5) showing the correspondence between racks installed in the computer room 400 and physical computers stored in the racks. 5 (b)).
 ラック・物理計算機マップ500(図5(a))は、ラックの識別子であるラックID501と、それぞれのラックに格納される物理計算機200の識別子である物理計算機ID502で構成される。 The rack / physical computer map 500 (FIG. 5A) includes a rack ID 501 that is an identifier of the rack and a physical computer ID 502 that is an identifier of the physical computer 200 stored in each rack.
 物理計算機リスト510(図5(b))は、物理計算機200の持つ物理計算機ID511と、シャーシ番号512と、構成要素識別子(項目)513と、構成要素の値514からなる一つ以上のレコードで構成され、物理計算機200の処理能力を表す。物理計算機ID501には、各物理計算機の識別子が格納されている。シャーシ番号512は、当該物理計算機がブレードサーバである場合に、ブレードサーバを格納するシャーシを特定するためにある。1Uサーバなどの非モジュラータイプのサーバである場合には、”-”が記録される。ブレードサーバは、複数のサーバでファンや電源を共有しており、サーバの構成や電源On/Offを管理する管理用プロセッサを持つ場合がある。管理対象物理計算機がブレードサーバである場合には、個々のサーバではなく、この管理用プロセッサに接続してCPU温度や共有するファンの回転数を得ることができる。管理用プロセッサに接続するため必要なIPアドレスやポート番号は、図示しないがサーバ構成情報として管理されているものとする。 The physical computer list 510 (FIG. 5B) is one or more records including a physical computer ID 511, a chassis number 512, a component identifier (item) 513, and a component value 514 that the physical computer 200 has. Configured and represents the processing capability of the physical computer 200. The physical computer ID 501 stores an identifier of each physical computer. The chassis number 512 is used to specify the chassis that stores the blade server when the physical computer is a blade server. In the case of a non-modular type server such as a 1U server, “-” is recorded. A blade server shares a fan and a power source among a plurality of servers, and may have a management processor that manages server configuration and power on / off. When the management target physical computer is a blade server, it is possible to obtain the CPU temperature and the shared fan rotation speed by connecting to this management processor instead of individual servers. It is assumed that the IP address and the port number necessary for connecting to the management processor are managed as server configuration information (not shown).
 サーバ構成情報121は、管理対象システムの設計者がシステム構築時に決定し、文書またはソフトウェアによって管理していることが多い。物理計算機構成情報は、このような管理されている構成情報に基づいて作成してもよいし、動的に収集した情報から作成してもよい。 The server configuration information 121 is often determined by a designer of a managed system at the time of system construction and managed by a document or software. The physical computer configuration information may be created based on such managed configuration information, or may be created from dynamically collected information.
 図6は、本発明の一実施例におけるラック・冷却装置マップ126を示す図である。 FIG. 6 is a diagram showing a rack / cooling device map 126 according to an embodiment of the present invention.
 ラック・冷却装置マップ126は、ラックの識別子601と、冷却装置151の識別子602と、ラックの前面に位置する吹き出し口603と、ラックの背面に位置する吹き出し口604とからなる一つ以上のレコードで構成される。各レコードは、ラックと、各ラックを冷却している冷却装置151と、ラックに対して送風・換気を行う吹き出し口の識別番号との対応を表している。 The rack / cooling device map 126 includes one or more records including a rack identifier 601, an identifier 602 of the cooling device 151, an outlet 603 located on the front of the rack, and an outlet 604 located on the rear of the rack. Consists of. Each record represents a correspondence between a rack, a cooling device 151 that cools each rack, and an identification number of an outlet that blows and ventilates the rack.
 図7は、本発明の一実施例における稼動情報710(図7(a))および電力情報720(図7(b))を示す図である。 FIG. 7 is a diagram showing operation information 710 (FIG. 7A) and power information 720 (FIG. 7B) in an embodiment of the present invention.
 稼動情報710(図7(a))は、一つの物理計算機200のリソース使用状況を示す。一例として、計測日711、計測曜日712、計測時刻713、CPU稼働率714とからなる一つ以上のレコードで構成される。CPU稼働率の単位は%である。ここで示す稼動情報は、Windows(登録商標)であればWMI(Windows Management Interface)で、LinuxであればTopコマンドで取得することが可能である。 The operation information 710 (FIG. 7A) indicates the resource usage status of one physical computer 200. As an example, it is composed of one or more records including a measurement date 711, a measurement day 712, a measurement time 713, and a CPU operation rate 714. The unit of the CPU operation rate is%. The operation information shown here can be acquired by WMI (Windows Management Interface) in the case of Windows (registered trademark) and by the Top command in the case of Linux.
 電力情報720(図7(b))は、各物理計算機200の電力消費状況を示すものであり、計測日721、計測曜日722、計測時刻723、物理計算機の電力量724および、シャーシの電力量726からなる一つ以上のレコードで構成される。なお、物理計算機200が、ブレードサーバである場合には、複数の物理計算機の電力量とシャーシの電力量が一つのテーブルで管理されている。一方、物理計算機200がブレードサーバでない場合には、一つの物理計算機の電力量だけが管理される。 The power information 720 (FIG. 7B) indicates the power consumption status of each physical computer 200. The measurement date 721, the measurement day 722, the measurement time 723, the physical computer power amount 724, and the chassis power amount. 726 is composed of one or more records. When the physical computer 200 is a blade server, the power amount of a plurality of physical computers and the power amount of the chassis are managed in one table. On the other hand, when the physical computer 200 is not a blade server, only the power amount of one physical computer is managed.
 図8は、本発明の一実施例における熱発生プロファイル122を示す図である。 FIG. 8 is a diagram showing a heat generation profile 122 in one embodiment of the present invention.
 熱発生プロファイル122は、CPU稼働率801と、発熱量802とからなる一つ以上のレコードで構成される。ここで、稼働率811とは、CPUの稼働率であり、発熱量と802とは、CPUの発熱量である。つまり、各レコードは、物理計算機200のCPUの稼働率に対する、発熱量を表している。 The heat generation profile 122 is composed of one or more records including a CPU operation rate 801 and a heat generation amount 802. Here, the operation rate 811 is the operation rate of the CPU, and the heat generation amount and 802 are the heat generation amount of the CPU. That is, each record represents the amount of heat generated with respect to the operating rate of the CPU of the physical computer 200.
 熱発生プロファイル122は、物理計算機200の備えるCPUの種類ごとに異なる。熱発生プロファイルの取得方法は様々な方法がある。例えば、CPUの過去の稼動率と発熱量の履歴を記録しておき、該履歴より取得することができる。また、事前にCPUによる処理を実行し、稼動率と発熱量の関係を測定することもできる。また、CPUベンダより提供された稼動率と発熱量との関係を記録することもできる。 The heat generation profile 122 is different for each type of CPU included in the physical computer 200. There are various methods for obtaining the heat generation profile. For example, it is possible to record a past operating rate and a calorific value history of the CPU, and obtain from the history. Moreover, the process by CPU can be performed in advance and the relationship between an operation rate and the emitted-heat amount can also be measured. It is also possible to record the relationship between the operating rate provided by the CPU vendor and the heat generation amount.
 図9は、本発明の一実施例におけるCPU温度プロファイル125(図9(a))、CPU温度範囲128(図9(b))、CPU最適温度129(図9(c))、温度上昇とリーク電流による消費電力の関係、および温度上昇とファン消費電力との関係(図9(d))を示す図である。 FIG. 9 shows the CPU temperature profile 125 (FIG. 9A), the CPU temperature range 128 (FIG. 9B), the CPU optimum temperature 129 (FIG. 9C), the temperature rise in one embodiment of the present invention. It is a figure which shows the relationship of the power consumption by leak current, and the relationship (FIG.9 (d)) of temperature rise and fan power consumption.
 CPU温度プロファイル125(図9(a))は、CPUの発熱量901と、一定時間あたりのCPU温度変化902からなる一つ以上のレコードで構成される。つまり、各レコードは、CPUの発熱量に対する、一定時間後の温度変化量を表している。 The CPU temperature profile 125 (FIG. 9A) is composed of one or more records including a CPU heat generation amount 901 and a CPU temperature change 902 per fixed time. That is, each record represents a temperature change amount after a certain time with respect to the heat generation amount of the CPU.
 一般に、物体の温度変化は外部から与えられる熱量を熱容量で割った値になるが、物体の性質によって熱の伝わりやすさが異なる。すなわち、同じ熱量が与えられても、CPUの材質や構造によって最終的に到達する温度、および温度変化の速度は異なる。このため、CPU温度プロファイルはCPU種類ごとに決まるテーブルである。 Generally, the temperature change of an object is a value obtained by dividing the amount of heat given from the outside by the heat capacity, but the ease of heat transfer differs depending on the nature of the object. That is, even when the same amount of heat is given, the temperature finally reached and the rate of temperature change differ depending on the material and structure of the CPU. Therefore, the CPU temperature profile is a table determined for each CPU type.
 上記のように、CPU温度プロファイル125は、物理計算機200の備えるCPUの種類ごとに異なる。CPU温度プロファイルの取得方法は様々な方法がある。例えば、CPUの過去の発熱量と温度変化の履歴を記録しておき、該履歴より取得することができる。また、事前にCPUによる処理を実行し、発熱量と温度変化の関係を測定することもできる。また、CPUベンダより提供された発熱量と温度変化との関係を記録することもできる。 As described above, the CPU temperature profile 125 is different for each type of CPU included in the physical computer 200. There are various methods for obtaining the CPU temperature profile. For example, the past heat generation amount and temperature change history of the CPU can be recorded and acquired from the history. Further, it is possible to measure the relationship between the heat generation amount and the temperature change by executing processing by the CPU in advance. It is also possible to record the relationship between the amount of heat generated by the CPU vendor and the temperature change.
 CPU温度範囲128(図9(b))は、上限値911と下限値912から成り、CPUが安全に稼動する温度の範囲を示している。通常、CPUは製造ベンダによって正常に動作する温度の上限と下限が決められている。そして、多くのサーバでは、室温の上昇やファンの故障によって、CPU温度がこの範囲を超えると、サーバのハードウェア状態を監視するプログラムがサーバ外部に警告を出す。 The CPU temperature range 128 (FIG. 9B) includes an upper limit value 911 and a lower limit value 912, and indicates a temperature range where the CPU operates safely. Usually, the upper and lower limits of the temperature at which the CPU operates normally are determined by the manufacturing vendor. In many servers, when the CPU temperature exceeds this range due to a rise in room temperature or a fan failure, a program for monitoring the hardware state of the server issues a warning to the outside of the server.
 CPU最適温度129は、CPUのリーク電流による消費電力と、CPUを冷却するファンが消費する電力との総和を最小にする温度である。CPU最適温度129は、入気温度によって変化するため、入気温度921と、CPU最適温度922からなる一つ以上のレコードで構成される。 The CPU optimum temperature 129 is a temperature that minimizes the sum of the power consumed by the leakage current of the CPU and the power consumed by the fan that cools the CPU. Since the CPU optimum temperature 129 varies depending on the intake air temperature, the CPU optimum temperature 129 is composed of one or more records including the intake air temperature 921 and the CPU optimum temperature 922.
 CPU最適温度129について説明する。半導体の微細化によりCPUにはOFF状態であっても微量の電流(リーク電流)が流れるが、このリーク電流は温度が上昇すると指数関数的に上昇するという特性を持つ。このため、アイドル状態であっても、CPU自身の温度が上昇すると消費電力も指数関数的に上昇する。このリーク電流によるCPU消費電力を抑えるには、CPU温度を上げない方が良いが、温度を維持するには冷却用の電力が必要になる。一般に、ファン消費電力はファンの風速、消費電力に比例しており、CPU温度上昇の2乗に反比例する。したがって、図9(d)に示すように、温度を低く保つとリーク電流は抑えられるが、ファンの消費電力は増加し、温度上昇を許せばリーク電流は増加するが、ファンの消費電力は低くすることができるというトレードオフの関係となる。CPU最適温度129は、このリーク電流による消費電力とファン消費電力との総和が最小になる温度であり、管理者が測定することも可能であり、サーバベンダが公開することも考えられる。 The CPU optimum temperature 129 will be described. A small amount of current (leakage current) flows through the CPU even when it is in an OFF state due to miniaturization of the semiconductor, but this leakage current has a characteristic that it rises exponentially as the temperature rises. For this reason, even in the idle state, when the temperature of the CPU increases, the power consumption also increases exponentially. In order to suppress CPU power consumption due to this leakage current, it is better not to raise the CPU temperature, but cooling power is required to maintain the temperature. Generally, fan power consumption is proportional to the fan wind speed and power consumption, and inversely proportional to the square of the CPU temperature rise. Therefore, as shown in FIG. 9D, the leakage current can be suppressed if the temperature is kept low, but the power consumption of the fan increases. If the temperature rise is allowed, the leakage current increases, but the power consumption of the fan is low. It is a trade-off relationship that can be done. The CPU optimum temperature 129 is a temperature at which the sum of the power consumed by the leak current and the power consumed by the fan is minimized, can be measured by the administrator, and can be disclosed by the server vendor.
 図10は、本発明の一実施例におけるファンプロファイル123を示す図である。 FIG. 10 is a diagram showing a fan profile 123 in one embodiment of the present invention.
 ファンプロファイル123は、物理計算機200のファンの回転数1001と、ファンの送風によりCPUを冷却し、一定時間あたりに変化させることのできるCPUの温度変化1002からなる一つ以上のレコードで構成される。そして、ファンによるCPUの冷却効率は、入気温度により変化する。したがって、ファンにより、一定時間あたりに変化させることのできるCPUの温度変化は入気温度により異なる。本実施例のファンプロファイルでは、21℃、22℃、23℃(1003)の入気温度の場合を示しているがこれに限定されるものではない。 The fan profile 123 is composed of one or more records each including a fan rotation speed 1001 of the physical computer 200 and a CPU temperature change 1002 that can be changed per unit time by cooling the CPU by blowing air from the fan. . And the cooling efficiency of CPU by a fan changes with inlet temperature. Therefore, the temperature change of the CPU that can be changed per certain time by the fan differs depending on the intake air temperature. Although the fan profile of the present embodiment shows the case of the inlet air temperature of 21 ° C., 22 ° C., and 23 ° C. (1003), it is not limited to this.
 また、サーバのCPUとファンとの関係は1対1とは限らない。例えば、一つの筐体に複数のサーバが格納されるブレードサーバでは、複数のサーバを共有のファンで冷却することがある。このような場合でも、ファンは、各CPUを均一に冷やすよう構成されており、前述のようなファンプロファイル123を定義することができる。 Also, the relationship between the server CPU and the fans is not necessarily one-to-one. For example, in a blade server in which a plurality of servers are stored in one casing, the plurality of servers may be cooled by a shared fan. Even in such a case, the fan is configured to cool each CPU uniformly, and the fan profile 123 as described above can be defined.
 図20は、本発明の一実施例における冷却装置プロファイル2010と、冷却制御パターン2010を示す図である。 FIG. 20 is a diagram showing a cooling device profile 2010 and a cooling control pattern 2010 in one embodiment of the present invention.
 冷却制御プロファイル2010は、冷却装置ごとに分かれており、冷却装置の出力段階2011と、その出力段階時の各冷却対象ラックの入気温度変化2012と、消費電力2013とからなる一つ以上のレコードで構成される。 The cooling control profile 2010 is divided for each cooling device, and includes one or more records including a cooling device output stage 2011, an inlet air temperature change 2012 of each cooling target rack at the output stage, and power consumption 2013. Consists of.
 冷却制御パターン2020は、複数の冷却装置の出力の組み合わせと消費電力を表すものであり、組み合わせ番号2021と、各冷却装置の出力2022と、その時の冷却装置全体の消費電力量2023からなる一つ以上のレコードで構成される。 The cooling control pattern 2020 represents the combination of the outputs of the plurality of cooling devices and the power consumption, and includes a combination number 2021, the output 2022 of each cooling device, and the power consumption 2023 of the entire cooling device at that time. Consists of the above records.
 図5から図10で示した情報は、管理者が定義ファイルに記述する。ただし、これらの情報は、定義ファイルでなく、GUI(Graphical User Interface)から入力してもよいし、他のサーバからネットワークを介して取得してもよい。 The information described in FIGS. 5 to 10 is described in the definition file by the administrator. However, these pieces of information may be input from a GUI (Graphical User Interface) instead of a definition file, or may be acquired from another server via a network.
 次に、本発明の一実施例における管理計算機による物理計算機と冷却装置の制御を、図を用いて説明する。 Next, the control of the physical computer and the cooling device by the management computer in one embodiment of the present invention will be described with reference to the drawings.
 図11は、本発明の一実施例における、管理計算機よる物理計算機と冷却装置の制御フローを示した図である。 FIG. 11 is a diagram showing a control flow of the physical computer and the cooling device by the management computer in one embodiment of the present invention.
 まず、物理計算機のサーバプログラム110の稼動情報監視部111は、サーバ構成情報121を記憶装置105から読み出す(S1101)。そして、管理する物理計算機200を把握し、稼動情報監視部111がこれら物理計算機200の稼動情報および消費電力を収集し(S1102)、サーバ稼動履歴124に格納する。 First, the operation information monitoring unit 111 of the server program 110 of the physical computer reads the server configuration information 121 from the storage device 105 (S1101). Then, the physical computer 200 to be managed is grasped, and the operation information monitoring unit 111 collects operation information and power consumption of the physical computer 200 (S1102) and stores them in the server operation history 124.
 そして、CPU温度推定部113は、格納した稼動履歴と、記憶装置に格納された熱発生プロファイル122、CPU温度プロファイル125に基づいて、各物理計算機200の一定時間経過後のCPU温度を推定する(S1103)。この推定の処理は図12を用いて説明する。 Then, the CPU temperature estimation unit 113 estimates the CPU temperature of each physical computer 200 after a certain period of time based on the stored operation history, the heat generation profile 122 and the CPU temperature profile 125 stored in the storage device ( S1103). This estimation process will be described with reference to FIG.
 図12は、本発明の一実施例におけるCPU温度推定フローを示す図である。本処理は、CPU温度推定部により実行される。 FIG. 12 is a diagram showing a CPU temperature estimation flow in one embodiment of the present invention. This process is executed by the CPU temperature estimation unit.
 CPU温度推定部113は、サーバ稼動履歴124を参照して、処理対象である物理計算機200のCPU稼働率714を取得する(S1201)。 The CPU temperature estimation unit 113 refers to the server operation history 124 and acquires the CPU operation rate 714 of the physical computer 200 that is the processing target (S1201).
 そして、CPU稼働率に対して発生する熱量を、熱発生プロファイル122を参照して求める(S1202)。この時、まず、サーバ構成情報121のサーバリスト510を参照して、当該物理計算機200のCPU種類を求め、CPU種類に対応するCPU温度プロファイルを用いる。 Then, the amount of heat generated with respect to the CPU operating rate is obtained with reference to the heat generation profile 122 (S1202). At this time, first, the CPU type of the physical computer 200 is obtained by referring to the server list 510 of the server configuration information 121, and the CPU temperature profile corresponding to the CPU type is used.
 次に、発生した熱量に対するCPU温度変化を、CPU温度プロファイル125を参照して求める(S1203)。具体的には、CPU温度推定部113は、温度監視部を用いて現在のCPUの温度とサーバ装置への入気温度を取得し、ファン監視・制御部116を用いて現在のファン回転数を取得する。そして、ファンプロファイル123を参照して、現在時刻から一定時間経過後まで現在のファン回転数を維持した場合のCPU温度変化(冷却効果)を求める(S1204)。ファン監視・制御部116は、前記BMC207に対してSSH(Secure Shell)などによって接続し、ファン回転数を取得するコマンドを実行し、ファン回転数の値を取得する。 Next, the CPU temperature change with respect to the generated heat quantity is obtained with reference to the CPU temperature profile 125 (S1203). Specifically, the CPU temperature estimation unit 113 acquires the current CPU temperature and the inlet temperature to the server device using the temperature monitoring unit, and uses the fan monitoring / control unit 116 to determine the current fan rotation speed. get. Then, referring to the fan profile 123, the CPU temperature change (cooling effect) in the case where the current fan speed is maintained from the current time until a certain time has elapsed is obtained (S1204). The fan monitoring / control unit 116 is connected to the BMC 207 by SSH (Secure Shell) and executes a command for acquiring the fan rotation speed, and acquires the value of the fan rotation speed.
 最後に、現在のCPU温度に発生した熱量に対するCPU温度変化を加算し、ファンによる温度変化を引いて、一定時間経過後のCPU温度とする(S1205)。 Finally, the CPU temperature change with respect to the amount of heat generated is added to the current CPU temperature, and the temperature change by the fan is subtracted to obtain the CPU temperature after a certain time has elapsed (S1205).
 一定時間経過後のCPU温度の推定値が算出されたら、一定時間経過後の推定温度が、設定したしきい値を超えるか否かを確認する(S1104)。設定したしきい値を超える場合(Yの場合)、温度監視部により物理計算機への入気温度を取得し、ファン回転数決定部115は取得された入気温度とファンプロファイル123を参照して、一定時間経過後のCPU温度を上限値内に抑えるために必要なファン回転数を算出する(S1105)。 When the estimated value of the CPU temperature after the lapse of a certain time is calculated, it is confirmed whether or not the estimated temperature after the lapse of the certain time exceeds a set threshold value (S1104). When the set threshold value is exceeded (in the case of Y), the temperature monitoring unit acquires the intake air temperature to the physical computer, and the fan rotation speed determination unit 115 refers to the acquired intake air temperature and the fan profile 123. Then, the number of fan rotations necessary to keep the CPU temperature within the upper limit value within a certain time is calculated (S1105).
 ファン回転数を決定する一例を図13に示す。つまり、図13は、本発明の一実施例におけるファン回転数決定フローを示す図である。 An example of determining the fan speed is shown in FIG. That is, FIG. 13 is a diagram showing a fan rotational speed determination flow in one embodiment of the present invention.
 CPU温度の推定値がしきい値を上回る場合、ファン回転数決定部115は、推定した一定時間経過後のCPU温度とCPU温度のしきい値との差分を求め、実現すべきCPU温度変化量とする(S1301)。 When the estimated value of the CPU temperature exceeds the threshold value, the fan rotational speed determination unit 115 obtains the difference between the estimated CPU temperature and the threshold value of the CPU temperature after the elapse of a predetermined time, and the CPU temperature change amount to be realized. (S1301).
 そして、一定時間経過後のCPU温度を推定する際に取得した、物理計算機への現在の入気温度と、ファンプロファイル123とを参照して、一定時間経過後の実現すべきCPU温度変化量とするためのファン回転数を求める(S1302)。 Then, referring to the current inlet temperature to the physical computer acquired when estimating the CPU temperature after a certain period of time and the fan profile 123, the CPU temperature change amount to be realized after a certain period of time The number of fan rotations to obtain is determined (S1302).
 しきい値を超えない場合は、一定時間経過を待ち(S1108)、稼動情報・温度・回転数の監視に戻る(S1102)。ここで、しきい値は、CPU温度範囲128の上限値であってもよい。また、しきい値は、上限値より一定値引いた値であってもよい。上限値より一定値を引いた値をしきい値とすることにより、突然稼動量が上昇した場合であっても、上限値を以下で稼動することができる。 If the threshold value is not exceeded, the process waits for a certain period of time (S1108) and returns to monitoring of operation information, temperature, and rotation speed (S1102). Here, the threshold value may be an upper limit value of the CPU temperature range 128. The threshold value may be a value obtained by subtracting a certain value from the upper limit value. By setting a value obtained by subtracting a constant value from the upper limit value as a threshold value, the upper limit value can be operated even when the operation amount suddenly increases.
 また、本実施例においては、一定時間経過後のしきい値を超えるか否かにより、ファンの回転数を制御するか否かを決定したが、これに限られるものではない。他の実施形態として、温度監視部112により入気温度を取得し、取得した入気温度からCPUリーク電流とファン消費電力の合計を最小にする値であるCPU最適温度129を決定する。そして、CPU最適温度から一定温度減算した値より、CPU最適温度に一定温度加算した値の範囲範囲内であるか否かにより判断1104をしてもよい。 In this embodiment, whether or not to control the rotation speed of the fan is determined depending on whether or not the threshold value after a certain time has elapsed, but the present invention is not limited to this. As another embodiment, the inlet temperature is acquired by the temperature monitoring unit 112, and the CPU optimum temperature 129 that is a value that minimizes the sum of the CPU leakage current and the fan power consumption is determined from the acquired inlet temperature. Then, the determination 1104 may be made based on whether the value obtained by subtracting the constant temperature from the CPU optimum temperature is within the range of the value obtained by adding the constant temperature to the CPU optimum temperature.
 次に、サーバプログラム110は、物理計算機200がファンの回転数を算出した回転数へ変更可能であるか確認する(S1106)。例えば、処理1105にて求めたファン回転数が、最大値を超えているかどうかを確認する。変更可能である場合には(S1106:Y)、ファン監視・制御部116は、物理計算機200に対して、ファンの回転数を変更するよう指示を出す(S1107)。そして、一定時間経過を待ち(S1108)、稼動情報・温度・回転数の監視に戻る(S1102)。 Next, the server program 110 confirms whether or not the physical computer 200 can change the rotational speed of the fan to the calculated rotational speed (S1106). For example, it is confirmed whether the fan rotation speed obtained in the process 1105 exceeds the maximum value. If it can be changed (S1106: Y), the fan monitoring / control unit 116 instructs the physical computer 200 to change the rotational speed of the fan (S1107). Then, after a predetermined time has elapsed (S1108), the process returns to monitoring of operation information, temperature, and rotation speed (S1102).
 物理計算機200のファンが、ファン回転数決定部115が決定したファンの回転数を実現できない場合には(S1106:N)、冷却制御部117は、コンピュータ室400に設置された冷却設備151a、151bの出力を変更して当該物理計算機200の入気温度を下げる(S1520)。この処理は図21を用いて説明する。 When the fan of the physical computer 200 cannot realize the fan rotation speed determined by the fan rotation speed determination unit 115 (S1106: N), the cooling control unit 117 sets the cooling facilities 151a and 151b installed in the computer room 400. And the inlet temperature of the physical computer 200 is lowered (S1520). This process will be described with reference to FIG.
 図の21は、本発明の一実施例における冷却制御フローを説明する図である。 FIG. 21 is a diagram for explaining a cooling control flow in one embodiment of the present invention.
 冷却制御部117は、ファンプロファイル123を参照して、実現すべきCPUの温度変化量を達成する、目標入気温度を求める(S2101)。ここで、ファン回転数は実現できる最大値とする。つまり、図10の5000回転が最大値であり、CPU温度変化を-3.0℃とした場合には、目標入気温度を21℃と求める。 The cooling control unit 117 refers to the fan profile 123 to obtain a target inlet temperature that achieves the CPU temperature change amount to be realized (S2101). Here, the fan rotation speed is the maximum value that can be realized. That is, if the 5000 revolutions in FIG. 10 is the maximum value and the CPU temperature change is −3.0 ° C., the target inlet air temperature is obtained as 21 ° C.
 そして、現在の入気温度と目標入気温度との差分を求め入気温度変化目標値とする(S2102)。 Then, the difference between the current inlet temperature and the target inlet temperature is obtained and set as the inlet temperature change target value (S2102).
 次に、冷却制御部117は、サーバ構成情報121を参照して制御対象の物理計算機200が格納されるラックを特定する。そして、ラック・冷却マップ126を参照して、特定したラックの冷却を担当する冷却設備を特定する。特定されるラックは複数台であってもよい。そして、冷却制御部117は、この冷却装置の出力を、前記差分に応じて変更するよう指示する。 Next, the cooling control unit 117 refers to the server configuration information 121 to identify the rack in which the physical computer 200 to be controlled is stored. Then, referring to the rack / cooling map 126, the cooling facility responsible for cooling the specified rack is specified. Multiple racks may be specified. And the cooling control part 117 instruct | indicates to change the output of this cooling device according to the said difference.
 具体的な冷却装置の出力の決定方法は、冷却装置プロファイル2010を参照して、ラック温度変化量が、入気温度変化目標値以上である出力段階を選択する。 The specific method for determining the output of the cooling device refers to the cooling device profile 2010 and selects an output stage in which the rack temperature change amount is equal to or greater than the intake air temperature change target value.
 ただし、本実施例のように複数の冷却装置が同一ラックを冷却する場合には、どの冷却装置の出力を変更するかによってコンピュータ室全体の消費電力が異なる。そこで、管理対象ラックの入気温度を目標入気温度にするために、冷却設備151aおよび冷却設備151bが取り得る出力の組み合わせを列挙した冷却制御パターンリスト2030を作成する(S2103)。 However, when a plurality of cooling devices cool the same rack as in this embodiment, the power consumption of the entire computer room differs depending on which cooling device output is changed. Therefore, in order to set the inlet temperature of the rack to be managed to the target inlet temperature, a cooling control pattern list 2030 listing the combinations of outputs that the cooling equipment 151a and the cooling equipment 151b can take is created (S2103).
 そして、冷却設備151aと冷却設備151bとの消費電力の和が最小となる組み合わせを選択する(S2104)。 Then, the combination that minimizes the sum of the power consumption of the cooling equipment 151a and the cooling equipment 151b is selected (S2104).
 例えば、組み合わせには、冷却設備151aは段階1にして、冷却設備151aは段階3とするパターン、両方の出力を段階2とするパターン、冷却設備151aを段階3として、冷却設備151bを段階1とするなどのパターンがある。ここで、温度変化、つまり冷やせる温度に対する冷却装置の消費電力は、装置の特性やラックとの距離によって異なるため、各パターンの消費電力は異なる。そして、冷却設備151a、151bの出力を変更して当該物理計算機200の入気温度を下げる(S2105)。
そして、一定時間経過を待ち(S1108)、稼動情報・温度・回転数の監視に戻る(S1102)。
For example, in the combination, the cooling equipment 151a is set to stage 1, the cooling equipment 151a is set to stage 3, the both outputs are set to stage 2, the cooling equipment 151a is set to stage 3, and the cooling equipment 151b is set to stage 1. There are patterns such as. Here, the power consumption of the cooling device with respect to the temperature change, that is, the cooling temperature varies depending on the characteristics of the device and the distance to the rack, and therefore the power consumption of each pattern is different. Then, the output of the cooling equipment 151a, 151b is changed to lower the inlet temperature of the physical computer 200 (S2105).
Then, after a predetermined time has elapsed (S1108), the operation information / temperature / revolution speed monitoring is resumed (S1102).
 本実施例の物理計算機プロファイル124は、主な発熱部品であるCPUのみに着目しているが、熱発生プロファイルはCPUだけでなく、他の部品の利用率に応じた情報であってもよい。 The physical computer profile 124 of the present embodiment focuses only on the CPU that is the main heat generating component, but the heat generation profile may be information corresponding to the utilization rate of other components in addition to the CPU.
 また、本実施例は、IT装置がサーバである場合について説明したが、IT装置はストレージ装置、ネットワーク機器であっても良い。IT装置がストレージ装置の場合には、装置の発熱量はCPU稼働率だけではなく、装置へのデータ入出力回数を示すIOPS(Input Output Per Second)によって変化するため、コントローラのCPU稼働率やIOPSに基づいて発熱量を推定することができる。同様にIT装置がネットワーク機器の場合には、ポートの使用率に基づいて発熱量を推定することができる。 In the present embodiment, the case where the IT device is a server has been described. However, the IT device may be a storage device or a network device. When the IT device is a storage device, the amount of heat generated by the device changes not only with the CPU operation rate but also with IOPS (InputInOutput Per Second) indicating the number of data input / output to the device. The calorific value can be estimated based on the above. Similarly, when the IT device is a network device, the heat generation amount can be estimated based on the port usage rate.
 上述したように、省電力制御サーバ110を、物理計算機200の稼動情報および消費電力情報を収集する稼動情報監視部111と、前記管理対象サーバの入気温度、CPU温度、排気温度を収集する温度監視部112と、前記管理対象サーバのファン回転数を監視し、前記ファンの回転数を変更するファン監視・制御部116と、熱発生プロファイル122およびCPU温度プロファイル125を読み込み、前記管理対象サーバが内蔵するCPUの一定時間後の温度を推定するCPU温度推定部113と、ファンプロファイル123を読み込んで前記CPUの一定時間後の温度を目標値まで下げるファン回転数を決定するファン回転数決定部115と、冷却装置プロファイル127と、ラック・冷却マップ126の入力を受け付け、冷却装置151の出力を決定する冷却制御決定部114と、冷却装置151の制御を指示する冷却制御部117とで構成することで、CPUを事前に冷却して最適温度に保ち、リーク電流による電力の消費を最小にし、冷却効率を上げることができる。また、サーバのファンと空調を連携して制御することにより、ファンの制御だけではCPUの温度を規定範囲内にできない場合であっても、事前に空調を調整して入気温度を下げておくことで、CPUの温度を規定の範囲内に収めて熱による障害の発生を回避できる。 As described above, the power saving control server 110 includes the operation information monitoring unit 111 that collects the operation information and power consumption information of the physical computer 200, and the temperature that collects the inlet temperature, CPU temperature, and exhaust temperature of the managed server. The monitoring unit 112 monitors the fan rotation speed of the management target server, reads the fan monitoring / control unit 116 that changes the rotation speed of the fan, the heat generation profile 122, and the CPU temperature profile 125, and the management target server A CPU temperature estimation unit 113 that estimates the temperature after a certain time of the built-in CPU, and a fan rotation number determination unit 115 that reads the fan profile 123 and determines a fan rotation number that lowers the temperature after a certain time of the CPU to a target value. , Cooling device profile 127 and rack / cooling map 126 are received and cooling The cooling control determining unit 114 for determining the output of the device 151 and the cooling control unit 117 for instructing the control of the cooling device 151 are used to cool the CPU in advance to maintain the optimum temperature, and to reduce the power generated by the leakage current. Consumption can be minimized and cooling efficiency can be increased. In addition, by controlling the server fan and air conditioning in cooperation, even if the CPU temperature cannot be within the specified range only by controlling the fan, the air intake temperature is lowered in advance by adjusting the air conditioning. As a result, the temperature of the CPU is kept within a specified range, and the occurrence of a failure due to heat can be avoided.
 なお、本実施例においては、管理計算機からの指示によりサーバ装置内のBMCがファンを制御するが、これに限定されるものではなく、サーバプログラム110を物理計算機のメモリに格納し、稼動情報履歴124と、サーバ構成情報121と、熱発生プロファイル122と、ファンプロファイル123と、CPU温度プロファイル125と、冷却装置プロファイル127と、CPU温度範囲128と、CPU最適温度129と、ラック・冷却マップ126と、冷却装置プロファイル2010とを記憶装置に格納し、物理計算機内において本実施例におけるファン制御及び空調装置制御を行ってもよい。 In this embodiment, the BMC in the server device controls the fan according to an instruction from the management computer. However, the present invention is not limited to this. The server program 110 is stored in the memory of the physical computer, and the operation information history is recorded. 124, server configuration information 121, heat generation profile 122, fan profile 123, CPU temperature profile 125, cooling device profile 127, CPU temperature range 128, CPU optimum temperature 129, rack / cooling map 126 The cooling device profile 2010 may be stored in the storage device, and the fan control and the air conditioning device control in this embodiment may be performed in the physical computer.
 以下、本発明の第2の実施形態を図面に基づいて説明する。第1の実施例と、同一の構成部分については、適宜図面及び説明を省略する。 Hereinafter, a second embodiment of the present invention will be described with reference to the drawings. For the same components as those in the first embodiment, the drawings and description are omitted as appropriate.
 本発明の一実施例のシステム構成は、図1と同様である。また、物理計算機は、図3と同様である。 The system configuration of one embodiment of the present invention is the same as in FIG. The physical computer is the same as that shown in FIG.
 図14は、本発明の一実施例の管理計算機を示す図である。実施例1の管理計算機との相違点は、稼動履歴情報124を保持せず、ジョブ実行スケジュール132、サーバ・ジョブマップ131を保持していることである。既に実施例1にて説明したサーバ構成情報121、熱発生プロファイル122、ファンプロファイル123、CPU温度プロファイル125、ラック・冷却マップ126、CPU温度範囲128、CPU最適温度129については、説明を省略する。 FIG. 14 is a diagram showing a management computer according to an embodiment of the present invention. The difference from the management computer of the first embodiment is that the operation history information 124 is not held, but the job execution schedule 132 and the server / job map 131 are held. Description of the server configuration information 121, the heat generation profile 122, the fan profile 123, the CPU temperature profile 125, the rack / cooling map 126, the CPU temperature range 128, and the CPU optimum temperature 129 already described in the first embodiment will be omitted.
 管理計算機の動作を簡単に説明する。なお、詳細な説明は以下、図を用いて行う。サーバプログラム110は、管理対象サーバが実行する業務のジョブ実行スケジュール(負荷変動)から一定の期間におけるCPU稼働率を求め、CPU稼働率と、CPU種類ごとに決まっている稼働率に対する発熱量とから、一定時間後のCPUの温度を推定し、CPU温度が、CPUが安定稼動する温度範囲の上限値を超えることを確認すると、前記管理対象サーバのファン回転数及び必要に応じて冷却装置の出力を制御する。 動作 Briefly explain the operation of the management computer. Detailed description will be given below with reference to the drawings. The server program 110 obtains the CPU operating rate in a certain period from the job execution schedule (load fluctuation) of the business executed by the managed server, and based on the CPU operating rate and the heat generation amount for the operating rate determined for each CPU type. When the temperature of the CPU after a certain time is estimated and it is confirmed that the CPU temperature exceeds the upper limit value of the temperature range in which the CPU operates stably, the fan speed of the managed server and the output of the cooling device as necessary To control.
 図15は、本発明の一実施例におけるサーバ・ジョブマップ131(図15(a))と、ジョブ実行スケジュール132(図15(b))とを示す構成図である。 FIG. 15 is a configuration diagram showing a server / job map 131 (FIG. 15A) and a job execution schedule 132 (FIG. 15B) according to an embodiment of the present invention.
 サーバ・ジョブマップ131(図15(a))は、物理計算機の識別子である物理計算機ID1401と、業務の種類を示す業務種別1402からなる一つ以上のレコードで構成される。各レコードは、どの物理計算機で、どの業務を実行しているかを示す。 The server job map 131 (FIG. 15A) is composed of one or more records including a physical computer ID 1401 that is an identifier of the physical computer and a business type 1402 that indicates the type of business. Each record indicates which business is being executed on which physical computer.
 ジョブ実行スケジュール132(図15(b))は、物理計算機ごとに、日1411と、曜日1412、開始時刻1413、終了時刻1414、ジョブID1415、平均CPU稼働率1416とからなる1つ以上のレコードで構成される。各レコードは、曜日と時間帯ごとの、業務によって発生するCPU稼働率の平均値を示している。サーバのCPU稼働率は、バッチジョブの実行スケジュール、業務リクエストの時間的変動、またはサーバのOn/Offスケジュールなどによって予測する。ジョブ実行スケジュール132は、バッチジョブの実行スケジュール、業務リクエストの時間的変動、またはサーバのOn/Offスケジュールと、同等の処理を実行した時のサーバ稼動履歴から求める。 The job execution schedule 132 (FIG. 15B) is one or more records including a day 1411, a day of the week 1412, a start time 1413, an end time 1414, a job ID 1415, and an average CPU operation rate 1416 for each physical computer. Composed. Each record indicates the average value of the CPU operation rate generated by the business for each day of the week and time zone. The CPU operation rate of the server is predicted based on the batch job execution schedule, the time variation of the business request, or the server On / Off schedule. The job execution schedule 132 is obtained from the server operation history when processing equivalent to the batch job execution schedule, the time variation of the business request, or the server On / Off schedule is executed.
 図16は、本発明の第二の実施例の制御フローを示すフローチャートである。まず、サーバプログラム110の稼動情報監視部111は、サーバ構成情報121を取得し(S1601)、管理する物理計算機200の情報を取得する。そして、管理する物理計算機のジョブ実行スケジュールの情報を取得する。 FIG. 16 is a flowchart showing the control flow of the second embodiment of the present invention. First, the operation information monitoring unit 111 of the server program 110 acquires server configuration information 121 (S1601), and acquires information on the physical computer 200 to be managed. Then, the job execution schedule information of the physical computer to be managed is acquired.
 次に、サーバプログラム110は、ジョブ実行スケジュールを確認し、物理計算機200が実行するジョブの終了時刻に達しているかを確認する(S1603)。そして、ジョブの終了時刻の場合(S1603:Y)、サーバ・ジョブマップ131を参照して、物理計算機200が次に実行するジョブを特定し、ジョブに対応するジョブ実行スケジュール132を参照して、当該物理計算機200が実行するジョブの開始・終了時刻と平均CPU稼働率を取得する(S1604)。 Next, the server program 110 confirms the job execution schedule and confirms whether the end time of the job executed by the physical computer 200 has been reached (S1603). If it is the job end time (S1603: Y), the server / job map 131 is referenced to identify the next job to be executed by the physical computer 200, and the job execution schedule 132 corresponding to the job is referred to. The start and end times of jobs executed by the physical computer 200 and the average CPU operating rate are acquired (S1604).
 そして、CPU温度推定部113は、平均CPU稼働率と、熱発生プロファイル122、CPU温度プロファイル125に基づいて、各物理計算機200の一定時間経過後のCPU温度を推定する(S1604)。温度推定の処理は第一の実施例と同一である。 Then, the CPU temperature estimation unit 113 estimates the CPU temperature of each physical computer 200 after a certain period of time based on the average CPU operating rate, the heat generation profile 122, and the CPU temperature profile 125 (S1604). The temperature estimation process is the same as in the first embodiment.
 ここで、例えば、これから開始する負荷区間の平均CPU稼働率が0である場合は、ファンの回転数を落とす、または停止することで消費電力を抑えることができる。 Here, for example, when the average CPU operation rate of the load section to be started is 0, the power consumption can be suppressed by reducing or stopping the rotation speed of the fan.
 また、処理1606の処理も実施例1と同様である。 Further, the processing of processing 1606 is the same as that of the first embodiment.
 本実施例では、ジョブ実行スケジュールから、実行されるジョブとそのジョブを処理するためのCPU稼働率を取得した。しかし、これに限定されるものではなく、例えば、過去に処理されたジョブやCPU稼働率などのデータを格納しておき、格納されたデータをもとにある時間のCPU稼働率を予測し、CPU稼動率を取得してもよい。 In this embodiment, the job to be executed and the CPU operating rate for processing the job are acquired from the job execution schedule. However, it is not limited to this, for example, storing data such as jobs processed in the past and CPU utilization rate, predicting CPU utilization rate for a certain time based on the stored data, CPU utilization may be acquired.
 以下、本発明の第3の実施形態を図面に基づいて説明する。第1の実施例と、同一の構成部分については、適宜図面及び説明を省略する。 Hereinafter, a third embodiment of the present invention will be described with reference to the drawings. For the same components as those in the first embodiment, the drawings and description are omitted as appropriate.
 本発明の一実施例のシステム構成は、図1と同様である。また、物理計算機は、図3と同様である。 The system configuration of one embodiment of the present invention is the same as in FIG. The physical computer is the same as that shown in FIG.
 図17は、本発明の一実施例の管理計算機を示す図である。実施例1の管理計算機との相違点は、サーバプログラムにルール判定部133を保持し、記憶装置に冷却制御ルールを保持している点である。既に実施例1にて説明したサーバ構成情報121、熱発生プロファイル122、ファンプロファイル123、稼動履歴情報124、CPU温度プロファイル125、ラック・冷却マップ126、CPU温度範囲128、CPU最適温度129については、説明を省略する。 FIG. 17 is a diagram showing a management computer according to an embodiment of the present invention. The difference from the management computer of the first embodiment is that the server program holds the rule determination unit 133 and the storage device holds the cooling control rule. Regarding the server configuration information 121, heat generation profile 122, fan profile 123, operation history information 124, CPU temperature profile 125, rack / cooling map 126, CPU temperature range 128, and CPU optimum temperature 129 already described in the first embodiment, Description is omitted.
 図18は、本発明の第3の実施例の冷却制御ルール131を示す構成図である。 FIG. 18 is a configuration diagram showing a cooling control rule 131 of the third embodiment of the present invention.
 冷却制御ルール131は、ルール1010とアクション1020からなり、ルール1010は、評価項目1011と、しきい値1012からなる一つ以上のレコードで構成される。各レコードは1つの条件を表しており、評価項目の値がしきい値以上になったときに条件が成立し、全てのレコードで表現された条件が成立したときに、アクション1020で指定された制御が実行される。また、1つの項目の条件が成立したときに、アクション1020で指定された制御を実行してもよい。 The cooling control rule 131 includes a rule 1010 and an action 1020. The rule 1010 includes one or more records including an evaluation item 1011 and a threshold value 1012. Each record represents one condition. The condition is satisfied when the value of the evaluation item is equal to or greater than the threshold value, and is specified by the action 1020 when the condition expressed by all the records is satisfied. Control is executed. Further, when the condition of one item is satisfied, the control designated by the action 1020 may be executed.
 冷却制御ルールは、異常な温度上昇が開始したと判定できる条件であり、あらかじめコンピュータ室の管理者が定義するものとする。 The cooling control rule is a condition that can be determined that an abnormal temperature rise has started, and is defined in advance by a computer room administrator.
 例えば、図18に示す冷却制御ルール131は、あるラックに格納される全物理計算機200のCPU温度が60℃を超えており、かつ、排気温度が40℃を超えており、かつ、ファン回転数が10000回転/秒を超える場合には、当該ラックに格納される物理計算機200の排気が出るラック背面に熱だまりが発生するとみなす。そして、この熱を排出するため、アクション1020を実行する。当該ラックを冷却する冷却装置151の出力を最大にし、当該ラック背面に位置する吹き出し口のグレーティング板を100%開放することを指示している。 For example, in the cooling control rule 131 shown in FIG. 18, the CPU temperature of all the physical computers 200 stored in a certain rack exceeds 60 ° C., the exhaust temperature exceeds 40 ° C., and the fan rotation speed Is more than 10,000 revolutions / second, it is considered that a heat pool is generated on the rear surface of the rack where the exhaust of the physical computer 200 stored in the rack comes out. Then, action 1020 is executed to discharge this heat. It is instructed to maximize the output of the cooling device 151 for cooling the rack and to open 100% of the grating plate at the outlet located on the back of the rack.
 上記の他に、図5(b)の電力情報より、同一ラックに格納される物理計算機200の消費電力の総和が一定の値を超えたときに、当該ラックに格納される物理計算機群から発生する熱量が一定の値を超えると判断し、当該ラックを冷却する冷却装置の出力を一段階上げることも有効である。 In addition to the above, when the total power consumption of the physical computers 200 stored in the same rack exceeds a certain value from the power information shown in FIG. 5B, it is generated from the physical computer group stored in the rack. It is also effective to increase the output of the cooling device that cools the rack by one step, judging that the amount of heat to be exceeded exceeds a certain value.
 また、同一ラックに格納される全物理計算機200の入気温度が一定のしきい値を超えた場合に、ラック前面に熱だまりが発生しているとみなして、当該ラックを冷却する冷却装置151の出力を最大にし、当該ラック前面に位置する吹き出し口のグレーティング板を100%開放することも有効である。 Further, when the intake air temperature of all the physical computers 200 stored in the same rack exceeds a certain threshold value, it is considered that a heat accumulation has occurred on the front surface of the rack, and the cooling device 151 for cooling the rack. It is also effective to open the grating plate at the outlet located in front of the rack 100%.
 図19は、本発明の第3の実施例の制御フローを示すフローチャートである。 FIG. 19 is a flowchart showing a control flow of the third embodiment of the present invention.
 まず、省電力制御サーバ110の稼動情報監視部111は、サーバ構成情報121を取得する(S1901)。 First, the operation information monitoring unit 111 of the power saving control server 110 acquires the server configuration information 121 (S1901).
 そして、管理する物理計算機200の情報を取得し、稼動情報監視部111がこれら物理計算機200の稼動情報および消費電力を収集し、サーバ稼動履歴124に格納する。温度監視部112は現在のCPU温度、排気温度を収集する。ファン監視・制御部116はファン回転数を収集する(S1902)。 Then, information on the physical computer 200 to be managed is acquired, and the operation information monitoring unit 111 collects operation information and power consumption of the physical computer 200 and stores them in the server operation history 124. The temperature monitoring unit 112 collects the current CPU temperature and exhaust temperature. The fan monitoring / control unit 116 collects the fan rotation speed (S1902).
 次に、ルール判定部133は、収集した情報をもとに、各ラックに格納される物理計算機200について、冷却制御ルール131のルール1010の各項目の値をしきい値1012と比較する(S1903)。全項目の値が、しきい値1012以上であり、ルール1010が成立している場合(S1904:Y)は、ラック・冷却マップ126を参照して、当該ラックを冷却する冷却装置151、および当該ラックの背面に位置する吹き出し口を特定し、アクション1020で指定された冷却制御を実行する(S1905)。そして、一定時間経過待ち(S1906)、稼動情報等の監視に戻る(S1902)。 Next, the rule determination unit 133 compares the value of each item of the rule 1010 of the cooling control rule 131 with the threshold value 1012 for the physical computer 200 stored in each rack based on the collected information (S1903). ). When the values of all items are equal to or greater than the threshold value 1012 and the rule 1010 is established (S1904: Y), the rack / cooling map 126 is referred to, the cooling device 151 that cools the rack, and the The air outlet located on the back of the rack is specified, and the cooling control designated by action 1020 is executed (S1905). Then, waiting for elapse of a fixed time (S1906), the process returns to monitoring of operation information and the like (S1902).
 本実施例によれば、管理対象機器の周辺にセンサを多数設置して温度上昇を検出する方法と比較して、センサの設置コストを省くことができる。 According to the present embodiment, it is possible to reduce the installation cost of the sensor as compared with a method of detecting a temperature increase by installing a large number of sensors around the management target device.

Claims (15)

  1.  プロセッサとファンを有するサーバ装置と冷却装置とに接続する管理計算機による制御方法であって、
     前記サーバから前記プロセッサの温度及び稼働率と前記ファンの回転数と前記サーバへの入気温度を取得し、
     前記プロセッサの前記温度及び前記稼働率と、前記ファンの前記回転数と、前記入気温度とから、予め定められた期間を経過した後の前記プロセッサの推定温度を算出し、
     前記推定温度が第1の所定値以上である場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記ファンの目標回転数を決定し、
     前記目標回転数とするように前記サーバ装置に指示する、ことを特徴とする制御方法。
    A control method by a management computer connected to a server device having a processor and a fan and a cooling device,
    Obtaining the temperature and operating rate of the processor, the rotational speed of the fan and the inlet temperature to the server from the server,
    From the temperature and the operating rate of the processor, the rotation speed of the fan, and the intake air temperature, an estimated temperature of the processor after a predetermined period has elapsed,
    When the estimated temperature is equal to or higher than a first predetermined value, a target rotational speed of the fan at which the estimated temperature after the period has elapsed is equal to or lower than the predetermined value;
    A control method characterized by instructing the server device to set the target rotational speed.
  2.  請求項1に記載の制御方法であって、
     前記管理計算機は稼動情報監視部と、ファン監視部と、ファン制御部と、温度推定部と、冷却制御部とを格納するメモリを有し、
     前記稼動情報監視部が前記サーバより前記プロセッサの前記稼動率を取得し、
     前記ファン監視部が前記ファンの前記回転数を取得し、
     前記温度推定部が前記入気温度と前記プロセッサの前記温度を取得し、前記プロセッサの前記温度及び前記稼働率から前記プロセッサの上昇温度を算出し、前記ファンの前記回転数と前記入気温度から冷却温度を算出し、前記上昇温度から前記冷却温度を減算して前記推定温度を算出し、
     前記ファン制御部が前記サーバに前記ファンが前記目標回転数に変更する指示を出し、
     前記ファンの回転数を最大にしても、前記期間を経過した後の前記推定温度が前記所定値以下とならない場合、前記冷却制御部が、前記ファンの回転数を規定値とした場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記目標入気温度を決定し、前記目標入気温度にするための前記冷却装置の出力を決定し、前記冷却装置に前記出力に変更する指示を出す、ことを特徴とする
    The control method according to claim 1, comprising:
    The management computer has a memory for storing an operation information monitoring unit, a fan monitoring unit, a fan control unit, a temperature estimation unit, and a cooling control unit,
    The operation information monitoring unit acquires the operation rate of the processor from the server,
    The fan monitoring unit obtains the rotation speed of the fan;
    The temperature estimation unit acquires the intake air temperature and the temperature of the processor, calculates an elevated temperature of the processor from the temperature and the operating rate of the processor, and from the rotation speed of the fan and the intake air temperature Calculate the cooling temperature, subtract the cooling temperature from the rising temperature to calculate the estimated temperature,
    The fan control unit issues an instruction to the server to change the fan to the target rotational speed,
    Even if the rotation speed of the fan is maximized, when the estimated temperature after the period does not become the predetermined value or less, when the cooling control unit sets the rotation speed of the fan to a specified value, Determining the target inlet temperature at which the estimated temperature after a period of time is less than or equal to the predetermined value, determining an output of the cooling device for achieving the target inlet temperature, and supplying the output to the cooling device It is characterized by issuing instructions to change
  3.  請求項1に記載の制御方法であって、
     前記ファンの回転数を最大にしても、前記期間を経過した後の前記推定温度が前記所定値以下とならない場合、前記冷却装置の出力を上げることを特徴とする制御方法。
    The control method according to claim 1, comprising:
    Even if the number of rotations of the fan is maximized, the output of the cooling device is increased when the estimated temperature after the period does not become the predetermined value or less.
  4.  請求項3に記載の制御方法であって、
     前記冷却装置の出力を上げる際、
     前記ファンの回転数を規定値とした場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記目標入気温度を決定し、
     前記目標入気温度にするための前記冷却装置の出力を決定する、ことを特徴とする制御方法。
    The control method according to claim 3, wherein
    When increasing the output of the cooling device,
    When the rotation speed of the fan is set to a specified value, the target intake air temperature at which the estimated temperature after the period has elapsed is less than or equal to the predetermined value is determined,
    A control method comprising: determining an output of the cooling device for achieving the target inlet temperature.
  5.  請求項4に記載の制御方法であって、
     前記冷却装置が複数台ある場合、前記複数の冷却装置の出力の組み合わせを複数算出し、
     前記複数の組み合わせのそれぞれに対し、前記複数の冷却装置による消費電力を算出し、
     前記消費電力をもとに、前記複数の組み合わせから組み合わせを選択し、
     前記選択された組み合わせにより、前記複数の冷却装置の出力を決定する、ことを特徴等する制御方法。
    The control method according to claim 4, comprising:
    When there are a plurality of the cooling devices, calculate a plurality of combinations of outputs of the cooling devices,
    For each of the plurality of combinations, calculate power consumption by the plurality of cooling devices,
    Based on the power consumption, select a combination from the plurality of combinations,
    A control method characterized by determining the outputs of the plurality of cooling devices based on the selected combination.
  6.  請求項1に記載の制御方法であって、
     前記推定温度を算出する際、前記プロセッサの前記温度及び前記稼働率から前記プロセッサの上昇温度を算出し、前記ファンの前記回転数と前記入気温度から冷却温度を算出し、前記上昇温度から前記冷却温度を減算して前記推定温度を算出する、ことを特徴とする制御方法。
    The control method according to claim 1, comprising:
    When calculating the estimated temperature, the temperature of the processor is calculated from the temperature and the operating rate of the processor, the cooling temperature is calculated from the rotation speed of the fan and the intake air temperature, and the temperature is calculated from the increased temperature. A control method, wherein the estimated temperature is calculated by subtracting a cooling temperature.
  7.  請求項6に記載の制御方法であって、
     前記サーバ装置の前記プロセッサの種類を認識し、
     前記上昇温度を算出する際、前記プロセッサの前記種類に対応する稼動率と発熱量の関係を用いて算出することを特徴とする制御方法。
    The control method according to claim 6, comprising:
    Recognizing the processor type of the server device;
    A control method characterized in that, when calculating the temperature rise, calculation is performed using a relationship between an operation rate corresponding to the type of the processor and a heat generation amount.
  8.  請求項1に記載の制御方法であって、
     前記推定温度が第1の所定値以上であり、さらに前記第2の所定値以下である場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記ファンの目標回転数を決定し、
     前記第1の所定値は、プロセッサ最適温度に第1の温度を加えた値であって、前記第2の所定値は前記プロセッサ最適温度から第2の温度を引いた値である、ことを特徴とする制御方法。
    The control method according to claim 1, comprising:
    When the estimated temperature is equal to or higher than the first predetermined value and further equal to or lower than the second predetermined value, the target rotational speed of the fan at which the estimated temperature after the period has elapsed is equal to or lower than the predetermined value. Decide
    The first predetermined value is a value obtained by adding the first temperature to the optimum processor temperature, and the second predetermined value is a value obtained by subtracting the second temperature from the optimum processor temperature. Control method.
  9.  請求項9に記載の制御方法であって、
     前記プロセッサ最適温度は、前記プロセッサの消費電力と、前記ファンの消費電力とにより算出する、ことを特徴とする制御方法。
    The control method according to claim 9, comprising:
    The control method, wherein the optimum processor temperature is calculated from power consumption of the processor and power consumption of the fan.
  10.  プロセッサとファンを有するサーバ装置と冷却装置とに接続する管理計算機による制御方法であって、
     前記サーバから前記プロセッサの温度と前記ファンの回転数と前記サーバへの入気温度を取得し、
     前記サーバから前記プロセッサにて実行されるジョブに関する情報を取得し、
     前記ジョブに関する情報から稼働率を取得し、
     前記プロセッサの前記温度及び前記稼働率と、前記ファンの前記回転数と、前記入気温度とから、予め定められた期間を経過した後の前記プロセッサの推定温度を算出し、
     前記推定温度が第1の所定値以上である場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記ファンの目標回転数を決定し、
     前記目標回転数とするように前記サーバ装置に指示する、ことを特徴とする制御方法。
    A control method by a management computer connected to a server device having a processor and a fan and a cooling device,
    Obtaining the temperature of the processor, the rotational speed of the fan and the inlet temperature to the server from the server;
    Obtaining information about a job executed by the processor from the server;
    Get the availability from the information about the job,
    From the temperature and the operating rate of the processor, the rotation speed of the fan, and the intake air temperature, an estimated temperature of the processor after a predetermined period has elapsed,
    When the estimated temperature is equal to or higher than a first predetermined value, a target rotational speed of the fan at which the estimated temperature after the period has elapsed is equal to or lower than the predetermined value;
    A control method characterized by instructing the server device to set the target rotational speed.
  11.  サーバシステムであって、
     プロセッサとファンを有し、前記プロセッサの温度及び稼働率と前記ファンの回転数と入気温度とを計測する前記サーバ装置と
     前記サーバ装置及び前記冷却装置に接続し、前記プロセッサの前記温度及び前記稼働率と、前記ファンの前記回転数と、前記入気温度とから、予め定められた期間を経過した後の前記プロセッサの推定温度を算出し、前記推定温度が第1の所定値を以上である場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記ファンの目標回転数を決定する管理計算機とを有する、ことを特徴とするサーバシステム。
    A server system,
    A processor and a fan, connected to the server device and the server device and the cooling device for measuring a temperature and an operating rate of the processor, a rotation speed of the fan, and an intake air temperature; An estimated temperature of the processor after elapse of a predetermined period is calculated from the operating rate, the rotation speed of the fan, and the intake air temperature, and the estimated temperature exceeds the first predetermined value. In some cases, the server system includes: a management computer that determines a target rotational speed of the fan at which the estimated temperature after the period has elapsed is equal to or less than the predetermined value.
  12.  請求項11に記載のサーバシステムであって、
     さらに、冷却装置を有し、
     前記管理計算機は、前記ファンの回転数を最大にしても、前記期間を経過した後の前記推定温度が前記所定値以下とならない場合、前記冷却装置の出力を上げる指示を出す、ことを特徴とするサーバシステム。
    The server system according to claim 11,
    Furthermore, it has a cooling device,
    The management computer issues an instruction to increase the output of the cooling device when the estimated temperature after the period does not become the predetermined value or less even when the rotation speed of the fan is maximized. Server system.
  13.  請求項12に記載のサーバシステムであって、
     前記管理計算機は、前記ファンの回転数を規定値とした場合に、前記期間を経過した後の前記推定温度が前記所定値以下となる前記目標入気温度を決定し、前記目標入気温度にするための前記冷却装置の出力を決定する、ことを特徴とするサーバシステム。
    The server system according to claim 12,
    The management computer determines the target inlet temperature at which the estimated temperature after the period has passed is less than or equal to the predetermined value when the rotation speed of the fan is a specified value, and sets the target inlet temperature to the target inlet temperature. And determining an output of the cooling device to perform.
  14.  請求項11に記載のサーバシステムであって、
     前記管理計算機が、前記推定温度を算出する際、前記プロセッサの前記温度及び前記稼働率から前記プロセッサの上昇温度を算出し、前記ファンの前記回転数と前記入気温度から冷却温度を算出し、前記上昇温度から前記冷却温度を減算して前記推定温度を算出する、ことを特徴とするサーバシステム。
    The server system according to claim 11,
    When the management computer calculates the estimated temperature, it calculates the rising temperature of the processor from the temperature and the operating rate of the processor, calculates the cooling temperature from the rotational speed of the fan and the inlet temperature, The server system, wherein the estimated temperature is calculated by subtracting the cooling temperature from the increased temperature.
  15.  請求項14に記載のサーバシステムであって、
     前記管理計算機は、前記サーバ装置の前記プロセッサの種類を認識し、前記上昇温度を算出する際、前記プロセッサの前記種類に対応する稼動率と発熱量の関係を用いて算出する、ことを特徴とするサーバシステム。
    15. The server system according to claim 14, wherein
    The management computer recognizes the type of the processor of the server device, and calculates the increased temperature using a relationship between an operation rate corresponding to the type of the processor and a heat generation amount, Server system.
PCT/JP2009/000806 2008-10-31 2009-02-24 Physical computer, method for controlling cooling device, and server system WO2010050080A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008-280698 2008-10-31
JP2008280698A JP2010108324A (en) 2008-10-31 2008-10-31 Physical computer, method for controlling cooling device, and server system

Publications (1)

Publication Number Publication Date
WO2010050080A1 true WO2010050080A1 (en) 2010-05-06

Family

ID=42128456

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/000806 WO2010050080A1 (en) 2008-10-31 2009-02-24 Physical computer, method for controlling cooling device, and server system

Country Status (2)

Country Link
JP (1) JP2010108324A (en)
WO (1) WO2010050080A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011196617A (en) * 2010-03-19 2011-10-06 Fujitsu Ltd Air conditioning system and air conditioning method
CN102314206A (en) * 2010-07-06 2012-01-11 英业达股份有限公司 Fan speed control device for server
CN102478935A (en) * 2010-11-29 2012-05-30 英业达股份有限公司 Rack-mounted server system
CN103883851A (en) * 2014-04-18 2014-06-25 沈潇 Air-cooling swinging laptop support
US9341190B2 (en) 2012-10-18 2016-05-17 International Business Machines Corporation Thermal control system based on nonlinear zonal fan operation and optimized fan power
US9354126B2 (en) 2012-11-30 2016-05-31 International Business Machines Corporation Calibrating thermal behavior of electronics
US10180665B2 (en) 2011-09-16 2019-01-15 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Fluid-cooled computer system with proactive cooling control using power consumption trend analysis
WO2023078237A1 (en) * 2021-11-02 2023-05-11 北京百度网讯科技有限公司 Method and apparatus for controlling temperature of cloud mobile phone server, and device

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5197675B2 (en) * 2010-05-14 2013-05-15 株式会社東芝 Air conditioning system
JP2012053678A (en) * 2010-09-01 2012-03-15 Fujitsu Ltd Fan control program, fan control method and information processing equipment
JP5545137B2 (en) * 2010-09-03 2014-07-09 富士通株式会社 Electronic device, control program, and shutdown control method
CN102478937A (en) * 2010-11-30 2012-05-30 英业达股份有限公司 Rack-mounted server system
JP5511698B2 (en) * 2011-01-20 2014-06-04 日本電信電話株式会社 Air conditioner linkage control system, air conditioner linkage control method, and air conditioner linkage control program
JP5672099B2 (en) * 2011-03-22 2015-02-18 富士通株式会社 Device equipped with electronic device, cooling program for device equipped with electronic device, and method for cooling device equipped with electronic device
JP5691933B2 (en) 2011-08-16 2015-04-01 富士通株式会社 Air conditioning control method, air conditioning control system, and air conditioning control device
WO2013038470A1 (en) 2011-09-12 2013-03-21 富士通株式会社 Cooling system, cooling method, and cooling control program
JP5568535B2 (en) * 2011-09-28 2014-08-06 株式会社日立製作所 Data center load allocation method and information processing system
JP5911970B2 (en) * 2011-12-29 2016-04-27 インテル コーポレイション Adaptive temperature throttling with user configuration features
JP5835465B2 (en) 2012-03-30 2015-12-24 富士通株式会社 Information processing apparatus, control method, and program
JP2013213635A (en) * 2012-04-03 2013-10-17 Nippon Telegr & Teleph Corp <Ntt> Air conditioning control method, and air conditioning control system
JP6036428B2 (en) * 2013-03-18 2016-11-30 富士通株式会社 Electronic device cooling system and electronic device cooling method
JP6020283B2 (en) * 2013-03-26 2016-11-02 富士通株式会社 Electronic equipment cooling system
JP6083305B2 (en) * 2013-04-08 2017-02-22 富士通株式会社 Electronic equipment cooling system
JP6002098B2 (en) * 2013-07-23 2016-10-05 日本電信電話株式会社 Air conditioning control method and air conditioning control system
JP6417672B2 (en) 2014-02-27 2018-11-07 富士通株式会社 Data center, data center control method and control program
JP6287434B2 (en) * 2014-03-26 2018-03-07 日本電気株式会社 Temperature control device, temperature control method, and temperature control program
JP6384321B2 (en) 2014-12-26 2018-09-05 富士通株式会社 Job allocation program, method and apparatus
JP6628311B2 (en) 2016-03-24 2020-01-08 Necプラットフォームズ株式会社 Fan control device, cooling fan system, computer device, fan control method and program
JP7037057B2 (en) * 2018-06-29 2022-03-16 富士通株式会社 Electronic devices, control programs and control methods
JP7213049B2 (en) * 2018-09-28 2023-01-26 中央電子株式会社 Fan control device, control method, control program, server rack, and air conditioning management system
JP7292908B2 (en) * 2019-03-13 2023-06-19 株式会社東芝 Abnormality detection device, abnormality detection method, and program
JP6659896B2 (en) * 2019-05-20 2020-03-04 株式会社Nttファシリティーズ Air conditioning system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002175131A (en) * 2000-12-07 2002-06-21 Nec Yonezawa Ltd Information processor
JP2006208000A (en) * 2005-01-28 2006-08-10 Hewlett-Packard Development Co Lp Heat/electric power controller
JP2007041739A (en) * 2005-08-02 2007-02-15 Nec Access Technica Ltd Fan drive control method and device
US20070067136A1 (en) * 2005-08-25 2007-03-22 Conroy David G Methods and apparatuses for dynamic thermal control

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002175131A (en) * 2000-12-07 2002-06-21 Nec Yonezawa Ltd Information processor
JP2006208000A (en) * 2005-01-28 2006-08-10 Hewlett-Packard Development Co Lp Heat/electric power controller
JP2007041739A (en) * 2005-08-02 2007-02-15 Nec Access Technica Ltd Fan drive control method and device
US20070067136A1 (en) * 2005-08-25 2007-03-22 Conroy David G Methods and apparatuses for dynamic thermal control

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011196617A (en) * 2010-03-19 2011-10-06 Fujitsu Ltd Air conditioning system and air conditioning method
CN102314206A (en) * 2010-07-06 2012-01-11 英业达股份有限公司 Fan speed control device for server
CN102478935A (en) * 2010-11-29 2012-05-30 英业达股份有限公司 Rack-mounted server system
US10180665B2 (en) 2011-09-16 2019-01-15 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Fluid-cooled computer system with proactive cooling control using power consumption trend analysis
US9341190B2 (en) 2012-10-18 2016-05-17 International Business Machines Corporation Thermal control system based on nonlinear zonal fan operation and optimized fan power
US9360021B2 (en) 2012-10-18 2016-06-07 International Business Machines Corporation Thermal control system based on nonlinear zonal fan operation and optimized fan power
US9354126B2 (en) 2012-11-30 2016-05-31 International Business Machines Corporation Calibrating thermal behavior of electronics
US9534967B2 (en) 2012-11-30 2017-01-03 International Business Machines Corporation Calibrating thermal behavior of electronics
US9702767B2 (en) 2012-11-30 2017-07-11 International Business Machines Corporation Calibrating thermal behavior of electronics
CN103883851A (en) * 2014-04-18 2014-06-25 沈潇 Air-cooling swinging laptop support
WO2023078237A1 (en) * 2021-11-02 2023-05-11 北京百度网讯科技有限公司 Method and apparatus for controlling temperature of cloud mobile phone server, and device

Also Published As

Publication number Publication date
JP2010108324A (en) 2010-05-13

Similar Documents

Publication Publication Date Title
WO2010050080A1 (en) Physical computer, method for controlling cooling device, and server system
JP4922255B2 (en) Information processing system and power saving control method in the system
US7991515B2 (en) Computer cooling system with preferential cooling device selection
US8341433B2 (en) Method and system for managing the power consumption of an information handling system
US8676397B2 (en) Regulating the temperature of a datacenter
US8019477B2 (en) Energy efficient CRAC unit operation
US9671839B2 (en) Information handling system dynamic acoustical management
US7584021B2 (en) Energy efficient CRAC unit operation using heat transfer levels
JP5835465B2 (en) Information processing apparatus, control method, and program
US9192076B2 (en) Methods for managing fans within information handling systems
US9723763B2 (en) Computing device, method, and computer program for controlling cooling fluid flow into a computer housing
JP2008235696A (en) Fan rotation control method, fan rotation control system, and fan rotation control program
JPWO2010050249A1 (en) Information management system operation management device
KR20140092328A (en) System and method for determining thermal management policy from leakage current measurement
US20200073456A1 (en) Adjusting a power limit in response to a temperature difference
JP5921461B2 (en) Outside air and local cooling information processing system and its load allocation method
JP5969939B2 (en) Data center air conditioning controller
CN107300278B (en) System and method for minimizing compressor usage in an HVAC system
US8630739B2 (en) Exergy based evaluation of an infrastructure
EP2575003B1 (en) Method for determining assignment of loads of data center and information processing system
JP5165104B2 (en) Information processing system and power saving control method in the system
US20210271300A1 (en) Dynamic thermal control
Chaudhry et al. Considering thermal-aware proactive and reactive scheduling and cooling for green data-centers
Noguchi et al. Shutter control for cooling air flow management in data center servers
JP2017084013A (en) Electric power management program, electric power management method, and electric power management unit

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09823197

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09823197

Country of ref document: EP

Kind code of ref document: A1