WO2022021240A1 - Thermal-aware scheduling method and system - Google Patents

Thermal-aware scheduling method and system Download PDF

Info

Publication number
WO2022021240A1
WO2022021240A1 PCT/CN2020/105938 CN2020105938W WO2022021240A1 WO 2022021240 A1 WO2022021240 A1 WO 2022021240A1 CN 2020105938 W CN2020105938 W CN 2020105938W WO 2022021240 A1 WO2022021240 A1 WO 2022021240A1
Authority
WO
WIPO (PCT)
Prior art keywords
servers
server
server cluster
operating modes
operating mode
Prior art date
Application number
PCT/CN2020/105938
Other languages
English (en)
French (fr)
Inventor
Xu Zhao
Yijun Lu
Zhan Li
Jian Tan
Youquan FENG
Yuan Tao
Original Assignee
Alibaba Cloud Computing Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Cloud Computing Ltd. filed Critical Alibaba Cloud Computing Ltd.
Priority to CN202080104888.7A priority Critical patent/CN116458140A/zh
Priority to PCT/CN2020/105938 priority patent/WO2022021240A1/en
Publication of WO2022021240A1 publication Critical patent/WO2022021240A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5094Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/20Cooling means
    • G06F1/206Cooling means comprising thermal management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/28Supervision thereof, e.g. detecting power-supply failure by out of limits supervision
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3215Monitoring of peripheral devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/329Power saving characterised by the action undertaken by task scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • Task scheduling or load balancing is a process of distributing or assigning a number of incoming tasks to a plurality of available resources such as computing units or processes, to enable an overall performance of the plurality of available resources to be efficient and the number of incoming tasks to be processed in time.
  • a plurality of available resources such as computing units or processes
  • task scheduling or load balancing has become important for ensuring the success and the efficient performance of cloud computing and data centers to fulfill the demands and requirements of the users.
  • Existing task scheduling or load balancing usually adopts strategies that randomly assign incoming tasks to a number of available resources, or assign the tasks to resources that currently have the least amount of workload or connections.
  • current task scheduling or load balancing strategies are too simple, and fail to consider physical conditions of the available resources, which may not only affect working conditions of the resources, but also affect health conditions of the resources.
  • FIG. 1 illustrates an example environment in which a thermal ⁇ aware scheduling system may be used.
  • FIG. 2 illustrates the example thermal ⁇ aware scheduling system in more detail.
  • FIG. 3 illustrates an example server in more detail.
  • FIG. 4 illustrates an example method of scheduling a task to be assigned.
  • FIG. 5 illustrates an example operating mode estimation model
  • the thermal ⁇ aware scheduling system may consider operating conditions or modes of cooling components (e.g., fans) associated with a cluster of computing resources (such as servers, etc. ) in a cloud computing or data center infrastructure to perform task scheduling or load balancing for incoming tasks or requests.
  • the power consumed by a certain type of a cooling component depends on some operating parameters of the cooling component. For example, the power consumed by a fan may have approximately a cubic relationship with the speed of the fan) , and may take up a large portion of power consumed by a server to which the fan is attached and from which the fan receives power. Therefore, the thermal ⁇ aware scheduling system may perform task scheduling or load balancing based at least in part on an estimation of the operating conditions or modes of the cooling component of the computing resources.
  • a plurality of operating modes of a cooling component may exist, and depend on a type of the cooling component and/or a setting of a manufacturer of the cooling component.
  • operating modes of a cooling component may include at least two modes, e.g., a first mode and a second mode (or called an acoustic mode and a performance mode) , to save fan power or guarantee thermal safety of an associated computing resource (such as a server, etc. ) under different circumstances.
  • a first mode of a cooling component (such as a fan) associated with a computing resource may refer to an operating mode in which the cooling component operates at a low speed with a low temperature of inflow air
  • a second mode of the cooling component associated with the computing resource may refer to an operating mode in which the cooling component operates at a high speed to protect the computing resource from a thermal breakdown due to a high temperature of inflow air (or a warmer airflow) .
  • the thermal ⁇ aware scheduling system may receive a task to be assigned from a client device.
  • the client device may send the task to be assigned to the cloud computing or data center infrastructure, and the thermal ⁇ aware scheduling system may receive the task to be assigned from an edge router through which the task to be assigned is transmitted to the cloud computing or data center infrastructure.
  • the thermal ⁇ aware scheduling system may collect first pieces of information that are useful for estimating operating modes of the cooling components associated with the computing resources of the same cluster in the data center or cloud computing infrastructure from the computing resources, and collect second pieces of information that are useful for estimating operating modes of the cooling components from an environment of the computing resources of the same cluster.
  • the first pieces of information may include, but is not limited to, respective power consumption information, respective processor utilization information, and respective inlet temperatures of the computing resources of the same cluster.
  • the second pieces of information may include, but is not limited to, operating conditions of cooling units (such as set points of air conditioners and/or fans in a facility housing the computing resources, etc. ) and an ambient temperature of the environment of the computing resources (e.g., a room temperature of the facility housing the computing resources) .
  • the facility may include a room housing at least a portion of the data center or cloud computing infrastructure such as the computing resources.
  • the thermal ⁇ aware scheduling system may estimate or predict the operating modes of the computing resources of the same cluster based at least in part on the first pieces of information and the second pieces of information using respective operating mode estimation models. In implementations, the thermal ⁇ aware scheduling system may select a computing resource from the computing resources based on the operating modes of the computing resources, and assign the task to be assigned to the selected computing resource.
  • the example thermal ⁇ aware scheduling system can deterministically or strategically assign or redirect a request for processing a distributed database transaction from a client device to a computing node that includes a data section of a data table involved in at least one query of the distributed database transaction, thus avoiding the use of a control or coordinate node and hence reducing the communication costs and resource waste.
  • a receiving service may receive a task to be assigned from a client device, while a collection service may collect information from a plurality of computing resources of a cluster and an environment housing the cluster, and yet another estimation service may estimate or predict operating modes of cooling components associated with the plurality of computing resources.
  • a selection service may select a computing resource from the plurality of computing resources based on the operating modes of the computing resources, and an assignment service may assign the task to be assigned to the selected computing resource.
  • the thermal ⁇ aware scheduling system may be implemented as a combination of software and hardware implemented and distributed in multiple devices, in other examples, the thermal ⁇ aware scheduling system may be implemented and distributed as services provided in one or more computing devices over a network and/or in a cloud computing architecture.
  • the application describes multiple and varied embodiments and implementations.
  • the following section describes an example framework that is suitable for practicing various implementations.
  • the application describes example systems, devices, and processes for implementing a thermal ⁇ aware scheduling system.
  • FIG. 1 illustrates an example environment 100 usable to implement a thermal ⁇ aware scheduling system.
  • the environment 100 may include a thermal ⁇ aware scheduling system 102, and a plurality of servers 104 ⁇ 1, 104 ⁇ 2, 104 ⁇ 3, 104 ⁇ 4, 104 ⁇ 6, ..., 104 ⁇ N (or called as a plurality of computing resources) , which are collectively called as servers 104.
  • the thermal ⁇ aware scheduling system 102 and the plurality of servers 104 may communicate data with one another via a network 106.
  • the plurality of servers 104 may include or be peripheral with a plurality of cooling components 108 ⁇ 1, 108 ⁇ 2, 108 ⁇ 3, 108 ⁇ 4, 108 ⁇ 5, ..., 108 ⁇ M, which are collectively called as cooling components 108.
  • the cooling components 108 may be configured to provide cooling effects on the plurality of servers 104.
  • the thermal ⁇ aware scheduling system 102 is described as being an individual entity or device. In other instances, the thermal ⁇ aware scheduling system 102 may be located in one of the plurality of servers 104, or may be located in a dedicated server such as a task scheduling server 110 (or called as a load balancing server) .
  • the environment 100 may further include one or more cooling units 112 and one or more sensors 114.
  • the one or more cooling units 112 may include, but are not limited to, air conditioners, fans, etc.
  • the one or more cooling units 112 may be configured to provide cooling effects on the plurality of servers 104 and/or control an ambient temperature of a physical environment of the plurality of servers 104, in addition to the cooling components 108.
  • the physical environment of the plurality of servers 104 may include a facility including or housing the plurality of servers 104, such as a data center room, etc.
  • the one or more sensors 114 may be configured to measure the ambient temperature of the physical environment of the plurality of servers 104, measure or detect respective setpoints or operating conditions of the one or more cooling units 112, etc.
  • the thermal ⁇ aware scheduling system 102 and the plurality of servers 104 may be included in a data center or cloud computing infrastructure, or at least a portion of a data center or cloud computing infrastructure. In other words, the thermal ⁇ aware scheduling system 102 and the plurality of servers 104 may form at least a portion of a data center or cloud.
  • the plurality of servers 104 may be divided or grouped into multiple server clusters, and each server cluster may include a number of servers 104 that may be physically close to each other (e.g., servers located in a same rack, servers located in a same storage cabinet, servers located in a same room, etc. ) .
  • the data center or cloud computing infrastructure may be physically divided into multiple physical partitions, and each physical partition may include a portion of the plurality of servers 104 (i.e., a server cluster of multiple servers 104) .
  • each physical partition (or each server cluster) may have an upper limit (e.g., 30, 50, 70, etc.
  • each physical partition may depend on a physical configuration or arrangement of the servers 104 in the respective physical partition (or the respective server cluster) .
  • a server cluster may include servers located in a same rack, or servers located in three adjacment racks, or servers located in a same storage cabinet, etc.
  • each of the plurality of servers 104 may be implemented as any of a variety of computing devices, but not limited to, a desktop computer, a notebook or portable computer, a handheld device, a netbook, an Internet appliance, a tablet or slate computer, a mobile device (e.g., a mobile phone, a personal digital assistant, a smart phone, etc. ) , a server computer, etc., or a combination thereof.
  • a desktop computer e.g., a notebook or portable computer, a handheld device, a netbook, an Internet appliance, a tablet or slate computer, a mobile device (e.g., a mobile phone, a personal digital assistant, a smart phone, etc. ) , a server computer, etc., or a combination thereof.
  • the network 106 may be a wireless or a wired network, or a combination thereof.
  • the network 106 may be a collection of individual networks interconnected with each other and functioning as a single large network (e.g., the Internet or an intranet) . Examples of such individual networks include, but are not limited to, telephone networks, cable networks, Local Area Networks (LANs) , Wide Area Networks (WANs) , and Metropolitan Area Networks (MANs) . Further, the individual networks may be wireless or wired networks, or a combination thereof.
  • Wired networks may include an electrical carrier connection (such a communication cable, etc. ) and/or an optical carrier or connection (such as an optical fiber connection, etc. ) .
  • Wireless networks may include, for example, a WiFi network, other radio frequency networks (e.g., Zigbee, etc. ) , etc.
  • a client device may send a request or task to the data center or cloud (e.g., an edge router of the data center or cloud) .
  • the thermal ⁇ aware scheduling system 102 may receive the request or task from the edge router, and select a server to process the request or task from among the plurality of servers 104. After selecting the server, the thermal ⁇ aware scheduling system 102 may forward the request or task to the selected server for processing.
  • FIG. 2 illustrates the server 104 in more detail.
  • the server 104 may include, but is not limited to, one or more processors 202, an input/output (I/O) interface 204, and/or a network interface 206, and memory 208.
  • some of the functions of the server 104 may be implemented using hardware, for example, an ASIC (i.e., Application ⁇ Specific Integrated Circuit) , a FPGA (i.e., Field ⁇ Programmable Gate Array) , and/or other hardware.
  • ASIC i.e., Application ⁇ Specific Integrated Circuit
  • FPGA i.e., Field ⁇ Programmable Gate Array
  • the processors 202 may be configured to execute instructions that are stored in the memory 208, and/or received from the I/O interface 204, and/or the network interface 206.
  • the processors 202 may be implemented as one or more hardware processors including, for example, a microprocessor, an application ⁇ specific instruction ⁇ set processor, a physics processing unit (PPU) , a central processing unit (CPU) , a graphics processing unit, a digital signal processor, a tensor processing unit, etc. Additionally or alternatively, the functionality described herein can be performed, at least in part, by one or more hardware logic components.
  • FPGAs field ⁇ programmable gate arrays
  • ASICs application ⁇ specific integrated circuits
  • ASSPs application ⁇ specific standard products
  • SOCs system ⁇ on ⁇ a ⁇ chip systems
  • CPLDs complex programmable logic devices
  • the memory 208 may include computer readable media in a form of volatile memory, such as Random Access Memory (RAM) and/or non ⁇ volatile memory, such as read only memory (ROM) or flash RAM.
  • RAM Random Access Memory
  • ROM read only memory
  • flash RAM flash random access memory
  • the computer readable media may include a volatile or non ⁇ volatile type, a removable or non ⁇ removable media, which may achieve storage of information using any method or technology.
  • the information may include a computer readable instruction, a data structure, a program module or other data.
  • Examples of computer readable media include, but not limited to, phase ⁇ change memory (PRAM) , static random access memory (SRAM) , dynamic random access memory (DRAM) , other types of random ⁇ access memory (RAM) , read ⁇ only memory (ROM) , electronically erasable programmable read ⁇ only memory (EEPROM) , quick flash memory or other internal storage technology, compact disk read ⁇ only memory (CD ⁇ ROM) , digital versatile disc (DVD) or other optical storage, magnetic cassette tape, magnetic disk storage or other magnetic storage devices, or any other non ⁇ transmission media, which may be used to store information that may be accessed by a computing device.
  • the computer readable media does not include any transitory media, such as modulated data signals and carrier waves.
  • the server 104 may further include other hardware components and/or other software components such as program units to execute instructions stored in the memory 208 for performing various operations, and program data 210 that stores application data and data of tasks processed by the server 104.
  • the server 104 may include system interfaces and/or platform management interfaces 212 that can be called or invoked by the thermal ⁇ aware scheduling system 102 to provide information to the thermal ⁇ aware scheduling system 102.
  • the server 104 may include or be peripheral with one or more cooling components (such as the cooling components 108 as shown in FIG. 1) .
  • FIG. 3 illustrates the thermal ⁇ aware scheduling system 102 in more detail.
  • the thermal ⁇ aware scheduling system 102 may include, but is not limited to, one or more processors 302, an input/output (I/O) interface 304, and/or a network interface 306, and memory 308.
  • some of the functions of the thermal ⁇ aware scheduling system 102 may be implemented using hardware, for example, an ASIC (i.e., Application ⁇ Specific Integrated Circuit) , a FPGA (i.e., Field ⁇ Programmable Gate Array) , and/or other hardware.
  • the thermal ⁇ aware scheduling system 102 is described to exist as an independent entity or device.
  • the thermal ⁇ aware scheduling system 102 may be included or located in any server of the plurality of servers 104, or may be included or located in a dedicated server (such as a task scheduling server 110 (or called as a load balancing server) .
  • the processors 302 may be configured to execute instructions that are stored in the memory 308, and/or received from the I/O interface 304, and/or the network interface 306.
  • the processors 302 may be implemented as one or more hardware processors including, for example, a microprocessor, an application ⁇ specific instruction ⁇ set processor, a physics processing unit (PPU) , a central processing unit (CPU) , a graphics processing unit, a digital signal processor, a tensor processing unit, etc. Additionally or alternatively, the functionality described herein can be performed, at least in part, by one or more hardware logic components.
  • FPGAs field ⁇ programmable gate arrays
  • ASICs application ⁇ specific integrated circuits
  • ASSPs application ⁇ specific standard products
  • SOCs system ⁇ on ⁇ a ⁇ chip systems
  • CPLDs complex programmable logic devices
  • the memory 308 may include computer readable media in a form of volatile memory, such as Random Access Memory (RAM) and/or non ⁇ volatile memory, such as read only memory (ROM) or flash RAM.
  • RAM Random Access Memory
  • ROM read only memory
  • flash RAM flash random access memory
  • the thermal ⁇ aware scheduling system 102 may further include other hardware components and/or other software components such as program units to execute instructions stored in the memory 308 for performing various operations and other program data 310.
  • the thermal ⁇ aware scheduling system 102 may further include a receiving module 312, a data collection module 314, an operating mode estimation module 316, and a scheduling module 318.
  • the thermal ⁇ aware scheduling system 102 may further include model database 320 that is configured to store operating mode estimation models of the cooling components 108 of the plurality of servers 104 in the data center or cloud computing infrastructure.
  • the thermal ⁇ aware scheduling system 102 may include a task scheduling or load balancing strategy that is based on operating modes of cooling components of multiple servers 104 of a server cluster in a data center or cloud infrastructure, which is described in detail in a subsequent section.
  • the thermal ⁇ aware scheduling system 102 may further include one or more predetermined task scheduling or load balancing strategies.
  • the one or more predetermined task scheduling or load balancing strategies may include assigning a request or task (e.g., a task that is received a client device) to a server in a random manner, assigning the request or task to a server in a round ⁇ robin manner, assigning the request or task to a server that currently has the least workload, assigning the request or task to a server based on a mapping relationship between an IP address of the client device and an IP address of the server, etc.
  • FIG. 4 shows a schematic diagram depicting an example method of FIG. 4 illustrates an example method of scheduling a task to be assigned.
  • the method of FIG. 4 may, but need not, be implemented in the environment of FIG. 1 and using the server and the thermal ⁇ aware scheduling system of FIG. 2 and FIG. 3.
  • a method 400 is described with reference to FIGS. 1 ⁇ 3.
  • the method 400 may alternatively be implemented in other environments and/or using other systems.
  • the method 400 is described in the general context of computer ⁇ executable instructions.
  • computer ⁇ executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, and the like that perform particular functions or implement particular abstract data types.
  • each of the example methods are illustrated as a collection of blocks in a logical flow graph representing a sequence of operations that can be implemented in hardware, software, firmware, or a combination thereof.
  • the order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method, or alternate methods. Additionally, individual blocks may be omitted from the method without departing from the spirit and scope of the subject matter described herein.
  • the blocks represent computer instructions that, when executed by one or more processors, perform the recited operations.
  • some or all of the blocks may represent application specific integrated circuits (ASICs) or other physical components that perform the recited operations.
  • ASICs application specific integrated circuits
  • the thermal ⁇ aware scheduling system 102 may receive a request or task to be assigned.
  • a client device may send a request or task to be assigned to the cloud computing or data center infrastructure, for example, via an edge router of the cloud computing or data center infrastructure.
  • the thermal ⁇ aware scheduling system 102 (or the receiving module 312) may receive the request or task to be assigned from the edge router through the network 106.
  • the thermal ⁇ aware scheduling system 102 may collect power and performance status information of multiple servers of a server cluster in a data center or cloud, and environment information of the server cluster.
  • the thermal ⁇ aware scheduling system 102 may collect power and performance status information of multiple servers in a server cluster of the cloud computing or data center infrastructure, and environment information of the server cluster. In implementations, the thermal ⁇ aware scheduling system 102 (or the data collection module 314) may collect these pieces of information before or after receiving the request or task to be assigned. In implementations, the thermal ⁇ aware scheduling system 102 (or the data collection module 314) may collect these pieces of information on a periodic basis, e.g., after a predetermined time interval (such as after every one millisecond, every second, etc. ) .
  • an actual time length of the predetermined time interval may depend on one or more factors, which include, but are not limited to, the number of requests or tasks received per second, a time of day, the number of servers in the server cluster, etc.
  • the thermal ⁇ aware scheduling system 102 (or the data collection module 314) may include dedicated hardware and/or software components to collect these pieces of information continuously.
  • the thermal ⁇ aware scheduling system 102 may collect the power and performance status information of the multiple servers in the server cluster of the cloud computing or data center infrastructure by actively calling or invoking respective system interfaces of the multiple servers or platform management interfaces to obtain the power and performance status information of the multiple servers from the multiple servers.
  • the platform management interfaces may include, for example, an IPMI (Intelligent Platform Management Interface) , which is a set of standardized specifications for hardware ⁇ based platform management systems to enable controlling and monitoring servers centrally.
  • functions of an IPMI may include, for example, monitoring hardware status (which include, but is not limited to, temperature, power consumption, voltage, etc. ) , log server data, and allowing access to a server even when an operating system of the server is not installed or is malfunctioning, etc.
  • the thermal ⁇ aware scheduling system 102 may collect the environment information of the server cluster from the cooling units 112 and the one or more sensors 114 through one or more communications protocols.
  • a communications protocol is a system of rules that allow multiple entities of a communications system to transmit information via any kind of variation of a physical quantity.
  • the one or more communications protocols may include, but not are not limited to, a Modbus (which is an open serial protocol derived from a Master/Slave architecture) , a TCP (Transmission Control Protocol) , a UDP (User Datagram Protocol) , etc.
  • the thermal ⁇ aware scheduling system 102 may collect the environment information of the server cluster from the cooling units 112 and the one or more sensors 114 through an IPC (Industrial Personal Computer) which is a computer intended for industrial purposes (such as production of goods and services) , with a form factor between a nettop and a server rack. Additionally, the thermal ⁇ aware scheduling system 102 (or the data collection module 314) may collect the environment information of the server cluster from the cooling units 112 and the one or more sensors 114 through industrial exchangers, and/or any devices with hardware interfaces (e.g., serial data communication cables such as RS485/232 adapter or RJ45 cable, etc. ) that are configured or enabled to connect to the cooling units 112 and the one or more sensor 114 for collecting information from the cooling units 112 and the one or more sensors 114, etc.
  • IPC International Personal Computer
  • the power and performance information of the plurality of servers may include, but is not limited to, respective power consumption information, respective processor utilization information, and respective inlet temperatures of the multiple servers in the server cluster. Additionally, the power and performance information of the plurality of servers may include respective memory utilization information, and respective input/output bandwidth information of the multiple servers in the server cluster.
  • the environment information of the server cluster may include, but is not limited to, operating conditions of cooling units, and an ambient temperature of an environment of the server cluster.
  • an operating condition of a cooling unit may include, for example, whether the cooling unit is turned on or off, a temperature setting of the cooling unit, an operating intensity of the cooling unit (such as high, medium, or low speed, etc. ) , etc.
  • the thermal ⁇ aware scheduling system 102 may estimate or predict respective operating modes of cooling components of the multiple servers based at least in part on power and performance status information of the multiple servers and environment information of the server cluster using corresponding operating mode estimation models.
  • the thermal ⁇ aware scheduling system 102 may retrieve corresponding operating mode estimation models for the multiple servers or the cooling components of the multiple servers.
  • the thermal ⁇ aware scheduling system 102 may retrieve the corresponding operating mode estimation models from the model database 320.
  • each cooling component may have a corresponding operating mode estimation model for predicting or estimating an operating mode of the respective cooling component.
  • a one ⁇ to ⁇ one correspondence relationship between each cooling component and a corresponding operating mode estimation model exists.
  • servers in a same server cluster are physically close to each other, and thus heat dissipated or generated by the servers in the server cluster may be susceptible to have thermal effects on each other (e.g., increasing the temperature surrounding the servers of the server cluster) , and thus may influence operating modes of cooling components associated with the servers with each other.
  • the thermal ⁇ aware scheduling system 102 may estimate or predict respective operating modes of the cooling components of the multiple servers based at least in part on the power and performance status information of the multiple servers in the server cluster and the environment information of the server cluster using corresponding operating mode estimation models of the cooling components of the multiple servers.
  • a cooling component (e.g., one of the cooling components of the multiple servers within the server cluster) may be associated with an operating mode estimation model that may take the power and performance status information of the multiple servers within the server cluster and environment information of the server cluster as inputs, and produce outputs that are related to likelihood or probabilities of the cooling component being operating or to be operating at different operating modes.
  • the operating mode estimation model associated with the cooling component in the server cluster may not take information (such as power and performance status information, etc. ) of servers that do not belong to that same server cluster in which a server associated with the cooling component is included.
  • the server cluster may include 10 servers, and each server may include or be peripheral with one cooling component (such as a fan, etc. ) .
  • an operating mode estimation model corresponding to the cooling component may take power and performance status information of the 10 servers within that server cluster and environment information of the server cluster as inputs, and produce outputs that are related to likelihood or probabilities of the cooling component being operating or to be operating at different operating modes (e.g., for N different operating modes, N outputs produced by the operating mode estimation model, one output for each operating mode) .
  • different cooling components may have different or same numbers of operating modes. For example, a first cooling component may have 2 different operating modes, while a second cooling component may have 3 different operating modes.
  • the number of different operating modes of each cooling component may depend on a type of the respective cooling component, a setting of the respective cooling component configured by a manufacturer thereof, etc.
  • an operating mode estimation model may include, but is not limited to, a neural network model (for example, a deep neural network model, a backpropagation neural network model, etc. ) , a decision tree model, etc.
  • a neural network model for example, a deep neural network model, a backpropagation neural network model, etc.
  • the thermal ⁇ aware scheduling system 102 may train and test each operating mode estimation model using historical input and output data.
  • historical input data for an operating mode estimation model of a cooling component of a particular server in a server cluster may include, but is not limited to, historical power and performance status information of each server in the server cluster that includes the particular server, and historical environment information of the server cluster
  • historical output data for the operating mode estimation model of the cooling component of the particular server in the server cluster may include corresponding labeled outputs of likelihood or probabilities of the cooling component being operating or to be operating at L different operating modes, wherein L is an integer greater than or equal to two.
  • the corresponding labeled outputs may be obtained or labeled manually by users in advance.
  • FIG. 5 shows an example operating mode estimation model.
  • an operating mode estimation model of a cooling component included in or peripheral with a server in a server cluster including a plurality of servers is described to be a neural network model 500, such as a deep neural network model.
  • the neural network model 500 may include an input layer 502, a plurality of hidden layers 504 ⁇ 1, 504 ⁇ 2, ..., 504 ⁇ K (where K is an integer greater than or equal to two) , and an output layer 506.
  • the input layer 502 may take historical power and performance status information of the plurality of servers in the server cluster, and historical environment information of the server cluster as inputs, and produce likelihood or probabilities of the cooling component being operating or to be operating at different operating modes as outputs.
  • the thermal ⁇ aware scheduling system 102 may train and test the operating mode estimation model of the cooling component included in or peripheral with the server in the server cluster through a conventional supervised learning method using historical input and output data as described above.
  • the thermal ⁇ aware scheduling system 102 may select a server from the multiple servers based on the respective operating modes of the cooling components of the multiple servers.
  • the thermal ⁇ aware scheduling system 102 may determine or select one or more servers as one or more server candidates for assigning the request or task to be assigned from the multiple servers.
  • each cooling component may operate at a plurality of different operating modes, e.g., N different operating modes, where N is a positive integer equal to two or higher.
  • the plurality of different operating modes of each cooling component may include operating modes of different power consumption levels (i.e., the respective cooling component consumes different amounts of power when operating at the different operating modes) .
  • the one or more servers selected by the thermal ⁇ aware scheduling system 102 may include one or more servers having a respective operating mode that consumes the lowest amount of power among the respective operating modes of the cooling components that are estimated (i.e., one or more servers having a respective operating mode corresponding to the lowest power consumption level among the respective operating modes of the cooling components that are estimated.
  • the thermal ⁇ aware scheduling system 102 (or the scheduling module 318) may set a server having a lowest inlet temperature as a selected server from among the one or more server candidates.
  • the thermal ⁇ aware scheduling system 102 may assign the request or task to be assigned to the selected server.
  • the thermal ⁇ aware scheduling system 102 may assign the request or task to be assigned to the selected server.
  • the thermal ⁇ aware scheduling system 102 may include multiple job queues separately associated with the multiple servers (i.e., the multiple job queues and the multiple servers have one ⁇ to ⁇ one correspondence relationship) .
  • the thermal ⁇ aware scheduling system 102 may place the request or task to be assigned in a job queue associated with the selected server, and send the task to be assigned to the selected server from the job queue in a first ⁇ in ⁇ first ⁇ out manner.
  • the thermal ⁇ aware scheduling system 102 may send the task to be assigned to the selected server which includes a job queue to receive requests or tasks that are assigned to the selected server.
  • the thermal ⁇ aware scheduling system 102 may continue to perform the above operations of blocks 402 –410 to perform task scheduling or load balancing for additional requests or tasks that are received, or to wait for a new request or task to perform task scheduling or load balancing.
  • the data center or cloud computing infrastructure is described as being physically divided into multiple physical partitions (with one physical partition corresponding to one server cluster) , in some instances, the data center or cloud computing infrastructure may be logically divided into multiple logical partitions, with each logical partition being associated with one of the plurality of servers 104.
  • a logical partition associated with a server 104 (e.g., a first server 104 for the purpose of distinction and ease of description only) of the plurality of servers 104 may include multiple servers 104 (e.g., second servers 104 for the purpose of distinction and ease of description only) located within a predetermined distance range (such as within one meter, two meters, etc., from the first server, etc. ) of the server 104. All these second servers 104 and the first server 104 may form a server cluster that is specific for the first server 104.
  • Information (such as power and performance status information) of these second servers 104 and the first server may then be used to determine an operating mode estimation model for a cooling component associated with the first server and estimate or predict an operating mode of the cooling component associated with the first server as described in the foregoing description.
  • the thermal ⁇ aware scheduling system 102 may have a moving window to collect information (such as power and performance status information) of servers within a server cluster corresponding to each server for determining a corresponding operating mode estimation model for a cooling component associated with the respective server, and estimating or predicting an operating mode of the cooling component associated with the respective server.
  • any of the acts of any of the methods described herein may be implemented at least partially by a processor or other electronic device based on instructions stored on one or more computer ⁇ readable media.
  • any of the acts of any of the methods described herein may be implemented under control of one or more processors configured with executable instructions that may be stored on one or more computer ⁇ readable media.
  • Clause 1 A method implemented by one or more computing devices, the method comprising: receiving a task to be assigned to a server cluster including a plurality of servers; estimating respective operating modes of cooling components of the plurality of servers based at least in part on power and performance status information of the plurality of servers and environment information of the server cluster using corresponding operating mode estimation models; selecting a server from the plurality of servers based on the respective operating modes of the cooling components of the plurality of servers; and assigning the task to be assigned to the selected server.
  • Clause 2 The method of Clause 1, further comprising collecting the power and performance status information of the plurality of servers in the server cluster, and the environment information of the server cluster.
  • Clause 3 The method of Clause 1, wherein the power and performance information of the plurality of servers comprises at least one of: respective power consumption information, respective processor utilization information, and respective inlet temperatures of the plurality of servers.
  • Clause 4 The method of Clause 1, wherein the environment information of the server cluster comprises at least one of: operating conditions of cooling units in a facility of a data center housing the server cluster, and an ambient temperature of an environment of the server cluster.
  • Clause 5 The method of Clause 1, further comprising training the corresponding operating mode estimation models based on a deep learning algorithm.
  • Clause 6 The method of Clause 1, wherein the respective operating modes of the plurality of servers have one ⁇ to ⁇ one correspondence with the corresponding operating mode estimation models.
  • Clause 7 The method of Clause 6, wherein estimating the respective operating modes of the cooling components of the plurality of servers based at least in part on the power and performance status information of the plurality of servers and the environment information of the server cluster using the corresponding operating mode estimation models comprises: providing the power and performance status information and the environment information of the server cluster to each corresponding operating mode estimation model of the corresponding operating mode estimation models to determine likelihoods of the respective operating modes of the cooling components of the plurality of servers.
  • Clause 8 The method of Clause 1, wherein an operating mode of a cooling component corresponds to one of a plurality of different operating modes having different power consumption levels of the cooling component.
  • Clause 9 The method of Clause 1, wherein selecting the server from the plurality of servers based on the respective operating modes of the cooling components of the plurality of servers comprises: determining one or more servers having a respective operating mode corresponding to a lowest power consumption level from the plurality of servers based on the respective operating modes of the cooling components of the plurality of servers; and setting a server having a lowest inlet temperature as the selected server from among the one or more servers.
  • Clause 10 The method of Clause 1, further comprising: placing the task to be assigned in a job queue associated with the selected server; and sending the task to be assigned to the selected server from the job queue in a first ⁇ in ⁇ first ⁇ out manner.
  • One or more computer readable media storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising: receiving a task to be assigned to a server cluster including a plurality of servers; estimating respective operating modes of cooling components of the plurality of servers based at least in part on power and performance status information of the plurality of servers and environment information of the server cluster using corresponding operating mode estimation models; selecting a server from the plurality of servers based on the respective operating modes; and assigning the task to be assigned to the selected server.
  • Clause 12 The one or more computer readable media of Clause 11, the acts further comprising collecting the power and performance status information of the plurality of servers in the server cluster, and the environment information of the server cluster.
  • Clause 13 The one or more computer readable media of Clause 11, wherein the power and performance information of the plurality of servers comprises at least one of: respective power consumption information, respective processor utilization information, and respective inlet temperatures of the plurality of servers, and the environment information of the server cluster comprises at least one of: operating conditions of cooling units in a facility of a data center housing the server cluster, and an ambient temperature of an environment of the server cluster.
  • Clause 14 The one or more computer readable media of Clause 11, the acts further comprising training the corresponding operating mode estimation models based on a deep learning algorithm.
  • Clause 15 The one or more computer readable media of Clause 11, wherein the respective operating modes of the cooling components of the plurality of servers have one ⁇ to ⁇ one correspondence with the corresponding operating mode estimation models, and wherein estimating the respective operating modes of the cooling components of the plurality of servers based at least in part on the power and performance status information of the plurality of servers and the environment information of the server cluster using the corresponding operating mode estimation models comprises: providing the power and performance status information and the environment information of the server cluster to each corresponding operating mode estimation model of the corresponding operating mode estimation models to determine likelihoods of the respective operating modes of the plurality of servers.
  • Clause 16 The one or more computer readable media of Clause 11, wherein an operating mode of a particular server corresponds to one of a plurality of different operating modes having different power consumption levels of the cooling component.
  • Clause 17 The one or more computer readable media of Clause 11, wherein selecting the server from the plurality of servers based on the respective operating modes of the cooling components of the plurality of servers comprises: determining one or more servers having a respective fan mode corresponding to a lowest power consumption level from the plurality of servers based on the respective operating modes of the cooling components of the plurality of servers; and setting a server having a lowest inlet temperature as the selected server from among the one or more servers.
  • Clause 18 The one or more computer readable media of Clause 11, the acts further comprising: placing the task to be assigned in a job queue associated with the selected server; and sending the task to be assigned to the selected server from the job queue in a first ⁇ in ⁇ first ⁇ out manner.
  • a system comprising: one or more processors; and memory storing executable instructions that, when executed by the one or more processors, cause the one or more processors to perform acts comprising: receiving a task to be assigned to a server cluster including a plurality of servers; estimating respective operating modes of cooling components of the plurality of servers based at least in part on power and performance status information of the plurality of servers and environment information of the server cluster using corresponding operating mode estimation models; selecting a server from the plurality of servers based on the respective operating modes; and assigning the task to be assigned to the selected server.
  • Clause 20 The system of Clause 19, the acts further comprising collecting the power and performance status information of the plurality of servers in the server cluster, and the environment information of the server cluster.
PCT/CN2020/105938 2020-07-30 2020-07-30 Thermal-aware scheduling method and system WO2022021240A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080104888.7A CN116458140A (zh) 2020-07-30 2020-07-30 热感知调度方法和系统
PCT/CN2020/105938 WO2022021240A1 (en) 2020-07-30 2020-07-30 Thermal-aware scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/105938 WO2022021240A1 (en) 2020-07-30 2020-07-30 Thermal-aware scheduling method and system

Publications (1)

Publication Number Publication Date
WO2022021240A1 true WO2022021240A1 (en) 2022-02-03

Family

ID=80037415

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105938 WO2022021240A1 (en) 2020-07-30 2020-07-30 Thermal-aware scheduling method and system

Country Status (2)

Country Link
CN (1) CN116458140A (zh)
WO (1) WO2022021240A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1643476A (zh) * 2002-03-18 2005-07-20 国际商业机器公司 管理多计算机服务器的功耗的方法
US20100211810A1 (en) * 2009-02-13 2010-08-19 American Power Conversion Corporation Power supply and data center control
US20100228861A1 (en) * 2009-03-04 2010-09-09 International Business Machines Corporation Environmental and computing cost reduction with improved reliability in workload assignment to distributed computing nodes
US20130191676A1 (en) * 2012-01-24 2013-07-25 Hitachi, Ltd. Operation management method of information processing system
CN103777737A (zh) * 2013-08-15 2014-05-07 中华电信股份有限公司 基于服务器资源负载及位置感知的云端机房节能方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1643476A (zh) * 2002-03-18 2005-07-20 国际商业机器公司 管理多计算机服务器的功耗的方法
US20100211810A1 (en) * 2009-02-13 2010-08-19 American Power Conversion Corporation Power supply and data center control
US20100228861A1 (en) * 2009-03-04 2010-09-09 International Business Machines Corporation Environmental and computing cost reduction with improved reliability in workload assignment to distributed computing nodes
US20130191676A1 (en) * 2012-01-24 2013-07-25 Hitachi, Ltd. Operation management method of information processing system
CN103777737A (zh) * 2013-08-15 2014-05-07 中华电信股份有限公司 基于服务器资源负载及位置感知的云端机房节能方法

Also Published As

Publication number Publication date
CN116458140A (zh) 2023-07-18

Similar Documents

Publication Publication Date Title
Ilager et al. Thermal prediction for efficient energy management of clouds using machine learning
Xia et al. Phone2Cloud: Exploiting computation offloading for energy saving on smartphones in mobile cloud computing
JP6359716B1 (ja) 分散型コンピューティングにおける低速タスクの診断
Yi et al. Toward efficient compute-intensive job allocation for green data centers: A deep reinforcement learning approach
US20200104184A1 (en) Accelerated resource allocation techniques
US7958219B2 (en) System and method for the process management of a data center
US8756441B1 (en) Data center energy manager for monitoring power usage in a data storage environment having a power monitor and a monitor module for correlating associative information associated with power consumption
Etemadi et al. A cost-efficient auto-scaling mechanism for IoT applications in fog computing environment: a deep learning-based approach
Mirmohseni et al. Using Markov learning utilization model for resource allocation in cloud of thing network
WO2021042339A1 (zh) 散热控制与模型训练方法、设备、系统及存储介质
CN109831524A (zh) 一种负载均衡处理方法及装置
EP3465966B1 (en) A node of a network and a method of operating the same for resource distribution
Jiang et al. An edge computing platform for intelligent operational monitoring in internet data centers
Khallouli et al. Cluster resource scheduling in cloud computing: literature review and research challenges
WO2021071636A1 (en) Machine learning-based power capping and virtual machine placement in cloud platforms
Nguyen et al. Modeling multi-constrained fog-cloud environment for task scheduling problem
Magotra et al. Adaptive computational solutions to energy efficiency in cloud computing environment using VM consolidation
Kumar et al. Novel Dynamic Scaling Algorithm for Energy Efficient Cloud Computing.
WO2022021240A1 (en) Thermal-aware scheduling method and system
CN111083201B (zh) 一种工业物联网中数据驱动制造服务的节能资源分配方法
US11656981B2 (en) Memory reduction in a system by oversubscribing physical memory shared by compute entities supported by the system
Acun et al. Neural network-based task scheduling with preemptive fan control
Chaudhry et al. Thermal prediction models for virtualized data center servers by using thermal-profiles
Su et al. Node capability aware resource provisioning in a heterogeneous cloud
Son et al. Stochastic distributed data stream partitioning using task locality: design, implementation, and optimization

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202080104888.7

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20947060

Country of ref document: EP

Kind code of ref document: A1