US20170168540A1 - Reducing power consumption of a compute node experiencing a bottleneck - Google Patents
Reducing power consumption of a compute node experiencing a bottleneck Download PDFInfo
- Publication number
- US20170168540A1 US20170168540A1 US14/964,007 US201514964007A US2017168540A1 US 20170168540 A1 US20170168540 A1 US 20170168540A1 US 201514964007 A US201514964007 A US 201514964007A US 2017168540 A1 US2017168540 A1 US 2017168540A1
- Authority
- US
- United States
- Prior art keywords
- component
- compute node
- workload
- utilization
- during
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims abstract description 52
- 238000004590 computer program Methods 0.000 claims description 14
- 238000010586 diagram Methods 0.000 description 16
- 238000013500 data storage Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 10
- 238000007726 management method Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 3
- 238000001816 cooling Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/16—Constructional details or arrangements
- G06F1/20—Cooling means
- G06F1/206—Cooling means comprising thermal management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
- H04L43/0882—Utilisation of link capacity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3209—Monitoring remote activity, e.g. over telephone lines or network connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3215—Monitoring of peripheral devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3228—Monitoring task completion, e.g. by use of idle timers, stop commands or wait commands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G06F9/4893—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues taking into account power or heat criteria
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
- H04L43/0817—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5019—Workload prediction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5022—Workload threshold
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/504—Resource capping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/508—Monitor
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Human Computer Interaction (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
- Field of the Invention
- The present invention relates to power management within a compute node.
- Background of the Related Art
- Datacenters include large numbers of compute nodes in order to have the capacity to run large workloads or a large number of workloads, such as application programs. In order for the compute nodes and other supporting equipment to operate, the datacenter must provide a sufficient amount of electrical power and distribute that power to each of the compute nodes and other equipment. Some of the power consumed by the compute nodes produces heat, which requires a cooling system to prevent high temperatures from damaging various components of the compute nodes. The amount of electrical power consumed in order to operate the compute nodes may represent the greatest expense of owning and operating the datacenter.
- Reducing power consumption and the associated expense is a high priority for a modern datacenter. Efforts to reduce datacenter power consumption may be directed at the cooling system design, network operations, energy efficient components, and the like. Some management applications may impose caps on a compute node or an individual component in order to force a compute node to consume no more than a given threshold of electrical power. However, such caps are likely to reduce the overall performance of a workload on the relevant compute node.
- One embodiment of the present invention a method comprising obtaining component utilization data for multiple components of a compute node during at least one previous execution of a workload. The method further comprises using the component utilization data to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, wherein the first component is one of the multiple components of the compute node. The method still further comprises, during a subsequent execution of the workload on the compute node, throttling the first component to prevent the first component from exceeding the threshold utilization level.
- Another embodiment of the present invention provides a method comprising obtaining component utilization data for multiple components of a compute node during execution of each of a plurality of workloads. The component utilization data is used to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, wherein the first component is one of the multiple components of the compute node, and wherein the plurality of workloads include first and second workloads. During a subsequent execution of the first and second workloads on the compute node, the first component may be throttled to prevent the first component from exceeding the threshold utilization level.
- Yet another embodiment of the present invention provides a computer program product comprising a computer readable storage medium having non-transitory program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method. The method may comprise obtaining component utilization data for multiple components of a compute node during at least one previous execution of a workload; using the component utilization data to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, wherein the first component is one of the multiple components of the compute node; and during a subsequent execution of the workload on the compute node, throttling the first component to prevent the first component from exceeding the threshold utilization level.
-
FIG. 1 is a diagram of a compute node. -
FIG. 2A is a diagram of a fully connected, peer-to-peer cluster. -
FIG. 2B is a diagram of a cluster with a central management node. -
FIG. 3 is a graph of utilization for each of four components of a particular node while running a particular workload. -
FIG. 4 is a table representing a component utilization history for a particular compute node. -
FIG. 5A is a (partial) component utilization history for a first compute node. -
FIG. 5B is a (partial) component utilization history for a second compute node. -
FIG. 5C is a node configuration summary including entries for both the first and second compute nodes. -
FIG. 6 is a flowchart of a method according to one embodiment of the present invention. - One embodiment of the present invention a method comprising obtaining component utilization data for multiple components of a compute node during at least one previous execution of a workload. The method further comprises using the component utilization data to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, wherein the first component is one of the multiple components of the compute node. The method still further comprises, during a subsequent execution of the workload on the compute node, throttling the first component to prevent the first component from exceeding the threshold utilization level.
- The component utilization data may be obtained or collected in various manners. In one example, the component utilization data may be collected during at least one previous execution of the workload on the same compute node where the subsequent execution of the workload is being performed. In an alternative example, the component utilization data may be collected during previous executions of the workload on another compute node having components that are equivalent to the components of the compute node where the subsequent execution of the workload is being performed. In yet another example, the component utilization data may be manually input, such as based upon an administrator's experience. In a still further example, the component utilization data may be collected during at least one previous execution of the workload on a second compute node having components that are not equivalent to the components of the compute node where the subsequent execution of the workload is being performed.
- However, where the data is collected from a second compute node having components that are not equivalent to the components of the compute node, the method may further include steps to convert the component utilization data for use by the compute node. In a specific example, the method further comprises, for each of the multiple components of the compute node, determining a ratio of a performance capacity of a component on the compute node to a performance capacity of a similar type of component on the second compute node, and using the ratio and the component utilization data collected during at least one previous execution of the workload on the second compute node to determine expected component utilization levels of the multiple components of the compute node during subsequent execution of the workload on the compute node. For example, if a data storage device on a first compute node is a hard disk drive having a performance capacity of 7200 RPM with 2 MB cache, and a second compute node has a disk drive having a performance capacity of 5400 RPM with 512 KB cache, then the same workload will cause a higher utilization of the hard disk of the second compute node than the utilization of the hard disk of the first compute node. While the effect of both speed and cache size may be considered, a simple ratio based solely on speed would result in a ratio of about 7200/5400=1.33, such that a workload causing a utilization of 50% on the first compute node may be estimated to cause of utilization of 67% on the second compute node. As a second example, a given workload that causes a 4 core processor to be 40% utilized on a first compute node, may result in an expected utilization of about 30% on a second compute node having an 8 core processor, all other variables being the same.
- Optionally, the component utilization data is used to determine an expected utilization level for a first component of the compute node during a subsequent execution of the workload by identifying the highest utilization level for the first component during any instance of executing the workload that is identified in the component utilization data.
- The performance capacity of each of the multiple components of each node may, for example, be obtained from each node's vital product data, found in a lookup table given a node's model number, or determined through testing. In one embodiment, a performance capacity is obtained for each of the components of the compute node(s), and the component utilization data is stated as a percentage of the performance capacity for each of the components.
- While a compute node may include any number of components representing a variety of component types, each compute node will typically include multiple components that perform certain essential functions of the compute node and are substantially determinative of the compute node's performance capacity. In one example, these multiple components include a processor, memory, a data storage device, and a network interface. Depending upon the nature of a given workload, one of the multiple components of a workload may limit the performance of the workload on the compute node. Accordingly, the other components may have varying degrees of unused or “stranded” capacity during at least some period during the execution of the workload.
- In another embodiment, management software may impose a utilization cap on the performance of one or more of the components of the compute node. Such a utilization cap may be a power cap imposed for the purpose of preventing a compute node or an entire network of compute nodes from exceeding a power budget or a thermal limit. A utilization cap may prevent full use of a component's capacity. For example, a utilization cap may limit a processor to a certain percentage of its maximum speed (i.e., instructions per second) or limit memory to a certain percentage of its bus speed or available memory capacity. The present embodiment recognizes that a utilization cap may affect which component will limit performance of a workload on the compute node, and may also affect the extent to which that component limits performance. In other words, with all other factors being unchanged, if the processor was already the limiting component and a utilization cap then reduces the processor speed by some percentage, then the unused capacity of the other components will increase. In another situation, a utilization cap may cause a component to limit performance of a workload even though the compute node did not previously have a component limiting performance of the workload. In one specific embodiment, the step of using the component utilization data to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, may include determining, for each of the multiple components of the compute node, whether the component has a utilization cap that is greater than the utilization for that component stated in the component utilization data in association with execution of the workload.
- In a further embodiment, the method may further comprise using the component utilization data to determine an expected utilization level for the first component of the compute node during the subsequent execution of the workload, wherein throttling the first component to prevent the first component from exceeding the threshold utilization level, includes throttling the second component to an extent that allows the second component to perform at an expected utilization level. In other words, the first component is throttled (i.e., its performance is limited and power consumption is reduced) to a limited extent such that the component still does not become the limiting component. Furthermore, where the expected utilization level for the first component varies during the execution of the workload, the extent of throttling the first component may be adjusted during performance of the workload. In one option, the extent of throttling the first component during execution of the workload is adjusted to maximize the extent of throttling the first component without preventing the first component from reaching the expected utilization level. In one alternative option, the extent of throttling the second component during execution of the workload is adjusted to place the second component into the lowest operating state that does not prevent the first component from reaching the expected utilization level.
- Depending upon the performance capacities of the components of the compute node and the nature of the workload being executed on the compute node, any one of the multiple components may be the limiting component and any of the other components may be the first component (i.e., a non-limiting component) that is to be throttled. For example, wherein the first component is a processor, throttling the processor may include setting a performance state for the processor. Where the first component is a memory module, throttling the memory module may include reducing the clock frequency of the memory module and/or powering down the memory module or another one of multiple memory modules. Where the first component is a controller for a redundant array of independent disks, throttling the controller may include reducing the clock frequency of cache memory available to the controller. Other ways of throttling a component may also be implemented.
- Another embodiment of the method further comprises obtaining a performance capacity for each of the multiple components of a plurality of compute nodes, identifying a target compute node, from among the plurality of compute nodes, having multiple components with sufficient performance capacity to perform the workload without exceeding a selected power policy, and assigning the workload to the target compute node. Accordingly, where a power policy may enforced on a compute node, a workload may be assigned to one of the compute nodes based on a determination that the workload may be executed on the target compute node without violating the power policy.
- A further embodiment of the present invention recognizes that a compute node may execute more than one workload at a time, and that the combined execution of multiple workloads will determine the limiting component and the extent to which other components may be throttled. Accordingly, one example of the method may comprise obtaining component utilization data for multiple components of a compute node during execution of each of a plurality of workloads. The component utilization data is used to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, wherein the first component is one of the multiple components of the compute node, and wherein the plurality of workloads include first and second workloads. During a subsequent execution of the first and second workloads on the compute node, the first component may be throttled to prevent the first component from exceeding the threshold utilization level.
- Yet another embodiment of the present invention provides a computer program product comprising a computer readable storage medium having non-transitory program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method. The method may comprise obtaining component utilization data for multiple components of a compute node during at least one previous execution of a workload; using the component utilization data to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, wherein the first component is one of the multiple components of the compute node; and during a subsequent execution of the workload on the compute node, throttling the first component to prevent the first component from exceeding the threshold utilization level.
- The foregoing computer program products may further include program instructions for implementing or initiating any one or more aspects of the methods described herein. Accordingly, a separate description of the methods will not be duplicated in the context of a computer program product.
-
FIG. 1 is a diagram of acompute node 10. Thecompute node 10 includes aprocessor 12,memory 14, adata storage device 16, anetwork interface 18, and asystem bus 19 connecting theprocessor 12,memory 14,data storage device 16, andnetwork interface 18. The illustratedcompute node 10 may include other types of components and additional numbers of the components shown, and the illustrated compute node is just one non-limiting example of a compute node that may be used in accordance with one or more embodiment of the present invention. -
FIG. 2A is a diagram of a fully connected, peer-to-peer cluster 20. In a peer-to-peer network, the compute nodes 10 (illustrated here as four nodes,Node 1 to Node 4) make resources available to other compute nodes without a central management node. Accordingly, processing power, data storage or network bandwidth may be shared among the compute nodes. The illustrated cluster is “fully connected” since each node has a direct connection to every other node in the cluster. However, the present invention is not limited to a fully connected, peer-to-peer cluster. Other network topologies may include, without limitation, a mesh, star, ring, line, or branched structure. -
FIG. 2B is a diagram of acluster 30 with acentral management node 32. Thecluster 30 includes fourcompute nodes 10 in communication with thecentral management node 32 through aswitch 34. Thecentral management node 32 may be responsible for coordinating any sharing of resources among thecompute nodes 10. -
FIG. 3 is a graph of utilization for each of four components of a particular compute node while running a particular workload. The graph includes a vertical bar for each of a processor, memory, data storage device and network interface, where the height of the vertical bar illustrates the utilization percentage. For example, the utilization percentage may be a percentage of a component's performance capacity as stated in vital product data. - In the illustrated graph, a compute node (Compute Node 1) is executing a identified workload (Workload ABC), such that the processor is running at 100% utilization, the memory is running at 50% utilization, the data storage device is running at 15% utilization, and the network interface is running at 30% utilization. This component utilization data shows that the processor is the limiting component when
Compute Node 1 executes Workload ABC, at least at the point in time represented by the graph. Accordingly, the memory, data storage device and network interface have unused capacity and steps taken to throttle one or more of these devices may result in an overall reduction in power consumption for the compute node. For example, memory running at 50% utilization is running at less than a 60% threshold of utilization. If the memory has the capability of being throttled at 60%, then this level of throttling will not prevent the memory from reaching its expected utilization of 50% while executing the workload. The available utilization thresholds available for a particular component may be limited by the throttling capabilities of the component. If a first component has multiple utilization thresholds levels (throttling levels), then the first component is preferably throttled at the lowest utilization threshold level that will not prevent the first component from performing at its expected utilization level for executing the workload. -
FIG. 4 is a table representing acomponent utilization history 40 for a particular compute node (Compute Node 1). WhereasFIG. 3 provides component utilization forCompute Node 1 when executing Workload ABC,FIG. 4 provides a history of component utilization data asCompute Node 1 executes various workloads. Each row of the table identifies one instance of a workload that was executed onCompute Node 1. The first row represents that utilization data fromFIG. 3 . The component utilization data resulting from execution of another workload (Workload DEF) onCompute Node 1 is shown in the second row. Component utilization data for a second instance of executing Workload ABC is shown in the third row. In all, thecomponent utilization history 40 includes four instances of executing Workload ABC. -
FIG. 5A is a (partial)component utilization history 50 for a first compute node (Compute Node 1). The history is “partial” because it only includes the four instances of executing Workload ABC. This partial component utilization history highlights that the processor is consistently the limiting component and that there is variation in the utilization of the other components. -
FIG. 5B is a (partial)component utilization history 60 for a second compute node (Compute Node 2) including three instances of executing the Workload ABC. While the processor in the second compute node is still the limiting component due to the nature of the workload, the memory utilization is lower than that for the first compute node while the data storage utilization and the network interface utilization are greater than that for the first compute node. -
FIG. 5C is anode configuration summary 70 including entries for both the first and second compute nodes (Compute Node 1 and Compute Node 2). The entry on the first row identifies the performance capacity of the processor, memory, data storage device and network interface for the first compute node (Compute Node 1), and the entry on the second row identifies the performance capacity of the processor, memory, data storage device and network interface for the second compute node (Compute Node 2). WhileCompute Node 2 has 20% greater processor capacity and 100% greater memory capacity,Compute Node 1 has 25% greater data storage speed. These differences in component configurations between the two compute nodes explains why, for execution of the Workload ABC, theCompute Node 2 memory utilization is lower butCompute Node 2 data storage utilization and the network interface utilization are both greater (SeeFIG. 5B compared toFIG. 5A ). In accordance with various embodiments of the present invention, it is possible to use the component utilization history collected from one compute node and estimate an expected component utilization for another compute node that is executing the same workload using the relative performance capacities of the components in both compute nodes. -
FIG. 6 is a flowchart of amethod 80 according to one embodiment of the present invention. Instep 82, the method obtains component utilization data for multiple components of a compute node during at least one previous execution of a workload. Instep 84, the method uses the component utilization data to identify a first component having a utilization level that is less than a threshold utilization level during the at least one previous execution of the workload, wherein the first component is one of the multiple components of the compute node. Instep 86, the method, during a subsequent execution of the workload on the compute node, throttles the first component to prevent the first component from exceeding the threshold utilization level. - As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
- Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
- A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
- Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
- Aspects of the present invention may be described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, and/or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
- The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
- The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, components and/or groups, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms “preferably,” “preferred,” “prefer,” “optionally,” “may,” and similar terms are used to indicate that an item, condition or step being referred to is an optional (not required) feature of the invention.
- The corresponding structures, materials, acts, and equivalents of all means or steps plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but it is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/964,007 US10819607B2 (en) | 2015-12-09 | 2015-12-09 | Reducing power consumption of a compute node experiencing a bottleneck |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/964,007 US10819607B2 (en) | 2015-12-09 | 2015-12-09 | Reducing power consumption of a compute node experiencing a bottleneck |
Publications (2)
Publication Number | Publication Date |
---|---|
US20170168540A1 true US20170168540A1 (en) | 2017-06-15 |
US10819607B2 US10819607B2 (en) | 2020-10-27 |
Family
ID=59018696
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/964,007 Active 2037-09-17 US10819607B2 (en) | 2015-12-09 | 2015-12-09 | Reducing power consumption of a compute node experiencing a bottleneck |
Country Status (1)
Country | Link |
---|---|
US (1) | US10819607B2 (en) |
Citations (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030229821A1 (en) * | 2002-05-15 | 2003-12-11 | Kenneth Ma | Method and apparatus for adaptive power management of memory |
US6895520B1 (en) * | 2001-03-02 | 2005-05-17 | Advanced Micro Devices, Inc. | Performance and power optimization via block oriented performance measurement and control |
US7069463B2 (en) * | 2000-12-21 | 2006-06-27 | Lg Electronics Inc. | Bus clock controlling apparatus and method |
US7272517B1 (en) * | 2006-04-25 | 2007-09-18 | International Business Machines Corporation | Method and system for providing performance estimations for a specified power budget |
US20070245161A1 (en) * | 2006-04-15 | 2007-10-18 | Shaw Mark E | Power management system and method |
US20070260897A1 (en) * | 2006-05-05 | 2007-11-08 | Dell Products L.P. | Power allocation management in an information handling system |
US20080086734A1 (en) * | 2006-10-10 | 2008-04-10 | Craig Jensen | Resource-based scheduler |
US20080228959A1 (en) * | 2007-03-16 | 2008-09-18 | Dot Hill Systems Corporation | Method and apparatus for operating storage controller system in elevated temperature environment |
US20090006876A1 (en) * | 2007-06-26 | 2009-01-01 | Fukatani Takayuki | Storage system comprising function for reducing power consumption |
US7519843B1 (en) * | 2008-05-30 | 2009-04-14 | International Business Machines Corporation | Method and system for dynamic processor speed control to always maximize processor performance based on processing load and available power |
US7529949B1 (en) * | 2005-10-26 | 2009-05-05 | Hewlett-Packard Development Company, L.P. | Heterogeneous power supply management system |
US20090125737A1 (en) * | 2007-11-08 | 2009-05-14 | International Business Machines Corporation | Power Management of an Electronic System |
US20090217066A1 (en) * | 2008-02-26 | 2009-08-27 | Anbazhagan Mani | Controlling connection status of network adapters |
US20100083010A1 (en) * | 2008-10-01 | 2010-04-01 | International Business Machines Corporation | Power Management For Clusters Of Computers |
US20100250642A1 (en) * | 2009-03-31 | 2010-09-30 | International Business Machines Corporation | Adaptive Computing Using Probabilistic Measurements |
US20110099320A1 (en) * | 2009-10-23 | 2011-04-28 | International Business Machines Corporation | Solid State Drive with Adjustable Drive Life and Capacity |
US20120041749A1 (en) * | 2010-08-12 | 2012-02-16 | International Business Machines Corporation | Determining Simulation Fidelity in a Self-Optimized Simulation of a Complex System |
US20120060168A1 (en) * | 2010-09-06 | 2012-03-08 | Samsung Electronics Co. Ltd. | Virtualization system and resource allocation method thereof |
US20130080809A1 (en) * | 2011-09-28 | 2013-03-28 | Inventec Corporation | Server system and power managing method thereof |
US20130080641A1 (en) * | 2011-09-26 | 2013-03-28 | Knoa Software, Inc. | Method, system and program product for allocation and/or prioritization of electronic resources |
US20130262953A1 (en) * | 2012-03-28 | 2013-10-03 | Xinmin Deng | Methods, systems, and computer readable media for dynamically controlling a turbo decoding process in a long term evolution (lte) multi-user equipment (ue) traffic simulator |
US20140149296A1 (en) * | 2012-11-29 | 2014-05-29 | Applied Materials, Inc. | Enhanced preventative maintenance utilizing direct part marking |
US20140152278A1 (en) * | 2011-08-26 | 2014-06-05 | The Trustees Of Columbia University In The City Of New York | Systems and methods for switched-inductor integrated voltage regulators |
US20140325524A1 (en) * | 2013-04-25 | 2014-10-30 | Hewlett-Packard Development Company, L.P. | Multilevel load balancing |
US20150229295A1 (en) * | 2014-02-12 | 2015-08-13 | International Business Machines Corporation | Three-d power converter in three distinct strata |
US20150286261A1 (en) * | 2014-04-04 | 2015-10-08 | International Business Machines Corporation | Delaying execution in a processor to increase power savings |
-
2015
- 2015-12-09 US US14/964,007 patent/US10819607B2/en active Active
Patent Citations (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7069463B2 (en) * | 2000-12-21 | 2006-06-27 | Lg Electronics Inc. | Bus clock controlling apparatus and method |
US6895520B1 (en) * | 2001-03-02 | 2005-05-17 | Advanced Micro Devices, Inc. | Performance and power optimization via block oriented performance measurement and control |
US20030229821A1 (en) * | 2002-05-15 | 2003-12-11 | Kenneth Ma | Method and apparatus for adaptive power management of memory |
US7529949B1 (en) * | 2005-10-26 | 2009-05-05 | Hewlett-Packard Development Company, L.P. | Heterogeneous power supply management system |
US20070245161A1 (en) * | 2006-04-15 | 2007-10-18 | Shaw Mark E | Power management system and method |
US7272517B1 (en) * | 2006-04-25 | 2007-09-18 | International Business Machines Corporation | Method and system for providing performance estimations for a specified power budget |
US20070260897A1 (en) * | 2006-05-05 | 2007-11-08 | Dell Products L.P. | Power allocation management in an information handling system |
US20080086734A1 (en) * | 2006-10-10 | 2008-04-10 | Craig Jensen | Resource-based scheduler |
US20080228959A1 (en) * | 2007-03-16 | 2008-09-18 | Dot Hill Systems Corporation | Method and apparatus for operating storage controller system in elevated temperature environment |
US20090006876A1 (en) * | 2007-06-26 | 2009-01-01 | Fukatani Takayuki | Storage system comprising function for reducing power consumption |
US20090125737A1 (en) * | 2007-11-08 | 2009-05-14 | International Business Machines Corporation | Power Management of an Electronic System |
US20090217066A1 (en) * | 2008-02-26 | 2009-08-27 | Anbazhagan Mani | Controlling connection status of network adapters |
US7519843B1 (en) * | 2008-05-30 | 2009-04-14 | International Business Machines Corporation | Method and system for dynamic processor speed control to always maximize processor performance based on processing load and available power |
US20100083010A1 (en) * | 2008-10-01 | 2010-04-01 | International Business Machines Corporation | Power Management For Clusters Of Computers |
US20100250642A1 (en) * | 2009-03-31 | 2010-09-30 | International Business Machines Corporation | Adaptive Computing Using Probabilistic Measurements |
US20110099320A1 (en) * | 2009-10-23 | 2011-04-28 | International Business Machines Corporation | Solid State Drive with Adjustable Drive Life and Capacity |
US20120041749A1 (en) * | 2010-08-12 | 2012-02-16 | International Business Machines Corporation | Determining Simulation Fidelity in a Self-Optimized Simulation of a Complex System |
US20120060168A1 (en) * | 2010-09-06 | 2012-03-08 | Samsung Electronics Co. Ltd. | Virtualization system and resource allocation method thereof |
US20140152278A1 (en) * | 2011-08-26 | 2014-06-05 | The Trustees Of Columbia University In The City Of New York | Systems and methods for switched-inductor integrated voltage regulators |
US20130080641A1 (en) * | 2011-09-26 | 2013-03-28 | Knoa Software, Inc. | Method, system and program product for allocation and/or prioritization of electronic resources |
US20130080809A1 (en) * | 2011-09-28 | 2013-03-28 | Inventec Corporation | Server system and power managing method thereof |
US20130262953A1 (en) * | 2012-03-28 | 2013-10-03 | Xinmin Deng | Methods, systems, and computer readable media for dynamically controlling a turbo decoding process in a long term evolution (lte) multi-user equipment (ue) traffic simulator |
US20140149296A1 (en) * | 2012-11-29 | 2014-05-29 | Applied Materials, Inc. | Enhanced preventative maintenance utilizing direct part marking |
US20140325524A1 (en) * | 2013-04-25 | 2014-10-30 | Hewlett-Packard Development Company, L.P. | Multilevel load balancing |
US20150229295A1 (en) * | 2014-02-12 | 2015-08-13 | International Business Machines Corporation | Three-d power converter in three distinct strata |
US20150286261A1 (en) * | 2014-04-04 | 2015-10-08 | International Business Machines Corporation | Delaying execution in a processor to increase power savings |
Also Published As
Publication number | Publication date |
---|---|
US10819607B2 (en) | 2020-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9405572B2 (en) | Optimized resource allocation and management in a virtualized computing environment | |
US9965309B2 (en) | Virtual machine placement | |
US9684364B2 (en) | Technologies for out-of-band power-based task scheduling for data centers | |
US8782443B2 (en) | Resource-based adaptive server loading | |
US9189273B2 (en) | Performance-aware job scheduling under power constraints | |
US20160378570A1 (en) | Techniques for Offloading Computational Tasks between Nodes | |
US20180101414A1 (en) | Processor thread management | |
US20170052813A1 (en) | Agile vm load balancing through micro-checkpointing and multi-architecture emulation | |
US9292371B1 (en) | Systems and methods for preventing failures of nodes in clusters | |
US8539271B2 (en) | Determining a power saving mode based on a hardware resource utilization trend | |
US8539192B2 (en) | Execution of dataflow jobs | |
US8065560B1 (en) | Method and apparatus for achieving high availability for applications and optimizing power consumption within a datacenter | |
US9952911B2 (en) | Dynamically optimized device driver protocol assist threads | |
US9703344B2 (en) | Power consumption control in computer system | |
US9329648B2 (en) | Performance management of subsystems in a server by effective usage of resources | |
Mandal et al. | Heterogeneous bandwidth provisioning for virtual machine migration over SDN-enabled optical networks | |
US8458719B2 (en) | Storage management in a data processing system | |
US9460399B1 (en) | Dynamic event driven storage system optimization | |
US10114438B2 (en) | Dynamic power budgeting in a chassis | |
US10171572B2 (en) | Server pool management | |
US10489189B2 (en) | Selection of maintenance tasks | |
US10819607B2 (en) | Reducing power consumption of a compute node experiencing a bottleneck | |
US10481659B2 (en) | Rack resource utilization | |
US10846147B2 (en) | Class-specific spinlock parameters for increased performance | |
US9483317B1 (en) | Using multiple central processing unit cores for packet forwarding in virtualized networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: LENOVO ENTERPRISE SOLUTIONS (SINGAPORE) PTE. LTD., Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ANGALURI, SRIHARI V.;CUDAK, GARY D.;DHOLAKIA, AJAY;AND OTHERS;SIGNING DATES FROM 20151201 TO 20151208;REEL/FRAME:037251/0312 Owner name: LENOVO ENTERPRISE SOLUTIONS (SINGAPORE) PTE. LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ANGALURI, SRIHARI V.;CUDAK, GARY D.;DHOLAKIA, AJAY;AND OTHERS;SIGNING DATES FROM 20151201 TO 20151208;REEL/FRAME:037251/0312 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCV | Information on status: appeal procedure |
Free format text: NOTICE OF APPEAL FILED |
|
STCV | Information on status: appeal procedure |
Free format text: APPEAL BRIEF (OR SUPPLEMENTAL BRIEF) ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: LENOVO GLOBAL TECHNOLOGIES INTERNATIONAL LTD, HONG KONG Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LENOVO ENTERPRISE SOLUTIONS (SINGAPORE) PTE LTD;REEL/FRAME:060649/0865 Effective date: 20210223 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |