EP3241089A1 - Procédé de gestion automatique de la consommation électrique d'une grappe de serveurs - Google Patents
Procédé de gestion automatique de la consommation électrique d'une grappe de serveursInfo
- Publication number
- EP3241089A1 EP3241089A1 EP15822954.2A EP15822954A EP3241089A1 EP 3241089 A1 EP3241089 A1 EP 3241089A1 EP 15822954 A EP15822954 A EP 15822954A EP 3241089 A1 EP3241089 A1 EP 3241089A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- consumption
- nodes
- node
- management method
- automatic management
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/16—Constructional details or arrangements
- G06F1/18—Packaging or power distribution
- G06F1/189—Power distribution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3287—Power saving characterised by the action undertaken by switching off individual functional units in the computer system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5094—Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the invention relates to a method for managing the power consumption of a server cluster.
- Server clusters in the context of the present application will be understood to mean any set of servers managed centrally.
- the high-performance computers also referred to as HPC calculator.
- HPC high performance computer
- the power available must be taken into account in order not to collapse the supply structure and therefore the calculator; the heat dissipation capacities must be taken into account in order not to risk damaging the computer by heating;
- the associated cost may exceed one million euros per year (current metric based on a calculation power of approximately 1 MW / P Flops).
- circuit breakers are very fast and cut the power of a group of nodes. However, this is a reactive approach, the energy consumption has already begun, In addition to the return of online disjunct nodes, it is necessary to perform a resetting often manual.
- the invention aims to remedy all or part of the disadvantages of the state of the art identified above, and in particular to provide means to allow to follow a consumption setpoint without exceeding it.
- one aspect of the invention relates to a method for automatically managing the power consumption of a server cluster comprising a plurality of nodes, characterized in that the method comprises the following steps:
- the method / device according to the invention may have one or more additional characteristics among the following, considered individually or according to the technically possible combinations:
- the number of nodes selected is a function of the difference between the predicted consumption and the instantaneous consumption limit; the method is implemented before a resource allocation, the resources to be allocated being used as a parameter of the prediction function of the future consumption;
- the process is implemented according to a schedule; the nodes are assigned to treatments, the treatments being classified according to at least two categories, the at least one node being selected according to the category of treatment that it performs;
- the nodes are pre-classified in at least two groups
- the at least one node is selected from a predetermined group; - to select the at least one node is selected the entirety of a predetermined group;
- the at least one node is selected from the nodes having a predetermined status.
- the invention also relates to a digital storage device comprising a file corresponding to instruction codes implementing the method according to a possible combination of the preceding features.
- the invention also relates to a device implementing the method according to a possible combination of the preceding features.
- Figure 1 an illustration of means for implementing the invention
- FIG. 1 an illustration of steps of the method according to the invention.
- the supervision server comprises:
- a storage means 1 for example a hard disk whether local or remote, whether simple or in a grid (for example RAI D); a communication interface 130, for example a communication card according to the Ethernet protocol.
- Other protocols are conceivable as "Fiber Channel” or InfiniBand.
- the microprocessor 1 1 0 of the supervision server, the storage means 1 20 of the supervision server and the communication interface 1 30 of the supervision server are interconnected by a bus 1 50.
- an action When an action is taken to a device it is actually performed by a microprocessor of the device controlled by instruction codes stored in a memory of the device. If an action is taken to an application, it is actually performed by a microprocessor of the device in a memory of which the instruction codes corresponding to the application are recorded. When a device, or an application sends a message, this message is sent via a communication interface of said device or of said application.
- FIG. 1 shows that the storage means 1 of the supervisory server 1 00 comprise:
- cluster database area 120.2 or node management database, which includes information about the nodes in the server cluster supervised by the supervisory server 1 00;
- FIG. 1 shows a cluster 200 of servers.
- the cluster 200 of servers having a number Z of nodes.
- the server cluster 200 is supervised by the supervisory server 100.
- Figure 1 shows a power supply block 300 corresponding to an electrical cabinet 300 from which the power is distributed in the cluster 200 of servers.
- Figure 1 shows a network 400 for interconnecting the server 1 00 supervision, the server cluster 200 and 300 power cabinet.
- FIG. 1 shows a calendar server 500, the calendar server 500 being interconnected with the supervision server 1 00 via at least the network 400.
- the calendar server 500 delivers, when it is polled, a limit of powerful , that is, a value representing maximum consumption. This value can be associated with one or more dates so as to specify during which time interval the issued limit is valid.
- the calendar server may be replaced by a zone in the storage means of the supervisory server 1 00.
- a zone is, for example, structured as a table for associating time intervals and power limits.
- Figure 2 shows a step 1 1 00 evaluation of the need for an adaptation of the consumption of the server cluster 200. This step can occur in at least two circumstances:
- the supervision server allocates resources for the execution of a new job
- second case a planning of the evaluation to follow as well as possible evolutions of a power limit setpoint.
- FIG. 2 shows that the step 1 100 comprises a sub-step 1 1 1 0 measuring an instant consumption of the cluster 200 of servers.
- the supervisory server 1 00 interrogates the power cabinet 300 to know the power that it is delivering.
- FIG. 2 shows that step 1 1 00 comprises a substep 1 1 20 for acquiring an instantaneous consumption limit.
- the supervisory server 100 interrogates the calendar server 500 to know the current limit, that is to say at the date of the question, of the power that can consume the cluster of 200 servers.
- the mode of acquisition of the limit includes the possibility of specifying a date. We then obtain a limit corresponding to the specified date.
- Step 1 1 1 0 At the end of the step 1 1 1 1 0 measuring an instantaneous consumption and step 1 1 20 acquisition of an instantaneous consumption limit the supervision server 100 goes to a substep 1 1 30 prediction of future consumption.
- Step 1 1 30 depends on the case that caused the execution of step 1 100 of assessing the need for a consumption adaptation.
- the server 1 00 supervision is allocating resources for the execution of a new job.
- the supervisory server 1 00 knows the characteristics of this new work, and in particular the number of nodes required for said execution.
- the server is therefore able to calculate how much the cluster will be consumed once the new job is running. This is the sum of the instantaneous consumption and the estimated consumption for the new job.
- the supervisory server 1 00 thus obtains a predicted consumption corresponding to the first case.
- the first case may be a little more complex taking into account, for example, the work that will end.
- the predicted consumption is the measured instantaneous consumption.
- the acquisition of limit can be done at a date slightly in the future.
- this slightly in the future may be, for example, the half-planning period.
- the server 1 00 supervision has therefore produced a consumption prediction.
- the supervisory server 100 passes to a substep 1 140 of confrontation of the prediction to the acquired limit. If the prediction is below the acquired limit, then we go to step X of end of the power management. If the prediction is greater than the acquired limit, then we proceed to a step 1200 of limiting the consumption of the cluster.
- Step 1 200 comprises a sub-step 121 0 for calculating the number of nodes to be stopped in order not to exceed the acquired limit. This number of nodes is a function of the difference between the prediction and the acquired limit.
- step 1220 for selecting a number of nodes corresponding to the number calculated in the previous step. There are several strategies for this selection.
- a first strategy consists in choosing a group of nodes from the groups of nodes described in the zone 1 20.3 for describing groups of nodes.
- the group chooses must fulfill at least two criteria:
- a second strategy is to choose nodes among those described by the node management database as being in status "idle" (rest or waiting), that is to say, waiting to be allocated.
- status "idle” rest or waiting
- the nodes, and their components are never dormant to ensure the fastest start possible. This results in a significant resting consumption.
- a third strategy is to choose nodes among those executing jobs that have been identified as non-priority. This third strategy is implemented efficiently by using several job management queues, in particular by using a management queue dedicated to non-priority jobs. The selection of the corresponding nodes is then facilitated.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Power Engineering (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Human Computer Interaction (AREA)
- Power Sources (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1463444A FR3031200B1 (fr) | 2014-12-30 | 2014-12-30 | Procede de gestion automatique de la consommation electrique d'une grappe de serveurs |
PCT/EP2015/081279 WO2016107840A1 (fr) | 2014-12-30 | 2015-12-28 | Procédé de gestion automatique de la consommation électrique d'une grappe de serveurs |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3241089A1 true EP3241089A1 (fr) | 2017-11-08 |
Family
ID=52684523
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP15822954.2A Ceased EP3241089A1 (fr) | 2014-12-30 | 2015-12-28 | Procédé de gestion automatique de la consommation électrique d'une grappe de serveurs |
Country Status (4)
Country | Link |
---|---|
US (1) | US20190155359A1 (fr) |
EP (1) | EP3241089A1 (fr) |
FR (1) | FR3031200B1 (fr) |
WO (1) | WO2016107840A1 (fr) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR3076005B1 (fr) * | 2017-12-22 | 2019-12-27 | Bull Sas | Commande de la consommation energetique d'une grappe de serveurs |
EP4195044A1 (fr) * | 2021-12-09 | 2023-06-14 | Bull SAS | Méthode d'optimisation de la consommation énergétique d'une infrastructure informatique par suspension de travaux |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7210048B2 (en) * | 2003-02-14 | 2007-04-24 | Intel Corporation | Enterprise power and thermal management |
CN102016748A (zh) * | 2008-04-21 | 2011-04-13 | 自适应计算企业股份有限公司 | 用于管理计算环境中的能量消耗的系统和方法 |
US8862922B2 (en) * | 2010-01-14 | 2014-10-14 | International Business Machines Corporation | Data center power adjustment |
-
2014
- 2014-12-30 FR FR1463444A patent/FR3031200B1/fr active Active
-
2015
- 2015-12-28 US US15/540,900 patent/US20190155359A1/en not_active Abandoned
- 2015-12-28 WO PCT/EP2015/081279 patent/WO2016107840A1/fr active Application Filing
- 2015-12-28 EP EP15822954.2A patent/EP3241089A1/fr not_active Ceased
Non-Patent Citations (2)
Title |
---|
None * |
See also references of WO2016107840A1 * |
Also Published As
Publication number | Publication date |
---|---|
WO2016107840A1 (fr) | 2016-07-07 |
FR3031200B1 (fr) | 2017-12-29 |
US20190155359A1 (en) | 2019-05-23 |
FR3031200A1 (fr) | 2016-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108632365B (zh) | 服务资源调整方法、相关装置和设备 | |
CA2852367C (fr) | Procede, programme d'ordinateur et dispositif d'allocation de ressources informatiques d'un cluster pour l'execution d'un travail soumis audit cluster | |
Calheiros et al. | Virtual machine provisioning based on analytical performance and QoS in cloud computing environments | |
US9104498B2 (en) | Maximizing server utilization within a datacenter | |
EP2894872B1 (fr) | Procédé d'ordonnancement de tâches dans un réseau à courants porteurs en ligne | |
US20180254998A1 (en) | Resource allocation in a cloud environment | |
Prachitmutita et al. | Auto-scaling microservices on IaaS under SLA with cost-effective framework | |
FR2906907A1 (fr) | Procedes et dispostif de gestion de l'energie dans un systeme de traitement d'informations | |
US20170034031A1 (en) | Automatic determination of optimal time window for migration, backup or other processes | |
WO2016107840A1 (fr) | Procédé de gestion automatique de la consommation électrique d'une grappe de serveurs | |
US10171572B2 (en) | Server pool management | |
US8745125B2 (en) | Routing traffic after power failure | |
Wang et al. | Trust: Real-time request updating with elastic resource provisioning in clouds | |
WO2016198762A1 (fr) | Procédé et système de détermination d'une configuration de serveurs cible pour un déploiement d'une application logicielle | |
EP3502809B1 (fr) | Procédé de pilotage de ballons d'eau chaude sanitaire | |
FR3045972B1 (fr) | Brassage dynamique de l'alimentation electrique | |
US9052904B1 (en) | System and method for determining whether to reschedule malware scans based on power-availability information for a power grid and power-usage information for the scans | |
EP3051416A1 (fr) | Procédé de commande de déploiement d'un programme a exécuter dans un parc de machines | |
FR3067832A1 (fr) | Fourniture de services inter-groupements | |
Canon et al. | Évaluation de la consommation d’énergie nécessaire à l’exécution d’un workload dans un datacenter vert | |
Feinberg et al. | Optimizing cloud utilization via switching decisions | |
Bhattacharjee et al. | Enhancing reliability of cloud system through proactive identification of under performing components | |
Bayati | Data centers energy optimization | |
CN118175032A (zh) | 一种云服务创建方法、装置、电子设备及存储介质 | |
EP4148569A1 (fr) | Procédé d'ordonnancement d'un ensemble de taches de calcul dans un supercalculateur |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20170727 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
17Q | First examination report despatched |
Effective date: 20190228 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20201130 |