WO2025052509A1

WO2025052509A1 - Estimation device and estimation program

Info

Publication number: WO2025052509A1
Application number: PCT/JP2023/032201
Authority: WO
Inventors: 玲佐藤; 亮介佐藤; 義和中村; 智也藤本; 雄二篠崎; 慎平広中
Original assignee: 日本電信電話株式会社
Priority date: 2023-09-04
Filing date: 2023-09-04
Publication date: 2025-03-13

Abstract

An estimation device 10 comprises a workload power consumption estimation unit 103 that links a server power consumption estimation model for learning at least the relationship between a resource use state of a server and power consumption of the server to estimate power consumption on the server, and a workload additional learning model for learning information on power consumption of the workload on the server, and that calculates an estimated value of power consumption when a prescribed workload operates on a server of a movement destination.

Description

Estimation device and estimation program

　本開示は、推定装置、及び、推定プログラムに関する。 This disclosure relates to an estimation device and an estimation program.

　データセンタのサーバ上で動作しているVM（Virtual Machine）等のワークロードを、別のデータセンタに移動することにより、各データセンタでの再生可能エネルギー利用率を向上させたい。 We want to improve the utilization rate of renewable energy at each data center by moving workloads such as VMs (Virtual Machines) running on servers in the data center to another data center.

　これを実現するには、ワークロード単位で消費電力を推定する必要がある。しかし、ワークロードの消費電力は、サーバ本体の消費電力と異なり、PDU（Power Distribution Unit）等で直接計測できない。 To achieve this, it is necessary to estimate the power consumption on a per-workload basis. However, unlike the power consumption of the server itself, the power consumption of a workload cannot be directly measured using a PDU (Power Distribution Unit) or the like.

　そこで、非特許文献１では、サーバの消費電力はCPU使用率に比例すると仮定し、ホストOSや各種VMについて各CPU使用率とCPU使用率1%あたりの消費電力との関係式を複数のサーバ分用意し、その複数の関係式から消費電力を算出することで、移動前のサーバでの各種VMの消費電力を推定する。その後、その推定結果を基に、移動前後のサーバの省電力比率から移動先のサーバでの各種VMの消費電力を推定する。 In this paper, we assume that the power consumption of a server is proportional to the CPU usage rate, and prepare equations for the relationship between each CPU usage rate and the power consumption per 1% of CPU usage rate for multiple servers for the host OS and various VMs. We then estimate the power consumption of the various VMs on the server before the move by calculating the power consumption from these equations. After that, based on the estimation results, we estimate the power consumption of the various VMs on the destination server from the power saving ratio of the server before and after the move.

中村義和、外6名、“VMの移動を考慮したVM消費電力の予測手法”、2022年総合大会、B-14-6、2022年3月15日Yoshikazu Nakamura and 6 others, "VM Power Consumption Prediction Method Considering VM Migration", 2022 General Conference, B-14-6, March 15, 2022

　しかしながら、同一スペックの同一機種サーバ間でワークロードを移動した場合（省電力比率が１対１の場合）でも、ワークロードの移動に伴う移動元のサーバでの電力下降量と移動先のサーバでの電力上昇量とは同じ値にならない場合がある。そのため、非特許文献１で示すようなサーバ間の省電力比率を用いた方法は推定精度が低いという課題があった。 However, even when a workload is moved between servers of the same model and specifications (when the power saving ratio is 1:1), the amount of power decrease on the source server and the amount of power increase on the destination server due to the workload movement may not be the same value. Therefore, the method using the power saving ratio between servers as shown in Non-Patent Document 1 has the problem of low estimation accuracy.

　本開示は、上記事情に鑑みてなされたものであり、本開示の目的は、ワークロードの消費電力を高精度に推定可能な技術を提供することである。 This disclosure has been made in consideration of the above circumstances, and the purpose of this disclosure is to provide a technology that can estimate the power consumption of a workload with high accuracy.

　本開示の一態様の推定装置は、少なくともサーバのリソース使用状況とサーバの消費電力との関係を学習してサーバでの消費電力を推定するためのサーバ消費電力推定モデルと、サーバでのワークロードの消費電力に関する情報を学習するためのワークロード用追加学習モデルと、を連携させて、所定のワークロードが移動先のサーバで動作するときの消費電力の推定値を計算する推定部、を備える。 The estimation device according to one aspect of the present disclosure includes an estimation unit that calculates an estimate of the power consumption of a given workload when it is operated on a destination server by linking a server power consumption estimation model that learns at least the relationship between the resource usage status of the server and the power consumption of the server and an additional learning model for workloads that learns information about the power consumption of the workloads on the server.

　本開示の一態様の推定プログラムは、上記推定装置としてコンピュータを機能させる。 The estimation program of one embodiment of the present disclosure causes a computer to function as the estimation device.

　本開示によれば、ワークロードの消費電力を高精度に推定可能な技術を提供できる。 This disclosure provides technology that can estimate the power consumption of a workload with high accuracy.

図１は、工夫１～３の概要を示す図である。FIG. 1 is a diagram showing an overview of the ideas 1 to 3. 図２は、ワークロードの負荷状態等を示す図である。FIG. 2 is a diagram showing the load state of the workload. 図３は、ワークロードの特性を示す図である。FIG. 3 is a diagram showing the characteristics of the workload. 図４は、本実施形態に係るシステムの構成を示す図である。FIG. 4 is a diagram showing the configuration of a system according to this embodiment. 図５は、消費電力の推定方法の処理フローを示す図である。FIG. 5 is a diagram showing a process flow of the power consumption estimation method. 図６は、推定装置のハードウェア構成を示す図である。FIG. 6 is a diagram illustrating a hardware configuration of the estimation device.

　以下、図面を参照して、本開示の実施形態を説明する。図面の記載において同一部分には同一符号を付し説明を省略する。 Below, an embodiment of the present disclosure will be described with reference to the drawings. In the description of the drawings, the same parts are given the same reference numerals and the description will be omitted.

　本実施形態では、ワークロードの消費電力を機械学習により推定する。短時間で高精度に推定するため、下記の工夫１～３を行う。図１は、工夫１～３の概要を示す図である。 In this embodiment, the power consumption of a workload is estimated using machine learning. In order to estimate the power consumption with high accuracy in a short time, the following measures 1 to 3 are implemented. Figure 1 shows an overview of measures 1 to 3.

　工夫１；
　本実施形態では、ワークロード消費電力推定モデルを、サーバ消費電力推定モデルとワークロード用追加学習モデルとの組み合わせと考える。 Tip 1:
In this embodiment, the workload power consumption estimation model is considered to be a combination of a server power consumption estimation model and a workload incremental learning model.

　サーバ消費電力推定モデルの生成に必要なデータは、実測データとしてサーバ等から事前に大量に得ることができる。そこで、学習フェーズ１では、サーバのリソース使用状況等とサーバの消費電力との関係を学習し、サーバから出力される各種情報を利用してサーバでの消費電力（サーバでのワークロードの消費電力を含む）を推定するためのサーバ消費電力推定モデルを生成する。 The data required to generate a server power consumption estimation model can be obtained in advance in large quantities as actual measurement data from servers, etc. Therefore, in learning phase 1, the relationship between the server resource usage status, etc. and the server's power consumption is learned, and a server power consumption estimation model is generated to estimate the power consumption of the server (including the power consumption of the server's workload) using various information output from the server.

　実測データとは、例えば、リソース使用状況（CPU使用率、メモリ使用率等）、ワークロードのリソース使用状況、サーバの消費電力、サーバのスペック、サーバの温度、サーバの周辺温度等である。 Actual measured data includes, for example, resource usage (CPU usage, memory usage, etc.), workload resource usage, server power consumption, server specifications, server temperature, and temperature surrounding the server.

　学習フェーズ２では、サーバに関する情報のうち特にサーバでのワークロードの消費電力に関する情報を学習するためのワークロード用追加学習モデルを生成する。具体的には、ワークロード用追加学習モデルは、サーバ消費電力推定モデルで推定されたサーバでのワークロードの消費電力の推定値と、サーバの消費電力の実測値から計算して求められるワークロードの消費電力値（以降、単にワークロードの消費電力の実測値という）と、ワークロードの特性情報（後述する工夫３）と、を入力してそれらの関係を学習する。 In learning phase 2, an additional learning model for workloads is generated to learn information about the server, particularly information about the power consumption of the workloads on the server. Specifically, the additional learning model for workloads inputs the estimated value of the power consumption of the workloads on the server estimated by the server power consumption estimation model, the power consumption value of the workload calculated from the actual measured value of the power consumption of the server (hereinafter simply referred to as the actual measured value of the power consumption of the workload), and the characteristic information of the workload (Ingenuity 3 described below) and learns the relationship between them.

　例えば、ワークロード用追加学習モデルは、ワークロードの消費電力の推定値とワークロードの消費電力の実測値との関係、ワークロードの消費電力の推定値とワークロードの特性値との関係、ワークロードの消費電力の推定値とワークロードの消費電力の実測値及びワークロードの特性値との関係、相異するサーバでの同一ワークロードの消費電力の推定値間の関係、同一サーバでの相違するワークロードの消費電力の推定値間の関係を学習する。例えば、製造ロット等に係わるサーバ間の微妙な違いによる推定値と実測値との誤差を学習する。 For example, the workload additional learning model learns the relationship between the estimated power consumption of a workload and the actual measured power consumption of the workload, the relationship between the estimated power consumption of a workload and the characteristic values of the workload, the relationship between the estimated power consumption of a workload and the actual measured power consumption of the workload and the characteristic values of the workload, the relationship between the estimated power consumption of the same workload on different servers, and the relationship between the estimated power consumption of different workloads on the same server. For example, it learns the error between the estimated value and the actual value due to subtle differences between servers related to manufacturing lots, etc.

　推論フェーズでは、サーバ消費電力推定モデルとワークロード用追加学習モデルとを連携させて、サーバでのワークロードの消費電力を推定する。具体的には、所定のワークロードが移動先のサーバで動作するときの消費電力を推定する。 In the inference phase, the server power consumption estimation model and the workload additional learning model are linked to estimate the power consumption of the workload on the server. Specifically, the power consumption of a given workload when it runs on the destination server is estimated.

　モデルの連携方法としては、例えば、自然言語処理の世界で利用されている「転移学習」や「ファインチューニング」の手法を利用する。つまり、サーバ消費電力推定モデルにおいて、サーバに関する学習のうち、ワークロードに関する学習をワークロード用追加学習モデルに委ねる。 As a method for linking models, for example, the "transfer learning" and "fine tuning" techniques used in the world of natural language processing are used. In other words, in the server power consumption estimation model, among the learning related to the server, the learning related to the workload is entrusted to the workload additional learning model.

　サーバから大量に取得できるリソース使用状況等の学習データとワークロード単位でそれほど取得できない学習データとがあるときに、例示した手法を利用することで、教師データによるモデルパラメータの調整範囲をワークロード用追加学習モデルに絞ることができる。その結果、短時間で高精度にワークロードの消費電力を推定することができる。 When there is learning data such as resource usage that can be obtained in large quantities from a server, and learning data that cannot be obtained in large quantities on a per-workload basis, by using the method described above, it is possible to narrow down the range of model parameter adjustments using training data to the additional learning model for the workload. As a result, it is possible to estimate the power consumption of a workload with high accuracy in a short time.

　工夫２；
　ワークロード用追加学習モデルは、工夫１で説明した通り、ワークロードの消費電力の実測値を教師データとして入力する。 Tip 2:
As described in Scheme 1, the workload additional learning model inputs the actual measured power consumption of the workload as training data.

　しかし、図２（ａ）に示すように、ワークロードの負荷状態は一般に逐次変化する。また、図２（ｂ）に示すように、ワークロード移動前後のサーバＡ、Ｂでは電力が安定するまでに一定の時間がかかる。ゆえに、ワークロード移動前後のサーバでの消費電力の実測値について、どのタイミングで安定しているのかを把握することは困難である。 However, as shown in Figure 2(a), the load state of a workload generally changes sequentially. Also, as shown in Figure 2(b), it takes a certain amount of time for the power to stabilize in servers A and B before and after the workload transfer. Therefore, it is difficult to determine at what point in time the actual measured power consumption of a server becomes stable before and after the workload transfer.

　そこで、有効な教師データ（適正な消費電力の実測値）を得るため、サーバ消費電力推定モデルは、サーバでのワークロードの消費電力を推定すると共に、サーバで電力が安定するまでの時間も推定する（図１参照）。その時間は、例えばサーバの消費電力の実測データとサーバのリソース使用状況の実測データとを基に推定可能である。 Therefore, to obtain effective training data (measured values of appropriate power consumption), the server power consumption estimation model estimates the power consumption of the workload on the server, and also estimates the time it takes for the server's power to stabilize (see Figure 1). This time can be estimated, for example, based on measured data on the server's power consumption and measured data on the server's resource usage.

　ワークロード用追加学習モデルは、その推定された電力安定時間の経過後における消費電力の実測値を教師データとして利用する。これにより、教師データとしての消費電力の実測値の価値を高めることができ、短時間でより高精度にワークロードの消費電力を推定することができる。 The workload additional learning model uses the actual measured power consumption value after the estimated power stabilization time has elapsed as training data. This increases the value of the actual measured power consumption value as training data, making it possible to estimate the power consumption of a workload with higher accuracy in a short period of time.

　工夫３；
　ワークロードにはワークロード毎に固有の特性があり、それにより消費電力にも差異が出る。固有の特性とは、例えば、スレッド数、GPUアクセラレーションの可否等である。 Tip 3:
Each workload has its own unique characteristics that affect power consumption, such as the number of threads, availability of GPU acceleration, etc.

　例えば、複数スレッドで動作可能なワークロードは、サーバの動作環境によって並列処理数に差分が出るため、消費電力にも差異が出る。例えば、図３（ａ）に示すワークロードＷＬ１はアプリの作りによってスレッド数を増加不可であるが、図３（ｂ）に示すワークロードＷＬ２は４スレッド対応可能である。ゆえに、移動先が同じサーバであっても、そのサーバが４スレッド以上の同時実行が可能なサーバであれば、ワークロードＷＬ２の方が消費電力は大きい。 For example, workloads that can run on multiple threads will have different power consumption because the number of parallel processes will differ depending on the server's operating environment. For example, the number of threads cannot be increased for workload WL1 shown in Figure 3(a) due to the application's design, but workload WL2 shown in Figure 3(b) can handle four threads. Therefore, even if the destination server is the same, if that server is capable of simultaneously executing four or more threads, workload WL2 will consume more power.

　その他、GPUオフロード可能なワークロードは、CPU以外にGPUも利用できる環境では並列処理ができるため、消費電力にも差異が出る。アプリの作りによってGPUが使えない場合や特定のGPUのみ利用可能な場合があるため、消費電力に違いが出る。 In addition, workloads that can be offloaded to the GPU can be processed in parallel in an environment where the GPU is available in addition to the CPU, which results in differences in power consumption. Depending on how the app is designed, the GPU may not be available or only certain GPUs may be available, resulting in differences in power consumption.

　そこで、ワークロード用追加学習モデルは、ワークロードの特性情報を教師データとして利用する（図１参照）。これにより、ワークロード固有の特性による消費電力の差異を吸収でき、短時間でより更に高精度にワークロードの消費電力を推定することができる。 The workload additional learning model uses workload characteristic information as training data (see Figure 1). This makes it possible to absorb differences in power consumption due to workload-specific characteristics, and estimate the power consumption of a workload with even greater accuracy in a short period of time.

　図４は、本実施形態に係るシステム１の構成を示す図である。 FIG. 4 is a diagram showing the configuration of system 1 according to this embodiment.

　システム１は、２つのサーバＡ、Ｂのそれぞれでのワークロード（VM、コンテナ等）の消費電力を推定する推定装置１０と、その推定結果を基にサーバＡ、Ｂ間でのワークロードの移動判定及び移動実行を行う実行装置２０と、を備える。サーバの数は、２つ以上であればよい。 The system 1 includes an estimation device 10 that estimates the power consumption of workloads (VMs, containers, etc.) on each of two servers A and B, and an execution device 20 that determines and executes the movement of workloads between servers A and B based on the estimation results. The number of servers may be two or more.

　推定装置１０は、リソース・消費電力収集部１０１と、サーバ消費電力推定モデル生成部１０２と、ワークロード消費電力推定部１０３と、重み係数前処理部１０４と、重み係数調整部１０５と、を備える。 The estimation device 10 includes a resource and power consumption collection unit 101, a server power consumption estimation model generation unit 102, a workload power consumption estimation unit 103, a weighting factor preprocessing unit 104, and a weighting factor adjustment unit 105.

　リソース・消費電力収集部１０１は、サーバＡ、Ｂからそれぞれのリソース使用状況（CPU使用率、メモリ使用率等）の情報を収集する機能を備える。例えば、リソース・消費電力収集部１０１は、サーバＡから、サーバＡで動作するホストOSのCPU使用率、サーバＡで動作するワークロードのCPU使用率（VM1のCPU使用率、VM2のCPU使用率、コンテナ1のCPU使用率等）を収集する。 The resource and power consumption collection unit 101 has a function for collecting information on the resource usage status (CPU usage, memory usage, etc.) from servers A and B. For example, the resource and power consumption collection unit 101 collects from server A the CPU usage of the host OS running on server A, and the CPU usage of the workloads running on server A (CPU usage of VM1, CPU usage of VM2, CPU usage of container 1, etc.).

　また、リソース・消費電力収集部１０１は、サーバＡ、Ｂの各PDUからそれぞれのサーバ本体の消費電力の情報を収集する機能を備える。例えば、リソース・消費電力収集部１０１は、サーバＡのPDUから、サーバＡの消費電力を収集する。 The resource and power consumption collection unit 101 also has a function of collecting information on the power consumption of each server from each PDU of servers A and B. For example, the resource and power consumption collection unit 101 collects the power consumption of server A from the PDU of server A.

　リソース・消費電力収集部１０１は、スペック情報記憶ＤＢからサーバＡ、Ｂのそれぞれのスペック情報を収集する機能を備える。 The resource and power consumption collection unit 101 has the function of collecting the specification information of each of the servers A and B from the specification information storage DB.

　サーバ消費電力推定モデル生成部１０２は、サーバＡ、Ｂでの各リソース使用状況、各サーバ本体の消費電力、各サーバのスペック情報、各サーバの温度、各サーバの周辺温度等を学習用データとして入力し、サーバＡ、Ｂでの各リソース使用状況と各サーバ本体の消費電力と各サーバのスペック情報と各サーバの温度と各サーバの周辺温度等との関係を学習し、サーバでの消費電力を推定するためのサーバ消費電力推定モデルを生成する機能を備える。 The server power consumption estimation model generation unit 102 has a function of inputting the resource usage status of servers A and B, the power consumption of each server body, each server's specification information, the temperature of each server, the ambient temperature of each server, etc. as learning data, learning the relationship between the resource usage status of servers A and B, the power consumption of each server body, each server's specification information, the temperature of each server, the ambient temperature of each server, etc., and generating a server power consumption estimation model for estimating the power consumption of the server.

　具体的には、サーバ消費電力推定モデルは、サーバＡ、Ｂでのワークロードのリソース使用状況からサーバＡ、Ｂでのワークロードの消費電力を推定する。例えば、サーバ消費電力推定モデルは、所定のワークロードが移動先のサーバで動作するときの消費電力を推定する。また、サーバ消費電力推定モデルは、ワークロードを移動した移動先のサーバで電力が安定するまでの時間を推定する。 Specifically, the server power consumption estimation model estimates the power consumption of the workload on servers A and B from the resource usage of the workload on servers A and B. For example, the server power consumption estimation model estimates the power consumption when a specific workload runs on the destination server. The server power consumption estimation model also estimates the time it will take for the power to stabilize on the destination server after the workload has been moved.

　ワークロード消費電力推定部（推定部）１０３は、サーバＡ、Ｂでのワークロードのリソース使用状況等を推論用データとして入力し、サーバ消費電力推定モデルとワークロード用追加学習モデルとを連携させて、推論用データとして入力したサーバＡ、Ｂでのワークロードのリソース使用状況等からサーバＡ、Ｂでのワークロードの消費電力（例えば、所定のワークロードが移動先のサーバで動作するときの消費電力）を推定し、ワークロードの移動後に移動先のサーバで電力が安定するまでの時間を推定する機能を備える。 The workload power consumption estimation unit (estimation unit) 103 has a function of inputting the resource usage status of the workload on servers A and B as inference data, linking the server power consumption estimation model with the workload additional learning model, estimating the power consumption of the workload on servers A and B (for example, the power consumption when a specific workload runs on the destination server) from the resource usage status of the workload on servers A and B input as inference data, and estimating the time until the power stabilizes on the destination server after the workload is moved.

　ここで、サーバ消費電力推定モデルとワークロード用追加学習モデルについて、改めて説明する。 Here, we will explain the server power consumption estimation model and the workload additional learning model again.

　サーバ消費電力推定モデルは、サーバでのワークロードの消費電力に関する情報を学習するためのワークロード用追加学習モデルと連携して、サーバで動作するワークロードの消費電力を推定し、サーバで電力が安定するまでの時間を推定する。その２つのモデルの連携方法としては、上述した通り、例えば「転移学習」や「ファインチューニング」の手法を利用する。 The server power consumption estimation model works in conjunction with the workload additional learning model, which is used to learn information about the power consumption of workloads on a server, to estimate the power consumption of the workloads running on the server and the time it takes for the server's power to stabilize. As mentioned above, the two models can be linked using techniques such as "transfer learning" and "fine tuning."

　ワークロード用追加学習モデルは、サーバ消費電力推定モデルで推定されたサーバでのワークロードの消費電力の推定値と、サーバの消費電力の実測値から計算して求められるワークロードの消費電力値（ワークロードの消費電力の実測値）であってワークロードの移動後にサーバで電力が安定するまでの時間が経過した後の実測値と、ワークロード特性情報記憶ＤＢに格納されているワークロードの特性情報と、を入力し、それらの関係を学習する。 The workload additional learning model inputs the estimated value of the workload power consumption on the server estimated by the server power consumption estimation model, the workload power consumption value calculated from the actual measured value of the server power consumption (actual measured value of the workload power consumption) which is the actual measured value after the time has elapsed until the power on the server stabilizes after the workload is moved, and the workload characteristic information stored in the workload characteristic information storage DB, and learns the relationship between them.

　サーバ消費電力推定モデルは、ワークロード用追加学習モデルと連携しない場合には、単にワークロードのリソース使用状況等を基にサーバでのワークロードの消費電力を推定するが、ワークロード用追加学習モデルと連携することで、ワークロードの特性や移動した消費電力の実測値等を考慮したワークロードの消費電力を推定する。 When the server power consumption estimation model is not linked to the workload additional learning model, it simply estimates the power consumption of the workload on the server based on the resource usage status of the workload, etc., but when linked to the workload additional learning model, it estimates the power consumption of the workload taking into account the characteristics of the workload and the actual measured value of the power consumption of the transferred workload, etc.

　ワークロードの特性や移動した消費電力の実測値等に関するパラメータを教師データとして学習したワークロード用追加学習モデルを連携させるので、短時間で高い精度にてワークロードの消費電力を推定することができる。 By linking this to an additional learning model for workloads that has been trained using training data such as parameters related to the characteristics of the workload and the actual measured power consumption of the transferred workload, it is possible to estimate the power consumption of the workload with high accuracy in a short time.

　重み係数前処理部１０４は、ワークロード消費電力推定部１０３からワークロードの移動先のサーバで電力が安定するまでの推定時間を取得し、その電力安定時間の経過後のリソース使用状況等を収集することをリソース・消費電力収集部１０１に指示する機能を備える。 The weighting coefficient preprocessing unit 104 has a function of obtaining an estimated time until power becomes stable on the server to which the workload is to be moved from the workload power consumption estimating unit 103, and instructing the resource and power consumption collecting unit 101 to collect resource usage status, etc. after the power stabilization time has elapsed.

　重み係数調整部１０５は、重み係数前処理部１０４の指示に基づきリソース・消費電力収集部１０１が収集したリソース使用状況等を教師データとして、ワークロード用追加学習モデルの重み係数を調整する機能を備える。例えば、重み係数調整部１０５は、ワークロードの消費電力の推定値と実測値との誤差が小さくなるように、ワークロード用追加学習モデルの重み係数を変更する。 The weighting factor adjustment unit 105 has a function of adjusting the weighting factor of the workload additional learning model using the resource usage status and the like collected by the resource and power consumption collection unit 101 as teacher data based on the instructions of the weighting factor preprocessing unit 104. For example, the weighting factor adjustment unit 105 changes the weighting factor of the workload additional learning model so as to reduce the error between the estimated value and the actual measured value of the power consumption of the workload.

　実行装置２０は、移動先のサーバＢの消費電力の上限値を確認し、移動対象ワークロードの消費電力の推定値を加えたサーバＢの消費電力が当該上限値を超えるか否かを判定する機能を備える。 The execution device 20 has a function for checking the upper limit of the power consumption of the destination server B, and determining whether the power consumption of server B, including the estimated power consumption of the workload to be moved, exceeds the upper limit.

　また、実行装置２０は、その判定結果を基に、移動対象ワークロードの移動の可否を決定する機能を備える。上限値を超えない場合には、移動対象ワークロードの移動を可と決定してサーバＡからサーバＢへ移動させ、上限値を超える場合には、移動対象ワークロードの移動を不可と決定する。なお、この判定方法は例であり、他の判定方法でもよい。 The execution device 20 also has a function for determining whether or not to move the workload to be moved based on the result of this determination. If the upper limit is not exceeded, it determines that the workload to be moved can be moved and moves it from server A to server B, and if the upper limit is exceeded, it determines that the workload to be moved cannot be moved. Note that this determination method is an example, and other determination methods may be used.

　図５は、消費電力の推定方法の処理フローを示す図である。 Figure 5 shows the process flow of the power consumption estimation method.

　ステップＳ１；
　リソース・消費電力収集部１０１は、サーバＡ、Ｂからそれぞれのリソース使用状況（CPU使用率、メモリ使用率等）を収集し、サーバＡ、Ｂの各PDUからそれぞれのサーバ本体の消費電力を収集し、スペック情報記憶ＤＢからサーバＡ、Ｂのそれぞれのスペック情報を収集する。 Step S1:
The resource/power consumption collection unit 101 collects the resource usage status (CPU usage, memory usage, etc.) of each server A and B, collects the power consumption of each server body from each PDU of servers A and B, and collects the specification information of each server A and B from a specification information storage DB.

　ステップＳ２；
　サーバ消費電力推定モデル生成部１０２は、サーバＡ、Ｂでの各リソース使用状況と各サーバ本体の消費電力と各サーバのスペック情報と各サーバの温度と各サーバの周辺温度等との関係を学習し、サーバＡ、Ｂでのワークロードのリソース使用状況からサーバＡ、Ｂでのワークロードの消費電力を推定し、かつ、ワークロードを移動した移動先のサーバで電力が安定するまでの時間を推定するためのサーバ消費電力推定モデルを生成する。 Step S2:
The server power consumption estimation model generation unit 102 learns the relationships between the resource usage status of each server A and B, the power consumption of each server itself, the specification information of each server, the temperature of each server, the ambient temperature of each server, etc., and generates a server power consumption estimation model for estimating the power consumption of the workloads on servers A and B from the resource usage status of the workloads on servers A and B, and for estimating the time it will take for the power to stabilize on the server to which the workload has been moved.

　ステップＳ３；
　ワークロード消費電力推定部１０３は、サーバＡ、Ｂからそれぞれのリソース使用状況を収集し、サーバＡ、Ｂの各PDUからそれぞれのサーバ本体の消費電力を収集し、スペック情報記憶ＤＢからサーバＡ、Ｂのそれぞれのスペック情報を収集する。 Step S3:
The workload power consumption estimation unit 103 collects the resource usage status of each server A and B, collects the power consumption of each server body from each PDU of servers A and B, and collects the specification information of each server A and B from a specification information storage DB.

　その後、ワークロード消費電力推定部１０３は、サーバ消費電力推定モデルに、ワークロードの消費電力の推定値と実測値とワークロードの特性情報との関係を学習するワークロード用追加学習モデルを連携させ、移動元のサーバＡで動作する移動対象ワークロードのCPU使用率から当該移動対象ワークロードが移動先のサーバＢで動作するときの消費電力の推定値を計算し、当該移動対象ワークロードの移動後にサーバＢで電力が安定するまでの推定時間を計算する。 Then, the workload power consumption estimation unit 103 links the server power consumption estimation model with an additional learning model for workloads that learns the relationship between the estimated and actual power consumption values of the workload and the characteristic information of the workload, calculates an estimate of the power consumption of the workload to be moved when it runs on the destination server B from the CPU utilization rate of the workload to be moved running on the source server A, and calculates an estimated time until the power on server B stabilizes after the workload to be moved.

　ステップＳ４；
　ワークロード消費電力推定部１０３は、移動対象ワークロードが移動先のサーバＢで動作するときの消費電力の推定値を実行装置２０へ送信する。実行装置２０は、当該移動対象ワークロードのサーバＢへの移動可否を判定し、移動可と判定した場合には当該移動対象ワークロードをサーバＡからサーバＢへ移動させる。 Step S4:
The workload power consumption estimator 103 transmits an estimate of the power consumption of the workload to be moved when it operates on the destination server B to the execution device 20. The execution device 20 determines whether the workload to be moved can be moved to server B, and if it determines that the workload can be moved, moves the workload to be moved from server A to server B.

　ステップＳ５；
　ワークロード消費電力推定部１０３は、移動対象ワークロードの移動後にサーバＢで電力が安定するまでの推定時間を重み係数前処理部１０４に送信する。重み係数前処理部１０４は、その推定時間（電力安定時間）の経過後のサーバＢのリソース使用状況やサーバＡ、Ｂ本体の消費電力を収集することをリソース・消費電力収集部１０１に指示する。 Step S5:
The workload power consumption estimator 103 transmits an estimated time until the power of server B stabilizes after the workload to be moved to the weighting factor preprocessor 104. The weighting factor preprocessor 104 instructs the resource and power consumption collector 101 to collect the resource usage status of server B and the power consumption of servers A and B after the estimated time (power stabilization time) has elapsed.

　ステップＳ６；
　リソース・消費電力収集部１０１は、移動対象ワークロードがサーバＡからサーバＢへ移動して上記推定時間が経過した後に、サーバＢから当該移動対象ワークロードのリソース使用状況を収集し、サーバＡ、Ｂの各PDUからそれぞれのサーバ本体の消費電力を収集する。 Step S6:
After the workload to be moved is moved from server A to server B and the above-mentioned estimated time has elapsed, the resource/power consumption collection unit 101 collects the resource usage status of the workload to be moved from server B, and collects the power consumption of each server body from each PDU of servers A and B.

　ステップＳ７；
　重み係数調整部１０５は、ステップＳ６で収集したサーバＢでの移動対象ワークロードのリソース使用状況とサーバＡ、Ｂの消費電力の実測値を教師データとしてワークロード用追加学習モデルの重み係数を調整する。 Step S7:
The weighting factor adjustment unit 105 adjusts the weighting factor of the workload additional learning model using the resource usage of the workload to be moved on server B collected in step S6 and the actual measured values of power consumption on servers A and B as teacher data.

　例えば、重み係数調整部１０５は、ステップＳ３で推定していた消費電力の推定値とステップＳ６で収集した消費電力の実測値との差分（誤差）が小さくなるような、ワークロード用追加学習モデルの機械学習で利用している重み係数の値の最適値を計算し、当該重み係数の値を当該最適値に変更する。また、このとき、ワークロード用追加学習モデルは、ステップＳ６で収集した推定時間経過後のサーバＡ、Ｂの消費電力の実測値を用いて再学習行う。 For example, the weighting factor adjustment unit 105 calculates an optimal value of the weighting factor used in the machine learning of the workload additional learning model so that the difference (error) between the estimated power consumption value estimated in step S3 and the actual power consumption value collected in step S6 is small, and changes the value of the weighting factor to the optimal value. At this time, the workload additional learning model re-learns using the actual power consumption values of servers A and B after the estimated time has elapsed, which were collected in step S6.

　ステップＳ１～Ｓ７を繰り返すことで、ワークロードの消費電力の推定結果を向上することができる。 By repeating steps S1 to S7, the estimated power consumption of the workload can be improved.

　以上より、本実施形態によれば、推定装置１０が、少なくともサーバのリソース使用状況とサーバの消費電力との関係を学習してサーバでの消費電力を推定するためのサーバ消費電力推定モデルと、サーバでのワークロードの消費電力に関する情報を学習するためのワークロード用追加学習モデルと、を連携させて、所定のワークロードが移動先のサーバで動作するときの消費電力の推定値を計算するワークロード消費電力推定部１０３を備えるので、ワークロードの消費電力を高精度に推定可能な技術を提供できる。 As described above, according to this embodiment, the estimation device 10 is equipped with a workload power consumption estimation unit 103 that links a server power consumption estimation model for estimating the power consumption of a server by learning at least the relationship between the resource usage status of the server and the power consumption of the server, and an additional learning model for workloads for learning information about the power consumption of the workloads on the server, to calculate an estimate of the power consumption of a specified workload when it is operated on a destination server, thereby providing a technology that can estimate the power consumption of a workload with high accuracy.

　本開示は、上記実施形態に限定されない。本開示は、本開示の要旨の範囲内で数々の変形が可能である。 This disclosure is not limited to the above-described embodiments. Many variations of this disclosure are possible within the scope of the gist of this disclosure.

　上記説明した本実施形態の推定装置１０は、例えば、図６に示すように、ＣＰＵ９０１と、メモリ９０２と、ストレージ９０３と、通信装置９０４と、入力装置９０５と、出力装置９０６と、を備えた汎用的なコンピュータシステムを用いて実現できる。メモリ９０２及びストレージ９０３は、記憶装置である。当該コンピュータシステムにおいて、ＣＰＵ９０１がメモリ９０２上にロードされた所定のプログラムを実行することにより、推定装置１０の各機能が実現される。 The estimation device 10 of the present embodiment described above can be realized, for example, as shown in FIG. 6, by using a general-purpose computer system including a CPU 901, a memory 902, a storage 903, a communication device 904, an input device 905, and an output device 906. The memory 902 and the storage 903 are storage devices. In this computer system, the CPU 901 executes a predetermined program loaded onto the memory 902, thereby realizing each function of the estimation device 10.

　推定装置１０は、１つのコンピュータで実装されてもよい。推定装置１０は、複数のコンピュータで実装されてもよい。推定装置１０は、コンピュータに実装される仮想マシンであってもよい。 The estimation device 10 may be implemented in one computer. The estimation device 10 may be implemented in multiple computers. The estimation device 10 may be a virtual machine implemented in a computer.

　推定装置１０用のプログラムは、ＨＤＤ、ＳＳＤ、ＵＳＢメモリ、ＣＤ、ＤＶＤ等のコンピュータ読取り可能な記録媒体に記憶できる。コンピュータ読取り可能な記録媒体は、例えば、非一時的な（non-transitory）記録媒体である。推定装置１０用のプログラムは、通信ネットワークを介して配信することもできる。 The program for the estimation device 10 can be stored in a computer-readable recording medium such as a HDD, SSD, USB memory, CD, or DVD. The computer-readable recording medium is, for example, a non-transitory recording medium. The program for the estimation device 10 can also be distributed via a communication network.

　１　システム
　１０　推定装置
　２０　実行装置
　１０１　リソース・消費電力収集部
　１０２　サーバ消費電力推定モデル生成部
　１０３　ワークロード消費電力推定部
　１０４　重み係数前処理部
　１０５　重み係数調整部
　９０１　ＣＰＵ
　９０２　メモリ
　９０３　ストレージ
　９０４　通信装置
　９０５　入力装置
　９０６　出力装置 REFERENCE SIGNS LIST 1 System 10 Estimation device 20 Execution device 101 Resource and power consumption collection unit 102 Server power consumption estimation model generation unit 103 Workload power consumption estimation unit 104 Weighting factor preprocessing unit 105 Weighting factor adjustment unit 901 CPU
902 Memory 903 Storage 904 Communication device 905 Input device 906 Output device

Claims

an estimation unit that calculates an estimate of the power consumption when a predetermined workload is operated on a destination server by linking a server power consumption estimation model for estimating the power consumption of the server by learning at least the relationship between the resource usage status of the server and the power consumption of the server and an additional learning model for workload for learning information on the power consumption of the workload on the server;
An estimation device comprising:

The server power consumption estimation model estimates a time until power becomes stable in a destination server to which a workload has been moved;
The workload incremental learning model comprises:
The estimation device according to claim 1 , wherein learning is performed using an actual measurement value of power consumption at the destination server after the estimated power stabilization time has elapsed.

The workload additional learning model is trained using workload characteristic information;
The estimation unit is
The estimation device according to claim 1 , wherein the server power consumption estimation model and an additional learning model for the workload are linked to calculate an estimate of power consumption when a predetermined workload is operated on a destination server.

An estimation program that causes a computer to function as the estimation device described in claim 1.