JP5151203B2 - Job scheduling apparatus and job scheduling method - Google Patents

Job scheduling apparatus and job scheduling method Download PDF

Info

Publication number
JP5151203B2
JP5151203B2 JP2007079429A JP2007079429A JP5151203B2 JP 5151203 B2 JP5151203 B2 JP 5151203B2 JP 2007079429 A JP2007079429 A JP 2007079429A JP 2007079429 A JP2007079429 A JP 2007079429A JP 5151203 B2 JP5151203 B2 JP 5151203B2
Authority
JP
Japan
Prior art keywords
computer
job
temperature
power consumption
computers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2007079429A
Other languages
Japanese (ja)
Other versions
JP2008242614A (en
Inventor
俊祐 秋元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP2007079429A priority Critical patent/JP5151203B2/en
Publication of JP2008242614A publication Critical patent/JP2008242614A/en
Application granted granted Critical
Publication of JP5151203B2 publication Critical patent/JP5151203B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Power Sources (AREA)

Description

本発明はジョブスケジューリング装置及びジョブスケジューリング方法に係り、特に計算機にジョブを割り当てるジョブスケジューリング装置及びジョブスケジューリング方法に関する。   The present invention relates to a job scheduling apparatus and a job scheduling method, and more particularly to a job scheduling apparatus and a job scheduling method for assigning a job to a computer.

近年、計算機の単位体積当たりの処理能力の進歩に伴って、その単位体積当たりの消費電力や発熱量も増え続けている。最近では、計算機の処理性能は自身の消費電力による発熱量に制限されるようになってきている。この傾向は、複数の計算機によって構成される並列計算機システムにおいても同様であるが、各計算機を密接して設置することの多い並列計算機システムにおいては個別に設置される熱対策以上に熱の問題への緻密な対応が求められている。   In recent years, with the progress of processing capacity per unit volume of computers, power consumption and heat generation per unit volume have been increasing. Recently, the processing performance of computers has been limited to the amount of heat generated by its own power consumption. This trend is the same in parallel computer systems composed of multiple computers, but in parallel computer systems where computers are often installed close to each other, there is a problem of heat beyond the heat countermeasures that are installed individually. The close correspondence of is demanded.

並列計算機システムにおける熱対策としては、例えば各計算機が最大出力で運用されるような状況下であっても、過熱によって計算機が停止したりパフォーマンスが低下したりすることのないよう、予め計算機同士をある程度離して設置する方法が最も安全な対策であると考えられる。しかし、近年の計算機の運用を考えると、全ての計算機が最大出力で運用されるような状況は稀なケースである場合が多い。そのような運用を考慮してサーバを離して配置した場合、システム全体の単位体積当りの計算能力は落ちてしまう。限られた体積で性能が求められる場合、ある程度密着して計算機を配置し、運用で熱対策を行う必要性が求められる。また、近年、計算機の消費電力の増加にマシンルームの設備投資が間に合わず、冷却能力の足りないマシンルームで発熱の多い計算機を運用したいといった要求もよくある。このような場合も運用で熱問題を解決する必要があるだろう。   As a countermeasure against heat in a parallel computer system, for example, even if each computer is operated at the maximum output, the computers should be connected in advance so that the computer will not stop or performance will deteriorate due to overheating. It is considered that the safest measure is to install it at some distance. However, considering the operation of computers in recent years, the situation where all computers are operated at the maximum output is often a rare case. If servers are arranged apart from each other in consideration of such operations, the computing capacity per unit volume of the entire system will be reduced. When performance is required in a limited volume, there is a need to place computers in close contact with each other and take measures against heat during operation. Also, in recent years, there is often a demand for operating a computer that generates a lot of heat in a machine room that does not have enough cooling capacity because the capital investment of the machine room is not in time for the increase in power consumption of the computer. In such cases, it may be necessary to solve the thermal problem through operation.

そのような運用の一形態として、ジョブの配置を工夫することで発熱する計算機を分散させ、特定の計算機の過熱を防ぐという手法がある。しかしながら、ある計算機にジョブ配置することによる他の計算機への温度の影響を考慮しながら緻密にジョブの配置を行うジョブスケジューリング手法はまだ提案されていない。   As one form of such operation, there is a method of preventing overheating of a specific computer by distributing the computers that generate heat by devising the arrangement of jobs. However, there has not yet been proposed a job scheduling method for precisely arranging jobs while considering the influence of temperature on other computers due to job placement on a computer.

運用における柔軟な熱対策の一つとして、ジョブスケジューリングを工夫することにより、熱の発生を分散させ、ある箇所の温度が極度に過熱しないように制御する方法が考えられる。その一手法として、各計算機の温度をモニタし、常に最も温度の低い計算機にジョブを割り当てるというジョブスケジューリング装置が既に提案されている(例えば、特許文献1参照)。   As one of the flexible heat countermeasures in operation, there can be considered a method of controlling the temperature of a certain location so as not to be overheated by devising job scheduling to disperse the generation of heat. As one method, a job scheduling apparatus that monitors the temperature of each computer and always assigns a job to the computer having the lowest temperature has already been proposed (see, for example, Patent Document 1).

また、マルチプロセッサシステムにおいて、制御用ICが過熱する可能性があるプロセッサがあるか否か確認し、過熱する可能性が無い場合には、プロセッサの割当に従ってスレッドのディスパッチを行い、過熱する可能性が有る場合には、現状の空きプロセッサに加えて新しい空きプロセッサができるまでジョブの割当を待機することにより、過熱による処理停止の発生を抑制するようにしたマルチプロセッサシステムも知られている(例えば、特許文献2参照)。   Also, in a multiprocessor system, it is checked whether there is a processor that may cause the control IC to overheat, and if there is no possibility of overheating, the thread is dispatched according to the processor assignment, and the possibility of overheating There is also known a multiprocessor system that suppresses processing stoppage due to overheating by waiting for job assignment until a new free processor is created in addition to the current free processor (for example, , See Patent Document 2).

更に、プロセッサの予測温度を計算し、予測温度が条件の範囲内であれば、プロセッサに対しタスクを割り当てるプロセッサシステムも知られている(例えば、特許文献3参照)。   Furthermore, a processor system that calculates a predicted temperature of a processor and assigns a task to the processor if the predicted temperature is within a range of conditions is also known (see, for example, Patent Document 3).

特開2004−126968号公報(第6頁)JP 2004-126968 A (page 6) 特開2006−099624号公報(第9頁)JP 2006-099624 A (page 9) 特開2006−133995号公報(第12−13頁)JP 2006-133955 A (pages 12-13)

しかしながら、特許文献1記載の従来のジョブスケジューリング装置では、ジョブの割り当てによる周囲温度の変化については考慮されておらず、限界温度ぎりぎりで動作していた計算機Aの近くの計算機Bに消費電力の高いジョブを割り当てることで、計算機Aの周辺温度が上がり、計算機Aがダウンしてしまうといったことが考えられる。   However, the conventional job scheduling apparatus described in Patent Document 1 does not consider the change in the ambient temperature due to job assignment, and the computer B near the computer A that was operating at the limit temperature has high power consumption. By assigning a job, it is conceivable that the ambient temperature of computer A rises and computer A goes down.

また、特許文献2や特許文献3記載の発明では、タスクを割り当てることで発生する一時的な発熱量のみを参考として温度予測を行っているので、精度の高い温度予測ができず、計算機がダウンするリスクが高い。更に、特許文献1〜3記載の発明はいずれも計算機へのジョブ割当によって発生する熱量が周辺の計算機に及ぼす二次的な温度変化を用いて温度予測したり、周辺の計算機で発生する発熱量も考慮に入れた温度予測をしていないため、計算機がダウンしない範囲でより高いパフォーマンスを引き出すことができない。   In addition, in the inventions described in Patent Document 2 and Patent Document 3, temperature prediction is performed with reference to only a temporary amount of heat generated by assigning a task, so accurate temperature prediction cannot be performed and the computer is down. There is a high risk of Furthermore, in any of the inventions described in Patent Documents 1 to 3, the temperature is predicted using a secondary temperature change caused by the amount of heat generated by job assignment to computers and the surrounding computers, or the amount of heat generated in the surrounding computers. Because the temperature is not taken into consideration, it is not possible to bring out higher performance within the range where the computer does not go down.

本発明は以上の点に鑑みなされたもので、周辺の計算機の発熱量も考慮した温度予測を行うことにより、システムの過熱による計算機のパフォーマンスダウンやシステムダウンを防止し得るジョブスケジューリング装置及びジョブスケジューリング方法を提供することを目的とする。   The present invention has been made in view of the above points, and a job scheduling apparatus and job scheduling capable of preventing a computer performance down and a system down due to overheating of the system by performing temperature prediction in consideration of the calorific value of surrounding computers. It aims to provide a method.

また、本発明の他の目的は、冷却能力の不足したマシンルームで計算機を運用する場合でも、決められた温度制限の中で最大限の能力を引き出すことが可能なジョブスケジューリング装置及びジョブスケジューリング方法を提供することにある。   Another object of the present invention is to provide a job scheduling apparatus and a job scheduling method capable of extracting the maximum capacity within a predetermined temperature limit even when a computer is operated in a machine room having insufficient cooling capacity. Is to provide.

上記の目的を達成するため、本発明のジョブスケジューリング装置は、複数の計算機にネットワークを介して接続され、各計算機毎にジョブを割り当てるジョブスケジューリング装置であって、割り当て可能な全てのジョブを予め複数の種類に分類し、複数の計算機のそれぞれについてジョブを割り当てた計算機の消費電力とジョブの実行時間とを対応させた履歴を種類毎に記憶したジョブ履歴テーブルを有し、割り当てようとする所望のジョブに応じた種類の分類結果を出力するジョブ分類手段と、ジョブ分類手段の分類結果に基づき各計算機のジョブ履歴テーブルを参照して得た、所望のジョブの種類における平均消費電力及び平均ジョブ実行時間から、所望のジョブを複数の計算機の各計算機に別々に割り当てた時の、各計算機毎の消費電力量増加分を予測し、その消費電力量増加分の予測値と、複数の計算機のそれぞれから入力される各計算機毎に検出された少なくとも各計算機の内部の温度情報及び消費電力情報とから、所望のジョブを複数の計算機のうちの一の計算機に割り当てた時の、一の計算機及び一の計算機の周辺の計算機のそれぞれの温度変化を予測することを複数の計算機の各計算機について行い、得られた各計算機の温度変化の予測値に基づいて所望のジョブを割り当てる計算機を決定する割当決定手段と、割当決定手段により決定された計算機に対して、所望のジョブを割り当てるジョブ割当手段と、ジョブを割り当てられた計算機の一定時間後の消費電力情報を取得し、ジョブの完了後にそのジョブの種類をインデックスとして、ジョブ完了までの時間とジョブ実行中の消費電力の値とをジョブ履歴テーブルに追加更新するジョブ履歴テーブル管理手段とを有することを特徴とする。 To achieve the above object, the job scheduling apparatus of the present invention is connected via a network to a plurality of computers, a job scheduling apparatus for allocating jobs for each computer in advance a plurality of all jobs assignable A job history table that stores, for each type, a history that associates the power consumption of the computer to which each job is assigned and the execution time of the job. Job classification means that outputs the classification result of the type according to the job, and average power consumption and average job execution for the desired job type obtained by referring to the job history table of each computer based on the classification result of the job classification means When a desired job is assigned to each computer of multiple computers based on time, it is deleted for each computer. Predicting the amount of power increase, from the predicted value of the power consumption increase, and the internal temperature information and power consumption information for at least each computer which is detected for each computer to be input from each of the plurality of computers, when assigned a desired job on one of the computer of the plurality of computers, performs the one computer and one computer of the plurality of predicting the respective temperature changes in the calculation device near the computer each computer of An assignment determination unit that determines a computer to which a desired job is assigned based on the obtained predicted temperature change value of each computer; a job assignment unit that assigns a desired job to the computer determined by the assignment determination unit; Get the power consumption information after a certain period of time for the computer to which the job is assigned, and use the job type as an index after the job is completed. And having a job history table managing means for adding updates the value of the power consumption during job execution in the job history table with.

また、上記の目的を達成するため、本発明のジョブスケジューリング装置は、複数の計算機のそれぞれの大きさ情報、計算機設置位置情報及び設置場所の冷却能力情報に基づく各計算機毎の環境情報を保存する環境情報保存装置を有すると共に、複数の計算機は環境温度を計測する環境温度計測手段を更に有しており、
割当決定手段は、所望のジョブを割り当てた一の計算機であるジョブ割り当て計算機とそのジョブ割り当て計算機の周辺の計算機のそれぞれの温度変化の予測値と、環境温度計測手段からの環境温度との和である各計算機の予測温度が、ジョブ割り当て計算機及びジョブ割り当て計算機の周辺の計算機のそれぞれに環境情報保存装置から読み出した環境情報に基づいて設定されている最大許容環境温度を超えず、かつ、予測温度と最大許容環境温度との差の温度が最小値を示し、かつ、最小値を示す差の温度のうち最大の差の温度を示したときのジョブ割り当て計算機をジョブを割り当てる計算機として決定し、最大の差の温度を示す計算機が存在しない場合、ジョブの割り当てを保留することを特徴とする。
To achieve the above object, the job scheduling apparatus of the present invention, save each size information, environment information for each computer based on the computer installation location information and location cooling capability information of the multiple computers And a plurality of computers further have an environmental temperature measuring means for measuring the environmental temperature,
The allocation determining means is a sum of a predicted value of the temperature change of each of the job allocation computer which is one computer to which a desired job is allocated and a computer around the job allocation computer, and the environmental temperature from the environmental temperature measurement means. The predicted temperature of each computer does not exceed the maximum allowable environmental temperature set based on the environmental information read from the environmental information storage device in each of the job allocation computer and the computers surrounding the job allocation computer, and the predicted temperature The job assignment calculator when the difference between the maximum allowable environmental temperature and the maximum allowable environment temperature indicates the minimum value and the maximum difference temperature among the difference temperatures indicating the minimum value is determined as the computer to which the job is allocated. When there is no computer indicating the difference temperature, job assignment is suspended .

この発明では、計算機が設置される場所(マシンルーム)の冷却能力が不足している場合は、決められた温度制限(最大許容環境温度)の中でジョブ割当に適した計算機を求めることができる。この場合、その計算機システムの最高性能を引き出すことはできないが、コスト対パフォーマンスのバランスを見て意図的にマシンルームへの投資を抑えたり、マシンルームの設備が整わないうちに計算能力の高い最新の計算機を動作させておいて、後から設備を整えたりすることが可能となる。   In the present invention, when the cooling capacity of the place (machine room) where the computer is installed is insufficient, a computer suitable for job allocation can be obtained within a predetermined temperature limit (maximum allowable environment temperature). . In this case, the highest performance of the computer system cannot be extracted, but the investment in the machine room is intentionally reduced by looking at the balance between cost and performance, or the latest with high computing power before the machine room facilities are ready. It is possible to arrange the equipment later by operating the computer.

また、この発明では予想の結果、条件を満たす計算機が見付けられない場合、ジョブの割り当てを保留し、現在割り当てられているジョブが終了するのを待つ。これにより、温度問題によるシステムダウンやパフォーマンスダウンを回避することが可能である。   Further, according to the present invention, when a computer satisfying the condition is not found as a result of prediction, job assignment is suspended and the currently assigned job is awaited. As a result, it is possible to avoid system down and performance down due to temperature problems.

また、上記の目的を達成するため、本発明のジョブスケジューリング装置は、上記の割当決定手段が、複数の計算機のそれぞれから入力される各計算機毎に検出された少なくとも各計算機の内部の温度情報及び消費電力情報とに基づいて、所望のジョブを各計算機に割り当てる前と割り当てた後の各計算機の温度の差分を計測し、各計算機毎に消費電力情報が示す消費電力量の上昇値と計測した温度の差分が示す温度の上昇値との相関関係から相関関数を求め、その相関関数を用いて各計算機の消費電力量の上昇値に基づいて周辺の計算機における温度上昇値を推定し、その推定値に基づいて所望のジョブを割り当てる計算機を決定する手段であることを特徴とする。 In order to achieve the above object, according to the job scheduling apparatus of the present invention, the allocation determining means includes at least temperature information inside each computer detected for each computer input from each of the plurality of computers, and Based on the power consumption information, the temperature difference between each computer before and after assigning the desired job to each computer was measured, and the increase in power consumption indicated by the power consumption information was measured for each computer . The correlation function is obtained from the correlation with the temperature rise value indicated by the temperature difference, and the temperature rise value in the surrounding computers is estimated based on the power consumption increase value of each computer using the correlation function , and the estimation It is a means for determining a computer to which a desired job is assigned based on a value.

また、上記の目的を達成するため、本発明のジョブスケジューリング方法は、複数の計算機にネットワークを介して接続され、各計算機毎にジョブを割り当てるジョブスケジューリング方法であって、割り当て可能な全てのジョブを予め複数の種類に分類し、複数の計算機のそれぞれについてジョブを割り当てた計算機の消費電力とジョブの実行時間とを対応させた履歴を種類毎に記憶したジョブ履歴テーブルを有し、割り当てようとする所望のジョブに応じた種類の分類結果を出力する第1のステップと、第1のステップの分類結果に基づき各計算機のジョブ履歴テーブルを参照して得た、所望のジョブの種類における平均消費電力及び平均ジョブ実行時間から、所望のジョブを複数の計算機の各計算機に別々に割り当てた時の、各計算機毎の消費電力量増加分を予測する第2のステップと、第2のステップによる消費電力量増加分の予測値と、複数の計算機のそれぞれから入力される各計算機毎に検出された少なくとも各計算機の内部の温度情報及び消費電力情報とから、所望のジョブを複数の計算機のうちの一の計算機に割り当てた時の、一の計算機及び一の計算機の周辺の計算機のそれぞれの温度変化を予測することを複数の計算機の各計算機について行い、得られた各計算機の温度変化の予測値に基づいて所望のジョブを割り当てる計算機を決定する第3のステップと、第3のステップにより決定された計算機に対して、所望のジョブを割り当てる第4のステップと、ジョブを割り当てられた計算機の一定時間後の消費電力情報を取得し、ジョブの完了後にそのジョブの種類をインデックスとして、ジョブ完了までの時間とジョブ実行中の消費電力の値とをジョブ履歴テーブルに追加更新する第5のステップとを含むことを特徴とする。 In order to achieve the above object, the job scheduling method of the present invention is a job scheduling method for allocating jobs to each computer connected to a plurality of computers via a network, and assigning all assignable jobs. There is a job history table that stores the history that associates the power consumption of the computer and the execution time of the job that has been classified into a plurality of types in advance and assigned a job to each of the plurality of computers, and tries to assign it. A first step for outputting a classification result of a type corresponding to a desired job, and an average power consumption of the desired job type obtained by referring to the job history table of each computer based on the classification result of the first step Each computer when a desired job is assigned to each computer of multiple computers from the average job execution time Of a second step of predicting a power consumption increase, and the predicted value of power consumption increase by a second step, at least the computer is detected for each computer to be input from each of the plurality of computers from the internal temperature information and power information, predicts the respective temperature changes in the desired job when assigned to one computer of the plurality of computers, one computer and calculation unit near one of the computer A third step of determining a computer to which a desired job is assigned based on the obtained predicted value of temperature change of each computer, and a computer determined by the third step respect, a fourth step of allocating a desired job, obtains power consumption information after a certain time of a computer assigned a job, the job after the job is completed S as an index to, characterized in that it comprises a fifth step of adding update the power consumption value of the time and during job execution to the job completion to the job history table.

本発明では、ジョブスケジューリング装置が各計算機の設置位置や周辺環境などの情報を持ち、数値計算によりジョブの投入や移動による各計算機の周辺の温度変化を予測し、その予測に基づいてジョブ割り当てを行う計算機を決定することができる。各計算機の周辺の温度変化の予測には、予めジョブを割り当てたときの発熱量を予測しておく必要がある。本発明では、実際にジョブを割り当てたときの消費電力増加の履歴を残しておき、そこから類推する手法をとる。ジョブは予め何種類かに分類しておき、履歴はその種類毎にとられる。消費電力の予測値は、ジョブを割り当てる前の消費電力値と、履歴情報の平均値などからある程度の精度で求めることが可能である。   In the present invention, the job scheduling apparatus has information such as the installation position of each computer and the surrounding environment, predicts the temperature change around each computer due to job input and movement by numerical calculation, and assigns jobs based on the prediction. The computer to perform can be determined. In order to predict the temperature change around each computer, it is necessary to predict the amount of heat generated when a job is assigned in advance. In the present invention, a method is used in which a history of power consumption increase when a job is actually allocated is retained and analogized there. Jobs are classified into several types in advance, and a history is taken for each type. The predicted value of power consumption can be obtained with a certain degree of accuracy from the power consumption value before allocating the job and the average value of the history information.

本発明によれば、ジョブを各計算機に割り当てた時の自計算機と周辺の全ての計算機の温度変化をそれぞれ予測し、その予測値に基づいて所望のジョブを割り当てる計算機を決定することで、ジョブを割り当てた結果発生する発熱量だけでなく、周辺の計算機の発熱量をも含めて温度予測してジョブの割り付けを決定することができ、システムを構成する複数の計算機のうち特定の計算機で過熱が起きることを防ぎ、システムのダウンや全体パフォーマンスの低下を予防することができる。   According to the present invention, by predicting temperature changes of the own computer and all peripheral computers when a job is assigned to each computer, and determining a computer to which a desired job is assigned based on the predicted value, In addition to the calorific value generated as a result of assigning, it is possible to determine the job assignment by predicting the temperature including the calorific value of surrounding computers, and overheating with a specific computer among the multiple computers that make up the system Can be prevented, system down and overall performance degradation can be prevented.

また、本発明によれば、冷却能力の不足したマシンルームで計算機を運用する場合でも、決められた温度制限の中で最大限の能力を引き出すことができ、予想の結果、条件を満たす計算機が見付けられない場合、ジョブの割り当てを保留し、現在割り当てられているジョブが終了するのを待つことで、温度問題によるシステムダウンやパフォーマンスダウンを回避することが可能である。   In addition, according to the present invention, even when a computer is operated in a machine room with insufficient cooling capacity, the maximum capacity can be extracted within a predetermined temperature limit. If it is not found, it is possible to avoid a system down or a performance down due to a temperature problem by suspending the job assignment and waiting for the currently assigned job to end.

次に、本発明を実施するための最良の形態について図面と共に説明する。図1は本発明になるジョブスケジューリング装置の一実施の形態の構成図を示す。同図において、並列計算機システム100は、ネットワーク101によって各計算機111、112、113などを接続した計算機システムである。この並列計算機システム100におけるジョブの割り付けは、同じネットワーク101に接続されたスケジューリング装置103が行う。各計算機111、112、113は環境温度センサ121、CPU温度センサ122、並びに消費電力センサ123を備えており、スケジューリング装置103はネットワーク101を通じてこれらの情報を読み出すことができる。また、各計算機111、112、113には、最大許容環境温度124が設定されており、同様にスケジューリング装置103から読み出すことができる。この最大許容環境温度情報は、例えば不揮発性メモリに予め設定されているものとする。別の実装としては、最大許容環境温度124は外から書き換え可能として、マシンルームの冷却能力に応じて設定するという実装も考えられる。   Next, the best mode for carrying out the present invention will be described with reference to the drawings. FIG. 1 shows a configuration diagram of an embodiment of a job scheduling apparatus according to the present invention. In the figure, a parallel computer system 100 is a computer system in which computers 111, 112, 113, etc. are connected by a network 101. Job assignment in the parallel computer system 100 is performed by the scheduling device 103 connected to the same network 101. Each of the computers 111, 112, and 113 includes an environmental temperature sensor 121, a CPU temperature sensor 122, and a power consumption sensor 123, and the scheduling device 103 can read out such information through the network 101. Each computer 111, 112, 113 is set with a maximum allowable environmental temperature 124, and can be read from the scheduling device 103 in the same manner. This maximum allowable environmental temperature information is set in advance in a nonvolatile memory, for example. As another implementation, the maximum allowable environmental temperature 124 can be rewritten from the outside and set according to the cooling capacity of the machine room.

本実施の形態では、各計算機111、112、113は環境温度センサ121の値とCPU温度センサ122の値とに応じて自動的に計算機内の冷却温度を調整する。この冷却制御は、例えばBMC(Baseboard Management Controller)などが行う。この機能により、本実施の形態では各計算機111、112、113の内部の冷却については考慮しない。   In the present embodiment, each computer 111, 112, 113 automatically adjusts the cooling temperature in the computer according to the value of the environmental temperature sensor 121 and the value of the CPU temperature sensor 122. This cooling control is performed by, for example, a BMC (Baseboard Management Controller). Due to this function, the present embodiment does not consider cooling of the computers 111, 112, and 113.

また、ネットワーク101には各計算機111、112、113が設置されている位置情報を保存しておくための環境情報保存装置102が接続されている。この環境情報もまたスケジューリング装置103から読み出し可能である。また、別の実装方法として、位置情報は各計算機111、112、113に保存されているという実装も考えられる。また、環境温度保存装置102には、より正確な温度予測を行うためのデータとして、各計算機111、112、113の大きさ情報(幅、奥行き、高さ)、各計算機111、112、113が設置してあるマシンルームの構造情報、マシンルームの冷却能力を保存している。スケジューリング装置103はこれらの情報を読み出して、演算の際の境界条件として用いることで、温度予測を正確に行うことができる。   The network 101 is connected to an environment information storage device 102 for storing location information where the computers 111, 112, and 113 are installed. This environment information can also be read from the scheduling device 103. As another mounting method, mounting in which the position information is stored in each of the computers 111, 112, and 113 is also conceivable. In addition, the environmental temperature storage device 102 includes size information (width, depth, height) of each computer 111, 112, 113 and data 111, 112, 113 as data for more accurate temperature prediction. Stores the structural information of the installed machine room and the cooling capacity of the machine room. The scheduling apparatus 103 can accurately perform temperature prediction by reading out this information and using it as a boundary condition in the calculation.

図2はスケジューリング装置103の一例の構成図を示す。同図において、割当決定部203は各計算機111、112、113の温度センサ値、消費電力センサ値、ジョブを割り付けた際の消費電力の予測値、環境パラメータなどを基に計算機間にジョブを割り当てた際の温度変化を予測し、その予測値を基にジョブの割り付け先計算機を決定する。また、ジョブ分類器201、ジョブ履歴テーブル202、履歴テーブル管理部204は、ジョブの消費電力を予測するために存在する。   FIG. 2 shows a configuration diagram of an example of the scheduling apparatus 103. In the figure, the assignment determination unit 203 assigns jobs among computers based on the temperature sensor values, power consumption sensor values, estimated power consumption values when jobs are assigned, environmental parameters, and the like. The temperature change is predicted, and the job assignment destination computer is determined based on the predicted value. A job classifier 201, a job history table 202, and a history table management unit 204 exist for predicting the power consumption of a job.

全てのジョブに対して増加消費電力分を予想するのは難しいため、ジョブを予め複数の種類に分類し、ジョブ履歴テーブル202にはこれらの種類毎に履歴がとられる。この場合のジョブの種類とは、例えばデータベース検索トランザクション、ウェブ(Web)リクエスト、データ処理などであり、スケジューリング装置103が分類を行う。   Since it is difficult to predict the increased power consumption for all jobs, the jobs are classified in advance into a plurality of types, and the job history table 202 stores a history for each type. The job types in this case are, for example, database search transactions, web (Web) requests, data processing, and the like, and the scheduling apparatus 103 performs classification.

図3はジョブ履歴テーブル202の一例を示す。同図に示すように、ジョブ履歴テーブル202は、Webリクエスト、データベース検索、データ処理1、データ処理2のジョブの種類毎に、履歴、消費電力増加、ジョブ時間とを対応付けて記憶している。このジョブ履歴テーブル202は図1のネットワーク101に接続されているスケジューリング装置103がジョブを決定する複数の計算機111、112、113のそれぞれに対応して設けられている。従って、図3に示すジョブ履歴テーブル202はネットワークに接続されたある一つの計算機についてのジョブ履歴テーブルであり、その中の「履歴」はその一つの計算機の履歴である。   FIG. 3 shows an example of the job history table 202. As shown in the figure, the job history table 202 stores history, power consumption increase, and job time in association with each job type of Web request, database search, data processing 1, and data processing 2. . The job history table 202 is provided corresponding to each of a plurality of computers 111, 112, 113 for which the scheduling device 103 connected to the network 101 in FIG. Therefore, the job history table 202 shown in FIG. 3 is a job history table for one computer connected to the network, and the “history” therein is the history of the one computer.

次に、本実施の形態の動作について詳細に説明する。まず、図1の環境情報保存部102に対して、予め各計算機111、112、113の設置位置情報を入力しておく。この設置位置情報は、マシンルーム内における縦方向、横方向、高さ方向といった簡単な情報でもよい。この設置位置情報の入力は人手で行うことを想定しているが、各計算機111、112、113の設置位置情報はRFIDなどの公知の技術を用いて自動的に入力される手段が提供されてもよい。また、同様に、環境温度の計算に必要な各種パラメータを図1の環境情報保存装置102に設定しておく。これは例えば、各計算機111、112、113の大きさ情報、マシンルームの構造情報、各計算機111、112、113の排熱方向、マシンルーム全体の排熱能力、などの情報である。   Next, the operation of the present embodiment will be described in detail. First, installation position information of each of the computers 111, 112, and 113 is input in advance to the environment information storage unit 102 in FIG. The installation position information may be simple information such as a vertical direction, a horizontal direction, and a height direction in the machine room. Although it is assumed that the installation position information is input manually, means for automatically inputting the installation position information of each of the computers 111, 112, and 113 using a known technique such as RFID is provided. Also good. Similarly, various parameters necessary for calculating the environmental temperature are set in the environmental information storage device 102 in FIG. This is, for example, information such as size information of each computer 111, 112, 113, machine room structure information, heat exhaust direction of each computer 111, 112, 113, and heat exhaust capability of the entire machine room.

図4(A)、(B)はジョブを割り当てる先の計算機の決定方法を説明するための構成図とフローチャートを示す。図4(A)、(B)に示すように、ジョブ分類器201でジョブを分類し(ステップS1)、割当決定部203がジョブ履歴テーブル202からジョブを割り当てた際の消費電力の増加分の予想値を得る。同時に、各計算機111、112、113の環境温度センサ121の値、各計算機111、112、113の環境パラメータを取得する(ステップS2)。   FIGS. 4A and 4B are a configuration diagram and a flowchart for explaining a method of determining a computer to which a job is assigned. As shown in FIGS. 4A and 4B, the job classifier 201 classifies the job (step S1), and the allocation determining unit 203 increases the power consumption when the job is allocated from the job history table 202. Get the expected value. At the same time, the value of the environmental temperature sensor 121 of each computer 111, 112, 113 and the environmental parameter of each computer 111, 112, 113 are acquired (step S2).

続いて、割当決定部203は各計算機にジョブを割り当てたときの全ての計算機111、112、113における一定時間後の環境温度を予測して割当先を決定する(ステップS3)。その後、ジョブ分類器201が割当先にジョブの割り当てを行う(ステップS4)。   Subsequently, the assignment determination unit 203 determines the assignment destination by predicting the environmental temperature after a predetermined time in all the computers 111, 112, 113 when the job is assigned to each computer (step S3). Thereafter, the job classifier 201 assigns a job to the assignment destination (step S4).

上記のジョブの割り当てについて、具体的に説明する。スケジューリング装置103は、あるジョブをある計算機Nに割り当てたときの消費電力の増加分W_Nを、例えば図3のジョブの種類毎の履歴の平均で推定する。図3が例えば図1の計算機111のジョブ履歴テーブルであれば、計算機111に各ジョブを割り当てた時の平均消費電力量の増加分は、平均消費電力と平均時間とから以下の表1のように求められる。   The above job assignment will be specifically described. The scheduling apparatus 103 estimates the increase in power consumption W_N when a certain job is assigned to a certain computer N, for example, based on the average of the history for each job type in FIG. If FIG. 3 is the job history table of the computer 111 of FIG. 1, for example, the increase in the average power consumption when each job is assigned to the computer 111 is as shown in Table 1 below from the average power consumption and the average time. Is required.

Figure 0005151203
これを各計算機について計算し、その消費電力量に基づいて、ジョブの割当先の計算機を決定する。
Figure 0005151203
This is calculated for each computer, and the computer to which the job is assigned is determined based on the power consumption.

次に、ステップS4のジョブ割当先の決定方法について更に具体的に説明する。まず、ある計算機NにジョブKを割り当てた時の消費電力の増加量Ws(N,K)に基づいて、ジョブKを割り当てた一の計算機である自計算機(ここでは、計算機N)と周辺の計算機の温度の上昇値を推定する。計算機NにジョブKを割り当てた時の、計算機N又は周辺の計算機Mの温度の上昇推定量をTup(M,N,K)として、全ての周辺の計算機Mについて上限環境温度Tlim(M)を超えないN,Kの組み合わせを探索し、割り当て行う。 Next, the method for determining the job assignment destination in step S4 will be described more specifically. First, based on the increase in power consumption Ws (N, K) when job K is assigned to a certain computer N , the own computer (here, computer N) that is the one computer to which job K is assigned and the surroundings Estimate the temperature rise of the computer. When the job K is assigned to the computer N, the estimated temperature rise of the computer N or the surrounding computer M is Tup (M, N, K), and the upper limit environmental temperature Tlim (M) is set for all the surrounding computers M. A search is made for combinations of N and K that do not exceed, and allocation is performed.

例えば、図1の計算機111、112、113を計算機1,2,3と表記すると共に、これらの計算機のジョブを割り当てる前の環境温度が表2のように示されるものとし、またこれらの計算機の上限環境温度を40℃とし、Webリクエストのジョブを各計算機に割り当てたときの、各計算機の平均消費電力量が表3のようになったものとする。   For example, the computers 111, 112, and 113 in FIG. 1 are expressed as computers 1, 2, and 3, and the environmental temperature before assigning jobs of these computers is shown in Table 2, and Assume that the upper limit environmental temperature is 40 ° C. and the average power consumption of each computer is as shown in Table 3 when a Web request job is assigned to each computer.

Figure 0005151203
Figure 0005151203

Figure 0005151203
次に、その平均消費電力量を用いて計算機xにジョブを割り当てた時の、計算機(自計算機と他計算機)yの温度上昇量の推定値を表4のように求め、計算機xにジョブを割り当てた時の、計算機(自計算機と他計算機)yの温度の推定値を表5のように求める。
Figure 0005151203
Next, when the job is assigned to the computer x using the average power consumption, an estimated value of the temperature rise amount of the computer (own computer and other computers) y is obtained as shown in Table 4, and the job is assigned to the computer x. The estimated value of the temperature of the computer (own computer and other computer) y at the time of allocation is obtained as shown in Table 5.

Figure 0005151203
Figure 0005151203

Figure 0005151203
ただし、表4及び表5において縦軸の計算機は上記の計算機xを示し、横軸の計算機は上記の計算機yを示す。従って、例えば、表4の2行1列目の「0.3」は計算機2(112)にジョブを割り当てた時の、計算機1(111)の温度上昇の推定値を示し、表5の2行1列目の「37.5」は計算機2(112)にジョブを割り当てた時の、計算機1(111)の温度の推定値を示す。
Figure 0005151203
However, in Tables 4 and 5, the vertical axis computer indicates the above-mentioned computer x, and the horizontal axis computer indicates the above-mentioned computer y. Therefore, for example, “0.3” in the second row and first column of Table 4 indicates an estimated value of the temperature rise of the computer 1 (111) when a job is assigned to the computer 2 (112). “37.5” in the first column of the row indicates an estimated value of the temperature of the computer 1 (111) when a job is assigned to the computer 2 (112).

その結果、表5から計算機1(111)にジョブを割り当てたときは、上限環境温度(40℃)と温度推定値との差分が最も小さな計算機yは計算機2(112)の1.2℃であり、また、計算機2(112)にジョブを割り当てたときは、上限環境温度(40℃)と温度推定値との差分が最も小さな計算機yは計算機2(112)の0.2℃であり、また、計算機3(113)にジョブを割り当てたときは、上限環境温度(40℃)と温度推定値との差分が最も小さな計算機yは計算機2(112)の1.25℃であることが分かる。このうち、差分温度が最も大きな1.25℃はジョブを計算機3(113)に割り当てた場合であるので、計算機3(113)に割り当てるのが最も上限環境温度(40℃)からのマージンが大きいと推定できるため、スケジューリング装置103は計算機3(113)にジョブを割り当てることになる。   As a result, when a job is assigned to computer 1 (111) from Table 5, the computer y having the smallest difference between the upper limit environmental temperature (40 ° C.) and the estimated temperature is 1.2 ° C. of computer 2 (112). Yes, and when the job is assigned to the computer 2 (112), the computer y having the smallest difference between the upper limit environment temperature (40 ° C.) and the estimated temperature value is 0.2 ° C. of the computer 2 (112). When a job is assigned to the computer 3 (113), the computer y having the smallest difference between the upper limit environmental temperature (40 ° C.) and the estimated temperature is 1.25 ° C. of the computer 2 (112). . Of these, 1.25 ° C. having the largest differential temperature is a case where the job is assigned to the computer 3 (113), and therefore, the margin from the upper limit environmental temperature (40 ° C.) is the largest to be assigned to the computer 3 (113). Therefore, the scheduling apparatus 103 assigns a job to the computer 3 (113).

次に、ジョブ管理テーブル202の管理について、図5(A)のスケジューリング装置103の構成図と同図(B)のフローチャートと共に説明する。まず、ジョブ分類器201はジョブを予めある程度の種類に分類する(ステップS11)。続いて、ジョブ分類器201はジョブ割当先の計算機を決定し(ステップS12)、ジョブを各計算機に投入した後(ステップS13)、履歴テーブル管理部204が一定時間後の投入計算機の消費電力センサ123の値を読んでおく(ステップS14)。   Next, management of the job management table 202 will be described with reference to the configuration diagram of the scheduling apparatus 103 in FIG. 5A and the flowchart in FIG. First, the job classifier 201 classifies jobs into certain types in advance (step S11). Subsequently, the job classifier 201 determines a job assignment destination computer (step S12), and after the job is input to each computer (step S13), the history table management unit 204 uses the power consumption sensor of the input computer after a predetermined time. The value of 123 is read (step S14).

続いて、履歴テーブル管理部204は、計算機のジョブの完了を確認すると、そのジョブの種類をインデックスとして、ジョブ完了までの時間と、ジョブ実行中の消費電力センサ123で検出した消費電力値をジョブ履歴テーブル202に追加書き込みをする(ステップS15)。ジョブ履歴テーブル202は各ジョブの種類に対して有限のエントリしか持ち得ないため、ラウンドロビン方式でエントリを上書きしていく。   Subsequently, when the history table management unit 204 confirms the completion of the job of the computer, the job type uses the job type as an index to calculate the time until completion of the job and the power consumption value detected by the power consumption sensor 123 during job execution. Additional writing is performed in the history table 202 (step S15). Since the job history table 202 can have only a finite entry for each job type, the entry is overwritten by the round robin method.

ところで、消費電力上昇値から他の計算機の温度上昇の推定値を求める方法としては、以下の方法がある。スケジューリング装置103がまず、ジョブを割り当てる度に図3のような消費電力の履歴を取り、これから消費電力量を求め、続いてジョブを割り当てる前とジョブが終了した時の全計算機の温度の差分をとる。続いて、スケジューリング装置103は、各計算機毎に消費電力量の上昇値と温度の上昇値の図6に示すような相関図を作成し、それに基づいて相関関数を求める(例えば、一次式で)。この相関関数を用いて、スケジューリング装置103は各計算機毎に、その計算機の消費電力量の上昇値から他の計算機における温度上昇値を推定する。   By the way, as a method for obtaining an estimated value of the temperature rise of another computer from the power consumption rise value, there are the following methods. The scheduling device 103 first takes a power consumption history as shown in FIG. 3 every time a job is assigned, and obtains the power consumption amount from this, and then calculates the temperature difference between all computers before the job is assigned and when the job is finished. Take. Subsequently, the scheduling apparatus 103 creates a correlation diagram as shown in FIG. 6 of the increase value of power consumption and the increase value of temperature for each computer, and obtains a correlation function based on the correlation diagram (for example, by a linear expression). . Using this correlation function, the scheduling device 103 estimates, for each computer, the temperature increase value in another computer from the increase value of the power consumption of that computer.

なお、前記環境温度の予測は、熱拡散を数値計算で解くことによって得ることもできる。この場合はスケジューリング装置103は、このために必要な情報を環境情報保存装置102から読み出し、適当な大きさの格子を設定して熱伝導や対流による温度変化の時間発展を数値計算する。系がある程度平衡に達したと判断したら数値計算を止め、そのときの他の各計算機における温度変化の予想値を記憶しておく。   The environmental temperature can be predicted by solving the thermal diffusion by numerical calculation. In this case, the scheduling device 103 reads information necessary for this from the environment information storage device 102, sets an appropriately sized grid, and numerically calculates the time evolution of temperature change due to heat conduction or convection. When it is determined that the system has reached a certain level of equilibrium, the numerical calculation is stopped, and the predicted value of the temperature change in each of the other computers at that time is stored.

スケジューリング装置103は、この予想を全ての計算機111、112、113に対して行い、温度条件が最も緩くなるジョブの割り当てを採用する。温度条件が最も緩いジョブの割り当てとは、温度条件が最もシビアな計算機(最大許容環境温度と環境温度の差が最も小さな計算機)の最大許容環境温度と環境温度の差が最も大きくなる割り当てである。すなわち、計算機iにジョブを割り付けた際の計算機j(自計算機iも含む)における予想環境温度をTijとし、また、計算機jの最大許容環境温度をTCjとすると、ジョブを各計算機iに割り当てた時の各計算機jそれぞれにおける最大許容環境温度TCjと予想環境温度Tijとの差分の温度が最も小さな複数の計算機jのうち、最大の差分温度を示すときのジョブが割り当てられている計算機i、すなわち、max_i(min_j(TCj-Tij))を満たす割り当て計算機iを求める。この計算機iが温度条件が最も緩くなるジョブの割当先計算機となる。   The scheduling apparatus 103 makes this prediction for all the computers 111, 112, and 113, and employs the job assignment in which the temperature condition becomes the most lenient. The job with the mildest temperature condition is an assignment in which the difference between the maximum allowable environment temperature and the environment temperature of the computer with the severest temperature condition (the computer with the smallest difference between the maximum allowable environment temperature and the environment temperature) is the largest. . In other words, assuming that the predicted environmental temperature in computer j (including its own computer i) when assigning a job to computer i is Tij, and the maximum allowable environmental temperature of computer j is TCj, the job is assigned to each computer i. Among the plurality of computers j having the smallest difference between the maximum allowable environment temperature TCj and the predicted environment temperature Tij in each computer j at the time, the computer i to which the job indicating the maximum difference temperature is assigned, that is, , Max_i (min_j (TCj-Tij)) is assigned. This computer i becomes a job assignment destination computer whose temperature condition is the least severe.

このような計算機iが存在しない場合、スケジューリング装置103は、ジョブの割り当てをペンディングし、現在実行されているジョブが一つでも終了するのを待ち、再度割当先を決定する。また、各計算機の最大許容環境温度は書き換え可能にして、マシンルームの冷却能力が不足しているような場合に、この値を調整することでマシンルームの冷却能力を超える過熱を防ぐ運用を行うこともできる。この場合の設定値TC_limitは、計算機を設置する際に予め求めておく。   When such a computer i does not exist, the scheduling apparatus 103 waits for the job assignment, waits for any currently executed job to end, and determines the assignment destination again. In addition, the maximum allowable environmental temperature of each computer can be rewritten, and when the cooling capacity of the machine room is insufficient, this value is adjusted to prevent overheating exceeding the cooling capacity of the machine room. You can also. The set value TC_limit in this case is obtained in advance when the computer is installed.

本発明のジョブスケジューリング装置の一実施の形態の構成図である。It is a block diagram of one embodiment of a job scheduling apparatus of the present invention. 図1中のスケジューリング装置の一例の構成図である。It is a block diagram of an example of the scheduling apparatus in FIG. 図2中のジョブ履歴テーブルの一例を示す図である。FIG. 3 is a diagram illustrating an example of a job history table in FIG. 2. ジョブを割り当てる先の計算機の決定方法を説明するためのスケジューリング装置の構成図と動作説明用フローチャートである。It is a block diagram and a flowchart for explaining operation of a scheduling device for explaining a method of determining a computer to which a job is assigned. ジョブ管理テーブルの管理方法を説明するためのスケジューリング装置の構成図と動作説明用フローチャートである。It is a block diagram of a scheduling apparatus for demonstrating the management method of a job management table, and the flowchart for operation | movement description. 計算機の消費電力量上昇と温度上昇との相関関係の一例を示す図である。It is a figure which shows an example of correlation with the increase in power consumption of a computer, and a temperature rise.

符号の説明Explanation of symbols

100 並列計算機システム
101 ネットワーク
102 環境情報保存装置
103 スケジューリング装置
111、112、113 計算機
121 環境温度センサ
122 CPU温度センサ
123 消費電力センサ
124 最大許容環境温度
201 ジョブ分類器
202 ジョブ履歴テーブル
203 割当決定部
204 履歴テーブル管理部
DESCRIPTION OF SYMBOLS 100 Parallel computer system 101 Network 102 Environment information storage apparatus 103 Scheduling apparatus 111, 112, 113 Computer 121 Environment temperature sensor 122 CPU temperature sensor 123 Power consumption sensor 124 Maximum permissible environment temperature 201 Job classifier 202 Job history table 203 Assignment determination part 204 History table management department

Claims (6)

複数の計算機にネットワークを介して接続され、各計算機毎にジョブを割り当てるジョブスケジューリング装置であって、
割り当て可能な全てのジョブを予め複数の種類に分類し、前記複数の計算機のそれぞれについてジョブを割り当てた計算機の消費電力とジョブの実行時間とを対応させた履歴を前記種類毎に記憶したジョブ履歴テーブルを有し、割り当てようとする所望のジョブに応じた種類の分類結果を出力するジョブ分類手段と、
前記ジョブ分類手段の前記分類結果に基づき各計算機の前記ジョブ履歴テーブルを参照して得た、前記所望のジョブの種類における平均消費電力及び平均ジョブ実行時間から、前記所望のジョブを前記複数の計算機の各計算機に別々に割り当てた時の、各計算機毎の消費電力量増加分を予測し、その消費電力量増加分の予測値と、前記複数の計算機のそれぞれから入力される各計算機毎に検出された少なくとも各計算機の内部の温度情報及び消費電力情報とから、前記所望のジョブを前記複数の計算機のうちの一の計算機に割り当てた時の、前記一の計算機及び前記一の計算機の周辺の計算機のそれぞれの温度変化を予測することを前記複数の計算機の各計算機について行い、得られた各計算機の前記温度変化の予測値に基づいて前記所望のジョブを割り当てる計算機を決定する割当決定手段と、
前記割当決定手段により決定された計算機に対して、前記所望のジョブを割り当てるジョブ割当手段と
ジョブを割り当てられた計算機の一定時間後の消費電力情報を取得し、ジョブの完了後にそのジョブの種類をインデックスとして、ジョブ完了までの時間とジョブ実行中の消費電力の値とを前記ジョブ履歴テーブルに追加更新するジョブ履歴テーブル管理手段と
を有することを特徴とするジョブスケジューリング装置。
A job scheduling apparatus that is connected to a plurality of computers via a network and assigns a job to each computer,
Job history in which all assignable jobs are classified into a plurality of types in advance, and a history in which the power consumption of the computer to which the job is assigned and the execution time of the computer is associated with each of the plurality of computers is stored for each type A job classification unit having a table and outputting a classification result of a type corresponding to a desired job to be allocated ;
From the average power consumption and average job execution time of the desired job type obtained by referring to the job history table of each computer based on the classification result of the job classification means, the desired job is assigned to the plurality of computers. Predicts the increase in power consumption for each computer when it is assigned to each computer separately, and detects the predicted value for the increase in power consumption and each computer input from each of the plurality of computers From at least the temperature information and power consumption information inside each of the computers , when the desired job is assigned to one computer of the plurality of computers, the one computer and the periphery of the one computer conducted for calculation machine for each of said plurality to predict the temperature change computing the calculator, based on the predicted value of the temperature change of each computer obtained the desired And allocation determining means for determining a computer to assign a job,
Job assignment means for assigning the desired job to the computer determined by the assignment determination means ;
Obtains power consumption information after a certain time of a computer to which a job is assigned, and uses the job type as an index after completion of the job, and the job history table shows the time until job completion and the value of power consumption during job execution. And a job history table management means for additionally updating the job scheduling apparatus.
前記複数の計算機のそれぞれの大きさ情報、計算機設置位置情報及び設置場所の冷却能力情報に基づく各計算機毎の環境情報を保存する環境情報保存装置を有すると共に、前記複数の計算機は環境温度を計測する環境温度計測手段を更に有しており、
前記割当決定手段は、
前記所望のジョブを割り当てた前記一の計算機であるジョブ割り当て計算機とそのジョブ割り当て計算機の周辺の計算機のそれぞれの前記温度変化の予測値と、前記環境温度計測手段からの前記環境温度との和である各計算機の予測温度が、前記ジョブ割り当て計算機及び前記ジョブ割り当て計算機の周辺の計算機のそれぞれに前記環境情報保存装置から読み出した前記環境情報に基づいて設定されている最大許容環境温度を超えず、かつ、前記予測温度と前記最大許容環境温度との差の温度が最小値を示し、かつ、最小値を示す差の温度のうち最大の差の温度を示したときのジョブ割り当て計算機をジョブを割り当てる計算機として決定し、前記最大の差の温度を示す計算機が存在しない場合、ジョブの割り当てを保留することを特徴とする請求項1記載のジョブスケジューリング装置。
Wherein each of the size information of the multiple computers, and has a computer installation position information and the environment information storing device based on the cooling capacity information of the installation location to store the environmental information for each computer, the plurality of computers under ambient It further has an environmental temperature measuring means for measuring,
The allocation determining means includes
The predicted value of the temperature change of each of the job allocation computer that is the one computer to which the desired job is allocated and computers around the job allocation computer, and the sum of the environmental temperature from the environmental temperature measurement means The predicted temperature of each computer does not exceed the maximum allowable environment temperature set based on the environment information read from the environment information storage device in each of the job allocation computer and the computers around the job allocation computer, A job assignment computer is assigned when the temperature of the difference between the predicted temperature and the maximum allowable environmental temperature indicates a minimum value, and the temperature of the difference of the difference indicating the minimum value indicates the maximum temperature. determined as a computer, when the computer indicative of the temperature of maximum difference is not present, to characterized in that pending job assignments Job scheduling apparatus according to claim 1.
前記割当決定手段は、前記複数の計算機のそれぞれから入力される各計算機毎に検出された少なくとも各計算機の内部の温度情報及び消費電力情報とに基づいて、前記所望のジョブを各計算機に割り当てる前と割り当てた後の各計算機の温度の差分を計測し、各計算機毎に前記消費電力情報が示す消費電力量の上昇値と計測した温度の差分が示す温度の上昇値との相関関係から相関関数を求め、その相関関数を用いて各計算機の消費電力量の上昇値に基づいて周辺の計算機における温度上昇値を推定し、その推定値に基づいて前記所望のジョブを割り当てる計算機を決定する手段であることを特徴とする請求項1記載のジョブスケジューリング装置。   The allocation determining means is configured to allocate the desired job to each computer based on at least temperature information and power consumption information inside each computer detected for each computer input from each of the plurality of computers. The correlation function is calculated from the correlation between the increase value of the power consumption indicated by the power consumption information and the increase value of the temperature indicated by the measured temperature difference for each computer. A means for estimating a temperature increase value in a peripheral computer based on an increase value of power consumption of each computer using the correlation function, and determining a computer to which the desired job is assigned based on the estimated value The job scheduling apparatus according to claim 1, wherein the job scheduling apparatus is provided. 複数の計算機にネットワークを介して接続され、各計算機毎にジョブを割り当てるジョブスケジューリング方法であって、
割り当て可能な全てのジョブを予め複数の種類に分類し、前記複数の計算機のそれぞれについてジョブを割り当てた計算機の消費電力とジョブの実行時間とを対応させた履歴を前記種類毎に記憶したジョブ履歴テーブルを有し、割り当てようとする所望のジョブに応じた種類の分類結果を出力する第1のステップと、
前記第1のステップの前記分類結果に基づき各計算機の前記ジョブ履歴テーブルを参照して得た、前記所望のジョブの種類における平均消費電力及び平均ジョブ実行時間から、前記所望のジョブを前記複数の計算機の各計算機に別々に割り当てた時の、各計算機毎の消費電力量増加分を予測する第2のステップと、
前記第2のステップによる前記消費電力量増加分の予測値と、前記複数の計算機のそれぞれから入力される各計算機毎に検出された少なくとも各計算機の内部の温度情報及び消費電力情報とから、前記所望のジョブを前記複数の計算機のうちの一の計算機に割り当てた時の、前記一の計算機及び前記一の計算機の周辺の計算機のそれぞれの温度変化を予測することを前記複数の計算機の各計算機について行い、得られた各計算機の前記温度変化の予測値に基づいて前記所望のジョブを割り当てる計算機を決定する第3のステップと、
前記第3のステップにより決定された計算機に対して、前記所望のジョブを割り当てる第4のステップと
ジョブを割り当てられた計算機の一定時間後の消費電力情報を取得し、ジョブの完了後にそのジョブの種類をインデックスとして、ジョブ完了までの時間とジョブ実行中の消費電力の値とを前記ジョブ履歴テーブルに追加更新する第5のステップと
を含むことを特徴とするジョブスケジューリング方法。
A job scheduling method that is connected to a plurality of computers via a network and assigns a job to each computer,
Job history in which all assignable jobs are classified into a plurality of types in advance, and a history in which the power consumption of the computer to which the job is assigned and the execution time of the computer is associated with each of the plurality of computers is stored for each type A first step having a table and outputting a classification result of a type corresponding to a desired job to be assigned ;
From the average power consumption and average job execution time in the desired job type obtained by referring to the job history table of each computer based on the classification result of the first step, the desired job is assigned to the plurality of jobs. A second step of predicting an increase in power consumption for each computer when each computer is assigned separately;
From the predicted value of the increase in power consumption by the second step, and at least the temperature information and power consumption information inside each computer detected for each computer input from each of the plurality of computers , when assigned a desired job on one of the computer of the plurality of computers, the one computer and the one computer of said plurality to predict the respective temperature changes in the calculation device near the computer A third step that is performed for each computer and determines a computer to which the desired job is assigned based on the obtained predicted value of the temperature change of each computer ;
A fourth step of assigning the desired job to the computer determined in the third step ;
Obtains power consumption information after a certain time of a computer to which a job is assigned, and uses the job type as an index after completion of the job, and the job history table shows the time until job completion and the value of power consumption during job execution. And a fifth step of additionally updating the job scheduling method.
前記複数の計算機のそれぞれの大きさ情報、計算機設置位置情報及び設置場所の冷却能力情報に基づく各計算機毎の環境情報を保存する環境情報保存装置を有すると共に、前記複数の計算機は環境温度を計測する環境温度計測手段を更に有しており、
前記第3のステップは、
前記所望のジョブを割り当てた前記一の計算機であるジョブ割り当て計算機とそのジョブ割り当て計算機の周辺の計算機のそれぞれの前記温度変化の予測値と、前記環境温度計測手段からの前記環境温度との和である各計算機の予測温度が、前記ジョブ割り当て計算機及び前記ジョブ割り当て計算機の周辺の計算機のそれぞれに前記環境情報保存装置から読み出した前記環境情報に基づいて設定されている最大許容環境温度を超えず、かつ、前記予測温度と前記最大許容環境温度との差の温度が最小値を示し、かつ、最小値を示す差の温度のうち最大の差の温度を示したときのジョブ割り当て計算機をジョブを割り当てる計算機として決定し、前記最大の差の温度を示す計算機が存在しない場合、ジョブの割り当てを保留することを特徴とする請求項記載のジョブスケジューリング方法。
Wherein each of the size information of the multiple computers, and has a computer installation position information and the environment information storing device based on the cooling capacity information of the installation location to store the environmental information for each computer, the plurality of computers under ambient It further has an environmental temperature measuring means for measuring,
The third step includes
The predicted value of the temperature change of each of the job allocation computer that is the one computer to which the desired job is allocated and computers around the job allocation computer, and the sum of the environmental temperature from the environmental temperature measurement means The predicted temperature of each computer does not exceed the maximum allowable environment temperature set based on the environment information read from the environment information storage device in each of the job allocation computer and the computers around the job allocation computer, A job assignment computer is assigned when the temperature of the difference between the predicted temperature and the maximum allowable environmental temperature indicates a minimum value, and the temperature of the difference of the difference indicating the minimum value indicates the maximum temperature. determined as a computer, when the computer indicative of the temperature of maximum difference is not present, to characterized in that pending job assignments Job scheduling method of claim 4.
前記第3のステップは、前記複数の計算機のそれぞれから入力される各計算機毎に検出された少なくとも各計算機の内部の温度情報及び消費電力情報とに基づいて、前記所望のジョブを各計算機に割り当てる前と割り当てた後の各計算機の温度の差分を計測し、各計算機毎に前記消費電力情報が示す消費電力量の上昇値と計測した温度の差分が示す温度の上昇値との相関関係から相関関数を求め、その相関関数を用いて各計算機の消費電力量の上昇値に基づいて周辺の計算機における温度上昇値を推定する温度上昇値推定ステップと、前記温度上昇値の推定値に基づいて前記所望のジョブを割り当てる計算機を決定する決定ステップとからなることを特徴とする請求項記載のジョブスケジューリング方法。 The third step assigns the desired job to each computer based on at least temperature information and power consumption information inside each computer detected for each computer input from each of the plurality of computers. Measure the difference in temperature of each computer before and after allocation, and correlate from the correlation between the increase value of power consumption indicated by the power consumption information for each computer and the increase value of temperature indicated by the difference in measured temperature A temperature rise value estimating step for obtaining a function, and estimating a temperature rise value in a surrounding computer based on an increase value of power consumption of each computer using the correlation function, and based on the estimated value of the temperature rise value 5. The job scheduling method according to claim 4 , further comprising a determination step of determining a computer to which a desired job is assigned.
JP2007079429A 2007-03-26 2007-03-26 Job scheduling apparatus and job scheduling method Expired - Fee Related JP5151203B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007079429A JP5151203B2 (en) 2007-03-26 2007-03-26 Job scheduling apparatus and job scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007079429A JP5151203B2 (en) 2007-03-26 2007-03-26 Job scheduling apparatus and job scheduling method

Publications (2)

Publication Number Publication Date
JP2008242614A JP2008242614A (en) 2008-10-09
JP5151203B2 true JP5151203B2 (en) 2013-02-27

Family

ID=39913919

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007079429A Expired - Fee Related JP5151203B2 (en) 2007-03-26 2007-03-26 Job scheduling apparatus and job scheduling method

Country Status (1)

Country Link
JP (1) JP5151203B2 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5098978B2 (en) * 2008-12-02 2012-12-12 富士通株式会社 Power consumption reduction support program, information processing apparatus, and power consumption reduction support method
JP5245852B2 (en) * 2009-01-17 2013-07-24 日本電気株式会社 Server processing distribution apparatus, server processing distribution method, and server processing distribution program
JP5531465B2 (en) * 2009-06-30 2014-06-25 日本電気株式会社 Information system, control device, data processing method thereof, and program
JP5549131B2 (en) 2009-07-07 2014-07-16 富士通株式会社 Job allocation apparatus, job allocation method, and job allocation program
US8489745B2 (en) 2010-02-26 2013-07-16 International Business Machines Corporation Optimizing power consumption by dynamic workload adjustment
JP5601024B2 (en) * 2010-05-20 2014-10-08 富士通株式会社 Power leveling method, system and program
JP5648397B2 (en) 2010-09-28 2015-01-07 富士通株式会社 COMPUTER PROCESSING SYSTEM, JOB DISTRIBUTION AND DISTRIBUTION METHOD, AND JOB DISTRIBUTION AND DISTRIBUTION PROGRAM
JP5597872B2 (en) * 2010-10-21 2014-10-01 株式会社日立製作所 Distributed information processing system, distributed information processing method, and data transfer apparatus
WO2012081079A1 (en) * 2010-12-13 2012-06-21 富士通株式会社 Information processing device, method of controlling power, and power control program
JP5568535B2 (en) * 2011-09-28 2014-08-06 株式会社日立製作所 Data center load allocation method and information processing system
JP5309200B2 (en) * 2011-10-26 2013-10-09 株式会社野村総合研究所 Operation management apparatus and information processing system
JP5887846B2 (en) * 2011-11-10 2016-03-16 日本電気株式会社 Power control system and power control method
JP5801732B2 (en) * 2012-01-24 2015-10-28 株式会社日立製作所 Operation management method of information processing system
JP5787365B2 (en) * 2012-09-18 2015-09-30 Necフィールディング株式会社 Power control apparatus, power control system, power control method, and program
JP6083305B2 (en) * 2013-04-08 2017-02-22 富士通株式会社 Electronic equipment cooling system
JP6201530B2 (en) * 2013-08-30 2017-09-27 富士通株式会社 Information processing system, job management apparatus, control program for job management apparatus, and control method for information processing system
JP6349982B2 (en) * 2014-06-06 2018-07-04 富士通株式会社 Information processing apparatus, information processing apparatus control method, and information processing apparatus control program
JP6384321B2 (en) 2014-12-26 2018-09-05 富士通株式会社 Job allocation program, method and apparatus
JP6826919B2 (en) * 2017-03-14 2021-02-10 株式会社富士通アドバンストエンジニアリング Data distribution device and data distribution ratio determination method
JP7367565B2 (en) 2020-03-03 2023-10-24 富士通株式会社 Power control device and power control program

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07332709A (en) * 1994-06-14 1995-12-22 Sekisui Chem Co Ltd Radiation air-conditioning equipment
JP2004240669A (en) * 2003-02-05 2004-08-26 Sharp Corp Job scheduler and multiprocessor system
US7051946B2 (en) * 2003-05-29 2006-05-30 Hewlett-Packard Development Company, L.P. Air re-circulation index
JP2005141669A (en) * 2003-11-10 2005-06-02 Nippon Telegr & Teleph Corp <Ntt> Grid computing and load distribution method in grid computing
US8224639B2 (en) * 2004-03-29 2012-07-17 Sony Computer Entertainment Inc. Methods and apparatus for achieving thermal management using processing task scheduling
JP3781758B2 (en) * 2004-06-04 2006-05-31 株式会社ソニー・コンピュータエンタテインメント Processor, processor system, temperature estimation device, information processing device, and temperature estimation method
JP2006285317A (en) * 2005-03-31 2006-10-19 Tokyo Electric Power Co Inc:The Load determination system, load distribution system, and abnormality detection system

Also Published As

Publication number Publication date
JP2008242614A (en) 2008-10-09

Similar Documents

Publication Publication Date Title
JP5151203B2 (en) Job scheduling apparatus and job scheduling method
JP4895266B2 (en) Management system, management program, and management method
JP5207193B2 (en) Method and apparatus for dynamically allocating power in a data center
US9176483B2 (en) Unified and flexible control of multiple data center cooling mechanisms
US7877751B2 (en) Maintaining level heat emission in multiprocessor by rectifying dispatch table assigned with static tasks scheduling using assigned task parameters
Kaushik et al. T*: A data-centric cooling energy costs reduction approach for Big Data analytics cloud
US20120005505A1 (en) Determining Status Assignments That Optimize Entity Utilization And Resource Power Consumption
CA2723908A1 (en) Methods to optimally allocate the computing server load based on the suitability of environmental conditions
JP5891680B2 (en) Power control apparatus, power control method, and power control program
TWI533146B (en) Virtual resource adjusting method, device and computer readable storage medium for storing thereof
JP5098978B2 (en) Power consumption reduction support program, information processing apparatus, and power consumption reduction support method
US10095204B2 (en) Method, medium, and system
JP4930909B2 (en) Computer environment optimization system, computer environment optimization method, and computer environment optimization program
JP2017041191A (en) Resource management apparatus, resource management program, and resource management method
KR20190042465A (en) Apparatus for managing disaggregated memory and method for the same
JP5853109B2 (en) Computer, computer system controller and recording medium
JP2010072733A (en) Server management device, server management method and program
EP2277094A1 (en) Arrangement for operating a data center using building automation system interface
JP2016115213A (en) Information processor, data processing method and data processing program
JP4749380B2 (en) Installation management method, installation management program, and installation management apparatus
JP6960491B2 (en) Management system and infrastructure system management method
JP2018013971A (en) Management device, information processing method, and program
JP2022121124A (en) Job assignment control device, job assignment control method, and job assignment control program
JP5648397B2 (en) COMPUTER PROCESSING SYSTEM, JOB DISTRIBUTION AND DISTRIBUTION METHOD, AND JOB DISTRIBUTION AND DISTRIBUTION PROGRAM
KR101212407B1 (en) System and method for ontology reasoning based task distribution scheduling for distributed parallel biometric authentication system

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20100302

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20111128

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20111206

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20120203

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20120605

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20120803

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20121106

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20121119

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20151214

Year of fee payment: 3

R150 Certificate of patent or registration of utility model

Ref document number: 5151203

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

LAPS Cancellation because of no payment of annual fees