JPH1185707A

JPH1185707A - Selection method/device for job input computer for parallel computer

Info

Publication number: JPH1185707A
Application number: JP23917897A
Authority: JP
Inventors: Kazuhiko Watanabe; 和彦渡辺
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1997-09-04
Filing date: 1997-09-04
Publication date: 1999-03-30

Abstract

PROBLEM TO BE SOLVED: To consider the reliability of a node anti to schedule a job at the time of selecting the node being the objet of job input from among plural nodes (computers) constituting a parallel computer. SOLUTION: A reliability data registration part 15 collects reliability data of the respective nodes and a node rank registration part 11 divides data into reliability ranks in accordance with the degree of reliability and registers them in a node rank table 12. A job rank decision part 10 decides the ranks of the jobs from the priority of the inputted job and the ranks of the jobs registered in a job rank table 16. A node selection part 13 refers to the node rank table 12 and selects the node having the same reliability rank as the priority rank of the job as the computer inputting the job. A job supply part 14 supplies the job to the selected node.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、並列計算機におい
てジョブの投入対象とする計算機を選択する方法に係わ
り、特に計算機の信頼度に基づいてジョブ投入計算機を
決定する方法に関する。The present invention relates to a method for selecting a computer to which a job is to be submitted in a parallel computer, and more particularly to a method for determining a job submission computer based on the reliability of the computer.

【０００２】[0002]

【従来の技術】並列計算機を構成する複数の計算機の中
からジョブ投入の対象とする計算機を選択する従来のス
ケジューリング方法として、例えば特開平５−１２０２
４３号公報のように計算機負荷の最も小さい計算機を選
択する方法が知られている。2. Description of the Related Art A conventional scheduling method for selecting a computer to which a job is to be submitted from among a plurality of computers constituting a parallel computer is disclosed in, for example, Japanese Patent Laid-Open No. 5-1202.
There is known a method of selecting a computer with the smallest computer load as disclosed in JP-A-43-43.

【０００３】[0003]

【発明が解決しようとする課題】並列計算機システム
は、複数のノードと呼ばれる計算機から構成され、各ノ
ードはプロセッサを有し、処理が並列に行われる。ノー
ド数を増やすことにより並列計算機の能力を向上させる
ことができ、またノードの追加は比較的容易に行えるた
め、処理能力のニーズに合わせてノードを増設すること
が行われる。その結果として、製造時期の古いノード
（特にプロセッサ）と新しいノードとが混在することが
ある。ノードの製造時期が異なるということは、ノード
の信頼度が異なるということである。従ってジョブ投入
のスケジューリングをするとき、ノードの信頼度を考慮
する必要がある。The parallel computer system is composed of computers called a plurality of nodes, each of which has a processor, and performs processing in parallel. The capacity of the parallel computer can be improved by increasing the number of nodes, and nodes can be added relatively easily. Therefore, the number of nodes is increased according to the need for processing capacity. As a result, nodes with an old manufacturing time (particularly, processors) and new nodes may be mixed. The fact that the nodes are manufactured at different times means that the reliability of the nodes is different. Therefore, when scheduling the job submission, it is necessary to consider the reliability of the node.

【０００４】本発明の目的は、ノードの信頼度を考慮す
るジョブのスケジューリング方法を提供することにあ
る。An object of the present invention is to provide a job scheduling method that takes into account the reliability of a node.

【０００５】[0005]

【課題を解決するための手段】本発明は、並列計算機を
構成する計算機の各々について信頼度データを収集して
信頼度の程度に従って複数の信頼度ランクのいずれかに
区分し、ジョブについての情報を入力しジョブの優先度
の程度に従って信頼度ランクと同じランク数をもつ優先
度ランクのいずれかに区分し、ジョブの優先度ランクと
同じ信頼度ランクをもつ計算機をジョブ投入する計算機
として選択するジョブ投入計算機の選択方法を特徴とす
る。SUMMARY OF THE INVENTION According to the present invention, reliability data is collected for each of the computers constituting a parallel computer, classified into one of a plurality of reliability ranks according to the degree of reliability, and information about a job is obtained. Is input, the job is classified into any one of the priority ranks having the same rank number as the reliability rank according to the priority level of the job, and the computer having the same reliability rank as the job priority rank is selected as the computer to which the job is input. It is characterized by a method of selecting a job input computer.

【０００６】[0006]

【発明の実施の形態】以下、本発明の一実施形態につい
て図面を用いて説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below with reference to the drawings.

【０００７】図１は、本実施形態の並列計算機システム
の構成図である。システムは、ジョブの投入を制御する
計算機であるノード１と、ノード１と伝送路２０を介し
て接続され、ジョブを実行する複数の計算機であるノー
ド３１，３２，・・・３３から構成される。ここでノー
ドとは、プロセッサとプロセッサに接続される入出力装
置、記憶装置、通信制御装置等を含む計算機である。ノ
ード３１，３２，・・・３３の各々は、並列計算機を構
成する計算機である。ノード１は、この並列計算機を構
成する計算機の１つであってもよいし、別の独立した計
算機であってもよい。入力装置２は、ノード１に接続さ
れ、ジョブについての情報（ＪＣＬ）を格納する外部記
憶装置などである。ノード１の記憶装置は、ノードラン
クテーブル１２及びジョブランクテーブル１６を格納す
る。ノードランクテーブル１２は、ノード３１，３２，
・・・３３の各ノードを信頼性の観点からランク付けす
るテーブルである。ジョブランクテーブル１６は、ジョ
ブを優先度の観点からランク付けするテーブルである。
ノード１の主記憶装置にはノードランク登録部１１、信
頼度データ登録部１５、ジョブランク判定部１０、ノー
ド選択部１３及びジョブ投入部１４の各プログラムが格
納され、実行される。これらのプログラムは、一般にオ
ペレーティングシステム（ＯＳ）の一部であるジョブ管
理プログラムの中に含まれる。信頼度データ登録部１５
は、周期的にノード３１，３２，・・・３３の各ノード
から信頼度データを収集し、信頼度データからノードの
ＭＴＢＦ（ＪＩＳ規格Ｘ００１４による平均故障間隔）
または故障率を計算してノードランクテーブル１２に登
録する。ノードランク登録部１１は、信頼度データ登録
部１５によって起動され、ノードランクテーブル１２か
ら各ノードのＭＴＢＦ又は故障率を読み出してノードの
ランクを求め、ノードランクテーブル１２に登録する。
ジョブランク判定部１０は、入力装置２からジョブ情報
を入力し、指定された優先度からジョブのランクを判定
する。ジョブ情報に優先度の指定がなく、ジョブランク
テーブル１６にジョブランクが登録されていれば、登録
されたランクを採用する。ノード選択部１３は、ノード
ランクテーブル１２を参照して決定したジョブの優先度
ランクに等しい信頼度ランクをもつノードを選択する。
ジョブ投入部１４は、選択されたノードにジョブを投入
する。ノード３１，３２，・・・３３の各ノードは、Ｏ
Ｓを有し、そのジョブ管理プログラムは投入されたジョ
ブのＪＣＬを受け取ってジョブの実行を開始する。なお
以下の説明では、簡単のためにすべてのノード３１，３
２，・・・３３のプロセッサ性能が同じとする。また各
ノードは、ジョブを実行するために必要なプログラム、
記憶装置、入出力装置等の資源を備えているものとす
る。またノード１内の上記プログラムを記憶媒体に格納
し、ノード１に接続された駆動装置を介してノード１の
主記憶装置に読み込み、実行することができる。FIG. 1 is a configuration diagram of a parallel computer system according to the present embodiment. The system includes a node 1 that is a computer that controls the submission of a job, and nodes 31, 32,... 33 that are connected to the node 1 via a transmission line 20 and are a plurality of computers that execute the job. . Here, the node is a computer including a processor and an input / output device connected to the processor, a storage device, a communication control device, and the like. Each of the nodes 31, 32,... 33 is a computer constituting a parallel computer. The node 1 may be one of the computers constituting the parallel computer, or may be another independent computer. The input device 2 is an external storage device that is connected to the node 1 and stores information (JCL) on a job. The storage device of the node 1 stores a node rank table 12 and a job rank table 16. The node rank table 12 includes nodes 31, 32,
.. Is a table for ranking each of the 33 nodes from the viewpoint of reliability. The job rank table 16 is a table for ranking jobs from the viewpoint of priority.
Each program of the node rank registration unit 11, the reliability data registration unit 15, the job rank determination unit 10, the node selection unit 13, and the job submission unit 14 is stored and executed in the main storage device of the node 1. These programs are generally included in a job management program that is a part of an operating system (OS). Reliability data registration unit 15
33 periodically collects reliability data from each of the nodes 31, 32,... 33, and calculates the MTBF (mean failure interval according to JIS standard X0014) of the node from the reliability data.
Alternatively, the failure rate is calculated and registered in the node rank table 12. The node rank registration unit 11 is started by the reliability data registration unit 15, reads out the MTBF or failure rate of each node from the node rank table 12, obtains a node rank, and registers the node rank in the node rank table 12.
The job rank determining unit 10 receives job information from the input device 2 and determines the rank of the job based on the designated priority. If no priority is specified in the job information and the job rank is registered in the job rank table 16, the registered rank is adopted. The node selection unit 13 selects a node having a reliability rank equal to the job priority rank determined with reference to the node rank table 12.
The job submission unit 14 submits a job to the selected node. Each of the nodes 31, 32,...
S, the job management program receives the JCL of the input job and starts executing the job. In the following description, for simplicity, all nodes 31, 3
2,..., 33 have the same processor performance. Each node has the programs required to execute the job,
It is assumed that resources such as a storage device and an input / output device are provided. Further, the program in the node 1 can be stored in a storage medium, read into a main storage device of the node 1 via a driving device connected to the node 1, and executed.

【０００８】図２は、ノードランクテーブル１２及びジ
ョブランクテーブル１６のデータ構成を示す図である。
ノードランクテーブル１２は、各ノードごとにノードの
識別子、ＭＴＢＦ、信頼度のランク及び使用中フラグを
格納する。ＭＴＢＦはそのノード（あるいはそのノード
を構成するプロセッサ）の最新のＭＴＢＦ値である。ラ
ンクはＭＴＢＦから定まるノードのランクであり、Ａ〜
Ｃのいずれかにランク付けされる。ノードが使用できな
いとき、ノードはＤにランク付けされる。ＭＴＢＦの代
わりにノードの故障率を信頼度の指標としてもよい。そ
のときにはランクは、故障率から定まる信頼度のランク
である。使用中フラグは、ノードが使用されているか否
かを示すフラグである。FIG. 2 is a diagram showing a data configuration of the node rank table 12 and the job rank table 16.
The node rank table 12 stores a node identifier, an MTBF, a rank of reliability, and a busy flag for each node. MTBF is the latest MTBF value of the node (or the processor constituting the node). The rank is the rank of the node determined from the MTBF.
C is ranked. When a node is unavailable, it is ranked D. The failure rate of the node may be used as an index of the reliability instead of the MTBF. At that time, the rank is a rank of reliability determined from the failure rate. The busy flag is a flag indicating whether or not the node is being used.

【０００９】ジョブランクテーブル１６は、各ジョブに
ついてジョブ名と対応する優先度のランクを格納する。
ランクはＡ〜Ｃのいずれかにランク付けされる。なおジ
ョブのランクを登録する代わりにジョブ名、ユーザ名又
はプログラム名のみを登録してもよい。The job rank table 16 stores a job name and a priority rank corresponding to each job.
The rank is ranked as any of AC. Instead of registering the rank of the job, only the job name, user name, or program name may be registered.

【００１０】図３は、信頼度データ登録部１５及びノー
ドランク登録部１１の処理の流れを示すフローチャート
である。信頼度データ登録部１５及びノードランク登録
部１１は、周期的に起動されて実行される。信頼度デー
タ登録部１５は、ノード３１，３２，・・・３３に問い
合わせて各ノードの動作時間データを収集する（ステッ
プ４１）。動作時間はノードを構成するプロセッサが故
障なく連続して動作した時間であり、各ノードについて
故障回数だけの連続動作時間が存在する。信頼度データ
登録部１５が動作時間を収集できないノードは、使用不
可能のノードとみなす。あるいはノードの使用可能性を
別途問い合わせてチェックしてもよい。次に各ノードに
ついて少なくとも１つの連続動作時間からＭＴＢＦを計
算し（ステップ４２）、ノードランクテーブル１２の当
該ノードの欄に格納する（ステップ４３）。各ノードが
連続動作時間の代わりに稼動時間積算値と故障回数を記
録していれば、これらのデータを収集し、ＭＴＢＦの代
わりに故障率（単位時間当りの故障確率）を計算してノ
ードランクテーブル１２に格納する。故障回数が０のノ
ードについては、例えば平均のＭＴＢＦ又は故障率を仮
定できる。FIG. 3 is a flowchart showing the processing flow of the reliability data registration unit 15 and the node rank registration unit 11. The reliability data registration unit 15 and the node rank registration unit 11 are started and executed periodically. The reliability data registration unit 15 inquires the nodes 31, 32,... 33 to collect operation time data of each node (step 41). The operation time is the time during which the processors constituting the nodes operate continuously without any failure, and there is a continuous operation time for each node corresponding to the number of failures. A node in which the reliability data registration unit 15 cannot collect the operation time is regarded as an unusable node. Alternatively, the availability of the node may be checked by separately inquiring. Next, an MTBF is calculated from at least one continuous operation time for each node (step 42) and stored in the column of the node in the node rank table 12 (step 43). If each node records the accumulated operation time and the number of failures instead of the continuous operation time, collect these data, calculate the failure rate (failure probability per unit time) instead of MTBF, and calculate the node rank. Stored in table 12. For a node where the number of failures is 0, for example, an average MTBF or failure rate can be assumed.

【００１１】次にノードランク登録部１１は、ノードラ
ンクテーブル１２の終端に達していなければ（ステップ
４４ＮＯ）、次のノードを選択し（ステップ４５）、信
頼度データ登録部１５からの報告によってノードが使用
可能か否かを判定する（ステップ４６）。ノードが使用
可能であれば（ステップ４６ＹＥＳ）、ノードのＭＴＢ
ＦからＡ〜Ｃのいずれかにランク付けする（ステップ４
７）。ランク付けの方法として、例えばノードのＭＴＢ
Ｆが正規分布に従って分布していると仮定したとき、Ｍ
ＴＢＦの平均±分散の範囲をランクＢとし、この範囲以
上のＭＴＢＦをランクＡ、この範囲以下のＭＴＢＦをラ
ンクＣとするなどの方法がある。故障率のランク付けに
ついても同様に故障率の逆数についてＡ〜Ｃのランク付
けをすることができる。次に決定したランクをノードラ
ンクテーブル１２の当該ノードの欄に格納する（ステッ
プ４８）。現在ノードが使用可能でなければ（ステップ
４６ＮＯ）、ランクＤとし、ノードランクテーブル１２
の当該ノードの欄に格納する（ステップ４８）。ある基
準より悪いＭＴＢＦ又は故障率をもつノードをランクＤ
にランク付けしてもよい。ノードランクテーブル１２の
終端に達し、すべてのノードの処理を終了したとき（ス
テップ４４ＹＥＳ）、処理を終了する。Next, if the end of the node rank table 12 has not been reached (NO in step 44), the node rank registration unit 11 selects the next node (step 45) and, based on the report from the reliability data registration unit 15, It is determined whether or not is available (step 46). If the node is available (step 46 YES), the MTB of the node
Rank from F to A to C (Step 4)
7). As a method of ranking, for example, the MTB of the node
Assuming that F is distributed according to a normal distribution, M
There is a method in which the range of the average ± variance of the TBF is set to rank B, the MTBF above this range is set to rank A, and the MTBF below this range is set to rank C. Regarding the failure rate ranking, A to C can be similarly ranked for the reciprocal of the failure rate. Next, the determined rank is stored in the column of the node in the node rank table 12 (step 48). If the current node is not usable (step 46 NO), the rank is set to D and the node rank table 12
(Step 48). Rank nodes with MTBF or failure rate worse than certain criteria
May be ranked. When the end of the node rank table 12 has been reached and processing of all nodes has been completed (step 44 YES), the processing ends.

【００１２】図４は、ジョブランク判定部１０の処理の
流れを示すフローチャートである。ジョブランク判定部
１０は、入力装置２から各ジョブ、すなわちジョブのＪ
ＣＬ（ジョブ制御言語）を入力する。最初にジョブの優
先度のランクをＣとする（ステップ５１）。ジョブのＪ
ＣＬに優先度の指定があれば（ステップ５２ＹＥＳ）、
指定された優先度によってランクを変更する（ステップ
５４）。ＪＣＬに記述されたジョブの優先度のレベル分
けとランクＡ〜Ｃが一致していない場合には、ジョブの
優先度をランクＡ〜Ｃに変換する。ジョブランクテーブ
ル１６に登録されたジョブであれば（ステップ５３ＹＥ
Ｓ）、登録されたランクをジョブのランクとする（ステ
ップ５４）。優先度の指定はないが、重要なユーザ又は
ジョブとしてユーザ名又はジョブ名が登録されていれば
（ステップ５５ＹＥＳ）、ランクＣを１つ上げてＢにす
る（ステップ５６）。またジョブ情報中に実行を指定さ
れているプログラム名が登録されているプログラムであ
れば（ステップ５７ＹＥＳ）、ジョブのランクを１つ上
げる（ステップ５８）。登録されているプログラムと
は、例えばプログラム実行途中で実行中断されたとき再
実行できないようなプログラムである。FIG. 4 is a flowchart showing the flow of the process of the job rank determining section 10. The job rank determination unit 10 determines whether each job, that is, the J
Input CL (job control language). First, the rank of the job priority is set to C (step 51). J of the job
If the CL has a priority designation (step 52 YES),
The rank is changed according to the designated priority (step 54). If the job priority levels described in the JCL do not match the ranks A to C, the job priority is converted to ranks A to C. If the job is a job registered in the job rank table 16 (step 53YE
S), the registered rank is set as the job rank (step 54). Although no priority is specified, if a user name or job name is registered as an important user or job (step 55 YES), rank C is increased by one to B (step 56). If the job information is a program in which the program name designated to be executed is registered (step 57 YES), the rank of the job is increased by one (step 58). The registered program is, for example, a program that cannot be re-executed when the execution is interrupted during the execution of the program.

【００１３】図５は、ノード選択部１３及びジョブ投入
部１４の処理の流れを示すフローチャートである。ノー
ド選択部１３は、ノードランクテーブル１２を参照して
使用中フラグがオフであり、かつジョブのランクに等し
いランクをもつノードを検索する（ステップ６１）。該
当するノードがあれば（ステップ６２ＹＥＳ）、ステッ
プ６４へ行く。該当ノードがなければ（ステップ６２Ｎ
Ｏ）、上位ランクのノードを選択する（ステップ６
３）。次にノードランクテーブル１２の選択したノード
の使用中フラグをオンにする（ステップ６４）。ジョブ
投入部１４は、選択したノードにジョブを投入する（ス
テップ６５）。上位ランクの該当するノードがなければ
ジョブの投入を停止する。この後ジョブはノード３１〜
３３のうちの選択されたノードで実行される。ノードか
らジョブの終了が報告されたとき、ノード選択部１３は
ノードランクテーブル１２中の当該ノードの使用中フラ
グをオフにする。なお上記実施形態では使用されていな
いノードを選択したが、使用中フラグの代わりに使用ジ
ョブ数を計数し、ジョブのランクに等しいランクをもつ
ノードを選択するとき、使用ジョブ数が最小でありかつ
ジョブのランクに等しいランクをもつノードを選択する
ようにしてもよい。この場合には、使用ジョブ数が所定
数に達していて使用ジョブ数が最小のノードがないとき
に上位ランクのノードを選択する。FIG. 5 is a flowchart showing the processing flow of the node selection unit 13 and the job submission unit 14. The node selecting unit 13 refers to the node rank table 12 to search for a node whose busy flag is off and which has a rank equal to the job rank (step 61). If there is a corresponding node (step 62 YES), the procedure goes to step 64. If there is no corresponding node (step 62N
O), select a node of higher rank (step 6)
3). Next, the busy flag of the selected node in the node rank table 12 is turned on (step 64). The job submission unit 14 submits a job to the selected node (Step 65). If there is no corresponding node of the higher rank, the job submission is stopped. After this, the jobs are
Executed on selected node out of 33. When the node reports the end of the job, the node selecting unit 13 turns off the busy flag of the node in the node rank table 12. Although the unused nodes are selected in the above embodiment, the number of used jobs is counted instead of the in-use flag, and when selecting a node having a rank equal to the job rank, the number of used jobs is minimum and A node having a rank equal to the rank of the job may be selected. In this case, when the number of used jobs has reached the predetermined number and there is no node with the smallest number of used jobs, a node with a higher rank is selected.

【００１４】[0014]

【発明の効果】本発明によれば、各ノードの信頼度ラン
クとジョブの優先度ランクのマッチングによってジョブ
を投入するノードを決定するので、ジョブの優先度に応
じてジョブの実行環境の信頼性を確保することができ
る。According to the present invention, the node to which a job is input is determined by matching the reliability rank of each node with the job priority rank. Therefore, the reliability of the job execution environment is determined according to the job priority. Can be secured.

[Brief description of the drawings]

【図１】実施形態の並列計算機システムの構成図であ
る。FIG. 1 is a configuration diagram of a parallel computer system according to an embodiment.

【図２】実施形態のノードランクテーブル１２及びジョ
ブランクテーブル１６のデータ構成を示す図である。FIG. 2 is a diagram illustrating a data configuration of a node rank table 12 and a job rank table 16 according to the embodiment.

【図３】実施形態の信頼度データ登録部１５及びノード
ランク登録部１１の処理の流れを示すフローチャートで
ある。FIG. 3 is a flowchart illustrating a processing flow of a reliability data registration unit 15 and a node rank registration unit 11 according to the embodiment.

【図４】実施形態のジョブランク判定部１０の処理手順
を示すフローチャートである。FIG. 4 is a flowchart illustrating a processing procedure of a job rank determination unit according to the embodiment.

【図５】実施形態のノード選択部１３及びジョブ投入部
１４の処理の流れを示すフローチャートである。FIG. 5 is a flowchart illustrating a processing flow of a node selection unit 13 and a job submission unit 14 according to the embodiment.

[Explanation of symbols]

１：ノード、１０：ジョブランク判定部、１１：ノード
ランク登録部、１２：ノードランクテーブル、１３：ノ
ード選択部、１４：ジョブ投入部、１５：信頼度データ
登録部、１６：ジョブランクテーブル、３１，３２，３
３：ノード1: node, 10: job rank determination unit, 11: node rank registration unit, 12: node rank table, 13: node selection unit, 14: job input unit, 15: reliability data registration unit, 16: job rank table, 31, 32, 3
3: Node

Claims

[Claims]

1. A method for selecting a computer to which a job is to be submitted from among a plurality of computers constituting a parallel computer, comprising: collecting reliability data for each of the computers constituting the parallel computer; Divide into any of a plurality of reliability ranks, input information about the job, and classify into any of the priority ranks having the same number of ranks as the reliability rank according to the degree of priority of the job. A method for selecting a job input computer, wherein a computer having the same reliability rank as the rank is selected as a computer for inputting a job.

2. The method according to claim 1, wherein said reliability rank is classified based on a failure rate of each computer.

3. The method for selecting a job input computer according to claim 1, wherein said reliability rank is classified based on an MTBF (mean failure interval) of each computer.

4. A computer for selecting a computer to which a job is to be submitted from among a plurality of computers constituting a parallel computer, collecting reliability data for each of the computers constituting the parallel computer and according to the degree of reliability. Means for classifying into any of a plurality of reliability ranks, means for inputting information about a job, and classifying into any of priority ranks having the same rank number as the reliability rank according to the priority level of the job; Means for selecting a computer having the same reliability rank as the job priority rank as a computer to which a job is to be submitted.

5. A computer program which is embodied on a computer-readable storage medium and selects a computer to which a job is to be submitted from among a plurality of computers constituting a parallel computer, the program comprising the following steps: Including: (a) collecting reliability data for each of the computers constituting the parallel computer, classifying the data into one of a plurality of reliability ranks according to the degree of reliability, and (b) inputting information about the job and According to the degree of priority, the job is classified into any one of the priority ranks having the same rank number as the reliability rank, and (c) a computer having the same reliability rank as the job priority rank is selected as a computer to which a job is input.