JPH07175766A

JPH07175766A - Job reexecution control method for loosely-coupled multiplexing system

Info

Publication number: JPH07175766A
Application number: JP5318812A
Authority: JP
Inventors: Mitsuru Ando; 充安藤; Takeshi Sasaki; 猛佐々木
Original assignee: TOHOKU NIPPON DENKI SOFTWARE KK; NEC Corp; NEC Software Tohoku Ltd
Current assignee: TOHOKU NIPPON DENKI SOFTWARE KK; NEC Corp; NEC Solution Innovators Ltd
Priority date: 1993-12-20
Filing date: 1993-12-20
Publication date: 1995-07-14

Abstract

PURPOSE:To promptly perform the reexecution of a job in execution by a designated host computer Hm when a fault is generated in a host computer Hn. CONSTITUTION:In a host computer H1, a job control language translation means B2 translates a job control language B1 and registers the designation of a host computer Hm reexecuting the job of a host computer Hn in a job control information holding means A4. A host fault recognition means A1 recognizes the fault notification from a host monitoring device G1, and a job reexecution preparation means A2 updates the job control information of the job control information holding means A4 and requests the reexecution for the job in execution in a faulty host computer Hn. A job schedule means A3 performs the rescheduling of the job and requests the host computer Hm to reexecute the job. A job start means A5 starts the execution program A6 of the job for which the execution is requested.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、疎結合多重システムの
ジョブ再実行制御方式に関し、特にジョブを実行中のホ
ストコンピュータに障害が発生した場合における疎結合
多重システムのジョブ再実行制御方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a job re-execution control method for a loosely coupled multiple system, and more particularly to a job re-execution control method for a loosely coupled multiple system when a failure occurs in a host computer that is executing a job.

【０００２】[0002]

【従来の技術】従来の疎結合多重システムのジョブ再実
行制御方式は、ジョブを実行しているホストコンピュー
タに何らかの要因によって障害が発生した場合には、実
行中のジョブの再実行を行うために必要なジョブ制御情
報をそのホストコンピュータの二次記憶領域上に持つこ
とにより、障害が発生したホストコンピュータの障害の
回復後に、二次記憶領域上のジョブ制御情報の内容を用
いて、障害で中断されたジョブの再実行を自動的に行っ
ている。2. Description of the Related Art A conventional job re-execution control method for a loosely coupled multiple system is to re-execute a job being executed when a failure occurs in the host computer that is executing the job. By holding the necessary job control information in the secondary storage area of the host computer, after the failure of the failed host computer is recovered, the contents of the job control information in the secondary storage area are used to interrupt due to the failure. The executed job is automatically re-executed.

【０００３】このような従来の疎結合多重システムのジ
ョブ再実行制御方式の一例としては特開昭６２−７９５
３１，“ジョブステップリスタート方法”がある。As an example of such a conventional job re-execution control system for a loosely coupled multiplex system, Japanese Patent Application Laid-Open No. 62-795.
31. "Job step restart method".

【０００４】[0004]

【発明が解決しようとする課題】上述した従来の疎結合
多重システムのジョブ再実行制御方式は、ジョブを実行
しているホストコンピュータに障害が発生した場合に、
そのホストコンピュータの障害の回復後に、そのホスト
コンピュータの二次記憶領域上のジョブ制御情報の内容
によって障害で中断されたジョブの再実行を行っている
ので、そのホストコンピュータの障害が回復するまで
は、実行を中断されたジョブの再実行を実施することが
できず、そのホストコンピュータの障害の回復後に、操
作員によってジョブの再実行の指示を行う必要があると
いう欠点を有している。The above-mentioned conventional job re-execution control method for the loosely coupled multiplex system, when a failure occurs in the host computer executing the job,
After the failure of the host computer is recovered, the job interrupted by the failure is re-executed due to the contents of the job control information in the secondary storage area of the host computer, so until the failure of the host computer is recovered However, there is a drawback that the re-execution of the job whose execution has been interrupted cannot be executed, and the operator must give an instruction for re-execution of the job after the failure of the host computer is recovered.

【０００５】また、従来の疎結合多重システムのジョブ
再実行制御方式は、障害が発生したホストコンピュータ
の装置故障などによって、そのホストコンピュータの障
害の回復が困難でジョブの再実行を行えない場合もある
という欠点を有している。Further, according to the conventional job re-execution control method of the loosely coupled multiple system, there is a case where it is difficult to recover the failure of the host computer due to the failure of the host computer in which the failure has occurred and the job cannot be re-executed. It has the drawback of being.

【０００６】本発明の目的は、ジョブを実行しているホ
ストコンピュータに障害が発生した場合に、そのホスト
コンピュータの障害の回復を待たずに、速やかにそのジ
ョブの再実行を他のホストコンピュータにより行うこと
ができる疎結合多重システムのジョブ再実行制御方式を
提供することにある。An object of the present invention is to, when a failure occurs in a host computer executing a job, promptly re-execute the job by another host computer without waiting for recovery from the failure of the host computer. It is to provide a job re-execution control method of a loosely coupled multiple system that can be performed.

【０００７】[0007]

【課題を解決するための手段】第１の発明の疎結合多重
システムのジョブ再実行制御方式は、複数のホストコン
ピュータと前記ホストコンピュータの各々に接続して前
記ホストコンピュータの何れかに障害が発生した場合に
他の前記ホストコンピュータに障害が発生したことを通
知するホスト監視装置とを持つ疎結合多重システムのジ
ョブ再実行制御方式において、第１の前記ホストコンピ
ュータには、（Ａ）前記ホストコンピュータ内に投入さ
れたジョブに対するジョブ制御情報を保持するジョブ管
理情報保持手段と、（Ｂ）ジョブ制御言語を入力して翻
訳するとともに、ジョブ制御言語により第２の前記ホス
トコンピュータに障害が発生したときジョブの再実行を
行うべき第３の前記ホストコンピュータが指定されてい
る場合に、その指定をジョブ制御情報として前記ジョブ
管理情報保持手段に登録するジョブ制御言語翻訳手段
と、（Ｃ）障害の発生時には前記ホスト監視装置から障
害を起した第２の前記ホストコンピュータを通知されて
認識することにより、前記ジョブ管理情報保持手段のジ
ョブ制御情報を参照して、第２の前記ホストコンピュー
タで実行中のジョブに対する再実行要求を行うホスト障
害認識手段と、（Ｄ）前記ホスト障害認識手段からジョ
ブの再実行要求を受けることによって、前記ジョブ管理
情報保持手段のジョブ制御情報を参照し、そのジョブ制
御情報で指示された第３の前記ホストコンピュータに対
するジョブの再スケジュール要求を行うジョブ再実行準
備手段と、（Ｅ）前記ジョブ制御言語翻訳手段からのジ
ョブ制御情報を受けることによって複数の前記ホストコ
ンピュータで実行すべきジョブのスケジューリングを行
い、前記ジョブ再実行準備手段の第３の前記ホストコン
ピュータに対するジョブの再スケジュール要求を受けて
ジョブの再スケジューリングを行うことによって第３の
前記ホストコンピュータに対して第２の前記ホストコン
ピュータのジョブの再実行を要求するジョブスケジュー
ル手段と、を備えるとともに、複数の前記ホストコンピ
ュータの各々には、（Ｆ）前記ジョブスケジュール手段
からスケジュールに基いたジョブの実行要求がある場合
に、実行を要求されたジョブの実行プログラムを起動す
るジョブ起動手段、を備えて構成されている。According to the job re-execution control method of the loosely coupled multiplex system of the first invention, a failure occurs in any one of the host computers by connecting to each of the plurality of host computers. In the job re-execution control method of the loosely coupled multi-system having a host monitoring device that notifies other host computers that a failure has occurred, the first host computer includes (A) the host computer Job management information holding means for holding job control information for a job entered in (B) a job control language is input and translated, and when a failure occurs in the second host computer due to the job control language If a third host computer to re-execute the job is specified, the Job control information translating means for registering the job control information as job control information in the job management information holding means, and (C) when a failure occurs, the host monitoring apparatus is notified of and recognizes the second host computer in which the failure occurred By referring to the job control information of the job management information holding means, a second host failure recognition means for making a re-execution request for a job being executed in the host computer; and (D) a job from the host failure recognition means Job re-execution preparation means for making a job re-scheduling request to the third host computer designated by the job control information by referring to the job control information of the job management information holding means by receiving the job re-execution request. (E) by receiving the job control information from the job control language translation means, Job to be executed by the host computer, and the job is rescheduled in response to a job rescheduling request from the job reexecution preparation means to the third host computer. And a second job scheduling means for requesting re-execution of the job of the host computer, and each of the plurality of host computers includes (F) a job based on the schedule from the job scheduling means. Job execution means for activating an execution program of a job requested to be executed when there is an execution request is configured.

【０００８】そして、第２の発明の疎結合多重システム
のジョブ再実行制御方式は、第１の発明の疎結合多重シ
ステムのジョブ再実行制御方式において、（Ａ）障害が
発生したときにジョブの再実行を行うための請求項１記
載のホストコンピュータの指定が行われていない場合や
行われているその指定を変更したい場合に、第２の前記
ホストコンピュータに障害が発生したときジョブの再実
行を行うべき第３の前記ホストコンピュータの指定を入
力する端末と、（Ｂ）前記端末から第２の前記ホストコ
ンピュータに障害が発生したときジョブの再実行を行う
べき第３の前記ホストコンピュータを指定した場合に、
その指定を新たなジョブ制御情報として前記ジョブ管理
情報保持手段内に登録するジョブ再実行ホスト登録手段
と、を第１の前記ホストコンピュータに対して備えて構
成されている。The job re-execution control method for the loosely coupled multiple system according to the second aspect of the present invention is the job re-execution control method for the loosely coupled multiple system according to the first aspect of the present invention. The re-execution of the job when the host computer according to claim 1 for re-execution is not specified or the specified specification is changed and a failure occurs in the second host computer. And a terminal for inputting the designation of the third host computer to perform the job, and (B) a third host computer to re-execute the job when a failure occurs in the second host computer from the terminal. If you do
Job re-execution host registration means for registering the designation as new job control information in the job management information holding means is provided for the first host computer.

【０００９】そして、第３の発明の疎結合多重システム
のジョブ再実行制御方式は、第１の発明の疎結合多重シ
ステムのジョブ再実行制御方式において、（Ａ）障害が
発生したときにジョブの再実行を行うための請求項１記
載のホストコンピュータの指定を利用者ごとに格納する
利用者管理情報格納手段と、（Ｂ）障害のジョブの利用
者に対して、前記利用者管理情報格納手段からジョブの
再実行を行うための前記ホストコンピュータの指定を取
得する利用者管理情報取得手段と、（Ｃ）第２の前記ホ
ストコンピュータに障害が発生したときに、請求項１記
載のジョブ管理情報保持手段にジョブの再実行を行うた
めの前記ホストコンピュータの指定が行われていない場
合には、障害のジョブの利用者を前記利用者管理情報取
得手段に与えることによって、その利用者に対する前記
ホストコンピュータの指定を取得し、その指定を新たな
ジョブ制御情報として前記ジョブ管理情報保持手段内に
登録するとともに障害のジョブに対する再実行要求を行
う請求項１記載のホスト障害認識手段と、を第１の前記
ホストコンピュータに対して備えて構成されている。A job re-execution control method for a loosely coupled multiple system according to a third aspect of the present invention is the job re-execution control method for a loosely coupled multiple system according to the first aspect of the present invention. A user management information storage means for storing the designation of the host computer according to claim 1 for re-execution for each user, and (B) the user management information storage means for the user of the faulty job. 2. The job management information according to claim 1, wherein when a failure occurs in the user management information acquisition means for acquiring the designation of the host computer for re-executing the job from (C) the second host computer. If the host computer for re-execution of the job is not designated in the holding means, the user of the job in failure is given to the user management information acquisition means. 2. The host according to claim 1, wherein the designation of the host computer for the user is acquired, the designation is registered as new job control information in the job management information holding unit, and a re-execution request is made for the faulty job. A fault recognition means is provided for the first host computer.

【００１０】そして、第４の発明の疎結合多重システム
のジョブ再実行制御方式は、第１の発明の疎結合多重シ
ステムのジョブ再実行制御方式において、（Ａ）障害が
発生したときにジョブの再実行を行うための請求項１記
載のホストコンピュータの指定を前記ホストコンピュー
タごとに保持するシステム定義情報保持手段と、（Ｂ）
障害が発生した第２の前記ホストコンピュータに対して
前記システム定義情報保持手段からジョブの再実行を行
うための前記ホストコンピュータの指定を取得するホス
ト情報取得手段と、（Ｃ）第２の前記ホストコンピュー
タに障害が発生したときに、請求項１記載のジョブ管理
情報保持手段にジョブの再実行を行うための前記ホスト
コンピュータの指定が行われていない場合には、第２の
前記ホストコンピュータであることを前記ホスト情報取
得手段に与えることにより、第２の前記ホストコンピュ
ータに対する第３の前記ホストコンピュータの指定を取
得し、その指定を新たなジョブ制御情報として前記ジョ
ブ管理情報保持手段内に登録するとともに障害のジョブ
に対する再実行要求を行う請求項１記載のホスト障害認
識手段と、を第１の前記ホストコンピュータに対して備
えて構成されている。A job re-execution control method for a loosely coupled multiple system according to a fourth aspect of the present invention is the job re-execution control method for a loosely coupled multiple system according to the first aspect of the present invention. A system definition information holding unit for holding the designation of the host computer according to claim 1 for re-execution, (B)
Host information acquisition means for acquiring a designation of the host computer for re-executing a job from the system definition information holding means for the second host computer in which a failure has occurred; and (C) the second host It is the second host computer when the host computer for re-execution of the job is not designated in the job management information holding means according to claim 1 when the computer fails. To the host information acquisition means, the designation of the third host computer with respect to the second host computer is obtained, and the designation is registered in the job management information holding means as new job control information. A host failure recognizing means according to claim 1, wherein a re-execution request is made to the failed job together with the first job. And it is configured to include to the host computer.

【００１１】さらに、第５の発明の疎結合多重システム
のジョブ再実行制御方式は、第１の発明の疎結合多重シ
ステムのジョブ再実行制御方式において、（Ａ）障害が
発生したときにジョブの再実行を行うための請求項１記
載のホストコンピュータの指定を各々のジョブクラスご
とに保持するシステム定義情報保持手段と、（Ｂ）障害
のジョブのジョブクラスに対して前記システム定義情報
保持手段からのジョブの再実行を行うための前記ホスト
コンピュータの指定を取得するジョブクラス情報取得手
段と、（Ｃ）第２の前記ホストコンピュータに障害が発
生したときに、請求項１記載のジョブ管理情報保持手段
にジョブの再実行を行うための前記ホストコンピュータ
の指定が行われていない場合には、障害のジョブについ
てのジョブクラスを前記ジョブクラス情報取得手段に与
えることにより、そのジョブクラスに対する第３の前記
ホストコンピュータの指定を取得して、その指定を新た
なジョブ制御情報として前記ジョブ管理情報保持手段内
に登録するとともに障害のジョブに対する再実行要求を
行う請求項１記載のホスト障害認識手段と、を第１の前
記ホストコンピュータに対して備えて構成されている。Further, the job re-execution control method for the loosely coupled multiple system according to the fifth aspect of the present invention is the job re-execution control method for the loosely coupled multiple system according to the first aspect of the present invention. The system definition information holding means for holding the designation of the host computer according to claim 1 for re-execution for each job class, and (B) the system definition information holding means for the job class of the faulty job. 2. The job management information holding according to claim 1, wherein the job class information acquisition means for acquiring the designation of the host computer for re-execution of the job of 1) and (C) the second host computer when a failure occurs. If the host computer is not specified to re-execute the job, the job class for the job in error By giving to the job class information acquisition means, the designation of the third host computer for the job class is obtained, the designation is registered as new job control information in the job management information holding means, and The host failure recognizing means according to claim 1, which issues a re-execution request for a job, is provided for the first host computer.

【００１２】[0012]

【実施例】次に、本発明の実施例につき図面を参照して
説明する。図１は本発明の疎結合多重システムのジョブ
再実行制御方式の第１の実施例を示すブロック図であ
る。第１の実施例は、図１に示すように、複数のホスト
コンピュータＨ１，…Ｈｍ，…Ｈｎ，……と、ホストコ
ンピュータＨ１，…Ｈｍ，…Ｈｎ，……の各々に接続し
たホスト監視装置Ｇ１とを有して構成されている。Embodiments of the present invention will now be described with reference to the drawings. FIG. 1 is a block diagram showing a first embodiment of a job re-execution control method for a loosely coupled multiplex system according to the present invention. In the first embodiment, as shown in FIG. 1, a plurality of host computers H1, ... Hm, ... Hn, ... And a host monitoring device connected to each of the host computers H1 ,. And G1.

【００１３】以下には、ホストコンピュータＨ１をジョ
ブのスケジューリングを行うホストとして、また、ホス
トコンピュータＨｎを障害が発生したホストとして、一
方、ホストコンピュータＨｍをジョブの再実行を実施す
るホストとして説明することとする。In the following description, the host computer H1 will be described as a job scheduling host, the host computer Hn as a failed host, and the host computer Hm as a job re-execution host. And

【００１４】また、ジョブ制御言語Ｂ１は、ホストコン
ピュータＨｎに障害が発生したときジョブの再実行を行
うべきホストコンピュータＨｍを指定しており、あらか
じめホストコンピュータＨ１に投入されている。Further, the job control language B1 specifies the host computer Hm which should re-execute a job when a failure occurs in the host computer Hn, and is preloaded in the host computer H1.

【００１５】一方、このホストコンピュータＨ１は、ホ
スト障害認識手段Ａ１と、ジョブ再実行準備手段Ａ２
と、ジョブスケジュール手段Ａ３と、ジョブ管理情報保
持手段Ａ４と、ジョブ制御言語翻訳手段Ｂ２とを有して
構成されている。On the other hand, the host computer H1 has a host failure recognition means A1 and a job re-execution preparation means A2.
A job scheduling means A3, a job management information holding means A4, and a job control language translation means B2.

【００１６】さらに、ホストコンピュータＨ１，…Ｈ
ｍ，…Ｈｎ，……の各々には、ジョブ起動手段Ａ５を備
えている。また、ジョブ起動手段Ａ５は、ジョブスケジ
ュール手段Ａ３からの指示によって、指定されたジョブ
の実行プログラムＡ６の起動を行っている。Further, the host computers H1, ... H
Each of m, ... Hn, ... Has a job starting means A5. Further, the job activation means A5 activates the execution program A6 of the designated job according to the instruction from the job scheduling means A3.

【００１７】図２は第１の実施例におけるジョブの投入
処理時の動作の一例を示した流れ図である。また、図３
は第１の実施例におけるホストコンピュータの障害発生
時の動作の一例を示した流れ図である。FIG. 2 is a flow chart showing an example of the operation at the time of job input processing in the first embodiment. Also, FIG.
6 is a flow chart showing an example of an operation when a failure occurs in the host computer in the first embodiment.

【００１８】第１の実施例でジョブを投入するときに
は、ジョブ制御言語翻訳手段Ｂ２は、順次にジョブ制御
言語Ｂ１などを入力して翻訳して、ジョブの投入者，ジ
ョブの実行クラス，実行させるホストコンピュータＨ
１，…Ｈｍ，…Ｈｎ，……を決定し、ジョブ管理情報保
持手段Ａ４に登録した後に、ジョブスケジュール手段Ａ
３に通知している。When a job is submitted in the first embodiment, the job control language translation means B2 sequentially inputs and translates the job control language B1 and the like to translate the job submitter, the job execution class, and the job execution class. Host computer H
1, ... Hm, ... Hn, ... are determined and registered in the job management information holding means A4, and then the job scheduling means A
3 has been notified.

【００１９】そこで、ジョブスケジュール手段Ａ３は、
ジョブ制御言語翻訳手段Ｂ２からの通知を受け、ジョブ
管理情報保持手段Ａ４を検索し、ホストコンピュータＨ
１，…Ｈｍ，…Ｈｎ，……で実行すべきジョブのスケジ
ューリングを行い、実行可能なジョブがある場合には、
そのジョブを実行させるホストコンピュータＨ１，…Ｈ
ｍ，…Ｈｎ，……の何れかに起動要求を行っている。ま
た、起動要求を受けたジョブ起動手段Ａ５は、ジョブス
ケジュール手段Ａ３からの指示によって、指定されたジ
ョブの実行プログラムＡ６の起動を行っている。Therefore, the job scheduling means A3
Upon receiving the notification from the job control language translation means B2, the job management information holding means A4 is searched and the host computer H
, ... Hm, ... Hn, ... Schedule jobs to be executed, and if there are executable jobs,
Host computers H1, ... H that execute the job
An activation request is issued to any of m, ... Hn ,. Upon receiving the activation request, the job activation unit A5 activates the execution program A6 of the designated job according to the instruction from the job scheduling unit A3.

【００２０】そして、図２に示すように、ステップ２０
１で、ジョブ制御言語Ｂ１により、ホストコンピュータ
Ｈｎに障害が発生したときに、実行しているジョブの再
実行を行うべきホストコンピュータＨｍが指定されてい
る場合には、ステップ２０２で、その指定をジョブ制御
情報としてジョブ管理情報保持手段Ａ４内に登録してい
る。また、ステップ２０３では、ジョブスケジュール手
段Ａ３は、ジョブ制御言語翻訳手段Ｂ２からジョブ制御
情報を受けることにより、ホストコンピュータＨ１，…
Ｈｍ，…Ｈｎ，……の各々で実行すべき各ジョブのスケ
ジューリングを行っている。Then, as shown in FIG.
If the host computer Hm that should re-execute the job being executed is designated by the job control language B1 in step 1 when the host computer Hn fails, the designation is made in step 202. It is registered in the job management information holding means A4 as job control information. Further, in step 203, the job scheduling means A3 receives the job control information from the job control language translating means B2, so that the host computers H1, ...
Each job to be executed is scheduled in each of Hm, ... Hn ,.

【００２１】一方、ステップ３０１では、ホスト障害認
識手段Ａ１は、ホストコンピュータＨｎに障害が発生し
たときに、ホスト監視装置Ｇ１からホストコンピュータ
Ｈｎに障害が発生したことを通知されるので、ステップ
３０２では、ジョブ管理情報保持手段Ａ４の中に登録さ
れているジョブ制御情報を調べて、障害発生のホストコ
ンピュータＨｎでジョブが実行されているときには、さ
らに、ステップ３０３で、再実行のホストコンピュータ
Ｈｍが指定されていることを検出して、ジョブ再実行準
備手段Ａ２にそのジョブの再実行を行うための処理を要
求し、ステップ３０４に移行している。On the other hand, in step 301, the host failure recognizing means A1 is notified by the host monitoring device G1 that a failure has occurred in the host computer Hn when the host computer Hn has failed. The job control information registered in the job management information holding unit A4 is checked, and when the job is being executed by the host computer Hn in which the failure has occurred, in step 303, the re-execution host computer Hm is designated. When it is detected, the job re-execution preparation unit A2 is requested to perform a process for re-execution of the job, and the process proceeds to step 304.

【００２２】そして、ジョブ再実行準備手段Ａ２は、そ
のジョブを再実行する準備として、ステップ３０４で、
その障害のジョブのために確保した資源の解放処理を行
い、ステップ３０５で、ジョブ管理情報保持手段Ａ４の
内容をそのジョブの実行開始前の状態に復帰させて、ス
テップ３０６で、そのジョブの実行を再実行のホストコ
ンピュータＨｍに変更してジョブスケジュール手段Ａ３
にそのジョブの再実行を要求している。Then, the job re-execution preparation means A2 prepares for re-execution of the job in step 304.
The resource reserved for the faulty job is released, the contents of the job management information holding unit A4 is returned to the state before the execution of the job is executed in step 305, and the execution of the job is executed in step 306. To the host computer Hm for re-execution, and the job scheduling means A3
Request to re-execute the job.

【００２３】そこで、ジョブスケジュール手段Ａ３は、
ステップ３０７では、ジョブ再実行準備手段Ａ２からホ
ストコンピュータＨｍに対するジョブの再スケジュール
要求を受けてジョブ管理情報保持手段Ａ４の内容を基に
再スケジューリングを行い、ステップ３０８では、ホス
トコンピュータＨｍに対してホストコンピュータＨｎに
より実行していたジョブの再実行を要求するので、ホス
トコンピュータＨｍのジョブ起動手段Ａ５は、再実行要
求を受けたジョブの実行プログラムＡ６の起動を行って
いる。Therefore, the job scheduling means A3
In step 307, the job re-execution preparation unit A2 receives a job re-scheduling request for the host computer Hm, and re-scheduling is performed based on the contents of the job management information holding unit A4. Since the computer Hn requests the re-execution of the job being executed, the job activating means A5 of the host computer Hm activates the execution program A6 of the job for which the re-execution request is received.

【００２４】図４は本発明の疎結合多重システムのジョ
ブ再実行制御方式の第２の実施例を示したブロック図で
ある。第２の実施例は、図４に示したように、複数のホ
ストコンピュータＨ１１，…Ｈ１ｍ，…Ｈ１ｎ，……と
ホストコンピュータＨ１１，…Ｈ１ｍ，…Ｈ１ｎ，……
に接続してホストコンピュータＨ１１，…Ｈ１ｍ，…Ｈ
１ｎ，……の何れかに障害が発生した場合には障害が発
生したことを通知するホスト監視装置Ｇ１１とを有して
構成されている。FIG. 4 is a block diagram showing a second embodiment of the job re-execution control system of the loosely coupled multiplex system according to the present invention. In the second embodiment, as shown in FIG. 4, a plurality of host computers H11, ... H1m, ... H1n, ... And host computers H11, ... H1m ,.
Connected to host computer H11, ... H1m, ... H
If any one of 1n, ... Has a failure, the host monitoring device G11 is provided for notifying that the failure has occurred.

【００２５】また、ホストコンピュータＨ１１のジョブ
制御言語翻訳手段Ｂ１２は、ジョブ制御言語Ｂ１１など
を入力して翻訳して、ジョブ管理情報保持手段Ａ１４に
登録しているので、ジョブ管理情報保持手段Ａ１４は、
ホストコンピュータＨ１１，…Ｈ１ｍ，…Ｈ１ｎ，……
内に投入されたジョブに対するジョブ制御情報を保持し
ている。そして、ジョブ制御言語翻訳手段Ｂ１２は、ジ
ョブ制御言語Ｂ１１によりホストコンピュータＨ１ｎに
障害が発生したときジョブの再実行を行うべきホストコ
ンピュータＨ１ｍが指定されている場合に、その指定を
ジョブ制御情報としてジョブ管理情報保持手段Ａ１４に
登録している。Further, the job control language translation means B12 of the host computer H11 inputs and translates the job control language B11 and the like and registers it in the job management information holding means A14. ,
Host computers H11, ... H1m, ... H1n, ...
It holds job control information for the jobs submitted in it. The job control language translation unit B12 uses the job control language B11 as job control information when the host computer H1m that should re-execute the job is designated when a failure occurs in the host computer H1n. It is registered in the management information holding means A14.

【００２６】一方で、ホストコンピュータＨ１１の端末
Ｃ１１は、ジョブ管理情報保持手段Ａ１４に障害が発生
したときにジョブの再実行を行うべきホストコンピュー
タの指定が行われていない場合や行われているその指定
を変更したい場合に、ホストコンピュータＨ１ｎに障害
が発生したときにはジョブの再実行を行うべきホストコ
ンピュータＨ１ｍの指定を入力している。On the other hand, in the terminal C11 of the host computer H11, when the job management information holding means A14 fails, the host computer to which the job should be re-executed is not specified or is executed. When the user wants to change the designation, when the host computer H1n fails, the designation of the host computer H1m to re-execute the job is input.

【００２７】また、ホストコンピュータＨ１１の端末Ｃ
１１からホストコンピュータＨ１ｎに障害が発生したと
きにジョブの再実行を行うべきホストコンピュータＨ１
ｍを指定した場合に、ジョブ再実行ホスト登録手段Ｃ１
２は、その指定をジョブ制御情報としてジョブ管理情報
保持手段Ａ１４に登録している。The terminal C of the host computer H11
The host computer H1 that should re-execute the job when a failure occurs from 11 to the host computer H1n
When m is specified, the job re-execution host registration means C1
2 has registered the designation as job control information in the job management information holding unit A14.

【００２８】そこで、障害の発生時には、ホストコンピ
ュータＨ１１のホスト障害認識手段Ａ１１は、ホスト監
視装置Ｇ１１から障害を起したホストコンピュータＨ１
ｎを通知されて認識することによって、ジョブ管理情報
保持手段Ａ１４のジョブ制御情報を参照して、ホストコ
ンピュータＨ１ｎで実行中のジョブに対する再実行の要
求を行っている。Therefore, when a failure occurs, the host failure recognition means A11 of the host computer H11 causes the host monitoring apparatus G11 to cause a failure in the host computer H1.
By notifying and recognizing n, the job control information of the job management information holding unit A14 is referred to, and a request for re-execution of the job being executed by the host computer H1n is made.

【００２９】そして、ホストコンピュータＨ１１のジョ
ブ再実行準備手段Ａ１２は、ホスト障害認識手段Ａ１１
からジョブの再実行要求を受けて、ジョブ管理情報保持
手段Ａ１４のジョブ制御情報を参照し、そのジョブ制御
情報により指示されたホストコンピュータＨ１ｍに対す
るジョブの再スケジュール要求を行っている。The job re-execution preparation means A12 of the host computer H11 is then connected to the host failure recognition means A11.
In response to the request for re-execution of the job from the job control information holding unit A14, the job re-scheduling request is issued to the host computer H1m designated by the job control information.

【００３０】一方、ホストコンピュータＨ１１のジョブ
スケジュール手段Ａ１３は、ジョブ制御言語翻訳手段Ｂ
１２からのジョブ制御情報を検索して、ホストコンピュ
ータＨ１１，…Ｈ１ｍ，…Ｈ１ｎ，……により実行すべ
きジョブのスケジューリングを行い、ジョブ再実行準備
手段Ａ１２からのホストコンピュータＨ１ｍに対するジ
ョブの再スケジュール要求を受けたときには、ジョブの
再スケジューリングを行うことにより、ホストコンピュ
ータＨ１ｍに対してホストコンピュータＨ１ｎのジョブ
の再実行を要求している。On the other hand, the job scheduling means A13 of the host computer H11 is the job control language translation means B.
.. H1m, ... H1n, ... Scheduling jobs to be executed by the host computers H11, ... H1m ,. When the job is received, the job is rescheduled to request the host computer H1m to reexecute the job of the host computer H1n.

【００３１】そこで、ジョブスケジュール手段Ａ１３か
らのスケジュールに基いたジョブの実行要求がある場合
に、ホストコンピュータＨ１ｍのジョブ起動手段Ａ１５
は、実行を要求されたジョブの実行プログラムＡ１６を
起動している。Therefore, when there is a job execution request based on the schedule from the job scheduling means A13, the job starting means A15 of the host computer H1m.
Has started the execution program A16 of the job requested to be executed.

【００３２】図５は本発明の疎結合多重システムのジョ
ブ再実行制御方式の第３の実施例を示したブロック図で
ある。第３の実施例は、図５に示したように、複数のホ
ストコンピュータＨ２１，…Ｈ２ｍ，…Ｈ２ｎ，……と
ホストコンピュータＨ２１，…Ｈ２ｍ，…Ｈ２ｎ，……
に接続してホストコンピュータＨ２１，…Ｈ２ｍ，…Ｈ
２ｎ，……の何れかに障害が発生した場合には障害が発
生したことを通知するホスト監視装置Ｇ２１とを有して
構成されている。FIG. 5 is a block diagram showing a third embodiment of the job re-execution control system of the loosely coupled multiplex system according to the present invention. In the third embodiment, as shown in FIG. 5, a plurality of host computers H21, ... H2m, ... H2n, ... And host computers H21, ... H2m ,.
To host computer H21, ... H2m, ... H
When any one of 2n, ... Has a fault, the host monitoring device G21 notifies the fact that the fault has occurred.

【００３３】また、ホストコンピュータＨ２１のジョブ
制御言語翻訳手段Ｂ２２は、ジョブ制御言語Ｂ２１など
を入力して翻訳して、ジョブ管理情報保持手段Ａ２４に
登録しているので、ジョブ管理情報保持手段Ａ２４は、
ホストコンピュータＨ２１，…Ｈ２ｍ，…Ｈ２ｎ，……
内に投入されたジョブに対するジョブ制御情報を保持し
ている。そして、ジョブ制御言語翻訳手段Ｂ２２は、ジ
ョブ制御言語Ｂ２１によりホストコンピュータＨ２ｎに
障害が発生したときジョブの再実行を行うべきホストコ
ンピュータＨ２ｍが指定されている場合に、その指定を
ジョブ制御情報としてジョブ管理情報保持手段Ａ２４に
登録している。Since the job control language translation means B22 of the host computer H21 inputs and translates the job control language B21 and the like and registers it in the job management information holding means A24, the job management information holding means A24 is ,
Host computer H21, ... H2m, ... H2n, ...
It holds job control information for the jobs submitted in it. Then, when the host computer H2n that should re-execute the job when the host computer H2n fails due to the job control language B21, the job control language translation means B22 uses the designation as job control information. It is registered in the management information holding means A24.

【００３４】一方、ホストコンピュータＨ２１の利用者
管理情報格納手段Ｄ２１は、利用者ごとに、障害が発生
したときにジョブの再実行を行うためのホストコンピュ
ータＨ２１，…Ｈ２ｍ，…Ｈ２ｎ，……の指定を格納し
ており、利用者管理情報取得手段Ｄ２２は、障害のジョ
ブの利用者に対して、利用者管理情報格納手段Ｄ２１か
らジョブの再実行を行うホストコンピュータＨ２１，…
Ｈ２ｍ，…Ｈ２ｎ，……の指定を取得することができ
る。On the other hand, the user management information storage means D21 of the host computer H21 includes host computers H21, ... H2m, ... H2n, ... For re-executing a job when a failure occurs for each user. The user management information acquisition unit D22, which stores the designation, re-executes the job from the user management information storage unit D21 for the user of the faulty job.
The designations of H2m, ... H2n, ... Can be acquired.

【００３５】そこで、障害の発生時には、ホストコンピ
ュータＨ２１のホスト障害認識手段Ａ２１は、ホスト監
視装置Ｇ２１から障害を起したホストコンピュータＨ２
ｎを通知されて認識することによって、ジョブ管理情報
保持手段Ａ２４のジョブ制御情報を参照して、ジョブ管
理情報保持手段Ａ２４にジョブの再実行を行うためのホ
ストコンピュータＨ２１，…Ｈ２ｍ，…Ｈ２ｎ，……の
指定が行われていない場合には、障害のジョブの利用者
を利用者管理情報取得手段Ｄ２２に与えることによっ
て、その利用者に対するジョブの再実行を行うためのホ
ストコンピュータＨ２ｍの指定を取得し、その指定を新
たなジョブ制御情報としてジョブ管理情報保持手段Ａ２
４内に登録するとともにホストコンピュータＨ２ｎで実
行中の障害のジョブに対する再実行要求を行っている。Therefore, when a failure occurs, the host failure recognition means A21 of the host computer H21 causes the host monitoring apparatus G21 to cause a failure in the host computer H2.
By being notified and recognizing n, the host computer H21, ... H2m, ... H2n, for re-executing the job in the job management information holding unit A24 by referring to the job control information of the job management information holding unit A24, .. is not specified, the user of the faulty job is given to the user management information acquisition means D22 to specify the host computer H2m for re-executing the job for that user. The job management information holding unit A2 that acquires the designation and uses the designation as new job control information
No. 4 is registered and the re-execution request is made for the faulty job being executed by the host computer H2n.

【００３６】そして、ホストコンピュータＨ２１のジョ
ブ再実行準備手段Ａ２２は、ホスト障害認識手段Ａ２１
からジョブの再実行要求を受けて、ジョブ管理情報保持
手段Ａ２４のジョブ制御情報を参照し、そのジョブ制御
情報により指示されたホストコンピュータＨ２ｍに対す
るジョブの再スケジュール要求を行っている。Then, the job re-execution preparation means A22 of the host computer H21 uses the host failure recognition means A21.
In response to the request for re-execution of the job from the job control information holding unit A24, the job re-scheduling request is issued to the host computer H2m designated by the job control information.

【００３７】一方、ホストコンピュータＨ２１のジョブ
スケジュール手段Ａ２３は、ジョブ制御言語翻訳手段Ｂ
２２からのジョブ制御情報を検索して、ホストコンピュ
ータＨ２１，…Ｈ２ｍ，…Ｈ２ｎ，……により実行すべ
きジョブのスケジューリングを行い、ジョブ再実行準備
手段Ａ２２からのホストコンピュータＨ２ｍに対するジ
ョブの再スケジュール要求を受けたときには、ジョブの
再スケジューリングを行うことにより、ホストコンピュ
ータＨ２ｍに対してホストコンピュータＨ２ｎのジョブ
の再実行を要求している。On the other hand, the job scheduling means A23 of the host computer H21 has a job control language translation means B.
.. H2m, ... H2n, .. Schedule the job to be executed by the host computers H21, ... H2m ,. When the job is received, the job is rescheduled to request the host computer H2m to re-execute the job of the host computer H2n.

【００３８】そこで、ジョブスケジュール手段Ａ２３か
らのスケジュールに基いたジョブの実行要求がある場合
に、ホストコンピュータＨ２ｍのジョブ起動手段Ａ２５
は、実行を要求されたジョブの実行プログラムＡ２６を
起動している。Therefore, when there is a job execution request based on the schedule from the job scheduling means A23, the job starting means A25 of the host computer H2m.
Has started the execution program A26 of the job requested to be executed.

【００３９】図６は本発明の疎結合多重システムのジョ
ブ再実行制御方式の第４の実施例を示したブロック図で
ある。第４の実施例は、図６に示したように、複数のホ
ストコンピュータＨ３１，…Ｈ３ｍ，…Ｈ３ｎ，……と
ホストコンピュータＨ３１，…Ｈ３ｍ，…Ｈ３ｎ，……
に接続してホストコンピュータＨ３１，…Ｈ３ｍ，…Ｈ
３ｎ，……の何れかに障害が発生した場合には障害が発
生したことを通知するホスト監視装置Ｇ３１とを有して
構成されている。FIG. 6 is a block diagram showing a fourth embodiment of the job re-execution control system of the loosely coupled multiplex system according to the present invention. In the fourth embodiment, as shown in FIG. 6, a plurality of host computers H31, ... H3m, ... H3n, ... And host computers H31, ... H3m ,.
Connected to host computer H31, ... H3m, ... H
If any of the 3n, ... Has a fault, the host monitoring device G31 for notifying that the fault has occurred is configured.

【００４０】また、ホストコンピュータＨ３１のジョブ
制御言語翻訳手段Ｂ３２は、ジョブ制御言語Ｂ３１など
を入力して翻訳して、ジョブ管理情報保持手段Ａ３４に
登録しているので、ジョブ管理情報保持手段Ａ３４は、
ホストコンピュータＨ３１，…Ｈ３ｍ，…Ｈ３ｎ，……
内に投入されたジョブに対するジョブ制御情報を保持し
ている。そして、ジョブ制御言語翻訳手段Ｂ３２は、ジ
ョブ制御言語Ｂ３１によりホストコンピュータＨ３ｎに
障害が発生したときジョブの再実行を行うべきホストコ
ンピュータＨ３ｍが指定されている場合に、その指定を
ジョブ制御情報としてジョブ管理情報保持手段Ａ３４に
登録している。Since the job control language translation means B32 of the host computer H31 inputs and translates the job control language B31 and the like and registers it in the job management information holding means A34, the job management information holding means A34 ,
Host computer H31, ... H3m, ... H3n, ...
It holds job control information for the jobs submitted in it. Then, when the host computer H3n that should re-execute the job when the host computer H3n fails due to the job control language B31, the job control language translation unit B32 uses the designation as the job control information for the job. It is registered in the management information holding means A34.

【００４１】一方、ホストコンピュータＨ３１のシステ
ム定義情報保持手段Ｆ３１は、障害が発生したときジョ
ブの再実行を行うためのホストコンピュータの指定を各
々のホストコンピュータごとおよび各々のジョブクラス
ごとに保持しており、ホスト情報取得手段Ｅ３１は、障
害のホストコンピュータＨ３ｎに対してシステム定義情
報保持手段Ｆ３１からジョブの再実行を行うべきホスト
コンピュータＨ３ｍの指定を取得しており、ジョブクラ
ス情報取得手段Ｅ３２は、障害発生のジョブのジョブク
ラスに対してシステム定義情報保持手段Ｆ３１からのジ
ョブの再実行を行うためのホストコンピュータＨ３ｍの
指定を取得している。On the other hand, the system definition information holding means F31 of the host computer H31 holds the designation of the host computer for re-executing the job when a failure occurs for each host computer and for each job class. Therefore, the host information acquisition unit E31 acquires the designation of the host computer H3m that should re-execute the job from the system definition information holding unit F31 for the failed host computer H3n, and the job class information acquisition unit E32 The designation of the host computer H3m for re-executing the job is acquired from the system definition information holding unit F31 for the job class of the job in which the failure has occurred.

【００４２】そこで、障害の発生時には、ホストコンピ
ュータＨ３１のホスト障害認識手段Ａ３１は、ホスト監
視装置Ｇ３１から障害を起したホストコンピュータＨ３
ｎを通知されて認識することによって、ジョブ管理情報
保持手段Ａ３４のジョブ制御情報を参照して、ジョブ管
理情報保持手段Ａ３４にジョブの再実行を行うためのホ
ストコンピュータの指定が行われていない場合に、障害
がホストコンピュータＨ３ｎであることをホスト情報取
得手段Ｅ３１に与えることにより、そのホストコンピュ
ータＨ３ｎに対応するホストコンピュータＨ３ｍの指定
を取得するか、障害のジョブについてのジョブクラスを
ジョブクラス情報取得手段Ｅ３２に与えて、そのジョブ
クラスに対するホストコンピュータＨ３ｍの指定を取得
するかの何れかにより、その指定を新たなジョブ制御情
報としてジョブ管理情報保持手段Ａ３４に登録するとと
もに、そのホストコンピュータＨ３ｎで実行していた障
害のジョブに対する再実行要求を行っている。Therefore, when a failure occurs, the host failure recognition means A31 of the host computer H31 causes the host monitoring apparatus G31 to cause a failure in the host computer H3.
When n is notified and recognized to refer to the job control information of the job management information holding unit A34, and the host computer for re-executing the job is not specified in the job management information holding unit A34 In addition, by giving the host information acquisition means E31 that the failure is the host computer H3n, the designation of the host computer H3m corresponding to the host computer H3n is acquired, or the job class of the failed job is acquired as the job class information. The designation is given to the means E32 to obtain the designation of the host computer H3m for the job class, and the designation is registered in the job management information holding means A34 as new job control information and executed by the host computer H3n. To the job of the fault that was It is doing the re-execution request.

【００４３】そして、ホストコンピュータＨ３１のジョ
ブ再実行準備手段Ａ３２は、ホスト障害認識手段Ａ３１
からジョブの再実行要求を受けて、ジョブ管理情報保持
手段Ａ３４のジョブ制御情報を参照し、そのジョブ制御
情報により指示されたホストコンピュータＨ３ｍに対す
るジョブの再スケジュール要求を行っている。The job re-execution preparation means A32 of the host computer H31 is then connected to the host failure recognition means A31.
In response to the job re-execution request from the job control information holding unit A34, the job re-scheduling request is issued to the host computer H3m designated by the job control information.

【００４４】一方、ホストコンピュータＨ３１のジョブ
スケジュール手段Ａ３３は、ジョブ制御言語翻訳手段Ｂ
３２からのジョブ制御情報を検索して、ホストコンピュ
ータＨ３１，…Ｈ３ｍ，…Ｈ３ｎ，……により実行すべ
きジョブのスケジューリングを行い、ジョブ再実行準備
手段Ａ３２からのホストコンピュータＨ３ｍに対するジ
ョブの再スケジュール要求を受けたときには、ジョブの
再スケジューリングを行うことにより、ホストコンピュ
ータＨ３ｍに対してホストコンピュータＨ３ｎのジョブ
の再実行を要求している。On the other hand, the job scheduling means A33 of the host computer H31 is the job control language translation means B.
.. H3m, ... H3n, ... Scheduling of the job to be executed by the host computers H31, ... H3m, ..., And the job re-scheduling request from the job re-execution preparation means A32 to the host computer H3m. When the job is received, the job is rescheduled to request the host computer H3m to reexecute the job of the host computer H3n.

【００４５】そこで、ジョブスケジュール手段Ａ３３か
らのスケジュールに基いたジョブの実行要求がある場合
に、ホストコンピュータＨ３ｍのジョブ起動手段Ａ３５
は、実行を要求されたジョブの実行プログラムＡ３６を
起動している。Therefore, when there is a job execution request from the job scheduling means A33 based on the schedule, the job starting means A35 of the host computer H3m.
Has started the execution program A36 of the job requested to be executed.

【００４６】なお、以上に述べた各実施例では、ホスト
コンピュータに障害が発生したときジョブの再実行を行
うためのホストコンピュータを様々な方法で定めている
が、これらの方法を種々に組合わせて、各々の優先順序
をジョブ管理情報保持手段やシステム定義情報保持手段
に保持しておくことにより、再実行を行うべきホストコ
ンピュータを適切に定めることができる。In each of the embodiments described above, the host computer for re-executing the job when a failure occurs in the host computer is defined by various methods, but these methods are combined in various ways. By holding each priority order in the job management information holding means and the system definition information holding means, the host computer to be re-executed can be appropriately determined.

【００４７】[0047]

【発明の効果】以上に説明したように、本発明の疎結合
多重システムのジョブ再実行制御方式は、ジョブを実行
中のホストコンピュータに障害が発生した場合に、その
ホストコンピュータの障害の回復を待たずに、他のホス
トコンピュータにより速やかにそのジョブの再実行を行
うことができるという効果を有している。As described above, according to the job re-execution control method of the loosely coupled multiple system of the present invention, when a failure occurs in the host computer that is executing a job, the failure recovery of the host computer is performed. There is an effect that the job can be promptly re-executed by another host computer without waiting.

[Brief description of drawings]

【図１】本発明の疎結合多重システムのジョブ再実行制
御方式の第１の実施例を示したブロック図である。FIG. 1 is a block diagram showing a first embodiment of a job re-execution control method for a loosely coupled multiplex system according to the present invention.

【図２】第１の実施例のジョブの投入処理時の動作の一
例を示す流れ図である。FIG. 2 is a flowchart showing an example of an operation at the time of a job input process of the first embodiment.

【図３】第１の実施例におけるホストコンピュータの障
害発生時の動作の一例を示した流れ図である。FIG. 3 is a flow chart showing an example of operation when a failure occurs in the host computer in the first embodiment.

【図４】本発明の疎結合多重システムのジョブ再実行制
御方式の第２の実施例を示したブロック図である。FIG. 4 is a block diagram showing a second embodiment of the job re-execution control method of the loosely coupled multiplex system according to the present invention.

【図５】本発明の疎結合多重システムのジョブ再実行制
御方式の第３の実施例を示したブロック図である。FIG. 5 is a block diagram showing a third embodiment of the job re-execution control system of the loosely coupled multiplex system according to the present invention.

【図６】本発明の疎結合多重システムのジョブ再実行制
御方式の第４の実施例を示したブロック図である。FIG. 6 is a block diagram showing a fourth embodiment of the job re-execution control method for the loosely coupled multiplex system according to the present invention.

[Explanation of symbols]

Ａ１，Ａ１１，Ａ２１，Ａ３１ホスト障害認識手段Ａ２，Ａ１２，Ａ２２，Ａ３２ジョブ再実行準備手
段Ａ３，Ａ１３，Ａ２３，Ａ３３ジョブスケジュール
手段Ａ４，Ａ１４，Ａ２４，Ａ３４ジョブ管理情報保持
手段Ａ５，Ａ１５，Ａ２５，Ａ３５ジョブ起動手段Ａ６，Ａ１６，Ａ２６，Ａ３６実行プログラムＢ１，Ｂ１１，Ｂ２１，Ｂ３１ジョブ制御言語Ｂ２，Ｂ１２，Ｂ２２，Ｂ３２ジョブ制御言語翻訳
手段Ｃ１１端末Ｃ１２ジョブ実行ホスト登録手段Ｄ２１利用者管理情報格納手段Ｄ２２利用者管理情報取得手段Ｅ３１ホスト情報取得手段Ｅ３２ジョブクラス情報取得手段Ｆ３１システム定義情報保持手段Ｇ１，Ｇ１１，Ｇ２１，Ｇ３１ホスト監視装置Ｈ１，〜Ｈｍ，〜Ｈｎ，〜，Ｈ１１，〜Ｈ１ｍ，〜Ｈ１
ｎ，〜，Ｈ２１，〜Ｈ２ｍ，〜Ｈ２ｎ，〜，Ｈ３１，〜
Ｈ３ｍ，〜Ｈ３ｎ，〜，ホストコンピュータA1, A11, A21, A31 Host failure recognition means A2, A12, A22, A32 Job re-execution preparation means A3, A13, A23, A33 Job scheduling means A4, A14, A24, A34 Job management information holding means A5, A15, A25 , A35 Job activation means A6, A16, A26, A36 Execution programs B1, B11, B21, B31 Job control language B2, B12, B22, B32 Job control language translation means C11 Terminal C12 Job execution host registration means D21 User management information storage Means D22 User management information acquisition means E31 Host information acquisition means E32 Job class information acquisition means F31 System definition information holding means G1, G11, G21, G31 Host monitoring devices H1, ~ Hm, ~ Hn, ~, H11, ~ H1m, ~ H1
n, ~, H21, ~ H2m, ~ H2n, ~, H31, ~
H3m, ~ H3n, ~, Host computer

Claims

[Claims]

1. A host monitoring device connected to a plurality of host computers and each of the host computers, and when any one of the host computers fails, the host monitoring device notifies the other host computer of the failure. In a job re-execution control method for a loosely coupled multiplex system, the first host computer includes (A) job management information holding means for holding job control information for a job input in the host computer. (B) When the job control language is input and translated, and the job control language designates the third host computer to re-execute the job when a failure occurs in the second host computer. The job whose registration is registered in the job management information holding means as job control information The language translation means and (C) when the failure occurs, the host monitoring apparatus notifies the second host computer of the failure and recognizes it, thereby referring to the job control information of the job management information holding means. And (D) receiving the job re-execution request from the host failure recognizing means, and thereby executing the job management information holding means. Job re-execution preparation means for making a job re-scheduling request to the third host computer designated by the job control information, and (E) job control from the job control language translation means. Receives information to schedule jobs for execution on multiple host computers Receiving a job re-scheduling request for the third host computer by the job re-execution preparation means and re-scheduling the job to re-execute the job of the second host computer to the third host computer. A job scheduling unit that requests execution, and (F) a job requested to be executed when each of the plurality of host computers has a job execution request based on the schedule from the job scheduling unit. A job re-execution control method for a loosely coupled multiplex system, comprising:

2. A host computer according to claim 1 for re-executing a job when a failure occurs (A) is not specified or the specified specification is changed, When a failure occurs in the second host computer, a terminal for inputting a designation of the third host computer which should re-execute the job, and (B) a failure occurs in the second host computer from the terminal. When the third host computer to re-execute the job is designated, the job re-execution host registration means for registering the designation as new job control information in the job management information holding means is provided. The job re-execution control method for a loosely coupled multiple system according to claim 1, wherein the job re-execution control method is provided for the host computer.

3. (A) User management information storage means for storing the designation of the host computer according to claim 1 for re-executing a job when a failure occurs,
(B) a user management information acquisition unit for acquiring the designation of the host computer for re-executing the job from the user management information storage unit for the user of the faulty job; and (C) the second. If the host computer for re-execution of the job is not designated in the job management information holding means according to claim 1 when a failure occurs in the host computer, the use of the failed job A user to the user management information acquisition unit to acquire the designation of the host computer for the user, register the designation as new job control information in the job management information holding unit, and perform a job failure Host failure recognition means according to claim 1 for making a re-execution request to the first host computer. Claim 1 job re-execution control method of the loosely coupled multiplexing system according to.

4. (A) System definition information holding means for holding the designation of a host computer according to claim 1 for re-executing a job when a fault occurs, and (B) a fault. (C) the second host computer, which acquires the designation of the host computer for re-executing the job from the system definition information holding means for the second host computer in which When a failure occurs,
When the host computer for re-execution of the job is not designated in the job management information holding means according to claim 1, the second host computer is given to the host information acquisition means. By
A method of acquiring a designation of the third host computer for the second host computer, registering the designation as new job control information in the job management information holding means, and requesting re-execution of the failed job. 2. The job re-execution control method for a loosely coupled multiplex system according to claim 1, further comprising: host failure recognition means according to claim 1 for the first host computer.

5. (A) System definition information holding means for holding the designation of the host computer according to claim 1 for re-executing a job when a failure occurs, and (B) Job class information acquisition means for acquiring the designation of the host computer for re-executing the job from the system definition information holding means for the job class of the faulty job; and (C) the second host computer. When a failure occurs, if the host computer for re-execution of the job is not specified in the job management information holding unit according to claim 1, the job class of the failed job is set to the job class. By giving it to the class information acquisition means, the designation of the third host computer for the job class is obtained and the instruction is given. Host fault recognition means according to claim 1, wherein the host failure recognition means is registered as new job control information in the job management information holding means, and a re-execution request is issued to the failed job. 2. The job re-execution control method for the loosely coupled multiple system according to claim 1.