JP2009098715A

JP2009098715A - Redundant system device, job execution method in redundant system device, and execution program

Info

Publication number: JP2009098715A
Application number: JP2007266635A
Authority: JP
Inventors: Kosuke Hideshima; 功介秀島
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2007-10-12
Filing date: 2007-10-12
Publication date: 2009-05-07

Abstract

<P>PROBLEM TO BE SOLVED: To perform switching of a system when a failure occurs in a redundant system in a short period of time. <P>SOLUTION: A main system server 10 and a standby system server 20 are provided with execution period DB11 and 21 with periods to execute jobs stored therein and execution orders DB 13 and 23 with the execution orders of jobs which have dependence with each other stored therein. The standby server 20 is provided with an unexecuted job DB22 with jobs which have not been executed by the main system server 10 stored therein. In a normal time, the main system server 10 executes the jobs by referring to the execution period DB11 and the execution order DB13, and notifies the standby system server 20 of executed jobs, and the standby system server 20 updates the unexecuted job DB22 on the basis of the notification. In an abnormal time, the standby system server 20 refers to the execution period DB21, the execution order DB23 and the unexecuted job DB22, and executes the jobs which have not been executed by the main system server 10, that is, the jobs having dependence with jobs have been already executed. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は冗長システム装置並びに冗長システム装置におけるジョブの実行方法及び実行プログラムに関し、特に、主系サーバと待機系サーバとを備えた多重化によって信頼性を向上させる冗長システム装置並びに冗長システム装置におけるジョブの実行方法及び実行プログラムに関する。 The present invention relates to a redundant system device and a job execution method and execution program in the redundant system device, and more particularly to a redundant system device and a job in the redundant system device that improve reliability by multiplexing with a primary server and a standby server. The present invention relates to an execution method and an execution program.

今日では、さまざまな情報システムがビジネスの根幹となっている。したがって、災害などの要因によってシステムがダウンした場合にも、短時間でシステムを復旧することが求められる。 Today, various information systems are the basis of business. Therefore, even when the system goes down due to a factor such as a disaster, it is required to restore the system in a short time.

そこで、システムを冗長化することによって、システムダウンへの耐性を向上させる技術が知られている。例えば、主系サーバ（本番系）と待機系サーバ(予備系)とを備えたシステムにおいては、主系サーバが故障した場合には、ジョブの実行主体を主系サーバから待機系サーバへと切り換えることによって、継続してジョブを実行させることができる。 In view of this, a technique is known in which a system is made redundant to improve resistance to system down. For example, in a system having a primary server (production system) and a standby server (standby system), if the primary server fails, the job execution subject is switched from the primary server to the standby server. Thus, the job can be continuously executed.

主系サーバと待機系サーバとを備えたシステムの代表的なものとして、ホットスタンバイ方式（ホットスペア方式）及びコールドスタンバイ方式（コールドスペア方式）が知られている。 As a typical system including a main server and a standby server, a hot standby system (hot spare system) and a cold standby system (cold spare system) are known.

ホットスタンバイ方式における待機系サーバは、正常時に稼動している主系サーバと常に同じ動作を行う（すなわち、ミラーリング状態にある）。したがって、主系サーバに異常が発生した場合には、待機系サーバが即座に処理を引き継ぐことができる。 The standby server in the hot standby system always performs the same operation as that of the main server operating normally (that is, in the mirroring state). Therefore, when an abnormality occurs in the primary server, the standby server can immediately take over the processing.

一方、コールドスタンバイ方式における待機系サーバは、主系サーバにおける障害の発生を確認してから起動され、ジョブの実行が引き継がれる。なお、待機系サーバにおいて、ＯＳのみを立ち上げておくことによって、コールドスタンバイ方式よりも迅速な処理の切替を実現する方式（ウォームスタンバイ方式）も知られている。コールドスタンバイ方式は、ホットスタンバイ方式に比べてコストが安い半面、システムダウンへの耐性は劣る。 On the other hand, the standby server in the cold standby system is started after confirming the occurrence of a failure in the main server, and the job execution is taken over. In addition, a method (warm standby method) that realizes faster process switching than the cold standby method by starting up only the OS in the standby server is also known. The cold standby method is less expensive than the hot standby method, but is less resistant to system down.

特許文献１において、ホットスタンバイ方式の一例として、２台のコンピュータがオンライン業務処理を同時に行い、互いの処理結果をつき合わせることでシステム全体の信頼度を高める方式（並列冗長式、デュアル運転方式）が開示されている。 In Patent Document 1, as an example of the hot standby method, two computers simultaneously perform online business processing, and increase the reliability of the entire system by combining the processing results of each other (parallel redundancy method, dual operation method) Is disclosed.

また、特許文献２においても、自動運用される現用サーバと予備サーバとで、現用サーバに障害が発生しても迅速な継続運転が可能なホットスタンバイ方式システムが開示されている。 Also, Patent Document 2 discloses a hot standby system that enables quick continuous operation even if a failure occurs in the active server between the active server and the standby server that are automatically operated.

なお、特許文献３において、オフラインシステムにおいて障害が発生したときに、ジョブ制御言語による記述を修正することなく、処理済みのジョブステップからの再実行を可能とする方式が開示されている。 Note that Patent Document 3 discloses a method that enables re-execution from a processed job step without correcting a description in a job control language when a failure occurs in an offline system.

特開昭６０−０４３７７２号公報JP-A-60-043772 特開平１１−２５９３２６号公報JP 11-259326 A 特開平０６−１０３０７８号公報Japanese Patent Laid-Open No. 06-103078

以下の分析は、本発明者によってなされたものである。 The following analysis was made by the present inventors.

従来のコールドスタンバイ方式においては、短時間で主系サーバと待機系サーバとを切り替えることが困難であった。スケジューリングされているジョブのうち、システムダウンする以前のジョブの実行状況を確認する作業と、実行系を切り替えた後ジョブの継続実行を行う箇所の設定作業とに時間を要していたためである。 In the conventional cold standby method, it is difficult to switch between the primary server and the standby server in a short time. This is because it takes time to check the execution status of the job before the system down among the scheduled jobs and to set the location where the job is continuously executed after switching the execution system.

また、従来のコールドスタンバイ方式システムにおいては、ジョブの実行状況の同期をとる際に、主系サーバと待機系サーバとの間でデータのやりとりを行う必要があり、両系が互いに遠隔地に設置されている場合には、データのやりとりに手間を要していた。 Also, in the conventional cold standby system, when synchronizing the job execution status, it is necessary to exchange data between the primary server and the standby server, and both systems are installed at remote locations. If so, it took time and effort to exchange data.

一方、特許文献１や２に開示されたホットスタンバイ方式システムでは、正常時であっても、待機系サーバは主系サーバと同様にジョブを実行するため、コストが高価となる問題がある。 On the other hand, in the hot standby system disclosed in Patent Documents 1 and 2, the standby server executes a job in the same manner as the main server even when it is normal, so that there is a problem that the cost becomes high.

そこで、正常時には待機系サーバにおいてジョブを実行しない冗長システム装置において、障害発生時に系の切り替え作業を短時間で行うことのできる冗長システム装置を提供することが課題となる。 Therefore, it is an object to provide a redundant system apparatus that can perform a system switching operation in a short time when a failure occurs in a redundant system apparatus that does not execute a job in a standby server during normal operation.

本発明の第１の視点に係る冗長システム装置は、ジョブを実行する時期を格納した実行時期データベース（以下、ＤＢとする。）及び互いに依存関係にある複数のジョブの実行順序を格納した実行順序ＤＢを備えた主系サーバと、前記実行時期ＤＢ及び前記実行順序ＤＢ並びに前記主系サーバにおいて実行されていないジョブを格納した未実行ジョブＤＢを備えた待機系サーバとを備えた冗長システム装置であって、正常時において、前記主系サーバが、前記実行時期ＤＢと前記実行順序ＤＢとを参照してジョブを実行して実行済みのジョブを前記待機系サーバへ通知し、前記待機系サーバが、通知された実行済ジョブに基づいて前記未実行ジョブＤＢを更新するように構成され、異常時において、前記待機系サーバが、前記実行時期ＤＢと前記実行順序ＤＢと前記未実行ジョブＤＢとを参照し、前記主系サーバにおいて実行されていないジョブであって、そのジョブと依存関係のあるジョブが実行済みであるものを実行するように構成されたことを特徴とする。 The redundant system apparatus according to the first aspect of the present invention includes an execution time database (hereinafter, referred to as DB) storing job execution times and an execution order storing execution orders of a plurality of jobs that are mutually dependent. A redundant system device comprising: a main server having a DB; and a standby server having a non-executed job DB storing jobs that have not been executed in the execution time DB, the execution order DB, and the main server. In the normal state, the primary server executes the job with reference to the execution time DB and the execution order DB, notifies the standby server of the executed job, and the standby server The non-executed job DB is updated based on the notified executed job, and when an abnormality occurs, the standby server is connected to the execution time DB. The execution order DB and the unexecuted job DB are referred to, and a job that has not been executed on the primary server and that has been executed for a job having a dependency relationship with the job is executed. It is characterized by that.

本発明の第２の視点に係る、冗長システム装置におけるジョブの実行方法は、ジョブを実行する時期を格納した実行時期データベース（以下、ＤＢとする。）及び互いに依存関係にある複数のジョブの実行順序を格納した実行順序ＤＢを備えた主系サーバと、前記実行時期ＤＢ及び前記実行順序ＤＢ並びに前記主系サーバにおいて実行されていないジョブを格納した未実行ジョブＤＢを備えた待機系サーバとを備えた冗長システム装置におけるジョブの実行方法であって、正常時において、前記主系サーバが、前記実行時期ＤＢと前記実行順序ＤＢとを参照してジョブを実行する工程と、実行済みのジョブを前記待機系サーバへ通知する工程と、前記待機系サーバが、通知された実行済ジョブに基づいて前記未実行ジョブＤＢを更新する工程とを含み、異常時において、前記待機系サーバが、前記実行時期ＤＢと前記実行順序ＤＢと前記未実行ジョブＤＢとを参照し、前記主系サーバにおいて実行されていないジョブであって、そのジョブと依存関係のあるジョブが実行済みであるものを実行する工程を含むことを特徴とする。 A job execution method in a redundant system apparatus according to a second aspect of the present invention includes an execution timing database (hereinafter referred to as DB) storing job execution timing and execution of a plurality of jobs that are dependent on each other. A primary server having an execution order DB storing an order; and a standby server having an execution time DB, an execution order DB, and an unexecuted job DB storing jobs not executed on the primary server. A job execution method in a redundant system device comprising: a step of executing a job by referring to the execution time DB and the execution order DB when the main server is in a normal state; A step of notifying the standby server, and a step of the standby server updating the unexecuted job DB based on the notified executed job The standby server refers to the execution time DB, the execution order DB, and the unexecuted job DB, and is not executed on the primary server, and the job It includes a step of executing a job having a dependency relationship that has been executed.

本発明の第３の視点に係る、冗長システム装置におけるジョブの実行プログラムは、ジョブを実行する時期を格納した実行時期データベース（以下、ＤＢとする。）及び互いに依存関係にある複数のジョブの実行順序を格納した実行順序ＤＢを備えた主系サーバと、前記実行時期ＤＢ及び前記実行順序ＤＢ並びに前記主系サーバにおいて実行されていないジョブを格納した未実行ジョブＤＢを備えた待機系サーバとを備えた冗長システム装置におけるジョブの実行プログラムであって、正常時において、前記主系サーバに、前記実行時期ＤＢと前記実行順序ＤＢとを参照してジョブを実行させる処理と、前記主系サーバに、実行済みのジョブを前記待機系サーバへ通知させる処理と、前記待機系サーバに、通知された実行済ジョブに基づいて前記未実行ジョブＤＢを更新させる処理とをコンピュータに実行させ、異常時において、前記待機系サーバに、前記実行時期ＤＢと前記実行順序ＤＢと前記未実行ジョブＤＢとを参照し、前記主系サーバにおいて実行されていないジョブであって、そのジョブと依存関係のあるジョブが実行済みであるものを実行させる処理をコンピュータに実行させることを特徴とする。 An execution program for a job in a redundant system apparatus according to a third aspect of the present invention includes an execution time database (hereinafter referred to as DB) that stores the time at which a job is executed and the execution of a plurality of jobs that are mutually dependent. A primary server having an execution order DB storing an order; and a standby server having an execution time DB, an execution order DB, and an unexecuted job DB storing jobs not executed on the primary server. A job execution program in a redundant system apparatus comprising: a process for causing the main server to execute a job with reference to the execution time DB and the execution order DB in a normal state; and A process for notifying the standby server of the executed job, and based on the executed job notified to the standby server. The process of updating the unexecuted job DB is executed by a computer, and in the event of an abnormality, the standby server is referred to the execution time DB, the execution order DB, and the unexecuted job DB, and the main server The computer is caused to execute a process for executing a job that has not been executed in step 1 and a job having a dependency relationship with the job has been executed.

第１の展開形態に係る冗長システム装置は、異常時において、前記待機サーバが、前記実行時期ＤＢと前記実行順序ＤＢと前記未実行ジョブＤＢとを参照し、前記主系サーバにおいて実行されていないジョブであって、そのジョブと依存関係のあるジョブが実行済みでない場合には、依存関係のあるジョブを実行した後に、そのジョブを実行するように構成されることが好ましい。 In the redundant system device according to the first development mode, the standby server refers to the execution time DB, the execution order DB, and the non-executed job DB and is not executed in the primary server when an abnormality occurs. If a job that has a dependency relationship with the job has not been executed, the job is preferably executed after the job having the dependency relationship is executed.

第２の展開形態に係る、冗長システム装置におけるジョブの実行方法は、異常時において、前記待機サーバが、前記実行時期ＤＢと前記実行順序ＤＢと前記未実行ジョブＤＢとを参照し、前記主系サーバにおいて実行されていないジョブであって、そのジョブと依存関係のあるジョブが実行済みでない場合には、依存関係のあるジョブを実行した後に、そのジョブを実行する工程を含むことが好ましい。 The job execution method in the redundant system apparatus according to the second development mode is such that the standby server refers to the execution time DB, the execution order DB, and the unexecuted job DB when an abnormality occurs, and the main system When a job that has not been executed in the server and a job having a dependency relationship with the job has not been executed, it is preferable to include a step of executing the job after executing the job having a dependency relationship.

本発明に係る冗長システム装置によって、短時間で主系サーバから待機系サーバへと切り替えることができる。待機系サーバでも主系サーバと同じようにスケジュール判断を行っており、未実行ジョブＤＢとジョブ終了情報から終了済みのジョブが判断できるのでジョブの継続実行を行うポイントの設定が容易にできるためである。 With the redundant system apparatus according to the present invention, it is possible to switch from the primary server to the standby server in a short time. This is because the standby server makes the schedule determination in the same way as the primary server, and it is easy to set the point for continued execution of the job because the completed job can be determined from the unexecuted job DB and job end information. is there.

また、本発明に係る冗長システム装置によって、待機系サーバにおいてジョブの依存関係を自動的に判断し、投入可能なものだけを実行することができる。依存関係のある先行ジョブの稼動状況を蓄積しており、その稼動状況によって投入可能なジョブを判定することができるからである。 Further, the redundant system apparatus according to the present invention can automatically determine the job dependency in the standby server and execute only those that can be submitted. This is because the operation status of a preceding job having a dependency relationship is accumulated, and a job that can be submitted can be determined based on the operation status.

さらに、本発明に係る冗長システム装置によって、ジョブの実行状況の同期を取る際に主系サーバに負荷をかけずに同期をとることができる。ジョブ終了時だけにメッセージを送信しており、少ないデータ量で同期をとることができるためである。 Further, the redundant system apparatus according to the present invention can synchronize the job execution status without imposing a load on the primary server. This is because a message is transmitted only at the end of the job, and synchronization can be achieved with a small amount of data.

本発明の実施形態に係る冗長システム装置について、図面を参照して説明する。 A redundant system apparatus according to an embodiment of the present invention will be described with reference to the drawings.

冗長システム装置は、図１を参照すると、主系サーバ１０及び待機系サーバ２０を備える。 Referring to FIG. 1, the redundant system device includes a main server 10 and a standby server 20.

主系サーバ１０は、実行時期ＤＢ１１、実行順序ＤＢ１３を備え、さらに、スケジュール判定部１５、サーバ判定部１６、依存関係判定部１７、ジョブ実行部１８を備える。 The main server 10 includes an execution time DB 11 and an execution order DB 13, and further includes a schedule determination unit 15, a server determination unit 16, a dependency relationship determination unit 17, and a job execution unit 18.

待機系サーバ２０は、実行時期ＤＢ２１、未実行ジョブＤＢ２２、実行順序ＤＢ２３、スケジュール判定部２５、サーバ判定部２６、依存関係判定部２７、ジョブ実行部２８、ジョブ実行状況同期部２９を備える。 The standby server 20 includes an execution time DB 21, an unexecuted job DB 22, an execution order DB 23, a schedule determination unit 25, a server determination unit 26, a dependency relationship determination unit 27, a job execution unit 28, and a job execution status synchronization unit 29.

正常時において、主系サーバ１０は、次のように動作する。すなわち、実行時期ＤＢ１１に基づいて投入されたジョブについて、実行順序ＤＢ１３によって依存関係のチェックが行われた後、ジョブの実行が行われる。 Under normal conditions, the main server 10 operates as follows. That is, for a job submitted based on the execution time DB 11, the dependency is checked by the execution order DB 13, and then the job is executed.

主系サーバ１０のジョブ実行部１８から待機系サーバ２０のジョブ実行状況同期部２９へ、ジョブの実行状況を示す情報として、各ジョブの終了メッセージのみが送信される。 Only the end message of each job is transmitted as information indicating the job execution status from the job execution unit 18 of the primary server 10 to the job execution status synchronization unit 29 of the standby server 20.

待機系サーバ２０は、次のように動作する。すなわち、主系サーバ１０の実行時期ＤＢ１１と同じ実行時期ＤＢ２１に基づいてジョブの投入が要求される。しかし、待機系サーバ２０においては、正常時には、実際のジョブ投入は行われず、未実行ジョブＤＢ２２へ投入されなかったジョブが記録される。待機系サーバ２０のジョブ実行状況同期部２９は、定期的に、主系サーバ１０のジョブ実行部１８から、ジョブの終了状況を取得し、ジョブの実行状況に関する同期処理を行う。 The standby server 20 operates as follows. That is, the job submission is requested based on the same execution time DB 21 as the execution time DB 11 of the main server 10. However, in the standby server 20, when normal, actual job submission is not performed, and jobs that have not been submitted to the unexecuted job DB 22 are recorded. The job execution status synchronization unit 29 of the standby server 20 periodically acquires the job completion status from the job execution unit 18 of the primary server 10 and performs synchronization processing related to the job execution status.

異常時において、主系サーバ１０と待機系サーバ２０との間で切り替えが発生した場合には、前回の同期処理を行ったポイントから現在までの状況の同期を行い、待機系サーバ２０においてジョブの投入が行われる。ここで、未実行ジョブＤＢ２２と実行順序ＤＢ２３とによって、要求されたジョブと依存関係がある先行ジョブにおいて実行されていないものがないかを判定する。要求されたジョブと依存関係がある先行ジョブに実行されていないものがある場合には、要求されたジョブの実行を行わないようにする。したがって、待機系サーバ２０では、先行ジョブに依存しないジョブのみが実行される。 When a switch occurs between the primary server 10 and the standby server 20 at the time of an abnormality, the situation from the point where the previous synchronization processing was performed to the present is synchronized, and the standby server 20 Input is made. Here, it is determined by the unexecuted job DB 22 and the execution order DB 23 whether there is any unexecuted preceding job having a dependency relationship with the requested job. If there is an unexecuted predecessor job that has a dependency relationship with the requested job, the requested job is not executed. Therefore, the standby server 20 executes only jobs that do not depend on the preceding job.

次に、本発明の第１の実施例について図面を参照して詳細に説明する。 Next, a first embodiment of the present invention will be described in detail with reference to the drawings.

図２は、本発明の第１の実施例に係る冗長システム装置のブロック図である。 FIG. 2 is a block diagram of the redundant system apparatus according to the first embodiment of the present invention.

図２を参照すると、本実施例に係る冗長システム装置は、主系サーバ１０、待機系サーバ２０、実行状況蓄積サーバ４０を備える。 Referring to FIG. 2, the redundant system device according to this embodiment includes a main server 10, a standby server 20, and an execution status storage server 40.

主系サーバ１０は、ジョブを実行すべき日時を記述した実行時期ＤＢ１１、実行時期ＤＢを元に実際に実行されたかどうかを記録する未実行ジョブＤＢ１２、フロー制御に必要なジョブの依存関係を格納した実行順序ＤＢ１３、ジョブの実行に必要な情報を格納したジョブ定義情報１４、スケジュール判定部１５、サーバ判定部１６、依存関係判定部１７、ジョブ実行部１８を備える。 The primary server 10 stores an execution time DB 11 that describes the date and time when the job should be executed, an unexecuted job DB 12 that records whether or not the job is actually executed based on the execution time DB, and job dependencies necessary for flow control. The execution order DB 13, job definition information 14 storing information necessary for job execution, a schedule determination unit 15, a server determination unit 16, a dependency relationship determination unit 17, and a job execution unit 18.

待機系サーバ２０は、ジョブを実行すべき日時を記述した実行時期ＤＢ２１、実行時期ＤＢを元に実際に実行されたかどうかを記録する未実行ジョブＤＢ２２、フロー制御に必要なジョブの依存関係を格納した実行順序ＤＢ２３、ジョブの実行に必要な情報を格納したジョブ定義情報２４、スケジュール判定部２５、サーバ判定部２６、依存関係判定部２７、ジョブ実行部２８、ジョブ実行状況同期部２９を備える。 The standby server 20 stores an execution time DB 21 that describes the date and time when the job should be executed, an unexecuted job DB 22 that records whether or not the job is actually executed based on the execution time DB, and job dependencies necessary for flow control. The execution order DB 23, job definition information 24 storing information necessary for job execution, a schedule determination unit 25, a server determination unit 26, a dependency relationship determination unit 27, a job execution unit 28, and a job execution status synchronization unit 29.

実行状況蓄積サーバ４０は、ジョブ終了情報４１を備える。 The execution status storage server 40 includes job end information 41.

これらの各部はそれぞれ以下のように動作する。 Each of these units operates as follows.

主系サーバ１０のスケジュール判定部１５は、実行時期ＤＢ１１を読み込んで、実行すべき日時が到来したとき、ジョブ投入要求をサーバ判定部１６に対して行う。待機系サーバ２０のスケジュール判定部２５も、同様に、実行時期ＤＢ２１を読み込んで、実行すべき日時が到来したとき、ジョブ投入要求をサーバ判定部２６に対して行う。 The schedule determination unit 15 of the primary server 10 reads the execution time DB 11 and makes a job submission request to the server determination unit 16 when the date and time to be executed has come. Similarly, the schedule determination unit 25 of the standby server 20 reads the execution time DB 21 and makes a job submission request to the server determination unit 26 when the date to be executed has arrived.

主系サーバ１０のサーバ判定部１６は、ジョブの投入要求を依存関係判定部１７へ送信する。待機系サーバ２０のサーバ判定部２６は、ジョブの投入要求を破棄して未実行ジョブＤＢ２２にそのジョブが実行されなかった旨を記録する。 The server determination unit 16 of the primary server 10 transmits a job submission request to the dependency determination unit 17. The server determination unit 26 of the standby server 20 discards the job submission request and records in the unexecuted job DB 22 that the job has not been executed.

主系サーバ１０においては、未実行ジョブＤＢ１２にデータが存在しないため、依存関係判定部１７は、ジョブの投入要求をジョブ実行部１８へ送信する。 In the primary server 10, since no data exists in the unexecuted job DB 12, the dependency determination unit 17 transmits a job submission request to the job execution unit 18.

待機系サーバ２０の依存関係判定部２７は、未実行ジョブＤＢ２２にデータが存在する場合は、投入要求されたジョブの依存関係を実行順序ＤＢ２３の情報を元に判断し、問題がなければ、ジョブ実行部２８へジョブの投入要求を送信する。 When there is data in the unexecuted job DB 22, the dependency determination unit 27 of the standby server 20 determines the dependency of the requested job based on the information in the execution order DB 23. A job submission request is transmitted to the execution unit 28.

主系サーバ１０のジョブ実行部１８は、ジョブの投入要求に基づいてジョブを実行し、ジョブの終了を検出した場合には、実行状況蓄積サーバ４０へジョブ終了情報４１を記録する。 The job execution unit 18 of the primary server 10 executes the job based on the job submission request, and records the job end information 41 in the execution status storage server 40 when the end of the job is detected.

待機系サーバ２０のジョブ実行状況同期部２９は、実行状況蓄積サーバ４０からジョブ終了情報４１を取得し、未実行ジョブＤＢ２２に含まれるジョブのうち、ジョブ終了情報４１に含まれるジョブを削除する。 The job execution status synchronization unit 29 of the standby server 20 acquires the job end information 41 from the execution status storage server 40 and deletes the job included in the job end information 41 among the jobs included in the unexecuted job DB 22.

待機系サーバ２０のサーバ判定部２６がシステム管理者等による系切替えの通知を受信した場合、待機系サーバ２０は、主系サーバ１０を代替してジョブの実行を担う。 When the server determination unit 26 of the standby server 20 receives a notification of system switching by a system administrator or the like, the standby server 20 replaces the primary server 10 and performs job execution.

サーバ判定部２６は正常時にはジョブの投入要求を破棄していた。しかし、切替え通知の受信後は、サーバ判定部２６は、ジョブの投入要求を依存関係判定部２７へ送信する。 The server determination unit 26 discards the job submission request when it is normal. However, after receiving the switching notification, the server determination unit 26 transmits a job submission request to the dependency determination unit 27.

次に、本実施例に係る冗長システム装置の動作について、図面を参照して詳細に説明する。 Next, the operation of the redundant system apparatus according to the present embodiment will be described in detail with reference to the drawings.

図３は、スケジュール判定部１５、２５の動作のフローチャートである。 FIG. 3 is a flowchart of the operations of the schedule determination units 15 and 25.

スケジュール判定部１５、２５は、実行時期ＤＢ１１、２１を読み込み（ステップＳ１０）、定義された日時情報に基づいて起動すべきジョブがあるか否かを判定する（ステップＳ１１)。起動すべきジョブがある場合には、ジョブ投入を要求する（ステップＳ１２）。 The schedule determination units 15 and 25 read the execution time DBs 11 and 21 (step S10), and determine whether there is a job to be activated based on the defined date and time information (step S11). If there is a job to be activated, a job submission is requested (step S12).

図４は、サーバ判定部１６、２６の動作のフローチャートである。 FIG. 4 is a flowchart of the operation of the server determination units 16 and 26.

サーバ判定部１６、２６は、主系または待機系のいずれであるかに係る情報を読み込む（ステップＳ２０)。主系サーバ１０であるか否かを判定し（ステップＳ２１）、主系サーバ１０である場合には、ジョブの投入要求を依存関係判定部１７へ送信する（ステップＳ２２)。待機系サーバ２０である場合には、未実行ジョブＤＢ２２へ実行されなかった旨を記録する（ステップＳ２３）。 The server determination units 16 and 26 read information related to the main system or the standby system (step S20). It is determined whether or not it is the primary server 10 (step S21). If it is the primary server 10, a job submission request is transmitted to the dependency determination unit 17 (step S22). If the server 20 is a standby server 20, the fact that it has not been executed is recorded in the unexecuted job DB 22 (step S23).

図５は、依存関係判定部１７、２７及びジョブ実行部１８、２８の動作のフローチャートである。 FIG. 5 is a flowchart of the operations of the dependency determination units 17 and 27 and the job execution units 18 and 28.

依存関係判定部１７、２７は、未実行ジョブＤＢにジョブがあるか否かを判定し（ステップＳ３０）、未実行ジョブＤＢにジョブがない場合（ステップＳ３０のＮｏ）には、ジョブ実行部１８、２８はジョブを実行する（ステップＳ３４）。未実行ジョブＤＢにジョブがある場合（ステップＳ３０のＹｅｓ）には、依存関係判定部２７は、実行順序ＤＢ２３を読み込んで（ステップＳ３１）、先に実行すべき依存関係のあるジョブの実行状況を確認する（ステップＳ３２）。依存関係のあるジョブが実行済みであるか否かを判定し（ステップＳ３３）、実行済みである場合(ステップＳ３３のＹｅｓ)には、ジョブ実行部２８は、ジョブの実行を行い（ステップＳ３４）、実行済みでない場合（ステップＳ３３のＮｏ）には、ジョブを実行することなく終了する。 The dependency determination units 17 and 27 determine whether there is a job in the unexecuted job DB (step S30). If there is no job in the unexecuted job DB (No in step S30), the job execution unit 18 , 28 execute the job (step S34). When there is a job in the unexecuted job DB (Yes in step S30), the dependency determination unit 27 reads the execution order DB 23 (step S31), and displays the execution status of the job having the dependency to be executed first. Confirm (step S32). It is determined whether or not the job having the dependency relationship has been executed (step S33). If the job has been executed (Yes in step S33), the job execution unit 28 executes the job (step S34). If it has not been executed (No in step S33), the job is terminated without executing it.

ジョブ実行部１８は、実行したジョブの終了を検出した場合（ステップＳ３５のＹｅｓ)、ジョブ終了情報を実行状況蓄積サーバ４０へ送信し、ジョブ状況蓄積サーバ４０は、受信したジョブ終了情報４１を記録する（ステップＳ３６）。 When the job execution unit 18 detects the end of the executed job (Yes in step S35), the job execution unit 18 transmits the job end information to the execution status storage server 40, and the job status storage server 40 records the received job end information 41. (Step S36).

図６は、ジョブ実行状況同期部２９の動作のフローチャートである。 FIG. 6 is a flowchart of the operation of the job execution status synchronization unit 29.

実行状況蓄積サーバ４０からジョブ終了情報４１を取得し（ステップＳ４０）、未実行ジョブＤＢ２２に記録されたジョブの中に終了したジョブが含まれるか否かを判定し（ステップＳ４１）、含まれる場合（ステップＳ４１のＹｅｓ）には、未実行ジョブＤＢ２２から終了したジョブを削除する（ステップＳ４２）。 The job end information 41 is acquired from the execution status storage server 40 (step S40), and it is determined whether or not the job recorded in the unexecuted job DB 22 is included (step S41). In (Yes in step S41), the completed job is deleted from the unexecuted job DB 22 (step S42).

本発明の第２の実施例に係る冗長システム装置について、図面を参照して説明する。 A redundant system apparatus according to a second embodiment of the present invention will be described with reference to the drawings.

本実施例における待機系サーバ５０は、図７を参照すると、第１の実施例に係る冗長システム装置における待機系サーバ２０において未実行ジョブ判定部３０をさらに備える。 Referring to FIG. 7, the standby server 50 in this embodiment further includes an unexecuted job determination unit 30 in the standby server 20 in the redundant system device according to the first embodiment.

第１の実施例に係る冗長システム装置は、要求されたジョブについて、依存関係のあるジョブの実行実績がない場合は、要求されたジョブの実行を行わないように構成された。 The redundant system apparatus according to the first embodiment is configured not to execute the requested job when there is no execution result of the dependent job for the requested job.

本実施例の未実行ジョブ判定部３０は、依存関係判定部２７で抽出された情報に基づいて、依存関係のある未実行のジョブを確認するとともに、未実行のジョブがある場合には、ジョブ実行部２８において実行する。ジョブ実行部２８は、依存関係のある未実行ジョブの終了後に、要求されたジョブを実行する。 The unexecuted job determination unit 30 according to the present embodiment checks the unexecuted job having the dependency relationship based on the information extracted by the dependency relationship determination unit 27, and if there is an unexecuted job, the job This is executed by the execution unit 28. The job execution unit 28 executes the requested job after the unexecuted job having the dependency relationship is terminated.

以上の記載は実施例に基づいて行ったが、本発明は、上記実施例に限定されるものではない。 Although the above description has been made based on examples, the present invention is not limited to the above examples.

バッチジョブ管理システムにおいて、災害発生時にそのシステム切り替えといった用途に適用でき、バックアップセンタの構築も容易になる。 In a batch job management system, it can be applied to a system switching when a disaster occurs, and a backup center can be easily constructed.

本発明の実施形態に係る冗長システム装置のブロック図である。1 is a block diagram of a redundant system device according to an embodiment of the present invention. 本発明の第１の実施例に係る冗長システム装置のブロック図である。1 is a block diagram of a redundant system device according to a first example of the present invention. FIG. 本発明の第１の実施例に係る冗長システム装置におけるスケジュール判定部の動作のフローチャートである。It is a flowchart of operation | movement of the schedule determination part in the redundant system apparatus based on 1st Example of this invention. 本発明の第１の実施例に係る冗長システム装置におけるサーバ判定部の動作のフローチャートである。It is a flowchart of operation | movement of the server determination part in the redundant system apparatus based on 1st Example of this invention. 本発明の第１の実施例に係る冗長システム装置における依存関係判定部及びジョブ実行部の動作のフローチャートである。It is a flowchart of operation | movement of the dependence determination part and the job execution part in the redundant system apparatus based on 1st Example of this invention. 本発明の第１の実施例に係る冗長システム装置におけるジョブ実行状況同期部の動作のフローチャートである。It is a flowchart of operation | movement of the job execution condition synchronization part in the redundant system apparatus based on 1st Example of this invention. 本発明の第２の実施例に係る冗長システム装置のブロック図である。It is a block diagram of the redundant system apparatus which concerns on 2nd Example of this invention.

Explanation of symbols

１０主系サーバ
１１、２１実行時期データベース（ＤＢ）
１２、２２未実行ジョブＤＢ
１３、２３実行順序ＤＢ
１４、２４ジョブ定義情報
１５、２５スケジュール判定部
１６、２６サーバ判定部
１７、２７依存関係判定部
１８、２８ジョブ実行部
２０、５０待機系サーバ
２９ジョブ実行状況同期部
３０未実行ジョブ判定部
４０実行状況蓄積サーバ
４１ジョブ終了情報 10 Main servers 11, 21 Execution time database (DB)
12, 22 Unexecuted job DB
13, 23 Execution order DB
14, 24 Job definition information 15, 25 Schedule determination unit 16, 26 Server determination unit 17, 27 Dependency determination unit 18, 28 Job execution unit 20, 50 Standby server 29 Job execution status synchronization unit 30 Unexecuted job determination unit 40 Execution status storage server 41 Job end information

Claims

A main server having an execution time database (hereinafter referred to as DB) storing job execution times and an execution order DB storing execution orders of a plurality of jobs that are mutually dependent;
A redundant system device comprising: a standby server having an execution time DB, an execution order DB, and an unexecuted job DB storing a job not executed in the primary server;
In a normal state, the primary server refers to the execution time DB and the execution order DB, executes the job, notifies the standby server of the executed job, and the standby server is notified. Configured to update the unexecuted job DB based on the executed job,
At the time of abnormality, the standby server refers to the execution time DB, the execution order DB, and the unexecuted job DB, and is a job that is not executed on the primary server, and has a dependency relationship with the job. A redundant system device configured to execute a job that has been executed.

At the time of abnormality, the standby server refers to the execution time DB, the execution order DB, and the unexecuted job DB, and is a job that is not executed on the primary server and has a dependency relationship with the job. The redundant system apparatus according to claim 1, wherein when a job has not been executed, the job is executed after a job having a dependency relationship is executed.

A main server having an execution time database (hereinafter referred to as DB) storing job execution times and an execution order DB storing execution orders of a plurality of jobs that are mutually dependent;
A job execution method in a redundant system device including the execution time DB, the execution order DB, and a standby server including an unexecuted job DB storing a job that is not executed in the primary server,
A step in which the main server executes a job with reference to the execution time DB and the execution order DB at a normal time;
Notifying the standby server of the executed job;
The standby server updates the unexecuted job DB based on the notified executed job, and
At the time of abnormality, the standby server refers to the execution time DB, the execution order DB, and the unexecuted job DB, and is a job that is not executed on the primary server, and has a dependency relationship with the job. A method for executing a job in a redundant system apparatus, comprising a step of executing a job that has been executed.

At the time of abnormality, the standby server refers to the execution time DB, the execution order DB, and the unexecuted job DB, and is a job that is not executed on the primary server and has a dependency relationship with the job. 4. The method for executing a job in a redundant system apparatus according to claim 3, further comprising a step of executing the job after execution of the job having dependency relation when the job has not been executed.

A main server having an execution time database (hereinafter referred to as DB) storing job execution times and an execution order DB storing execution orders of a plurality of jobs that are mutually dependent;
A job execution program in a redundant system device including the execution time DB, the execution order DB, and a standby server including an unexecuted job DB storing a job that is not executed in the primary server,
A process of causing the primary server to execute a job with reference to the execution time DB and the execution order DB at a normal time;
A process for causing the primary server to notify the standby server of an executed job;
Causing the standby server to execute a process of updating the unexecuted job DB based on the notified executed job,
At the time of abnormality, the standby server is referred to the execution time DB, the execution order DB, and the unexecuted job DB, and is a job that is not executed on the primary server and has a dependency relationship with the job. A job execution program in a redundant system apparatus, characterized in that a computer executes a process for executing a job that has already been executed.