JPH11126195A

JPH11126195A - Distributed system

Info

Publication number: JPH11126195A
Application number: JP9289368A
Authority: JP
Inventors: Satoshi Kobayashi; 智小林
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1997-10-22
Filing date: 1997-10-22
Publication date: 1999-05-11

Abstract

PROBLEM TO BE SOLVED: To make it possible to constitute a system optimum for the throughput and reliability of each server for performing work processing. SOLUTION: Respective server managing devices 2a-2n perform the operation management/control of respective PC servers 1a-1n, share constitution information expressing the throughput of the latest respective CPU servers 1a-1n, a fault information expressing the reliability and the constitution information of external input/output units 5a-5m by mutually exchanging information through a communication path 3, determine the respective server managing devices 2a-2n to be the suitable connection destination through autonomous judgment while considering the throughput and reliability of the respective PC servers 1a-1n and start the job processing by starting OS 6a-6m while being connected to an external input/output switching mechanism 4.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、異なる業務処理能
力と信頼性を有した複数の処理装置で構成される分散シ
ステム、特にシステム全体を統合的に制御するコンピュ
ータを有しない分散システムにおけるシステム構成制御
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a distributed system comprising a plurality of processing units having different business processing capabilities and reliability, and more particularly to a system configuration in a distributed system having no computer for integrally controlling the entire system. Regarding control.

【０００２】[0002]

【従来の技術】複数のＰＣサーバあるいはそのデータ処
理部を利用した分散システムにおけるシステム構成制御
方式においては、分散システム全体を統合的に管理・制
御する専用のシステム制御装置による方式と、オペレー
ティングシステム（ＯＳ）あるいはその拡張機能によ
り、分散システム構成要素間で互いに運転状況を監視す
ることでシステム構成を制御する方式とがある。このう
ち、図１４は、分散システム構成要素間で互いに運転状
況を監視する従来の分散システムを示したブロック構成
図である。2. Description of the Related Art In a system configuration control method in a distributed system using a plurality of PC servers or data processing units thereof, a system using a dedicated system control device for integrally managing and controlling the entire distributed system and an operating system ( There is a method of controlling the system configuration by monitoring the operation status between the distributed system components using the OS (OS) or its extended function. Among them, FIG. 14 is a block diagram showing a conventional distributed system for monitoring the operation status between the distributed system components.

【０００３】この分散システムは、業務処理を実行する
複数（ｎ）台のＰＣサーバ１０１ａ〜１０１ｎ内に設置
され、ＰＣサーバ１０１ａ〜１０１ｎの個別の運転管理
を行うサーバ管理装置１０２ａ〜１０２ｎ、ＰＣサーバ
１０１ａ〜１０１ｎ上で動作して業務処理を行うオペレ
ーティングシステム（ＯＳと略す）１０６ａ〜１０６
ｎ、業務処理に必要なデータの格納及びシステム間の通
信を行う複数（ｍ）台の外部入出力装置１０５ａ〜１０
５ｍ、外部入出力装置１０５ａ〜１０５ｍ内で共用デー
タを格納する共有ディスク１０８ａ〜１０８ｍ、外部入
出力装置１０５ａ〜１０５ｍ間及びその他の外部システ
ムとの通信を行う二重化されたＬｏｃａｌＡｒｅａＮ
ｅｔｗｏｒｋ（ＬＡＮと略す）１０７ａ、１０７ｂ及び
ＰＣサーバ１０１ａ〜１０１ｎと外部入出力装置１０５
ａ〜１０５ｍ間を接続し、ＰＣサーバ１０１ａ〜１０１
ｎ間でデータ共有を行なえるようにする外部入出力切り
換え機構１０４からなる。[0003] This distributed system is installed in a plurality (n) of PC servers 101a to 101n that execute business processes, and server management devices 102a to 102n that individually manage the operation of the PC servers 101a to 101n; Operating systems (abbreviated as OS) 106a to 106 that operate on 101a to 101n and perform business processes
n, a plurality (m) of external input / output devices 105a-10 for storing data necessary for business processing and communicating between systems
5m, shared disks 108a to 108m for storing shared data in the external input / output devices 105a to 105m, and redundant Local Area N for communication between the external input / output devices 105a to 105m and other external systems.
network (abbreviated as LAN) 107a, 107b, PC servers 101a to 101n, and external input / output device 105
a to 105 m, and the PC servers 101 a to 101
and an external input / output switching mechanism 104 for enabling data sharing among the n.

【０００４】次に、電源投入によるシステム立上げ時に
ＰＣサーバ１０１ｃがハードウェア障害のため動作不能
となった場合の動作について説明する。Next, an operation when the PC server 101c becomes inoperable due to a hardware failure when the system is started by turning on the power will be described.

【０００５】各ＰＣサーバ１０１ａ〜１０１ｎは、サー
バ管理装置１０２ａ〜１０２ｎの制御のもと、内部ハー
ドウェアの初期化と自己診断を行ない、ＯＳ１０６ａ〜
１０６ｎを立ちあげる。この間、ＰＣサーバ１０１ｃ
は、ハードウェア障害によってＯＳ１０６ｃの立上げを
完了できないまま、停止状態となる。残りのＯＳ１０６
ａ，１０６ｂ，１０６ｄ〜１０６ｎは正常に立ち上が
り、外部入出力切り換え機構１０４を経由して対応する
外部入出力装置１０５ａ〜１０５ｍに接続すると共に、
ＬＡＮ１０７ａ、１０７ｂを通じて互いに稼動状況を確
認しあう。この結果、稼動中のＯＳ１０６ａ，１０６
ｂ，１０６ｄ〜１０６ｎは、ＯＳ１０６ｃが立ち上がっ
ていないことを知り、ＯＳ１０６ｃが対応する予定の外
部入出力装置１０５ｃ上の共有ディスク１０８ｃへのサ
ービスを、例えば、ＰＣサーバ１０１ｂが外部入出力切
り換え機構１０４を経由してアクセスすることにより肩
代わりする。以降、ＰＣサーバ１０１ａ，１０１ｂ，１
０１ｄ〜１０１ｎは共有ディスク１０８ａ〜１０８ｍ及
びＬＡＮ１０７ａ、１０７ｂを使用して分散システムと
しての業務処理を開始する。[0005] Under the control of the server management devices 102a to 102n, each of the PC servers 101a to 101n performs initialization of internal hardware and self-diagnosis, and executes OSs 106a to 106n.
Start up 106n. During this time, the PC server 101c
Is in a stopped state before the startup of the OS 106c can be completed due to a hardware failure. Remaining OS 106
a, 106b, 106d to 106n normally start up and connect to the corresponding external input / output devices 105a to 105m via the external input / output switching mechanism 104;
The operating status is mutually confirmed through the LANs 107a and 107b. As a result, the operating OSs 106a, 106
b, 106d to 106n know that the OS 106c has not been started, and provide a service to the shared disk 108c on the external input / output device 105c to which the OS 106c is to correspond. For example, the PC server 101b sends the external input / output switching mechanism 104 You can take over by accessing via. Hereinafter, the PC servers 101a, 101b, 1
01d to 101n start business processing as a distributed system using the shared disks 108a to 108m and the LANs 107a and 107b.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、従来の
分散システムは、上記のように構成されていたため、サ
ーバ切り換えのための特殊なＯＳ乃至その拡張機能が必
要であり、ＰＣサーバを利用したオープンな製品による
高性能で低価格なシステム構築には適さなかった。However, since the conventional distributed system is configured as described above, a special OS for server switching or its extended function is required, and an open system using a PC server is required. It was not suitable for building a high-performance, low-cost system using products.

【０００７】また、処理能力及び信頼性が急速に向上す
るＰＣサーバにタイムリに適応すべく、ＰＣサーバの一
部更新、プロセッサ増設等によるＰＣサーバの部分的な
能力アップを実施し、システム全体の業務処理能力を向
上しようとした場合、ＯＳが個々のＰＣサーバの処理能
力に応じた適切な業務割り付けができるとはかぎらない
問題点があった。In order to timely adapt to a PC server whose processing capability and reliability are rapidly improving, the PC server is partially upgraded by adding a processor, etc. to partially improve the capacity of the PC server, and the entire system is improved. When trying to improve the business processing capacity, there is a problem that the OS cannot always allocate an appropriate business according to the processing capacity of each PC server.

【０００８】また、ＰＣサーバは、近年マルチプロセッ
サ化が進み、プロセッサ単位の縮退運転が可能となって
いるが、縮退運転を行った場合、縮退による処理能力及
び信頼性の低下については適切に対処することができな
かったため、縮退によるシステム全体の可用性向上が十
分に生かされない問題点があった。In recent years, the use of multiprocessors has been advanced in PC servers, and it is possible to perform a degenerate operation on a processor basis. However, when the degenerate operation is performed, a decrease in processing performance and reliability due to the degeneration is appropriately dealt with. Therefore, there has been a problem that the improvement in the availability of the entire system due to the degeneration cannot be fully utilized.

【０００９】更に、ＰＣサーバが障害によって停止した
場合の他のＰＣサーバによる代替運転についても、代替
とすべきＰＣサーバの選択において各ＰＣサーバの処理
能力及び信頼性を適切に考慮できないため、分散システ
ム全体としてバランスの良いシステム構成を必ずしも実
現できるとはかぎらなかった。[0009] Further, regarding the alternative operation by another PC server when the PC server is stopped due to a failure, the processing capacity and reliability of each PC server cannot be properly considered in selecting the PC server to be replaced. It was not always possible to achieve a well-balanced system configuration as a whole system.

【００１０】本発明は以上のような問題を解決するため
になされたものであり、その目的は、特殊なＯＳの制御
機能やシステム全体を統合するシステム管理機能に依存
することなく、業務処理を行うサーバのそれぞれの処理
能力と信頼性に最適なシステム構成を可能とするととも
に、障害発生時においても残った構成要素によって最適
なシステム構成を可能とする分散システムを提供するこ
とにある。SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and has as its object to perform business processing without depending on a special OS control function or a system management function for integrating the entire system. It is an object of the present invention to provide a distributed system that enables an optimum system configuration for each processing capacity and reliability of a server to be performed, and that enables an optimum system configuration using remaining components even when a failure occurs.

【００１１】[0011]

【課題を解決するための手段】以上のような目的を達成
するために、第１の発明に係る分散システムは、複数の
処理装置と、オペレーティングシステム及び業務アプリ
ケーションをそれぞれ格納した複数の外部入出力装置
と、前記処理装置と前記外部入出力装置とを接続する外
部入出力切換え手段とを有し、前記処理装置によって前
記外部入出力切替え手段を介して前記外部入出力装置の
オペレーティングシステムを起動させることにより所定
の業務処理を実行する分散システムにおいて、前記処理
装置それぞれに搭載され、かつ前記処理装置の運転管理
を行う複数の処理装置管理装置と、複数の前記処理装置
管理装置間の通信を行う通信路と、前記処理装置内に設
けられ、かつ前記処理装置管理装置からの指示に従い前
記処理装置と前記外部入出力装置との接続・切離しを行
う装置切換手段とを有し、前記各処理装置管理装置は、
前記通信路を介して前記各処理装置の構成情報及び障害
情報を授受することによって共有し、その共有した構成
情報及び障害情報に基づいて前記外部入出力装置の接続
先を自律判断して決定し、前記装置切換手段は、前記処
理装置管理装置が決定した前記外部入出力装置の接続先
に基づいて前記処理装置と前記外部入出力装置との接続
・切離しを行うものである。In order to achieve the above object, a distributed system according to a first aspect of the present invention comprises a plurality of processing units and a plurality of external input / outputs each storing an operating system and a business application. Device, and external input / output switching means for connecting the processing device and the external input / output device, wherein the processing device activates an operating system of the external input / output device via the external input / output switching device. Thus, in a distributed system that executes a predetermined business process, a plurality of processing device management devices mounted on each of the processing devices and performing operation management of the processing device, and perform communication between the plurality of processing device management devices A communication path, provided in the processing device, and configured to communicate with the processing device in accordance with an instruction from the processing device management device. And a device switching means for connecting, disconnecting the input-output device, wherein the processing device management apparatus,
The configuration information and the failure information of the respective processing devices are shared by transmitting and receiving via the communication path, and the connection destination of the external input / output device is determined by autonomous determination based on the shared configuration information and the failure information. The device switching means connects and disconnects the processing device and the external input / output device based on the connection destination of the external input / output device determined by the processing device management device.

【００１２】第２の発明に係る分散システムは、第１の
発明において、前記各処理装置管理装置は、共有した構
成情報及び障害情報に基づいて処理能力及び信頼性上最
適な前記外部入出力装置の接続先を決定するものであ
る。In a distributed system according to a second aspect, in the first aspect, each of the processing unit management devices is configured to optimize the external input / output device in terms of processing capability and reliability based on shared configuration information and fault information. Is determined.

【００１３】第３の発明に係る分散システムは、第１の
発明において、前記各処理装置管理装置は、前記処理装
置の立上げ時に共有した構成情報及び障害情報を参照し
て初期化処理を行うことによって前記処理装置の構成情
報及び障害情報を自動生成すると共に、その自動生成し
た構成情報及び障害情報を他の前記処理装置管理装置に
通報するものである。[0013] In a distributed system according to a third aspect of the present invention, in the first aspect, each of the processing unit management devices performs an initialization process with reference to the configuration information and the failure information shared when the processing unit is started up. Thereby, the configuration information and the fault information of the processing device are automatically generated, and the automatically generated configuration information and the fault information are reported to the other processing device management devices.

【００１４】第４の発明に係る分散システムは、第１の
発明において、前記外部入出力装置の構成情報には、当
該外部入出力装置に記憶されている業務アプリケーショ
ンが実行される上で必要な前記処理装置の処理能力及び
信頼性を表す指標が含まれているものである。[0014] In the distributed system according to a fourth aspect, in the first aspect, the configuration information of the external input / output device includes information necessary for executing a business application stored in the external input / output device. An index indicating the processing capability and reliability of the processing device is included.

【００１５】第５の発明に係る分散システムは、第１の
発明において、前記処理装置の構成情報には、前記各処
理装置の接続先に関する情報が含まれているものであ
る。According to a fifth aspect of the present invention, in the distributed system according to the first aspect, the configuration information of the processing device includes information on a connection destination of each of the processing devices.

【００１６】第６の発明に係る分散システムは、第１の
発明において、前記各処理装置管理装置は、共有した構
成情報及び障害情報に基づいて前記各処理装置の実効的
な処理能力を表す実効性能点数及び前記各処理装置の信
頼性を表す信頼度点数を算出し、その算出した実効性能
点数、信頼度点数及び前記外部入出力装置の構成情報に
基づいて前記処理装置の接続先となる前記外部入出力装
置を決定するものである。According to a sixth aspect of the present invention, in the distributed system according to the first aspect, each of the processing device management devices is configured to indicate an effective processing capability of each of the processing devices based on the shared configuration information and fault information. A performance score and a reliability score representing the reliability of each of the processing devices are calculated, and the connection destination of the processing device is calculated based on the calculated effective performance score, the reliability score, and the configuration information of the external input / output device. It determines the external input / output device.

【００１７】第７の発明に係る分散システムは、第１の
発明において、前記各処理装置管理装置は、実効性能点
数及び信頼度点数に基づいて前記全処理装置の接続順位
を決定し、また、前記外部入出力装置の構成情報に基づ
いて前記外部入出力装置の接続順位を決定し、前記処理
装置と同じ接続順位となった前記外部入出力装置を接続
先と決定するものである。According to a seventh aspect of the present invention, in the distributed system according to the first aspect, each of the processing device management devices determines a connection order of all the processing devices based on an effective performance score and a reliability score. The connection order of the external input / output devices is determined based on the configuration information of the external input / output devices, and the external input / output device having the same connection order as the processing device is determined as the connection destination.

【００１８】第８の発明に係る分散システムは、第１又
は第３の発明において、前記各処理装置管理装置は、他
の前記処理装置に搭載されている前記処理装置管理装置
からの通報に応じて保有している構成情報又は障害情報
を更新するものである。According to an eighth aspect of the present invention, in the distributed system according to the first or third aspect, each of the processing device management devices responds to a notification from the processing device management device mounted on another of the processing devices. This is to update the configuration information or the failure information held by the system.

【００１９】第９の発明に係る分散システムは、第８の
発明において、前記各処理装置管理装置は、前記各処理
装置と前記各外部入出力装置とを接続した結果、前記処
理装置がいずれの前記外部入出力装置にも接続されなか
ったときには代替処理装置として待機させ、業務処理を
実行していた他の前記処理装置から代替運転の通報を受
けた場合、その通報をした処理装置に代わって当該通報
をした処理装置と接続されていた前記外部入出力装置を
前記処理装置に接続し直して当該通報をした処理装置が
実行していた業務処理を引き継がせるものである。According to a ninth aspect of the present invention, in the distributed system according to the eighth aspect, each of the processing device management devices connects each of the processing devices and each of the external input / output devices, and as a result, When it is not connected to the external input / output device, it is made to stand by as a substitute processing device, and when a notification of the alternative operation is received from the other processing device that is executing the business process, instead of the processing device that has sent the report. The external input / output device connected to the processing device that has issued the notification is reconnected to the processing device so that the business process that has been executed by the processing device that has issued the notification can be taken over.

【００２０】第１０の発明に係る分散システムは、第９
の発明において、前記各処理装置管理装置は、前記代替
処理装置が複数存在する場合は、当該通報をした処理装
置の通報時点における処理能力及び信頼性に最も近似し
ている代替処理装置に業務処理を引き継がせるものであ
る。[0020] A distributed system according to a tenth aspect of the present invention comprises a ninth aspect.
In the invention of the above, when there are a plurality of the alternative processing devices, each of the processing device management devices performs the job processing on the alternative processing device having the closest processing capability and reliability at the time of the notification of the processing device that sent the notification. Is to be taken over.

【００２１】第１１の発明に係る分散システムは、第９
の発明において、前記各処理装置管理装置は、前記代替
処理装置が複数存在する場合は、当該通報をした処理装
置が本来有している処理能力及び信頼性に最も近似して
いる前記代替処理装置に業務処理を引き継がせるもので
ある。[0021] The distributed system according to the eleventh aspect of the present invention includes a ninth aspect.
In the invention, the processing device management device, when there are a plurality of the alternative processing devices, the replacement processing device that is closest to the processing capability and reliability originally possessed by the processing device that sent the notification. To take over the business process.

【００２２】第１２の発明に係る分散システムは、第８
の発明において、前記各処理装置管理装置は、前記各処
理装置と前記各外部入出力装置とを接続した結果、前記
処理装置がいずれの前記外部入出力装置にも接続されな
かったときには代替処理装置として待機させ、複数の前
記代替処理装置が存在する場合は最も接続順位の高い前
記代替処理装置が業務処理中の他の前記処理装置に代わ
って業務処理を引き継ぐものである。The distributed system according to the twelfth aspect is provided with
In the invention of the first aspect, the processing device management device may connect the processing device and the external input / output device, and as a result, when the processing device is not connected to any of the external input / output devices, When there are a plurality of the alternative processing devices, the alternative processing device having the highest connection order takes over the business process in place of the other processing device that is performing the business process.

【００２３】第１３の発明に係る分散システムは、第９
乃至第１２の発明において、業務処理を引き継がせた前
記処理装置に搭載の前記処理装置管理装置は、その管理
対象処理装置の復旧後に引継先の前記処理装置から再度
その業務処理を引き継ぎ再開させるものである。A distributed system according to a thirteenth aspect is provided with a ninth aspect.
In the twelfth to twelfth inventions, the processing device management device mounted on the processing device that has taken over the business process restarts the business process from the takeover destination processing device again after the management target processing device is restored. It is.

【００２４】第１４の発明に係る分散システムは、第１
の発明において、前記各処理装置管理装置は、前記処理
装置及び前記外部入出力装置の構成情報の双方に含まれ
ている前記各処理装置と前記外部入出力装置との接続上
の制約条件に従い、その制約条件を満たす前記処理装置
群及び前記外部入出力装置群単位に前記処理装置の接続
先となる前記外部入出力装置を自律判断して決定するも
のである。The distributed system according to the fourteenth invention has the first
In the invention, the processing device management device, according to the constraints on the connection between the processing device and the external input / output device included in both the configuration information of the processing device and the external input / output device, The external input / output device to which the processing device is connected is determined by autonomous determination for each of the processing device group and the external input / output device group satisfying the constraint condition.

【００２５】第１５の発明に係る分散システムは、第１
の発明において、前記処理装置の構成情報には、前記各
処理装置の立上げ時に代替処理装置として待機させるた
めの運転制御情報が含まれており、前記各処理装置管理
装置は、前記処理装置の運転制御情報に待機指示が設定
されている場合は、処理能力及び信頼性の高低にかかわ
らず前記処理装置を立上げ時に代替処理装置として待機
させるものである。The distributed system according to the fifteenth invention is characterized in that
In the invention, the configuration information of the processing devices includes operation control information for causing the processing devices to be on standby as an alternative processing device when starting up each of the processing devices, and each of the processing device management devices includes When a standby instruction is set in the operation control information, the processing device is made to stand by as an alternative processing device when starting up, regardless of the processing capability and the reliability.

【００２６】第１６の発明に係る分散システムは、第１
５の発明において、処理能力及び信頼性の高い前記処理
装置の運転制御情報に待機指示を予め設定するものであ
る。The distributed system according to the sixteenth invention is characterized in that
In the invention of claim 5, a standby instruction is set in advance in the operation control information of the processing device having high processing capability and reliability.

【００２７】[0027]

【発明の実施の形態】以下、図面に基づいて、本発明の
好適な実施の形態について説明する。Preferred embodiments of the present invention will be described below with reference to the drawings.

【００２８】実施の形態１．図１は、本発明に係る分散
システムの実施の形態１を示したブロック構成図であ
る。図１には、複数（ｎ）台のＰＣサーバ１ａ〜１ｎ、
外部入出力切り換え機構４及び複数（ｍ）台の外部入出
力装置５ａ〜５ｍが示されている。外部入出力装置５ａ
〜５ｍは、ＯＳ及び業務アプリケーションなど業務処理
に必要なデータを格納している。ＰＣサーバ１ａ〜１ｎ
は、接続されたいずれかの外部入出力装置５ａ〜５ｍの
ＯＳを起動することによって所定の業務処理を実行する
処理装置である。外部入出力切り換え機構４は、全ての
ＰＣサーバ１ａ〜１ｎと外部入出力装置５ａ〜５ｍを接
続する外部入出力切換え手段である。Embodiment 1 FIG. 1 is a block diagram showing Embodiment 1 of the distributed system according to the present invention. FIG. 1 shows a plurality (n) of PC servers 1a to 1n,
The external input / output switching mechanism 4 and a plurality (m) of external input / output devices 5a to 5m are shown. External input / output device 5a
5 m store data necessary for business processing such as an OS and a business application. PC server 1a-1n
Is a processing device that executes a predetermined business process by activating the OS of any of the connected external input / output devices 5a to 5m. The external input / output switching mechanism 4 is external input / output switching means for connecting all the PC servers 1a to 1n and the external input / output devices 5a to 5m.

【００２９】各ＰＣサーバ１ａ〜１ｎには、それぞれに
サーバ管理装置２ａ〜２ｎが処理装置管理装置として搭
載されている。各サーバ管理装置２ａ〜２ｎは、搭載さ
れたＰＣサーバ１ａ〜１ｎ（以下、「管理対象サーバ」
ともいう）の運転管理及び制御を個々に行うと共に、通
信路３を介して相互に情報の授受を行う。また、外部入
出力装置５ａ〜５ｍは、業務処理を実行するためのオペ
レーティングシステム（ＯＳ）６ａ〜６ｍを格納し、ま
た、二重化されたＬｏｃａｌＡｒｅａＮｅｔｗｏｒ
ｋ（ＬＡＮ）７ａ，７ｂを介して外部入出力装置５ａ〜
５ｍ間及びその他の外部システムとの通信を行う。な
お、外部入出力切り換え機構４は、図１においては単一
の切換え装置として表現しているが、大規模な分散シス
テムでは、切り替える入出力範囲の増大及びデータ転送
経路の高性能化や高信頼化に対処することを考慮する
と、複数の小型切換え装置による多段・多重接続によっ
て構成してもよい。Each of the PC servers 1a to 1n has a server management device 2a to 2n mounted thereon as a processing device management device. Each of the server management devices 2a to 2n includes an installed PC server 1a to 1n (hereinafter, "managed server").
), And mutually exchange information via the communication path 3. In addition, the external input / output devices 5a to 5m store operating systems (OS) 6a to 6m for executing business processes, and have a duplicated Local Area Network.
k (LAN) 7a, 7b, external input / output devices 5a-
Communication is performed for 5 m and with other external systems. Although the external input / output switching mechanism 4 is shown as a single switching device in FIG. 1, in a large-scale distributed system, the input / output switching range is increased, the performance of the data transfer path is improved, and the reliability is improved. In order to cope with this, it may be configured by multi-stage / multi-connection by a plurality of small switching devices.

【００３０】ＰＣサーバ１ａ〜１ｎは、外部入出力切り
換え機構４を経由して外部入出力装置５ａ〜５ｍに接続
し、外部入出力装置５ａ〜５ｍに格納されたＯＳ６ａ〜
６ｍのいずれかを起動して業務処理を行う。業務処理に
関する処理データは、接続先の外部入出力装置５ａ〜５
ｍに格納されており、外部のシステムとの通信も、接続
先の外部入出力装置５ａ〜５ｍを使用してＬＡＮ７ａ，
７ｂにより行う。なお、ＰＣサーバ１ａ〜１ｎ上で稼動
するＯＳ６ａ〜６ｍの起動に際しては、唯一選択された
外部入出力装置５ａ〜５ｍへの接続が必要であるが、業
務処理データの共用やデータ格納装置の拡大等の目的
で、一つのＰＣサーバ１ａ〜１ｎから複数の外部入出力
装置５ａ〜５ｍに接続したり、一つの外部入出力装置５
ａ〜５ｍに複数のＰＣサーバ１ａ〜１ｎを接続してもよ
い。The PC servers 1a to 1n are connected to the external input / output devices 5a to 5m via the external input / output switching mechanism 4, and the OSs 6a to 5m stored in the external input / output devices 5a to 5m.
6m is activated to perform business processing. Processing data relating to business processing is stored in the external input / output devices 5a to 5
m, and communication with an external system is also performed using the external input / output devices 5a to 5m of the connection destinations.
7b. Note that when the OSs 6a to 6m operating on the PC servers 1a to 1n are started, it is necessary to connect only to the selected external input / output devices 5a to 5m. However, sharing of business process data and expansion of data storage devices are required. For the purpose of, for example, connecting one PC server 1a to 1n to a plurality of external input / output devices 5a to 5m,
A plurality of PC servers 1a to 1n may be connected to a to 5m.

【００３１】サーバ管理装置２ａ〜２ｎは、ＰＣサーバ
１ａ〜１ｎの運転状況を監視・記録すると共に、ＰＣサ
ーバ１ａ〜１ｎの電源投入・切断、初期化と自己診断を
含む初期化処理の開始、外部入出力装置５ａ〜５ｍとの
接続、業務処理の開始・終了等の運転制御を行う。サー
バ管理装置２ａ〜２ｎは、外部入出力装置５ａ〜５ｍの
ＯＳ６ａ〜６ｍの起動に必要なＰＣサーバ１ａ〜１ｎと
外部入出力装置５ａ〜５ｍの接続先をシステム起動時に
決定する。サーバ管理装置２ａ〜２ｎは、通信路３を介
してＰＣサーバ１ａ〜１ｎの諸元、環境等に基づく構成
情報及び稼動状態に基づく障害情報を授受することによ
って共有し、常に、全てのＰＣサーバ１ａ〜１ｎに関す
る最新の構成情報と障害情報を保持する。また、サーバ
管理装置２ａ〜２ｎは、外部入出力装置５ａ〜５ｍの諸
元、環境等に基づく構成情報をも保持することによっ
て、自分が管理しているＰＣサーバ１ａ〜１ｎが接続す
べき外部入出力装置５ａ〜５ｍを自律判断で決定する。The server management devices 2a to 2n monitor and record the operating conditions of the PC servers 1a to 1n, start and stop power supply of the PC servers 1a to 1n, start initialization and self-diagnosis. Operation control such as connection with the external input / output devices 5a to 5m and start / end of business processing is performed. The server management devices 2a to 2n determine the connection destinations of the PC servers 1a to 1n and the external input / output devices 5a to 5m required for starting the OSs 6a to 6m of the external input / output devices 5a to 5m at the time of system startup. The server management devices 2a to 2n share by transmitting and receiving configuration information based on the specifications, environment, and the like of the PC servers 1a to 1n and fault information based on the operation state via the communication path 3, and always share all the PC servers. The latest configuration information and fault information regarding 1a to 1n are held. The server management devices 2a to 2n also hold configuration information based on the specifications, environment, and the like of the external input / output devices 5a to 5m, so that the PC servers 1a to 1n managed by the server management devices 2a to 2n can connect to the external devices. The input / output devices 5a to 5m are determined by autonomous judgment.

【００３２】図２は、図１の複数のＰＣサーバ１ａ〜１
ｎの一つの構成要素であるＰＣサーバ１ｉを示したブロ
ック構成図である。このＰＣサーバ１ｉでは、データ処
理を行う複数（ｎ）個のプロセッサ８ａ〜８ｎと、処理
データを一時的に貯えるＲＡＭ等により形成されるメモ
リ９と、バスブリッジ１０をプロセッサバス２４で接続
している。バスブリッジ１０は、高速に動作するプロセ
ッサバス２４と低速な入出力バス２３の動作速度の違い
を吸収することによって、両バス２３，２４に接続され
た任意の装置間でのデータ転送を円滑に行えるようにし
ている。入出力バス２３には、外部入出力切り換え機構
４を経由して外部入出力装置５ａ〜５ｍとＰＣサーバ１
ｉを接続するための外部入出力装置接続装置１１と、Ｐ
Ｃサーバ１ｉの初期化や自己診断を行うためのプログラ
ムを記憶するＲＯＭ１２と、ＰＣサーバ１ｉと外部との
直接的な入出力を行うためのキーボード１４、マウス１
６、ディスプレイ１８及びフロッピィディスク装置（Ｆ
ＤＤ）２０の各制御装置１３，１５，１７，１９と、Ｐ
Ｃサーバ１ｉの診断プログラムや一時的なデータの保管
に使用するハードディスク装置（ＨＤＤ）２２の制御装
置２１とが接続されている。この入出力バス２３は、Ｐ
ＣＩバスやＥＩＳＡバス等の業界標準に基づいたバスで
あり、外部入出力装置接続装置１１や制御装置１３，１
５，１７，１９，２１は、ＰＣＩバス準拠の拡張ボード
等により構成される。また、ＰＣサーバ１ｉの運転管理
・制御を行うサーバ管理装置２ｉには、サーバ管理装置
２ｉの管理情報を記録する不揮発性メモリ２５が内蔵さ
れている。なお、不揮発性メモリ２５は、サーバ管理装
置２ｉに内蔵させなくてもサーバ管理装置２ｉ専用のＨ
ＤＤやメモリカード等の外部記憶装置で構成してもよ
い。FIG. 2 shows a plurality of PC servers 1a to 1 shown in FIG.
FIG. 3 is a block diagram showing a PC server 1i which is one of n components. In the PC server 1i, a plurality of (n) processors 8a to 8n for performing data processing, a memory 9 formed by a RAM or the like for temporarily storing processing data, and a bus bridge 10 are connected by a processor bus 24. I have. The bus bridge 10 absorbs the difference between the operating speeds of the processor bus 24 operating at a high speed and the operating speed of the input / output bus 23 at a low speed, so that data transfer between any devices connected to the buses 23 and 24 can be performed smoothly. I can do it. The input / output bus 23 is connected to the external input / output devices 5a to 5m and the PC server 1 via the external input / output switching mechanism 4.
i, an external input / output device connecting device 11 for connecting
ROM 12 for storing a program for performing initialization and self-diagnosis of C server 1i, keyboard 14 for directly inputting / outputting between PC server 1i and the outside, mouse 1
6, display 18 and floppy disk drive (F
DD) 20, each of the control devices 13, 15, 17, 19, and P
A control device 21 of a hard disk device (HDD) 22 used for storing a diagnostic program of the C server 1i and temporary data is connected. This input / output bus 23 is
It is a bus based on an industry standard such as a CI bus or an EISA bus, and includes an external input / output device connection device 11 and control devices 13, 1
5, 17, 19, and 21 are configured by a PCI bus-compliant expansion board or the like. The server management device 2i that manages and controls the operation of the PC server 1i has a built-in nonvolatile memory 25 for recording management information of the server management device 2i. Note that the non-volatile memory 25 has a dedicated H for the server management device 2i without being built in the server management device 2i.
An external storage device such as a DD or a memory card may be used.

【００３３】外部入出力装置接続装置１１は、ＰＣサー
バ１ｉと外部入出力装置５ａ〜５ｍとの接続・切離しを
行う装置切換手段であり、サーバ管理装置２ｉの指示に
より選択された外部入出力装置５ａ〜５ｍを外部入出力
切り換え機構４を経由して入出力バス２３に接続するこ
とにより、複数のプロセッサ８ａ〜８ｎ、メモリ９及び
制御装置１３，１５，１７，１９，２１と選択された外
部入出力装置５ａ〜５ｍの間のデータ転送を行う。ま
た、外部入出力装置接続装置１１を外部入出力装置５ａ
〜５ｍのＯＳ６ａ〜６ｍのロード装置として定義するこ
とによって、ＰＣサーバ１ｉの初期化処理完了後、サー
バ管理装置２ｉが接続先を決定して外部入出力装置接続
装置１１に接続指示を出すまで、外部入出力装置接続装
置１１がロード装置オフラインの状態を維持し、複数の
プロセッサ８ａ〜８ｎをシステム立上げ待ちの状態に留
める。The external input / output device connection device 11 is device switching means for connecting / disconnecting the PC server 1i to / from the external input / output devices 5a to 5m, and the external input / output device selected by the instruction of the server management device 2i. By connecting the 5a to 5m to the input / output bus 23 via the external input / output switching mechanism 4, the plurality of processors 8a to 8n, the memory 9, and the control devices 13, 15, 17, 19, 21 and the selected external device are connected. Data transfer between the input / output devices 5a to 5m is performed. The external input / output device connection device 11 is connected to the external input / output device 5a.
After the initialization process of the PC server 1i is completed, the server management device 2i determines a connection destination and outputs a connection instruction to the external input / output device connection device 11 after the initialization process of the PC server 1i is completed. The external input / output device connection device 11 maintains the load device offline state, and keeps the plurality of processors 8a to 8n waiting for system startup.

【００３４】複数のプロセッサ８ａ〜８ｎは、ＰＣサー
バ１ｉの立上げ時に、ＲＯＭ１２に格納されたＰＣサー
バ１ｉの初期化及び自己診断プログラムを実行し、メモ
リ９、外部入出力装置接続装置１１、キーボード１４、
マウス１６、ディスプレイ１８、ＦＤＤ２０、ＨＤＤ２
２とそれらの制御装置１３，１５，１７，１９，２１を
初期化し診断する。初期化及び自己診断完了後、複数の
プロセッサ８ａ〜８ｎは、外部入出力装置接続装置１１
を経由して外部入出力装置５ａ〜５ｍに格納されたＯＳ
６ａ〜６ｍをメモリ９に読み出すことによって業務処理
を開始する。操作員は、キーボード１４、マウス１６、
ディスプレイ１８を使用してＯＳ６ａ〜６ｍに指示を与
える。また、複数のプロセッサ８ａ〜８ｎは、自己診断
によって例えばプロセッサ８ｃの障害が判明しても、当
該プロセッサ８ｃを停止させたまま、残りのプロセッサ
８ａ，８ｂ，８ｄ〜８ｎによってデータ処理を行う縮退
運転機能を持つ。The plurality of processors 8a to 8n execute initialization and self-diagnosis programs of the PC server 1i stored in the ROM 12 when the PC server 1i is started up, and store the memory 9, the external input / output device connection device 11, the keyboard 14,
Mouse 16, display 18, FDD 20, HDD2
2 and their control devices 13, 15, 17, 19, 21 are initialized and diagnosed. After completion of the initialization and the self-diagnosis, the plurality of processors 8a to 8n
OS stored in the external input / output devices 5a to 5m via the
The business process is started by reading 6a to 6m into the memory 9. The operator has a keyboard 14, a mouse 16,
The display 18 is used to give instructions to the OSs 6a to 6m. Further, even if a failure of the processor 8c is found by the self-diagnosis, for example, the plurality of processors 8a to 8n perform the degenerate operation in which the remaining processors 8a, 8b, and 8d to 8n perform data processing while the processor 8c is stopped. Has functions.

【００３５】サーバ管理装置２ｉは、ＰＣサーバ１ｉの
運転管理・制御を行うと共に、通信路３を使用して外部
のサーバ管理装置２ａ〜２ｎとの間で、構成情報と障害
情報の授受を行うことにより、全ＰＣサーバ１ａ〜１ｎ
の最新の構成情報と障害情報を不揮発性メモリ２５に記
録する。ＰＣサーバ１ｉの初期化及び自己診断時におい
て、サーバ管理装置２ｉは、障害が発生したプロセッサ
８ａ〜８ｎを自動的に切り離したり、障害履歴や操作員
の指示等により特定のプロセッサ８ａ〜８ｎやメモリ９
の一部を切り離す等の縮退運転を実施する。この場合、
サーバ管理装置２ｉは、初期化処理完了時点で通信路３
を介して他の全てのサーバ管理装置２ａ〜２ｎに当該縮
退情報を通報する。サーバ管理装置２ｉは、ＰＣサーバ
１ｉの初期化処理が完了した時点で構成情報と障害情報
に基づいて適切な外部入出力装置５ａ〜５ｍを選択し、
外部入出力装置接続装置１１に接続先を指示することに
よってＰＣサーバ１ｉと選択された外部入出力装置５ａ
〜５ｍを接続する。The server management device 2i manages and controls the operation of the PC server 1i, and exchanges configuration information and fault information with the external server management devices 2a to 2n using the communication path 3. As a result, all the PC servers 1a to 1n
Is recorded in the nonvolatile memory 25. At the time of initialization and self-diagnosis of the PC server 1i, the server management device 2i automatically disconnects the failed processors 8a to 8n, or stores the specific processors 8a to 8n and memory 9
Degenerate operation such as disconnecting a part of in this case,
The server management device 2i sets the communication path 3
To the other server management devices 2a to 2n via the server. When the initialization process of the PC server 1i is completed, the server management device 2i selects an appropriate external input / output device 5a to 5m based on the configuration information and the failure information,
The PC server 1i and the selected external input / output device 5a are designated by instructing the external input / output device connection device 11 to connect.
To 5 m.

【００３６】このように、各サーバ管理装置２ａ〜２ｎ
は、構成情報管理機能、障害情報管理機能及び構成制御
機能を有している。各サーバ管理装置２ａ〜２ｎは、構
成情報管理機能を発揮することによって、図示しない通
信手段によって通信路３を介して相互に構成情報を授受
しあうことで、全てのＰＣサーバ１ａ〜１ｎに関する最
新の構成情報を共有する。また、各サーバ管理装置２ａ
〜２ｎは、障害情報管理機能を発揮することによって、
全てのＰＣサーバ１ａ〜１ｎに関する障害発生履歴に基
づいて求めた信頼度点数、障害による停止や縮退運転中
等の障害情報を図示しない通信手段によって通信路３を
介して相互に授受しあうことで、全てのＰＣサーバ１ａ
〜１ｎに関する最新の障害情報を共有する。また、管理
対象サーバにおける障害発生を監視し、障害発生を検知
した場合には、他のサーバ管理装置２ａ〜２ｎへ障害情
報を通報することによって、他のＰＣサーバ１ａ〜１ｎ
が保有する当該管理対象サーバに関する障害情報を更新
する。更に、各サーバ管理装置２ａ〜２ｎは、構成制御
機能を発揮することによって、共有している構成情報及
び障害情報に基づいて、最適な外部入出力装置５ａ〜５
ｍを自律判断で選択し、外部入出力切り換え機構４によ
って管理対象サーバと選択されたいずれかの外部入出力
装置５ａ〜５ｍとを結合させ、当該外部入出力装置５ａ
〜５ｍに保持されているＯＳを起動して、業務処理を開
始させる。As described above, each of the server management devices 2a to 2n
Has a configuration information management function, a failure information management function, and a configuration control function. Each of the server management devices 2a to 2n performs a configuration information management function, and exchanges configuration information with each other via a communication path 3 by a communication unit (not shown). Share configuration information. Also, each server management device 2a
~ 2n, by exhibiting the fault information management function,
By transmitting and receiving the reliability score obtained based on the failure occurrence histories for all the PC servers 1a to 1n and failure information such as stop due to a failure or degraded operation via the communication path 3 by communication means (not shown), All PC servers 1a
Share the latest fault information about .about.1n. In addition, when a failure has been monitored in the managed server, and the failure has been detected, the failure information is reported to the other server management devices 2a to 2n, whereby the other PC servers 1a to 1n are notified.
Update the failure information of the managed server held by the server. Further, each of the server management devices 2a to 2n exerts a configuration control function, so that the optimum external input / output devices 5a to 5n are determined based on the shared configuration information and fault information.
m is selected by autonomous judgment, the external input / output switching mechanism 4 couples the server to be managed with one of the selected external input / output devices 5a to 5m, and
Activate the OS stored in 5 m to start business processing.

【００３７】図３は、図１の複数の外部入出力装置５ａ
〜５ｍの構成を示すブロック図である。外部入出力装置
５ｉは、複数のディスク制御装置２６ａ〜２６ｎ、複数
（ｎ）台のディスク制御装置２６ａ〜２６ｎによってデ
ータの記録や読出しをする複数（ｎ）台のディスク装置
２７ａ〜２７ｎ，２８ａ〜２８ｎ、ＬＡＮ７との通信制
御を行う複数（ｎ）台の回線制御装置２９ａ〜２９ｎ、
外部入出力切り換え機構４を経由してＰＣサーバ１ａ〜
１ｎと接続するサーバ接続装置３０を有している。ここ
では、ディスク装置２７ａに業務処理を実行するための
ＯＳ６ｉが格納されているものとしている。FIG. 3 shows a plurality of external input / output devices 5a of FIG.
It is a block diagram which shows the structure of ~ 5m. The external input / output device 5i includes a plurality of (n) disk devices 27a to 27n and 28a to record and read data by a plurality of (n) disk controllers 26a to 26n and a plurality (n) of disk controllers 26a to 26n. 28n, a plurality (n) of line controllers 29a to 29n for controlling communication with the LAN 7,
PC server 1a through external input / output switching mechanism 4
1n is connected to the server connection device 30. Here, it is assumed that the OS 6i for executing the business process is stored in the disk device 27a.

【００３８】システム立上げ時、外部入出力装置５ｉの
サーバ接続装置３０は、ＰＣサーバ１ａ〜１ｎのいずれ
かと接続し、接続先のＰＣサーバ１ｉの指示を受けて、
ディスク制御装置２６ａの制御によりディスク装置２７
ａに格納されているＯＳｉ６ｉを読み出し、ＰＣサーバ
１ｉによって業務処理が開始される。業務処理が開始さ
れると、接続先のＰＣサーバ１ｉの指示のもと、複数の
ディスク装置２７ａ〜２７ｎ，２８ａ〜２８ｎが業務処
理用のデータ保存に利用されると共に、接続先のＰＣサ
ーバ１ｉは、回線制御装置２９ａ〜２９ｎを利用してＬ
ＡＮ７により他のＰＣサーバ１ａ〜１ｎや外部システム
とのデータ授受を行う。When the system is started, the server connection device 30 of the external input / output device 5i connects to any one of the PC servers 1a to 1n, and receives an instruction from the connection destination PC server 1i.
The disk device 27 is controlled by the disk controller 26a.
OSi 6i stored in a is read, and the business process is started by PC server 1i. When the business process is started, a plurality of disk devices 27a to 27n and 28a to 28n are used for storing data for business process under the instruction of the PC server 1i of the connection destination, and the PC server 1i of the connection destination is used. Is L using the line controllers 29a to 29n.
The AN 7 exchanges data with the other PC servers 1a to 1n and external systems.

【００３９】ＰＣサーバ１ｉは、ＯＳｉ６ｉをロードす
ることによってＯＳｉ６ｉが格納されている外部入出力
装置５ｉ内のディスク装置２７ａ〜２７ｎ，２８ａ〜２
８ｎと回線制御装置２９ａ〜２９ｎのみで業務処理を開
始できるように設計されている。当該ＯＳｉ６ｉが具備
する業務処理に必要な処理能力と信頼性は、全てのＰＣ
サーバ１ａ〜１ｎ内のサーバ管理装置２ａ〜２ｎの不揮
発性メモリ２５に外部入出力装置５ａ〜５ｍの構成情報
として記憶されており、システム立上げ時においてＰＣ
サーバ１ａ〜１ｎが接続先の外部入出力装置５ａ〜５ｍ
を選択する際の判定基準となる。なお、業務処理開始
後、外部入出力装置５ａ〜５ｍ内のディスク装置２７ａ
〜２７ｎ，２８ａ〜２８ｎを、ＯＳ６ａ〜６ｍの制御に
よりＰＣサーバ１ａ〜１ｎで共用させることもできる。The PC server 1i loads the OSi 6i, and the disk devices 27a to 27n and 28a to 2 in the external input / output device 5i in which the OSi 6i is stored.
8n and the line control devices 29a to 29n are designed so that business processing can be started only. The processing power and reliability required for business processing provided by the OSi6i are equivalent to all PCs.
The configuration information of the external input / output devices 5a to 5m is stored in the non-volatile memory 25 of the server management devices 2a to 2n in the servers 1a to 1n.
External input / output devices 5a-5m to which servers 1a-1n are connected
Is a criterion when selecting. After the business process starts, the disk device 27a in the external input / output devices 5a to 5m is set.
To 27n and 28a to 28n can be shared by the PC servers 1a to 1n under the control of the OSs 6a to 6m.

【００４０】図４は、サーバ管理装置２ａ〜２ｎが不揮
発性メモリ２５に記憶しているＰＣサーバ１ａ〜１ｎの
構成情報と障害情報の例を表形式で示した図である。更
に、ＰＣサーバ１ａ〜１ｎの初期化処理完了時、サーバ
管理装置２ａ〜２ｎが接続先の外部入出力装置５ａ〜５
ｍを決定する際に算出するサーバ接続順位も示してい
る。構成情報には、ＰＣサーバ１ａ〜１ｎと外部入出力
装置５ａ〜５ｍを接続する際の制約条件となるサーバグ
ループ、性能を左右する性能指標となるプロセッサモデ
ル、動作周波数、プロセッサ数、キャッシュ容量、メモ
リ容量、プロセッサモデル等、更にはこれらの構成情報
に基づいて後述する既定の算式によりシステム設計者が
算出する基本性能点数及びその基本性能点数に後述する
縮退条件を加味してサーバ管理装置２ａ〜２ｎにより算
出される実効的な処理能力を表す実効性能点数が含まれ
る。障害情報には、処理能力を左右する縮退状況を表す
縮退プロセッサ数、信頼性指標としての障害回数、ＰＣ
サーバ１ａ〜１ｎの初期化処理の結果を表す稼動状況
等、更にはこれらの障害情報に基づいて後述する既定の
算式によりサーバ管理装置２ａ〜２ｎが算出した信頼度
点数が含まれている。FIG. 4 is a table showing examples of configuration information and failure information of the PC servers 1a to 1n stored in the nonvolatile memory 25 by the server management devices 2a to 2n. Furthermore, when the initialization processing of the PC servers 1a to 1n is completed, the server management devices 2a to 2n connect the external input / output devices 5a to 5
The server connection order calculated when determining m is also shown. The configuration information includes a server group serving as a constraint when connecting the PC servers 1a to 1n and the external input / output devices 5a to 5m, a processor model serving as a performance index that affects performance, an operating frequency, the number of processors, a cache capacity, The server management devices 2a to 2a to 2c include a basic capacity score calculated by a system designer based on the memory capacity, a processor model, and the like, and a predetermined formula described later based on the configuration information and a degeneration condition described later. 2n includes an effective performance score indicating an effective processing capacity calculated by 2n. The failure information includes the number of degraded processors indicating a degraded state that affects processing capacity, the number of failures as a reliability index,
It includes the operating status indicating the result of the initialization processing of the servers 1a to 1n, and further includes the reliability score calculated by the server management devices 2a to 2n based on these failure information by a predetermined formula described later.

【００４１】各ＰＣサーバ１ａ〜１ｎの構成情報と障害
情報は、システム構築時にシステム設計者が各値を決定
してサーバ管理装置２ａ〜２ｎに設定する。但し、ある
サーバ管理装置２ｉは、自身が管理するＰＣサーバ１ｉ
に関する詳細な構成情報と障害情報を保持するが、他の
ＰＣサーバ１ａ〜１ｎに関する構成情報と障害情報につ
いては、サーバグループ、実効性能点数及び信頼度点数
のみを保持する。例えば、ＰＣサーバ１ｊの構成変更や
増設時には、ＰＣサーバ１ｊを管理するサーバ管理装置
２ｊが、通信路３を介して他の全てのサーバ管理装置２
ａ〜２ｎに新たな構成情報と障害情報を通報することに
より全てのサーバ管理装置２ａ〜２ｎの構成情報及び障
害情報を更新する。サーバ管理装置２ｊが通報する新た
な構成情報と障害情報には、サーバグループ、実効性能
点数及び信頼度点数が含まれている。これによって、各
ＰＣサーバ１ａ〜１ｎに固有の構成情報や障害情報を個
々のサーバ管理装置２ａ〜２ｎ内に閉じて管理できると
共に、実効性能点数及び信頼度点数の算式についても各
ＰＣサーバ１ａ〜１ｎに固有の算式を設けることができ
る。以下に構成情報及び障害情報を構成する各情報の詳
細について説明する。The configuration information and the failure information of each of the PC servers 1a to 1n are determined by the system designer at the time of system construction and set in the server management devices 2a to 2n. However, a server management device 2i manages its own PC server 1i.
The configuration information and the failure information regarding the other PC servers 1a to 1n are stored, but only the server group, the effective performance score, and the reliability score are retained. For example, when the configuration of the PC server 1j is changed or added, the server management device 2j that manages the PC server 1j communicates with all other server management devices 2 via the communication path 3.
The configuration information and the failure information of all the server management devices 2a to 2n are updated by reporting the new configuration information and the failure information to a to 2n. The new configuration information and fault information reported by the server management device 2j include a server group, an effective performance score, and a reliability score. Thus, the configuration information and the failure information unique to each of the PC servers 1a to 1n can be closed and managed in each of the server management devices 2a to 2n, and the formulas of the effective performance score and the reliability score can be calculated. 1n can have its own formula. The details of each piece of information constituting the configuration information and the failure information will be described below.

【００４２】サーバグループは、各ＰＣサーバ１ａ〜１
ｎをいずれかの外部入出力装置５ａ〜５ｍへ接続する際
の制約条件の一つであり、ここでは、例としてＩによっ
てインテル社のマイクロプロセッサアーキテクチャであ
るｘ８６アーキテクチャを示し、Ｍによってマルチプロ
セッサ構成であることを、Ｓによってシングルプロセッ
サ構成であることを示す。この情報によって、ｘ８６ア
ーキテクチャのマルチプロセッサシステムとｘ８６アー
キテクチャのシングルプロセッサシステムを異なるサー
バグループとして定義し、外部入出力装置５ａ〜５ｍに
格納されたＯＳ６ａ〜６ｍの仕様にあったＰＣサーバ１
ａ〜１ｎを選択できるようにしている。すなわち、この
制約条件を満たすＰＣサーバ１ａ〜１ｎ及び外部入出力
装置５ａ〜５ｍ単位でＰＣサーバ１ａ〜１ｎの接続先と
なる外部入出力装置５ａ〜５ｍが選択され、決定され
る。なお、ここでは、ｘ８６アーキテクチャのみを示し
ているが、他のアーキテクチャのサーバを定義すること
によって、例えば、ＤＥＣ社のアルファアーキテクチャ
のＰＣサーバをＰＣサーバ１ａ〜１ｎに混在させても、
対応したＯＳの格納された外部入出力装置５ａ〜５ｍに
接続させることができる。The server group includes each of the PC servers 1a to 1
n is one of the restrictive conditions when connecting to any of the external input / output devices 5a to 5m. Here, for example, I denotes an x86 architecture which is a microprocessor architecture of Intel Corporation, and M denotes a multiprocessor configuration. Indicates that the configuration is a single processor by S. Based on this information, the x86 architecture multiprocessor system and the x86 architecture single processor system are defined as different server groups, and the PC server 1 that meets the specifications of the OSs 6a to 6m stored in the external input / output devices 5a to 5m.
a to 1n can be selected. That is, the external input / output devices 5a to 5m to which the PC servers 1a to 1n are connected are selected and determined in units of the PC servers 1a to 1n and the external input / output devices 5a to 5m satisfying the constraint condition. Although only the x86 architecture is shown here, by defining a server of another architecture, for example, even if a PC server of an alpha architecture of DEC is mixed with the PC servers 1a to 1n,
It can be connected to the external input / output devices 5a to 5m storing the corresponding OS.

【００４３】基本性能点数は、各ＰＣサーバ１ａ〜１ｎ
が本来有している処理能力を示す点数であり、構成の縮
退がされていない状態における処理能力を示す点数であ
る。基本性能点数は、プロセッサモデル、動作周波数、
プロセッサ数、キャッシュ容量及びメモリ容量等の処理
能力を左右する性能指標に基づいてシステム設計者がシ
ステム構築時に決定される。基本性能点数は、各性能指
標毎に重み付けを行い、例えば、（プロセッサモデル点
数）＋（動作周波数×重み値Ｆ）＋（プロセッサ数×重
み値Ｐ）＋（キャッシュ容量×重み値Ｃ）＋（メモリ容
量×重み値Ｍ）のような計算式によって算出する。従っ
て、基本性能点数は、値が大きいほど処理能力が高いこ
とを意味している。The basic performance score is calculated for each of the PC servers 1a to 1n.
Is a score indicating the inherent processing capability, and is a score indicating the processing capability in a state where the configuration is not degenerated. The basic performance score is based on the processor model, operating frequency,
A system designer is determined at the time of system construction based on performance indices that affect processing capabilities such as the number of processors, cache capacity, and memory capacity. The basic performance score is weighted for each performance index. For example, (processor model score) + (operating frequency × weight F) + (number of processors × weight P) + (cache capacity × weight C) + ( It is calculated by a calculation formula such as (memory capacity × weight M). Therefore, the larger the value of the basic performance score, the higher the processing capability.

【００４４】実効性能点数は、構成の縮退がされた状態
での運転の結果、処理能力が低下したことにより補正さ
れた処理能力の点数であり、ＰＣサーバ１ａ〜１ｎの初
期化処理完了時に、サーバ管理装置２ａ〜２ｎが、基本
性能点数と縮退した部位の縮退の形態及び重み値を元に
算出する。例えば、基本性能点数−（縮退プロセッサ数
×重みＰ）のような計算式によって算出する。図４の例
によると、ＰＣサーバ１ｂの縮退プロセッサ数は２であ
ることから、重みＰを１とするとＰＣサーバ１ｂの実効
性能点数は、基本性能点数８に対して（８−２×１＝）
６となる。The effective performance score is a score of the processing capacity corrected due to a reduction in the processing capacity as a result of the operation in the state in which the configuration has been degenerated. When the initialization processing of the PC servers 1a to 1n is completed, The server management devices 2a to 2n calculate based on the basic performance score, the degeneration mode of the degenerated part, and the weight value. For example, it is calculated by a calculation formula such as basic performance score− (number of degenerate processors × weight P). According to the example of FIG. 4, since the number of degenerated processors of the PC server 1 b is 2, if the weight P is 1, the effective performance score of the PC server 1 b is (8−2 × 1 = )
It becomes 6.

【００４５】縮退プロセッサ数は、ＰＣサーバ１ａ〜１
ｎの初期化処理完了時点で更新される値であり、縮退さ
れているプロセッサ８ａ〜８ｎの数を表す。The number of degenerate processors is the number of PC servers 1a to 1
n is a value updated at the time of completion of the initialization processing, and represents the number of degenerated processors 8a to 8n.

【００４６】障害回数は、ＰＣサーバ１ａ〜１ｎの初期
化処理や業務処理中に発生したＰＣサーバ１ａ〜１ｎの
障害発生回数を表し、サーバ管理装置２ａ〜２ｎがＰＣ
サーバ１ａ〜１ｎの信頼性を表す指標の一つとして使用
する。本実施の形態では、システム構築以来の障害発生
回数を累積している。障害発生回数が既定の回数を越え
る毎に信頼度点数に加算することで信頼度の低下を表す
ことができるようにしている。The number of failures indicates the number of failures of the PC servers 1a to 1n that occurred during the initialization processing of the PC servers 1a to 1n and business processing.
It is used as one of the indexes indicating the reliability of the servers 1a to 1n. In the present embodiment, the number of failure occurrences since the system was constructed is accumulated. Each time the number of failure occurrences exceeds a predetermined number, it is added to the reliability score so that a decrease in reliability can be represented.

【００４７】稼動状況は、ＰＣサーバ１ａ〜１ｎの初期
化処理完了時点と障害発生により稼動状況が変化した際
に更新される。例えば、初期化処理完了時点の更新の結
果、ＰＣサーバ１ｉが障害のため停止した場合、稼動状
況を停止とする。このように更新することで、サーバ管
理装置２ａ〜２ｎが外部入出力装置５ａ〜５ｍの接続先
決定に際し、ＰＣサーバ１ｉを接続先の候補から外すた
めの制御に使用する。The operating status is updated when the initialization process of the PC servers 1a to 1n is completed and when the operating status changes due to a failure. For example, if the PC server 1i is stopped due to a failure as a result of the update at the time of completion of the initialization processing, the operation status is set to stop. By updating in this manner, the server management devices 2a to 2n use the PC server 1i for control to remove the PC server 1i from the connection destination candidates when determining the connection destination of the external input / output devices 5a to 5m.

【００４８】信頼度点数は、ＰＣサーバ１ａ〜１ｎの初
期化処理完了時点に、サーバ管理装置２ａ〜２ｎが、縮
退プロセッサ数、障害回数、稼動状況等から既定の算式
によって算出するＰＣサーバ１ａ〜１ｎの信頼性を表す
総合点数である。本実施の形態では、稼動状況が稼動の
場合に障害回数をそのまま信頼度点数としており、稼動
状況が停止の場合には障害回数に更に１００を加算して
いる。従って、信頼度点数が高いほど、信頼性が低いこ
とを意味している。なお、稼動状況が停止の場合の加算
値は、１０００等のより大きな値を利用してもよい。The reliability score is calculated by the server management devices 2a to 2n from the number of degraded processors, the number of failures, the operation status, and the like by a predetermined formula when the initialization process of the PC servers 1a to 1n is completed. It is a total score representing the reliability of 1n. In the present embodiment, when the operation status is “operation”, the number of failures is used as it is as the reliability score, and when the operation status is “stop”, 100 is added to the number of failures. Therefore, the higher the reliability score, the lower the reliability. It should be noted that a larger value such as 1000 may be used as the added value when the operation status is stopped.

【００４９】サーバ接続順位は、ＰＣサーバ１ａ〜１ｎ
の初期化処理完了時点に、サーバ管理装置２ａ〜２ｎ
が、サーバグループ毎に、実効性能点数と信頼度点数に
基づいて算出する順位で、本順位に基づいてサーバ管理
装置２ａ〜２ｎがＰＣサーバ１ａ〜１ｎの接続先となる
外部入出力装置５ａ〜５ｍを決定する。本実施の形態で
は、実効性能点数で第１次の順序付けを行い、実効性能
点数が同一のものに対しては信頼度点数で第２次の順序
付けを行っている。更に、最終的な順序付けとしては、
ＰＣサーバ１ａ〜１ｎ固有のサーバ番号の若い順に順序
付けを行なう。なお、順位０は、外部入出力装置５ａ〜
５ｍとの接続を行わないことを表す。図４に示した例に
よると、ＰＣサーバ１ｃは、外部入出力装置５ａ〜５ｍ
との接続候補外である。また、サーバグループＩＳには
ＰＣサーバ１ｄのみが存在するので、ＰＣサーバ１ｄの
サーバ接続順位は１となる。The server connection order is determined by the PC servers 1a to 1n.
Server management devices 2a to 2n
Is an order calculated based on the effective performance score and the reliability score for each server group. Based on the order, the server management devices 2a to 2n are connected to the PC servers 1a to 1n. Determine 5m. In the present embodiment, the first order is performed based on the effective performance score, and the second order is performed based on the reliability score for those having the same effective performance score. Furthermore, the final ordering is:
The ordering is performed in ascending order of server numbers unique to the PC servers 1a to 1n. In addition, the order 0 is the external input / output devices 5a to 5a.
5m is not connected. According to the example illustrated in FIG. 4, the PC server 1c is configured to use the external input / output devices 5a to 5m.
Is not a candidate for connection. Further, since only the PC server 1d exists in the server group IS, the server connection order of the PC server 1d is 1.

【００５０】図５は全てのサーバ管理装置２ａ〜２ｎが
不揮発性メモリ２５に記憶している外部入出力装置５ａ
〜５ｍの構成情報の例を表形式で示した図である。この
構成情報は、外部入出力装置５ａ〜５ｍに格納されてい
るＯＳ６ａ〜６ｍの種類を表すＯＳ名、搭載されている
ＯＳ６ａ〜６ｍによる業務処理に必要な処理能力と信頼
性を表す性能指標と信頼度指標、接続元のＰＣサーバ１
ａ〜１ｎに対する制約条件としてのサーバグループ、前
記性能指標と信頼度指標に基づいてサーバグループ単位
で決定され、ＰＣサーバ１ａ〜１ｎとの接続順序を表す
装置接続順位を含む。外部入出力装置５ａ〜５ｍの構成
情報は、装置接続順位を除き全てシステム設計者によっ
て決定され、システム構築時やシステム構成変更時に、
全てのサーバ管理装置２ａ〜２ｎに記録される。装置接
続順位は、ＰＣサーバ１ａ〜１ｎの初期化処理完了時点
で、サーバ管理装置２ａ〜２ｎが、性能指標と信頼度指
標に基づいて算出する。以下に外部入出力装置５ａ〜５
ｍの構成情報を構成する各情報の詳細について説明す
る。FIG. 5 shows the external input / output device 5a stored in the nonvolatile memory 25 by all the server management devices 2a to 2n.
FIG. 8 is a diagram showing an example of configuration information of up to 5 m in a table format. The configuration information includes an OS name indicating the type of the OSs 6a to 6m stored in the external input / output devices 5a to 5m, a performance index indicating processing capability and reliability required for business processing by the installed OSs 6a to 6m, and Reliability index, connection source PC server 1
A server group as a constraint on a to 1n, and a device connection order, which is determined for each server group based on the performance index and the reliability index, and indicates a connection order with the PC servers 1a to 1n. All the configuration information of the external input / output devices 5a to 5m is determined by the system designer except for the device connection order.
It is recorded in all server management devices 2a to 2n. The device connection order is calculated by the server management devices 2a to 2n based on the performance index and the reliability index when the initialization processing of the PC servers 1a to 1n is completed. The external input / output devices 5a to 5
Details of each piece of information constituting the m configuration information will be described.

【００５１】サーバグループは、ＰＣサーバ１ａ〜１ｎ
の構成情報に含まれるサーバグループと同じ意味を持
ち、同一のサーバグループ間での接続のみを許可するも
のである。ここでは、ＯＳ６ａ〜６ｍの種類によって、
シングルプロセッサ専用のものやマルチプロセッサに対
応したものによってグループ分けをし、ＰＣサーバ１ａ
〜１ｎとの接続時の制限事項としている。The server groups are the PC servers 1a to 1n
Has the same meaning as the server group included in the configuration information, and permits connection only between the same server groups. Here, depending on the type of OS 6a to 6m,
The PC server 1a is divided into groups according to those dedicated to a single processor or those compatible with a multiprocessor.
1 to 1n.

【００５２】性能指標と信頼度指標は、この例では、外
部入出力装置５ａ〜５ｍに搭載されたＯＳ６ａ〜６ｍに
よる業務処理に必要な処理能力と信頼性を順位として表
した相対的な指標である。サーバグループ毎に性能指標
で第１次の順序付けを行い、同一性能指標の外部入出力
装置５ａ〜５ｍを信頼性指標で更に順序付けしている。
例えば、外部入出力装置を新たに追加した場合には、シ
ステム設計者が、まず、性能指標で性能面から既存順位
のいずれかに割り付け、信頼度指標によって同一性能指
標内での順序付けを変更する。この例では、性能指標と
信頼度指標を元に、ＰＣサーバ１ａ〜１ｎへの接続順序
を一元的に表した装置接続順位を、サーバ管理装置２ａ
〜２ｎがＰＣサーバ１ａ〜１ｎの初期化処理完了時点で
求めている。サーバ管理装置２ａ〜２ｎは、装置接続順
位とサーバ接続順位をつき合わせることによって、ＰＣ
サーバ１ａ〜１ｎと外部入出力装置５ａ〜５ｍの接続関
係を決定する。In this example, the performance index and the reliability index are relative indexes representing, as ranks, the processing performance and reliability required for business processing by the OSs 6a to 6m mounted on the external input / output devices 5a to 5m. is there. The primary ordering is performed by the performance index for each server group, and the external input / output devices 5a to 5m having the same performance index are further ordered by the reliability index.
For example, when an external input / output device is newly added, the system designer first assigns one of the existing ranks in terms of performance using the performance index, and changes the ordering within the same performance index using the reliability index. . In this example, based on the performance index and the reliability index, the device connection order that represents the connection order to the PC servers 1a to 1n in a unified manner is stored in the server management device 2a.
To 2n at the time of completion of the initialization processing of the PC servers 1a to 1n. The server management devices 2a to 2n compare the device connection order with the server connection order, and
The connection relationship between the servers 1a to 1n and the external input / output devices 5a to 5m is determined.

【００５３】図６は、ＰＣサーバ１ａ〜１ｎと外部入出
力装置５ａ〜５ｍの接続先を決定して業務処理のための
システム立上げを開始する迄のサーバ管理装置２ａ〜２
ｎの処理手順を示したフローチャートである。以下、図
１〜図６を用いて本実施の形態における動作について説
明する。ここでは、ＰＣサーバ１ａ〜１ｎ、サーバ管理
装置２ａ〜２ｎ、外部入出力装置５ａ〜５ｍ及びＯＳ６
ａ〜６ｍのうち、ＰＣサーバ１ｉ、サーバ管理装置２
ｉ、外部入出力装置５ｉ及びＯＳ６ｉに着目して動作を
説明する。分散システム全体の電源投入時には、他のＰ
Ｃサーバ１ａ〜１ｎ及びサーバ管理装置２ａ〜２ｎも同
一の手順によって一斉にシステム立上げが開始され以下
の処理が実行される。FIG. 6 shows the server management devices 2a-2 until the connection destinations of the PC servers 1a-1n and the external input / output devices 5a-5m are determined and the system startup for business processing is started.
4 is a flowchart showing the processing procedure of n. Hereinafter, the operation in the present embodiment will be described with reference to FIGS. Here, the PC servers 1a to 1n, the server management devices 2a to 2n, the external input / output devices 5a to 5m, and the OS 6
PC server 1i, server management device 2
i, the external input / output device 5i and the OS 6i, the operation will be described. When the entire distributed system is powered on, other P
The C servers 1a to 1n and the server management devices 2a to 2n also simultaneously start up the system by the same procedure, and execute the following processing.

【００５４】まず、分散システム全体の電源投入によっ
て、サーバ管理装置２ｉは、ＰＣサーバ１ｉの障害情報
のうち、縮退プロセッサ数を０に、稼動状況を初期化状
態に設定する。メモリやキャッシュ等、縮退情報が他に
も存在する場合には、それらも非縮退時の値に設定する
（ステップ１００）。First, when the power of the entire distributed system is turned on, the server management apparatus 2i sets the number of degenerated processors to 0 in the failure information of the PC server 1i, and sets the operation status to the initialized state. If there is other degeneration information such as a memory or a cache, they are also set to the values at the time of non-degeneration (step 100).

【００５５】サーバ管理装置２ｉは、ＰＣサーバ１ｉの
初期化及び自己診断を開始する（ステップ１１０）。Ｐ
Ｃサーバ１ｉのプロセッサ８ａ〜８ｎは、電源投入後、
初期化処理開始がサーバ管理装置２ｉから指示されるま
での間、ハードウェアのリセット状態を維持する等によ
り動作を停止する。サーバ管理装置２ｉが初期化処理開
始を指示すると、プロセッサ８ａ〜８ｎは、ＲＯＭ１２
内の初期化及び自己診断プログラムによって初期化処理
を開始する。これによって、メモリ９、外部入出力装置
接続装置１１、制御装置１３，１５，１７，１９，２
１、キーボード１４、マウス１６、ディスプレイ１８、
ＦＤＤ２０及びＨＤＤ２２が初期化及び診断され、稼動
可能な状態になる。なお、外部入出力装置接続装置１１
は、サーバ管理装置２ｉからの接続先指示が出されるま
では、接続先オフラインの状態を維持する。外部入出力
装置接続装置１１は、ＯＳ６ａ〜６ｍのロード装置とし
て定義されていることから、初期化処理完了後、プロセ
ッサ８ａ〜８ｎはＯＳ６ａ〜６ｍによるシステム立上げ
待ちの状態に留まる。The server management device 2i starts initialization and self-diagnosis of the PC server 1i (step 110). P
After turning on the power, the processors 8a to 8n of the C server 1i
Until the start of the initialization process is instructed from the server management apparatus 2i, the operation is stopped by, for example, maintaining the reset state of the hardware. When the server management device 2i instructs the start of the initialization process, the processors 8a to 8n
Initialization processing is started by the initialization and self-diagnosis program in the above. Thereby, the memory 9, the external input / output device connection device 11, the control devices 13, 15, 17, 19, 2
1, keyboard 14, mouse 16, display 18,
The FDD 20 and the HDD 22 are initialized and diagnosed, and become operable. The external input / output device connection device 11
Keeps the connection destination offline until the connection destination instruction is issued from the server management device 2i. Since the external input / output device connection device 11 is defined as a load device for the OSs 6a to 6m, the processors 8a to 8n remain in a state of waiting for the system startup by the OSs 6a to 6m after the completion of the initialization process.

【００５６】ここで、ＰＣサーバ１ｉが初期化処理に失
敗して稼動不能であることが確認された場合、サーバ管
理装置２ｉは、ＰＣサーバ１ｉが停止状態になった旨の
障害通報を通信路３を介して他の全てのサーバ管理装置
２ａ〜２ｎに発信する（ステップ１２０，１３０）。こ
の後、サーバ管理装置２ｉは、ＰＣサーバ１ｉを停止状
態に保ち、保守要員による復旧を待つ（ステップ１４
０）。Here, when it is confirmed that the PC server 1i has failed in the initialization process and is inoperable, the server management device 2i sends a failure report indicating that the PC server 1i has been stopped to the communication path. 3 to all the other server management devices 2a to 2n (steps 120 and 130). Thereafter, the server management device 2i keeps the PC server 1i in a stopped state and waits for recovery by maintenance personnel (step 14).
0).

【００５７】一方、サーバ管理装置２ｉは、プロセッサ
８ａ〜８ｎの初期化処理完了を確認すると、ＰＣサーバ
１ｉの縮退状況を確認する（ステップ１２０，１５
０）。ＰＣサーバ１ｉは、例えばプロセッサ８ａ〜８ｎ
の自己診断の結果、プロセッサ８ｃに障害を発見した場
合、プロセッサ８ｃを切り離して初期化処理を完了させ
る。サーバ管理装置２ｉは、プロセッサ８ｃの切離しを
認識すると、自身の障害情報にその旨（障害プロセッサ
数）を記録する（ステップ１６０）。On the other hand, upon confirming the completion of the initialization processing of the processors 8a to 8n, the server management device 2i confirms the degeneration status of the PC server 1i (steps 120 and 15).
0). The PC server 1i includes, for example, the processors 8a to 8n
As a result of the self-diagnosis, if a failure is found in the processor 8c, the processor 8c is disconnected to complete the initialization processing. When recognizing the disconnection of the processor 8c, the server management device 2i records the fact (the number of failed processors) in its own failure information (step 160).

【００５８】サーバ管理装置２ｉは、更新されたＰＣサ
ーバ１ｉに関する構成情報と障害情報に基づき、ＰＣサ
ーバ１ｉの実効性能点数を計算し、ＰＣサーバ１ｉの構
成情報に記録する（ステップ１７０）。図４に示した例
によると、ＰＣサーバ１ｂについては縮退プロセッサ数
が２、重みが１であることから、実効性能点数を（基本
性能点数−２）として計算している。他のＰＣサーバに
ついては縮退プロセッサ数が０であることから、基本性
能点数をそのまま実効性能点数としている。The server management device 2i calculates the effective performance score of the PC server 1i based on the updated configuration information and failure information relating to the PC server 1i, and records it in the configuration information of the PC server 1i (step 170). According to the example shown in FIG. 4, since the number of degenerated processors is 2 and the weight is 1 for the PC server 1b, the calculation is performed with the effective performance score as (basic performance score-2). Since the number of degenerated processors is 0 for other PC servers, the basic performance score is directly used as the effective performance score.

【００５９】サーバ管理装置２ｉは、更新されたＰＣサ
ーバ１ｉに関する障害情報に基づき、ＰＣサーバ１ｉの
信頼度点数を計算し、ＰＣサーバ１ｉの障害情報に記録
する（ステップ１８０）。図４に示した例によると、Ｐ
Ｃサーバ１ａ、ＰＣサーバ１ｂ、ＰＣサーバ１ｄ及びＰ
Ｃサーバ１ｎについては、稼動状況が稼動状態にあるた
め、障害回数をそのまま信頼度点数として記録している
が、ＰＣサーバ１ｃについては、稼動状況が停止状態に
あるため、信頼度点数を障害回数＋１００として記録し
ている。The server management device 2i calculates the reliability score of the PC server 1i based on the updated failure information on the PC server 1i, and records it in the failure information of the PC server 1i (step 180). According to the example shown in FIG.
C server 1a, PC server 1b, PC server 1d and P
Since the operating status of the C server 1n is in the operating state, the number of failures is recorded as it is as the reliability score. However, since the operating status of the PC server 1c is in the stopped state, the reliability score is represented by the number of failures. Recorded as +100.

【００６０】サーバ管理装置２ｉは、このようにしてＰ
Ｃサーバ１ｉの構成情報と障害情報を自動生成すると、
不揮発性メモリ２５の内容を更新し、更に当該構成情報
及び障害情報を含んだ初期化完了通報を通信路３を介し
て他の全てのサーバ管理装置２ａ〜２ｎに発信する（ス
テップ１９０）。その一方、他のサーバ管理装置２ａ〜
２ｎからの初期化完了通報あるいは障害通報を受信す
る。この段階で、サーバ管理装置２ｉは、全てのＰＣサ
ーバ１ａ〜１ｎに関する構成情報と障害情報を他のサー
バ管理装置２ａ〜２ｎから通信路３を介して取得し、不
揮発性メモリ２５に逐次記録する。The server management device 2i thus sets P
When the configuration information and the failure information of the C server 1i are automatically generated,
The contents of the nonvolatile memory 25 are updated, and an initialization completion report including the configuration information and the failure information is transmitted to all the other server management apparatuses 2a to 2n via the communication path 3 (step 190). On the other hand, the other server management devices 2a to 2a
2n, receives an initialization completion notification or a failure notification. At this stage, the server management device 2i acquires the configuration information and the failure information regarding all the PC servers 1a to 1n from the other server management devices 2a to 2n via the communication path 3 and sequentially records them in the nonvolatile memory 25. .

【００６１】サーバ管理装置２ｉは、全ＰＣサーバ１ａ
〜１ｎの構成情報と障害情報及び全外部入出力装置５ａ
〜５ｍの構成情報に基づいて全ＰＣサーバ１ａ〜１ｎと
全外部入出力装置５ａ〜５ｍの接続先を決定する（ステ
ップ２００）。この結果、ＰＣサーバ１ｉの接続先を外
部入出力装置５ｉに決定したものとする。サーバ管理装
置２ｉは、この決定結果からＰＣサーバ１ｉの外部入出
力装置接続装置１１に、ＰＣサーバ１ｉの接続先となる
外部入出力装置５ｉを指示して、ＰＣサーバ１ｉと外部
入出力装置５ｉを接続する（ステップ２１０）。なお、
各サーバ管理装置２ａ〜２ｎは、個々に同様の処理を行
うことによりそれぞれを搭載したＰＣサーバ１ａ〜１ｎ
の接続先を決定した結果、ＰＣサーバ数が外部入出力装
置数以下であれば基本的には全ＰＣサーバ１ａ〜１ｎの
接続先が決定されることになる。The server management device 2i is connected to all PC servers 1a.
To 1n configuration information, fault information, and all external input / output devices 5a
The connection destinations of all the PC servers 1a to 1n and all the external input / output devices 5a to 5m are determined based on the configuration information of the first to fifth meters (step 200). As a result, it is assumed that the connection destination of the PC server 1i is determined to be the external input / output device 5i. The server management device 2i instructs the external input / output device connection device 11 of the PC server 1i to the external input / output device 5i to be connected to the PC server 1i based on the result of the determination, and the PC server 1i and the external input / output device 5i. Are connected (step 210). In addition,
Each of the server management devices 2a to 2n performs the same processing individually, and the PC server 1a to 1n on which each is mounted.
As a result of determining the connection destinations, if the number of PC servers is equal to or less than the number of external input / output devices, the connection destinations of all the PC servers 1a to 1n are basically determined.

【００６２】外部入出力装置接続装置１１は、サーバ管
理装置２ｉの指示により選択された外部入出力装置５ｉ
に外部入出力切り換え機構４を経由して接続し、接続先
オンラインの状態とする。これによって、ＰＣサーバ１
ｉの初期化処理完了後、ＯＳ６ｉのロード装置のオンラ
イン待ち状態にあったプロセッサ８ａ〜８ｎは、接続さ
れた外部入出力装置５ｉに搭載されたＯＳ６ｉをメモリ
９へロードしてシステム立上げを開始する（ステップ２
２０）。The external input / output device connection device 11 is connected to the external input / output device 5i selected by the instruction of the server management device 2i.
Via the external input / output switching mechanism 4 to bring the connection destination online. Thereby, the PC server 1
After the initialization of i is completed, the processors 8a to 8n in the online waiting state of the OS 6i loading device load the OS 6i mounted on the connected external input / output device 5i into the memory 9 and start system startup. (Step 2
20).

【００６３】図７は、図６における初期化完了通報処理
（ステップ１９０）の詳細手順を示したフローチャート
である。本手順によって、サーバ管理装置２ｉは、ＰＣ
サーバ１ｉの構成情報と障害情報を他の全てのサーバ管
理装置２ａ〜２ｎに通報すると共に、他の全てのサーバ
管理装置２ａ〜２ｎの構成情報と障害情報を取得する。
ここでは、既定時間に受信した初期化完了通報及び障害
通報の数によって、全てのＰＣサーバ１ａ〜１ｎの台数
を求めており、電源投入前のシステム構成の変更にも柔
軟に対応することができる。他のサーバ管理装置２ａ〜
２ｎも同時に同じ処理を行うことで、全てのサーバ管理
装置２ａ〜２ｎは全てのＰＣサーバ１ａ〜１ｎの構成情
報と障害情報を共有する。FIG. 7 is a flowchart showing a detailed procedure of the initialization completion notification process (step 190) in FIG. By this procedure, the server management device 2i
The configuration information and the failure information of the server 1i are reported to all the other server management apparatuses 2a to 2n, and the configuration information and the failure information of all the other server management apparatuses 2a to 2n are acquired.
Here, the number of all the PC servers 1a to 1n is obtained based on the number of the initialization completion reports and the failure reports received at the predetermined time, and it is possible to flexibly cope with a change in the system configuration before the power is turned on. . Other server management devices 2a to
2n also performs the same processing at the same time, so that all server management apparatuses 2a to 2n share the configuration information and the failure information of all PC servers 1a to 1n.

【００６４】サーバ管理装置２ｉは、ＰＣサーバ１ｉの
構成情報と障害情報を含んだ初期化完了通報を、他の全
てのサーバ管理装置２ａ〜２ｎに発信する（ステップ１
９１）。ここで発信される構成情報と障害情報は、サー
バグループ、実効性能点数、信頼度点数等の外部入出力
装置５ａ〜５ｍを決定するために最低限必要な情報のみ
である。サーバ管理装置２ｉは、全てのＰＣサーバ１ａ
〜１ｎの台数を求めるために初期値ｎ＝１を設定する
（ステップ１９２）。なお、自己のＰＣサーバ１ｉの分
を考慮してｎ＝１としている。次に、サーバ管理装置２
ｉは、他のＰＣサーバ１ａ〜１ｎからの初期化処理完了
通報あるいは障害通報待ち時間を設定する（ステップ１
９３）。この待ち時間は、全てのＰＣサーバ１ａ〜１ｎ
の初期化処理が完了するに十分な値とする。そして、サ
ーバ管理装置２ｉは、待ち時間が終了するまでの間、以
下の処理を繰り返し行う。The server management device 2i sends an initialization completion report including the configuration information and the failure information of the PC server 1i to all the other server management devices 2a to 2n (step 1).
91). The configuration information and the failure information transmitted here are only the minimum information necessary for determining the external input / output devices 5a to 5m, such as the server group, the effective performance score, and the reliability score. The server management device 2i includes all the PC servers 1a
An initial value n = 1 is set in order to obtain the number of １1n (step 192). Note that n = 1 is set in consideration of the own PC server 1i. Next, the server management device 2
i sets the initialization processing completion notification or the failure notification waiting time from the other PC servers 1a to 1n (step 1).
93). This waiting time is determined for all PC servers 1a to 1n.
Is a value sufficient to complete the initialization process. Then, the server management device 2i repeatedly performs the following processing until the waiting time ends.

【００６５】まず、受信した情報が初期化完了通報の場
合、受信した構成情報と障害情報を不揮発性メモリ２５
に記録し（ステップ１９６）、サーバ台数ｎに１を加算
する（ステップ１９７）。一方、受信した情報が障害通
報の場合、受信した障害情報を不揮発性メモリ２５に記
録し（ステップ１９９）、サーバ台数ｎに１を加算する
（ステップ１９７）。この処理の結果、各情報を不揮発
性メモリ２５に記録しながらＰＣサーバ１ａ〜１ｎの台
数を求めることができる。First, when the received information is the initialization completion report, the received configuration information and fault information are stored in the nonvolatile memory 25.
(Step 196), and 1 is added to the server number n (step 197). On the other hand, if the received information is a failure report, the received failure information is recorded in the non-volatile memory 25 (step 199), and 1 is added to the server number n (step 197). As a result of this processing, the number of PC servers 1a to 1n can be obtained while recording each information in the nonvolatile memory 25.

【００６６】図８は、図６における外部入出力装置接続
先決定（ステップ２００）の詳細手順を示したフローチ
ャートである。本実施の形態におけるＰＣサーバ１ａ〜
１ｎの接続先となる外部入出力装置５ａ〜５ｍは、全て
のサーバ管理装置２ａ〜２ｎが全てのＰＣサーバ１ａ〜
１ｎの構成情報と障害情報及び全ての外部入出力装置５
ａ〜５ｍの構成情報を共有することによってそれぞれに
自律判断で決定される。以下、図４、図５及び図８によ
り、接続先の決定手順をＰＣサーバ１ｉにおける動作と
して説明する。他のＰＣサーバ１ａ〜１ｎにおいても、
同一の手順によって、接続先を決定することになる。FIG. 8 is a flowchart showing a detailed procedure of determining the external input / output device connection destination (step 200) in FIG. PC server 1a to 1 in this embodiment
The external input / output devices 5a to 5m to which 1n is connected are all server management devices 2a to 2n, and all the PC servers 1a to
1n configuration information and fault information and all external input / output devices 5
By sharing the configuration information a to 5m, each is determined by autonomous judgment. Hereinafter, the procedure for determining the connection destination will be described as an operation in the PC server 1i with reference to FIG. 4, FIG. 5, and FIG. In other PC servers 1a to 1n,
The connection destination is determined by the same procedure.

【００６７】図８の手順開始時点では、全てのＰＣサー
バ１ａ〜１ｎの構成情報と障害情報及び全ての外部入出
力装置５ａ〜５ｍの構成情報は、初期化処理完了時点の
状態に更新されており、分散システム全体としては、こ
れらの情報を元に接続先を決定することによって業務処
理が開始できる状態にある。At the start of the procedure in FIG. 8, the configuration information and fault information of all the PC servers 1a to 1n and the configuration information of all the external input / output devices 5a to 5m are updated to the state at the time of completion of the initialization processing. Thus, the entire distributed system is in a state where business processing can be started by determining a connection destination based on such information.

【００６８】サーバ管理装置２ｉは、図４における実効
性能点数と信頼度点数から、全てのＰＣサーバ１ａ〜１
ｎの接続順位を決定する（ステップ２０１）。サーバ接
続順位は、まず実効性能点数の高いものから高順位と
し、実効性能点数が同じものについては、信頼度点数の
少ないものを高順位とする。この結果、依然として同一
順位に並ぶものについては、図４においてＰＣサーバ１
ａ〜１ｎのサーバ番号の若いものを高順位とする。図４
に示した例では、サーバグループＩＭについて実効性能
点数の最も高いＰＣサーバ１ｎをサーバ接続順位１と
し、実効性能点数が等しいＰＣサーバ１ａ〜１ｃについ
ては、信頼度点数の小さいものから順に、すなわちＰＣ
サーバ１ａをサーバ接続順位２、ＰＣサーバ１ｂをサー
バ接続順位３としている。ＰＣサーバ１ｃは、信頼度点
数が１００を越えていることから接続不能と判断し、サ
ーバ接続順位０に設定している。The server management device 2i, based on the effective performance score and the reliability score in FIG.
The connection order of n is determined (step 201). First, the server connection order is determined to be higher in the order of higher effective performance score, and for those having the same effective performance score, the server having the lower reliability score is ranked higher. As a result, the PC servers 1 which are still in the same order are shown in FIG.
The server with the smaller server number from a to 1n is ranked higher. FIG.
In the example shown in (1), the PC server 1n having the highest effective performance score for the server group IM is set to the server connection order 1, and the PC servers 1a to 1c having the same effective performance score are ordered in ascending order of reliability score, that is, PC
The server 1a has the server connection order 2 and the PC server 1b has the server connection order 3. The PC server 1c determines that the connection is impossible because the reliability score exceeds 100, and sets the server connection rank to 0.

【００６９】続いて、サーバ管理装置２ｉは、図５にお
ける外部入出力装置５ａ〜５ｍの構成情報から、外部入
出力装置５ａ〜５ｍの装置接続順位を決定する（ステッ
プ２０２）。装置接続順位は、図５に示した性能指標の
順に順位付けを行い、同一順位のものについては、信頼
度指標によって順位付けを行う。性能指標と信頼度指標
は、共にサーバグループ単位に設定した順位を表してお
り、値の小さいものが高順位であり、より高性能で高信
頼度のＰＣサーバを必要としていることを表す。なお、
本実施の形態においては、性能指標と信頼度指標によっ
て間接的に装置接続順位を表現すると共に、システム立
上げ時に毎回サーバ管理装置２ｉによって装置接続順位
を算出しなおす手順としているが、システム構築時にシ
ステム設計者によって一意に決定するようにしてもよ
い。Subsequently, the server management device 2i determines the device connection order of the external input / output devices 5a to 5m from the configuration information of the external input / output devices 5a to 5m in FIG. 5 (step 202). The device connection order is ranked in the order of the performance indices shown in FIG. 5, and those having the same order are ranked by the reliability index. The performance index and the reliability index both indicate the order set for each server group, and a smaller value indicates a higher order, indicating that a PC server with higher performance and higher reliability is required. In addition,
In the present embodiment, the device connection order is indirectly expressed by the performance index and the reliability index, and the server connection order is calculated again by the server management device 2i every time the system is started up. It may be uniquely determined by the system designer.

【００７０】サーバ管理装置２ｉは、自身が管理するＰ
Ｃサーバ１ｉのサーバ接続順位に基づいて外部入出力装
置５ａ〜５ｍの構成情報を取得し（ステップ２０３）、
ＰＣサーバ１ｉと同一のサーバグループでかつ同じ装置
接続順位の外部入出力装置５ｉを検索する（ステップ２
０４，２０５）。これを該当する外部入出力装置５ｉが
見つかるまで繰り返し行い（ステップ２０６）、該当す
る外部入出力装置５ｉが見つかれば、それをＰＣサーバ
１ｉの接続先とする（ステップ２０７）。The server management device 2i manages the P
The configuration information of the external input / output devices 5a to 5m is acquired based on the server connection order of the C server 1i (step 203).
The external input / output device 5i in the same server group as the PC server 1i and in the same device connection order is searched (step 2).
04, 205). This is repeated until the corresponding external input / output device 5i is found (step 206). If the relevant external input / output device 5i is found, it is set as the connection destination of the PC server 1i (step 207).

【００７１】一方、上記検索の結果、該当する外部入出
力装置５ｉがなかった場合には、接続先無しと判断し、
サーバ管理装置２ｉは、ＰＣサーバ１ｉの接続待ち状態
を保持することで代替サーバとして待機させる（ステッ
プ２０８）。業務処理開始後、業務処理実行中のＰＣサ
ーバ１ａ〜１ｎに障害が発生した場合には、待機状態に
あるＰＣサーバ１ｉが代替運転を行う。On the other hand, if there is no corresponding external input / output device 5i as a result of the above search, it is determined that there is no connection destination, and
The server management device 2i holds the connection waiting state of the PC server 1i to make it stand by as a substitute server (step 208). If a failure occurs in the PC servers 1a to 1n that are executing the business process after the start of the business process, the PC server 1i in the standby state performs the alternative operation.

【００７２】以上のように、本実施の形態によれば、Ｐ
Ｃサーバ１ａ〜１ｎと外部入出力装置５ａ〜５ｍとをそ
れぞれに接続順位を決定し、各ＰＣサーバ１ａ〜１ｎを
同一のサーバグループでかつ同じ装置接続順位の外部入
出力装置５ａ〜５ｍとを接続するようにしたので、各Ｐ
Ｃサーバ１ａ〜１ｎにそれぞれの処理能力に応じた適切
な外部入出力装置５ａ〜５ｍを接続して業務処理を行わ
せることができる。特に、本実施の形態によれば、基本
性能点数で表したＰＣサーバ１ａ〜１ｎが諸元等による
本来的に持つ処理能力ではなく、プロセッサの縮退等構
成の変更を考慮した現実の構成に基づく実効性能点数を
用いてサーバ接続順位を求めているので、システム立上
げ時点における処理能力に応じて各ＰＣサーバ１ａ〜１
ｎを最適な外部入出力装置５ａ〜５ｍと接続させること
ができる。As described above, according to the present embodiment, P
The connection order is determined for each of the C servers 1a to 1n and the external input / output devices 5a to 5m, and each PC server 1a to 1n is connected to the external input / output devices 5a to 5m in the same server group and having the same device connection order. Since each connection was made,
It is possible to connect the C servers 1a to 1n to appropriate external input / output devices 5a to 5m corresponding to the respective processing capacities to perform business processing. In particular, according to the present embodiment, the PC servers 1a to 1n represented by the basic performance score are not based on the inherent processing capability due to the specifications and the like, but are based on the actual configuration in consideration of the change in the configuration such as the degeneration of the processor. Since the server connection order is obtained using the effective performance score, each of the PC servers 1a to 1 is set in accordance with the processing capacity at the time of system startup.
n can be connected to the optimal external input / output devices 5a to 5m.

【００７３】次に、待機中のＰＣサーバ１ｉが代替運転
を開始する場合の動作を、図９と図１０を用いて説明す
る。図９は、システム立上げ時のＰＣサーバ１ａ〜１ｎ
と外部入出力装置５ａ〜５ｍの接続状況、代替運転開始
後の接続状況、ＰＣサーバ１ｊ復旧後の接続状況及びＰ
Ｃサーバ１ｉ再立上げ時の接続状況を表形式で示した図
である。サーバ管理装置２ａ〜２ｎが管理する構成情報
と障害情報のうちサーバグループ、サーバ接続順位と稼
動状況を示すとともに、新たに、外部入出力装置接続状
態も示した。外部入出力装置接続状態というのは、各Ｐ
Ｃサーバ１ａ〜１ｎの接続先に関する情報であり、いず
れかの外部入出力装置５ａ〜５ｍと接続されている場合
にはその外部入出力装置５ａ〜５ｍの識別情報が、いず
れの外部入出力装置５ａ〜５ｍとも接続されていない場
合にはその旨「なし」が設定される。全ＰＣサーバ１ａ
〜１ｎにおけるシステム立上げ時には、各ＰＣサーバ１
ａ〜１ｎ間で情報の交換を行い、サーバ接続順位と装置
接続順位を一対一に対応させるため、外部入出力装置接
続状態をサーバ接続順位と装置接続順位から直接知るこ
とができる。しかし、稼動中のＰＣサーバ１ａ〜１ｎの
状態をそのままにして代替運転をする場合には、サーバ
接続順位と装置接続順位の対応関係を一部崩すため、外
部入出力装置接続状態をサーバ接続順位と装置接続順位
から直接知ることができなくなる。すなわち、代替運転
をする場合、その代替運転をするＰＣサーバ１ｉは、シ
ステム立上げ時に他のＰＣサーバ１ａ〜１ｎに対してサ
ーバグループ等の情報を送出することになるが、稼動中
の各ＰＣサーバ１ａ〜１ｎは、ＰＣサーバ１ｉに対して
接続先に関する情報を送信しないので、対応関係が取れ
なくなるからである。図９（ａ）では、外部入出力装置
接続状態を直接示すことで、接続状態の遷移状況を表し
た。代替運転を制御するため、サーバ管理装置２ａ〜２
ｎは、外部入出力装置接続状態を構成情報の中に記録す
る。Next, the operation when the standby PC server 1i starts the alternative operation will be described with reference to FIG. 9 and FIG. FIG. 9 shows the PC servers 1a to 1n when the system is started up.
Connection status between the external input / output devices 5a to 5m, the connection status after the start of the alternative operation, the connection status after the PC server 1j is restored, and P
It is the figure which showed the connection status at the time of C server 1i restarting in tabular form. Among the configuration information and the failure information managed by the server management devices 2a to 2n, the server group, the server connection order and the operation status are shown, and the external input / output device connection status is also newly shown. The external I / O device connection state is defined by each P
This is information on the connection destinations of the C servers 1a to 1n. When connected to any one of the external input / output devices 5a to 5m, the identification information of the external input / output devices 5a to 5m indicates any of the external input / output devices. If none of 5a to 5m is connected, "None" is set to that effect. All PC server 1a
1 to 1n, each PC server 1
Since information is exchanged between a to 1n and the server connection order and the device connection order are in one-to-one correspondence, the external input / output device connection state can be directly known from the server connection order and the device connection order. However, when performing the alternative operation while keeping the states of the operating PC servers 1a to 1n as they are, in order to partially break the correspondence between the server connection order and the device connection order, the external input / output device connection state is changed to the server connection order. And the device connection order cannot be directly known. That is, when performing the alternative operation, the PC server 1i performing the alternative operation sends information such as a server group to the other PC servers 1a to 1n when the system is started up. This is because the servers 1a to 1n do not transmit information about the connection destination to the PC server 1i, and thus cannot establish a correspondence. FIG. 9A illustrates a transition state of the connection state by directly indicating the external input / output device connection state. In order to control the alternative operation, the server management devices 2a to 2a
n records the external input / output device connection status in the configuration information.

【００７４】図９（ｂ）では、ＰＣサーバ１ｊが障害を
起こして停止し、待機中のＰＣサーバ１ｉがＰＣサーバ
１ｊが接続していた外部入出力装置５ｄに接続し、代替
運転を開始した状況を示している。図９（ｃ）では、Ｐ
Ｃサーバ１ｊの復旧時にＰＣサーバ１ｊの信頼性が向上
したと仮定し、外部入出力装置接続状態はそのままでサ
ーバ接続順位のみが変更された状況を示している。ま
た、図９（ｄ）では、ＰＣサーバ１ｉ再立ち上げ時にサ
ーバ接続順位の見直しによって、再び、ＰＣサーバ１ｊ
が外部入出力装置５ａと接続して業務処理を開始した様
子を示している。In FIG. 9B, the PC server 1j has failed and stopped, and the standby PC server 1i has connected to the external input / output device 5d to which the PC server 1j has been connected, and has started alternative operation. Indicates the situation. In FIG. 9C, P
Assuming that the reliability of the PC server 1j is improved when the C server 1j is restored, only the server connection order is changed without changing the external input / output device connection state. Also, in FIG. 9D, when the PC server 1i is restarted, the server connection order is reviewed, so that the PC server 1j is again activated.
Shows a state in which a business process is started by connecting to the external input / output device 5a.

【００７５】図１０は、障害を起こしたＰＣサーバ１ｊ
を管理するサーバ管理装置２ｊが障害発生を他のサーバ
管理装置２ａ〜２ｎへ通報した際の、サーバ管理装置２
ａ〜２ｎの動作を示したフローチャートである。以下、
このフローチャートに基づき待機中のＰＣサーバ１ｉが
代替運転を開始する動作について説明する。FIG. 10 shows the failed PC server 1j.
Management device 2j when the server management device 2j that manages the server notifies the other server management devices 2a to 2n that a failure has occurred.
It is a flowchart which showed operation | movement of a-2n. Less than,
The operation in which the standby PC server 1i starts the alternative operation will be described based on this flowchart.

【００７６】業務処理開始後、ＰＣサーバ１ｊに障害が
発生し動作不能になると、ＰＣサーバ１ｊを管理してい
るサーバ管理装置２ｊは、外部入出力装置５ａとの接続
を切断した上で、他の全てのサーバ管理装置２ａ〜２ｎ
に障害発生を通信路３を介して通報する。障害発生の通
報を受信した各サーバ管理装置２ａ〜２ｎは、ＰＣサー
バ１ｊの障害情報に稼動状況として停止を記録すると共
に、接続先を外部入出力装置５ａから無しに変更する
（ステップ３０１）。When the PC server 1j fails and becomes inoperable after the business process starts, the server management device 2j managing the PC server 1j disconnects the connection with the external input / output device 5a, Server management devices 2a to 2n
Is notified via the communication path 3 to the occurrence of a failure. Each of the server management devices 2a to 2n that has received the notification of the occurrence of the failure records the stop as the operating status in the failure information of the PC server 1j and changes the connection destination to none from the external input / output device 5a (step 301).

【００７７】各サーバ管理装置２ａ〜２ｎは、ＰＣサー
バ１ｊの障害情報と構成情報を変更した後、障害情報の
稼動状況から待機ＰＣサーバの有無を調べる（ステップ
３０２）。待機ＰＣサーバがない場合は、サーバ管理装
置２ｊ以外は処理を終了する。サーバ管理装置２ｊのみ
は、代替運転ができないことから、保守員による復旧作
業が必要なため、ＰＣサーバ１ｊ停止の警報を出す（ス
テップ３０３）。After changing the failure information and the configuration information of the PC server 1j, each of the server management devices 2a to 2n checks whether or not there is a standby PC server from the operating status of the failure information (step 302). If there is no standby PC server, the process ends except for the server management device 2j. Since only the server management device 2j cannot perform the alternative operation, the maintenance work is required by the maintenance staff, and thus an alarm is issued to stop the PC server 1j (step 303).

【００７８】待機ＰＣサーバが存在する場合、各サーバ
管理装置２ａ〜２ｎは、待機ＰＣサーバの中から最もサ
ーバ接続順位の高いＰＣサーバ１ｉを代替ＰＣサーバに
決定する（ステップ３０４）。この結果、サーバ管理装
置２ａ〜２ｎは、ＰＣサーバ１ｉの稼動状況を稼動に変
更すると共に、外部入出力装置接続状況を外部入出力装
置５ａとする。If there is a standby PC server, each of the server management apparatuses 2a to 2n determines the PC server 1i having the highest server connection order among the standby PC servers as the substitute PC server (step 304). As a result, the server management devices 2a to 2n change the operating status of the PC server 1i to active and set the external input / output device connection status to the external input / output device 5a.

【００７９】サーバ管理装置２ａ〜２ｎは、自己が代替
運転を開始するＰＣサーバ１ｉを管理するサーバ管理装
置２ｉであった場合には、外部入出力装置５ａとの接続
を行い（ステップ３０５，３０６）、システム立上げを
行って代替運転を開始する（ステップ３０７）。それ以
外のサーバ管理装置２ａ〜２ｎは、障害情報と構成情報
の更新のみで処理を終了する。これによって、全てのサ
ーバ管理装置２ａ〜２ｎは、同一の構成情報と障害情報
を共有する。If the server management devices 2a to 2n are the server management devices 2i that manage the PC server 1i that starts the alternative operation, the server management devices 2a to 2n connect to the external input / output device 5a (steps 305 and 306). ), The system is started, and the alternative operation is started (step 307). The other server management devices 2a to 2n end the processing only by updating the failure information and the configuration information. Thus, all the server management devices 2a to 2n share the same configuration information and fault information.

【００８０】なお、障害を起こしたＰＣサーバ１ｊは、
外部入出力装置５ａ〜５ｍとの接続を全て切断した状態
であるため、他のＰＣサーバ１ａ〜１ｎによる業務処理
を続けたままで、電源の切断、故障箇所の復旧及び診断
等を単独で実行することができる。ＰＣサーバ１ｊは、
復旧後、待機状態に留めることによって代替ＰＣサーバ
として利用することができる。また、待機ＰＣサーバが
１台だけ存在した場合は、その待機ＰＣサーバが無条件
に代替運転を行うことになるが、この場合も上記処理手
順に従うことで当該待機ＰＣサーバが常に代替ＰＣサー
バとして決定されることになる。Note that the failed PC server 1j
Since all the connections with the external input / output devices 5a to 5m are disconnected, the power supply is cut off, the fault location is recovered, and the diagnosis is performed independently while the business processes by the other PC servers 1a to 1n are continued. be able to. The PC server 1j
After the recovery, it can be used as an alternative PC server by keeping it in a standby state. When only one standby PC server exists, the standby PC server will perform the alternative operation unconditionally. In this case, too, the standby PC server is always set as the alternative PC server by following the above procedure. Will be determined.

【００８１】ＰＣサーバ１ｉによる代替運転は、ＰＣサ
ーバ１ｊ復旧後におけるシステム全体の再立上げによっ
て解除されるが、システム全体が稼働中であっても、業
務処理の都合でＰＣサーバ１ｉによる代替運転を停止す
ることができる場合には、その時点で引継先のＰＣサー
バ１ｉから再度その業務処理を引き継ぎ、ＰＣサーバ１
ｊで再開させることもできる。この場合には、ＰＣサー
バ１ｉとＰＣサーバ１ｊを一旦共に待機状態にし、再
度、サーバ接続順位を元に戻すことでＰＣサーバ１ｊが
代替ＰＣサーバとして決定されるようにする。この結
果、ＰＣサーバ１ｊを本来の業務処理に復帰させること
ができる。The alternative operation by the PC server 1i is canceled by restarting the entire system after the PC server 1j is restored. However, even if the entire system is in operation, the alternative operation by the PC server 1i for the convenience of business processing is performed. Can be stopped, the business process is taken over again from the takeover destination PC server 1i at that time, and the PC server 1
It can be restarted with j. In this case, both the PC server 1i and the PC server 1j are temporarily put into a standby state, and the server connection order is restored again so that the PC server 1j is determined as the substitute PC server. As a result, the PC server 1j can be returned to the original business process.

【００８２】次に、ＰＣサーバ１ｊの復旧時の動作を、
図６、図７、図９及び図１１を用いて説明する。図１１
は、ＰＣサーバ１ｊが復旧する際のＰＣサーバ１ｊ以外
のＰＣサーバ１ａ〜１ｎの動作を示したフローチャート
である。他のＰＣサーバ１ａ〜１ｎには、業務処理中の
ＰＣサーバと待機中のＰＣサーバを含む。Next, the operation at the time of restoration of the PC server 1j will be described.
This will be described with reference to FIGS. 6, 7, 9 and 11. FIG.
Is a flowchart showing the operation of the PC servers 1a to 1n other than the PC server 1j when the PC server 1j recovers. The other PC servers 1a to 1n include a PC server that is performing a business operation and a standby PC server.

【００８３】ＰＣサーバ１ｊは、保守員による復旧作業
の後、電源投入され図６に示した手順に従って初期化処
理を実行する。ここで、システム全体の電源投入時と異
なる点は、既に他のＰＣサーバ１ａ〜１ｎが業務処理を
実行中であるため、他のサーバ管理装置２ａ〜２ｎから
の初期化処理完了報告がそのままでは受信できないこと
と、サーバ接続順位決定後、サーバ接続順位と装置接続
順位に従って外部入出力装置５ａ〜５ｍを勝手に接続で
きない点である。After the recovery work by the maintenance staff, the PC server 1j is powered on and executes the initialization processing according to the procedure shown in FIG. Here, the difference from the power-on state of the entire system is that since the other PC servers 1a to 1n are already executing business processes, the initialization processing completion reports from the other server management devices 2a to 2n are not changed. This is that the external input / output devices 5a to 5m cannot be connected without permission according to the server connection order and the device connection order after the server connection order is determined.

【００８４】図６において、サーバ管理装置２ｊは、電
源投入によるＰＣサーバ１ｊの初期化処理完了を確認
し、ステップ１９０で初期化処理完了を通報して、他の
サーバ管理装置２ａ〜２ｎからの初期化完了報告を待
つ。これに対応するため、業務処理中のＰＣサーバ１ａ
〜１ｎの各サーバ管理装置２ａ〜２ｎは、サーバ管理装
置２ｊからの初期化処理完了通報を受信すると、図１１
に示した処理を行うことによって、サーバ管理装置２ｊ
が他のサーバ管理装置２ａ〜２ｎと構成情報及び障害情
報を共有できるようにすると共に、ＰＣサーバ１ｊが外
部入出力装置５ａ〜５ｍとの接続処理を行えるようにす
る。In FIG. 6, the server management device 2j confirms the completion of the initialization process of the PC server 1j by turning on the power, reports the completion of the initialization process in step 190, and sends the information from the other server management devices 2a to 2n. Wait for the initialization completion report. In order to cope with this, the PC server 1a during the business process
When the server management devices 2a to 2n receive the initialization processing completion notification from the server management device 2j,
The server management device 2j
Can share the configuration information and the failure information with the other server management devices 2a to 2n, and allow the PC server 1j to perform the connection processing with the external input / output devices 5a to 5m.

【００８５】すなわち、各サーバ管理装置２ａ〜２ｎ
は、復旧したＰＣサーバ１ｊのサーバ管理装置２ｊから
初期化完了報告を受信すると（ステップ３１１）、自Ｐ
Ｃサーバ１ａ〜１ｎの構成情報、障害情報及び外部入出
力装置接続状態を初期化処理完了通報としてサーバ管理
装置２ｊへ発信する（ステップ３１２）。この障害情報
には縮退情報と稼動状況とが含まれている。That is, each of the server management devices 2a to 2n
When receiving the initialization completion report from the server management device 2j of the restored PC server 1j (step 311), the P
The configuration information, the failure information, and the external input / output device connection status of the C servers 1a to 1n are transmitted to the server management device 2j as an initialization processing completion notification (step 312). The failure information includes degeneration information and operation status.

【００８６】図７において、これらの情報を受信したＰ
Ｃサーバ１ｊのサーバ管理装置２ｊは、対応するＰＣサ
ーバ１ａ〜１ｎの構成情報、障害情報と外部入出力装置
接続状態をサーバ管理情報として記録する（ステップ１
９６）。その後、サーバ管理装置２ｊは、通報待ち時間
を終了した時点で、図６の外部入出力装置５ａ〜５ｍの
接続先決定処理（ステップ２１０）へ進む。Referring to FIG. 7, the P
The server management device 2j of the C server 1j records the configuration information, failure information, and external input / output device connection status of the corresponding PC server 1a to 1n as server management information (step 1).
96). Thereafter, the server management device 2j proceeds to the connection destination determination processing (step 210) of the external input / output devices 5a to 5m in FIG.

【００８７】図６におけるステップ２１０の外部入出力
装置５ａ〜５ｍの接続先決定処理において、サーバ管理
装置２ｊは、サーバ接続順位と装置接続順位の対応付け
に際し、他のサーバ管理装置２ａ〜２ｎから受信した外
部入出力装置接続状態を優先することによって、業務処
理中のＰＣサーバ１ａ〜１ｎが接続している外部入出力
装置５ａ〜５ｍへの接続を避ける。すなわち、サーバ管
理装置２ｊは、現在メモリ９で保持している外部入出力
装置接続状態を受信した外部入出力装置接続状態で上書
き更新を行う。In the process of determining the connection destination of the external input / output devices 5a to 5m in step 210 in FIG. 6, the server management device 2j transmits the server connection order to the device connection order from the other server management devices 2a to 2n. By prioritizing the received external input / output device connection status, connection to the external input / output devices 5a to 5m to which the PC servers 1a to 1n that are performing business operations are connected is avoided. That is, the server management device 2j updates the external input / output device connection status currently held in the memory 9 with the received external input / output device connection status.

【００８８】一方、図１１において、業務処理中のＰＣ
サーバ１ａ〜１ｎの各サーバ管理装置２ａ〜２ｎは、サ
ーバ管理装置２ｊから受信した構成情報と障害情報に基
づいてサーバ管理情報を更新し（ステップ３１３）、そ
の更新結果に基づいてサーバ接続順位を見直す。（ステ
ップ３１４）。但し、これらのＰＣサーバ１ａ〜１ｎは
業務処理中であるため、サーバ管理装置２ａ〜２ｎはそ
のまま処理を終了する（ステップ３１５）。待機中のＰ
Ｃサーバ１ｋについても、外部入出力装置５ａ〜５ｍに
空きが無いことから、待機状態のままとする（ステップ
３１６）。On the other hand, in FIG.
Each of the server management devices 2a to 2n of the servers 1a to 1n updates the server management information based on the configuration information and the failure information received from the server management device 2j (step 313), and determines the server connection order based on the update result. Review. (Step 314). However, since these PC servers 1a to 1n are performing business operations, the server management devices 2a to 2n end the processing as they are (step 315). Waiting P
As for the C server 1k, since there is no space in the external input / output devices 5a to 5m, it remains in the standby state (step 316).

【００８９】ＰＣサーバ１ｊ故障時を示した図９（ｂ）
では、ＰＣサーバ１ｉが、ＰＣサーバ１ｊの代替運転を
行っている様子を示しているが、ＰＣサーバ１ｊの復旧
時を示した図９（ｃ）では、サーバ接続順位を見直した
結果、信頼性が向上してＰＣサーバ１ｊが第４順位から
第３順位になっため、システム構成を再構築する必要が
あることを示している。但し、この時点では、業務処理
が継続中であるため、サーバ接続順位にかかわらず現状
の外部入出力装置接続状態のままとしている。FIG. 9B showing the time when the PC server 1j fails.
FIG. 9C shows that the PC server 1i is performing an alternative operation of the PC server 1j. In FIG. 9C showing when the PC server 1j is restored, the server connection order is reviewed. Indicates that the PC server 1j is changed from the fourth rank to the third rank, so that the system configuration needs to be reconfigured. However, at this point, since the business process is ongoing, the current external input / output device connection state is maintained regardless of the server connection order.

【００９０】ＰＣサーバ１ｉの再立上げをすることで、
ＰＣサーバ１ｊを業務処理に復旧させる手順を図６、図
９及び図１１を用いて説明する。この処理は、ＰＣサー
バ１ｉによる業務処理が一旦停止できる状態になること
が前提である。By restarting the PC server 1i,
The procedure for restoring the PC server 1j to business processing will be described with reference to FIGS. 6, 9, and 11. This process is based on the premise that the business process by the PC server 1i can be temporarily stopped.

【００９１】ＰＣサーバ１ｉは、業務処理停止後、図６
の手順に従って、再度、初期化処理を実行する。処理手
順は、ＰＣサーバ１ｊの復旧時の手順と同様で、ＰＣサ
ーバ１ｉのサーバ管理装置２ｉの初期化処理完了通報
（ステップ１９０）により、他のサーバ管理装置２ａ〜
２ｎから構成情報、障害情報及び外部入出力装置接続状
況が発信され（ステップ３１１，３１２）、サーバ管理
装置２ｉは、外部入出力装置接続先決定の処理（ステッ
プ２００）を開始する。この場合、ＰＣサーバ１ｉの状
態は変更されていないため、サーバ接続順位には変化は
なく、ステップ２００及びステップ３１４でのサーバ接
続順位見直しの結果は、図９のＰＣサーバ１ｊ復旧時の
状態と同じとなる。但し、ＰＣサーバ１ｉのサーバ接続
順位は待機中のＰＣサーバ１ｊより低いため、外部入出
力装置５ａとの接続はできず、ＰＣサーバ１ｉは待機状
態となる。同時に、全てのサーバ管理装置２ａ〜２ｎ
は、ＰＣサーバ１ｉの外部入出力装置接続状況を外部入
出力装置５ａからなしに変更する。After the business processing is stopped, the PC server 1i
The initialization process is executed again according to the procedure described in (1). The processing procedure is the same as the procedure at the time of restoration of the PC server 1j. The server management apparatus 2i of the PC server 1i receives the initialization processing completion notification (step 190), and the other server management apparatuses 2a to 2a.
2n transmits the configuration information, the failure information, and the external input / output device connection status (steps 311 and 312), and the server management device 2i starts the external input / output device connection destination determination process (step 200). In this case, since the state of the PC server 1i has not been changed, there is no change in the server connection order, and the result of the server connection order review in step 200 and step 314 is the same as the state when the PC server 1j in FIG. Will be the same. However, since the server connection order of the PC server 1i is lower than that of the standby PC server 1j, connection with the external input / output device 5a cannot be performed, and the PC server 1i enters a standby state. At the same time, all server management devices 2a to 2n
Changes the external input / output device connection status of the PC server 1i to none from the external input / output device 5a.

【００９２】この時、待機中のサーバ管理装置２ｊも、
サーバ管理装置２ｉからの初期化処理完了通報を受信
し、図１１の手順に従って応答処理を行う。サーバ管理
装置２ｊは、ステップ３１４でのサーバ接続順位見直し
の結果、外部入出力装置５ａを接続先に決定する。サー
バ管理装置２ｊは、ステップ３１５で待機中であること
からステップ３１６に進み、ステップ３１４で決定した
外部入出力装置接続状態から、外部入出力装置５ａの接
続先が接続可能であると判断し、ステップ３１７へ進
む。その結果、ＰＣサーバ１ｊは外部入出力装置５ａと
接続し、システム立上げを開始する（ステップ３１
８）。At this time, the standby server management device 2j also
Upon receiving the initialization processing completion notification from the server management device 2i, the server performs response processing according to the procedure shown in FIG. The server management device 2j determines the external input / output device 5a as a connection destination as a result of the review of the server connection order in step 314. The server management device 2j proceeds to step 316 because it is waiting in step 315, and determines from the external input / output device connection state determined in step 314 that the connection destination of the external input / output device 5a is connectable, Proceed to step 317. As a result, the PC server 1j connects to the external input / output device 5a and starts system startup (step 31).
8).

【００９３】以上、本実施の形態においては、ＰＣサー
バ１ｉの初期化処理によって、ＰＣサーバ１ｊの業務処
理復帰を行っているが、ＰＣサーバ１ｉでの業務処理停
止を全サーバ管理装置２ａ〜２ｎに通報することによっ
て、待機中のＰＣサーバ１ｊを外部入出力装置５ａに接
続して業務処理を開始させることによっても、同様の効
果が得られる。As described above, in the present embodiment, the business processing of the PC server 1j is restored by the initialization processing of the PC server 1i. However, the business processing of the PC server 1i is stopped by all the server management apparatuses 2a to 2n. The same effect can be obtained by connecting the standby PC server 1j to the external input / output device 5a to start the business process.

【００９４】なお、本実施の形態では、動作不能となっ
たＰＣサーバの代替運転並びに復旧後における復帰処理
について説明したが、動作不能とまではいかないもの
の、例えばプロセッサの縮退等による処理能力の低下に
伴い他のＰＣサーバで代替運転をさせたい場合に代替運
転の通報を発信したりする場合にも適用することができ
る。In the present embodiment, the alternative operation of the PC server that has become inoperable and the return processing after restoration have been described. However, although it is not impossible to operate the PC server, the processing capacity is reduced due to, for example, degeneration of the processor. Accordingly, the present invention can also be applied to a case where a notification of an alternative operation is transmitted when it is desired to perform an alternative operation on another PC server.

【００９５】また、本実施の形態においては、各サーバ
管理装置２ａ〜２ｎに構成情報及び障害情報に基づいて
最適な外部入出力装置５ａ〜５ｍを１台のみ選択させる
ようにしたが、複数台選択させるようにしてもよいし、
例えば最適なものから２番目の外部入出力装置５ａ〜５
ｍを１台のみ選択させるようにするなどしてもよい。In the present embodiment, each server management device 2a-2n is made to select only one optimal external input / output device 5a-5m based on the configuration information and the failure information. You can make it select,
For example, the second best external input / output device 5a-5
Alternatively, only one m may be selected.

【００９６】実施の形態２．上記実施の形態１では、原
則として処理能力が高く、かつ信頼性の高いＰＣサーバ
から順に接続するようにし、残ったサーバ接続順位が低
いＰＣサーバを待機させ、あるＰＣサーバが故障したら
待機中のサーバの中から信頼性の高いサーバで代替運転
をさせるようにした。ただ、このように処理能力の低い
ＰＣサーバを待機させて処理能力の高いＰＣサーバの代
替運転をさせるようにしていると、その代替ＰＣサーバ
がネックになり、システム全体の処理能力が低下してし
まうおそれが生じる。Embodiment 2 In the first embodiment, in principle, PC servers with higher processing capacity and higher reliability are connected in order, and the remaining PC servers with lower connection order are put on standby. Alternate operation is performed on a highly reliable server from among the servers. However, if a PC server with a low processing capacity is made to stand by and an alternative operation of a PC server with a high processing capacity is performed, the alternative PC server becomes a bottleneck and the processing capacity of the entire system decreases. May occur.

【００９７】そこで、本実施の形態では、システム立上
げ時から相対的に高い処理能力と信頼性を有した特定の
ＰＣサーバを代替サーバとして待機させることによって
システム全体の処理能力の低下を防止するようにした。
本実施の形態では、これをＰＣサーバの構成情報に各サ
ーバの立上げ時に代替サーバとして待機させるための運
転制御情報を追加することによって実現している。以
下、図１２に示した構成情報と障害情報の例を用いて本
実施の形態における動作について説明する。Therefore, in the present embodiment, a specific PC server having relatively high processing capability and reliability is set as a substitute server from the start of the system to prevent a reduction in the processing capability of the entire system. I did it.
In the present embodiment, this is realized by adding operation control information for making the PC server stand by as a substitute server when each server is started up, to the configuration information of the PC server. Hereinafter, the operation in the present embodiment will be described using the example of the configuration information and the failure information shown in FIG.

【００９８】図１２（ａ）では、サーバ管理装置２ａ〜
２ｎが保持する構成情報と障害情報のうちサーバグルー
プ、実効性能点数、信頼度点数、サーバ接続順位、稼動
状況、外部入出力接続状態を示すと共に、新たに運転制
御を示した。実施の形態１においては、同一サーバグル
ープ単位でサーバ接続順位を決定していたが、ここで
は、同一サーバグループ内であっても、運転制御が待機
と指定されたＰＣサーバ１ｃ，１ｅとそれ以外の運転制
御が運転と指定されたＰＣサーバ１ａ，１ｂ，１ｎとで
別々に接続順位を決定している。外部入出力装置５ａ〜
５ｍとの接続先は、まず、運転制御が運転と指定された
ＰＣサーバ１ａ，１ｂ，１ｎのサーバ接続順位と装置接
続順位の比較を行い、サーバ接続順位が高い順にＰＣサ
ーバ１ｎ，１ａ，１ｂをそれぞれ外部入出力装置５ａ，
５ｂ，５ｃに接続する。この結果、サーバグループＩＭ
の全ての外部入出力装置５ａ，５ｂ，５ｃが運転制御が
運転と指定されたＰＣサーバ１ｎ，１ａ，１ｂのいずれ
かに接続され、運転制御が待機と指定されたＰＣサーバ
１ｃ，１ｅの接続先がなくなり、実効性能点数及び信頼
度点数が最も高いＰＣサーバ１ｃが第１位のサーバ接続
順位で待機することになる。このように、処理能力及び
信頼性の高いＰＣサーバであっても待機指示が設定され
たものは、代替ＰＣサーバとなる。In FIG. 12A, the server management apparatuses 2a to 2a
2n shows the server group, the effective performance score, the reliability score, the server connection order, the operation status, the external input / output connection status among the configuration information and the fault information held by 2n, and additionally shows the operation control. In the first embodiment, the server connection order is determined in the same server group unit. However, here, even in the same server group, the PC servers 1c and 1e for which the operation control is designated as standby and the other PC servers 1c and 1e. The operation control of the PC servers 1a, 1b, and 1n designated as driving separately determines the connection order. External input / output device 5a-
The connection destination with the PC 5m is first compared with the server connection order and the device connection order of the PC servers 1a, 1b, 1n for which the operation control is designated as operation, and the PC servers 1n, 1a, 1b are ordered in descending order of the server connection order. To the external input / output devices 5a,
5b, 5c. As a result, the server group IM
Are connected to any of the PC servers 1n, 1a, 1b whose operation control is designated as operation, and the connection of the PC servers 1c, 1e whose operation control is designated as standby. The PC server 1c having the highest effective performance score and the highest reliability score stands by in the first server connection order. As described above, even a PC server with a high processing capability and high reliability, for which a standby instruction is set, is a substitute PC server.

【００９９】ここまでの段階で、ＰＣサーバ１ａ，１
ｂ，１ｎのいずれかが障害を起こして停止した場合に
は、外部入出力装置５ａ〜５ｃの接続先決定時に、装置
接続順位の最も低い外部入出力装置５ｃの接続先が未接
続のままとなる。この結果、待機と指定されているＰＣ
サーバ１ｃ，１ｅのうち、第１順位のＰＣサーバ１ｃ
が、外部入出力装置５ｃに接続し、業務処理を開始す
る。At this stage, the PC servers 1a, 1
If any one of the external input / output devices 5a to 5c is stopped due to a failure, the external input / output device 5c having the lowest device connection order remains unconnected at the time of connection destination determination. Become. As a result, the PC designated as standby
Among the servers 1c and 1e, the first-ranked PC server 1c
Is connected to the external input / output device 5c and starts business processing.

【０１００】また、外部入出力装置５ａ〜５ｃの接続先
決定後業務処理中に、例えばＰＣサーバ１ｎが障害によ
って停止した場合には、図１２（ｂ）に示したように、
同一サーバグループで接続順位が第１位のＰＣサーバ１
ｃを代替ＰＣサーバに選び、ＰＣサーバ１ｎが接続して
いた外部入出力装置５ａをＰＣサーバ１ｃに接続し直し
て、業務処理を継続する。ＰＣサーバ１ｃは、実効性能
点数と信頼度点数共にＰＣサーバ１ｎより高得点である
ことから、代替運転に際して、業務処理全体としての処
理能力及び信頼性は低下しない。すなわち、より処理能
力等の高いＰＣサーバ１ｃの運転制御情報に予め待機指
示を設定しておくことによって当該ＰＣサーバ１ｃをシ
ステム立上げ時に待機させるようにしたので、相対的に
処理能力等の低いＰＣサーバ１ｎに代わってＰＣサーバ
１ｃが業務処理を実行したとしても処理能力等は低下し
ない。Further, if the PC server 1n is stopped due to a failure during the business process after the connection destination of the external input / output devices 5a to 5c is determined, as shown in FIG.
PC server 1 with the highest connection order in the same server group
c is selected as the alternative PC server, the external input / output device 5a connected to the PC server 1n is reconnected to the PC server 1c, and business processing is continued. Since the PC server 1c has higher scores in both the effective performance score and the reliability score than the PC server 1n, the processing performance and reliability of the entire business process do not decrease during the alternative operation. That is, by setting a standby instruction in advance in the operation control information of the PC server 1c having a higher processing capability or the like, the PC server 1c is caused to wait when the system is started up. Even if the PC server 1c executes the business process in place of the PC server 1n, the processing capacity and the like do not decrease.

【０１０１】ＰＣサーバ１ｎを復旧させた時点では、図
１２（ｃ）に示したように、ＰＣサーバ１ｃが代替運転
中であり、ＰＣサーバ１ｎに割り当てるべき外部入出力
装置５ａ〜５ｍが存在しないことから、ＰＣサーバ１ｎ
は待機状態となる。When the PC server 1n is restored, as shown in FIG. 12C, the PC server 1c is in the alternative operation, and there are no external input / output devices 5a to 5m to be assigned to the PC server 1n. Therefore, the PC server 1n
Is in a standby state.

【０１０２】そして、ＰＣサーバ１ｃでの業務処理を一
旦停止させ、ＰＣサーバ１ｃを再度初期化処理して立ち
上げなおすことにより、図１２（ｄ）に示したように、
ＰＣサーバ１ｎを業務処理に復帰させると共に、ＰＣサ
ーバ１ｃを再び待機状態とする。これは、ＰＣサーバ１
ｃの初期化処理によって、外部入出力装置５ａをＰＣサ
ーバ１ｃから切り離すと共に、外部入出力装置５ａの接
続先決定を初期化処理が完了したＰＣサーバ１ｃと待機
中のＰＣサーバ１ｅ，１ｎとの間で運転制御情報を見直
すことにより、運転制御情報に運転が指定してあるＰＣ
サーバ１ｎを外部入出力装置５ａの接続先に決定するこ
とで行われる。Then, the business process in the PC server 1c is temporarily stopped, and the PC server 1c is initialized and restarted, as shown in FIG.
The PC server 1n is returned to the business process, and the PC server 1c is again put in the standby state. This is PC server 1
c, the external input / output device 5a is disconnected from the PC server 1c, and the connection destination of the external input / output device 5a is determined between the PC server 1c, which has completed the initialization process, and the standby PC servers 1e, 1n. Review the operation control information between the PCs, so that the PCs whose operation is specified in the operation control information
This is performed by determining the server 1n as a connection destination of the external input / output device 5a.

【０１０３】なお、運転制御を待機と指定したＰＣサー
バ１ｃと１ｅのサーバ接続順位を、運転制御を運転と指
定したＰＣサーバ１ａ，１ｂ，１ｎのサーバ接続順位の
次の順位から続けるようにしても同様の効果を得られ
る。図１３にこのサーバ接続順位の設定内容例を示し
た。図１３に示した例には、運転制御に運転と指定され
ているＰＣサーバ１ａ，１ｂ，１ｎについてサーバ接続
順位を決定し、次に、運転制御に待機と指定されている
ＰＣサーバ１ｃ，１ｅについて、サーバ接続順位４から
順位を割り当てた結果が示されている。システム立上げ
時には、このサーバ接続順位に基づいて外部入出力装置
５ａ〜５ｃをＰＣサーバ１ｎ，１ａ，１ｂの順に接続す
ることで、ＰＣサーバ１ｃ，１ｅには接続先がなくな
り、自動的に待機状態となる。Note that the server connection order of the PC servers 1c and 1e whose operation control is designated as standby is continued from the next order of the server connection order of the PC servers 1a, 1b and 1n whose operation control is designated as operation. Can obtain the same effect. FIG. 13 shows an example of the setting contents of the server connection order. In the example shown in FIG. 13, the server connection order is determined for the PC servers 1a, 1b, and 1n designated as operation in the operation control, and then the PC servers 1c and 1e designated as standby in the operation control. 2 shows the result of assigning the order from the server connection order 4. When the system is started, the external input / output devices 5a to 5c are connected in the order of the PC servers 1n, 1a, and 1b based on the server connection order, so that the PC servers 1c and 1e have no connection destination and are automatically on standby. State.

【０１０４】ここまでの段階で、ＰＣサーバ１ａ，１
ｂ，１ｎのいずれかが障害を起こして停止した場合に
は、外部入出力装置５ａ〜５ｃの接続先決定時に、装置
接続順位の最も低い外部入出力装置５ｃの接続先が未接
続のままとなる。この結果、第４順位のＰＣサーバ１ｃ
が、外部入出力装置５ｃに接続し、業務処理を開始す
る。At this stage, the PC servers 1a, 1
If any one of the external input / output devices 5a to 5c is stopped due to a failure, the external input / output device 5c having the lowest device connection order remains unconnected at the time of connection destination determination. Become. As a result, the fourth-ranked PC server 1c
Is connected to the external input / output device 5c and starts business processing.

【０１０５】サーバ接続順位が連続的に決定されている
ことから、ＰＣサーバ１ａ，１ｂ，１ｎのいずれかが稼
働中に障害を発生して停止した場合の代替運転の手順や
障害からの回復手順については、実施の形態１と同一の
手順となる。例えば、ＰＣサーバ１ｎが障害によって停
止した場合には、第４順位のＰＣサーバ１ｃが待機状態
であることから代替ＰＣサーバとなって業務処理を継続
する。Since the server connection order is determined continuously, an alternative operation procedure and a recovery procedure from a failure when one of the PC servers 1a, 1b, and 1n is stopped due to a failure during operation. Is the same procedure as in the first embodiment. For example, when the PC server 1n is stopped due to a failure, the fourth-ranked PC server 1c is in a standby state, so that the PC server 1n becomes an alternative PC server and continues business operations.

【０１０６】本実施の形態によれば、より高い処理能力
と信頼性を有した特定のＰＣサーバを代替サーバとして
待機させることによってシステム全体の処理能力の低下
を防止することができる。According to the present embodiment, a specific PC server having higher processing capability and reliability is made to stand by as a substitute server, thereby preventing a reduction in the processing capability of the entire system.

【０１０７】ところで、上記各実施の形態では、待機Ｐ
Ｃサーバのうちサーバ接続順位の高いＰＣサーバを代替
ＰＣサーバとして決定しているが、複数台の待機ＰＣサ
ーバの中から代替ＰＣサーバを決定する方法としては、
その他にも様々な方法が考えられる。Incidentally, in each of the above embodiments, the standby P
Among the C servers, a PC server having a higher server connection order is determined as an alternative PC server. As a method of determining an alternative PC server from a plurality of standby PC servers,
Various other methods are also conceivable.

【０１０８】例えば、動作不能となりその旨の通報を発
信したＰＣサーバの処理能力及び信頼性に最も近似して
いる待機ＰＣサーバを代替ＰＣサーバとして決定し、業
務処理を引き継がせるようにしてもよい。これにより、
代替運転前後におけるシステム全体の処理能力をほぼ同
じとすることができる。For example, the standby PC server that has become inoperable and has the closest processing capability and reliability to the PC server that has transmitted the notification to that effect may be determined as the substitute PC server, and the business process may be taken over. . This allows
The processing capacity of the entire system before and after the alternative operation can be made substantially the same.

【０１０９】また、プロセッサの縮退等による処理能力
の低下に伴い他のＰＣサーバで代替運転をさせたい場
合、あるいは動作不能となったＰＣサーバがプロセッサ
の縮退運転をしていた場合、その通報時点におけるＰＣ
サーバの処理能力及び信頼性に最も近似している待機Ｐ
Ｃサーバを代替ＰＣサーバとして決定し、業務処理を引
き継がせるようにする。これは、結果的に上記と同じこ
とになるため、代替運転前後におけるシステム全体の処
理能力をほぼ同じとすることができる。あるいは、通報
時点ではなくＰＣサーバが本来有している処理能力及び
信頼性に最も近似している待機ＰＣサーバを代替ＰＣサ
ーバとして決定するようにしてもよい。If another PC server wants to perform an alternative operation due to a decrease in processing capacity due to processor degeneration or the like, or if an inoperable PC server is performing a processor degeneration operation, the notification time is PC in
Standby P closest to server performance and reliability
The C server is determined as an alternative PC server so that business processing can be taken over. This results in the same as above, so that the processing capacity of the entire system before and after the alternative operation can be made substantially the same. Alternatively, the standby PC server that is closest to the processing capability and reliability originally possessed by the PC server may be determined as the substitute PC server, not at the time of notification.

【０１１０】実施の形態２では、運転制御情報を設定し
ているので上記各決定方法を容易に採用することができ
るが、実施の形態１では、接続順位の高い順に接続し、
また運転制御情報を用いていないのでシステム立上げ時
点では結局のところ処理能力等の低いＰＣサーバが代替
されることになる。しかし、長い間システムがダウンせ
ずに運用し続けＰＣサーバの代替運転が数多く行われた
場合は、処理能力等の高いＰＣサーバが待機ＰＣサーバ
になっている可能性もあるため上記各決定方法を採用す
ることは効果的となる。In the second embodiment, since the operation control information is set, each of the above-described determination methods can be easily adopted. In the first embodiment, the connection is performed in the descending order of the connection order.
Further, since the operation control information is not used, a PC server having a low processing capacity or the like is eventually substituted at the time of starting the system. However, when the system continues to operate without being down for a long time and a number of alternative operations of the PC server are performed, it is possible that a PC server having a high processing capacity or the like may be a standby PC server, and thus each of the above-described determination methods is performed. Adopting is effective.

【０１１１】[0111]

【発明の効果】本発明によれば、各処理装置及び外部入
出力装置の構成情報を各処理装置において共有し、これ
らの情報に基づいて各処理装置に接続する外部入出力装
置を決定するようにしたので、各処理装置の処理能力及
び信頼性に応じた適切な外部入出力装置を各処理装置に
接続することができ、各処理装置の処理能力に応じた適
切な業務処理を割り付けることができる。また、接続先
を決定する際、各処理装置のプロセッサ縮退等の障害情
報をも考慮するようにしたので、構成の縮退による処理
能力及び信頼性の低下についても適切に対処することが
できる。すなわち、処理装置及びＯＳに関する処理能力
と信頼性に関する最新の情報に基づいてシステム構成が
決定されることから、障害発生による処理装置の一部停
止や機能縮退、処理装置のグレードアップによる処理能
力や信頼性の向上及び処理装置の増設に対応してシステ
ム全体の処理能力と信頼性のバランスを常に最適な状態
に保った経済的で信頼性の高い分散システムを構築する
ことができる。According to the present invention, the configuration information of each processing device and the external input / output device is shared by each processing device, and the external input / output device to be connected to each processing device is determined based on the information. Therefore, it is possible to connect an appropriate external input / output device to each processing device according to the processing capacity and reliability of each processing device, and to allocate an appropriate business process according to the processing capacity of each processing device. it can. Further, when determining the connection destination, failure information such as processor degeneration of each processing device is also taken into consideration, so that it is possible to appropriately cope with a reduction in processing capability and reliability due to the degeneration of the configuration. That is, since the system configuration is determined based on the latest information on the processing capability and the reliability of the processing device and the OS, the processing capability is partially stopped or the function is degraded due to the occurrence of a failure, and the processing capability or the function is upgraded by upgrading the processing device. It is possible to construct an economical and highly reliable distributed system in which the balance between the processing capacity and the reliability of the entire system is always kept in an optimum state in response to the improvement of the reliability and the addition of the processing device.

【０１１２】更に、本発明によれば、特殊なＯＳ乃至そ
の拡張機能によらずに処理装置の障害発生時に残った処
理装置の中から最適な処理装置を選択して業務処理を行
うことができるため、オープンな製品による高性能で低
価格なシステム構築を容易に行うことができる。Further, according to the present invention, it is possible to select the optimum processing device from the remaining processing devices when a failure occurs in the processing device without using a special OS or its extended function, and to perform business processing. Therefore, it is possible to easily construct a high-performance and low-cost system using open products.

【０１１３】また、各処理装置に分散配置された処理装
置管理装置の連携によって各処理装置が自律判断でシス
テム構成を決定できるため、分散システムの大規模化に
柔軟に対応し得るとともに、一部の処理装置や処理装置
管理装置の故障によっても最適なシステム構成で業務処
理を行うことのできる信頼性の高い分散システムを構築
することができる。Further, since each processing device can determine the system configuration by autonomous judgment in cooperation with the processing device management devices distributed in each processing device, it is possible to flexibly cope with an increase in the scale of the distributed system, and it is also possible to partially handle the system configuration. It is possible to construct a highly reliable distributed system capable of performing business processing with an optimal system configuration even when a processing device or a processing device management device fails.

【０１１４】また、業務処理中の処理装置が動作不能に
なった場合でも、処理能力及び信頼性を考慮して待機中
の処理装置の中から最適な代替運転を行う処理装置を選
択し、業務処理を引き継がせるようにしているので、代
替運転後においてもシステム全体の処理能力等の低下を
極力防止することができる。Even when a processing unit that is performing a business process becomes inoperable, a processing unit that performs an optimal alternative operation is selected from among the standby processing units in consideration of the processing capacity and reliability, and Since the processing can be taken over, it is possible to prevent a decrease in the processing capacity and the like of the entire system as much as possible even after the alternative operation.

【０１１５】また、動作不能になった処理装置の復旧
後、再度実行していた業務処理を行わせることができる
ので、システム立上げ時の好適な処理能力等を有するシ
ステム環境に戻すことができる。Further, after the recovery of the inoperable processing device, the business process being executed again can be performed, so that it is possible to return to a system environment having a suitable processing capacity at the time of system startup. .

【０１１６】また、接続上の制約条件を設けて、その制
約条件を満たす外部入出力装置を接続候補とすることが
できるので、異機種の処理装置や外部入出力装置を混在
させたとしても適切な接続をすることができる。Further, since external input / output devices satisfying the restriction conditions can be set as connection candidates by providing connection restriction conditions, even if heterogeneous processing devices and external input / output devices are mixed, it is appropriate. Connection can be made.

【０１１７】また、システム立上げ時において特定の処
理装置を待機用として設定することができるので、細か
なシステム構成の制御を行うことができる。特に、シス
テム立上げ時から相対的に高い処理能力と信頼性を有し
た特定の処理装置を代替用の処理装置として待機させて
おけば、代替運転後におけるシステム全体の処理能力の
低下を防止することができる。Further, since a specific processing device can be set for standby at the time of starting the system, detailed control of the system configuration can be performed. In particular, if a specific processing unit having relatively high processing capacity and reliability is kept on standby as an alternative processing unit from the start of the system, a reduction in the processing capacity of the entire system after the alternative operation is prevented. be able to.

[Brief description of the drawings]

【図１】本発明に係る分散システムの実施の形態１を
示したブロック構成図である。FIG. 1 is a block diagram showing Embodiment 1 of a distributed system according to the present invention.

【図２】実施の形態１におけるＰＣサーバを示したブ
ロック構成図である。FIG. 2 is a block diagram showing a PC server according to the first embodiment.

【図３】実施の形態１における外部入出力装置を示し
たブロック構成図である。FIG. 3 is a block diagram showing an external input / output device according to the first embodiment.

【図４】実施の形態１におけるサーバ管理装置が管理
するＰＣサーバの構成情報と障害情報の内容例を示した
図である。FIG. 4 is a diagram showing an example of configuration information and failure information of a PC server managed by a server management apparatus according to the first embodiment.

【図５】実施の形態１におけるサーバ管理装置が管理
する外部入出力装置の構成情報の内容例を示した図であ
る。FIG. 5 is a diagram illustrating an example of configuration information of an external input / output device managed by a server management device according to the first embodiment;

【図６】実施の形態１におけるシステムの立上げ手順
を示したフローチャートである。FIG. 6 is a flowchart showing a procedure for starting up the system according to the first embodiment.

【図７】実施の形態１における初期化処理完了の通報
と他のサーバ管理装置が管理するサーバ管理情報の入手
手順を示したフローチャートである。FIG. 7 is a flowchart showing a procedure for notifying completion of initialization processing and obtaining server management information managed by another server management apparatus in the first embodiment.

【図８】実施の形態１におけるＰＣサーバと外部入出
力装置の接続先を決定する手順を示したフローチャート
である。FIG. 8 is a flowchart showing a procedure for determining a connection destination between the PC server and the external input / output device according to the first embodiment.

【図９】実施の形態１におけるＰＣサーバの状態と外
部入出力装置の接続状態の変化を示した図である。FIG. 9 is a diagram showing a change in the state of the PC server and the connection state of the external input / output device according to the first embodiment.

【図１０】実施の形態１におけるＰＣサーバ障害発生
時における処理を示したフローチャートである。FIG. 10 is a flowchart showing processing when a PC server failure occurs in the first embodiment.

【図１１】実施の形態１におけるＰＣサーバ復旧時に
おける処理を示したフローチャートである。FIG. 11 is a flowchart showing a process when the PC server is restored in the first embodiment.

【図１２】実施の形態２におけるＰＣサーバの状態と
外部入出力装置の接続状態の変化を示した図である。FIG. 12 is a diagram illustrating a change in a state of a PC server and a connection state of an external input / output device according to the second embodiment.

【図１３】実施の形態２におけるＰＣサーバの状態と
外部入出力装置の接続状態の変化を示した図である。FIG. 13 is a diagram illustrating a change in a state of a PC server and a connection state of an external input / output device according to the second embodiment.

【図１４】従来例の分散システムを示したブロック構
成図である。FIG. 14 is a block diagram showing a conventional distributed system.

[Explanation of symbols]

１ａ〜１ｎＰＣサーバ、２ａ〜２ｎサーバ管理装
置、３通信路、４外部入出力切り換え機構、５ａ〜
５ｍ外部入出力装置、６ａ〜６ｍオペレーティング
システム（ＯＳ）、７，７ａ，７ｂＬｏｃａｌＡｒ
ｅａＮｅｔｗｏｒｋ（ＬＡＮ）、８ａ〜８ｎプロセ
ッサ、９メモリ、１０バスブリッジ、１１外部入
出力装置接続装置、１２ＲＯＭ、１３，１５，１７，
１９，２１制御装置、１４キーボード、１６マウ
ス、１８ディスプレイ、２０フロッピィディスク装
置（ＦＤＤ）、２２ＨＤＤ、２３入出力バス、２４
プロセッサバス、２５不揮発性メモリ、２６ａ〜２
６ｎディスク制御装置、２７ａ〜２７ｎ，２８ａ〜２
８ｎディスク装置、２９ａ〜２９ｎ回線制御装置、
３０サーバ接続装置。1a-1n PC server, 2a-2n server management device, 3 communication path, 4 external input / output switching mechanism, 5a-
5m External input / output device, 6a-6m Operating system (OS), 7, 7a, 7b Local Ar
ea Network (LAN), 8a to 8n processors, 9 memories, 10 bus bridges, 11 external input / output device connection devices, 12 ROM, 13, 15, 17,
19, 21 control device, 14 keyboard, 16 mouse, 18 display, 20 floppy disk device (FDD), 22 HDD, 23 input / output bus, 24
Processor bus, 25 non-volatile memory, 26a-2
6n disk controller 27a-27n, 28a-2
8n disk device, 29a-29n line controller,
30 Server connection device.

Claims

[Claims]

A plurality of processing devices; a plurality of external input / output devices each storing an operating system and a business application; and external input / output switching means for connecting the processing device and the external input / output device. A distributed system that executes a predetermined business process by activating an operating system of the external input / output device via the external input / output switching unit by the processing device; A plurality of processing device management devices for performing operation management of the device; a communication path for performing communication between the plurality of processing device management devices; and a communication channel provided in the processing device and according to an instruction from the processing device management device. Device switching means for connecting / disconnecting a device to / from the external input / output device; The devices share the configuration information and the failure information of the respective processing devices by exchanging them via the communication path, and autonomously determine the connection destination of the external input / output device based on the shared configuration information and the failure information. The device switching means performs connection / disconnection between the processing device and the external input / output device based on the connection destination of the external input / output device determined by the processing device management device. Distributed system.

2. The apparatus according to claim 1, wherein each of the processing device management devices determines an optimal connection destination of the external input / output device in terms of processing capacity and reliability based on the shared configuration information and fault information. A distributed system as described.

3. The processing apparatus management apparatus automatically generates configuration information and failure information of the processing apparatus by performing initialization processing with reference to configuration information and failure information shared when the processing apparatus is started up. 2. The distributed system according to claim 1, wherein the automatically generated configuration information and fault information are reported to another processing device management device.

4. The configuration information of the external input / output device includes an index indicating processing capability and reliability of the processing device necessary for executing a business application stored in the external input / output device. The distributed system according to claim 1, wherein:

5. The distributed system according to claim 1, wherein the configuration information of the processing device includes information on a connection destination of each of the processing devices.

6. The processing device management device according to claim 1, wherein the processing device management device is configured to determine an effective performance score indicating an effective processing capability of the processing device based on the shared configuration information and the failure information, and a reliability indicating reliability of the processing device. The external input / output device to which the processing device is to be connected is determined based on the calculated effective performance score, reliability score, and configuration information of the external input / output device. 2. The distributed system according to 1.

7. Each of the processing device management devices determines a connection order of all the processing devices based on an effective performance score and a reliability score, and determines the external input / output device based on configuration information of the external input / output device. 7. The distributed system according to claim 6, wherein a connection order of the output devices is determined, and the external input / output device having the same connection order as the processing device is determined as a connection destination.

8. The processing apparatus management apparatus according to claim 1, wherein said processing apparatus management apparatus updates configuration information or fault information stored therein in response to a report from said processing apparatus management apparatus mounted on another processing apparatus. The distributed system according to claim 1, wherein

9. The processing device management apparatus according to claim 1, wherein the processing device is connected to each of the external input / output devices, and as a result, when the processing device is not connected to any of the external input / output devices, When receiving a notification of the alternative operation from another processing device that has been in standby as a device and performing business processing, the external input device connected to the processing device that has sent the notification has been replaced with the processing device that has sent the notification. 9. The distributed system according to claim 8, wherein an output device is reconnected to the processing device so as to take over a business process executed by the processing device that has sent the notification.

10. When there are a plurality of alternative processing devices, each of the processing device management devices assigns a service to the alternative processing device having the closest processing capability and reliability at the time of the notification of the processing device that sent the notification. The distributed system according to claim 9, wherein the processing is succeeded.

11. When there are a plurality of alternative processing devices, each of the processing device management devices may be configured to execute the alternative processing that is most similar to the processing capability and reliability inherent in the processing device that sent the notification. 10. The distributed system according to claim 9, wherein the apparatus is made to take over business processing.

12. The processing device management apparatus according to claim 1, wherein, when the processing device is not connected to any of the external input / output devices as a result of connecting the processing device and the external input / output devices, 9. The apparatus according to claim 8, wherein when a plurality of said alternative processing apparatuses are present, said alternative processing apparatus having the highest connection order takes over the business processing in place of said other processing apparatus during business processing. A distributed system as described.

13. The processing device management apparatus mounted on the processing device that has taken over the business process, after the management target processing device is restored, resumes the business process from the takeover destination processing device again. The distributed system according to any one of claims 9 to 12, wherein

14. The processing device management apparatus according to claim 1, further comprising: a connection condition between the processing device and the external input / output device included in the configuration information of the processing device and the external input / output device. 2. The distribution according to claim 1, wherein the external input / output device to be connected to the processing device is determined autonomously for each of the processing device group and the external input / output device group satisfying the constraint condition. system.

15. The processing device configuration information includes operation control information for causing each processing device to stand by as an alternative processing device when starting up each of the processing devices. 2. The distributed system according to claim 1, wherein when a standby instruction is set in the operation control information, the processing apparatus is made to stand by as an alternative processing apparatus when the processing apparatus is started up regardless of the level of processing capacity and reliability. .

16. The distributed system according to claim 15, wherein a standby instruction is set in advance in the operation control information of the processing device having high processing capability and high reliability.