JP2002259355A

JP2002259355A - Multiplex system

Info

Publication number: JP2002259355A
Application number: JP2001054168A
Authority: JP
Inventors: Hiroshi Ono; 大野　　洋; Ryokichi Yoshizawa; 亮吉吉澤; Takao Nouchi; 隆夫野内; Hiromichi Endo; 浩通遠藤; Eiko Naya; 英光納谷; Masahiko Saito; 雅彦齊藤; Tetsuaki Nakamigawa; 哲明中三川
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2001-02-28
Filing date: 2001-02-28
Publication date: 2002-09-13

Abstract

PROBLEM TO BE SOLVED: To provide a multiplex system for making arbitrary the location of a plurality of computers and facilitating configuration control processing. SOLUTION: In the multiplex system connected with a plurality of computers 101 and 102 on a network and having a shared external storage to be shared by these computers, a SAN 201 is used for the network and managing communication such as confirmation or change instruction of an operating state or executing state to be executed between the computers is performed on the SAN 201. Thus, the process of configuration control processing in each of computers is changed and zones and virtual communication paths can be set on the SAN 201.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は複数の計算機が連携
して動作する多重系システムに関し、特に多重系システ
ムを構成する計算機間でデータを共有するためのストレ
ージ装置をネットワーク接続した多重系システムに関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multiplex system in which a plurality of computers operate in cooperation with each other, and more particularly to a multiplex system in which a storage device for sharing data between the computers constituting the multiplex system is connected to a network. .

【０００２】[0002]

【従来の技術】従来、複数の計算機でストレージ装置上
に蓄積されたデータを共有し、いずれかの計算機がサー
ビス対象装置に対してサービスを提供する多重系システ
ムとしては、特開平１０−２０７８５５号公報に記載の
ように、共有ＳＣＳＩバスに接続されたストレージ装置
を共有する構成が一般的であった。2. Description of the Related Art Conventionally, as a multiplex system in which a plurality of computers share data stored in a storage device and one of the computers provides a service to a service target device, Japanese Patent Laid-Open No. Hei 10-207855 is known. As described in the gazette, a configuration in which a storage device connected to a shared SCSI bus is shared has been common.

【０００３】一方、計算機システムにおいて、多数の計
算機でストレージ装置を共用、あるいはストレージ装置
を計算機本体の設置場所から分離して集中配置するとい
う需要が顕在化し、ストレージ装置と計算機間を専用の
ネットワークで結ぶストレージ・エリア・ネットワーク
（以下、「ＳＡＮ」と呼ぶ）の技術が開発・製品化され
るに至った。このＳＡＮを使用した多重系システムの例
を図１８に示す。On the other hand, in a computer system, the demand for sharing a storage device between a large number of computers or separating and arranging the storage device from the installation location of the computer body has become apparent, and a dedicated network connects the storage device and the computer. The technology of the connected storage area network (hereinafter referred to as "SAN") has been developed and commercialized. FIG. 18 shows an example of a multiplex system using this SAN.

【０００４】本多重系システムでは、単一のＳＡＮ９５
１を使用し、２つの多重系システム９１１、９１２が構
築されている。多重系システム９１１は、計算機９０
１、９０２と共有ディスク装置９６１がＳＡＮ９５１経
由で接続されている。また計算機９０１、９０２の間は
計算機間リンク９２１で接続されており、この経路を通
じて、多重系システムとして連携動作するための管理通
信が行われている。In this multiplex system, a single SAN95
1, two multiplex systems 911 and 912 are constructed. The multiplex system 911 is a computer 90
1, 902 and a shared disk device 961 are connected via a SAN 951. The computers 901 and 902 are connected by an inter-computer link 921, and management communication for cooperative operation as a multiplex system is performed through this path.

【０００５】管理通信の内容としては、計算機自体の稼
働状態、各計算機上で動作している計算処理の負荷状態
といった、状態情報がある。計算機９０１、９０２は、
この状態情報を元に、相手の計算機の障害を検出して当
該計算機を停止させ、実行されていた計算処理を引き継
いで実行したり、相手の計算機の計算処理負荷の過剰状
態を判定して処理の一部を分担して実行する状態変更を
行う。このような計算機自体の稼働状態や計算機上で実
行されている処理の状態を把握し、これらの状態を変更
する処理を、以下、「構成制御」と呼ぶ。[0005] The contents of the management communication include state information such as the operation state of the computer itself and the load state of calculation processing running on each computer. Computers 901 and 902 are:
Based on this state information, a failure of the partner computer is detected, the computer is stopped, the executed calculation process is taken over, or the calculation process load of the partner computer is determined to be excessive, and the process is performed. To perform a state change that is executed by sharing a part of. Such a process of grasping the operating state of the computer itself and the state of the processing executed on the computer and changing these states is hereinafter referred to as “configuration control”.

【０００６】これらの構成制御処理の一部として、他の
計算機の稼働状態ないし他の計算機上で動作している計
算処理の稼働状態を変更する必要が発生した場合は、管
理通信の一つとして当該計算機に伝達される。当該管理
通信が途絶した場合は、各計算機が他の計算機の状態を
誤って認識し（スプリット・ブレイン状態と呼ばれる）
連携動作することができなくなるため、計算機間リンク
９２１を二重化するかまたはサービス用のネットワーク
９７１をバックアップの通信経路として使用することに
より冗長化する。計算機間リンクには、シリアル通信や
Ｅｔｈｅｒｎｅｔ（米国、ゼロックス社の商標）などの
ＬＡＮを使用する。When it is necessary to change the operation state of another computer or the operation state of a calculation process operating on another computer as a part of the configuration control processing, it is regarded as one of the management communications. The information is transmitted to the computer. When the management communication is interrupted, each computer erroneously recognizes the state of the other computer (called a split brain state).
Since the cooperative operation cannot be performed, the inter-computer link 921 is duplicated or the service network 971 is used as a backup communication path for redundancy. A LAN such as serial communication or Ethernet (trademark of Xerox Corporation, USA) is used for the inter-computer link.

【０００７】多重系システム９１１は、Ｅｔｈｅｒｎｅ
ｔ９７１を通じてサービス対象装置９８１に対して計算
処理機能を提供する。サービス対象装置９８１として
は、情報処理クライアントや制御対象機器などが考えら
れ、これらに対して多重系システム９１１はデータベー
ス検索処理や計算制御処理などを行う。もう一つの多重
系システム９１２は、同様に計算機９０３、９０４、共
有ディスク装置９６２、計算機間リンク９２２で構成さ
れ、同様にＥｔｈｅｒｎｅｔ９７２を通じてサービス対
象装置９８２に対して計算処理機能を提供する。[0007] The multiplex system 911 is an Ethernet system.
A calculation processing function is provided to the service target device 981 through t971. The service target device 981 may be an information processing client, a control target device, or the like, and the multiplex system 911 performs a database search process, a calculation control process, and the like for these devices. Another multiplex system 912 is similarly composed of computers 903 and 904, a shared disk device 962, and an inter-computer link 922, and similarly provides a calculation processing function to the service target device 982 through the Ethernet 972.

【０００８】[0008]

【発明が解決しようとする課題】上記従来技術を用いた
多重系システムでは、多重系システムを構成する計算機
の間に計算機間リンクを設置することが必要であり、計
算機の設置位置に自由度を持たせようとすると、計算機
間リンクの延長のためにコストが高くなる問題がある。
本発明の第一の目的は、計算機の設置位置を自由に設定
できる多重系システムを提供することにある。In a multiplex system using the above-mentioned prior art, it is necessary to install an inter-computer link between the computers constituting the multiplex system, and the degree of freedom in the installation position of the computers is increased. If they are to be provided, there is a problem that the cost increases due to the extension of the inter-computer link.
A first object of the present invention is to provide a multiplex system in which the installation position of a computer can be freely set.

【０００９】また、従来技術を用いた多重系システムで
は、複数の多重系システムが単一のＳＡＮを共用してお
り、一つの多重系システムの通常動作ないしは誤動作に
より、他の多重系システムが自身では予想できない影響
を受ける可能性がある。本発明の第二の目的は、一つの
多重系システムの動作が、他の多重系システムに影響を
与えないように構成された多重系システムを提供するこ
とにある。In a multiplex system using the prior art, a plurality of multiplex systems share a single SAN, and the other multiplex system itself becomes inoperable due to normal operation or malfunction of one multiplex system. Could be unpredictably affected. A second object of the present invention is to provide a multiplex system configured so that the operation of one multiplex system does not affect other multiplex systems.

【００１０】また、多重系システムを構成する計算機に
障害が発生した場合、当該計算機と入れ替えて別のバッ
クアップ計算機を組み入れることにより、多重系システ
ムの冗長性を復旧できる。特に、単一のＳＡＮに多数の
多重系システムを接続する場合、共通のバックアップ計
算機を用意することにより、少ない計算機台数でシステ
ムの可用性を高めることができる。しかし、これらの場
合に、多重系システムでバックアップ計算機の状態まで
も把握することが必要になり、管理が煩雑になる。本発
明の第三の目的は、当該管理を容易にする多重系システ
ムを提供することにある。If a failure occurs in a computer constituting the multiplex system, the redundancy of the multiplex system can be restored by replacing the computer with another backup computer. In particular, when a large number of multiplex systems are connected to a single SAN, the availability of the system can be increased with a small number of computers by preparing a common backup computer. However, in these cases, it is necessary to grasp even the state of the backup computer in the multiplex system, and management becomes complicated. A third object of the present invention is to provide a multiplex system that facilitates the management.

【００１１】さらに、従来技術を用いた多重系システム
では、システムを構成する各計算機上で構成制御に必要
な情報（以下、「構成制御情報」と呼ぶ）を管理するた
め、計算機間で構成制御情報が異ならないように、常に
情報の同期化を意識した処理を行わなければならず、処
理が煩雑になる。本発明の第四の目的は、各計算機での
構成制御情報の管理処理を容易にする多重系システムを
提供することにある。Further, in a multiplex system using the conventional technique, information necessary for configuration control (hereinafter, referred to as "configuration control information") is managed on each computer constituting the system. Processing must always be performed in consideration of information synchronization so that the information does not differ, and the processing becomes complicated. A fourth object of the present invention is to provide a multiplex system which facilitates management processing of configuration control information in each computer.

【００１２】[0012]

【課題を解決するための手段】まず、本発明で用いる
「管理通信」の定義を行なう。多重系システムの各ノー
ドは、ノード上で動作している各管理プログラムの働き
によって一つのシステムとして協調動作するが、そのた
めに管理プログラム相互間で通信を行なう。通信の内容
は、送信側ノード上での管理プログラムやアプリケーシ
ョンの動作状態の定期的ないし状態変化発生時の報告、
および受信側ノード上での管理プログラムやアプリケー
ションの動作状態を変更させる指示がある。また、送信
側ノード上の管理プログラムが認識している他のノード
に関する情報を通信し、各管理プログラム間での状態認
識の不整合を検出することも含まれる。さらに、通信が
可能であることをもって、受信側ノードで送信側ノード
が正しく動作しているかを認識する生存確認機能も含ま
れる。これらの通信を、本明細書中では「管理通信」と
呼ぶことにする。First, "management communication" used in the present invention is defined. Each node of the multiplex system cooperates as a single system by the operation of each management program running on the node. For this purpose, the management programs communicate with each other. The content of the communication is a report on the periodic or status change of the operation status of the management program or application on the sending node,
There is also an instruction to change the operation state of the management program or application on the receiving node. It also includes communicating information about other nodes recognized by the management program on the transmitting node and detecting inconsistencies in state recognition between the management programs. Further, a survival confirmation function for recognizing whether the transmitting node is operating correctly on the receiving node based on the fact that communication is possible is also included. These communications will be referred to herein as "management communications."

【００１３】次に、上記目的を達成する本発明の構成を
説明する。本発明は、複数の計算機と、前記複数の計算
機とネットワークで接続され前記複数の計算機で共有さ
れる共有外部記憶装置を有し、前記複数の計算機のいず
れかで行っている処理を他のいずれかの計算機が引き継
ぐ、または前記処理の一部を他の計算機で分担して実行
する多重系システムにおいて、前記複数の計算機相互で
行われる計算機の稼働状態の確認ないし変更指示、ある
いは前記複数の計算機で実行されている処理の実行状態
の確認ないし変更指示のために行われる管理通信を、前
記複数の計算機と前記共有外部記憶装置を接続する前記
ネットワーク上で行なうことを特徴とする。Next, the configuration of the present invention for achieving the above object will be described. The present invention includes a plurality of computers, a shared external storage device connected to the plurality of computers via a network and shared by the plurality of computers, and performs processing performed by any of the plurality of computers in any other manner. In a multiplex system in which one of the computers takes over, or in which a part of the processing is shared and executed by another computer, a confirmation or change instruction of the operating state of the computer performed between the plurality of computers, or the plurality of computers The management communication performed for confirming or changing the execution state of the process executed by the computer is performed on the network connecting the plurality of computers and the shared external storage device.

【００１４】前記ネットワークは、ポートを有する複数
のスイッチと、該スイッチ間にスイッチ間リンクを有す
るストレージ・エリア・ネットワーク（ＳＡＮ）を設け
ている。ＳＡＮは、前記計算機と前記共有外部記憶装置
の間で使用されるデータ転送プロトコルのデータを、当
該ネットワーク上で転送されるパケットの転送内容デー
タ部分に包含する機能を有している。[0014] The network includes a plurality of switches having ports and a storage area network (SAN) having an inter-switch link between the switches. The SAN has a function of including data of a data transfer protocol used between the computer and the shared external storage device in a transfer content data portion of a packet transferred on the network.

【００１５】また、複数の計算機相互で行われる前記管
理通信を、前記複数の計算機と前記共有外部記憶装置と
の通信よりも優先して実行するように、前記ネットワー
クに対して設定するネットワーク管理手段を有してい
る。該ネットワーク管理手段の一つに、仮想通信路管理
手段（実施例では、仮想通信路管理サーバ）がある。Ｓ
ＡＮ上に仮想通信路を設定し、管理通信の使用する仮想
通信路を他の通信よりも常に優先して実行することによ
り、当該管理通信を優先させることができる。また、管
理通信の使用する仮想通信路の最大遅延時間を設定する
ことにより、当該管理通信のリアルタイム性を確保でき
る。他のネットワーク管理手段として、パケット優先順
位管理手段がある。管理通信の送信時に、パケットヘッ
ダの優先順位を示すフィールドにより高い数値を設定す
ることで、ネットワーク上にあるスイッチなどの中継装
置が装置内で中継待ちとなっている優先順位の低いパケ
ットを待機ないし廃棄して優先順位の高いパケットを先
に中継することになり、当該管理通信を優先させること
ができる。Further, network management means for setting the network so that the management communication performed between the plurality of computers is performed with priority over the communication between the plurality of computers and the shared external storage device. have. One of the network management means is a virtual communication path management means (a virtual communication path management server in the embodiment). S
By setting a virtual communication path on the AN and always executing the virtual communication path used by the management communication with priority over other communication, the management communication can be prioritized. Further, by setting the maximum delay time of the virtual communication path used by the management communication, real-time property of the management communication can be ensured. As another network management means, there is a packet priority management means. By setting a higher value in the field indicating the priority of the packet header when transmitting the management communication, a relay device such as a switch on the network does not wait for a low-priority packet waiting to be relayed in the device. A packet with a higher priority is discarded and relayed first, so that the management communication can be prioritized.

【００１６】また、本発明は、当該多重系システムの複
数の計算機および共有外部記憶装置に対して、前記ネッ
トワークに接続されている他の多重系システムまたは他
の装置から送信された通信データを中継することを禁止
するように、前記ネットワークに設定するゾーン管理手
段（実施例では、ゾーン管理サーバ）を有することを特
徴とする。あるいは、前記複数の計算機のいずれかに障
害が発生した場合に、当該障害発生計算機を除く他の複
数の計算機および前記共有外部記憶装置に対して、当該
障害発生計算機から送信された通信データを、前記ネッ
トワークで中継することを禁止するように、前記ネット
ワークに設定することを特徴とする。Further, the present invention relays communication data transmitted from another multiplex system or another device connected to the network to a plurality of computers and a shared external storage device of the multiplex system. A zone management unit (in this embodiment, a zone management server) configured to set the network so as to prohibit the operation from being performed. Alternatively, when a failure occurs in any of the plurality of computers, the communication data transmitted from the failure-occurring computer to the plurality of other computers and the shared external storage device excluding the failure-occurring computer, It is characterized in that it is set in the network so as to prohibit relaying in the network.

【００１７】つまり、多重系システムの構成要素を、Ｓ
ＡＮ上のゾーン、すなわち通信を許可するグループとし
て定義し、ゾーン外からの通信を中継しないように設定
したものである。That is, the components of the multiplex system are represented by S
It is defined as a zone on the AN, that is, a group that permits communication, and is set not to relay communication from outside the zone.

【００１８】また、本発明は、１以上の計算機と、前記
計算機とネットワークで接続される共有外部記憶装置を
有し、前記ネットワークに接続された他の計算機を自多
重系システムを構成する計算機として追加することが可
能で、追加前の前記計算機のいずれかで行っている処理
を他の計算機のいずれかに引き継ぐ、または追加前の前
記計算機のいずれかで行っている処理の一部を他の計算
機で分担して実行する多重系システムにおいて、計算機
の追加処理を行うときに、当該計算機から送信された通
信データを、追加後に前記多重系システムを構成する全
ての計算機および前記共有外部記憶装置に対して、前記
ネットワークで中継することを許可するように前記ネッ
トワークに設定するゾーン管理手段を有することを特徴
とする。Further, the present invention has one or more computers and a shared external storage device connected to the computers via a network, and uses the other computers connected to the network as computers constituting a self-multiplexing system. It is possible to add, take over the processing performed on any of the computers before the addition to any of the other computers, or replace some of the processing performed on any of the computers before the addition with another In a multiplex system that is shared and executed by computers, when performing additional processing of a computer, communication data transmitted from the computer is added to all the computers and the shared external storage device configuring the multiplex system after the addition. On the other hand, there is provided a zone management means for setting the network so as to permit relaying on the network.

【００１９】あるいは、前記複数の計算機のいずれかで
行っている処理を他の計算機のいずれかに引き継ぐ、ま
たは前記複数の計算機のいずれかで行っている処理の一
部を他の計算機で分担して実行する多重系システムにお
いて、計算機の除外処理を行うときに、当該計算機から
送信された通信データを、除外後に前記多重系システム
を構成する全ての計算機および前記共有外部記憶装置に
対して、前記ネットワークで中継することを禁止するよ
うに前記ネットワークに設定するゾーン管理手段を有す
ることを特徴とする。Alternatively, the processing performed by one of the plurality of computers is taken over by one of the other computers, or a part of the processing performed by one of the plurality of computers is shared by the other computer. When performing a computer exclusion process in a multiplex system, the communication data transmitted from the computer is excluded from all the computers and the shared external storage device configuring the multiplex system after the exclusion. It is characterized by having zone management means for setting the network so as to prohibit relaying on the network.

【００２０】これによれば、多重系システムを構成する
計算機の稼働状態に関する情報を、ＳＡＮのゾーン設定
から取得するようにしたので、各計算機ごとに全計算機
の状態を管理する必要がなく、必要となった時点でゾー
ン設定情報を利用するだけで済む。According to this, the information on the operating status of the computers constituting the multiplex system is obtained from the SAN zone setting, so that it is not necessary to manage the status of all the computers for each computer, and it is necessary to Then, it is only necessary to use the zone setting information.

【００２１】さらに、本発明は複数の計算機と、前記複
数の計算機とネットワークで接続され前記複数の計算機
で共有される共有外部記憶装置を有し、前記複数の計算
機のいずれかで行っている処理を他のいずれかの計算機
で引き継ぎ、または前記複数の計算機のいずれかで行っ
ている処理の一部を他の計算機で分担して実行する多重
系システムにおいて、前記ネットワークを構成する機器
または前記共有外部記憶装置は、前記複数の計算機の稼
働状態を監視する監視手段（実施例では主管理プログラ
ム）を有していることを特徴とする。Further, the present invention has a plurality of computers, a shared external storage device connected to the plurality of computers via a network and shared by the plurality of computers, and a process performed by any of the plurality of computers. In a multiplex system in which the other computers take over the processing of some of them, or share some of the processing performed by one of the plurality of computers with the other computers, and execute the network. The external storage device is characterized by having a monitoring means (main management program in the embodiment) for monitoring the operating states of the plurality of computers.

【００２２】前記監視手段は、その監視結果をもとに、
前記複数の計算機の稼働状態を変更する指示を行うか、
または前記複数の計算機のいずれかで行っている処理の
実行状態を変更する指示を行う。The monitoring means, based on the monitoring result,
Instruct to change the operating state of the plurality of computers,
Alternatively, an instruction is issued to change the execution state of the processing being performed in any of the plurality of computers.

【００２３】これによれば、ＳＡＮのネットワーク構成
装置ないしＳＡＮに接続される共有ストレージ装置で、
多重系システムの構成制御に関する情報をその内部に保
持し、また多重系システムの構成制御処理の一部または
全部を実行させるようにしたので、構成制御情報の保
持、および構成制御の実行の手順を簡略化できる。な
お、これらの装置は一般的に既に冗長化が施されている
ため、新たな冗長化のコストなしで高い可用性を実現す
ることができる。According to this, the SAN network component device or the shared storage device connected to the SAN,
Since information related to the configuration control of the multiplex system is held therein and a part or all of the configuration control processing of the multiplex system is executed, the procedure of holding the configuration control information and executing the configuration control is performed. Can be simplified. In addition, since these devices are generally already provided with redundancy, high availability can be realized without additional cost for redundancy.

【００２４】[0024]

【発明の実施の形態】以下、本発明の実施の形態につい
て図面を参照して説明する。本発明は、複数の計算機
と、前記複数の計算機とネットワークで接続され前記複
数の計算機で共有される共有外部記憶装置を有し、前記
複数の計算機のいずれかで行っている処理を他のいずれ
かの計算機が引き継ぐ、または他の計算機で分担して実
行する多重系システムである。複数の計算機の間では、
計算機の稼働状態ないし実行状態の確認、あるいは変更
指示のために行われる管理通信を、前記ネットワーク上
で行なう。Embodiments of the present invention will be described below with reference to the drawings. The present invention includes a plurality of computers, a shared external storage device connected to the plurality of computers via a network and shared by the plurality of computers, and performs processing performed by any of the plurality of computers in any other manner. This is a multiplex system that is taken over by one computer or shared and executed by another computer. Between multiple calculators,
Management communication for confirming the operation state or execution state of the computer or for instructing a change is performed on the network.

【００２５】図１に、本発明の第１の実施形態に係る多
重系システムの構成を示す。本多重系システムでは、２
台の計算機１０１、１０２がＳＡＮ２０１を経由して、
共有ディスク装置３０１に接続されている。なお、計算
機が３台以上としてもよく、また共有ディスク装置を２
台以上としてもよい。FIG. 1 shows the configuration of a multiplex system according to the first embodiment of the present invention. In this multiplex system, 2
Computers 101 and 102 pass through the SAN 201,
It is connected to the shared disk device 301. The number of computers may be three or more, and the number of shared disk devices may be two or more.
It may be more than one.

【００２６】各計算機１０１、１０２は、中央処理装置
１１１（以下、「ＣＰＵ」と呼ぶ）、主記憶装置１１
２、入出力制御装置１１３を備え、これらはプロセッサ
バス１１０によって互いに接続されている。入出力制御
装置１１３には、拡張バス１１４が接続され、さらに拡
張バス１１４には計算機の機能を拡張するための機器を
接続する。本実施形態に係る計算機１０１、１０２は、
拡張バス１１４にディスク装置１１５、ＳＡＮに接続す
るためのホストバスアダプタ１１６（以下、「ＨＢＡ」
と呼ぶ）、Ｅｔｈｅｒｎｅｔアダプタ１１７を接続して
いる。Each of the computers 101 and 102 includes a central processing unit 111 (hereinafter referred to as a “CPU”) and a main storage device 11.
2. It has an input / output control unit 113, which is connected to each other by a processor bus 110. An expansion bus 114 is connected to the input / output control device 113, and a device for expanding the functions of the computer is connected to the expansion bus 114. The computers 101 and 102 according to the present embodiment
A disk bus 115 is connected to the expansion bus 114 and a host bus adapter 116 (hereinafter, “HBA”) for connecting to the SAN.
), And an Ethernet adapter 117 is connected.

【００２７】各計算機１０１、１０２のＥｔｈｅｒｎｅ
ｔアダプタ１１７は、Ｅｔｈｅｒｎｅｔ４０１に接続さ
れ、当該ネットワークに接続されたサービス対象装置４
０２と通信を行う。サービス対象装置４０２としては、
情報処理クライアントや制御対象機器など様々なものが
考えられ、これらに対して各計算機１０１、１０２はデ
ータベース検索処理や計算制御処理などの機能を提供す
る。Ｅｔｈｅｒｎｅｔ４０１の代わりに、トークンリン
グや制御用リアルタイムネットワークなど、他のネット
ワークを使用してもよい。The Ethernet of each of the computers 101 and 102
The t adapter 117 is connected to the Ethernet 401, and the service target device 4 connected to the network.
02 is communicated. As the service target device 402,
Various devices such as an information processing client and a device to be controlled can be considered, and the computers 101 and 102 provide functions such as a database search process and a calculation control process. Instead of the Ethernet 401, another network such as a token ring or a control real-time network may be used.

【００２８】また、各計算機１０１、１０２のＨＢＡ１
１６は、ＳＡＮ２０１に接続され、共有ディスク装置３
０１などＳＡＮに接続された他の機器と通信を行う。The HBA 1 of each of the computers 101 and 102
16 is connected to the SAN 201 and the shared disk device 3
01 and other devices connected to the SAN.

【００２９】このような多重系システムにおいて、各計
算機１０１、１０２が正常な状態では、主記憶装置１１
２上には、ＯＳ１３１と管理プログラム１３２がローデ
ィングされ、実行されている。また、主記憶装置１１２
上にはアプリケーション１３３がローディングされてい
る。In such a multiplex system, when the computers 101 and 102 are in a normal state, the main storage 11
2, an OS 131 and a management program 132 are loaded and executed. Also, the main storage device 112
Above is loaded an application 133.

【００３０】管理プログラム１３２は、多重系システム
を構成する計算機１０１、１０２の状態、および各計算
機上でのアプリケーション１３３の実行状態を認識し、
これらの状態を変更するためのプログラムである。管理
プログラム１３２は、主記憶装置１１２上に、本多重系
システムの状態を管理するための状態表１３５を作成す
る。The management program 132 recognizes the status of the computers 101 and 102 constituting the multiplex system and the execution status of the application 133 on each computer.
It is a program for changing these states. The management program 132 creates, on the main storage device 112, a status table 135 for managing the status of the multiplex system.

【００３１】アプリケーション１３３は、本多重系シス
テムの主たる目的である、サービス対象装置４０２へ提
供する機能を実現するためのプログラムである。アプリ
ケーション１３３は複数存在してもよい。一つのアプリ
ケーション１３３の実行については、ある一時点では一
つの計算機に限定し、当該計算機に障害が発生したとき
などに他の計算機に切り換えて可用性を向上させる構成
と、全体の処理を複数の計算機で分割して実行して処理
性能を向上させる構成がある。以下では、前者の構成に
もとづいて説明する。The application 133 is a program for realizing a function to be provided to the service target device 402, which is a main purpose of the multiplex system. A plurality of applications 133 may exist. The execution of one application 133 is limited to one computer at a certain point in time, the availability is improved by switching to another computer when a failure occurs in the computer, and the entire processing is executed by a plurality of computers. There is a configuration for improving the processing performance by dividing and executing. Hereinafter, description will be given based on the former configuration.

【００３２】ただし、後者の構成にした場合は、管理プ
ログラム１３２は処理を分割するために各計算機上での
アプリケーションの実行状態を認識し、その結果に基づ
き各計算機上のアプリケーションに処理動作についての
指示を行う必要がある。このため、前者の構成の処理の
うち、アプリケーションを実行させるために特定の計算
機を選択するアルゴリズムを、アプリケーションを実行
させる複数の計算機を選択しこれらの複数の計算機上の
アプリケーションに対して全体の処理のどの部分を処理
するかについて指示するアルゴリズムに変更する。However, in the case of the latter configuration, the management program 132 recognizes the execution state of the application on each computer in order to divide the processing, and based on the result, the application on each computer notifies the application on each computer about the processing operation. You need to give instructions. For this reason, among the processing of the former configuration, the algorithm for selecting a specific computer to execute an application is determined by selecting an algorithm for selecting a plurality of computers to execute the application and performing the entire processing for the application on the plurality of computers. Is changed to an algorithm that instructs which part of is processed.

【００３３】ＳＡＮ２０１には、ゾーン管理サーバ２１
１が接続される。ゾーン管理サーバ２１１は記憶装置２
１２を備えており、その中にゾーン定義情報２１３を格
納する。ゾーン定義情報２１３は、ＳＡＮ２０１にて相
互に通信を許可する機器のグループを定義したものであ
る。The SAN 201 has a zone management server 21
1 is connected. The zone management server 211 is the storage device 2
12 in which zone definition information 213 is stored. The zone definition information 213 defines a group of devices that are allowed to communicate with each other in the SAN 201.

【００３４】ゾーン管理サーバ２１１は、ＳＡＮ２０１
に接続された機器からの要求により、定義を作成、変
更、削除し、または現在設定されているゾーン定義情報
２１３を提供する。定義の作成、変更、削除が行われた
場合には、ＳＡＮ２０１の構成要素に対し、ゾーン定義
に従った通信データ中継の許可・禁止を指示する。[0034] The zone management server 211
In response to a request from a device connected to the server, the definition is created, changed, or deleted, or the currently set zone definition information 213 is provided. When the definition is created, changed, or deleted, the SAN 201 is instructed to permit or prohibit communication data relay according to the zone definition.

【００３５】また、ＳＡＮ２０１には、仮想通信路管理
サーバ２２１が接続される。仮想通信路管理サーバ２２
１は記憶装置２２２を備えており、その中には仮想通信
路定義情報２２３を格納する。仮想通信路定義情報２２
３は、ＳＡＮ２０１に接続された特定のノード間に作ら
れる専用の論理通信路を定義したものである。The SAN 201 is connected to a virtual communication path management server 221. Virtual communication path management server 22
1 has a storage device 222 in which virtual channel definition information 223 is stored. Virtual channel definition information 22
Reference numeral 3 defines a dedicated logical communication path created between specific nodes connected to the SAN 201.

【００３６】仮想通信路管理サーバ２２１は、ＳＡＮ２
０１に接続された機器からの要求により、定義を作成、
変更、削除し、または現在設定されている仮想通信路定
義情報２２３を提供する。定義の作成、変更、削除が行
われた場合には、ＳＡＮ２０１の構成要素に対し、仮想
通信路定義情報に従った通信データ中継のための、論理
通信路の設定、通信帯域幅や最大通信遅延時間の保証に
必要なリソースの予約を指示する。The virtual communication path management server 221 is connected to the SAN 2
Creates a definition in response to a request from the device connected to
It changes, deletes, or provides the currently set virtual channel definition information 223. When a definition is created, changed, or deleted, a logical channel setting, a communication bandwidth, and a maximum communication delay for relaying communication data in accordance with the virtual channel definition information are performed on the SAN 201 components. Instructs reservation of resources necessary for time guarantee.

【００３７】以下、各部の詳細について説明する。本多
重系システムで前提としているＳＡＮ２０１は、計算機
１０１，１０２と共有ディスク装置３０１の間で使用さ
れるデータ転送プロトコルのデータを、当該ネットワー
ク上で転送されるパケットの転送内容データ部分に包含
する機能を有している。つまり、ＳＡＮ上のパケットに
データ転送プロトコルのデータをペイロード（転送内容
本体）としてカプセル化する。The details of each section will be described below. The SAN 201 presupposed in the multiplex system includes a function of including data of a data transfer protocol used between the computers 101 and 102 and the shared disk device 301 in a transfer content data portion of a packet transferred on the network. have. That is, data of the data transfer protocol is encapsulated in a packet on the SAN as a payload (transfer content body).

【００３８】図２にその内部構成を示し、その動作を説
明する。ＳＡＮ２０１は１以上のスイッチ２３１と、ス
イッチ間を接続するスイッチ間リンク２３２から構成さ
れる。各スイッチにはポート２３３が１以上備えられて
おり、ＳＡＮ２０１に含まれる任意のポート２３３との
間でデータ転送できるように、スイッチ２３１内部およ
びスイッチ２３１相互間でデータの送受信が行える。FIG. 2 shows the internal configuration and its operation will be described. The SAN 201 includes one or more switches 231 and an inter-switch link 232 that connects the switches. Each switch is provided with one or more ports 233, and data can be transmitted and received inside the switch 231 and between the switches 231 so that data can be transferred to and from an arbitrary port 233 included in the SAN 201.

【００３９】ポート２３３に接続される機器はノードと
呼ばれ、ノードリンク２３４によって接続される。ＳＡ
Ｎ２０１は、ノードの種類が計算機であるかディスク装
置であるか、あるいはその他の機器であるかを区別せず
に、データ転送機能を提供する。各ノードはアドレス２
３５にて区別され、ノードからポート２３３に対して送
信されたデータは、データ内に指定されたアドレス２３
５を持つノードに対して転送される。The equipment connected to the port 233 is called a node, and is connected by a node link 234. SA
The N201 provides a data transfer function without distinguishing whether the type of node is a computer, a disk device, or another device. Each node has address 2
35, the data transmitted from the node to the port 233 is the address 23 specified in the data.
5 is forwarded to the node with 5.

【００４０】図２では、図１に示した計算機１０１、１
０２および共有ディスク装置３０１が、それぞれアドレ
スａ、アドレスｂ、アドレスｃというアドレスによって
接続されている。また図２では、図１には示されていな
い他のノード３０５ａ、３０５ｂも、同一のＳＡＮ２０
１に接続されている。In FIG. 2, the computers 101, 1 shown in FIG.
02 and the shared disk device 301 are connected by addresses a, b, and c, respectively. In FIG. 2, the other nodes 305a and 305b not shown in FIG.
1

【００４１】図１に示したゾーン管理サーバ２１１、仮
想通信路管理サーバ２２１は、それぞれアドレスｘ、ア
ドレスｙという、予約アドレス２３６を持つポートとし
て接続されている。予約アドレス２３６は、それぞれゾ
ーン管理、仮想通信路管理のために固定で割り当てられ
たアドレスであり、各ノードは、当該機能に関する処理
を要求する場合に、ゾーン管理サーバ２１１、仮想通信
路管理サーバ２２１の実体を意識せずに、予約アドレス
２３６を指定すればよい。The zone management server 211 and the virtual communication path management server 221 shown in FIG. 1 are connected as ports having a reserved address 236 called an address x and an address y, respectively. The reservation address 236 is an address fixedly assigned for zone management and virtual communication path management, respectively. When each node requests a process related to the function, the node manages the zone management server 211 and the virtual communication path management server 221. The reserved address 236 may be specified without being aware of the entity.

【００４２】図２の構成では、ＳＡＮ２０１は、これら
予約アドレス２３６宛のデータを、通常のアドレス２３
５と同様に当該サーバに転送する。一方、ゾーン管理、
仮想通信路管理といった機能をＳＡＮの内部装置の一部
として実装することも可能で、この場合は、当該機能に
関する要求の通信データをサーバに転送するのではな
く、ＳＡＮの内部で処理する。In the configuration shown in FIG. 2, the SAN 201 converts the data addressed to the reserved address 236 into the normal address 23.
Transfer to the server in the same manner as in 5. Meanwhile, zone management,
A function such as virtual communication path management can be implemented as a part of the internal device of the SAN. In this case, the communication data of the request related to the function is not transferred to the server but is processed inside the SAN.

【００４３】一方、ゾーン管理や仮想通信路管理を実行
するサーバを冗長化してＳＡＮの可用性を高めることも
可能で、この場合、ある時点では当該機能を現に提供し
ているサーバが予約アドレスを使用し、冗長化した予備
サーバが当該機能を引き継いだ時点でＳＡＮ上の予約ア
ドレスも引き継いで処理する。これらいずれの場合で
も、各ノードは予約アドレスを指定して処理要求を送信
するという手順を一切変更しなくてよい。On the other hand, it is also possible to increase the availability of the SAN by making the servers that execute zone management and virtual communication path management redundant, and in this case, at a certain point, the server that is currently providing the function uses the reserved address. Then, when the redundant spare server takes over the function, the reserved address on the SAN is also taken over and processed. In any of these cases, each node does not need to change the procedure of transmitting a processing request by designating a reserved address at all.

【００４４】ゾーン管理の機能について説明する。ゾー
ンとは、ＳＡＮ２０１に接続されたノードの中で、デー
タ転送を相互に許可するノードのグループのことであ
る。ＳＡＮ２０１は、設定されたゾーンに属するノード
間相互でのデータ転送は実行し、それ以外のノード間の
データ転送は拒否する。ただし、いずれのゾーンにも属
しないノードについては、これらのノード相互間でデー
タ転送を実行する。また、予約アドレス２３６を持つノ
ードについては、ＳＡＮ２０１の管理機能を提供するも
のであるから、当該予約アドレスとＳＡＮ２０１に接続
された任意のノードとの間の通信は常に実行する。The function of zone management will be described. A zone is a group of nodes that mutually permit data transfer among nodes connected to the SAN 201. The SAN 201 executes data transfer between nodes belonging to the set zone, and rejects data transfer between other nodes. However, for nodes that do not belong to any zone, data transfer is performed between these nodes. Since the node having the reserved address 236 provides a management function of the SAN 201, the communication between the reserved address and any node connected to the SAN 201 is always executed.

【００４５】ゾーンの設定、修正、削除については、Ｓ
ＡＮ２０１に接続されている任意のノードから、ゾーン
管理サーバ２１１に要求することができる。ゾーン管理
サーバ２１１は、要求パラメータに矛盾が無いことを確
認したうえで、ゾーン定義情報を記憶するとともに、ス
イッチ２３１に対して当該情報を転送して、その反映を
指示する。また、ゾーン管理サーバ２１１は、ＳＡＮ２
０１の任意のノードから要求があった場合に、保持して
いるゾーン定義情報２１３の内容を通知する。For setting, modifying and deleting a zone,
A request can be made to the zone management server 211 from any node connected to the AN 201. After confirming that there is no inconsistency in the request parameters, the zone management server 211 stores the zone definition information, transfers the information to the switch 231, and instructs the switch 231 to reflect the information. In addition, the zone management server 211 has the SAN2
When there is a request from an arbitrary node 01, the content of the zone definition information 213 held is notified.

【００４６】仮想通信路管理の機能について説明する。
仮想通信路とは、ＳＡＮ２０１に接続されたノード間
に、論理的に設定される通信路であり、各通信路には通
信帯域幅と最大遅延時間というパラメータが設定され
る。ＳＡＮ２０１は、当該仮想通信路を使うと指定され
たデータ通信に関して、設定された通信帯域幅を保証で
きるように、一定時間、たとえば１秒の間に決められた
量のデータを転送できるように、スイッチ２３１の処理
時間や中継するスイッチ間リンク２３２の占有割合とい
ったリソースを必要量予約する。同時に、決められた通
信帯域幅以上のデータがＳＡＮ２０１内に流入して予約
した以上のリソースを消費しないように、スイッチ２３
１は、接続されたノードに対してデータの送出停止を要
求する機能を持つ。The function of virtual communication path management will be described.
The virtual communication path is a communication path logically set between nodes connected to the SAN 201, and parameters such as a communication bandwidth and a maximum delay time are set in each communication path. The SAN 201 can transfer a predetermined amount of data for a predetermined period of time, for example, one second so that a set communication bandwidth can be guaranteed for data communication specified using the virtual communication path. Resources such as the processing time of the switch 231 and the occupation ratio of the inter-switch link 232 to be relayed are reserved. At the same time, the switch 23 prevents the data exceeding the determined communication bandwidth from flowing into the SAN 201 and consuming more resources than reserved.
1 has a function of requesting a connected node to stop sending data.

【００４７】さらに、ＳＡＮ２０１は、仮想通信路を使
うと指定されたデータ通信に関して、設定された最大遅
延時間を守るよう、同様にリソースを予約する。この
際、設定できる最大遅延時間の下限値は、リソースを予
約できる時間の最小単位によって制約される。仮想通信
路を使わないデータ通信に関しては、当該通信を行うの
に必要なリソースが存在する場合には実行され、他の設
定された仮想通信路の通信のためにリソースが不足した
ときに、通信が延期ないし中止される。Further, the SAN 201 similarly reserves resources for data communication designated to use the virtual communication path so as to keep the set maximum delay time. At this time, the lower limit of the maximum delay time that can be set is limited by the minimum unit of the time for which resources can be reserved. The data communication that does not use the virtual communication path is executed when the resources necessary for the communication exist, and the communication is performed when the resources are insufficient for the communication of the other set virtual communication paths. Is postponed or discontinued.

【００４８】仮想通信路の設定、修正、削除について
は、ＳＡＮ２０１の任意のノードから、仮想通信路管理
サーバ２２１に要求することができる。設定、ないし修
正の要求があった場合、仮想通信路管理サーバ２１１
は、要求された仮想通信路の設定に必要なリソースを算
出し、当該仮想通信路を識別するＩＤ番号を付した上
で、スイッチ２３１やスイッチ間リンク２３２といった
各構成機器にリソースの予約を依頼する。The setting, correction and deletion of a virtual communication channel can be requested from any node of the SAN 201 to the virtual communication channel management server 221. If there is a request for setting or correction, the virtual communication path management server 211
Calculates the resources required for setting the requested virtual communication path, attaches an ID number for identifying the virtual communication path, and requests each component such as the switch 231 or the inter-switch link 232 to reserve the resource. I do.

【００４９】全ての依頼が成功すれば当該仮想通信路の
設定に成功したことになり、仮想通信路管理サーバ２２
１は、仮想通信路定義情報２２３に新しい設定内容を記
憶し、仮想通信路ＩＤ番号を要求元に返す。一方、何れ
かのリソースの予約に失敗すれば、他の新しいリソース
の予約を取り消し、仮想通信路定義情報２２３を変更せ
ず、要求元に失敗を報告する。また、仮想通信路管理サ
ーバ２２１は、ＳＡＮ２０１の任意のノードから要求が
あった場合に、保持している仮想通信路定義情報２２３
の内容を返す。If all the requests succeed, it means that the setting of the virtual communication channel has succeeded, and the virtual communication channel management server 22
1 stores the new setting contents in the virtual channel definition information 223 and returns the virtual channel ID number to the request source. On the other hand, if the reservation of any resource fails, the reservation of another new resource is canceled and the failure is reported to the request source without changing the virtual communication path definition information 223. Further, the virtual communication path management server 221 receives the request from an arbitrary node of the SAN 201, and holds the virtual communication path definition information 223 stored therein.
Returns the contents of

【００５０】図３に、ＳＡＮ２０１上で転送されるデー
タの構造を示す。データには共通の情報として、送信元
アドレス２４１、送信先アドレス２４２、仮想通信路Ｉ
Ｄ２４３、プロトコルＩＤ２４４、データサイズ２４
５、チェックデータ２４７を持つ。また、通信するデー
タ本体２４６は可変長のデータであり、その長さがデー
タサイズ２４５として格納される。FIG. 3 shows the structure of data transferred on the SAN 201. The transmission source address 241, the transmission destination address 242, the virtual communication path I
D243, protocol ID 244, data size 24
5, and has check data 247. The data body 246 to be communicated is variable-length data, and the length is stored as the data size 245.

【００５１】送信元アドレス２４１は、送信元のＳＡＮ
上でのノードアドレス２３５である。また、送信先アド
レス２４２は、上記で説明した送信先のノードのアドレ
ス２３５ないし予約アドレス２３６である。仮想通信路
ＩＤ２４３は、仮想通信路を使う場合に、当該仮想通信
路設定時に仮想通信路管理サーバ２２１から返されたＩ
Ｄ番号を設定する。仮想通信路使わない場合は、０を指
定する。プロトコルＩＤ２４４は、データ本体２４６が
持っているデータを解釈するための規約を示すもので、
本多重系システムでは、ＳＣＳＩプロトコル、ＩＰプロ
トコル、およびＳＡＮの管理通信のためのプロトコルを
使用する。チェックデータ２４７は通信データに伝送誤
りがないかを確認するためのもので、送信ノードで予め
決まった数式により算出して付加し、受信ノードで再計
算して確認する。The source address 241 is the SAN address of the source.
The above is the node address 235. The destination address 242 is the address 235 to the reservation address 236 of the destination node described above. When using a virtual communication channel, the virtual communication channel ID 243 is the ID returned from the virtual communication channel management server 221 at the time of setting the virtual communication channel.
Set the D number. Specify 0 when not using a virtual communication channel. The protocol ID 244 indicates a rule for interpreting data held by the data body 246.
In this multiplex system, a SCSI protocol, an IP protocol, and a protocol for SAN management communication are used. The check data 247 is for confirming whether or not there is a transmission error in the communication data. The check data 247 is calculated and added by a predetermined mathematical formula at the transmitting node, and is recalculated and confirmed at the receiving node.

【００５２】図４に、ＳＡＮ２０１内のスイッチ２３１
が、ノードからの通信データを受け取った場合の処理フ
ロー５０１を示す。FIG. 4 shows a switch 231 in the SAN 201.
Shows a processing flow 501 when communication data is received from a node.

【００５３】まず、送信元アドレス２４１と送信先アド
レス２４２の組合せと、ゾーン管理サーバ２１１から提
供されたゾーン情報を比較し、当該アドレスの組合せが
同一ゾーンに含まれているか、ないし両アドレスともい
ずれのゾーンにも属していないことを確認する（５０
２）。もし、条件に合っていなければ通信データを廃棄
し（５０３）、当該通信を行わない。First, the combination of the source address 241 and the destination address 242 is compared with the zone information provided from the zone management server 211, and whether the combination of the addresses is included in the same zone, or both addresses are both Confirm that it does not belong to the zone of (50
2). If the condition is not met, the communication data is discarded (503), and the communication is not performed.

【００５４】次に、仮想通信路ＩＤ２４３をチェック
し、当該仮想通信路ＩＤを持ったデータが、一定時間内
に、当該スイッチに対して予約されたリソースを超える
容量以上に送信されたかをチェックする（５０４）。容
量を超えている場合は、当該スイッチで予約されたリソ
ース超過ということで、その通信データを廃棄し（５０
５）、送信元に対して容量オーバを通知して一定時間送
信を停止させ（５０６）、その後再度送信を許可する
（５０７）。なお、図４では省略したが、スイッチ２３
１は、再度送信を許可するまでの時間にも、他の仮想通
信路ＩＤを持った通信データの送信を受け付ける。Next, the virtual communication path ID 243 is checked, and it is checked whether the data having the virtual communication path ID has been transmitted within a certain period of time to a capacity exceeding the resource reserved for the switch. (504). If the capacity exceeds the capacity, the communication data is discarded (50) because the resource reserved by the switch is exceeded.
5) The transmission source is notified of the capacity over, and the transmission is stopped for a certain period of time (506), and then the transmission is permitted again (507). Although not shown in FIG.
1 accepts transmission of communication data having another virtual communication path ID even before the transmission is permitted again.

【００５５】以上で廃棄されなかった通信データは、送
信先ノードが当該スイッチに接続されているかどうか
（５０８）によって処理が分けられ、接続されていれば
当該送信先ノードに伝送され（５０９）、そうでなけれ
ば当該ノードが接続されているスイッチに対してスイッ
チ間リンク２３２を使って転送される（５１０）。後者
の場合、途中のスイッチ間リンク２３２で予約したリソ
ースを超えてしまう可能性があり、この場合には、送信
元のスイッチへエラーの報告が返り（５１１）、上記と
同様、当該通信データを廃棄して送信を一時停止させる
（５０５〜５０７）。The communication data that has not been discarded as described above is divided depending on whether or not the destination node is connected to the switch (508), and if connected, is transmitted to the destination node (509). Otherwise, it is transferred to the switch to which the node is connected using the inter-switch link 232 (510). In the latter case, there is a possibility that resources reserved on the inter-switch link 232 may be exceeded, and in this case, an error report is returned to the transmission source switch (511), and the communication data is transmitted in the same manner as described above. Discard and suspend transmission (505-507).

【００５６】なお、プロトコルＩＤ２４４については、
スイッチ２３１では関知しない。受信ノードでは、通信
データのプロトコルＩＤ２４４で指定されたプロトコル
を処理可能かどうか判定し、処理不可能であれば当該通
信データを廃棄する。また、チェックデータ２４７が再
計算の結果と合致しなかった場合も当該通信データを廃
棄する。Incidentally, regarding the protocol ID 244,
The switch 231 has no effect. The receiving node determines whether the protocol specified by the protocol ID 244 of the communication data can be processed, and discards the communication data if the protocol cannot be processed. Also, when the check data 247 does not match the result of the recalculation, the communication data is discarded.

【００５７】次に、本多重系システムを構成する計算機
上の管理プログラム１３２の動作について説明する。先
に説明したとおり、本多重系システムでは、一つのアプ
リケーション１３３は、ある一時点ではいずれか一つの
計算機上で動作させるようにし、当該計算機上で実行が
不可能になった場合、別な計算機が引き継いで実行す
る。また、以下の説明では、本多重系システム上で、二
つのアプリケーション（以下、それぞれを「アプリケー
ション１」、「アプリケーション２」と呼ぶ）を動作さ
せるものとする。Next, the operation of the management program 132 on the computer constituting the multiplex system will be described. As described above, in this multiplex system, one application 133 is operated on one of the computers at a certain point in time, and when it becomes impossible to execute on the computer, another application 133 is executed. Is taken over and executed. In the following description, it is assumed that two applications (hereinafter, referred to as “application 1” and “application 2”) are operated on the multiplex system.

【００５８】図５に、計算機上の管理プログラム１３２
が、当該計算機の主記憶装置１１２上に構築する状態表
１３５の内容を示す。この表には、各計算機に対応した
行１４１ａ、１４１ｂが存在する。また、当該計算機の
ＳＡＮ上でのアドレスを格納する列１４２、当該計算機
の稼働状態を格納する列１４３が存在する。さらに、当
該計算機でアプリケーションを実行する優先順位を格納
する列１４４ａ、１４４ｂ、および当該計算機上でのア
プリケーションの実行状態を格納する列１４５ａ、１４
５ｂが、アプリケーションの数だけ存在する。このう
ち、ＳＡＮ上でのアドレスを格納する列１４２の内容
は、多重系を構成する計算機に対応して予め決められた
ものとする。FIG. 5 shows a management program 132 on the computer.
Shows the contents of the state table 135 constructed on the main storage device 112 of the computer. This table has rows 141a and 141b corresponding to each computer. Further, there is a column 142 for storing the address of the computer on the SAN, and a column 143 for storing the operating state of the computer. Furthermore, columns 144a and 144b storing the priority order of executing the application on the computer, and columns 145a and 14 storing the execution status of the application on the computer.
5b exist as many as the number of applications. Of these, the contents of the column 142 storing addresses on the SAN are determined in advance corresponding to the computers constituting the multiplex system.

【００５９】アプリケーションの優先順位を格納する列
１４４ａ、１４４ｂの内容については、当該アプリケー
ションをどの計算機上で優先させて実行するかを格納す
る。この列には１から９９９８までの値を設定でき、よ
り優先して実行したい計算機に対して、優先されない計
算機よりも小さい値を設定するものとする。また９９９
９が設定されている場合は、当該計算機上で当該アプリ
ケーションを実行しないことを示すものとする。例え
ば、当該アプリケーションを優先して実行したい計算機
に１、そうでない計算機に２を設定する。どちらの計算
機で実行してもよい場合は、ともに１を設定するものと
する。Regarding the contents of the columns 144a and 144b for storing the priority of the application, the computer on which the application is prioritized and executed is stored. In this column, values from 1 to 9998 can be set, and a smaller value is set for a computer that is to be executed with higher priority than a computer that is not to be executed. Also 999
When 9 is set, it indicates that the application is not executed on the computer. For example, 1 is set to a computer that preferentially executes the application, and 2 is set to a computer that does not prefer. In the case where the processing may be executed by either computer, 1 is set for both.

【００６０】計算機の稼働状態を格納する列１４３に
は、当該計算機上の管理プログラム１３２が正常に動作
していれば「稼働」、それ以外の場合には「停止」が格
納される。また、アプリケーションの実行状態を格納す
る列１４５ａ、１４５ｂは、それぞれ当該計算機上でア
プリケーションが動作していれば「稼働」、それ以外の
場合には「停止」が格納される。これらの値は、管理プ
ログラム１３２が、後述する処理フローに従って設定す
る。In the column 143 for storing the operating state of the computer, "operating" is stored if the management program 132 on the computer is operating normally, and "stop" otherwise. In addition, the columns 145a and 145b storing the execution state of the application store "operating" when the application is operating on the computer, and "stop" otherwise. These values are set by the management program 132 according to the processing flow described later.

【００６１】図６に、管理プログラム１３２の起動時の
処理フロー５２１を示す。管理プログラム１３２は、通
常、当該計算機の起動時に実行され、当該計算機が停止
するまで実行を継続するものとする。従って、本起動処
理フローは、通常、当該計算機の起動時に実行される。FIG. 6 shows a processing flow 521 when the management program 132 is started. The management program 132 is normally executed when the computer is started, and continues to be executed until the computer stops. Therefore, this activation processing flow is usually executed when the computer is activated.

【００６２】まず最初に、ゾーン管理サーバ２１１から
ＳＡＮ２０１で定義されているゾーン定義情報２１３の
内容を取得する（５２２）。そして、当該計算機を含む
ゾーンが定義されているか確認する（５２３）。ここ
で、ＳＡＮ２０１上で当該計算機を含むゾーンの定義
は、当該計算機を含む多重系システムの構成要素を表し
ていると解釈する。このように、多重系システムの構成
を表すゾーンを、以下「多重系ゾーン」と呼ぶ。First, the contents of the zone definition information 213 defined in the SAN 201 are obtained from the zone management server 211 (522). Then, it is confirmed whether a zone including the computer is defined (523). Here, it is interpreted that the definition of the zone including the computer on the SAN 201 represents a component of the multiplex system including the computer. The zone representing the configuration of the multiplex system as described above is hereinafter referred to as a “multi-system zone”.

【００６３】後述するように、当該多重系システムの全
ての計算機が停止する場合には、ゾーン定義を削除する
処理を行うため、当該計算機を含むゾーンが定義されて
いるということは、すでに当該多重系システムを構成す
る他の計算機が起動済みと判断される。この場合、取得
したゾーン定義情報に含まれる計算機と、状態表１３５
に含まれている構成が一致するか検証する（５２４）。
一致しない場合は、当該計算機が停止している間に、他
の計算機で状態表が更新され、当該多重系システムを構
成する計算機が変更になったことを表している。この場
合、状態表に含まれる計算機を、ゾーン定義情報の内容
で更新する（５２５）。As will be described later, when all the computers of the multiplex system are stopped, the processing for deleting the zone definition is performed. Therefore, the fact that the zone including the computers is defined means that the multiplex system has already been defined. It is determined that the other computers constituting the system have been started. In this case, the computer included in the acquired zone definition information and the state table 135
It is verified whether or not the configurations included in are the same (524).
If they do not match, it indicates that the state table is updated by another computer while the computer is stopped, and that the computer constituting the multiplex system has been changed. In this case, the computer included in the state table is updated with the contents of the zone definition information (525).

【００６４】次に、仮想通信路管理サーバ２２１に要求
して、本管理プログラム１３２が動作している当該計算
機と、多重系システムで使用する共有ディスク装置３０
１の間に、仮想通信路を設定する（５２６）。以後、当
該計算機が共有ディスク装置３０１に対してデータを読
み書きする場合、ここで設定した仮想通信路を使用し
て、すなわち通信データの仮想通信路ＩＤ２４３に当該
仮想通信路設定時に取得したＩＤを設定して、ＳＣＳＩ
プロトコルを使ったデータを送受信することにより実行
する。Next, a request is issued to the virtual communication path management server 221 so that the computer on which the present management program 132 is operating and the shared disk device 30 used in the multiplex system are used.
A virtual communication channel is set during the period 1 (526). Thereafter, when the computer reads and writes data from and to the shared disk device 301, the virtual communication path set here is used, that is, the ID acquired at the time of setting the virtual communication path is set as the virtual communication path ID 243 of the communication data. And SCSI
Executes by sending and receiving data using a protocol.

【００６５】なお、図６では省略したが、共有ディスク
装置３０１が稼働していない場合や、仮想通信路の設定
が失敗した場合は、多重系システムとしての動作が不可
能なので、設定したゾーン定義を削除し、管理プログラ
ム１３２の動作をエラー終了する。Although omitted in FIG. 6, if the shared disk device 301 is not operating or the setting of the virtual communication path fails, the operation as a multiplex system is impossible, so the set zone definition Is deleted, and the operation of the management program 132 ends with an error.

【００６６】次に、仮想通信路管理サーバ２２１に要求
して、本管理プログラム１３２が動作している当該計算
機と、状態表１３５に含まれるその他の計算機との間
に、各々仮想通信路を設定する（５２７）。以後、当該
計算機の管理プログラム１３２と、仮想通信路設定先の
計算機上の管理プログラム１３２との間で行われる構成
制御のための管理通信は、ここで設定した仮想通信路を
使用して、すなわち通信データの仮想通信路ＩＤ２４３
に当該仮想通信路設定時に取得したＩＤを設定して、Ｉ
Ｐプロトコルを使ったデータを送受信する事により実行
する。以下では、当該仮想通信路を「管理通信路」と呼
ぶ。Next, the virtual communication path management server 221 is requested to set virtual communication paths between the computer on which the present management program 132 is running and the other computers included in the status table 135. (527). Thereafter, the management communication for configuration control performed between the management program 132 of the computer and the management program 132 on the virtual communication channel setting destination computer uses the virtual communication path set here, that is, Virtual communication path ID 243 of communication data
Is set to the ID acquired at the time of setting the virtual communication channel, and
This is executed by transmitting and receiving data using the P protocol. Hereinafter, the virtual communication path is referred to as a “management communication path”.

【００６７】管理通信路を設定する際、通信帯域幅は当
該仮想通信路上で実行する管理通信の最大値を設定し、
また最大遅延時間については、必要とする当該計算機の
障害検出時間よりも十分に短い時間、たとえば１／３の
時間を設定する。When setting the management communication path, the communication bandwidth is set to the maximum value of the management communication executed on the virtual communication path.
As for the maximum delay time, a time that is sufficiently shorter than the required failure detection time of the computer, for example, 1/3 time is set.

【００６８】続いて管理通信路を使い、各計算機に対し
て管理プログラム１３２が起動して、当該計算機が多重
系システムとして動作可能になったことを通知する（５
２８）。この通知を受けた他の計算機上の管理プログラ
ム１３２は、当該計算機に対して応答を返す。この応答
を受け取った場合には、応答を返した計算機が既に多重
系システムとして動作している状態であることがわかる
ので、状態表１３５の応答した計算機の稼働状態欄に
「稼働」と記録する（５２９）。逆に、応答が無い場合
は、当該欄に「停止」と記録し、その計算機から起動し
た旨の通知が来るまでは、本多重系システムの構成制御
の対象から外す（５３０）。Subsequently, using the management communication path, the management program 132 is started for each computer to notify that the computer has become operable as a multiplex system (5).
28). The management program 132 on another computer that has received the notification returns a response to the computer. When this response is received, it is known that the computer which has returned the response is already operating as a multiplex system, and therefore, “operating” is recorded in the operating state column of the responding computer in the status table 135. (529). Conversely, if there is no response, “stop” is recorded in the column, and the computer is excluded from the target of the configuration control of the multiplex system until a notification that the computer has been started is received (530).

【００６９】ここで、稼働状態にある計算機が、当該計
算機以外に無かった場合、直前に多重系システムが正し
く停止せずに終了してしまったことを表しており、多重
系システムの初期化処理へ移行する（５３１）。Here, if there is no computer in operation other than the computer in question, it indicates that the multiplex system has been terminated immediately without stopping correctly, and the initialization process of the multiplex system has been completed. The process proceeds to (531).

【００７０】なお、当該計算機と多重系システムで使用
する共有ディスク装置３０１の間は、仮想通信路を設定
しない構成にしてもよい。この場合、各計算機間の管理
通信には、仮想通信路の設定によって必要な通信品質が
確保されているので、常に当該管理通信が、共有ディス
ク装置へのアクセス通信よりも、優先して実行されるこ
とになる。A configuration may be adopted in which no virtual communication path is set between the computer and the shared disk device 301 used in the multiplex system. In this case, since the necessary communication quality is secured for the management communication between the computers by setting the virtual communication path, the management communication is always executed with priority over the access communication to the shared disk device. Will be.

【００７１】続いて、稼働状態にある他の計算機から状
態表１３５の内容を取得し、当該計算機の状態表１３５
にアプリケーションの実行優先順位１４４ａ、１４４
ｂ、およびアプリケーション実行状態１４５ａ、１４５
ｂをコピーする（５３２）。Subsequently, the contents of the state table 135 are acquired from another computer in the operating state, and the state table 135 of the computer is acquired.
Application execution priorities 144a, 144
b, and application execution state 145a, 145
b is copied (532).

【００７２】なお、他に稼働中の計算機は、二重系シス
テムの場合は１台に限定されるが、三重系以上の多重系
システムの場合には複数台の可能性がある。この場合、
コピーはいずれか任意の１台から行えばよい。なぜなら
ば、アプリケーションの実行優先順位は各管理プログラ
ムの起動時にコピーされて同一である。また、アプリケ
ーションの実行状態を変更した場合は、その変更内容が
常に当該多重系システムを構成する各計算機の管理プロ
グラム１３２に通知され、状態表１３５の内容も一致化
されているからである。In the meantime, the number of operating computers is limited to one in the case of a dual system, but may be plural in the case of a multiplex system having three or more systems. in this case,
Copying may be performed from any arbitrary one. This is because the execution priority of the application is copied and the same when each management program is started. Further, when the execution state of the application is changed, the change is always notified to the management program 132 of each computer constituting the multiplex system, and the contents of the state table 135 are also consistent.

【００７３】次に、コピーを完了した状態表１３５を調
べ、アプリケーションの中で、当該アプリケーションの
実行優先順位が、今起動した計算機よりも低く、すなわ
ち状態表の実行優先順位１４４の値が今起動した計算機
よりも大きく設定されている計算機上で実行されている
ものを探す。該当するアプリケーションがあれば、今起
動した計算機上で実行するように引き継ぐ（５３３）。
実際の引継処理は、次のように行う。（ａ）引継先計算機（優先順位の高い方）の管理プログ
ラム１３２から、引継元計算機（優先順位の低い方）の
管理プログラム１３２に、稼働中のアプリケーションを
停止するよう要求する。（ｂ）引継元計算機の管理プログラム１３２は当該アプ
リケーションを停止し、停止したことを引継先計算機の
管理プログラム１３２に通知する。（ｃ）引継先計算機の管理プログラム１３２は当該アプ
リケーションを起動し、状態表１３５を更新する。（ｄ）引継先計算機の管理プログラム１３２は、多重系
システムを構成する他の計算機に対して、アプリケーシ
ョンの実行状態の変更内容を通知する。（ｅ）多重系システムを構成する他の計算機の管理プロ
グラム１３２は、通知内容に従い、状態表１３５を更新
する。Next, the status table 135 in which copying has been completed is examined, and among the applications, the execution priority of the application is lower than that of the computer that has just started, that is, the value of the execution priority 144 of the state table is now running. Search for a computer running on a computer that is set larger than the one that was set. If there is a corresponding application, the application is taken over so as to be executed on the started computer (533).
The actual handover process is performed as follows. (A) The management program 132 of the takeover destination computer (higher priority) requests the management program 132 of the takeover source computer (lower priority) to stop the running application. (B) The management program 132 of the takeover source computer stops the application and notifies the management program 132 of the takeover destination computer that the application has been stopped. (C) The management program 132 of the takeover destination computer starts the application and updates the status table 135. (D) The management program 132 of the takeover destination computer notifies other computers constituting the multiplex system of the change in the execution state of the application. (E) The management program 132 of another computer constituting the multiplex system updates the status table 135 according to the contents of the notification.

【００７４】一方、当該計算機を含むゾーンが定義され
ていない場合は、当該多重系システムを構成する他の計
算機が起動されていないため、当該計算機が持っている
状態表１３５の定義に従って、新たに多重系システムを
構築する。ここで状態表１３５は、当該多重系システム
の構成装置およびアプリケーションに関する情報を予め
定義したテーブルであり、ディスク装置１１５などの不
揮発性記憶装置に予め書き込まれているものを、主記憶
装置１１２上にローディングして使用する。On the other hand, if the zone including the computer is not defined, the other computers constituting the multiplex system have not been started, and a new zone is defined according to the definition of the state table 135 of the computer. Build a multiplex system. Here, the state table 135 is a table in which information on the constituent devices and applications of the multiplex system is defined in advance, and information written in a nonvolatile storage device such as the disk device 115 in advance is stored in the main storage device 112. Use by loading.

【００７５】まず、ゾーン管理サーバ２１１に要求し、
状態表にある各計算機および当該多重系システムで使用
する共有ディスク装置を含むゾーンを定義する（５４
１）。次に、仮想通信路管理サーバ２２１に要求して、
本管理プログラム１３２が動作している当該計算機と、
多重系システムで使用する共有ディスク装置３０１の間
に、仮想通信路を設定する（５４２）。この処理は、処
理５２６と同じ内容である。First, a request is made to the zone management server 211,
A zone including each computer in the state table and a shared disk device used in the multiplex system is defined (54).
1). Next, a request is made to the virtual communication path management server 221.
The computer on which the management program 132 is operating;
A virtual communication path is set between the shared disk devices 301 used in the multiplex system (542). This processing has the same contents as the processing 526.

【００７６】次に状態表１３５の他の計算機の稼働状態
欄１４３に「停止」と記録し、また各アプリケーション
の実行状態欄１４５にも「停止」と記録する（５４
３）。そして、本多重系システムで動作している計算機
が当該計算機のみであることから、各アプリケーション
を当該計算機上で起動し、状態表１３５にも反映する
（５４４）。Next, "stop" is recorded in the operation state column 143 of the other computer in the state table 135, and "stop" is also recorded in the execution state column 145 of each application (54).
3). Then, since the only computer operating in the multiplex system is the computer, each application is started on the computer and reflected in the state table 135 (544).

【００７７】図７に、管理プログラム１３２の状態監視
処理の処理フロー５５１を示す。この処理は、管理プロ
グラム１３２の正常起動後、一定時間間隔で実行され
る。まず、処理の先頭で、当該多重系システムを構成す
る他の計算機の全てに対して、管理通信路を使用し本計
算機が正常に稼働していることを通知（以下、これを
「生存通知」と呼ぶ）する（５５２）。FIG. 7 shows a processing flow 551 of the state monitoring processing of the management program 132. This process is executed at regular time intervals after the normal startup of the management program 132. First, at the beginning of the processing, a notification that this computer is operating normally using a management communication path is sent to all the other computers constituting the multiplex system (hereinafter, this is referred to as a "live notification"). (552).

【００７８】次に、当該計算機上で「稼働」状態にある
各アプリケーションが正常に動作しているかを確認する
（５５３）。確認の方法としては、当該アプリケーショ
ンのプログラムプロセスが存在しているかの確認や、当
該アプリケーション提供機能のテストを実行する方法な
どが上げられる。動作異常を検出した場合、当該アプリ
ケーションは当該計算機上で動作できないものと判断
し、当該計算機上での切り離し処理（５５４）を行う。
具体的には次の処理となる。（ａ）当該計算機上で当該アプリケーションを完全に停
止する。（ｂ）状態表１３５で、当該計算機行の当該アプリケー
ション優先順位列の内容を９９９９に設定して、以後、
当該計算機上でアプリケーションが起動されないように
する。（ｃ）管理通信路を通じて、当該多重系システムを構成
する他の計算機の全てに状態表の変更を通知する。Next, it is confirmed whether or not each application in the “operating” state is operating normally on the computer (553). As a method of confirmation, there is a method of confirming whether a program process of the application exists, a method of executing a test of the application providing function, and the like. When an operation abnormality is detected, the application is determined to be incapable of operating on the computer, and the disconnection process (554) is performed on the computer.
Specifically, the following processing is performed. (A) The application is completely stopped on the computer. (B) In the state table 135, the content of the application priority column of the computer row is set to 9999, and thereafter,
Prevent applications from running on the computer. (C) Notify all the other computers constituting the multiplex system of the change of the state table via the management communication path.

【００７９】状態表の変更通知を受けた他の計算機で
は、通知内容に従い、状態表の内容を更新する。更新後
の状態表から、新たに上記の停止したアプリケーション
の実行優先順位が最も高くなっている計算機が、当該ア
プリケーションを起動して、結果的に処理を引き継ぐこ
とになる。The other computers that have received the status table change notification update the status table content according to the notification content. From the updated state table, the computer with the highest execution priority of the above-mentioned stopped application newly starts the application and, as a result, takes over the processing.

【００８０】なお、動作異常を検出した時点で、すぐに
当該計算機上で動作不可能と判断する代わりに、一定回
数の動作異常検出が起きるまでは、当該計算機上で当該
アプリケーションを再起動するという処理としても良
い。また、動作異常の回数の代わりに、異常検出の頻度
を動作不可能かどうかの基準としても良い。When an abnormal operation is detected, the application is not restarted immediately on the computer, but the application is restarted on the computer until a predetermined number of abnormal operations are detected. It is good also as processing. Further, instead of the number of operation abnormalities, the frequency of abnormality detection may be used as a reference for determining whether or not operation is impossible.

【００８１】次に、他の計算機からの生存通信の到着状
況を確認し、一定時間以上未着の計算機があるかどうか
を確認する（５５５）。そのような計算機が無ければ、
多重系システムが正常に動作しているので、状態監視処
理を終了する。一方、生存通知未着の計算機が存在すれ
ば、何らかの障害が発生したことがわかるので、以下そ
の処理を行う。Next, the arrival status of the live communication from another computer is confirmed, and it is confirmed whether there is any computer that has not arrived for a predetermined time or more (555). Without such a calculator,
Since the multiplex system is operating normally, the status monitoring process ends. On the other hand, if there is a computer to which the notification of the existence has not arrived, it is known that some sort of failure has occurred.

【００８２】ここで、他の計算機の障害を検出するまで
の必要時間は、生存通知の発行間隔（処理５５２）、管
理通信路のパラメータである最大遅延時間、および生存
通信の到着状況確認（処理５５５）の実行間隔の合計と
なる。従って、状態監視処理５５１の実行間隔ついて
は、必要とする障害検出時間よりも十分短い時間、たと
えば１／３の時間を設定する。Here, the time required until a failure of another computer is detected is determined by an issuance notification issuance interval (process 552), a maximum delay time which is a parameter of a management communication path, and an arrival status check of alive communication (processing 555). Therefore, the execution interval of the state monitoring process 551 is set to a time sufficiently shorter than the required failure detection time, for example, 時間.

【００８３】なお、アプリケーションの動作確認（処理
５５３）は、状態監視処理５５１を一定回数実行するご
とに一回だけ実行するなど、間欠で実行して、確認処理
の負荷を軽減することも可能である。The operation check of the application (process 553) can be executed intermittently, for example, only once each time the state monitoring process 551 is executed a certain number of times, to reduce the load of the check process. is there.

【００８４】以下、障害発生時の処理手順について説明
する。まず、ゾーン管理サーバ２１１からＳＡＮ２０１
で定義されているゾーン定義情報２１３の内容を取得
し、当該計算機が含まれているか確認する（５５６）。
ゾーンに含まれていない場合、既に他の計算機で当該計
算機が異常であると判定されて多重系システムから切り
離し処理が終わっていることを意味している。また、ゾ
ーン情報の取得に失敗した場合は、当該計算機からＳＡ
Ｎ２０１へのアクセスに失敗していると考えられ、やは
り多重系システムから切り離されていることになる。こ
れらの場合、当該計算機上でのアプリケーションを終了
し（５５７）、当該計算機を停止する（５５８）。The processing procedure when a failure occurs will be described below. First, the zone management server 211 sends the SAN 201
The content of the zone definition information 213 defined in (1) is acquired, and it is confirmed whether the computer is included (556).
If it is not included in the zone, it means that the computer has already been determined to be abnormal by another computer and the disconnection processing from the multiplex system has been completed. If acquisition of the zone information fails, the computer
It is considered that access to N201 has failed, and it is also separated from the multiplex system. In these cases, the application on the computer is terminated (557), and the computer is stopped (558).

【００８５】一方、当該計算機がゾーンに含まれている
場合は、生存監視が一定時間未着となっている送信元の
計算機で障害が発生したものと判定し、次のように、当
該障害計算機を多重系システムから切り離す処理を行
う。（ａ）当該障害計算機を、当該多重系システムとして定
義されたゾーンから外す（５５９）。（ｂ）当該障害計算機の誤動作を防止するため、当該障
害計算機に停止を要求する。具体的には、本計算機と当
該障害計算機の間に多重系ゾーンと異なるゾーンを定義
してから、停止処理要求を送信し、当該ゾーンを削除す
る（５６０）。当該障害計算機側では、ＯＳ１３１より
も優先して計算機を停止させることのできるプログラム
を用意するか、あるいは計算機上のＨＢＡ１１６が当該
要求を受信した際に自律的に計算機を停止させる機構を
備える方法が考えられる。（ｃ）状態表１３５から、当該障害計算機の行を削除す
る（５６１）。（ｄ）更新後の状態表１３５で、本計算機の実行優先順
位が最も高くなっているアプリケーションが停止中であ
れば起動し、結果的に処理を引き継ぐ（５６２）。
（ｅ）管理通信路を通じて、当該多重系システムを構成
する他の計算機全てに状態表の変更を通知する。ただ
し、二重系システムの場合には不要であり、図７にも記
述していない。On the other hand, if the computer is included in the zone, it is determined that a failure has occurred in the transmission source computer for which survival monitoring has not been received for a certain period of time. From the multiplex system. (A) Remove the faulty computer from the zone defined as the multiplex system (559). (B) In order to prevent a malfunction of the faulty computer, the faulty computer is requested to stop. Specifically, a zone different from the multiplex system zone is defined between the present computer and the faulty computer, a stop processing request is transmitted, and the zone is deleted (560). On the faulty computer side, there is a method of preparing a program capable of stopping the computer in preference to the OS 131, or providing a mechanism for stopping the computer autonomously when the HBA 116 on the computer receives the request. Conceivable. (C) Delete the row of the faulty computer from the status table 135 (561). (D) In the updated state table 135, if the application with the highest execution priority of this computer is stopped, it is started, and as a result, the process is taken over (562).
(E) Notify the other computers constituting the multiplex system of the change of the state table via the management communication path. However, it is unnecessary in the case of a dual system and is not described in FIG.

【００８６】なお、三重系以上の多重系システムの場合
には、（ａ）および（ｂ）の処理については、いずれか
一台が行えば良い。これは（ａ）の前に、当該障害計算
機がゾーンに含まれているかどうかをチェックし、既に
ゾーンから外されていれば、他のノードで（ａ）および
（ｂ）の処理が行われていると判断して、当該処理をス
キップするようにすれば良い。In the case of a multiplex system having three or more systems, any one of the processes (a) and (b) may be performed. This means that before (a), it is checked whether the faulty computer is included in the zone, and if the faulty computer has already been removed from the zone, the processes (a) and (b) are performed by another node. It is only necessary to judge that the process is present and skip the process.

【００８７】図８に、管理プログラム１３２の終了時の
処理フロー５７１を示す。管理プログラム１３２は、通
常、当該計算機の停止時に実行を終了する。従って、本
起動処理フローは、通常、当該計算機の停止時に実行さ
れる。FIG. 8 shows a processing flow 571 when the management program 132 ends. The management program 132 normally terminates execution when the computer stops. Therefore, this startup processing flow is usually executed when the computer is stopped.

【００８８】当該計算機で実行中のアプリケーションを
全て終了し（５７２）、続いて当該管理プログラム１３
２が停止することを、管理通信路を通じて、他の計算機
に通知する（５７３）。この通知を受け取った他の計算
機上の管理プログラム１３２は、状態表１３５の停止す
る計算機の行にある、計算機稼働状態欄を「停止」と
し、各アプリケーションの実行状態の列１４５ａ、１４
５ｂを「停止」とする。そして更新後の状態表１３５
で、「稼働中」の計算機の中で当該計算機の実行優先順
位が最も高くなっているアプリケーションが停止中であ
れば起動して、結果的に処理を引き継ぐ。All the applications running on the computer are terminated (572), and then the management program 13
2 is notified to other computers via the management communication path (573). The management program 132 on the other computer that has received this notification sets the computer operation status column in the row of the computer to be stopped in the status table 135 to “stop”, and executes the columns 145 a and 14 of the execution status of each application.
5b is "stop". And the updated state table 135
Then, if the application with the highest execution priority of the computer among the "operating" computers is stopped, the application is started and, as a result, the process is taken over.

【００８９】続いて、停止する計算機の他に稼働中の計
算機があるかを状態表１３５から確認し（５７４）、他
に稼働中の計算機が無ければ、当該多重系システムが停
止するものと判断し、ゾーン管理サーバ２１１に依頼
し、当該多重系システムの構成に対応した多重系ゾーン
を削除する（５７５）。Subsequently, it is confirmed from the status table 135 whether there is an operating computer in addition to the computer to be stopped (574). If there is no other operating computer, it is determined that the multiplex system is to be stopped. Then, a request is made to the zone management server 211 to delete the multiplex system zone corresponding to the configuration of the multiplex system (575).

【００９０】以上が、本発明の第１の実施形態に係る多
重系システムの説明である。本実施例によれば、ネット
ワークにＳＡＮを用い、各計算機での構成制御処理の手
順を変更して計算機間リンクとして利用しているので、
多重系システムにおける各計算機を遠隔地など、任意の
位置に配置できる。また、ＳＡＮ上に仮想通信路を設定
しているので、管理通信路のリアルタイム性を保証でき
る。また、ゾーン設定情報を利用できるので、多重系シ
ステムを構成する計算機の管理が容易になる。The above is the description of the multiplex system according to the first embodiment of the present invention. According to the present embodiment, since the SAN is used for the network and the procedure of the configuration control process in each computer is changed and used as an inter-computer link,
Each computer in the multiplex system can be arranged at an arbitrary position such as a remote place. Further, since the virtual communication path is set on the SAN, the real-time property of the management communication path can be guaranteed. Further, since the zone setting information can be used, the management of the computers constituting the multiplex system becomes easy.

【００９１】次に、本発明の第２の実施形態を説明す
る。図９に、本発明の第２の実施形態に係る多重系シス
テムの構成を示す。ＳＡＮ２０１に対して、計算機１０
１〜１０５が接続されている。これらの計算機の内部構
造は、本発明の第１の実施形態の場合と同一である。ま
た、ＳＡＮ２０１、およびこれに接続されたゾーン管理
サーバ２１１、仮想通信路管理サーバ２２１の機能につ
いては、第１の実施形態の場合と同一である。Next, a second embodiment of the present invention will be described. FIG. 9 shows a configuration of a multiplex system according to the second embodiment of the present invention. For the SAN 201, the computer 10
1 to 105 are connected. The internal structure of these computers is the same as that of the first embodiment of the present invention. The functions of the SAN 201 and the zone management server 211 and the virtual communication path management server 221 connected to the SAN 201 are the same as those in the first embodiment.

【００９２】第２の実施形態では、計算機１０１と計算
機１０２および共有ディスク装置３０１で多重系システ
ム１５１を構成し、また、計算機１０４と計算機１０５
および共有ディスク装置３０２で別な多重系システム１
５２を構成している。各々の多重系システムを構成する
計算機と共有ディスク装置の組合せに対して、第１の実
施形態で説明したのと同様に、多重系ゾーンが設定され
ている。In the second embodiment, the computer 101, the computer 102, and the shared disk device 301 constitute a multiplex system 151, and the computer 104 and the computer 105
And another multiplex system 1 with the shared disk device 302
52. A multiplex system zone is set for the combination of the computer and the shared disk device constituting each multiplex system in the same manner as described in the first embodiment.

【００９３】多重系システムに含まれる各計算機は、第
１の実施形態で説明した内部構造の他に、各多重系シス
テムに関するパラメータ表１３６ないし１３７を主記憶
装置１１２上に保持している。これらの各多重系システ
ムは、同一のＥｔｈｅｒｎｅｔ４０１に接続され、サー
ビス対象装置４０２ａないし４０２ｂと通信して、計算
処理機能を提供する。なお、第１の実施形態と同じく、
ネットワーク４０１はＥｔｈｅｒｎｅｔ以外のものでも
構わない。またサービス対象装置の数は任意であり、ま
た一つのサービス対象装置が複数の多重系システムと通
信しこれらの計算処理機能を同時に利用しても良い。Each computer included in the multiplex system has, in addition to the internal structure described in the first embodiment, parameter tables 136 to 137 relating to each multiplex system in the main storage device 112. Each of these multiplex systems is connected to the same Ethernet 401 and communicates with the service target device 402a or 402b to provide a calculation processing function. In addition, as in the first embodiment,
The network 401 may be something other than Ethernet. The number of service target devices is arbitrary, and one service target device may communicate with a plurality of multiplex systems and use these calculation processing functions simultaneously.

【００９４】計算機１０３については、ＳＡＮ２０１な
らびにＥｔｈｅｒｎｅｔ４０１に接続されているが、図
９に示した時点では、いずれの多重系システムにも属し
ていない。この計算機１０３は、いずれかの多重系シス
テムに組み込まれることが可能である。組込は任意の時
点で可能であるが、一般に、多重系システムを構成して
いる計算機のいずれかに障害が発生しその代替として組
み込む場合、および多重系システムを構成する計算機の
いずれかを保守作業用に停止させるためその代替として
組み込む場合に行われる。The computer 103 is connected to the SAN 201 and the Ethernet 401, but does not belong to any multiplex system at the time shown in FIG. This computer 103 can be incorporated in any multiplex system. Embedding is possible at any time.However, in general, when a failure occurs in one of the computers constituting the multi-system and it is incorporated as a substitute, and when any of the computers constituting the multi-system is maintained This is done when installing as a substitute to stop for work.

【００９５】計算機１０３には待機管理プログラム１３
４がローディングされ実行されている。プログラム１３
４が実行されている間、他の計算機からの多重系システ
ムへの組み込み要求を受け付けることが可能である。The computer 103 has a standby management program 13
4 is loaded and executed. Program 13
While Step 4 is being executed, it is possible to receive a request from another computer for incorporation into a multiplex system.

【００９６】なお、多重系システムごとに共有ディスク
装置が使用されているが、単一の共有ディスク装置を複
数の多重系システムで共用する形態としても良い。この
場合は、一つの共有ディスク装置が複数の多重系ゾーン
の中に含まれることになる。Although a shared disk device is used for each multiplex system, a single shared disk device may be shared by a plurality of multiplex systems. In this case, one shared disk device is included in a plurality of multiplex system zones.

【００９７】以下、第１の実施形態と異なる部分につい
て、詳細に説明する。なお、動作説明は多重系システム
１５１の場合について行うが、多重系システム１５２に
ついても全く同様の動作を行う。Hereinafter, portions different from the first embodiment will be described in detail. The operation is described for the multiplex system 151, but the same operation is performed for the multiplex system 152.

【００９８】図１０に、多重系システムに関するパラメ
ータ表１３６ないし１３７の構造を示す。パラメータ表
には、当該多重系システムの構成要素となり得る計算機
と構成要素の共有ディスク装置、およびこれらに対する
ＳＡＮ２０１上でのアドレスの組合せが格納されてい
る。FIG. 10 shows the structure of parameter tables 136 to 137 relating to a multiplex system. The parameter table stores a computer that can be a component of the multiplex system, a shared disk device of the component, and a combination of addresses of these components on the SAN 201.

【００９９】図１０に示すように、多重系システム１５
１に対応するパラメータ表１３６には計算機１〜３が格
納され、多重系システム１５２に対応するパラメータ表
１３７には計算機３〜５が格納されている。これによ
り、計算機３は、いずれの多重系システムの構成要素と
しても動作できるように定義されている。As shown in FIG. 10, the multiplex system 15
In the parameter table 136 corresponding to 1, the computers 1 to 3 are stored, and in the parameter table 137 corresponding to the multiplex system 152, the computers 3 to 5 are stored. Thus, the computer 3 is defined so as to be able to operate as a component of any multiplex system.

【０１００】なお、ある多重系システムの構成要素とし
て組み込み済みの計算機を、他の多重系システムのパラ
メータ表に格納しても構わない。例えば、図９の状態で
の計算機４ないし計算機５を、パラメータ表１３６の中
に入れることも可能である。このように定義しておくこ
とにより、例えばある多重系システムでの処理の負荷が
高くなった場合に、既に他の多重系システムに組み込ま
れている計算機を一旦その多重系システムから削除し、
当該多重系システムに組み込むということが可能にな
る。Note that a computer already installed as a component of a certain multiplex system may be stored in a parameter table of another multiplex system. For example, the computers 4 to 5 in the state of FIG. 9 can be included in the parameter table 136. By defining in this way, for example, when the processing load in a certain multi-system increases, the computer already incorporated in another multi-system is temporarily deleted from the multi-system,
It becomes possible to incorporate into the multiplex system.

【０１０１】次に、多重系システムに計算機を追加する
場合の処理を説明する。図１１に、多重系システム１５
１に組み込まれている計算機上での管理プログラム１３
２の計算機追加指示処理の処理フロー６０１を示す。本
処理は、状態監視処理５５１で障害計算機を切り離した
後に代替の計算機を追加する目的で自動的に実行された
り、保守などの目的で構成を変更するためにオペレータ
や他のプログラムなどから要求されて実行される。Next, a process for adding a computer to the multiplex system will be described. FIG. 11 shows a multiplex system 15.
Management program 13 on the computer incorporated in
2 shows a processing flow 601 of the second computer addition instruction processing. This process is automatically executed for the purpose of adding an alternative computer after disconnecting the faulty computer in the status monitoring process 551, or is requested by an operator or another program to change the configuration for maintenance or the like. Executed.

【０１０２】まず、ゾーン管理サーバ２１１に要求し
て、ＳＡＮ２０１で設定されている全てのゾーン情報を
取得する（６０２）。次に、追加すべき計算機を選択す
る（６０３）。ここで追加すべき計算機は、多重系シス
テム１５１に関するパラメータ表１３６に存在し、か
つ、いずれのゾーンにも含まれていない計算機となる。First, a request is made to the zone management server 211 to obtain all zone information set in the SAN 201 (602). Next, a computer to be added is selected (603). The computer to be added here is a computer that exists in the parameter table 136 for the multiplex system 151 and is not included in any zone.

【０１０３】このようにして選択した計算機を、当該多
重系システムに対応する多重系ゾーンに組み入れ（６０
５）、追加する計算機で動作している待機管理プログラ
ム１３４に対して当該ノードへの参加を要求する（６０
６）。要求を受けた待機管理プログラム１３４は、当該
計算機上で管理プログラム１３２を起動し、要求の成功
を返す。成功が返ってきた場合、要求元では追加処理を
終了する（６０７）。The computer selected in this way is incorporated into the multiplex system zone corresponding to the multiplex system (60).
5) Request the standby management program 134 running on the computer to be added to join the node (60)
6). Upon receiving the request, the standby management program 134 activates the management program 132 on the computer and returns success of the request. If success is returned, the requesting source ends the additional processing (607).

【０１０４】一方、失敗が返ってきたり応答が無い場合
は、当該計算機が停止している等の状態と考えられるの
で、当該計算機の追加を中止し、当該計算機を多重系ゾ
ーンから外し（６０８）、他に追加する計算機がないか
再度選択する（６０３）。選択すべき計算機が無い場合
は、追加処理に失敗する（６０４）。On the other hand, if a failure is returned or there is no response, it is considered that the computer is stopped or the like, so the addition of the computer is stopped and the computer is removed from the multiplex zone (608). The user selects again whether there is any other computer to add (603). If there is no computer to select, the addition process fails (604).

【０１０５】図１２に、多重系システムに組み込まれて
いない計算機１０３上の待機管理プログラム１３４の処
理フロー６１１を示す。まずＳＡＮ２０１経由で、多重
系への追加要求が送られるのを待機する（６１２）。追
加要求が来た場合、要求元から当該多重系システムに関
するパラメータ表１３６をコピーする（６１３）。続い
て、当該多重系システムを構成している、すなわち多重
系ゾーンに含まれている各計算機で動作している管理プ
ログラム１３２に対して、その状態表１３５に当該計算
機を追加するよう要求する（６１４）。FIG. 12 shows a processing flow 611 of the standby management program 134 on the computer 103 which is not incorporated in the multiplex system. First, it waits for a request for addition to the multiplex system to be sent via the SAN 201 (612). When an addition request is received, the parameter table 136 relating to the multiplex system is copied from the request source (613). Subsequently, a request is made to the management program 132 constituting the multiplex system, that is, operating on each computer included in the multiplex system zone, to add the computer to the status table 135 ( 614).

【０１０６】この要求を受けた各計算機では、状態表１
３５に当該計算機を追加した上、計算機稼働状態を「停
止」、各アプリケーションの動作状態を「停止」に設定
する。次に、当該多重系システムで実行すべきアプリケ
ーション１３３をローカルディスク１１５から主記憶１
１２上へローディングする（６１５）。続いて、多重系
システムの管理プログラム１３２を起動する（６１
６）。In each computer receiving this request, the state table 1
After adding the computer to 35, the computer operation state is set to "stop" and the operation state of each application is set to "stop". Next, the application 133 to be executed in the multiplex system is transferred from the local disk 115 to the main storage 1.
12 (615). Subsequently, the management program 132 for the multiplex system is started (61).
6).

【０１０７】このプログラムの起動処理５２１によっ
て、当該多重系システムの各計算機の状態表１３５で当
該計算機が「稼働」状態に更新され、新たに本計算機が
多重系システムの構成要素として正しく動作を開始す
る。最後に多重系システムへの追加要求発行元に、追加
処理が正常に終了したことを通知し（６１７）、終了す
る。By this program start processing 521, the relevant computer is updated to the “operating” state in the state table 135 of each computer of the multiplex system, and this computer newly starts to operate correctly as a component of the multiplex system. I do. Finally, the source of the addition request to the multiplex system is notified that the addition processing has been normally completed (617), and the processing ends.

【０１０８】なお、本処理においては、多重系システム
で実行すべきアプリケーション１３３について、予めそ
の種類を知っており、かつ当該プログラムがローカルデ
ィスク１１５上に存在するものとして説明したが、管理
プログラム１３２の起動処理５２１の中で、他の計算機
からアプリケーションの情報ならびにプログラムをコピ
ーしてもよい。すなわち、処理５３２において、コピー
する状態表の情報の一部としてアプリケーションの種類
などの情報を渡してもらい、この状態表を参照した上
で、プログラム本体を状態表のコピー元からコピーすれ
ばよい。In this processing, the type of the application 133 to be executed in the multiplex system is known in advance and the program is described as being on the local disk 115. In the boot process 521, application information and programs may be copied from another computer. That is, in the process 532, information such as the type of the application is passed as a part of the information of the state table to be copied, and after referring to this state table, the program body may be copied from the copy source of the state table.

【０１０９】ＳＡＮ２０１には、ゾーン構成が変更され
たときに、変更が発生したことを接続されているノード
に通知する機能を備えている場合がある。この場合は、
追加される計算機の待機管理プログラム１３４でこの通
知を検出することで、多重系システムへの参加要求の発
行（処理６０６）に代えることが可能である。In some cases, the SAN 201 has a function of notifying the connected node that the change has occurred when the zone configuration is changed. in this case,
By detecting this notification by the standby management program 134 of the computer to be added, it is possible to substitute for issuing a request for participation in the multiplex system (process 606).

【０１１０】図１３に、多重系システム１５１に組み込
まれている計算機上での管理プログラム１３２の計算機
削除処理の処理フロー６２１を示す。本処理は、保守な
どのために構成を変更するために他のプログラムなどか
ら要求されて実行される。FIG. 13 shows a processing flow 621 of a computer deletion process of the management program 132 on a computer incorporated in the multiplex system 151. This processing is requested and executed by another program to change the configuration for maintenance or the like.

【０１１１】まず、多重系システムから削除する計算機
で動作している管理プログラム１３２に対して、動作の
停止を要求する（６２２）。この際、対象の計算機が停
止しているなどの理由で、管理プログラム１３２が要求
を正しく処理しない可能性があるが、無視して処理を継
続する。そして削除する計算機を、当該多重系システム
に対応する多重系ゾーンから削除する（６２３）。First, a request to stop the operation is issued to the management program 132 running on the computer to be deleted from the multiplex system (622). At this time, there is a possibility that the management program 132 does not correctly process the request because the target computer is stopped, but the processing is ignored and the processing is continued. Then, the computer to be deleted is deleted from the multiplex system zone corresponding to the multiplex system (623).

【０１１２】以上が、本発明の第２の実施形態に係る多
重系システムの説明である。本実施例によれば、多重系
システムの構成要素をＳＡＮのゾーン、すなわち通信を
許可するグループとして定義し、ゾーン外からの通信を
中継しないように設定できるので、ある多重系システム
に対し、ＳＡＮ上の他の多重系システムまたは他の装置
からの通信データの中継を禁止することができる。これ
により、一つの多重系システムの動作が他の多重系シス
テムに影響を与えないようにすることができる。なお、
多重系システムの複数の計算機のいずれかに障害が発生
した場合に、障害計算機からの通信データをＳＡＮ上で
中継することも禁止できる。The multiplex system according to the second embodiment of the present invention has been described. According to this embodiment, the components of the multiplex system can be defined as a SAN zone, that is, a group that permits communication, and can be set not to relay communication from outside the zone. Relaying of communication data from other multiplex systems or other devices above can be prohibited. As a result, the operation of one multiplex system can be prevented from affecting other multiplex systems. In addition,
When a failure occurs in any of a plurality of computers in a multiplex system, it is also possible to prohibit relaying communication data from the failed computer on the SAN.

【０１１３】また、多重系システムを構成する計算機な
らびに共通のバックアップ計算機の稼働状態に関する情
報を、ＳＡＮのゾーン設定から取得するようにしたの
で、各計算機ごとに全計算機の状態を管理する必要がな
く、必要となった時点でゾーン設定情報を利用するだけ
で済む。Further, since information on the operating states of the computers constituting the multiplex system and the common backup computer is obtained from the SAN zone setting, it is not necessary to manage the state of all the computers for each computer. It is only necessary to use the zone setting information when needed.

【０１１４】次に、本発明の第３の実施形態を説明す
る。図１４に、第３の実施形態に係る多重系システムの
構成を示す。Next, a third embodiment of the present invention will be described. FIG. 14 shows a configuration of a multiplex system according to the third embodiment.

【０１１５】本実施形態では、計算機１０１、１０２の
主記憶装置上では副管理プログラム１３９がローディン
グされ実行されている。副管理プログラム１３９は、第
１の実施形態で説明した管理プログラム１３２のうち当
該計算機の停止処理および当該計算機上で動作するアプ
リケーション１３３の起動・停止・動作確認処理と、後
述する主管理プログラム１３８との間の通信機能のみを
持つ。計算機上の主記憶装置１１２上には状態表１３５
は存在しない。各計算機上のその他の内部構造について
は、上述した第１の実施形態の場合と同一である。In the present embodiment, the sub-management program 139 is loaded and executed on the main storage devices of the computers 101 and 102. The sub-management program 139 is, among the management programs 132 described in the first embodiment, a process of stopping the computer and a process of starting, stopping, and confirming the operation of the application 133 operating on the computer. It has only the communication function between The state table 135 is stored in the main storage device 112 on the computer.
Does not exist. Other internal structures on each computer are the same as those in the first embodiment described above.

【０１１６】ＳＡＮ２０１およびこれに接続されたゾー
ン管理サーバ２１１、仮想通信路管理サーバ２２１の機
能については、第１の実施形態の場合と同一である。ま
た、Ｅｔｈｅｒｎｅｔ４０１経由でサービス対象装置４
０２と通信して、計算処理機能を提供する点についても
本発明の第１の実施形態の場合と同一である。The functions of the SAN 201 and the zone management server 211 and the virtual communication path management server 221 connected thereto are the same as those in the first embodiment. Also, the service target device 4 via the Ethernet 401
The second embodiment is also the same as the first embodiment of the present invention in that it provides a calculation processing function by communicating with the second embodiment.

【０１１７】計算機１０１、１０２から接続される共有
ディスク装置３０１には、ＳＡＮ２０１接続のためのＨ
ＢＡ３１４とディスク装置３１５が搭載されている。Ｈ
ＢＡ３１４はＳＡＮ経由のＳＣＳＩプロトコルの通信デ
ータを解釈し、ディスク装置３１５に送る。The shared disk device 301 connected from the computers 101 and 102 has H for connecting to the SAN 201.
A BA 314 and a disk device 315 are mounted. H
The BA 314 interprets SCSI protocol communication data via the SAN and sends the data to the disk device 315.

【０１１８】共有ディスク装置３０１には、小型演算処
理装置３１１（以下、「ＭＰＵ」と呼ぶ）、主記憶装置
３１２、記録装置３１３も備え、これらの構成要素およ
びＨＢＡ３１４は、内部バス３１０によって互いに接続
されている。ＭＰＵ３１１上では、通常ＨＢＡ３１４の
動作を指示するなど、共有ディスク装置３０１に関する
管理処理を行っている。記録装置３１３は動作パラメー
タなど、共有ディスク装置３０１への電源供給が切れた
場合に、記録しておきたいデータを格納する場所であ
り、バッテリバックアップしたＳＲＡＭなどが使われ
る。ここまでの共有ディスク装置３０１の内部構成およ
び動作は、一般的な説明であり、第１の実施形態の場合
でも同じである。The shared disk device 301 also includes a small processing unit 311 (hereinafter, referred to as “MPU”), a main storage device 312, and a recording device 313. These components and the HBA 314 are connected to each other by an internal bus 310. Have been. On the MPU 311, management processing for the shared disk device 301 is performed, such as instructing the operation of the normal HBA 314. The recording device 313 is a place for storing data to be recorded when the power supply to the shared disk device 301 is cut off, such as operation parameters, and a battery-backed SRAM or the like is used. The internal configuration and operation of the shared disk device 301 up to this point are general descriptions, and are the same in the case of the first embodiment.

【０１１９】本実施形態では、さらに共有ディスク装置
３０１上のＭＰＵ３１１において、主記憶装置３１２に
ローディングされた主管理プログラム１３８を実行する
ことが特徴である。主管理プログラム１３８は、多重系
システム上の各計算機の稼働状態やその上のアプリケー
ションの動作状態を管理する機能を有する。第１の実施
形態で計算機の主記憶装置１１２上にあった状態表１３
５は、共有ディスク装置上の主記憶装置３１２上に配置
される。The present embodiment is further characterized in that the MPU 311 on the shared disk device 301 executes the main management program 138 loaded on the main storage device 312. The main management program 138 has a function of managing the operation state of each computer on the multiplex system and the operation state of the application thereon. State table 13 in the main storage device 112 of the computer in the first embodiment
5 is located on the main storage device 312 on the shared disk device.

【０１２０】このような構成とすることで、多重系シス
テム上の各計算機の稼働状態やその上のアプリケーショ
ンの動作状態の管理が、共有ディスク装置３０１一カ所
に集中して実行できるため、第１の実施形態のように複
数の計算機上の管理プログラム１３２の間で通信を行
い、計算機稼働状態の確認や、状態表の計算機情報やア
プリケーション情報の更新を行ったりする必要が無くな
る。これにより構成制御の手順が簡素化されて確実性が
向上する。また、構成情報の変更手順も、共有ディスク
装置３０１上の管理表１３５に反映するだけになるので
簡素化される。With such a configuration, the management of the operating state of each computer in the multiplex system and the operating state of the application thereon can be performed centrally in one shared disk device 301. Communication between the management programs 132 on a plurality of computers as in the embodiment described above eliminates the need to check the computer operation status and update the computer information and application information in the status table. This simplifies the configuration control procedure and improves reliability. Further, the procedure for changing the configuration information is also simplified because it is only reflected in the management table 135 on the shared disk device 301.

【０１２１】以下、第１の実施形態と異なる部分につい
て、詳細に説明する。本多重系システムでは、まず最初
に共有ディスク装置３０１が起動され、主管理プログラ
ム１３８が起動する。主管理プログラム１３８は、起動
時にゾーン管理サーバ２１１に対して、計算機１０１、
１０２、および共有ディスク装置３０１を含む多重系ゾ
ーンの設定を要求する。次に、状態表１３５の各計算機
の稼働状態欄１４３に「停止」と記録し、また各アプリ
ケーションの実行状態欄１４５にも「停止」と記録す
る。Hereinafter, portions different from the first embodiment will be described in detail. In this multiplex system, first, the shared disk device 301 is activated, and the main management program 138 is activated. The main management program 138 sends the computer 101,
102 and a setting of a multiplex system zone including the shared disk device 301 is requested. Next, “stop” is recorded in the operation state column 143 of each computer in the state table 135, and “stop” is also recorded in the execution state column 145 of each application.

【０１２２】図１５に、主管理プログラム１３８の計算
機起動時処理の処理フロー７０１を示す。多重系システ
ムの計算機が起動した場合、当該計算機上の副管理プロ
グラム１３９が起動し、主管理プログラム１３８に対し
て当該計算機の起動を通知する。この通知を受けて処理
７０１が実行される。FIG. 15 shows a processing flow 701 of the computer startup processing of the main management program 138. When the computer of the multiplex system starts, the sub-management program 139 on the computer starts and notifies the main management program 138 of the start of the computer. Upon receiving this notification, the process 701 is executed.

【０１２３】計算機起動時には、仮想通信路管理サーバ
２２１に要求して、当該起動計算機と共有ディスク装置
３０１間に仮想通信路を設定する（７０２）。以後、当
該計算機が共有ディスク装置３０１に対してデータを読
み書きする場合、ここで設定した仮想通信路を使用して
実行する。When the computer is started, a request is made to the virtual communication path management server 221 to set a virtual communication path between the starting computer and the shared disk device 301 (702). Thereafter, when the computer reads and writes data from and to the shared disk device 301, the computer executes the data using the virtual communication path set here.

【０１２４】次に、仮想通信路管理サーバ２２１に要求
して、当該起動計算機と主管理プログラム１３８が動作
している装置（すなわち共有ディスク装置３０１）との
間に、仮想通信路を設定する（７０３）。以後、主管理
プログラム１３８と当該起動計算機の副管理プログラム
１３９との間の通信は、ここで設定した仮想通信路（管
理通信路）を使用して実行する。なお、仮想通信路を設
定する際に指定する通信帯域幅や最大遅延時間などのパ
ラメータの選定基準については、第１の実施形態と同一
である。Next, a request is made to the virtual communication path management server 221 to set a virtual communication path between the boot computer and the device on which the main management program 138 is running (ie, the shared disk device 301) ( 703). After that, communication between the main management program 138 and the sub-management program 139 of the boot computer is executed using the virtual communication path (management communication path) set here. Note that the selection criteria for parameters such as the communication bandwidth and the maximum delay time specified when setting the virtual communication path are the same as in the first embodiment.

【０１２５】そして状態表１３５の中の当該起動計算機
の稼働状態に「稼働」と記録する（７０４）。続いて状
態表１３５を調べ、アプリケーションの中で、当該アプ
リケーションの実行優先順位が今起動した計算機よりも
低く、すなわち状態表の実行優先順位１４４の数値が今
起動した計算機よりも大きく設定されている計算機上
で、実行されているものを探す。該当するアプリケーシ
ョンがあれば、今起動した計算機上で実行するように、
副管理プログラムに対して引き継ぎ処理を指示する（７
０５）。実際の引継処理指示は、次のように行う。（ａ）主管理プログラム１３８が、引継元計算機（優先
順位の低い方）の副管理プログラム１３９に稼働中のア
プリケーションを停止するよう要求する。（ｂ）引継元計算機の副管理プログラム１３９は当該ア
プリケーションを停止し、停止したことを主管理プログ
ラム１３８に通知する。（ｃ）主管理プログラム１３８が、引継先計算機（優先
順位の高い方）の副管理プログラム１３９に、当該アプ
リケーションを起動するよう要求する。（ｄ）引継先計算機の副管理プログラム１３９は、当該
アプリケーションを起動し、起動したことを主管理プロ
グラム１３８に通知する。（ｅ）主管理プログラムは、状態表１３５のアプリケー
ションの実行状態に関する欄に新しい実行状態を反映す
る。Then, “operating” is recorded in the operating state of the boot computer in the state table 135 (704). Subsequently, the state table 135 is examined, and among the applications, the execution priority of the application is lower than the computer that has just started, that is, the numerical value of the execution priority 144 in the state table is set to be higher than the computer that has just started. Find what is running on the calculator. If there is an applicable application, execute it on the computer that has just started.
Instruct the sub-management program to perform the takeover process (7
05). The actual transfer processing instruction is performed as follows. (A) The main management program 138 requests the sub-management program 139 of the takeover source computer (lower priority) to stop the running application. (B) The sub management program 139 of the takeover source computer stops the application and notifies the main management program 138 of the stop. (C) The main management program 138 requests the sub-management program 139 of the takeover destination computer (higher priority) to start the application. (D) The sub-management program 139 of the takeover destination computer activates the application and notifies the main management program 138 of the activation. (E) The main management program reflects the new execution state in the column related to the execution state of the application in the state table 135.

【０１２６】図１６に、主管理プログラム１３８の状態
監視処理の処理フロー７１１を示す。この処理は、主管
理プログラム１３８の正常起動後、一定時間間隔で実行
される。処理は状態表で稼働状態となっている計算機に
対して順次行う。FIG. 16 shows a processing flow 711 of the state monitoring processing of the main management program 138. This process is executed at regular time intervals after the main management program 138 is normally started. The processing is sequentially performed on the computers that are in the operating state in the state table.

【０１２７】まず、各計算機に対して、アプリケーショ
ンの動作状態の報告を要求する（７１２）。各計算機上
では、第１の実施形態で説明したのと同じ方法でアプリ
ケーションの動作状態を調べて、結果を返す。返ってき
た報告から、稼働表１３５で当該計算機上で「稼働」状
態となっている各アプリケーションが正常に動作してい
るかを確認する（７１４）。First, each computer is requested to report the operation status of the application (712). Each computer checks the operation state of the application in the same manner as described in the first embodiment, and returns a result. Based on the returned report, it is confirmed from the operation table 135 whether each application in the “operation” state is operating normally on the computer (714).

【０１２８】動作異常を検出した場合は、当該アプリケ
ーションは当該計算機上で動作できないものと判断し、
次の切り離し処理（７１５）を行う。（ａ）当該計算機上の副管理プログラム１３９に対し
て、当該アプリケーションを完全に停止するよう指示す
る。（ｂ）状態表１３５の当該計算機行当該アプリケーショ
ン優先順位列の内容を９９９９に設定して、以後、当該
計算機上でアプリケーションが起動されないようにす
る。When an abnormal operation is detected, it is determined that the application cannot operate on the computer, and
The next separation process (715) is performed. (A) Instruct the sub management program 139 on the computer to completely stop the application. (B) Set the contents of the computer row and the application priority column of the state table 135 to 9999 so that the application is not started on the computer thereafter.

【０１２９】続いて、更新後の状態表１３５を参照し
て、起動すべきアプリケーションがあれば、当該計算機
の副管理プログラム１３９に起動を指示する（７１
６）。結果的に動作が異常となったアプリケーション
を、他の計算機で引き継ぐことになる。Subsequently, referring to the updated state table 135, if there is an application to be started, the start is instructed to the sub management program 139 of the computer (71).
6). As a result, the application whose operation has become abnormal is taken over by another computer.

【０１３０】なお、第１の実施形態で説明したのと同
様、動作異常を検出した時点で、すぐに当該計算機上で
動作不可能と判断する代わりに、一定回数の動作異常検
出が起きるまでは、当該計算機上で再起動するように指
示するという処理としても良い。また、動作異常の回数
の代わりに、異常検出の頻度を動作不可能かどうかの基
準としても良い。As described in the first embodiment, when an operation abnormality is detected, the operation is not immediately determined to be inoperable on the computer. Alternatively, the processing may be instructed to restart on the computer. Further, instead of the number of operation abnormalities, the frequency of abnormality detection may be used as a reference for determining whether or not operation is impossible.

【０１３１】処理７１３で、アプリケーションの動作状
態報告の応答が無かった計算機については、何らかの障
害が発生したことが認められるので、当該計算機を多重
系システムから切り離す以下の処理を行う。（ａ）当該障害計算機を、当該多重系システムとして定
義されたゾーンから外す（７１７）。（ｂ）当該障害計算機の誤動作を防止するため、当該障
害計算機に停止を要求する。具体的には、当該障害計算
機と主管理プログラム１３８が動作している装置（すな
わち共有ディスク装置３０１）との間に多重系ゾーンと
異なるゾーンを定義してから、停止処理要求を送信し、
当該ゾーンを削除する（７１８）。当該障害計算機側の
処理は、第１の実施形態の場合の処理５６０と同じであ
る。（ｃ）状態表１３５から、当該障害計算機の行を削除す
る（７１９）。In the process 713, since it is recognized that some failure has occurred in the computer that has not responded to the operation status report of the application, the following process of disconnecting the computer from the multiplex system is performed. (A) Remove the faulty computer from the zone defined as the multiplex system (717). (B) In order to prevent a malfunction of the faulty computer, the faulty computer is requested to stop. Specifically, a zone different from the multiplex system zone is defined between the fault computer and the device on which the main management program 138 is operating (that is, the shared disk device 301), and then a stop processing request is transmitted.
The zone is deleted (718). The processing on the fault computer side is the same as the processing 560 in the case of the first embodiment. (C) The row of the faulty computer is deleted from the status table 135 (719).

【０１３２】その後、処理７１６が実行されることによ
り、結果的に障害計算機の処理を引き継ぐことになる。After that, the process 716 is executed, and as a result, the process of the fault computer is taken over.

【０１３３】図１７に、主管理プログラム１３８の計算
機停止時処理の処理フロー７３１を示す。多重系システ
ムの計算機が停止する場合、当該停止計算機上の副管理
プログラム１３９の停止処理が実行され、当該停止計算
機上で実行中のアプリケーションを全て停止したうえ、
主管理プログラム１３８に対して当該計算機の停止を通
知する。この通知を受けて処理７３１が実行される。FIG. 17 shows a processing flow 731 of the processing when the computer is stopped by the main management program 138. When the computer of the multiplex system is stopped, the stop processing of the sub-management program 139 on the stopped computer is executed, and all the applications running on the stopped computer are stopped.
The main management program 138 is notified of the stop of the computer. Upon receiving this notification, the process 731 is executed.

【０１３４】主管理プログラム１３８では、状態表１３
５の停止する計算機行にある、計算機稼働状態の列１４
３を「停止」とし、各アプリケーションの実行状態の列
１４５ａ、１４５ｂを「停止」とする（７３２）。そし
て、更新後の状態表１３５で、「稼働中」の計算機の中
で当該計算機の実行優先順位が最も高くなっているアプ
リケーションが停止中であれば起動して、結果的に処理
を引き継ぐ（７３３）。In the main management program 138, the status table 13
Column 14 of the computer operating state in the computer row to stop 5
3 is set to "stop", and the columns 145a and 145b of the execution state of each application are set to "stop" (732). Then, in the updated state table 135, if the application having the highest execution priority of the computer among the “operating” computers is stopped, the application is activated and the process is subsequently taken over (733). ).

【０１３５】なお、主管理プログラム１３８を共有ディ
スク装置３０１上のＭＰＵ３１１で実行する代わりに、
ゾーン管理サーバ２１１や仮想通信路管理サーバ２２１
上のＣＰＵで実行したり、あるいはＳＡＮ２０１を構成
するスイッチ２３１上のＭＰＵで実行することも可能で
ある。It should be noted that, instead of executing the main management program 138 by the MPU 311 on the shared disk device 301,
Zone management server 211 and virtual communication path management server 221
It can be executed by the above CPU or by the MPU on the switch 231 configuring the SAN 201.

【０１３６】これらの装置には、一般に本来機能の実現
のためにＣＰＵないしＭＰＵが搭載されており、主記憶
装置も接続されている。さらにＳＡＮ２０１に接続され
ていることから、計算機１０１、１０２や共有ディスク
装置３０１との通信も可能であり、上述した主管理プロ
グラム１３８の動作がそのまま実行できる。ただし、こ
の場合、当該装置と、計算機１０１、１０２ないし共有
ディスク装置３０１との間で通信するために、多重系ゾ
ーンを定義するときに当該装置自身も加えることが必要
である。These devices are generally equipped with a CPU or MPU for realizing the original functions, and are also connected to a main storage device. Furthermore, since it is connected to the SAN 201, communication with the computers 101 and 102 and the shared disk device 301 is also possible, and the operation of the main management program 138 described above can be executed as it is. In this case, however, it is necessary to add the device itself when defining a multiplex system zone in order to communicate between the device and the computers 101 and 102 or the shared disk device 301.

【０１３７】以上が、本発明の第３の実施形態に係る多
重系システムの説明である。本実施例によれば、ＳＡＮ
に接続される共有ストレージ装置で、多重系システムの
構成制御に関する情報をその内部に保持し、また多重系
システムの構成制御処理の一部または全部を実行させる
ようにしているので、構成制御情報の保持、および構成
制御の実行の手順を簡略化できる。The above is the description of the multiplex system according to the third embodiment of the present invention. According to the present embodiment, the SAN
The shared storage device connected to the device stores information related to the configuration control of the multiplex system therein and executes a part or all of the configuration control processing of the multiplex system. The procedure for holding and executing configuration control can be simplified.

【０１３８】[0138]

【発明の効果】本発明によれば、計算機間リンクとして
ＳＡＮを共用できるようになるので、多重系システムを
構成する計算機を任意の位置に設置できるという効果が
ある。また、この場合でも、管理通信について専用の物
理的通信路を設けたときと同等のリアルタイム性を確保
できるので、構成制御の状態判定を遅滞なく行うことが
でき、計算処理の渋滞を最小にできるという効果があ
る。According to the present invention, since the SAN can be shared as the inter-computer link, there is an effect that the computers constituting the multiplex system can be installed at any positions. Also in this case, since the same real-time property as when a dedicated physical communication path is provided for management communication can be ensured, the state of configuration control can be determined without delay, and congestion in calculation processing can be minimized. This has the effect.

【０１３９】また、本発明によれば、多重系システムの
構成要素をＳＡＮのゾーンとして設定するため、ＳＡＮ
を使用した他の多重系システムによる通信の影響を受け
ず、構成制御の誤動作を防止するという効果がある。According to the present invention, since the components of a multiplex system are set as SAN zones,
Is not affected by the communication by another multiplex system using, and there is an effect that a malfunction of the configuration control is prevented.

【０１４０】また、本発明によれば、複数の多重系シス
テムの構成要素ならびにバックアップ計算機の情報を、
必要となった時点でゾーン設定情報から取得するだけで
済むため、各計算機での構成情報の管理を簡略化する効
果がある。Further, according to the present invention, information on the components of a plurality of multiplex systems and the backup computer is
Since it is only necessary to acquire from the zone setting information when it becomes necessary, there is an effect of simplifying management of configuration information in each computer.

【０１４１】また、本発明によれば、多重系システムの
構成制御情報の保持あるいは多重系システムの構成制御
処理を、既に冗長化されている装置で、可用性の低い計
算機自身から分離して実行することができるので、新た
な冗長化のコストを要せずに、確実な構成制御ができる
という効果がある。Further, according to the present invention, the holding of the configuration control information of the multiplex system or the configuration control process of the multiplex system is executed separately from the low-availability computer itself in the redundant device. Therefore, there is an effect that reliable configuration control can be performed without requiring a new redundancy cost.

[Brief description of the drawings]

【図１】本発明の第１の実施形態に係る多重系システム
の構成を示すブロック図。FIG. 1 is a block diagram showing a configuration of a multiplex system according to a first embodiment of the present invention.

【図２】多重系システムが前提としているストレージ・
エリア・ネットワーク（ＳＡＮ）の構成を示すブロック
図。FIG. 2 shows a storage system assumed by a multiplex system.
FIG. 2 is a block diagram showing a configuration of an area network (SAN).

【図３】ＳＡＮで使用される通信データの構成図。FIG. 3 is a configuration diagram of communication data used in a SAN.

【図４】ＳＡＮを構成するスイッチがノードから通信デ
ータを受信した時の処理フロー図。FIG. 4 is a processing flow diagram when a switch configuring the SAN receives communication data from a node.

【図５】状態表の管理内容を示す構造図。FIG. 5 is a structural diagram showing management contents of a state table.

【図６】管理プログラムの起動時の処理フロー図。FIG. 6 is a processing flowchart at the time of starting a management program.

【図７】管理プログラムの状態監視処理の処理フロー
図。FIG. 7 is a processing flowchart of a status monitoring process of the management program.

【図８】管理プログラムの終了時の処理フロー図。FIG. 8 is a processing flowchart at the end of the management program.

【図９】第２の実施形態に係る多重系システムの構成を
示すブロック図。FIG. 9 is a block diagram showing a configuration of a multiplex system according to a second embodiment.

【図１０】多重系システムに関するパラメータ表の構造
図。FIG. 10 is a structural diagram of a parameter table relating to a multiplex system.

【図１１】多重系システムに組み込まれている計算機上
での管理プログラムの計算機追加指示処理の処理フロー
図。FIG. 11 is a processing flowchart of a computer addition instruction process of a management program on a computer incorporated in the multiplex system.

【図１２】多重系システムに組み込まれていない計算機
上の待機管理プログラムの処理フロー図。FIG. 12 is a processing flowchart of a standby management program on a computer that is not incorporated in a multiplex system.

【図１３】多重系システムに組み込まれている計算機上
での管理プログラムの計算機削除処理の処理フロー図。FIG. 13 is a processing flowchart of computer deletion processing of a management program on a computer incorporated in a multiplex system.

【図１４】第３の実施形態に係る多重系システムの構成
を示すブロック図。FIG. 14 is a block diagram showing a configuration of a multiplex system according to a third embodiment.

【図１５】主管理プログラムの計算機起動時処理の処理
フロー図。FIG. 15 is a processing flowchart of a computer startup process of the main management program.

【図１６】主管理プログラムの状態監視処理の処理フロ
ー図。FIG. 16 is a processing flowchart of a status monitoring process of the main management program.

【図１７】主管理プログラムの計算機停止時処理の処理
フロー図。FIG. 17 is a processing flowchart of a computer stop time process of the main management program.

【図１８】従来技術による多重系システムの構成を示す
ブロック図。FIG. 18 is a block diagram showing a configuration of a multiplex system according to the related art.

[Explanation of symbols]

１０１，１０２，１０３，１０４，１０５…計算機、１
１１…中央演算処理装置（ＣＰＵ）、１１６，３１４…
ホスト・バス・アダプタ（ＨＢＡ）、１３２…管理プロ
グラム、１３３…アプリケーション、１３５…状態表、
２０１…ストレージ・エリア・ネットワーク（ＳＡ
Ｎ）、２１１…ゾーン管理サーバ、２２１…仮想通信路
管理サーバ、２３１…スイッチ、３０１，３０２…共有
ディスク装置、３１１…小型演算処理装置（ＭＰＵ）、
４０１…Ｅｔｈｅｒｎｅｔ、４０２，４０２ａ，４０２
ｂ…サービス対象装置。101, 102, 103, 104, 105 ... computer, 1
11 central processing unit (CPU), 116, 314
Host bus adapter (HBA) 132 132 Management program 133 Application 135 State table
201: Storage Area Network (SA)
N), 211: zone management server, 221: virtual communication path management server, 231: switch, 301, 302: shared disk device, 311: small arithmetic processing unit (MPU),
401 ... Ethernet, 402, 402a, 402
b: Service target device.

───────────────────────────────────────────────────── フロントページの続き (72)発明者野内隆夫茨城県日立市大みか町五丁目２番１号株式会社日立製作所情報制御システム事業部内 (72)発明者遠藤浩通茨城県日立市大みか町七丁目１番１号株式会社日立製作所日立研究所内 (72)発明者納谷英光茨城県日立市大みか町七丁目１番１号株式会社日立製作所日立研究所内 (72)発明者齊藤雅彦茨城県日立市大みか町七丁目１番１号株式会社日立製作所日立研究所内 (72)発明者中三川哲明茨城県日立市大みか町七丁目１番１号株式会社日立製作所日立研究所内Ｆターム(参考） 5B034 AA04 BB17 CC01 DD02 5B042 GA11 GA34 JJ04 5B045 BB29 BB47 BB49 JJ08 JJ13 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Takao Nouchi 5-2-1, Omika-cho, Hitachi City, Ibaraki Prefecture Inside Information Control Systems Division, Hitachi, Ltd. (72) Inventor Hiromichi Endo Omika-cho, Hitachi City, Ibaraki Prefecture 7-1-1, Hitachi Research Laboratory, Hitachi, Ltd. (72) Inventor Hidemitsu Naya 7-1-1, Omika-cho, Hitachi City, Ibaraki Pref. Hitachi, Ltd. Hitachi Research Laboratory, Ltd. (72) Inventor Masahiko Saito, Ibaraki 7-1-1, Omika-cho, Hitachi City Hitachi Research Laboratory, Hitachi, Ltd. (72) Inventor Tetsuaki Nakamikawa 7-1-1, Omika-cho, Hitachi City, Ibaraki Prefecture F-term in Hitachi Research Laboratory, Hitachi, Ltd. (Reference) 5B034 AA04 BB17 CC01 DD02 5B042 GA11 GA34 JJ04 5B045 BB29 BB47 BB49 JJ08 JJ13

Claims

[Claims]

1. A computer comprising: a plurality of computers; and a shared external storage device connected to the plurality of computers via a network and shared by the plurality of computers, wherein processing performed by any of the plurality of computers is performed by another computer. In a multiplex system in which any one of the computers takes over, or a part of the processing is shared and executed by another computer, a confirmation or change instruction of the operating state of the computer performed between the plurality of computers, or the plurality of A multiplex system, wherein management communication performed for confirming or changing an execution state of a process executed by a computer is performed on the network connecting the plurality of computers and the shared external storage device. .

2. The multiplex system according to claim 1, wherein the network includes a plurality of switches having ports and a storage area network having an inter-switch link between the switches.

3. The network according to claim 1, wherein the management communication performed between the plurality of computers is performed prior to the communication between the plurality of computers and the shared external storage device. A multiplex system comprising a network management means for setting a network system.

4. The storage data transfer method according to claim 1, further comprising virtual communication path management means for setting a plurality of virtual communication paths on the network, wherein the plurality of computers and the shared external storage device are connected to each other. A multiplex system, wherein communication and management communication performed between the plurality of computers are performed by different virtual communication paths.

5. The virtual communication path management unit according to claim 4, wherein the virtual communication path management unit sets the virtual communication path to the network so as to guarantee a maximum communication delay time or a communication bandwidth of the virtual communication path used by the management communication. A multiplex system characterized by the following.

6. A computer having a plurality of computers, and a shared external storage device connected to the plurality of computers via a network and shared by the plurality of computers, wherein processing performed by one of the plurality of computers is performed by another computer. In a multi-system in which one of the computers takes over or part of the processing performed by one of the plurality of computers is shared and executed by another computer, a plurality of computers of the multi-system and a shared external storage A zone management unit configured to set the network so as to prohibit the device from relaying communication data transmitted from another multiplex system or another device connected to the network. And a multiplex system.

7. A computer having a plurality of computers and a shared external storage device connected to the plurality of computers via a network and shared by the plurality of computers, wherein processing performed by any of the plurality of computers is performed by another computer. In a multiplex system in which one of the computers takes over or a part of the processing performed by one of the plurality of computers is shared and executed by another computer, a failure has occurred in any of the plurality of computers. In case,
A setting is made in the network so as to prohibit relaying of communication data transmitted from the faulty computer to the plurality of computers other than the faulty computer and the shared external storage device via the network. A multiplex system having zone management means.

8. A computer having one or more computers and a shared external storage device connected to the computer via a network, and adding another computer connected to the network as a computer constituting a self-multiplexing system. Possible, taking over the processing performed on any of the computers before addition to any of the other computers, or sharing a part of the processing performed on any of the computers before the addition with another computer When performing a computer addition process, the communication data transmitted from the computer is added to all of the computers and the shared external storage device configuring the multiplex system after the addition. A multiplex system comprising zone management means for setting the network so as to permit relaying through the network.

9. A computer having a plurality of computers and a shared external storage device connected to the plurality of computers via a network, wherein any computer can be excluded from the computers constituting the own multiplex system. A multiplex system in which a process performed by one of the plurality of computers is taken over by one of the other computers, or a part of the process performed by any of the plurality of computers is shared by the other computer and executed. In performing a computer exclusion process, relaying the communication data transmitted from the computer to all the computers constituting the multiplex system and the shared external storage device after the exclusion through the network. A multiplex system comprising zone management means for setting the network to be prohibited.

10. The storage area network according to claim 8, wherein the network is a storage area network having a plurality of switches having ports and an inter-switch link between the switches. A multiplex system characterized in that relaying or prohibition is set for the system.

11. A computer having a plurality of computers, and a shared external storage device connected to the plurality of computers via a network and shared by the plurality of computers, wherein processing performed by one of the plurality of computers is performed by another computer. Take over at one of the calculators,
Alternatively, in a multiplex system in which a part of the processing performed by any of the plurality of computers is shared and executed by another computer, the device constituting the network or the shared external storage device may be a computer of the plurality of computers. A multiplex system comprising a monitoring means for monitoring an operation state.

12. The multiplex system according to claim 11, wherein said monitoring means is a management program for a multiplex system executed on a device constituting said network or an arithmetic processing unit of said shared external storage device. System.

13. The computer according to claim 12, wherein the monitoring unit issues an instruction to change the operating state of the plurality of computers based on the monitoring result, or performs the instruction at one of the plurality of computers. A multiplex system which issues an instruction to change a processing execution state.