JPH02132529A

JPH02132529A - Automatic monitor switch controller

Info

Publication number: JPH02132529A
Application number: JP63285646A
Authority: JP
Inventors: Junichi Kurihara; 潤一栗原; Toshio Hirozawa; 廣澤　敏夫; Ikuo Kimura; 木村　伊九夫
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1988-11-14
Filing date: 1988-11-14
Publication date: 1990-05-22

Abstract

PURPOSE:To shorten the service stop time by using a substitute processor to perform the business work which is so far carried out by a faulty processor and then using this processor again when the relevant fault is recovered for execution of a diagnostic program. CONSTITUTION:An automatic monitor switch controller 100 consists of a processor interface part 110, a switch control part 120, and a process part 130. The part 130 includes a memory part 130-1 which stores the control tables/blocks and a group of process programs and an execution control part 130-2 which interprets and carries out successively the instruction trains of the process programs. Then the controller 100 receives the information on occurrence of faults from a control system 200 and a business work system 220 and selects a processor based on the stored control information to substitute the faulty processor to perform its business work. The controller 100 instructs the substitution of the due process with addition of the substitute business work. In such a way, the service stop time can be shortened.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は電子計算機システム等、情報処理システムの自
動監視切替制御装置に係り，特に、複数の情報処理シス
テム構成での管理システムの切替えの自動化に関する．〔従来の技術〕情報処理システム、すなわち電子計算機システムの２４
時間運転サービスにともない、サービスの停止時間を最
小限にとどめるために、現用システムの他に待機システ
ムによる二重化システムでの運用方式に関して、特開昭
６０　−　７５４８号で開示されている。特開昭６０　
−　７５４８号記載の技術は、待機システムの電源を断
状態にしておき、現用システムで障害が発生したときに
自動的に切替えることによって、省電力化，省力化の効
果を狙ったものである。[Detailed Description of the Invention] [Field of Industrial Application] The present invention relates to an automatic monitoring and switching control device for information processing systems such as computer systems, and in particular to automation of switching of management systems in multiple information processing system configurations. Regarding. [Prior art] Information processing system, i.e., computer system 24
Japanese Patent Laid-Open No. 7548/1983 discloses an operating method for a redundant system using a standby system in addition to the current system in order to minimize service stoppage time due to hourly service. Tokukai 1986
- The technology described in No. 7548 aims to save power and labor by keeping the standby system powered off and automatically switching over when a failure occurs in the active system.

[Problem to be solved by the invention]

従来技術に関して、特開昭６０　−　７５４８号公報記
載の技術は，現用システムと待機システムによる二重化
システムにおいて，待機システムの電源を常時断状態と
しておき、現用システムで障害が発生したことを検知す
ると、現用システムをリセットすると同時に入出力装置
群を待機システムに接続替えし、待機システムの電源を
投入し、イニシャル・プログラム・ローデイングの指令
を自動的に発行する制御装置に関するものである．した
がって、現用システム，待機システムによる二重化シス
テムでの省電力運転の効果を狙ったものである．しかし
、上記従来技術は、（１）現用システム内での障害検出手順、（２）待機シ
ステムの起動後のジョブの再実行手順、（３）自動切替
制御装置内での処理手順、（４）待機システムの電源投
入制御方法、（５）入出力装置の接続関係変更手順、さ
らに、（６）現用システムの復帰後の再開方法の点につ
いて開示されておらず、上記技術の実現性の面で問題を
残している。Regarding the conventional technology, the technology described in Japanese Patent Application Laid-Open No. 60-7548 is such that in a redundant system consisting of an active system and a standby system, the power of the standby system is always turned off, and when a failure is detected in the active system, This relates to a control device that simultaneously resets the active system, connects the input/output devices to the standby system, turns on the power to the standby system, and automatically issues an initial program loading command. Therefore, the aim is to achieve power-saving operation in a redundant system consisting of an active system and a standby system. However, the above conventional technology requires (1) a failure detection procedure within the active system, (2) a job re-execution procedure after the standby system is started, (3) a processing procedure within the automatic switching control device, and (4) There is no disclosure of the power-on control method for the standby system, (5) the procedure for changing the connection relationship of input/output devices, and (6) the restart method after the active system is restored. leaving a problem.

ところで、近年のオンライン・サービスの普及にともな
い，障害によるサービスの停止時間を最小限にする必要
がある。その１つの解が複数のプロセッサ構成でシステ
ムを形成し、かつ各プロセッサ毎にオペレーテイング・
システムが動作する形態の複合計算機システムによる方
法であろう。Incidentally, as online services have become more popular in recent years, it is necessary to minimize service outage time due to failures. One solution is to form a system with multiple processors, and each processor has its own operating system.
This would be a method using a complex computer system in the form in which the system operates.

複合計算機システム構成においては、どれかのプ岬セツ
サがシステム全体を統括管理（これを管理！！燻システムという）し、他のプロセッサ群にて業務処理（
これを業務システムという）を行なうことになるが、こ
のシステム構成においては、障害発生システムの検出と
即時の自動的切替え方法、ならびに障害発生システムの
回復後の再接続制御方法が課題として残る。In a compound computer system configuration, one of the processors centrally manages the entire system (this is called the management system), and other processors handle business processing (
This is called a business system), but in this system configuration, issues remain such as how to detect and immediately automatically switch over a faulty system, and how to control reconnection after the faulty system is recovered.

したがって、本発明の目的は、管理システムと業務シス
テムで成る複合計算機システム構成において、管理シス
テムの異常の有無を監視する機構と異常を検出したとき
に管理システムの機能を業務システムに代行させるため
の制御手段、および管理システムで診断プログラムを走
行させる制御手段を具備した自動監視切替制御装置を提
供することにある．本発明の他の目的は、管理システムの業務システムで成
る複合計算機システム構成において、業務システムの異
常の有無を監視する制御手段と異常を検出したときに業
務システムの機能を管理システムに代行させるための制
御手段、および当該業務犬ステムで診断プログラムを走
行させる制御手段を具備した自動監視切替制御装置を提
供することにある。Therefore, an object of the present invention is to provide a mechanism for monitoring the presence or absence of an abnormality in the management system, and a mechanism for delegating the functions of the management system to the business system when an abnormality is detected, in a multicomputer system configuration consisting of a management system and a business system. An object of the present invention is to provide an automatic monitoring switching control device equipped with a control means and a control means for running a diagnostic program in a management system. Another object of the present invention is to provide a control means for monitoring the presence or absence of an abnormality in the business system in a multicomputer system configuration consisting of a business system of a management system, and a system for causing the management system to perform the functions of the business system on behalf of the management system when an abnormality is detected. An object of the present invention is to provide an automatic monitoring switching control device equipped with a control means for running a diagnostic program on the working dog stem.

本発明の他の目的は、障害の発生したプロセッサに接続
されていた入出力装置群を他のプロセッサに接続替えす
るとともに、該入出力装置群で実行中、実行待ちの処理
を切替えられた代替のプロセッサで継続して処理する制
御手段を具備した自動監視切替制御装置を提供すること
にある。Another object of the present invention is to connect a group of input/output devices connected to a failed processor to another processor, and to switch a process currently being executed or waiting to be executed on the group of input/output devices to an alternative processor. An object of the present invention is to provide an automatic monitoring switching control device equipped with a control means for continuous processing by a processor.

本発明の他の目的は、障害の発生したプロセッサの障害
対策の完了したときに、該プロセッサを複合計算機シス
テム構成に組み込み、障害発生前の分担業務を再実行さ
せる制御手段を具備した自動監視切替制御装置を提供す
ることにある。Another object of the present invention is to provide automatic monitoring switching that includes a control means for incorporating a faulty processor into a multicomputer system configuration and re-executing the tasks assigned before the fault occurs when fault countermeasures for the faulty processor are completed. The purpose is to provide a control device.

本発明の他の目的は，複合計算機システム構成によるス
トップレス・サービスの実現を可能とする自動監視切替
制御装置を提供することにある。Another object of the present invention is to provide an automatic monitoring and switching control device that makes it possible to realize stopless service using a multi-computer system configuration.

[Means to solve the problem]

上記目的を達成するために，本発明の自動監視切替制御
装置を複合計算機システムの各計算機（プロセッサ）に
接続し、かつ各入出力装置群のチャネル・バス選択を制
御する。自動監視切替制御装置は各プロセッサに対して
順次に動作状態を検査する。また、本発明の自動監視切
替制御装置は各計算機システム内で独自に検出した障害
状態を緊急に受信する機構も具備している。In order to achieve the above object, the automatic monitoring switching control device of the present invention is connected to each computer (processor) of a compound computer system, and controls channel bus selection of each input/output device group. The automatic supervisory switching control device sequentially checks the operating status of each processor. Furthermore, the automatic monitoring switching control device of the present invention is also equipped with a mechanism for urgently receiving failure states independently detected within each computer system.

したがって、本発明の自動監視切替制御装置が各プロセ
ッサ群の障害を検出したり，あるいは各プロセッサから
障害発生の通知を受けると、該障害のプロセッサを停止
させる指令を発行して当該プロセッサを停止させた後、
本発明の自動監視切替制御装置内であらかじめ記憶して
いた管理情報にもとづいて障害を起したプロセッサで実
行していた業務を代行するプロセッサを選択する制御手
段を具備している６代行プロセッサが決定すると，障害
を起したプロセッサに接続されていた入出力装置群を代
行プロセッサに接続替えするために切替制御部を介して
チャネル・パス選択線を制御し、接続関係を切替える。Therefore, when the automatic monitoring switching control device of the present invention detects a failure in each processor group or receives a notification of the occurrence of a failure from each processor, it issues a command to stop the failed processor and stops the processor. After
Based on the management information stored in advance in the automatic monitoring switching control device of the present invention, six substitute processors are determined which are equipped with a control means for selecting a processor to take over the task that was being executed by the faulty processor. Then, in order to switch the input/output device group connected to the faulty processor to the substitute processor, the channel path selection line is controlled via the switching control unit, and the connection relationship is switched.

次に、本発明の自動監視切替制御装置は代行プロセッサ
に対して，代行業務の内容を付加して処理の代行を指示
する制御手段を発行する。これにより、代行プロセッサ
は本業務に加えて代行業務をも遂行する。Next, the automatic monitoring switching control device of the present invention issues a control means to the proxy processor to add the details of the proxy work and instruct the proxy processor to perform the process. As a result, the proxy processor performs the proxy business in addition to the main business.

一方、障害の発生したプロセッサに対しては、本発明の
自動監視切替制御装置が当該プロセッサに対して診断プ
ログラムの実行を指令する制御手段を具備している。こ
れにより、保守員が到着するまでに該障害を起したプロ
セッサの診断を自動的に行なえることになる。電子計算
機システムの保守員が障害箇所を修復したならば、保守
員はサービス・プロセッサのコンソール装置から“回復
″の旨の情報を入力すれば良い。On the other hand, for a processor in which a failure has occurred, the automatic monitoring and switching control device of the present invention includes a control means for instructing the processor to execute a diagnostic program. As a result, the faulty processor can be automatically diagnosed before maintenance personnel arrive. Once the maintenance personnel of the computer system has repaired the faulty part, the maintenance personnel only needs to input information to the effect of "recovery" from the console device of the service processor.

自動監視切替制御装置は、この″回復″の情報を受信す
ると該回復したプロセッサのオペレーテイング・システ
ムを起動する。一般的には、この処理をイニシャル・プ
ログラム・ローデイング（Ｉｎｉｔｉａｌ　Ｐｒｏｇｒ
ｅｍ　Ｌｏａｄｉｎｇ　：　Ｉ　Ｐ　Ｌ　）処理と呼ば
れている。該プロセッサのＩＰＬ処理が完了した旨の報
告を受けると、自動監視切替制御装置は、先に代行プロ
セッサに対して代行させていた業務を停止させる指令を
発行し、入出力装置群も元のプロセッサに接続替えする
。その後、回復したプロセッサにて元の業務を遂行させ
る制御手段を具備している。When the automatic supervisory switching control device receives this "recovery" information, it starts the operating system of the recovered processor. Generally, this process is performed by initial program loading (Initial Program Loading).
This is called em Loading (IPL) processing. Upon receiving a report that the IPL processing of the processor has been completed, the automatic monitoring and switching control device issues a command to the substitute processor to stop the task that it had previously performed, and the input/output device group also returns to the original processor. Change the connection to. Thereafter, control means is provided for causing the recovered processor to perform the original task.

これにより、複合計算機システム構成における運転形態
において、エンド・ユーザに対するサービスを停止する
ことなく、障害プロセッサの切離しと再接続が達成でき
るとともに、ストップレス・サービスを実現できる。This makes it possible to disconnect and reconnect a faulty processor without stopping service to end users in the operating mode of a complex computer system configuration, and also to realize stopless service.

[Effect]

本発明の自動監視切替制御装置は、複合計算機システム
構成において、各プロセッサと専用線で結ばれて動作す
る。また、各入出力装置群に対してはチャネル・パス選
択の制御を行なっている。The automatic monitoring and switching control device of the present invention operates by being connected to each processor via a dedicated line in a compound computer system configuration. Furthermore, channel path selection is controlled for each input/output device group.

したがって，各プロセッサの動作状態の監視は２各プロ
セッサの本来の動作と並行して行なわれるので、各プロ
セッサの従来の動作に対して誤動作を励起するものでは
ない。Therefore, since monitoring of the operating state of each processor is carried out in parallel with the original operation of each processor, it does not induce malfunction in the conventional operation of each processor.

また，本発明の自動監視切替制御装置は、従来、計算機
システムの保守員やシステム操作者，管理者が行なって
いた操作を自動的に代行するものであり、人的ミスによ
る誤操作の頻発防止、システム運転形態の変更時間の短
縮等の効果が得られるものである。さらに、本発明の自
動監視切替制御装置は複数プロセッサ構成に適用したも
のであるが，単一プロセッサ構成にも適用可能である。In addition, the automatic monitoring switching control device of the present invention automatically performs operations that were conventionally performed by maintenance personnel, system operators, and administrators of computer systems, and prevents frequent erroneous operations due to human error. This provides effects such as shortening the time required to change the system operation mode. Furthermore, although the automatic monitoring switching control device of the present invention is applied to a multiple processor configuration, it is also applicable to a single processor configuration.

その場合には、障害検出後に自動的に診断プログラムを
実行できることによって、サービス停止時間の減少の効
果が得られる。In that case, by being able to automatically execute a diagnostic program after detecting a failure, the effect of reducing service outage time can be obtained.

〔Example〕

以下、本発明の一実施例を第１図〜第１５−ｂ図により
説明する。第１図は本発明の自動監視切替制御装置の構
成を端的に示した図であり、各プロセッサ群、および入
出力装置群との接続関係も示してある。図中の符号１０
０は本発明の自動監視切替制御装置であり，プロセッサ
・インタフェース部１１０，切替え制御部１２０、およ
び処理部１３０で構成している。なお、処理部１３０は
制御テーブル／制御ブロック、および処理プログラム群
を格納するメモリ部１３０−１と処理プログラムの命令
列を順次、解釈実行する実行制御部１３０−２でなって
おり、実行制御部１３０−２はマイクロ・プロセッサ等
のように演算機能を有していれば何で構成しても構わな
い。An embodiment of the present invention will be described below with reference to FIGS. 1 to 15-b. FIG. 1 is a diagram clearly showing the configuration of the automatic monitoring switching control device of the present invention, and also shows the connection relationship with each processor group and input/output device group. Number 10 in the diagram
0 is an automatic monitoring switching control device of the present invention, which is composed of a processor interface section 110, a switching control section 120, and a processing section 130. The processing unit 130 includes a memory unit 130-1 that stores a control table/control block and a group of processing programs, and an execution control unit 130-2 that sequentially interprets and executes a sequence of instructions of the processing program. 130-2 may be constructed of anything as long as it has an arithmetic function such as a microprocessor.

第１図を参照するに、第１図の複合計算機システムの構
成例においては、２台のプロセッサ（管理システム２０
０と業務システム２２０）に自動監視切替制御装置を接
続しているが、プロセッサの台数に制限を与えるもので
はない。Referring to FIG. 1, in the configuration example of the compound computer system shown in FIG.
0 and the business system 220), but there is no limit to the number of processors.

本発明の自動監視切替制御装置１００は管理システム２
００および業務システム２２０と信号線群Ｑｌ−０−Ｑ
２−７と接続され、各入出力装置群に対しては、チャネ
ル・パス切替え制御装置２３０〜２３４への制御信号線
群１２５を介してチャネル・パスの選択を指示する。な
お、切替え制御部゛１２０は処理部１３０からの切替え
指示信号Ｑ１８を信号線群Ｑ５の中の対応する信号線に
伝達する回路で成っている。本実施例において、第２図
はプロセッサ・インタフェース部１１０の制御回路を示
した図、第３図は自動監視切替制御装置から各プロセッ
サに発行される指令の一覧を示した図，第４図は各プロ
セッサからの要求情報の一覧を示した図、第５図は自動
監視切替制御装置１００内の処理部１３０の制御テーブ
ル，制御ブロック類、および処理プログラム群の構成を
示した図である。第６図以降第１４図までは各制御テー
ブル，制御ブロックの詳細、および処理プログラムのフ
ロー図である。第１５−ａ図，第１５−ｂ図は管理シス
テム２００，業務システム２２０内での監視プログラム
２１０の動作フローを示した図である。The automatic monitoring switching control device 100 of the present invention is a management system 2
00 and business system 220 and signal line group Ql-0-Q
2-7, and instructs each input/output device group to select a channel path via a control signal line group 125 to channel path switching control devices 230-234. Note that the switching control section 120 is made up of a circuit that transmits the switching instruction signal Q18 from the processing section 130 to the corresponding signal line in the signal line group Q5. In this embodiment, FIG. 2 is a diagram showing the control circuit of the processor interface section 110, FIG. 3 is a diagram showing a list of commands issued to each processor from the automatic monitoring switching control device, and FIG. FIG. 5 is a diagram showing a list of request information from each processor, and is a diagram showing the configuration of a control table, control blocks, and processing program group of the processing section 130 in the automatic monitoring switching control device 100. 6 to 14 are flowcharts of each control table, details of control blocks, and processing programs. 15-a and 15-b are diagrams showing the operation flow of the monitoring program 210 within the management system 200 and the business system 220.

では、第１図を用いて動作の概要を述べた後に，詳細な
説明を行なうことにする。第１図において、先に説明し
た以外の符号の中で、符号２３５〜符号２３８は入出力
装置群，符号Ｑ６−０〜ｎ９−１はチャネル・データ線
，符号２０１は管理システム２００のサービス・プロセ
ッサ、符号２２１は業務システムのサービス・プロセッ
サである。After describing the outline of the operation using FIG. 1, a detailed explanation will be given. In FIG. 1, among the symbols other than those described above, symbols 235 to 238 are input/output device groups, symbols Q6-0 to n9-1 are channel data lines, and symbol 201 is a service terminal of the management system 200. A processor 221 is a service processor of the business system.

符号２０２，符号２２２はオペレーテイング・システム
（Ｏｐｅｒａｔｉｎｇ　Ｓｙｓｔｅｍ　：　Ｏ　Ｓ　）
であり、ＯＳ２０２と○Ｓ２２２は同一のＯＳでも良い
し、異なったＯＳであっても構わない。Reference numerals 202 and 222 are operating systems (OS).
The OS 202 and ○S 222 may be the same OS or may be different OSs.

符号２０３は管理システム２００のもとで動作する管理
プログラム、符号２２３は業務システム２２０のもとで
動作する業務プログラム、符号２２４は業務システム２
２０の管理プログラムである。したがって、符号２０３
の管理プログラムを統括管理プログラム、符号２２４の
管理プログラムを副管理プログラムと区別することにす
る。203 is a management program that operates under the management system 200, 223 is a business program that operates under the business system 220, and 224 is a business system 2.
20 management programs. Therefore, code 203
The management program 224 will be distinguished from the general management program, and the management program 224 from the sub-management program.

なお、管理システム２００と業務システム２２０間は信
号線Ｑ３で接続されている。この信号線Ｑ３の代りに、
本発明の自動監視切替制御装置１００を介して管理シス
テム２００と業務システム２２０間の交信を行なっても
構わない。この場合には、後に示す第３図，第４図の指
令、および要求情報を追加すれば良い。Note that the management system 200 and the business system 220 are connected by a signal line Q3. Instead of this signal line Q3,
Communication between the management system 200 and the business system 220 may be performed via the automatic monitoring switching control device 100 of the present invention. In this case, the commands and request information shown in FIGS. 3 and 4 shown later may be added.

システムの障害検出手段としては、（１）自動監視切替え制御装置１００が稼動中の電子計
算機システムを定期的に検査する方法、（２）稼動中の
電子計算機システムが障害を検出して，自動監視切替制
御装置１００に報告する方法、がある。本実施例においては、まず（１）の方法での動
作を説明した後に、（２）の方法での動作を説明するこ
とにする。System fault detection means include (1) a method in which the automatic monitoring switching control device 100 periodically inspects the operating computer system; (2) a method in which the operating computer system detects a fault and automatically monitors the computer system; There is a method of reporting to the switching control device 100. In this embodiment, the operation according to method (1) will be explained first, and then the operation according to method (2) will be explained.

自動監視切替制御装置１００は、信号線Ｑｌ−０を介し
て第３図の項番５の指令を発する。The automatic monitoring switching control device 100 issues the command No. 5 in FIG. 3 via the signal line Ql-0.

ＳＶＰ２０１は自動監視切替制御装置１００からの指令
を解釈し、対応する動作を行なう。ハードウエアの状態
はＳＶＰ２０１で検査し、ＯＳを含むソフトウエアの検
査は監視プログラム２１０に制御を渡す。監視プログラ
ム２１０は動作状態を検査し、その結果をＳＶＰ２０１
へ戻す。ＳｖＰ２０１はハードウエアの検査結果を含め
て障害の発生の有無を信号ＲＩＡＱ２−０を介して自動
監視切替制御装置１００へ返す。自動監視切替制御装置
１００は信号線Ｑ２−０のデータを調べて、障害発生の
有群を検査する。上記の検査は複合計算機システムを構
成するプロセッサの台数分を繰返すことになる．すなわ
ち、検査信号はΩ１−ｉとなり、応答信号はＱ２−ｉで
ある。第１図の例では７番目の業務システム２２０に対
しては、信号線Ｑ１−７と信号線Ｑ２−７が用いられる
ことになる．一方、（２）の稼動中の電子計算機システムが障害を検
出した場合には，第４図に示す要求情報が信号線Ｑ２−
ｉを介して、自動監視切替制御装置１００に報告される
。このときには、自動監視切替制御装置１００が各電子
計算機システム２ｏＯ〜２２０への検査手順とは独立に
（非同期に）動作する。The SVP 201 interprets commands from the automatic monitoring switching control device 100 and performs corresponding operations. The state of the hardware is inspected by the SVP 201, and control is passed to the monitoring program 210 to inspect the software including the OS. The monitoring program 210 inspects the operating state and sends the result to the SVP 201.
Return to. The SvP 201 returns the presence or absence of a failure, including the hardware test results, to the automatic monitoring switching control device 100 via the signal RIAQ2-0. The automatic monitoring switching control device 100 examines the data on the signal line Q2-0 to check for the occurrence of a failure. The above test will be repeated for the number of processors that make up the complex computer system. That is, the test signal is Ω1-i, and the response signal is Q2-i. In the example of FIG. 1, signal lines Q1-7 and Q2-7 are used for the seventh business system 220. On the other hand, if the computer system in operation (2) detects a fault, the request information shown in FIG.
It is reported to the automatic monitoring switching control device 100 via i. At this time, the automatic monitoring switching control device 100 operates independently (asynchronously) of the inspection procedure for each of the computer systems 2oO to 220.

自動監視切替え装ＩＲ　１　０　０は、先に述べた（１
），（２）の方法にて線Ｑ２−ｉを介して報告された検
査情報をプロセッサインタフェース部１１０で受信する
。プロセッサインタフェース部１１０は信号線群Ｑ１１
を介して，プロセッサ識別情報と障害情報を処理部１３
０へ渡す。プロセッサ識別情報はプロセッサ番号で識別
可能である。処理部１３０は障害の状態を判断し、該障
害を起したプロセッサを停止すべきと判断したならば，
信号線群０１０を介して該プロセッサの停止指令を発行
する。第５図〜第１５図は処理部１３０内のメモリ部１
３０−１に格納されている制御テーブル、制御ブロック
、および処理プログラムの動作フローを示したものであ
り、後に詳細に説明する。The automatic monitoring switching device IR 1 0 0 is as described above (1
), (2), the processor interface unit 110 receives the test information reported via the line Q2-i. The processor interface unit 110 is connected to the signal line group Q11.
Processor identification information and fault information are sent to the processing unit 13 via
Pass to 0. The processor identification information can be identified by a processor number. If the processing unit 130 determines the state of the failure and determines that the processor that has caused the failure should be stopped,
A command to stop the processor is issued via the signal line group 010. 5 to 15 show the memory section 1 in the processing section 130.
30-1 shows the operation flow of the control table, control block, and processing program stored in 30-1, and will be described in detail later.

停止指令は＠　Ｑ　１　−　ｉを介して、障害を起した
プロセッサに伝えられ、停止処理が行なわれた後、診断
プログラムをＯＳの代りに走行させる指示が線Ｑｌ−ｉ
を介して発せられる。The stop command is sent to the faulty processor via @Q1-i, and after the stop processing is performed, an instruction to run the diagnostic program instead of the OS is sent to the line Ql-i.
issued through.

障害の発生したプロセッサが管理システム２００である
ならば、複合計算機システムを統括管理している統括管
理プログラム２０３を業務システム２２０のもとで実行
させることになる。これは、副管理プログラム２２４に
対して統括管理の機能を具備する旨の指令を発しても良
いし、業務システム２２０のもとで副管理プログラム２
２４が走行していない場合には、新たに統括管理プログ
ラムを走行させれば良い。いずれにしても，これらの指
令は、自動監視切替制御装置１００より、信号線１２１
−ｊを介して発せられる。このときには，信号＆！Ｑ２
−ｊには代行業務の種別情報が送出される．なお、自動
監視切替制御装置１００は業務の代行を指示する前に、
障害を起したシステムが使用していた入出力装置群を，
業務を代行するシステムに接続替えする。第１図の例で
は，管理システム２００の代行が業務システム２２０と
なるので入出力装置群２３５〜２３８は業務システム２
２０から使用できるように接続替えがなされる。If the processor in which the failure has occurred is the management system 200, the overall management program 203 that collectively manages the multifunction computer system will be executed under the business system 220. This may be done by issuing a command to the sub-management program 224 to provide an integrated management function, or by issuing a command to the sub-management program 224 under the business system 220.
24 is not running, it is sufficient to newly run the overall management program. In any case, these commands are transmitted from the automatic monitoring switching control device 100 to the signal line 121.
-emitted via j. At this time, the signal &! Q2
-j is sent the type information of the agency work. Note that, before instructing the automatic monitoring switching control device 100 to perform the work on behalf of the user, the automatic monitoring switching control device 100
The input/output devices used by the failed system are
Reconnect to the system that performs the work on your behalf. In the example shown in FIG. 1, the business system 220 acts as a proxy for the management system 200, so the input/output devices 235 to 238 are the business system 220.
The connection is changed so that it can be used from 20 onwards.

このために、自動監視切替制御装置１００内の処理部１
３０からの指令信号が切替え制御部１２０に送出され、
信号線群Ｑ５を経てチャネル・パス切替え制御装置２３
０〜２３４に指示される。これにより、入出力装置群に
対するチャネル・パスは管理システム２００から業務シ
ステム２２０へ切替えられ、チャネル・データ線０７−
１〜Ｑ９−１を介したデータの送受信が可能となる。こ
こで、管理システム２００に対する診断プログラム２０
４が管理システム２００の主メモリ（図示せず）内に、
すでに存在している場合には入出力装置２３５のチャネ
ル・パスも切替えられるが、診町，４−ｆ，．ｏグラム
のオリジナル２０５を入出力装置２３５から管理システ
ム２００内の主メモリにロードして実行させる場合には
、当該入出力装置２３５の接続替えは診断プログラムの
実行後に行なわれる。For this purpose, the processing unit 1 in the automatic monitoring switching control device 100
A command signal from 30 is sent to the switching control section 120,
Channel path switching control device 23 via signal line group Q5
Indicated from 0 to 234. As a result, the channel path for the input/output device group is switched from the management system 200 to the business system 220, and the channel data line 07-
Data can be transmitted and received via Q1 to Q9-1. Here, the diagnostic program 20 for the management system 200
4 in the main memory (not shown) of the management system 200,
If the channel path of the input/output device 235 already exists, the channel path of the input/output device 235 is also switched. When the original o-gram 205 is loaded from the input/output device 235 into the main memory in the management system 200 and executed, the connection of the input/output device 235 is changed after the diagnostic program is executed.

また、すでに業務システム２２０で使用している入出力
装置群に対しては、この切替え動作は行なわれない。Furthermore, this switching operation is not performed for input/output device groups that are already used in the business system 220.

障害を起したプロセッサ、すなわち第１図の例では管理
システム２００のもとで診断プログラム２０４が実行し
ている間に計算機システムの保守員が到着し、障害要因
がハードウエアに起因するものであるが、あるいはソフ
トウエアに起因するものであるかの分析がなされる。ハ
ードウエアに起因するものであれば、診断プログラムで
指摘された回路を交換する。ソウトウエアに起因するも
のであれば、障害を誘発したプログラムを修正する。そ
の後、保守員は、ＳＶＰ　２　０　１のコンソール装置
（図示せず）から″回復″の旨の情報を入力する。この
情報は信号線Ｑ　２−０を介して自動監視切替制御装置
１００に伝えられ、自動監視切替制御装置１００内の処
理部１３０は当該プロセッサのオペレーテイング・シス
テムを起動する。Maintenance personnel for the computer system arrive while the diagnostic program 204 is running under the faulty processor, i.e., the management system 200 in the example of FIG. 1, and find that the cause of the fault is due to hardware. An analysis is performed to determine whether the problem is caused by software or not. If the cause is hardware, replace the circuit pointed out by the diagnostic program. If the problem is caused by software, fix the program that caused the problem. Thereafter, the maintenance person inputs information indicating "recovery" from the console device (not shown) of the SVP 201. This information is transmitted to the automatic supervisory switching control device 100 via the signal line Q2-0, and the processing unit 130 in the automatic supervisory switching control device 100 starts up the operating system of the processor.

これは、第３図に示した項番３の工ｐ　Ｌ（Ｉｎｉｔｉ
ａｌＰｒｏｇｒａｍ　Ｌｏａｄｉｎｇ）起動の指令を発
行することになる。This is the process p L (Initi
A command to start alProgram Loading) will be issued.

管理システム２００のＩＰＬ処理が完了すると信号線Ｑ
２−０を介して”　Ｉ　Ｐ　Ｌ完了″の報告を受けるの
で、自動監視切替制御装置１００は、統括管理業務を代
行している業務システムに対して、第３図の項番１１の
指令を信号線Ｑｌ−ｊ（第１図の例では信号線Ｑｌ−７
）を介して送出する。When the IPL processing of the management system 200 is completed, the signal line Q
2-0, the automatic monitoring switching control device 100 issues the command No. 11 in FIG. Signal line Ql-j (signal line Ql-7 in the example in Figure 1)
).

次に、入出力装置群２３５〜２３８を再び元の管理シス
テム２００に接続するために、自動監視切替制御装置１
００内の処理部１３０からの指令信号が切替え制御部１
２０に送出され、信号線群Ｑ５を経てチャネル・バス切
替え制御装置２３０〜２３４に指示される。これにより
、入出力装置群に対するチャネル・パスは業務システム
２２０から管理システムに戻る。Next, in order to connect the input/output devices 235 to 238 to the original management system 200 again, the automatic monitoring switching control device
The command signal from the processing unit 130 in 00 is switched to the switching control unit 1.
20, and is instructed to the channel/bus switching control devices 230-234 via the signal line group Q5. As a result, the channel path for the input/output device group returns from the business system 220 to the management system.

以上が本発明の自動監視切替制御装１１１００の動作概
要である。では次に、第２図以降の図を用いて動作の詳
細を説明する。第２１１Ａは自動監視切替制御装置１０
０におけるプロセッサ・インタフェース部１１０の回路
図である。図中の符号１はデコーダＤＥＣ、符号２はエ
ンコーダＥＮＣ．符号３はコマンド・レジスタＣＲＥＧ
，符号４はデータ・レジスタＤＲＥＧ、符号５はコマン
ド・ワーク・レジスタＣ：ＷＲＥＧ．符号６はデータ・
ワーク・レジスタＤＷＲＥＧ、符号７，８はゲート回路
である。ここで，ＣＷＲＥＧとＤＷＲＥＧは第１図の各
プロセッサと信号線Ｑｌ−ｉ，Ｑ２−ｉと接続されてお
り、各プロセッサ毎に対をなしている。そこで、第２図
ではＣＷＲＥＧｓ，Ｄ　Ｗ　Ｒ　Ｅ　Ｇ　１のようにサ
ブイツクスを付してある。The above is an outline of the operation of the automatic monitoring switching control device 11100 of the present invention. Next, details of the operation will be explained using the figures from FIG. 2 onwards. No. 211A is automatic monitoring switching control device 10
FIG. 2 is a circuit diagram of the processor interface unit 110 in FIG. In the figure, numeral 1 is a decoder DEC, and numeral 2 is an encoder ENC. Code 3 is command register CREG
, 4 is the data register DREG, and 5 is the command work register C: WREG. Code 6 is data
Work register DWREG, numerals 7 and 8 are gate circuits. Here, CWREG and DWREG are connected to each processor and signal lines Ql-i and Q2-i in FIG. 1, and form a pair for each processor. Therefore, in FIG. 2, subixes such as CWREGs and DWREG1 are added.

符号Ｑ１０−１は第１図の処理部１３０から送出される
モード信号線であり、プロセッサ２００等へのデータ送
信のときには値が“１”となり、データ受信のときには
値が゛′０”となる。符号１２１０−２はアドレス・バ
ステあり、ＣＩｌｌＲＥＧｉ　，Ｄ　Ｗ　Ｒ　Ｅ　Ｇ　
＋　のｉの値を指定して第１図の処理部１３０より送出
される。符号Ｑｌ３はコマンド・データ・バス，符号１
２１４，Ｑｌ５はデータ・パスである．符号０１６はプ
ロセッサ群から信号線Ｑ２−ｉにより要求情報が送られ
て来たときに値がｔｔ　１　ｐｐとなり、エンコーダＥ
ＮＣ２の入力となり、信号ｉｆｌｌｌ−１にそのプロセ
ッサ番号を出力させるものである。したがって、エンコ
ーダＥＮＣ２の入力信号は０〜■と図示したように、各
プロセッサ対応のＤ　Ｗ　Ｒ　Ｅ　Ｇ　＊　からの出力
信号である。The code Q10-1 is a mode signal line sent from the processing unit 130 in FIG. 1, and the value is "1" when transmitting data to the processor 200 etc., and the value is "0" when receiving data. . Code 1210-2 has address buste, CIllREGi, DWREG
The value of i of + is specified and sent from the processing unit 130 in FIG. Code Ql3 is the command data bus, code 1
214, Ql5 is the data path. The code 016 has a value of tt 1 pp when request information is sent from the processor group via the signal line Q2-i, and the encoder E
It serves as an input to NC2, and causes the processor number to be output as signal ifllll-1. Therefore, the input signals of the encoder ENC2 are output signals from the DWREG* corresponding to each processor, as shown in the figure from 0 to ■.

また、第２図の符号Ｑ１０−１〜Ｑｌｏ−３は第１図の
信号線群Ｑ１０に対応し，符号Ω１１−１〜Ａｌｌ−２
は信号線群Ｑ１１に対応している。Further, the symbols Q10-1 to Qlo-3 in FIG. 2 correspond to the signal line group Q10 in FIG. 1, and the symbols Ω11-1 to All-2
corresponds to the signal line group Q11.

第３図は自動監視切替制御装置１００から各プロセッサ
に発行される指令の一覧であり、この第３図で示した値
が処理部１３０から信号線Ｑ１０−３を介してＣＲＥＧ
３に格納される。対象プロセッサの選択は第１図の処理
部１３０より信号線Ω１０−２にプロセッサ番号を送出
する。これにより、プロセッサ番号の値がデコーダＤＥ
ＣＩでデコードされ、プロセッサ番号に対応する出力信
号線の値がＲ　Ｉ　Ｉ＋となる。例えば，第１図の管理
システム２００への送出であるならば信号線１２１７の
値が″′１”となり、ゲート回路７が開く。その結果、
第３図で示した指令の値がコマンド・レジスタＣＲＥＧ
３に保持されており、その値がコマンド・データ・バス
Ｑ１３，ゲート回路７を経て、コマンド・ワーク・レジ
スタＣＷＲＥＧＩに保持される。一方、指令にともなう
付加情報がある場合には、その値がデータ・レジスタＤ
ＲＥＱ４に保持されており、信号線ＱＩＯ−１の値が“
′１”すなわち送信モードが指定されることにより、デ
ータ・バス党１４を経てデータ・ワーク・レジスタＤ　
Ｗ　Ｒ　Ｅ　Ｇ　ｘに保持される．コマンド・ワーク・
レジスタＣ　Ｗ　Ｒ　Ｅ　Ｇ　１　５、データ・ワーク
・レジスタＤ　Ｗ　Ｒ　Ｅ　Ｇ　１　６の値は、各々信
号線Ｑｌ−０．１２２−０を介して管理システム２００
へ送出される６以上が，管理システム２００に対する指
令のときの制御情報の送出方法であるが、他のプロセッ
サ・システムへの送出はプロセッサ番号を処理部１３０
から信号線ＱＩＯ−２へ送出すれば、デコーダ回路ＤＥ
ＣＩの出力信号に対応するＣ　Ｗ　Ｒ　Ｅ　Ｇ　ｓ　，
　Ｄ　Ｗ　Ｒ　Ｅ　Ｇ霊が選択される。FIG. 3 is a list of commands issued from the automatic monitoring switching control device 100 to each processor, and the values shown in FIG.
3. To select a target processor, the processor number is sent from the processing unit 130 in FIG. 1 to the signal line Ω10-2. This causes the processor number value to be changed to the decoder DE
It is decoded by CI, and the value of the output signal line corresponding to the processor number becomes R I I+. For example, if the signal is to be sent to the management system 200 in FIG. 1, the value of the signal line 1217 will be "'1" and the gate circuit 7 will be opened. the result,
The command value shown in Figure 3 is the command register CREG.
3, and the value is held in the command work register CWREGI via the command data bus Q13 and the gate circuit 7. On the other hand, if there is additional information accompanying the command, its value is stored in the data register D.
It is held in REQ4, and the value of signal line QIO-1 is “
'1', that is, the transmission mode is specified, the data work register D is sent via the data bus 14.
W R E G x is held. command work
The values of the register C W R E G 1 5 and the data work register D W R E G 1 6 are transmitted to the management system 200 via the signal line Ql-0.122-0, respectively.
6 or more is the method of sending control information when issuing a command to the management system 200, but when sending to other processor systems, the processor number is sent to the processing unit 130.
If it is sent from the signal line QIO-2 to the decoder circuit DE
C W R E G s corresponding to the output signal of CI,
D W R E G spirit is selected.

プロセッサからの応答は以下のようにして受信する。こ
こでも、説明を容易にするために、管理システム２００
からの応答を例にして、第２図の制御回路の動作を説明
することにする。A response from the processor is received as follows. Again, for ease of explanation, the management system 200
The operation of the control circuit shown in FIG. 2 will be explained by taking as an example the response from .

管理システム２００のプロセッサからの応答情報は信号
ｇＱ　２−０を介してデータ・ワーク・レジスタＤ　Ｗ
　Ｒ　Ｅ　Ｇ　ｓ　６に保持される，一方、第１図の処
理部１３０は該プロセッサに対する指令の応答待ち状態
を認識しているために、信号線Ｑ１０−１の値を“０”
とし，かつ信号線α１０−２にはプロセッサ番号を送出
している。その結果、ゲート回路８が開くことになり、
データ・ワーク・レジスタＤ　Ｗ　Ｒ　Ｅ　Ｇ　ｘ　６
の値はデータ・バスＱｌ５を経てデータ・レジスタＤＲ
ＥＧ４に保持される。The response information from the processor of the management system 200 is sent to the data work register DW via signals gQ2-0.
On the other hand, since the processing unit 130 in FIG. 1 recognizes that it is waiting for a response to a command to the processor, it sets the value of the signal line Q10-1 to "0".
and the processor number is sent to the signal line α10-2. As a result, the gate circuit 8 will open,
Data work register D W R E G x 6
The value of is transferred to data register DR via data bus Ql5.
Retained in EG4.

これは、信号線ｆｌｌｏ−１の値が゛′０″であるため
にデータ受信可能となる。次に、処理部１３０がデータ
・レジスタＤＲＥＧ４の値を信号線Ｑ１１−２を介して
読み出せば良い。This is because the value of the signal line flo-1 is "'0", so data can be received.Next, if the processing unit 130 reads the value of the data register DREG4 via the signal line Q11-2, good.

次に、管理システム２００が独自にシステムの障害を検
出した場合、すなわち自動監視切替制御装［１００とは
非同期に障害を検出した場合の障害通報の受信動作を説
明する。管理システム２００や業務システム２２０のプ
ロセッサからの通報は第４図に示した障害情報に対応す
る値が信号線Ｑ２−ｉを介して送出される。ここで、ｉ
はプロセッサ番号に対応する。ここでも、説明を容量に
するために、管理システム２００のプロセッサからの通
報を例として説明する。管理システム２００のプロセッ
サからの障害通報は信号線Ｑ　２−０を介してデータ・
ワーク・レジスタＤＷＲＥＧ１６で保持される。このと
き、信号線１２１６の値が／Ｉ　Ｉ　Ｉ＋となり，エン
コーダＥＮＣ２にてエンコードされる。その結果，信号
線Ｑｌｌ−１には要求信号を送出したプロセッサ番号が
出力され、このプロセッサ番号が第１図の処理部１３０
で受信できる。Next, a description will be given of the operation of receiving a failure report when the management system 200 independently detects a system failure, that is, when it detects a failure asynchronously with the automatic monitoring switching control device [100]. Regarding notifications from the processors of the management system 200 and the business system 220, values corresponding to the fault information shown in FIG. 4 are sent out via the signal line Q2-i. Here, i
corresponds to the processor number. Here again, in order to make the explanation concise, a report from the processor of the management system 200 will be explained as an example. Failure notifications from the processor of the management system 200 are sent as data via the signal line Q2-0.
It is held in work register DWREG16. At this time, the value of the signal line 1216 becomes /I II I+, which is encoded by the encoder ENC2. As a result, the processor number that sent the request signal is output to the signal line Qll-1, and this processor number is transferred to the processing unit 130 in FIG.
You can receive it at

処理部１３０では、信号線１１１１０−２に、先に受信
したプロセッサ番号を送出し、かつ、信号線ＱＩＯ−１
の値を“０”とする。これによって、先に説明した動作
と同じように、ゲート回路８が開き、データ・ワーク・
レジスタＤＷＲＥＧｚ６の要求情報コードがデータ・バ
スＱ１５を経てデータ・レジスタＤＲＥＧ４に保持され
た後、データ・バスＱ１２−２を経て第１図の処理部１
３０にて要求情報を認識できる。The processing unit 130 sends the previously received processor number to the signal line 11110-2, and also sends the previously received processor number to the signal line QIO-1.
Let the value of be “0”. As a result, the gate circuit 8 opens and the data work
After the request information code of register DWREGz6 is held in data register DREG4 via data bus Q15, it is transferred to processing unit 1 of FIG. 1 via data bus Q12-2.
The request information can be recognized at 30.

第３図は指令の一覧を示したものであり、ＣＲＥＧ３を
経て対応するプロセッサに送出される。このとき、ＤＲ
ＥＧ４には必要に応じて付加情報が設定され、対応する
プロセッサに送出される。第４図は稼動している電子計
算機システムからの要求情報が信号線Ｑ２−ｉを介して
自動監視切替制御装置１００へ通報されるときの要求内
容を示したものである。FIG. 3 shows a list of commands, which are sent to the corresponding processor via CREG3. At this time, DR
Additional information is set in EG4 as necessary and sent to the corresponding processor. FIG. 4 shows the contents of a request when request information from an operating computer system is reported to the automatic monitoring switching control device 100 via the signal line Q2-i.

第５図は自動監視切替制御装置１００内の処理部１３０
のメモリ部１３０−１に格納されている制御テーブル，
制御ブロックと処理プログラム群の構成を示している。FIG. 5 shows a processing section 130 in the automatic monitoring switching control device 100.
A control table stored in the memory unit 130-1 of
It shows the structure of the control block and processing program group.

また、第６図はプロセッサ管理テーブルの構成、第７図
は第５図で示したシステム起動／停止順序制御ブロック
１４１の構成、第８図は装置属性制御ブロック１４２の
構成を示している。第５図を参照するに、プロセッサ管
理テーブル１４０は各プロセッサの状態を保持しており
、各エントリはプロセッサ毎に対応している。6 shows the configuration of the processor management table, FIG. 7 shows the configuration of the system startup/stop order control block 141 shown in FIG. 5, and FIG. 8 shows the configuration of the device attribute control block 142. Referring to FIG. 5, the processor management table 140 holds the status of each processor, and each entry corresponds to each processor.

したがって、プロセッサ対応のエントリ毎にシステム起
動／停止順序制御ブロック１４１，装置属性制御ブロッ
ク１４２をポイントしている。すなわち、システム起動
／停止順序制御ブロック１４１や装置属性制御ブロック
１４２はプロセッサ毎に存在することになる。Therefore, each entry corresponding to a processor points to the system start/stop order control block 141 and the device attribute control block 142. That is, a system start/stop order control block 141 and a device attribute control block 142 exist for each processor.

一方、処理プログラム群は上記の制御テーブル，制御ブ
ロックを用いて処理を遂行することになる。On the other hand, the processing program group uses the above-mentioned control table and control block to perform processing.

自動監視切替制御装置１００の処理プログラム群はメイ
ン・プログラム１４５のもとに、電源オン／オフ・プロ
グラム１４６，起動／停止プログラム１４７，プロセッ
サ切替プログラム１４８，診断プログラム１４９，回復
切替えプログラム１５０、およびコマンド送信プログラ
ム１５１で成っている。The processing program group of the automatic monitoring switching control device 100 includes a main program 145, a power on/off program 146, a start/stop program 147, a processor switching program 148, a diagnostic program 149, a recovery switching program 150, and commands. It consists of a sending program 151.

では、第６図〜第８図の制御テーブル、制御ブロックを
説明した後に、第９図〜第１４図の動作フロー図を用い
て自動監視切替制御装置１００内の処理プログラム群の
動作を説明する。なお、第１５−ａ図，第１５−ｂ図は
、第１図に示したチェック・プログラムＣＨＫ２　１０
の動作フロー図である。このチェック・プログラムＣＨ
Ｋ２　１　０はプロセッサ側で動作するものであり、サ
ービス・プロセッサＳＶＰ２０１と連携して動作する。Now, after explaining the control tables and control blocks shown in FIGS. 6 to 8, the operation of the processing program group in the automatic monitoring switching control device 100 will be explained using the operation flow diagrams shown in FIGS. 9 to 14. . Note that FIGS. 15-a and 15-b show the check program CHK2 10 shown in FIG.
FIG. 2 is an operation flow diagram. This check program CH
K2 1 0 operates on the processor side, and operates in cooperation with the service processor SVP 201 .

したがって、オペレーテイング・システムＯＳ２０２や
２２２とは独立して動作している。Therefore, it operates independently of the operating systems OS 202 and 222.

第６図はプロセッサ管理テーブル１４０の形式であり、
各プロセッサ対応に１エントリが対応している。各エン
トリには、プロセッサの状態フラグ１１、交代ＣＰＵ番
号１２，業務コード１３、およびシステム起動／停止順
序制御ブロック・アドレス、装置属性制御ブロック・ア
ドレスのフィールドで成っており、各プロセッサの動作
状態を検査後、状態フラグ１１にその状態を反映させ、
障害発生ならば交代ＣＰＵ番号１２で示されたプロセッ
サに業務を切替える処理を第５図で示した処理プログラ
ム群が行なうことになる。FIG. 6 shows the format of the processor management table 140,
One entry corresponds to each processor. Each entry consists of fields for processor status flag 11, replacement CPU number 12, business code 13, system startup/stop order control block address, and device attribute control block address, and indicates the operating status of each processor. After the inspection, the status is reflected in the status flag 11,
If a failure occurs, the processing program group shown in FIG. 5 will perform the process of switching the task to the processor indicated by the replacement CPU number 12.

第７図はシステム起動／停止順序制御ブロック１４１の
形式を示したものである。この制御ブロック内には該プ
ロセッサに対する起動順序と停止順序の手順が格納され
ており、処理プログラム群がこれらを解釈し、必要に応
じて信号線Ｑ２−ｉを介してプロセッサ２００，２２０
に送出する。FIG. 7 shows the format of the system start/stop order control block 141. This control block stores the procedures for starting and stopping the processors, and the processing program group interprets these and sends them to the processors 200 and 220 via the signal line Q2-i as necessary.
Send to.

第８図は装置属性制御ブロック１４２の形式を示したも
のである。この制御ブロック１４２は該プロセッサに接
続されている入出力装置群のチャネル・パスアドレス１
５と交代プロセッサからのチャネル・バス・アドレス１
６を格納している。FIG. 8 shows the format of the device attribute control block 142. This control block 142 controls the channel path address 1 of the input/output device group connected to the processor.
5 and channel bus address 1 from the alternate processor
6 is stored.

第９図は第５図で示したメイン・プログラム１４５の処
理フロー図である。第９図を参照するに，処理２１では
一定時間の間、割込み待ち状態に入る。ここで，外部割
込み待ちとは該自動監視切替制御装置１００と接続して
いるプロセッサ群からの要求信号待ちをいう。判定処理
２２ではプロセツサからの要求が発生したか否かを検査
し、プロセッサからの要求であるならば、第２図で示し
た信号線Ω１１−１の値をプロセッサ番号として，デコ
ーダＤＥＣＩに送出し、信号線Ｑ１〇一１の値を“Ｏ”
としてＤＲＥＧ４を読みとる準備をする（処理２３）。FIG. 9 is a processing flow diagram of the main program 145 shown in FIG. Referring to FIG. 9, in process 21, an interrupt wait state is entered for a certain period of time. Here, waiting for an external interrupt means waiting for a request signal from a processor group connected to the automatic monitoring switching control device 100. In the determination process 22, it is checked whether a request has occurred from the processor or not. If it is a request from the processor, the value of the signal line Ω11-1 shown in FIG. 2 is sent to the decoder DECI as the processor number. , set the value of signal line Q1〇-1 to “O”
Prepare to read DREG4 (processing 23).

一方、プロセッサからの要求がないならば処理２４〜処
理２９を実行する。On the other hand, if there is no request from the processor, processes 24 to 29 are executed.

処理２４ではプロセッサ管理テーブル１４０をアクセス
し、各プロセッサ対応のエントリを検査するための準備
をする。ここで、ｉは繰返しカウントである。したがっ
て、処理２５によって、ｉが有効エントリ数ｎ以上にな
ると再び処理２１へ戻る。処理２６では当該プロセッサ
の状態フラグ１１を調べて、動作中のプロセッサに対し
てのみ状態検査を行なう。処理２８ではプロセッサ番号
ｉの値を第２図のＤＥＣＩに送出し、第３図の指令内容
のうちの項番５の″動作検査″の値をＣＲＥＧ３に送出
する。次に、信号線Ｊ２１０−１の値を“１″とするこ
とによって指令が該当プロセッサへ送出される。その後
，信号線Ｑ１０−１の値をｔＬ　Ｏ　１７として、結果
の受信準備をする。In process 24, the processor management table 140 is accessed and preparations are made for checking the entries corresponding to each processor. Here, i is the repetition count. Therefore, in step 25, when i becomes equal to or greater than the number of valid entries n, the process returns to step 21 again. In process 26, the status flag 11 of the processor in question is checked, and the status is checked only for the processors that are in operation. In process 28, the value of processor number i is sent to DECI in FIG. 2, and the value of "operation check" in item number 5 of the command contents in FIG. 3 is sent to CREG3. Next, a command is sent to the corresponding processor by setting the value of the signal line J210-1 to "1". Thereafter, the value of the signal line Q10-1 is set to tL O 17, and preparations are made to receive the result.

処理２９では，処理２３，処理２８後の共通の処理とな
り、第２図のＤＲＥＧ４の値を読みとる。In process 29, which is a common process after processes 23 and 28, the value of DREG4 in FIG. 2 is read.

ＤＲＥＧ４の値が“０２″であるならば正常報告であり
、処理２５へ戻って次のプロセッサの検査を行なうこと
になる。If the value of DREG4 is "02", it is a normal report, and the process returns to step 25 to inspect the next processor.

一方、ＤＲＥＧ４の値が＃ｌ　Ｏ　７１でないならば、
第４図に示した要因が発生していることになり、判定処
理３０にて障害回復の通報（ＤＲＥＧ４の値が゛’２４
”）かを調べる。判定の結果、障害回復の通報でないな
らば、処理３１にて第６図に示した状態フラグ１１を″
障害″とし，起動／停止プログラム１４７の停止ルーチ
ンに制御を移し、停止処理を行なわせた後、プロセッサ
切替えプログラム１４８に制御を移す。プロセッサ切替
えプログラム１４８の処理フローは第１１図で示すが、
主な処理は、装置群の切離しと交代プロセッサへの接続
、および業務の切替え指示を行なう。プロセッサ切替え
プログラム１４８から制御が戻ると、欠Ｓ、診断プログ
ラム１４９へ制御を移して該障（：ｊ．％害を起したプロセッサに対して自動的に診断プログラム
を実行させた後、処理２５へ戻る。On the other hand, if the value of DREG4 is not #l O 71,
This means that the factors shown in FIG.
”). If the result of the determination is that it is not a failure recovery report, in process 31 the status flag 11 shown in FIG.
"Failure", control is transferred to the stop routine of the start/stop program 147 to perform a stop process, and then control is transferred to the processor switching program 148.The processing flow of the processor switching program 148 is shown in FIG.
The main processing involves disconnecting a group of devices, connecting them to a replacement processor, and instructing a task changeover. When the control is returned from the processor switching program 148, the control is transferred to the diagnostic program 149, which automatically executes the diagnostic program for the processor that caused the fault (:j.%), and then proceeds to process 25. return.

障害回復の通報であるならば、処理３２にて、第６図の
状態フラグ１１を“稼動”とし、回復切替えプログラム
１５０へ制御を移す。回復切替えプグラム１５０の処理
フローは第１３図に示してあるが、この回復切替えプロ
グラム１５０では，業務を代行していたプロセッサでの
業務を停止させ、装置を切離した後に、障害回復したプ
ロセッサに該装置群を接続する。そして、該回復したプ
ロセッサに対して″業務切替″、すなわち業務の再続行
を指示する。これによって、障害回復したプロセッサに
て業務の再続行が可能となる。以上が、第５図で示した
処理プログラム群の動作であり、第１０図以降第１４図
まではメイン・プログラム１４５から呼ばれるサブ・プ
ログラムの処理フローを示している。第１０図は起動／
停止プログラム１４７の停止処理の処理フロー図、第１
１図はプロセッサ切替プログラム１４８の処理フロー図
，第１２図は診断プログラム１４９の処理フロー図，第
１３図は回復切替えプログラム１５０の処理フロー図，
第１４図はコマンド送信プログラムの処理フロー図であ
り、ここで示した特徴的な動作はすべて示されている。If it is a failure recovery notification, in step 32 the status flag 11 in FIG. 6 is set to "active" and control is transferred to the recovery switching program 150. The processing flow of the recovery switching program 150 is shown in FIG. 13. In this recovery switching program 150, the processor that was performing the task is stopped, the device is disconnected, and then the corresponding processor is transferred to the processor that has recovered from the failure. Connect devices. Then, it instructs the recovered processor to "switch the business", that is, to resume the business. This allows the processor that has recovered from the failure to resume operations. The above is the operation of the processing program group shown in FIG. 5, and FIGS. 10 to 14 show the processing flow of the sub-program called from the main program 145. Figure 10 shows startup/
Processing flow diagram of stop processing of the stop program 147, 1st
1 is a processing flow diagram of the processor switching program 148, FIG. 12 is a processing flow diagram of the diagnostic program 149, and FIG. 13 is a processing flow diagram of the recovery switching program 150.
FIG. 14 is a processing flow diagram of the command sending program, and all the characteristic operations shown here are shown.

では次に、第１図に示したチェック・プログラムＣＨＫ
２１０の動作を説明する。第１５−ａ図，第１５−ｂ図
はチェック・プログラムＣ　Ｈ　Ｋ　２１０の処理フロ
ー図である。まず、判定処理３５によって、監視装［ｉ
１ｆｌＯＯからの指令がなされたか否かを判定する。こ
れは、信号線Ｑｌ−ｉに指令コードが送出されて来たか
否かを調べれば良い。監視装置からの指令のときには第
１５−ｂ図の処理を行なう。このときの処理は後に説明
することにする．監視装ｇｉｏｏからの指令でない場合
には処理３５〜処理４４を実行する。Next, check the check program CHK shown in Figure 1.
The operation of 210 will be explained. 15-a and 15-b are processing flow diagrams of the check program C H K 210. First, in the determination process 35, the monitoring device [i
It is determined whether a command from 1flOO has been issued. This can be done by checking whether a command code has been sent to the signal line Ql-i. When the command comes from the monitoring device, the process shown in FIG. 15-b is performed. The processing at this time will be explained later. If the command is not from the monitoring device gioo, processes 35 to 44 are executed.

判定処理３６によって、ハードウエア障害発生を検出し
たならば、要求コードＲＥＱの値を２１として処理４４
を実行する。ここで、要求コードＲＥＱは第４図で示し
た値である。同様に、判定処理３７〜判定処理３９にて
判定条件に対応する事象が発生したならば対応する要求
コードＲＥＱの値を設定する。処理４４では要求コード
ＲＥＱの値を信号線Ｑ２−ｉに逆出し、自動監視切替制
御装置１００に報告した後、処理３５へ戻る。If the occurrence of a hardware failure is detected in the determination process 36, the value of the request code REQ is set to 21, and the process 44
Execute. Here, the request code REQ is the value shown in FIG. Similarly, if an event corresponding to the determination condition occurs in determination processing 37 to determination processing 39, the value of the corresponding request code REQ is set. In process 44, the value of the request code REQ is sent back to the signal line Q2-i, and after being reported to the automatic monitoring switching control device 100, the process returns to process 35.

自動監視切替制御装置１００からの指令であるならば第
１５−ｂ図に示した処理を行なう。すなわち、処理４６
によって信号線Ｑｌ−ｉより指令コードを、信号線Ｑ２
−ｉより付加情報をそれぞれ得る。分岐処理４７では処
理４６で得た指令コードに対応した処理に分岐する．第
１５−ｂ図の数字は指令コードの番号を表わしており、
この番号は第３図で示した指令内容に対応する値である
。If the command is from the automatic monitoring switching control device 100, the processing shown in FIG. 15-b is performed. That is, process 46
The command code is sent from the signal line Ql-i by the signal line Q2.
- Obtain additional information from i. In branch process 47, the process branches to the process corresponding to the command code obtained in process 46. The numbers in Figure 15-b represent the command code numbers,
This number is a value corresponding to the command content shown in FIG.

以上、自動監視切替制御装置を経由して業務の切替え方
法、および回復後の再開方法について説明したが、第１
図の信号線Ｑ３を用いて管理システム２００と業務シス
テム２２０間での業務の代行切替えを行なっても構わな
い。この場合には信号線Ｑ３を管理システム２００から
個々の業務システム２２０と接続することと、自動監視
切替制御装１ｉ！１００で行なっている処理を管理シス
テム２００，業務システム２２０で各々分担して行なう
ことになる。Above, we have explained how to switch operations via the automatic monitoring switching control device and how to resume operations after recovery.
The signal line Q3 in the figure may be used to switch the business between the management system 200 and the business system 220. In this case, the signal line Q3 is connected from the management system 200 to the individual business systems 220, and the automatic monitoring switching control device 1i! The processing performed by the management system 200 and the business system 220 will be divided and performed by the management system 200 and the business system 220, respectively.

〔Effect of the invention〕

本発明によれば、複数のプロセッサで成る複合計算機シ
ステムにおいて、自動監視切替制御装置が各プロセッサ
の動作状況を常時監視する制御手段を具備しており、ど
れかのプロセッサで障害が発生したことを検出すると該
プロセッサ、システムで実行していた業務を他の代行プ
ロセッサで実行させる制御手段と該障害を起したプロセ
ッサに対して診断プログラムを実行させる制御手段、お
よび障害が回復したならば代行プロセッサで実行してい
た業務を元のプロセッサで実行させる制御手段を具備し
ているので、サービス停止時間の削減，障害発生したプ
ロセッサの障害回復時間の短縮の効果がある。According to the present invention, in a compound computer system consisting of a plurality of processors, the automatic monitoring switching control device is equipped with a control means that constantly monitors the operating status of each processor, and is capable of detecting the occurrence of a failure in any one of the processors. When the fault is detected, the faulty processor has a control means that causes another substitute processor to execute the work that was being executed in the system, a control means that causes the faulty processor to execute a diagnostic program, and a control means that causes the faulty processor to execute a diagnostic program, and if the fault has been recovered, the faulty processor Since it is equipped with a control means that causes the original processor to execute the task that was being executed, it has the effect of reducing service stoppage time and shortening the failure recovery time of a failed processor.

さらに，本発明の自動監視切替制御装置は，従来、計算
機システムは保守員やシステム操作者，管理者が行なっ
ていた操作を自動的に代行するために、人的ミスにより
誤操作の頻発防止，構成変更のための操作時間の短縮効
果がある。Furthermore, the automatic monitoring switching control device of the present invention is designed to prevent frequent erroneous operations caused by human error, and to configure This has the effect of shortening the operation time for changes.

また、第１＠の管理システム２００と業務システム２２
０の間でお互いに相手システムの動作状態を監視する方
法も可能である。その場合には（１）信号線氾３を経由
する方法、（２）自動監視切替制御装置１００を経由す
る方法、が代案としてある。（１）の場合には管理プロ
グラム２０３、副管理プログラム２２４内に第５図〜第
１４図で示した処理を組み込めば良い。一方、（２）の
場合には第３図の指令、および第４図の要求内容の項目
を増やすことで可能であり，それらの機能および処理は
本実施例を開示した内容により容易に類推できるもので
ある。In addition, the first @ management system 200 and the business system 22
It is also possible to mutually monitor the operating status of the other system between the two systems. In that case, the alternatives are (1) a method via the signal line 3, and (2) a method via the automatic monitoring switching control device 100. In the case of (1), the processes shown in FIGS. 5 to 14 may be incorporated into the management program 203 and the sub-management program 224. On the other hand, in the case of (2), it is possible to increase the items of the command shown in Figure 3 and the request content shown in Figure 4, and their functions and processing can be easily inferred from the contents disclosed in this embodiment. It is something.

[Brief explanation of the drawing]

第１図は本発明の自動監視切替制御装置の構成を端的に
示した図、第２図は第１図のプロセッサインタフェース
部の制御回路を示した図、第３図は自動監視切替制御装
置から各プロセッサに発行される指令の一覧を示した図
，第４図は各プロセッサからの要求情報の一覧を示した
図、第５図は自動監視切替制御装置１００内の処理部１
３０の制御テーブル、制御ブロック，および処理プログ
ラム群の構成を示した図、第６図はプロセッサ管理テー
ブルの構成を示した図、第７図は第５図で示したシステ
ム起動／停止順序制御ブロック１４１の構成を示した図
，第８図は第５図で示した装置属性制御ブロック１４２
の構成を示した図、第９図は第５図で示したメイン・プ
ログラム１４５の処理フローを示した図、第１０図は起
動／停止プログラム１４７の停止処理の処理フローを示
した図、第１１図はプロセッサ切替プログラム１４８の
処理フローを示した図、第１２図は診断プログラム１４
９の処理フローを示した図、第１３図は回復切替えプロ
グラム１５０の処理フローを示した図、第１４図はコマ
ンド送信プログラムの処理フローを示した図、第１５図
（ａ），（ｂ）は第１図で示したチェックプログラムＣ
ＨＫ２１０の処理フローを示した図、である。１００・・・自動監視切替制御装置、１１０・・・プロ
セッサ・インタフェース部、１２０・・・切替え制御部
、１３０・・・処理部、２００・・・管理システムのプ
ロセッサ、２０１，２２１・・・サービス・プロセッサ
ＳＶＰ，２０３・・・管理プログラム、２０４・・・診
断プログラム、２１０・・・チェックプログラム、２２
０・・・業務システムのプロセッサ，２２３・・・業務
プログラム、２２４・・・副管理プログラム。第因茅　　Ｚｌｌｌ卒図第Ｓ第凹第ム凹第ワ凹第関第｝１図第｝Ｏ国第凹第圀第ゾ茅凹（α）FIG. 1 is a diagram succinctly showing the configuration of the automatic monitoring switching control device of the present invention, FIG. 2 is a diagram showing the control circuit of the processor interface section of FIG. 1, and FIG. 3 is a diagram showing the configuration of the automatic monitoring switching control device of the present invention. FIG. 4 is a diagram showing a list of commands issued to each processor. FIG. 4 is a diagram showing a list of request information from each processor. FIG. 5 is a diagram showing a list of request information from each processor. FIG.
FIG. 6 is a diagram showing the configuration of the 30 control tables, control blocks, and processing program groups, FIG. 6 is a diagram showing the configuration of the processor management table, and FIG. 7 is the system startup/stop order control block shown in FIG. 5. 141, FIG. 8 is a diagram showing the configuration of the device attribute control block 142 shown in FIG.
9 is a diagram showing the processing flow of the main program 145 shown in FIG. 5. FIG. 10 is a diagram showing the processing flow of the stop processing of the start/stop program 147. FIG. 11 is a diagram showing the processing flow of the processor switching program 148, and FIG. 12 is a diagram showing the processing flow of the processor switching program 148.
9, FIG. 13 is a diagram showing the processing flow of the recovery switching program 150, FIG. 14 is a diagram showing the processing flow of the command sending program, and FIGS. 15 (a) and (b). is the check program C shown in Figure 1.
It is a diagram showing a processing flow of HK210. DESCRIPTION OF SYMBOLS 100... Automatic monitoring switching control device, 110... Processor interface part, 120... Switching control part, 130... Processing part, 200... Processor of management system, 201, 221... Service - Processor SVP, 203... Management program, 204... Diagnosis program, 210... Check program, 22
0... Processor of business system, 223... Business program, 224... Sub-management program. No. 1 Zllll Graduation No. S No. 1 No. 1 No. O Country No. 1 (α)

Claims

[Scope of Claims] 1. In a composite consisting of a first information processing system having a management function and a second information processing system having a business processing function, means for monitoring the presence or absence of an abnormality in the first information processing system; , command means for switching the management function of the first information processing system to the second information processing system when an abnormality in the first information processing system is detected; and control for stopping the first information processing system. means, a control means for switching an input/output device group connected to the first information processing system to the second information processing system, and a command for starting a diagnostic program for the stopped first information processing system. An automatic monitoring switching control device characterized by comprising means. 2. The means for monitoring the presence or absence of an abnormality in the first information processing system operates independently of the first and second information processing systems, and storage means for storing the states of the first and second information processing systems. a storage means for storing identification information of the second information processing system when a failure of the first information processing system is detected; a state inspection commanding means for the second information processing system; and a task switching system. 2. The automatic monitoring switching control device according to claim 1, further comprising command means. 3. A control means that automatically sends a stop command to the first information processing system when a failure occurrence is detected in the first information processing system, and a command that automatically executes a diagnostic program. 2. The automatic monitoring switching control device according to claim 1, further comprising means. 4. When a status inspection command is issued to the first information processing system, the first information system has a processing means for determining whether its own system is operating normally or abnormally, and immediately issues a command based on the determination result. Claim 1 or 2 is characterized in that it is equipped with a processing means that responds to the control device that has issued the
The automatic monitoring switching control device described in Section 1. 5. In a complex system consisting of multiple information processing systems, a mechanism that sequentially inspects and monitors each information processing system for the presence or absence of an abnormality in its operating state, and a mechanism for checking and monitoring the presence or absence of abnormal operating conditions for each information processing system, as well as a mechanism for performing tasks on information processing systems that are operating abnormally. automatic monitoring switching control characterized by comprising a command means for stopping the information processing system, and a command means for issuing a command for causing the replacement information processing system to perform the work being executed by the information processing system that has caused the failure. Device. 6. In a complex system consisting of a plurality of information processing systems, a control means for determining normal operation and abnormal operation within each information processing system, a control means for reporting the determination result to an external control device, and a control means for receiving reports of abnormal operation. control means for selecting a replacement information processing system based on an abnormality report in an external control device; and a command for issuing a command to cause the selected information processing system to perform the work of the information processing system that has caused the failure. 1. An automatic monitoring switching control device comprising: a means for instructing an information processing system in which the failure has occurred to execute a diagnostic program. 7. In a complex system consisting of multiple information processing systems, when the work of the information processing system that has caused a failure is being performed by another information processing system, the information processing system that has caused the failure issues a request for recovery from the failure. means for receiving the report; a command means for issuing a command to the information processing system that was performing the work based on the fault recovery report to stop the work on its behalf; An automatic monitoring switching control device comprising: a control means for switching connections to a system; and a control means for instructing an information processing system that has recovered from a failure to resume operations. 8. In an automatic monitoring switching control device connected to a complex system consisting of multiple information processing systems, a specific control code is used to determine whether the information processing system is operating normally or abnormally. Any one of claims 1 to 4, characterized by comprising a control means for sending a response code to the information processing system, a control means for receiving a response code from the information processing system, and a control means for making a determination. One automatic monitoring switching control device. 9. In a complex system consisting of multiple information processing systems, the operations of the other information processing systems are performed between a first information processing system that centrally manages the entire complex system and a second information processing system that performs business processing functions. The system is equipped with a control means for monitoring the state, and a processing means for, if an abnormal operating state is detected, executing the work that was being executed by the information processing system that caused the abnormal operating state, on behalf of the information processing system that detected the abnormal operating state. An automatic monitoring switching control device characterized by: