JPH03209533A

JPH03209533A - Redundant system consisting of plural processors

Info

Publication number: JPH03209533A
Application number: JP2003459A
Authority: JP
Inventors: Tadashi Ohashi; 正大橋
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-01-12
Filing date: 1990-01-12
Publication date: 1991-09-12

Abstract

PURPOSE:To obtain a redundant system having high reliability without duplexing totally the system by providing the processors into the system as the redundant components together with a cooling device, etc. CONSTITUTION:If one of processors(PRC0 - PRCn) 110 - 11n has a fault in a system operating state, a system unit controller SUC 5 applies an interruption to a host OS to separate the faulty processor from other nondefective processors. Then the SUC 5 cuts off the supply of power to the faulty processor and separates this processor from a system. At the same time, the SUC 5 orders the optional one of nondefective processors PRC0 - n to load a microprogram via a microcode storage MSC 7 as long as the system operation can be carried on with reduction of the system operating scale. So that a spare processor (PRCn + 1) 11n + 1 or the optional nondefective processor is started. In such a constitution, the reliability of a redundant system is improved.

Description

【発明の詳細な説明】［概　要］複数のマイクロプログラム制御方式の処理装置と、該処
理装置ごとに冷却装置を有する／ステムに関し、従来、処理装置または冷却装置の内任意の一つが故障し
た場合にはシステム停止となるため、システム全体を二
重化するなどの対策を施していた問題の解決を目的とし
、１個以上の予備の処理装置と冷却装置を設け、／ステム
立ち上げ時にサービスプロセッサから送られるシステム
構成情報を記憶する手段と、各処理装置へ送られる初期
マイクロプログラムコードをバックアップする手段と、
／ステム運用中に故障が生じた場合には上記予備用の処
理装置および冷却装置を代替装置として使用し、上記バ
ックアップされたマイクロプログラムコードを代替処理
装置にローディングして立ち上げる手段とを用いて構成
する。[Detailed Description of the Invention] [Summary] Conventionally, regarding a stem having a plurality of microprogram-controlled processing devices and a cooling device for each of the processing devices, if any one of the processing devices or the cooling device breaks down, In order to solve this problem, measures such as duplicating the entire system were installed, and one or more spare processing units and cooling units were installed, and when the system was started up, the system was stopped. means for storing system configuration information to be sent; and means for backing up initial microprogram code to be sent to each processing unit;
/If a failure occurs during system operation, the above-mentioned spare processing device and cooling device are used as alternative devices, and the above-mentioned backed-up microprogram code is loaded into the alternative processing device and started up. Configure.

［産業上の利用分野コ本発明は超大型計算機システム、またはスーパーコンピ
ュータ等の複数のマイクロプログラム制御方式の処理装
置で構成されるシステムに関し、特に各処理装置ごと冷
却装置を有し、さらに上記処理装置および冷却装置に対
して予備用の装置を一つ以上有する複数の処理装置から
なる冗長システムに関する。[Industrial Field of Application] The present invention relates to a super-large computer system or a system consisting of a plurality of microprogram-controlled processing devices such as a supercomputer, and in particular, each processing device has a cooling device, and furthermore, the above-mentioned processing The present invention relates to a redundant system consisting of a plurality of processing devices having one or more spare devices for each device and cooling device.

〔従来の技術］第２図は従来の複数処理装置からなる／ステムの構成例
を示す図であり、Ｉはメインストレージ（主記憶）、１ａはメインストレ
ージ中のホストＯＳ１２はサービスプロセッサ（ＳＶＰ
）、３は初期マイクロプログラム等が格納されたフロッ
ピーディスク　（ＦＰＤ）、４はコンソールデイスプレ
ィ装置（Ｃ５Ｌ−ＤＳＰ）　、９　ｂは集合冷媒供給ユ
ニット（ＣＣ５Ｕ）　、８　ｅは制御モジュール（ＣＴ
Ｍ）ｌＯｏ〜ＩＯ，、は冷媒供給モジュール（Ｃ３Ｍ）
、１１ｏ−１１，、は処理装置（ＰＲＣ，−ＰＲＣ，、
）、１２、〜１２．は配管、ホース等の冷媒通路（ＣＰ
。[Prior Art] Fig. 2 is a diagram showing an example of the configuration of a conventional system consisting of multiple processing devices, where I is the main storage, 1a is the main storage, and the host OS 12 is a service processor (SVP).
), 3 is a floppy disk (FPD) in which initial microprograms etc. are stored, 4 is a console display device (C5L-DSP), 9 b is a collective refrigerant supply unit (CC5U), 8 e is a control module (CT
M) lOo~IO,, is the refrigerant supply module (C3M)
, 11o-11,, is a processing device (PRC, -PRC,,
), 12, ~12. refers to refrigerant passages such as piping and hoses (CP
.

〜ＣＰ、、）、３０はインタフェース、３１はパワーコ
ントロールマネージ＋　−（Ｐｏｗｅｒ　Ｃｏｎｔｒ。~CP, , ), 30 is an interface, and 31 is a power control manager + - (Power Contr.

Ｍａｎｅｇｅｒ：　Ｐ　ＣＭ）　、３２ｏ　〜３２＋、
は電源制御装置（ＰＣ，〜ＰＣ，，）を表わしている。Manager: P CM), 32o ~ 32+,
represents a power supply control device (PC, ~PC,,).

本ンステムは、ｎ個の処理装置（ＰＲＣｏ　〜Ｐ　ＲＣ
，、）　１１０−１１．を有するシステムであって、各
処理装置（ＰＲＣｏ＝ＰＲＣ，、）　ｌｌｏ　〜１１、
、はホスト０３Ｉａとインタフェース３０を介して接続
され、該ホス）０５１　ａの管理の元に一つの協動シス
テムを構成する。This system consists of n processing units (PRCo to PRC
,,) 110-11. A system having each processing device (PRCo=PRC,,) llo ~11,
, is connected to the host 03Ia via the interface 30, and forms a cooperative system under the management of the host 051a.

また、上記各処理装置（ＰＲＣ，−ＰＲＣ，、）ｌｉｅ
−１１，、はマイクロプログラム制御式の処理装置であ
って、システムの初期立ち上げ時に、初期マイクロプロ
グラムがロードされ、各々割り当てられた機能の処理装
置となる。In addition, each of the above processing devices (PRC, -PRC,,)
-11, are microprogram-controlled processing units, into which an initial microprogram is loaded when the system is initially started up, and each unit becomes a processing unit for the assigned function.

また、各処理装置（ＰＲＣｏ　〜ＰＲＣｈ）１１゜〜１
１．のそれぞれに対して、冷却装置が設けられ、本例で
は、冷却媒体を冷媒通路（ＣＰ、〜ＣＰ、　）　１２．
−１２．、により各処理装置（ＰＲＣ。In addition, each processing device (PRCo to PRCh) 11° to 1
1. A cooling device is provided for each of the 12.
-12. , each processing unit (PRC.

〜ＰＲＣ，，＞１１゜〜１１．．内に還流させて冷却を
行なう冷媒供給モジュール（Ｃ３Ｍｏ−Ｃ５Ｍ、、）１
０、〜１０ｈ　を用いている。~PRC,,>11°~11. ．． Refrigerant supply module (C3Mo-C5M,...) 1
0 to 10h is used.

なお、上記冷媒供給モジュール（Ｃ３ＭＯ〜ＣＳ　Ｍ、
、）　１０ｏ　〜１０．は集合冷媒供給ユニット（ＣＣ
３Ｕ）９ｂ内に収納され、制御モジュール（ＣＴＭ）８
ｅにより管理される。In addition, the above refrigerant supply modules (C3MO to CSM,
,) 10o ~10. is the collective refrigerant supply unit (CC
Control module (CTM) 8
Managed by e.

また、電源制御装置（ＰＣ，−ＰＣ，、）３２゜〜３２
．は各処理装置（Ｐ　ＲＣ，〜Ｐ　ＲＣ，、）　ｌｌ。In addition, the power supply control device (PC, -PC,,) 32° ~ 32
．． is each processing device (P RC, ~ P RC,,) ll.

〜１１．への電源供給を制御する装置であり、ンステム
立ち上げ時に各冷媒供給モジュール（ＣｓＭｏ−Ｃ３Ｍ
、、）　１０ｏ〜１０．、から各処理装置（ＰＲＣ，〜
ＰＲＣ，）　１１．〜１１．への冷却媒体の還流が正常
に行なわれていることが確認された後に、パワーコント
ロールマネージャー（ＰＣＭ）３１の指示により電源の
投入を行なう。~11. This is a device that controls the power supply to each refrigerant supply module (CsMo-C3M
,,) 10o~10. , to each processing unit (PRC, ~
PRC,) 11. ~11. After confirming that the cooling medium is being returned normally, the power is turned on according to instructions from the power control manager (PCM) 31.

サービスプロセッサ（ＳＶＰ）２はシステムの運用・管
理・サービスを行なうプロセッサであって、システムの
立ち上げ時の初期マイクロプログラムのロードを行なう
他、システムの運用状態を監視し、該運用状態をコンソ
ールデイスプレィ　（Ｃ５Ｌ−ＤＳＰ）４上に表示して
オペレータ　（操作員）に知らせる。The service processor (SVP) 2 is a processor that operates, manages, and services the system.In addition to loading the initial microprogram when the system is started up, the service processor (SVP) 2 also monitors the operating status of the system and displays the operating status on the console. It is displayed on the display (C5L-DSP) 4 to notify the operator.

［発明が解決しようとする課題］第２図に示した従来の／ステム構成では、例えば冷媒供
給モジｘ−ル（Ｃ５ＭＯ−ｃｓＭ、、）ｌＯ０〜ｌＯ□
の内、任意の冷媒供給モジュールに故障が生じた場合に
は、システム全体が停止してしまう（処理装置の故障の
場合も同様）。[Problems to be Solved by the Invention] In the conventional/stem configuration shown in FIG. 2, for example, the refrigerant supply module
If a failure occurs in any of the refrigerant supply modules, the entire system will stop (the same applies in the case of a failure in the processing device).

一般に複数の処理装置からなる超大型電子計算機等のシ
ステムにおいては、重要な仕事を遂行している場合が多
く、システム停止は深刻な事態を招くこととなる。In general, systems such as ultra-large computers that are made up of a plurality of processing units often perform important tasks, and a system stoppage would lead to a serious situation.

従って、システム停止をさけるために、上記冷媒供給モ
ジュール（ＣＳＭＱ　−ＣＳＭ、　）　ｔｏ。Therefore, in order to avoid system stoppage, the refrigerant supply module (CSMQ-CSM, ) to.

−１０，および冷媒通路ｃｃｐｏ−ＣＰ、、）１２゜〜
１２ｈ　のみならず、システム全体を二重化するなどの
対策を施している。-10, and refrigerant passage ccpo-CP,,)12°~
In addition to 12 hours, we have taken measures such as duplicating the entire system.

しかしながら、システム全体を二重化することは、ハー
ドウェアの増大による設置面積の増加、コストの上昇、
制御の複雑化等を招き、種々問題となる場合が多い。However, duplicating the entire system increases the installation area due to increased hardware, increases costs,
This often leads to complicated control and various other problems.

本発明は、上記問題点に鑑みなされたものであり、シス
テムの二重化を行なうことなく、システム構成要素の一
部に故障が生じた場合にも効率的に対処し得る複数の処
理装置からなる冗長システムを提供することを目的とす
る。The present invention has been made in view of the above-mentioned problems, and is a redundant system consisting of a plurality of processing units that can efficiently cope with a failure in some of the system components without duplicating the system. The purpose is to provide a system.

［課題を解決するための手段］本発明によれば、上述の目的は前記特許請求の範囲に記
載した手段により達成される。[Means for Solving the Problems] According to the present invention, the above objects are achieved by the means described in the claims.

すなわち、本発明は、主記憶上に格納されたホス）Ｏ３
の管理の元に動作するｎ個（ｎ≧２）のマイクロプログ
ラム制御方式の処理装置と各処理装置を冷却するｎ個の
冷却装置とで構成されるシステムであって、該システム
にはシステムの運用管理を行なうためのサービスプロセ
ッサを有し、システム立ち上げ時には、各処理装置へ初
期マイクロプログラムロードを行ない各処理装置の動作
機能を特定するよう構成された複数の処理装置からなる
システムにおいて、該システム中には、１個以上の予備
の処理装置および冷却装置を設け、さらに、サービスプロセッサと各処理装置との間に介在
する中間制御装置中に、システム立ち上げ時にサービスプロセッサがら送られる
システム構成情報を記憶するシステム構成テーブルと、
システム立ち上げ時に個々の処理装置に格納されるマイ
クロプログラムコードをバックアップするための手段と
、システム運用中に任意の処理装置または冷却装置に故
障が生じた場合には、ホス）Ｏ３への割込みに対する許
可を得て、上記システム構成テーブルの情報に基づき、
上記予備の処理装置および冷却装置を選択する手段と、
上記予備の処理装置に；対して、中間制御装置中にバッ
クアップされたマイクロプログラムコードを格納し、該
処理装置の立ち上げを図る手段とを設けた複数の処理装
置からなる冗長システムである。That is, the present invention provides a method for storing host) O3 stored on the main memory.
A system consisting of n (n≧2) microprogram-controlled processing devices that operate under the management of In a system consisting of a plurality of processing devices that have a service processor for operational management, and are configured to perform an initial microprogram load to each processing device and specify the operational functions of each processing device when the system is started up, One or more spare processing units and cooling units are installed in the system, and the system configuration sent from the service processor at system start-up is sent to an intermediate control unit interposed between the service processor and each processing unit. a system configuration table that stores information;
A means for backing up the microprogram code stored in each processing unit at system start-up, and a means for interrupting the host (host) O3 in the event of a failure of any processing unit or cooling unit during system operation. With permission, based on the information in the system configuration table above,
means for selecting the spare processing equipment and cooling equipment;
This is a redundant system consisting of a plurality of processing devices, in which the backup processing device is provided with means for storing backed-up microprogram codes in an intermediate control device and for starting up the processing devices.

［作　用］本発明では、複数の処理装置と各処理装置に対応する冷
却装置に対して、該処理装置または冷却装置の内任意の
ものが故障した場合にその代替となる予備用の処理装置
および冷却装置を設けておき、さらにサービスプロセッサと上記各処理装置との間に設
けられた中間制御装置中に、システム立ち上げ時にサー
ビスプロセッサから送られるシステム構成情報と、各処
理装置に送られるマイクロプログラムコードとを記憶し
ておく。[Function] In the present invention, for a plurality of processing devices and a cooling device corresponding to each processing device, a backup processing device is provided as a substitute when any one of the processing devices or cooling devices breaks down. In addition, an intermediate control device installed between the service processor and each of the above-mentioned processing devices stores the system configuration information sent from the service processor at system startup and the microcontroller sent to each processing device. Memorize the program code.

そして、システムの運用中に、任意の処理装置または冷
却装置が故障した場合には、上記中間制御装置は故障装
置の切り離しを行なうと共に、上記システム構成情報に
従い、システムのりコンフィギユレーション（Ｒｅｃｏ
ｎｆ　ｉｇｕｒａｔｉｏｎ）またはシステムのデグラデ
ーション（口ｅｇｒａｄａｔｉｏｎ）のために、予備用
の処理装置に対して、予めバックアップされた初期マイ
クロプログラムコードの内必要なものをローディングし
て立ち上げる。If any processing device or cooling device fails during system operation, the intermediate control device disconnects the failed device and performs system configuration (Recoding) according to the system configuration information.
For system degradation or system degradation, necessary initial microprogram codes backed up in advance are loaded into the backup processing device and started up.

このようなシステム構成を取ることにより、運用中に故
障が発生した場合にも、サービスプロセッサ等を介在す
ることなく、また、システムを最初から立上げることな
く迅速にシステムを復旧することができる。By adopting such a system configuration, even if a failure occurs during operation, the system can be quickly restored without the intervention of a service processor or the like, and without starting up the system from the beginning.

［実施例］第１図は本発明の一実施例の／ステム構成を示す図であ
り、５はシステムユニットコントローラ　（Ｓｙｓｔｅ
ｍ　ｌＩｎ＋ｔ　（：ｏｎｔｒｏｌｌｅｒ　：　Ｓ　Ｕ
　Ｃ）　、６よ／ステム構成情報を記１．へするシステ
ム構成デープル（ＣＦＴ）　、７はサービスプロセッサ
から各処理装置に送られる初期マイクロプログラムのコ
ードをバックアップするマイクロコードストレージ（！
Ｊｉｃｒｏ　（：ｏｄｅ　Ｓｔｏｒａｇｅ　：　Ｍ　Ｃ
Ｓ　）、８ａ〜８ｄは制御モジコール（ＣＴＭ、　ＳＣ
ＴＭ、）、９ａは集合冷媒供給ユニッ）　（ＣＣ３Ｕ）
、ｌＯ，、＋１　　は予備用の冷媒供給モジュール（Ｃ
８Ｍ、、４＋）、１１．、＋１　　は予備用の処理装置
（Ｐ　ＲＣ，、、）　、１２．、、　　は冷媒通路（Ｃ
Ｐ、、、）、］　３．〜１３□１　はユニットコントロ
ーラ（ＵｎｉｔＣｏｎｔｒｏｌｌｅｒ　：　Ｕ　Ｃ）　
、１４　ａ　、　１４　ｂは冗長化されたトレラント制
御装置（ＴＬｃｏ　、ＴＬＣ，）、２１はシステム構成
情報のローディングバス（Ｐａｔｈ：経路）、２２は集
合冷媒供給ユニットの起動停止信号バス、２３は冷媒供
給モジュールＣＳ　Ｍ。[Embodiment] FIG. 1 is a diagram showing the /stem configuration of an embodiment of the present invention, and 5 is a system unit controller (System
m lIn+t (:ontroller: S U
C), 6/Describe the stem configuration information 1. 7 is a microcode storage (!) that backs up the initial microprogram code sent from the service processor to each processing unit.
Jicro (:ode Storage: MC
S ), 8a to 8d are control module (CTM, SC
TM, ), 9a is the collective refrigerant supply unit) (CC3U)
, lO,, +1 is the reserve refrigerant supply module (C
8M,,4+),11. , +1 is a backup processing unit (P RC, , ), 12. ,, is the refrigerant passage (C
P,,,),] 3. ~13□1 is the unit controller (Unit Controller: UC)
, 14 a and 14 b are redundant tolerant control devices (TLco, TLC,), 21 is a system configuration information loading bus (Path), 22 is a start/stop signal bus for the collective refrigerant supply unit, and 23 is a refrigerant Supply module CS M.

〜ＣＳ　Ｍ、、、、　の起動停止信号バス、２４はユニ
ットコントローラＵＣ，−ＵＣＩ、、、　　に対しての
処理装置の電源投入切断信号バス、２５は処理装置ＰＲ
Ｃ，−ＰＲＣ，，，に対するマイクロプログラムコード
のローディングバス、２６はマイクロプログラムコード
のバックアップバス（ＢａｃｋυρＰａｔｈ　ン、２７
はＯＳ割込みバスを表わしており、他の符号は第２図の
場合と同様である。〜CSM,..., start/stop signal bus, 24 is the power on/off signal bus for the processing device for the unit controllers UC, -UCI,..., 25 is the processing device PR
26 is a microprogram code loading bus for C,-PRC, , , 27 is a microprogram code backup bus (BackυρPath);
represents the OS interrupt bus, and the other symbols are the same as in FIG.

本実施例のシステム構成においては、第２図に示した従
来のシステム構成と比較して、予備の処理装置（ＰＲＣ
，、。、Ｍｌ、。、および冷媒供給モジュール（Ｃ８Ｍ
ｏ、Ｉ）ｌｏ、、、ｌ　を設け、処理装置（ＰＲＣＱ〜
ＰＲＣ，）１１゜〜１１．。In the system configuration of this embodiment, compared to the conventional system configuration shown in FIG.
,,. , Ml. , and refrigerant supply module (C8M
o, I) lo,,,l are provided, and a processing device (PRCQ~
PRC, ) 11° ~ 11. .

および冷媒供給モジュール（Ｃ３Ｍ、−Ｃ３Ｍ、。and refrigerant supply module (C3M, -C3M,.

１０、〜１０．．の内、任意の一つが故障しても、本ン
ステムは上記予備装置に切り換えて支障なく稼動できる
よう構成されている。10, ~10. ．． Even if any one of them fails, the system is configured so that it can be switched to the backup device and continue to operate without any trouble.

また、従来システムでのインタフェースおよヒパワーコ
ントロールマネージャー（ＰＣＭ）を廃止し、新たにシ
ステムユニットコントローラ（ＳＵＣ）５を設け、従来
はサービスプロセッサ（ＳＶＰ）２から処理装置ｆ　（
ＰＲＣｏ−ＰＲＣ，、）Ｉｌｏ　〜１１．　に対してイ
ンタフェースを介して行なっていたサービスを、該シス
テムユニットコントローラ（ＳＵＣ）５を介して行なう
ようにする。In addition, the interface and power control manager (PCM) in the conventional system have been abolished, and a new system unit controller (SUC) 5 has been installed.
PRCo-PRC,,)Ilo~11. Services that were previously performed through the interface are now performed through the system unit controller (SUC) 5.

このシステムユニットコントローラ（ＳＵＣ）５は、従
来のインタフェース機能、パワーコントロールマネージ
ャー（ＰＣＭ）の電源制御機能を持つ他、システム電源
制御、環境冷却制御、処理装置（ＰＲＣｏ　〜Ｐ　ＲＣ
，、、）　ＩＬ　〜１１１．相へのサービス機能等を、
サービスプロセッサ（ＳＶＰ）２から大幅に移管したも
のである。This system unit controller (SUC) 5 has conventional interface functions and power control functions of the power control manager (PCM), as well as system power control, environmental cooling control, and processing units (PRCo to PRC).
,,,) IL ~111. service functions, etc.
This is a major transfer from Service Processor (SVP) 2.

さらに、処理袋ｆｉｔ　（ＰＲＣＧ　−ＰＲＣ，、、’
）１１ｏ〜１１゜。１へのホストＯＳへの割込みバスを
有する。Furthermore, processing bag fit (PRCG -PRC,,,'
) 11o~11°. 1 to the host OS.

また、システムユニットコントローラ（ＳｔＪＣ）５は
システム構成テーブル（ＣＦＴ）６およびマイクロプロ
グラムコードのバックアップバッファとしてのマイクロ
コードストレージ（ＭＣ５）７を有し、シスブト構成テ
ーブル（ＣＦＴ）６はシステム中の処理装置（ＰＲＣＯ
〜ＰＲＣ，、）ｌｌｏ　〜１１．または冷媒供給モジュ
ール（ＣＳ　ｔｖＬ＋　−ｃ　Ｓ　Ｍｈ）　１００〜１
０ｈ　の内任意のものが故障した場合に、予備用の処理
装置（Ｐ　ＲＣ，、＋　）　１１１．−１　および冷媒
供給モジュール（ＣＳ　Ｍ、、＋　）　１０．、＝１　
を用いたシステムの再構築のためのシステム・リコンフ
ィギュレーンヨン（Ｓｙｓｔｅｍ　Ｒｅｃｏｎｆｉｇｕ
ｒａｔｉｏｎ）または予備用でもまかなえない様な複数
個以上の故障があった場合、複数個以上の処理装置（Ｐ
ＲＣ，−ＰＲＣ７）ｌｌｏ　〜１１ｈ　または冷媒供給
モジュール（ＣＳ　Ｍ、　−ＣＳ　Ｍ、、）　１０ｏ−
１０，のうち、故障をしていない処理装置と冷媒供給モ
ジ、−ル系を予備用に充当させて全体的なシステムの規
模を縮小させるシステム・デグラデーション（Ｓｙｓｔ
ｅｍ　Ｄｅｇｒａｄａｔｉｏｎ）の情報が格納されてお
り、マイクロコードストレージ＜ＭＣ５）７はシステ１
．１γｐ、　ｌげ時に、→ノービスブ［］セッザ（ＳＶ
Ｐ）２から各処理装置（ＰＲＣ，〜ＰＲＣ，，）　Ｉｌ
、−ＩＩ、、に送られるマイクロプログラムコードをバ
ックアップしておくものである。Additionally, the system unit controller (StJC) 5 has a system configuration table (CFT) 6 and a microcode storage (MC5) 7 as a backup buffer for microprogram codes, and the system configuration table (CFT) 6 is used for processing units in the system. (PRCO
~PRC,,)llo ~11. Or refrigerant supply module (CS tvL+ -c S Mh) 100-1
0h If any one of them fails, a backup processing unit (P RC,, +) 111. -1 and refrigerant supply module (CSM,, +) 10. ,=1
System Reconfiguration (System Reconfiguration)
ration) or if there is a failure of multiple or more processing units that cannot be covered by backup equipment, the failure of multiple or more processing units (P
RC, -PRC7)llo ~11h or refrigerant supply module (CS M, -CS M,,) 10o-
10. System degradation (Syst.
The microcode storage <MC5)7 is the system 1
．． 1γp, when falling, → Novisbu [] Seza (SV
P) 2 to each processing device (PRC, ~PRC,,) Il
, -II, , to back up the microprogram code sent to.

以上のごとき実施例の／ステム構成で、／ステムの立ち
上げから、運用中に故障が生じた場合までの動作につい
て以下説明する。In the /stem configuration of the above-described embodiment, the operation from startup of the /stem to when a failure occurs during operation will be described below.

ただし、システム電源投入時に、フロッピーディスク　
（ＦＤＰ）３と、サービスプロセッサ（ＳＶＰ）２ｃ！
：、システムユニットコントローラ（Ｓｔ、ＪＣ）５と
ユニットコントローラ（ＵＣ，、、）とコントロールモ
ジュール（ＣＴ　Ｍｏ、　ｌ）と、トレラントコントロ
ーラ（Ｔ　ＬＣｏ、　ｌ　　）　’）制御回路電源が投
入される。However, when the system is powered on, the floppy disk
(FDP) 3 and service processor (SVP) 2c!
:, system unit controller (St, JC) 5, unit controller (UC, , ), control module (CTMo, l), and tolerant controller (TLCo, l)') control circuit power is turned on.

（１）　　まず、フロッピーディスク（ＦＰＤ）３に格
納されたシステム構成情報を、サービスプロセッサ（Ｓ
ＶＰ）２を介して、システムユニットコントローラ（Ｓ
ＩＪＣ）５中のシステム構成テーブル（ＣＦＴ）６中に
格納する。(1) First, the system configuration information stored in the floppy disk (FPD) 3 is transferred to the service processor (S
The system unit controller (S
It is stored in the system configuration table (CFT) 6 in the IJC) 5.

（２）次に、サービスプロセッサ（ＳＶＰ）２は／ステ
ムユニットコントローラ　（ＳＵＣ）５を経由して、集
合冷媒供給ユニッ）　（ＣＣ３Ｕ）９ａに起動をかける
。(2) Next, the service processor (SVP) 2 activates the collective refrigerant supply unit (CC3U) 9a via the stem unit controller (SUC) 5.

（３）サラに、システムユニットコント（Ｊ−ラ（ＳＵ
Ｃ）５のシステム構成テーブル（ＣＦＴ）６の内容によ
り、各冷媒供給モジュール（ＣＳ　Ｍ、　−ＣＳ　Ｍ、
、）　＋００−１０．、に起動をかける。(3) System unit control (J-RA (SU)
C) Depending on the contents of the system configuration table (CFT) 6 in 5, each refrigerant supply module (CS M, -CSM,
, ) +00-10. , is activated.

（４）次に、／ステムユニットコントローラ（ＳＩＣ）
５はユニットコントローラ（ｕＣｏ　〜ＵＣ，）１３０
〜１３ｈ　を立ち上げて環境冷却監視を行ない、正常で
あったならば、処理装置（ＰＲＣ，−ＰＲＣ，、）　１
１０〜１１．．に電源を投入する。(4) Next, /stem unit controller (SIC)
5 is a unit controller (uCo ~ UC,) 130
~13h Start up and monitor the environmental cooling, and if it is normal, the processing device (PRC, -PRC,...) 1
10-11. ．． Turn on the power.

（５）　　その後に、サービスプロセッサ（ＳＶＰ）２
中のフロッピーディスク（ＦＰＤ）３に予め準備してい
たマイクロプログラムコードを各処理袋！　（ＰＲＣＯ
−ＰＲＣ，、）　１００〜１０゜ヘローディングする。(5) After that, service processor (SVP) 2
Each processing bag contains the microprogram code prepared in advance on the floppy disk (FPD) 3 inside! (PRCO
-PRC,,) 100-10° loading.

該ローディングと同時に、マイクロコードストレージ（
ＭＣ５）７に上記マイクロプログラムコードを格納しバ
ックアップしておく。At the same time as this loading, microcode storage (
Store and back up the above microprogram code in MC5)7.

以上（１）〜（５）の手順により／ステムの立ち上げが
完了し／ステムの運用状態に入る。By the steps (1) to (5) above, /the startup of the stem is completed /and the system enters the operating state.

（６）　　その後、システム運用中に、例えば処理装置
（Ｐ　ＲＣ，−Ｐ　ＲＣ，、）　１ｌｏ−１１，、に障
害カ発生スると、システムユニットコントローラ（ＳＵ
Ｃ）５はホストＯ５に割込みをかけて、ホス）Ｏ３に障
害処理装置の切り離し指示依頼割込みをさせる。(6) Afterwards, during system operation, if a failure occurs in the processing units (P RC, -P RC,,) 1lo-11,, the system unit controller (SU
C)5 interrupts the host O5 and causes the host O3 to issue an interrupt requesting a disconnection instruction for the failure processing device.

（７）次に、システムユニットコントローラ　（ＳＬＩ
Ｃ＞５は障害処理装置への電源を切断し、該障害処理装
置をシステムから切り離すと共に、予備の処理装置（Ｐ
　ＲＣＨ＋＋　）　１１．、−＋　またはシステムの運
用上、規模を縮小して、システムの運転が継続できる場
合は、任意の正常な処理装置（ＰＲＣＯ−７）にマイク
ロコードストレージ（ＭＣ３）７よりマイクロプログラ
ムをローディングさせて立ち上げる。(7) Next, the system unit controller (SLI)
C>5 turns off the power to the failure processing device, disconnects the failure processing device from the system, and connects the spare processing device (P
RCH++) 11. , -+ Or, if the scale of the system can be reduced and system operation can continue, load the microprogram from the microcode storage (MC3) 7 to any normal processing unit (PRCO-7). Launch.

なお、この予備も含めた処理装置（ＰＲＣ，−７゜ｎ−
ｌ　）　１ｌｏ−ｎ−ｎ−１は任意の処理装置（ＰＲＣ
。In addition, the processing equipment (PRC, -7゜n-
l) 1lo-n-n-1 is any processing device (PRC
.

〜Ｐ　ＲＣｏ）　１１０〜１１．．の予備として機能し
得るよう、いかなる種類の処理装置（例えば「チャネル
」やｒｃＰＵ」）にもマイクロプログラムの内容により
変化し得るよう構成されたものである。~PRCo) 110~11. ．． It is configured such that it can be changed depending on the contents of the microprogram in any type of processing unit (for example, "channel" or "rcPU") so that it can function as a backup for the microprogram.

［発明の効果］以上説明したごとく、本発明の、複数の処理装置からな
る冗長システムによれば、以下に示す効果がある。[Effects of the Invention] As explained above, the redundant system comprising a plurality of processing devices of the present invention has the following effects.

（１）　　システム内に、冗長分としての処理装置、冷
却装置等を設けることにより、ンステム全体を二重化さ
せることなく信頼性の高いシステムを実現できる。(1) By providing redundant processing equipment, cooling equipment, etc. in the system, a highly reliable system can be realized without duplicating the entire system.

（２）例え複数以上の処理装置または冷却装置が故障し
ても任意の正常な処理装置、冷却装置を故障した処理装
置及び冷却装置として代替させて、システムデグラデー
ンヨンをさせることにより、システムダウンとさせずに
継続運転ができる。(2) Even if more than one processing device or cooling device fails, any normal processing device or cooling device can be substituted for the failed processing device or cooling device to cause system degradation. Continuous operation is possible without downtime.

／スフー　ｌ、ユニットコントローラ（ＳＵＣ）中にシ
ステム構成テーブル（ＣＦＴ）１．ｉよびマイクロコー
ドストレージ（ＭＣ３）を設けろことにより、障害発生
時の故障処理装置から予備の処理装置への切り換えが迅
速になされる。/Sufu l, System configuration table (CFT) in the unit controller (SUC) 1. By providing i and microcode storage (MC3), switching from a failed processing device to a spare processing device can be quickly performed when a failure occurs.

（３）(3)

[Brief explanation of drawings]

第１図は本発明の一実施例のシステム構成を示す図、第
２図は従来の複数処理装置からなるシステムの構成例を
示す図である。ｌ・・・・・・ホストＯＳの格納されたメインストレー
ジ、１ａ・・・・・・ホスト０８１２・・・・・・サー
ビスプロセッサ（ＳＶＰ）、３・・・・・・フロッピー
ディスク　（ＦＰＤ）　、４・・・・・・コンソールデ
イスプレィ装置、５・・・・・・システムユニットコン
トローラ（ＳＵＣ）　、６・・・・・・システム構成テ
ーブル（ＣＦＴ）　、？・・・・・・マイクロコードス
トレージ（ＭＣ５）、８ａ〜ｇｄ・・・・・・制御モジ
ュール（ＣＴＭｏ　、　ＣＴＭ、）、９ａ・・・・・・
集合冷媒供給ユニット（ＣＣＳ　Ｕ）　、１００−１０
．、ｌ　　−・・・・・冷媒供給モジュール（ＣＳ　Ｍ
ｏ　−ｃ　Ｓ　Ｍｌ、、、　’）、　ｌｌｏ　〜１１、
、、−・−・・処理装置（ＰＲＣ，−ＰＲＣ，、、、）
、１２、−１２．、、−・・−・冷媒通路（ＣＰ、　〜
ＣＰ、、、、）、１３ｏ　〜１３．．’、　　・・・・
・・ユニットコントローラ（ＵＣ。FIG. 1 is a diagram showing a system configuration according to an embodiment of the present invention, and FIG. 2 is a diagram showing an example of the configuration of a conventional system comprising a plurality of processing devices. l... Main storage where host OS is stored, 1a... Host 0812... Service processor (SVP), 3... Floppy disk (FPD), 4... Console display device, 5... System unit controller (SUC), 6... System configuration table (CFT), ? ...Microcode storage (MC5), 8a to gd...Control module (CTMo, CTM,), 9a...
Collective refrigerant supply unit (CCSU), 100-10
．． , l -... Refrigerant supply module (CS M
o -c S Ml,,,'), llo ~11,
,,--...processing device (PRC, -PRC,,,,)
, 12, -12. ,,... Refrigerant passage (CP, ~
CP, , , ), 13o ~ 13. ．． ', ・・・・・・
...Unit controller (UC.

Claims

[Claims] n (n≧2) microprogram-controlled processing devices that operate under the management of a host OS stored in main memory, and n cooling devices that cool each treatment device. The system has a service processor for managing the operation of the system, and when the system is started up, it loads an initial microprogram into each processing unit and controls the operating functions of each processing unit. In a system consisting of a plurality of processing units configured to specify,
The system is provided with one or more spare processing units and cooling units, and furthermore, an intermediate control unit interposed between the service processor and each processing unit contains system information sent from the service processor at the time of system startup. A system configuration table that stores configuration information, a means for backing up the microprogram code stored in each processing unit at system startup, and a means to back up the microprogram code stored in each processing unit when the system is started, and in the event of a failure of any processing unit or cooling unit during system operation. means for obtaining permission to interrupt the host OS and selecting the spare processing device and cooling device based on the information in the system configuration table; What is claimed is: 1. A redundant system comprising a plurality of processing devices, comprising: a means for storing microprogram codes backed up by a computer, and for starting up the processing devices.