JPS6112580B2 - - Google Patents

Info

Publication number
JPS6112580B2
JPS6112580B2 JP52137901A JP13790177A JPS6112580B2 JP S6112580 B2 JPS6112580 B2 JP S6112580B2 JP 52137901 A JP52137901 A JP 52137901A JP 13790177 A JP13790177 A JP 13790177A JP S6112580 B2 JPS6112580 B2 JP S6112580B2
Authority
JP
Japan
Prior art keywords
processor
common bus
bus
processors
multiprocessor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP52137901A
Other languages
Japanese (ja)
Other versions
JPS5471537A (en
Inventor
Koichiro Yamaguchi
Kazuo Nishimura
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to JP13790177A priority Critical patent/JPS5471537A/en
Publication of JPS5471537A publication Critical patent/JPS5471537A/en
Publication of JPS6112580B2 publication Critical patent/JPS6112580B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 本発明は、情報処理装置において、負荷分散あ
るいは機能分散をはかるマルチプロセツサの障害
処理方式に関するものである。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a multiprocessor failure handling method for distributing loads or distributing functions in an information processing apparatus.

本発明に最も近いと考えられる従来技術の一例
を第1図により説明する。第1図はn台のプロセ
ツサ(PU1〜PUn)1〜1nを共通バス3に
より有機的に結合し、この共通バス3には、さら
に共通メモリ(CM)5、データチヤネル
(DCH)6、専用入出力制御部(P10C)7を
結合する代表的なマルチプロセツサを表わす。各
プロセツサ11〜1nはこれに対応する個別メモ
リ(LM)2〜2nに接続され、LM2〜2
n上に格納された個別プログラムに従がつて動作
する。したがつて、負荷分散形のマルチプロセツ
サにおいては複数のプロセツサが同種の処理を分
担実施することになり、これに対応するLM2
〜2n上には同種のプログラムを格納する。
An example of the prior art that is considered to be closest to the present invention will be explained with reference to FIG. In FIG. 1, n processors (PU1 to PUn) 11 to 1n are organically coupled by a common bus 3, and this common bus 3 further includes a common memory (CM) 5, a data channel (DCH) 6, It represents a typical multiprocessor coupled with a dedicated input/output control unit (P10C) 7. Each processor 11 to 1n is connected to a corresponding individual memory (LM) 2 1 to 2n, and LM2 1 to 2
It operates according to an individual program stored on the computer. Therefore, in a load-balanced multiprocessor, multiple processors share the same type of processing, and the corresponding LM2 1
~2n stores the same kind of programs.

いつぽう、機能分散のマルチプロセツサにおい
ては、それぞれの処理機能に対応する異種のプロ
グラムを各LM2〜2n上に格納することにな
る。したがつて、負荷分散形、機能分散形のいづ
れについても、第1図に示す基本構成をとるのが
一般的である。
In a functionally distributed multiprocessor, different types of programs corresponding to respective processing functions are stored on each LM2 1 to 2n. Therefore, the basic configuration shown in FIG. 1 is generally adopted for both the load distribution type and the function distribution type.

このマルチプロセツサにおける構成のポイント
は、情報の処理要求源、例えば、DCH6ある
いはPIOC7からの処理要求をどのプロセツサで
分担するか、プロセツサが障害に陥つた場合に
どのようにこれを検出し、正常なプロセツサに処
理を引継ぐかにあるといえる。
The key points in the configuration of this multiprocessor are which processors share the processing requests from the information processing request source, for example, DCH6 or PIOC7, how to detect a failure in a processor, and how to restore it to normal status. It can be said that the processing can be taken over to a suitable processor.

第1のポイントに関しては、従来方式において
も、例えば優先処理または巡回処理により各プロ
セツサに処理を割り当てる方法が採られ、金物的
には、共通バス制御部(CBC)4がこの機能を
分担していた。
Regarding the first point, even in the conventional system, a method is adopted in which processing is assigned to each processor by, for example, priority processing or cyclic processing, and in terms of hardware, this function is shared by the common bus control unit (CBC) 4. Ta.

第2のポイントに関しては、従来方式において
は、各プロセツサに障害検出、停止機能を持たせ
るという考え方が一般的であつた。
Regarding the second point, in the conventional system, the general idea was to provide each processor with a fault detection and shutdown function.

ところがこの方式は、障害となつたプロセツサ
に対し、自己制御機能を持たせるためには、個々
プロセツサの金物増を引きおこす。また障害に陥
つたプロセツサが起こす妨害、例えば、共通メモ
リアクセス要求信号が要求状態にスタツクされる
ことも考えられ、従つてこの場合には他のプロセ
ツサの共通バス3の使用を妨げ、ひいてはマルチ
プロセツサ全体のシステムダウンを招くという危
険性があつた。
However, this method requires additional hardware for each processor in order to provide a self-control function to the processor that is causing the problem. It is also possible that interference caused by the failed processor, for example, the common memory access request signal may be stacked in the requested state, thus preventing other processors from using the common bus 3, and thus preventing the multiprocessor from using the common bus 3. There was a risk that the entire SETUSA system would go down.

本発明の目的は、上記した従来技術の欠点をな
くし、経済的、かつ信頼性の高いマルチプロセツ
サを提供するにある。
SUMMARY OF THE INVENTION An object of the present invention is to eliminate the above-mentioned drawbacks of the prior art and provide an economical and highly reliable multiprocessor.

本発明は、各プロセツサを共通バスにて結合す
るマルチプロセツサにおいて、個々のプロセツサ
との間に設けられた集中監視制御バスを通して各
プロセツサの障害を早期に検出し、かつ障害に陥
つたプロセツサを共通バスより電気的に切離し、
また正常なプロセツサに負荷を移し換える機能を
備えた集中監視制御部を装備することにより、経
済的で、かつ信頼性の高いマルチプロセツサを構
成することを特徴とするものである。
In a multiprocessor in which processors are connected via a common bus, the present invention detects a failure in each processor at an early stage through a centralized monitoring control bus provided between the individual processors, and removes the failed processor. Electrically separated from the common bus,
Furthermore, by being equipped with a centralized monitoring control section having a function of transferring the load to a normal processor, an economical and highly reliable multiprocessor is constructed.

本発明の一実施例の全構成を第2図の機能ブロ
ツク図により説明する。第2図は、従来方式のマ
ルチプロセツサに対し、各々のプロセツサの障害
を早期に検出し、障害処理を実施する集中監視制
御部(CSC)8を装備した本発明のマルチプロ
セツサを示すものである。CSC8は各プロセツ
サ1〜1nと、集中監視制御バス9および共通
バス3により接続される。或るプロセツサ1
1nまたは、これに接続される個別メモリ2
2nが障害に陥つた場合に、障害の発生を、集中
監視制御バス9を経由してCSC8が検出し障害
の程度に応じてCSC8が障害に陥つたプロセツ
サに対して障害処理指令を集中監視制御バス9を
経由して発する。また、共通バス3を通して他の
正常なプロセツサに対して割込要求を出し、障害
プロセツサの負荷を移し換える。
The entire configuration of one embodiment of the present invention will be explained with reference to the functional block diagram of FIG. FIG. 2 shows the multiprocessor of the present invention, which is equipped with a centralized supervisory control unit (CSC) 8 that detects failures in each processor early and performs failure handling, in contrast to conventional multiprocessors. It is. The CSC 8 is connected to each of the processors 1 1 to 1n by a centralized monitoring control bus 9 and a common bus 3. A certain processor 1 1 ~
1n or individual memories 2 1 to 1n connected to this
2n encounters a fault, the CSC 8 detects the occurrence of the fault via the centralized monitoring control bus 9, and centrally monitors and controls the CSC 8 to issue a fault handling command to the faulty processor depending on the degree of the fault. Departs via bus 9. It also issues an interrupt request to other normal processors through the common bus 3 to transfer the load on the faulty processor.

つぎに、CSC8の具体的構成および動作を第
3図および第4図を用いて説明する。
Next, the specific configuration and operation of the CSC 8 will be explained using FIGS. 3 and 4.

第3図において、n台のプロセツサ1〜1n
のうち、第i番目のプロセツサを代表例として左
側に示し、右側にCSCを示す。プロセツサ1
〜1nは、一般には内蔵プログラム制御方式で構
成され、全体の構成は省略するが、本発明に特に
関係するCSC8とのインタフエースをもつ部分
のみを図中に示す。
In FIG. 3, n processors 1 1 to 1n
Among them, the i-th processor is shown as a representative example on the left, and the CSC is shown on the right. Processor 1 1
-1n are generally configured using a built-in program control system, and although the entire configuration is omitted, only the portion having an interface with the CSC 8, which is particularly relevant to the present invention, is shown in the figure.

障害検出回路(ED)10はプロセツサ自身が
異常状態を検出する機能を行なう部分であり、 (イ) プロセツサのクロツク断検出 (ロ) 障害検出タイマによるプログラム暴走の検出 (ハ) 個別メモリのパリテイエラー検出 等の障害検出を行ない、これらの障害を検出した
旨を障害報告信号線11を介して、CSC8内の
制御回路(CONT)21に通報する。制御回路2
1は、プロセツサの障害原因を分析し、(イ)、(ロ)の
ような重症障害の場合は、障害プロセツサ1iの
障害が共通バス3を介して他のプロセツサに影影
を及ぼすのを防止するべく、処理部停止信号線1
5を経由して、障害に陥つたプロセツサを動作停
止させるほか、処理部切離信号線16を経由し
て、共通バス3と直接接続され電気的にプロセツ
サ1iに対して信号を入出力制御しているところ
のバストランスミツタ19およびバスレシーバ2
0に対し、共通バス3との接続を電気的に切離す
べく指示する。
The fault detection circuit (ED) 10 is a part in which the processor itself performs the function of detecting an abnormal state. (a) Detection of processor clock interruption (b) Detection of program runaway by fault detection timer (c) Parity detection of individual memory Fault detection such as error detection is performed, and the fact that these faults have been detected is reported to the control circuit (CONT) 21 in the CSC 8 via the fault report signal line 11. Control circuit 2
1 analyzes the cause of the processor failure, and in the case of severe failures such as (a) and (b), prevents the failure of the failed processor 1i from affecting other processors via the common bus 3. In order to do so, the processing unit stop signal line 1
In addition to stopping the operation of the faulty processor via line 5, it is directly connected to common bus 3 via processing section disconnection signal line 16, and electrically controls the input and output of signals to processor 1i. Bus transmitter 19 and bus receiver 2
0 to electrically disconnect from the common bus 3.

つぎに制御回路21は、障害に陥つたプロセツ
サの障害発生時点での状態を示す種々のレジス
タ、例えば、内蔵プログラムの実行番地を示すプ
ログラムカウンタ(PC)11、プロセツサに対
する外部からの割込みの有無を示す割込レジスタ
(ISF)12、障害原因を示す状態レジスタ
(STR)13等の内容を集中監視制御バス9の中
のデータ線17を経由して読取り、これらのレジ
スタの内容を報告キユーレジスタ22の内部に順
次畜積する。この際にレジスタ類の選択にはレジ
スタ選択線18を用いて通知する。
Next, the control circuit 21 checks various registers that indicate the state of the faulty processor at the time of the fault, such as a program counter (PC) 11 that shows the execution address of the built-in program, and the presence or absence of an external interrupt to the processor. The contents of the interrupt register (ISF) 12 indicating the cause of the failure, the status register (STR) 13 indicating the cause of the failure, etc. are read via the data line 17 in the central monitoring control bus 9, and the contents of these registers are reported to the queue register 22. Accumulate sequentially inside. At this time, the register selection line 18 is used to notify the selection of registers.

以上の障害情報の読取りが完了すると制御回路
21は、障害処理部表示レジスタ23に、該当す
る番号の表示ビツトを立て、該当するプロセツサ
が障害に陥つた旨を表示しておく。
When the reading of the above fault information is completed, the control circuit 21 sets a display bit of the corresponding number in the fault processing unit display register 23 to indicate that the corresponding processor has fallen into a fault.

次にCSC8は、共通バス3を介して、正常な
プロセツサに対して、第i番目のプロセツサ1i
に障害が発生した旨を報告すべく割込要求を出
す。これに応じて、CSC8に蓄積されている障
害処理部表示レジスタ23、報告キユーレジスタ
22の内容が、共通バス制御線26を通して、制
御回路21により制御され、バストランスミツタ
24,25から共通バスに送出され、正常なプロ
セツサのうち、異常処理解析プログラムを起動し
うるプロセツサがこれを読取り、異常処理解析を
実行する。
Next, the CSC 8 sends the i-th processor 1i to the normal processor via the common bus 3.
An interrupt request is issued to report that a failure has occurred. In response to this, the contents of the failure processing unit display register 23 and report queue register 22 stored in the CSC 8 are controlled by the control circuit 21 through the common bus control line 26 and sent from the bus transmitters 24 and 25 to the common bus. Among the normal processors, a processor capable of starting an abnormality processing analysis program reads this and executes an abnormality processing analysis.

以上の一連の動作を第4図の動作図に示す。 The above series of operations is shown in the operation diagram of FIG.

以上説明したように集中監視制御部CSCをマ
ルチプロセツサに導入することにより、従来方式
の負つていた1台のプロセツサの障害がマルチプ
ロセツサ全体のシステムダウンをひき起こす可能
性をなくすことが可能となり、マルチプロセツサ
の稼動率を飛躍的に高めることが可能となつた。
As explained above, by introducing the centralized monitoring and control unit CSC into a multiprocessor, it is possible to eliminate the possibility that a failure in one processor will cause the entire multiprocessor system to go down, which was the problem with the conventional system. This has made it possible to dramatically increase the operating rate of multiprocessors.

さらに集中監視制御部は内蔵プログラム方式の
プロセツサに比して、わずかな金物量により実現
できるため、同種の機能をプロセツサを用いて実
現する場合に比して、飛躍的な経済化を達成する
ことができる。
Furthermore, since the centralized monitoring and control unit can be realized with a small amount of hardware compared to a built-in program type processor, it is possible to achieve dramatic economical savings compared to the case where similar functions are realized using a processor. I can do it.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は従来方式によるマルチプロセツサの代
表的構成例を示す機能ブロツク図、第2図は本発
明の一実施例を示すマルチプロセツサの機能ブロ
ツク図、第3図は本発明のポイントである集中監
視制御部の機能の一構成例を示す機能図、第4図
は障害検出および障害処理の一部を示す動作図で
ある。 1〜1n:プロセツサ、2〜2n:個別メ
モリ、3:共通バス、9:集中監視制御バス、1
0:障害検出回路、11:プログラムカウンタ、
12:割込レジスタ、13:状態レジスタ、1
9,24,25:バストランスミツタ、20:バ
スレシーバ、8:集中監視制御部、21:制御回
路、22:報告キユーレジスタ、23:障害処理
部表示レジスタ、26:共通バス制御線。
Fig. 1 is a functional block diagram showing a typical configuration example of a conventional multiprocessor, Fig. 2 is a functional block diagram of a multiprocessor showing an embodiment of the present invention, and Fig. 3 is a functional block diagram showing the main points of the present invention. FIG. 4 is a functional diagram showing an example of the configuration of the functions of a certain centralized monitoring control section, and FIG. 4 is an operation diagram showing a part of fault detection and fault processing. 1 1 to 1n: Processor, 2 1 to 2n: Individual memory, 3: Common bus, 9: Centralized monitoring control bus, 1
0: Failure detection circuit, 11: Program counter,
12: Interrupt register, 13: Status register, 1
9, 24, 25: bus transmitter, 20: bus receiver, 8: central monitoring control section, 21: control circuit, 22: report queue register, 23: failure processing section display register, 26: common bus control line.

Claims (1)

【特許請求の範囲】[Claims] 1 複数のプロセツサを共通バスにより有機的に
結合し、この共通バスに入出力制御部、共通メモ
リ等を接続し、個々のプロセツサのもつ処理能力
以上の処理能力を発揮するマルチプロセツサにお
いて、上記共通バスの他に集中監視制御バスを設
け、該バスを通して個々のプロセツサの障害を早
期に検出し、重要障害の場合には障害に陥つたプ
ロセツサの動作を停止せしめると同時に障害に陥
つたプロセツサを共通バスより電気的に切離し、
また障害に陥つたプロセツサの障害発生時点での
状態を示すレジスタの内容を読取ると共に障害プ
ロセサ番号を蓄積し、他の正常なプロセツサに該
蓄積された障害情報を通知し、異常処理の解析を
実行させる機能を備えた集中監視制御部を装備す
ることによりシステムの信頼性を高めることを特
徴とするマルチプロセツサの障害処理方式。
1. In a multiprocessor that organically connects multiple processors through a common bus, and connects input/output control units, common memory, etc. to this common bus, and exhibits processing capabilities exceeding those of the individual processors, the above-mentioned In addition to the common bus, a centralized monitoring control bus is provided, through which faults in individual processors can be detected at an early stage, and in the case of a major fault, the operation of the faulty processor will be stopped, and at the same time the faulty processor will be stopped. Electrically separated from the common bus,
It also reads the contents of the register that indicates the state of the faulty processor at the time of the fault occurrence, stores the faulty processor number, notifies other normal processors of the stored fault information, and analyzes the abnormality processing. A failure handling method for a multiprocessor characterized by increasing system reliability by being equipped with a centralized monitoring and control unit that has a function to
JP13790177A 1977-11-18 1977-11-18 Failure processing system for multiprocessor Granted JPS5471537A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP13790177A JPS5471537A (en) 1977-11-18 1977-11-18 Failure processing system for multiprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP13790177A JPS5471537A (en) 1977-11-18 1977-11-18 Failure processing system for multiprocessor

Publications (2)

Publication Number Publication Date
JPS5471537A JPS5471537A (en) 1979-06-08
JPS6112580B2 true JPS6112580B2 (en) 1986-04-09

Family

ID=15209317

Family Applications (1)

Application Number Title Priority Date Filing Date
JP13790177A Granted JPS5471537A (en) 1977-11-18 1977-11-18 Failure processing system for multiprocessor

Country Status (1)

Country Link
JP (1) JPS5471537A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6420975U (en) * 1987-07-22 1989-02-01

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS59135553A (en) * 1983-01-24 1984-08-03 Fujitsu Ltd Holding system of fault information
JPS59173857A (en) * 1983-03-24 1984-10-02 Nec Corp Processor controlling system
JPS60541A (en) * 1983-06-16 1985-01-05 Nec Corp Isolation circuit of single device
JPS61237142A (en) * 1985-04-15 1986-10-22 Nec Corp Information processing system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6420975U (en) * 1987-07-22 1989-02-01

Also Published As

Publication number Publication date
JPS5471537A (en) 1979-06-08

Similar Documents

Publication Publication Date Title
JPH0746322B2 (en) Faulty device identification system
US3964055A (en) Data processing system employing one of a plurality of identical processors as a controller
JPS6112580B2 (en)
JPS6350739B2 (en)
JPH05224964A (en) Bus abnormality information system
JP2937857B2 (en) Lock flag release method and method for common storage
JPS5917467B2 (en) Control computer backup method
JP3363579B2 (en) Monitoring device and monitoring system
JP2744113B2 (en) Computer system
JP2946541B2 (en) Redundant control system
JPS60134352A (en) Duplex bus control device
JP2876676B2 (en) Communication control method between processors
JP2725385B2 (en) Data transfer method of information processing system
JPH07114521A (en) Multimicrocomputer system
JP3055906B2 (en) Emergency operation method
JP2815730B2 (en) Adapters and computer systems
JP2580311B2 (en) Mutual monitoring processing method of multiplex system
JPH03278213A (en) Method for detecting and informating power supply status transition of extended storage device
JPS61135293A (en) Remote supervisory control system
JPS58114141A (en) Fault reporting system for multimicroprocessor
JPS6330660B2 (en)
JPS622335B2 (en)
JPS58201155A (en) Dual system monitoring system
JPS6010666B2 (en) Computer system monitoring method
JPS62105243A (en) Recovery device for system fault