JPS59124099A

JPS59124099A - Fault repair processing system of common storage area

Info

Publication number: JPS59124099A
Application number: JP57233745A
Authority: JP
Inventors: Masaji Ishibashi; 正路石橋
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-12-29
Filing date: 1982-12-29
Publication date: 1984-07-18
Also published as: JPH0236013B2

Abstract

PURPOSE:To facilitate processing in a fixed fault by dividing a data receiving and passing area into a main and an auxiliary area and indicating which area is used, and allowing hardware to access the indicated area automatically. CONSTITUTION:If a fixed fault occurs to the data receiving and passing area CiA, a CPU program transfers necessary data based upon an instruction to the auxiliary CiA and sends out a switching signal indicating the switching of the CiA to send execute I/O signal (XiO signal) to turn on. When a CHP receives the signal XiO, a switching signal for switching the CiA is sent out in case of a fault and the instruction is interrupted as instruction processing damage. If the CPU detects a fault when receiving an accept signal, instruction processing damage processing is performed and the switching signal is sent out. Consequently, the subsequent actuation is performed by using the CiA having no fault.

Description

【発明の詳細な説明】〔発明の技術分野〕本発明は、複数の処理装置（１つの装置の中の複数処理
部でもよい）が共通にアクセス可能な記憶手段を介して
互いに通信する如く構成されたデータ処理システムにお
いて、該共通記憶手段に障害が生じた場合の復旧方式に
関する。Detailed Description of the Invention [Technical Field of the Invention] The present invention relates to a system configured such that a plurality of processing devices (or a plurality of processing units in one device) communicate with each other via a commonly accessible storage means. The present invention relates to a recovery method when a failure occurs in the common storage means in a data processing system.

[Prior art to the invention]

主記憶装置の容量の増大と共に、ハードウェアのファー
ムウェア化が盛んになっており、主記憶装置の一部をハ
ードウェア領域としてファームウェアが使用するように
なってきた。With the increase in the capacity of main storage devices, the use of firmware in hardware has become popular, and firmware has come to use part of the main storage device as a hardware area.

一般に、主記憶装置にはＥＣＣ回路が付加されており、
１ビツト障害に対しては対策がとられているが、２ビツ
ト障害や固定障害になると、エラが検出されてもその処
理のみで、復旧については殆ど対策がなされていない。Generally, an ECC circuit is added to the main memory,
Countermeasures have been taken for 1-bit failures, but when it comes to 2-bit failures or fixed failures, even if an error is detected, only that error is dealt with, and almost no measures are taken for recovery.

これは２重障害は極めてまれにしか生じないとの前提で
対策がたてられているためである。This is because countermeasures are taken on the assumption that double failures occur extremely rarely.

しかし、マイクロプログラム間の通信など１重要なデー
タの受は渡し領域内で固定障害が発生すると、それ以後
の処理がすべてだめになり、システムダウンにつながっ
てしまう。However, if a fixed failure occurs in the transfer area for important data such as communication between microprograms, all subsequent processing will be disrupted, leading to a system down.

[Purpose of the invention]

本発明は上記の問題点を解決することを目的とし、デー
タの受は渡し領域（以下ＣＩＡ、コミュニケーション・
インタフェース・エリアと称す）を主領域と予備領域と
に分け、１ラーを検出した側の装置またはマイクロプロ
グラムが他方の装置またはマイクロプログラムに対して
、いずれのＣＩＡを使用するかを指示するようにし、指
示された側は特に指示されたことを意識することなく常
に主ＣＩＡを使用しているものとしてアクセスするよう
に、ハードウェアが自動的に指示された方のＣＩＡをア
クセスす−る↓う−に！、たものである。The purpose of the present invention is to solve the above-mentioned problems.
The interface area (referred to as the interface area) is divided into a main area and a spare area, and the device or microprogram that detects a 1 error instructs the other device or microprogram which CIA to use. , the hardware automatically accesses the CIA of the instructed side so that the instructed side always accesses the main CIA without being aware of the instructions. -to! , is something.

[Embodiments of the invention]

ここではＣＩＡを主記憶装置上に設けた場合について説
明する。Here, a case where the CIA is provided on the main memory will be explained.

ＣＩＡの内容としては、Ｉ１０装置の起動あるいは１１
０割込みに必要な情報１例えば命令コート゛、チャネル
機番、デバイス機番、ＣＡＷ　（チャネル・アドレス・
ワード）、コンディション・コード、あるいはＣＳＷ　
＜チャネル・ステータス・ワード）などが含まれる（第
２図参照）。The contents of the CIA include activation of the I10 device or
0 Information required for interrupt 1 For example, instruction code, channel number, device number, CAW (channel address)
word), condition code, or CSW
<Channel Status Word) (see Figure 2).

第１図は本発明の一実施例を示すブロック図であり、マ
イクロプログラム制御のＣＰＵと同じくマイクロプログ
ラム制御のＣＨＰ　（チャネル・プロセッサ）とが、主
記憶装置ＭＳの所定領域をＣＩＡとして通信する場合を
示す。FIG. 1 is a block diagram showing an embodiment of the present invention, in which a microprogram-controlled CPU and a microprogram-controlled CHP (channel processor) communicate using a predetermined area of the main memory MS as a CIA. shows.

以下、障害の生じるタイミング別に本発明による復旧処
理方式を説明する。Hereinafter, the recovery processing method according to the present invention will be explained according to the timing at which a failure occurs.

ｆｌ）ＣＰＵマイクロプログラムは、Ｉ１０命令等を認
識するとこれに伴う必要データをＣ，ＩＡにセントし、
ＣＨＰに対してＸＩＯ信号（エクスキュー）Ｉ１０信号
）をオンし、起動要求を伝える。fl) When the CPU microprogram recognizes the I10 instruction, etc., it sends the necessary data to C and IA,
Turns on the XIO signal (excue I10 signal) to the CHP and transmits a startup request.

もし、この間にＣＩＡで固定障害が発生すると、必要デ
ータを予備ＣＩＡに移し、ＣＩＡを切替えた旨の信号（
以下切替信号と称す）を送出し、ＸＩＯ信号を再度オン
する。即ち、命令の再試行となり、命令の中断は生じな
い。If a fixed failure occurs in the CIA during this time, the necessary data will be transferred to the backup CIA and a signal indicating that the CIA has been switched (
(hereinafter referred to as a switching signal) is sent out, and the XIO signal is turned on again. In other words, the command is retried and the command is not interrupted.

（２）ＣＨＰがＸＩＯ信号を受けると、ＣＩＡを読出し
、命令コードを解析し、必要な処理を行なった後、コン
ディション・コード等をＣＩＡに書込み、ＡＣＰＴ信号
（アクセプト信号）をオンにしてＣＰＵに対して処理終
了を通知する。(2) When the CHP receives the XIO signal, it reads the CIA, analyzes the instruction code, performs the necessary processing, writes the condition code, etc. to the CIA, turns on the ACPT signal (accept signal), and sends it to the CPU. Notify the end of processing.

もし、この間にＣＩＡに固定障害が発生すると、この時
のＣＨＰにおける処理はチャネル・コントロール・エラ
ー処理、あるいはエクスターナル・ダメージ処理により
中断される。またＣＰＵマイクロプログラムはこの固定
障害をマシンチェックとして知ることができ、ＣＩＡを
切替えるべく切替信号を送出し、その命令をインストラ
クション・プロセッシング・ダメージとして中断する。If a fixed failure occurs in the CIA during this time, the processing in the CHP at this time is interrupted by channel control error processing or external damage processing. The CPU microprogram can also detect this fixed fault as a machine check, send a switching signal to switch the CIA, and interrupt the instruction as an instruction processing damage.

（３１Ｃ−Ｐ　Ｕが正常にＡＣＰＴ信号を受は取ると。(If 31C-PU normally receives the ACPT signal.

ＣＩＡの中からコンディション・コード等を取り出し、
ＰＳＷヘセソトしたり、規定の番地へＣ３Ｗをストアし
たりする処理を行ない、ＸＩ○信号を落し、ＡＣＰＴ信
号が落ちるのを確認して次の命令へと制御を移す。Extract the condition code etc. from the CIA,
It performs processing such as loading the PSW and storing C3W at a specified address, drops the XI○ signal, confirms that the ACPT signal drops, and transfers control to the next instruction.

この時点で固定障害を検出すると、インストラクション
・プロセッシング・ダメージ処理を行ない、切替信号を
送出しておく。If a fixed fault is detected at this point, instruction processing and damage processing are performed and a switching signal is sent.

これにより２次回からの起動は固定障害の無いＣＩＡを
使用して行うことが可能である。As a result, the second startup can be performed using a CIA that does not have fixed failures.

Ｃ１λは主記憶装置の最大アドレスに近い方に置かれる
ため、主記憶装置の容量が変ると置かれるアドレスもそ
れに伴って変化する。このためＣＨＰはシステムの電源
投入時の初期化の時点で。Since C1λ is placed near the maximum address of the main memory, if the capacity of the main memory changes, the address at which it is placed also changes accordingly. For this reason, CHP is used at the time of initialization when the system is powered on.

ＳＶＰ　（サービスプロセッサ）により主記憶装置の最
大アドレスを知らせてもらい、自動的にＣＩＡのアドレ
スを算出するか、または直接ＳｖＰからＣＩＡアドレス
を知らせてもらうようにすればよい。The SVP (service processor) may notify the maximum address of the main storage device and automatically calculate the CIA address, or the SVP (service processor) may notify the CIA address directly.

また主記憶装置はＣＰＵによるパトロールの対象になっ
ており、一定時間内にリード／ライトされ、一時的障害
の回復は密に行なわれている。Furthermore, the main memory is subject to patrol by the CPU, read/written within a certain period of time, and recovery from temporary failures is performed closely.

〔Effect of the invention〕

以上の如く本発明によれば、マイクロプログラム制御部
は常に１つのＣＩＡのみを意識して処理を行なえばよ＜
、ＣＩＡの切替えはハードウェアにより制御することに
よって処理が簡単になる。As described above, according to the present invention, the microprogram control unit only needs to perform processing with only one CIA in mind.
, CIA switching is controlled by hardware, which simplifies processing.

[Brief explanation of the drawing]

第１図は本発明の一実施例プロツク図であり。ＭＳは主記憶装置、ｃｐｕは中央処理装置、ＣＨＰはチ
ャネル処理装置、ＡＲはアドレス・レジスタ、ＤＲはデ
ータ・レジスタ、ＭＳＡは主記憶装置アドレス・レジス
タ、ＣＩＡは共通記憶領域。ＥＦは固定障害を示すフラグ、である。第２図はＣＩＡの内容の一例を示す図である。第７図第２図FIG. 1 is a block diagram of one embodiment of the present invention. MS is the main memory, CPU is the central processing unit, CHP is the channel processing unit, AR is the address register, DR is the data register, MSA is the main memory address register, and CIA is the common storage area. EF is a flag indicating a fixed failure. FIG. 2 is a diagram showing an example of the contents of the CIA. Figure 7 Figure 2

Claims

[Claims] A plurality of microprogram-controlled processing devices. In a data processing system that communicates with each other via a common storage area that can be accessed in common, a plurality of the common storage areas are provided, and the microprogram control unit of each processing unit uses a single address as the address of the common storage area. , wherein each processing device is provided with means for detecting a failure in the common storage area, and hardware means for changing the address with respect to other processing devices when a failure is detected. Processing method.