JPH09114701A

JPH09114701A - Simple multiprocessor monitoring system

Info

Publication number: JPH09114701A
Application number: JP7273874A
Authority: JP
Inventors: Yuichi Ota; 雄一大田
Original assignee: NEC Engineering Ltd
Current assignee: NEC Engineering Ltd
Priority date: 1995-10-23
Filing date: 1995-10-23
Publication date: 1997-05-02

Abstract

PROBLEM TO BE SOLVED: To make a CPU on a monitoring side able to discriminate an operation history or the like until the fault generation of a runaway CPU on a side to be monitored in a simple multiprocessor monitoring system in which the CPU on the monitoring side can identify in which part of a task under a processing by the CPU on the side to be monitored runaway occurs. SOLUTION: Through COM RAMs 11 and 21 provided with the operating state detailed information of sub-CPUs 31 and 41 and areas (STS/CMD areas) 13 and 23 where a main CPU 1 stores the command of a restoration procedure or the like when a fault is generated in the sub-CPUs 31 and 41, the sub-CPUs 31 and 41 perform the operation of tentatively interrupting the task during the processing by a runaway monitoring module for monitoring the runaway of the respective sub-CPUs 31 and 41 themselves, shifting the processing to the runaway monitoring module, performing monitoring and recovering the processing of the tentatively interrupted task after the processing of the runaway monitoring module is ended in a prescribed cycle and inform the main CPU 1 of the operating states of the respective sub-CPUs themselves.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、処理の分散をはか
るため複数のＣＰＵで構成されているマルチプロセッサ
監視システムに関し、特に、メインＣＰＵと少なくとも
一つのサブＣＰＵとからなりメインＣＰＵが各サブＣＰ
Ｕを監視するマルチプロセッサ監視システムであって、
サポートツールなどを必要としない簡易マルチプロセッ
サ監視システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multiprocessor monitoring system composed of a plurality of CPUs for the purpose of distributing processing, and in particular, it includes a main CPU and at least one sub CPU, and each main CPU is a sub CPU.
A multiprocessor monitoring system for monitoring U,
The present invention relates to a simple multiprocessor monitoring system that does not require support tools.

【０００２】[0002]

【従来の技術】従来、複数のＣＰＵで構成されているマ
ルチプロセッサ監視システムとして、例えば、特開昭６
３−４９９４８号が挙げられる。2. Description of the Related Art Conventionally, as a multiprocessor monitoring system composed of a plurality of CPUs, for example, Japanese Patent Laid-Open No.
3-49948 is mentioned.

【０００３】このシステムは、図７に示されている様
に、複数のＣＰＵ１００、２００を備え、各ＣＰＵ１０
０、２００は、個々に暴走を検知するためのプログラム
であるモジュールを格納したＲＯＭを有している。さら
に、各ＣＰＵ１００、２００には、正常動作時には、所
定のタイミングでリセットされ、障害、即ち、暴走の発
生の際、予め定められた時間の経過後、各ＣＰＵ１０
０、２００に割り込みをかけるストールタイマが接続さ
れている。As shown in FIG. 7, this system includes a plurality of CPUs 100 and 200, and each CPU 10
Each of 0 and 200 has a ROM that stores a module that is a program for individually detecting runaway. Further, each CPU 100, 200 is reset at a predetermined timing during normal operation, and when a failure, that is, a runaway occurs, each CPU 10 is reset after a predetermined time has elapsed.
A stall timer for interrupting 0 and 200 is connected.

【０００４】図示された例では、ＣＰＵ１００を監視側
ＣＰＵとし、ＣＰＵ２００を被監視側ＣＰＵとして説明
する。In the illustrated example, the CPU 100 will be described as a monitoring CPU, and the CPU 200 will be described as a monitored CPU.

【０００５】まず、被監視側ＣＰＵ２００が正常に動作
している場合、監視側ＣＰＵ１００から被監視側ＣＰＵ
２００への要求に対して、監視側ＣＰＵ１００のストー
ルタイマ１０３がカウントアップして監視側ＣＰＵ１０
０のＩＮＴ２をイネーブルする以前に、被監視側ＣＰＵ
２００から監視側ＣＰＵ１００への応答を受ける。監視
側ＣＰＵ１００は、被監視側ＣＰＵ２００からの応答を
受けると監視側ＣＰＵ１００のストールタイマ１０３を
クリアする。First, when the monitored CPU 200 is operating normally, the monitored CPU 100 moves to the monitored CPU.
In response to a request to the CPU 200, the stall timer 103 of the monitoring CPU 100 counts up and the monitoring CPU 10
Before enabling INT2 of 0, the monitored CPU
200 receives a response from the monitoring CPU 100. Upon receiving the response from the monitored CPU 200, the monitoring CPU 100 clears the stall timer 103 of the monitoring CPU 100.

【０００６】それに対し、被監視側ＣＰＵ２００の暴走
等で、監視側ＣＰＵ１００のストールタイマ１０３がカ
ウントアップしても、被監視側ＣＰＵ２００から監視側
ＣＰＵ１００への応答が無い場合、監視側ＣＰＵ１００
のストールタイマ１０３が監視側ＣＰＵ１００に割り込
みをかけ、監視側ＣＰＵ１００にＩＮＴ２割り込みが発
生する。監視側ＣＰＵ１００は、ＩＮＴ２割り込みに従
い、ＮＭＩ信号１０９を、被監視側ＣＰＵ２００のＮＭ
Ｉ端子に出力する。その割り込み処理の中でリセット信
号１０７によりリセット動作を行い、暴走と誤認識を防
止し、他のＣＰＵに影響を及ぼさないシステムを構築し
ている。On the other hand, if there is no response from the monitored CPU 200 to the monitored CPU 100 even if the monitored CPU 100 runs out of control and the stall timer 103 of the monitored CPU 100 counts up, if the monitored CPU 100 does not respond.
Stall timer 103 interrupts the monitoring CPU 100, and an INT2 interrupt is generated in the monitoring CPU 100. The monitoring CPU 100 sends the NMI signal 109 to the NM of the monitored CPU 200 in accordance with the INT2 interrupt.
Output to I terminal. A reset operation is performed by the reset signal 107 in the interrupt processing to prevent runaway and erroneous recognition, thereby constructing a system that does not affect other CPUs.

【０００７】[0007]

【発明が解決しようとする課題】しかしながら、前記従
来例において、監視側ＣＰＵは、被監視側ＣＰＵが処理
中のタスクのどの部分で暴走したのかを識別することが
できなかった。However, in the above-mentioned conventional example, the monitoring side CPU could not identify in which part of the task being processed the monitored CPU went out of control.

【０００８】また、被監視側ＣＰＵを監視している監視
側ＣＰＵは、暴走した被監視側ＣＰＵの障害発生までの
動作履歴等を識別することができなかった。Further, the monitoring CPU that is monitoring the monitored CPU cannot identify the operation history of the runaway monitored CPU until the occurrence of a failure.

【０００９】このようなことにより、図示されたシステ
ムでは、暴走と誤認識を防止するにとどまり、障害に対
応した復旧手順が的確に行えないという問題があった。As a result, the system shown in the figure has a problem in that it cannot prevent the runaway and erroneous recognition, and that the recovery procedure corresponding to the failure cannot be performed accurately.

【００１０】一方、サポートツール等を必要としない簡
易マルチプロセッサ監視システムにおいても、システム
の信頼性の向上、ならびに、迅速な障害復旧が要求され
る様になってきている。このような状況の下において、
単に、暴走を検出できる図７のシステムは、需要者の要
求に十分に応えられないという欠点もある。On the other hand, even in a simple multiprocessor monitoring system which does not require a support tool or the like, improvement in system reliability and quick failure recovery are required. Under these circumstances,
The system of FIG. 7 which can detect runaway simply has a drawback that it cannot sufficiently meet the demand of the consumer.

【００１１】本発明の目的は、上記の問題を解決するた
め、被監視側ＣＰＵが処理中のタスクのどの部分で暴走
したのかを、監視側ＣＰＵが識別することができる簡易
マルチプロセッサ監視システムを提供することにある。In order to solve the above problems, an object of the present invention is to provide a simple multiprocessor monitoring system in which the monitoring CPU can identify in which part of the task the monitored CPU has runaway. To provide.

【００１２】本発明の他の目的は、監視側ＣＰＵが、暴
走した被監視側ＣＰＵの障害発生までの動作履歴等の動
作状態詳細情報を識別することができる簡易マルチプロ
セッサ監視システムを提供することにある。Another object of the present invention is to provide a simple multiprocessor monitoring system in which a monitoring CPU can identify detailed operation state information such as an operation history of a runaway monitored CPU until a failure occurs. It is in.

【００１３】[0013]

【課題を解決するための手段】本発明によれば、メイン
ＣＰＵと、少なくとも一つのサブＣＰＵと、前記メイン
ＣＰＵと前記各サブＣＰＵとの間に設けられ、前記メイ
ンＣＰＵと前記各サブＣＰＵが共通に使用する共通メモ
リと、前記メインＣＰＵ，前記各サブＣＰＵ、及び当該
サブＣＰＵに対応した前記共通メモリを接続するデータ
バスからなる簡易マルチプロセッサ監視システムにおい
て、前記共通メモリは、前記サブＣＰＵの動作状態詳細
情報及び前記メインＣＰＵからサブＣＰＵへのコマンド
を格納する状態／コマンド領域、及び前記サブＣＰＵの
動作状態詳細情報及び前記メインＣＰＵからサブＣＰＵ
へのコマンド以外のデータを格納するデータ領域とを備
え、前記メインＣＰＵは、前記共通メモリの前記状態／
コマンド領域に格納されている動作状態詳細情報を参照
して、各サブＣＰＵを監視することを特徴とする簡易マ
ルチプロセッサ監視システムが得られる。According to the present invention, a main CPU, at least one sub CPU, and the main CPU and the sub CPUs are provided, and the main CPU and the sub CPUs are provided. In a simple multiprocessor monitoring system including a common memory used in common, the main CPU, the sub CPUs, and a data bus connecting the common memories corresponding to the sub CPUs, the common memory is the sub CPU. State / command area for storing detailed operating state information and commands from the main CPU to the sub CPU, and detailed operating state information of the sub CPU and the main CPU to the sub CPU
A data area for storing data other than commands to the main CPU,
A simple multiprocessor monitoring system characterized in that each sub CPU is monitored by referring to the detailed operation state information stored in the command area can be obtained.

【００１４】また、本発明によれば、前記簡易マルチプ
ロセッサ監視システムにおいて、前記各サブＣＰＵは、
前記各共通メモリの前記状態／コマンド領域に前記各サ
ブＣＰＵの状態を前記動作状態詳細情報として書き込む
手段を有し、前記メインＣＰＵは、前記各動作状態詳細
情報を読み込む手段を有することにより、前記各サブＣ
ＰＵに暴走が起こったときに、該サブＣＰＵが処理中の
タスクのどの部分で暴走したのかを知ることができるこ
とを特徴とする簡易マルチプロセッサ監視システムが得
られる。Further, according to the present invention, in the simple multiprocessor monitoring system, each of the sub CPUs is
By having a unit for writing the state of each sub CPU as the detailed operating state information in the state / command area of each common memory, and the main CPU having a unit for reading the detailed operating state information, Each sub-C
When a runaway occurs in a PU, a simple multiprocessor monitoring system characterized by being able to know at which part of the task the sub CPU has runaway is obtained.

【００１５】また、本発明によれば、前記簡易マルチプ
ロセッサ監視システムにおいて、前記メインＣＰＵ又は
前記共通メモリのいずれか一方は、前記サブＣＰＵ状態
情報を蓄積する手段を有することにより、前記サブＣＰ
Ｕの障害発生までの動作履歴を格納することができるこ
とを特徴とする簡易マルチプロセッサ監視システムが得
られる。Further, according to the present invention, in the simple multiprocessor monitoring system, one of the main CPU and the common memory has a unit for accumulating the sub CPU state information, whereby the sub CP
A simple multiprocessor monitoring system characterized by being able to store the operation history until the failure of U occurs.

【００１６】更に、本発明によれば、前記簡易マルチプ
ロセッサ監視システムにおいて、前記メインＣＰＵは、
前記動作履歴を検出する機能を備え、前記サブＣＰＵに
障害が発生した際に、当該検出された動作履歴に基づい
て、前記サブＣＰＵに対し、該障害に対応した復旧手順
処理を行なうことを特徴とする簡易マルチプロセッサ監
視システムが得られる。Further, according to the present invention, in the simple multiprocessor monitoring system, the main CPU is
A function of detecting the operation history is provided, and when a failure occurs in the sub CPU, a recovery procedure process corresponding to the failure is performed on the sub CPU based on the detected operation history. A simple multiprocessor monitoring system can be obtained.

【００１７】[0017]

【発明の実施の形態】以下に、本発明の実施の形態を図
面を参照して、説明をする。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings.

【００１８】まず、図１に示されるように、本実施の形
態の簡易マルチプロセッサ監視システムは、メインＣＰ
Ｕ１は、各サブＣＰＵ３１、４１と、各々共通メモリ
（以下、ＣＯＭＲＡＭ）１１、２１及びデータバス
４、３４、４４を介して接続された構成をなしている。First, as shown in FIG. 1, the simplified multiprocessor monitoring system of the present embodiment is a main CP.
The U1 is connected to each of the sub CPUs 31 and 41 via a common memory (hereinafter, COM RAM) 11 and 21 and data buses 4, 34 and 44, respectively.

【００１９】ここで、ＣＯＭＲＡＭ１１、２１は、サ
ブＣＰＵ３１、４１の動作状態詳細情報、及びサブＣＰ
Ｕ３１、４１に障害が発生した時にメインＣＰＵ１が復
旧手順等のコマンドを格納する領域（以下、ＳＴＳ／Ｃ
ＭＤ領域）１３、２３を備えると共に、更に、前述の動
作状態詳細情報及びコマンド以外の通常のデータの送受
を行うための領域（以下、ＤＡＴＡ領域）１２、２２と
からなっている。Here, the COM RAMs 11 and 21 have detailed operational state information of the sub CPUs 31 and 41 and the sub CP.
Area where the main CPU 1 stores commands such as a recovery procedure when a failure occurs in U31 and U1 (hereinafter referred to as STS / C
MD areas) 13, 23, and areas (hereinafter referred to as DATA areas) 12, 22 for transmitting and receiving normal data other than the above-mentioned operation state detailed information and commands.

【００２０】また、サブＣＰＵ３１、４１は、各々のサ
ブＣＰＵ３１、４１自身の暴走を監視するための暴走監
視モジュールを有しており、図２に示すように、処理中
のタスクを一時中断し（Ｓ１０１）、暴走監視モジュー
ルに処理を移行して監視し（Ｓ１０２）、暴走監視モジ
ュールの処理終了後、一時中断したタスクの処理に復帰
する（Ｓ１０３）動作を所定の周期で行っている。Each of the sub CPUs 31 and 41 has a runaway monitoring module for monitoring the runaway of the sub CPU 31 or 41 itself. As shown in FIG. 2, the task being processed is temporarily suspended ( S101), the process is transferred to the runaway monitoring module for monitoring (S102), and after the process of the runaway monitoring module is completed, the operation of returning to the process of the temporarily suspended task (S103) is performed in a predetermined cycle.

【００２１】（第１の実施の形態）次に、第１の実施の
形態として、図３を用いて、図２のステップＳ１０２に
おける各々のサブＣＰＵ３１、４１の暴走監視モジュー
ルの処理を説明する。(First Embodiment) Next, as a first embodiment, the process of the runaway monitoring module of each of the sub CPUs 31 and 41 in step S102 of FIG. 2 will be described with reference to FIG.

【００２２】まず、処理中のタスクを一時中断し、暴走
監視モジュールに処理を移行したサブＣＰＵ３１、４１
は、ＣＯＭＲＡＭ１１、２１に対してＣＯＭＲＡＭ
１１、２１のゲット要求を行う（Ｓ２０１）。ＣＯＭ
ＲＡＭ１１、２１がアクセス可能であれば（Ｓ２０
２）、一時中断されたタスクのどの時点で暴走監視モジ
ュールが呼ばれたのかを識別するため、暴走監視モジュ
ールの処理終了後に、一時中断したタスクのどこへ復帰
すればよいかを示す復帰アドレスをＣＯＭＲＡＭ１
１、２１に設定する（Ｓ２０３）。First, the sub CPUs 31 and 41 which have temporarily suspended the task being processed and have transferred the processing to the runaway monitoring module.
Is the COM RAM with respect to the COM RAMs 11 and 21.
Get requests of 11 and 21 are made (S201). COM
If the RAMs 11 and 21 are accessible (S20
2) In order to identify at which point in the suspended task the runaway monitoring module was called, a return address indicating where to return to the suspended task after the processing of the runaway monitoring module is completed. COM RAM1
It is set to 1 and 21 (S203).

【００２３】次に、サブＣＰＵ３１、４１は、そのサブ
ＣＰＵ３１、４１自身の各内部レジスタ（ＡＸレジスタ
／ＢＸレジスタ／ＣＸレジスタ／ＤＸレジスタ／ＤＩレ
ジスタ／ＳＩレジスタ／ＤＳ［データセグメント］／Ｅ
Ｓ［エキストラセグメント］／ＳＳ［スタックセグメン
ト］／ＳＰ［スタックポインタ］／ＢＰ［ベースポイン
タ］／ＰＳ［プログラムセグメント］／ＰＣ［プログラ
ムコード］）の内容を動作状態詳細情報として、ＣＯＭ
ＲＡＭ１１、２１に設定する（Ｓ２０４）。Next, the sub CPUs 31 and 41 each have their own internal registers (AX register / BX register / CX register / DX register / DI register / SI register / DS [data segment] / E.
The contents of S [extra segment] / SS [stack segment] / SP [stack pointer] / BP [base pointer] / PS [program segment] / PC [program code]) are used as detailed operating state information for COM.
The RAM 11 and 21 are set (S204).

【００２４】更に、後述する様に、メインＣＰＵは、サ
ブＣＰＵによってＣＯＭＲＡＭ１１、２１に設定され
た動作状態詳細情報を取得できる。Further, as will be described later, the main CPU can acquire detailed operation state information set in the COM RAMs 11 and 21 by the sub CPU.

【００２５】このことは、サブＣＰＵ３１、４１が、サ
ブＣＰＵ３１、４１自身の現在アクセスしているプログ
ラム領域アドレス、データ領域アドレス、スタックポイ
ンタアドレス、及びベースポインタアドレスをメインＣ
ＰＵ１に通知できることを意味している。This means that the sub CPUs 31 and 41 send the program area address, the data area address, the stack pointer address, and the base pointer address which the sub CPUs 31 and 41 are currently accessing to the main C.
This means that PU1 can be notified.

【００２６】その後、サブＣＰＵ３１、４１は、通常の
暴走検出処理を行い（Ｓ２０５）、Ｓ２０３で設定した
復帰アドレスに従い、暴走監視モジュールが呼ばれたタ
スクに復帰する。After that, the sub CPUs 31 and 41 perform a normal runaway detection process (S205), and the runaway monitoring module returns to the called task according to the return address set in S203.

【００２７】更に、メインＣＰＵ１は、図４に示すよう
に、まず、ＣＯＭＲＡＭ１１、２１に対してＣＯＭ
ＲＡＭ１１、２１のゲット要求を行う（Ｓ３０１）。Ｃ
ＯＭＲＡＭ１１、２１がアクセス可能であれば（Ｓ２０
２）、ＣＯＭＲＡＭ１１、２１に格納されているサブ
ＣＰＵ３１、４１の動作状態詳細情報を読み込む（Ｓ３
０３）。メインＣＰＵ１は、読み込んだ動作状態詳細情
報をメインＣＰＵ１自身のＲＡＭ３のある一定の領域を
確保したリングバッファに書き込むことにより、サブＣ
ＰＵ３１、４１の動作状態履歴を作成する（Ｓ３０
４）。メインＣＰＵ１は、その復帰アドレスおよび内部
レジスタの内容などから、サブＣＰＵ３１、４１が動作
したタスクの走行状態や使用している各データ領域のア
ドレス等の動作状態が把握できる。Further, as shown in FIG. 4, the main CPU 1 first performs COM on the COM RAMs 11 and 21.
A request to get the RAMs 11 and 21 is made (S301). C
If the OMRAMs 11 and 21 are accessible (S20
2), detailed operation state information of the sub CPUs 31 and 41 stored in the COM RAMs 11 and 21 is read (S3).
03). The main CPU 1 writes the read detailed operation state information in a ring buffer that secures a certain area of the RAM 3 of the main CPU 1 itself, thereby sub C
The operation state history of the PUs 31 and 41 is created (S30
4). The main CPU 1 can grasp the running state of the task operated by the sub CPUs 31 and 41 and the operating state such as the address of each data area being used from the return address and the contents of the internal register.

【００２８】また、サブＣＰＵ３１、４１に暴走などの
障害が発生した場合、サブＣＰＵ３１、４１は、暴走監
視モジュールに処理を移行できないため、ＣＯＭＲＡ
Ｍ１１、２１内のサブＣＰＵ３１、４１の動作状態詳細
情報は書き変わらないことになる。したがって、メイン
ＣＰＵ１は、ある決められた回数、ＣＯＭＲＡＭ１
１、２１内のサブＣＰＵ３１、４１の動作状態詳細情報
に変化がなければ（Ｓ３０５）、サブＣＰＵ３１、４１
の動作停止と判断し、その動作状態詳細情報に示された
タスクの走行アドレスやサブＣＰＵ３１、４１の動作履
歴により、タスクの再起動要求や、タスクの終了要求、
また、最悪の場合には、ＰＳ、ＰＣの書き換え指示によ
る初期化処理など、その時の障害に対応した復旧手順処
理を行う（Ｓ３０６）ようになっている。Further, when a failure such as a runaway occurs in the sub CPUs 31 and 41, the sub CPUs 31 and 41 cannot shift the processing to the runaway monitoring module.
The detailed operation state information of the sub CPUs 31 and 41 in M11 and M21 will not be rewritten. Therefore, the main CPU 1 has a certain number of
If there is no change in the operation state detailed information of the sub CPUs 31 and 41 in the sub CPUs 1 and 21 (S305), the sub CPUs 31 and 41
Of the task and the operation history of the sub CPUs 31 and 41 indicated in the operation state detailed information, the task restart request, the task end request,
In the worst case, the recovery procedure processing corresponding to the failure at that time, such as the initialization processing by the rewriting instruction of PS and PC, is performed (S306).

【００２９】尚、本発明の第１の実施の形態において、
メインＣＰＵは、サブＣＰＵ３１、４１の動作状態詳細
情報をメインＣＰＵ１自身のＲＡＭ３のある一定の領域
を確保したリングバッファに、動作状態履歴を作成する
こととしているが、ＣＯＭＲＡＭ１１、２１に動作状態
詳細情報を蓄積し、動作状態履歴を作成してもよい。但
し、作成された動作状態履歴を扱った処理は、いずれの
場合もメインＣＰＵが行なう。Incidentally, in the first embodiment of the present invention,
Although the main CPU creates the operation state history in the ring buffer that secures a certain area of the RAM 3 of the main CPU 1 itself, the operation state detailed information of the sub CPUs 31 and 41 is stored in the COMRAMs 11 and 21. May be accumulated and an operating state history may be created. However, in any case, the main CPU performs the process of handling the created operation state history.

【００３０】（第２の実施の形態）次に、第２の実施の
形態について、前述の第１の実施の形態と比較して、図
５及び図６を用いて説明する。(Second Embodiment) Next, a second embodiment will be described with reference to FIGS. 5 and 6 in comparison with the above-described first embodiment.

【００３１】第２の実施の形態において、メインＣＰＵ
１は、暴走監視モジュールにより、ＣＯＭＲＡＭ１
１、２１に所定の周期で書き込まれるサブＣＰＵ３１、
４１の動作状態詳細情報以外に、メインＣＰＵ１からサ
ブＣＰＵ３１、４１に必要に応じたデータを要求するこ
とにより、その時のサブＣＰＵ３１、４１内の内部レジ
スタの他にメインＣＰＵ１から指定されたアドレスのデ
ータをもＣＯＭＲＡＭ１１、２１を介して取得するこ
とができる。In the second embodiment, the main CPU
1 is a COM RAM1 by the runaway monitoring module.
The sub CPU 31, which is written in the first and the second 21 in a predetermined cycle,
In addition to the operation state detailed information of 41, the main CPU 1 requests data from the sub CPUs 31 and 41 as needed, so that the data of the address specified by the main CPU 1 in addition to the internal registers in the sub CPUs 31 and 41 at that time. Can also be obtained via the COM RAMs 11 and 21.

【００３２】サブＣＰＵ３１、４１は、図５に示すよう
に、サブＣＰＵ３１、４１自身の各内部レジスタをＣＯ
ＭＲＡＭ１１、２１に設定した（Ｓ２０４）後、メイ
ンＣＰＵ１から要求された指定されたアドレスのデータ
などをＣＯＭＲＡＭ１１、２１に設定する（Ｓ２０４
ａ）ことにより、サブＣＰＵ３１、４１のＲＡＭ３３、
４３のデータ内容をＣＯＭＲＡＭ１１、２１を介して
メインＣＰＵ１に通知することができる。As shown in FIG. 5, each of the sub CPUs 31 and 41 has its own internal register with a CO register.
After setting in the M RAMs 11 and 21 (S204), the data of the designated address requested from the main CPU 1 is set into the COM RAMs 11 and 21 (S204).
By a), the RAM 33 of the sub CPUs 31 and 41,
The data content of 43 can be notified to the main CPU 1 via the COM RAMs 11 and 21.

【００３３】一方、メインＣＰＵ１は、図６に示す様
に、ＣＯＭＲＡＭ１１、２１に格納されているサブＣ
ＰＵ３１、４１の動作状態詳細情報を読み込み（Ｓ３０
３）、読み込んだ動作状態詳細情報をメインＣＰＵ１自
身のＲＡＭ３のある一定の領域を確保したリングバッフ
ァに書き込むことにより、サブＣＰＵ３１、４１の動作
状態履歴を作成（Ｓ３０４）した後、サブＣＰＵ３１、
４１に対し要求した指定アドレスのデータをＣＯＭＲ
ＡＭ１１、２１から読みだす（Ｓ３０４ａ）ことによ
り、サブＣＰＵ３１、４１のＲＡＭ３３、４３のデータ
内容を取得することができる。On the other hand, as shown in FIG. 6, the main CPU 1 has a sub C stored in the COM RAMs 11 and 21.
The detailed operation state information of the PUs 31 and 41 is read (S30
3) The operation state history of the sub CPUs 31 and 41 is created (S304) by writing the read operation state detailed information into a ring buffer that secures a certain area of the RAM 3 of the main CPU 1 itself.
COMR the data of the specified address requested to 41
The data contents of the RAMs 33 and 43 of the sub CPUs 31 and 41 can be acquired by reading from the AMs 11 and 21 (S304a).

【００３４】更に、メインＣＰＵ１は、各サブＣＰＵ３
１、４１の動作状態詳細情報及び動作履歴などをさらに
上位のＣＰＵ（図示せず）に通知することができるため
（Ｓ３０４ｂ）、システムとして全てのＣＰＵの動作状
態を認識でき、さらに上位のＣＰＵからメインＣＰＵ１
に各サブＣＰＵ３１、４１の状態を要求することもでき
る。Further, the main CPU 1 has each sub CPU 3
Since detailed operation state information and operation history of Nos. 1 and 41 can be notified to a higher CPU (not shown) (S304b), the operation states of all the CPUs can be recognized as a system, and the higher CPU can Main CPU1
It is also possible to request the status of each sub CPU 31, 41.

【００３５】[0035]

【発明の効果】以上、説明してきたように、本発明によ
れば、被監視側ＣＰＵの動作状態詳細情報、及び被監視
側ＣＰＵに障害が発生した時に監視側ＣＰＵが復旧手順
等のコマンドを格納する領域と、被監視側ＣＰＵの動作
状態詳細情報、及び被監視側ＣＰＵに障害が発生した時
に監視側ＣＰＵが格納する復旧手順等のコマンド以外の
通常のデータの送受を行うための領域とからなる共通メ
モリ有し、且つ、その共通メモリを利用して、被監視側
ＣＰＵが処理中のタスクのどの部分で暴走したのかを、
監視側ＣＰＵが識別することができる簡易マルチプロセ
ッサ監視システムを提供することができる。As described above, according to the present invention, detailed information on the operating state of the monitored CPU, and a command such as a recovery procedure for the monitored CPU when a monitored CPU fails. An area for storing, detailed information on the operating state of the monitored CPU, and an area for transmitting and receiving normal data other than commands such as a recovery procedure stored by the monitoring CPU when a failure occurs in the monitored CPU A common memory consisting of, and using the common memory, which part of the task being processed by the monitored CPU has runaway,
It is possible to provide a simple multiprocessor monitoring system that can be identified by the monitoring CPU.

【００３６】また、本発明によれば、監視側ＣＰＵが、
暴走した被監視側ＣＰＵの障害発生までの動作履歴等の
動作状態詳細情報を識別することができる簡易マルチプ
ロセッサ監視システムを提供することができる。Further, according to the present invention, the monitoring CPU is
It is possible to provide a simple multiprocessor monitoring system capable of identifying detailed operation state information such as an operation history of a runaway monitored CPU until a failure occurs.

[Brief description of the drawings]

【図１】本発明の簡易マルチプロセッサ監視システムの
構成を示す図である。FIG. 1 is a diagram showing a configuration of a simple multiprocessor monitoring system of the present invention.

【図２】サブＣＰＵの動作を示す流れ図である。FIG. 2 is a flowchart showing an operation of a sub CPU.

【図３】本発明の第１の実施の形態のサブＣＰＵの暴走
監視モジュールの動作を示す流れ図である。FIG. 3 is a flowchart showing the operation of the runaway monitoring module of the sub CPU according to the first embodiment of the present invention.

【図４】本発明の第１の実施の形態のメインＣＰＵ部の
動作を示す流れ図である。FIG. 4 is a flowchart showing an operation of the main CPU unit according to the first embodiment of the present invention.

【図５】本発明の第２の実施の形態のサブＣＰＵの暴走
監視モジュールの動作を示す流れ図である。FIG. 5 is a flowchart showing the operation of the runaway monitoring module of the sub CPU according to the second embodiment of the present invention.

【図６】本発明の第１の実施の形態のメインＣＰＵ部の
動作を示す流れ図である。FIG. 6 is a flowchart showing the operation of the main CPU unit according to the first embodiment of the present invention.

【図７】従来のマルチプロセッサ監視システムの構成を
示す図である。FIG. 7 is a diagram showing a configuration of a conventional multiprocessor monitoring system.

【符号の説明】１メインＣＰＵ１１、２１共通メモリ３１、４１サブＣＰＵ２、３２、４２ＲＯＭ３、３３、４３ＲＡＭ１００監視側ＣＰＵ１０３ストールタイマ１０７リセット信号１０９ＮＭＩ信号１１０共通メモリ（デュアルポートＲＡ
Ｍ）２００被監視側ＣＰＵ[Explanation of Codes] 1 main CPU 11, 21 common memory 31, 41 sub CPU 2, 32, 42 ROM 3, 33, 43 RAM 100 monitoring CPU 103 stall timer 107 reset signal 109 NMI signal 110 common memory (dual port RA
M) 200 Monitored CPU

Claims

[Claims]

1. A main CPU, at least one sub CPU, and a common memory provided between the main CPU and each of the sub CPUs and commonly used by the main CPU and each of the sub CPUs. In the simple multiprocessor monitoring system, the common memory includes a detailed operation state information of the sub CPU and a status / command area for storing a command from the main CPU to the sub CPU, and the main CPU stores the common memory in the common memory. A simple multiprocessor monitoring system, wherein each sub CPU is monitored by referring to detailed operation state information stored in the state / command area.

2. The simple multiprocessor monitoring system according to claim 1, wherein each of the sub CPUs writes the status of each of the sub CPUs in the status / command area of each of the common memories as the detailed operation status information. The main CPU reads the operation state detailed information and determines the run state, so that when a runaway occurs in each sub CPU, the runaway occurs in which part of the task the sub CPU is processing. A simple multiprocessor monitoring system characterized by being able to detect whether or not it has.

3. The simplified multiprocessor monitoring system according to claim 2, wherein one of the main CPU and the common memory has a unit for accumulating the sub CPU state information, so that the failure of the sub CPU occurs. A simple multiprocessor monitoring system characterized by being able to store the operation history up to the occurrence.

4. The simplified multiprocessor monitoring system according to claim 3, wherein the main CPU has a function of detecting the operation history, and when a failure occurs in the sub CPU, the detected operation history. A simple multiprocessor monitoring system characterized by performing a recovery procedure process corresponding to the failure on the sub CPU based on the above.