JP2846238B2

JP2846238B2 - System control management method by error trace

Info

Publication number: JP2846238B2
Application number: JP6092715A
Authority: JP
Inventors: 正人牛島
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1994-04-28
Filing date: 1994-04-28
Publication date: 1999-01-13
Anticipated expiration: 2014-01-13
Also published as: JPH07295862A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】この発明は、エラー発生源の認定
を容易にすると共に、認定したエラー発生源に応じてシ
ステムを制御し管理するエラートレースによるシステム
制御管理方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system control and management method using an error trace for facilitating the identification of an error source and controlling and managing a system according to the identified error source.

【０００２】[0002]

【従来の技術】低速で動作するメモリシステムと少量の
高速で動作するハードウェアにより構成された従来のコ
ンピュータシステムは、エラー発生源の認定は、エラー
発生とエラー検出が同時であること、またエラー発生元
がプログラムの状態変化を生ずる前にエラー検出が行わ
れることを想定している。すなわち、エラー原因となる
動作を起動したプログラムとエラー検出時に実行されて
いたプログラムとは同じプログラムの状態（ユーザプロ
グラムまたはシステムプログラム）であり、エラーの原
因となるプログラムとエラーを検出したプログラムとは
同じプロセスであるとの前提で、エラー原因とエラー発
生元の認定を行い、エラー処理を行っていた。2. Description of the Related Art In a conventional computer system including a memory system operating at a low speed and a small amount of hardware operating at a high speed, an error source is identified by the fact that error occurrence and error detection are performed simultaneously, It is assumed that error detection is performed before the source causes a change in the state of the program. That is, the program that started the operation that caused the error and the program that was being executed when the error was detected are in the same program state (user program or system program), and the program that caused the error and the program that detected the error are: Assuming that the process is the same, the cause of the error and the source of the error are determined, and the error is processed.

【０００３】図２３は、コンピュータシステムにおける
エラー発生状況を模式的に時間の経過に従って示した説
明図であり、図において１と２は異なる２つのプロセス
ＡとプロセスＢを示している。また、３はシステムプロ
グラムの実行状態、４はユーザプログラムの実行状態を
示している。５〜９はエラーを発生させる動作が行われ
るタイミングを示し、１０〜１３は前記タイミング５〜
９のエラーを発生させる動作によるエラーの発生に対応
した夫々のエラーが検出されるタイミングを示してい
る。タイミング５とタイミング１０、タイミング６とタ
イミング１１、タイミング７とタイミング１２、タイミ
ング８とタイミング１３、タイミング９とタイミング１
４が夫々のエラー発生とエラー検出の組である。FIG. 23 is an explanatory diagram schematically showing an error occurrence situation in a computer system as time elapses. In the drawing, 1 and 2 show two different processes A and B. Reference numeral 3 denotes an execution state of the system program, and reference numeral 4 denotes an execution state of the user program. 5 to 9 show timings at which an operation for generating an error is performed, and 10 to 13 show the timings 5 to 5.
9 shows timings at which respective errors corresponding to the occurrence of errors by the operation of generating the error No. 9 are detected. Timing 5 and Timing 10, Timing 6 and Timing 11, Timing 7 and Timing 12, Timing 8 and Timing 13, Timing 9 and Timing 1
4 is a set of each error occurrence and error detection.

【０００４】エラー発生元がプログラムの状態変化を生
ずる前にエラー検出が行われることを前提としてなされ
る従来のエラー発生元の認定では、従来のコンピュータ
システムが低速で動作するメモリシステムと少容量の高
速で動作するハードウェア（キャッシュメモリや書き込
み高速化のための書き込みバッファなどをいう）により
構成されていることから、エラーを発生させる動作が行
われるタイミングと発生したエラーが検出されるタイミ
ングとの間隔は図２３に示すほど大きくはならず、前記
エラー発生とエラー検出が同時であること、またエラー
発生元がプログラムの状態変化を生ずる前にエラー検出
が行われるという前提の基にエラーの認定が行われてい
た。According to the conventional error source identification based on the premise that error detection is performed before the error source causes a change in the state of the program, the conventional computer system has a memory system that operates at a low speed and a small-capacity memory system. Since it is composed of hardware that operates at high speed (such as a cache memory or a write buffer for high-speed writing), the timing at which the operation for generating an error is performed and the timing at which the generated error is detected are different. The interval does not become as large as shown in FIG. 23, and the error recognition is performed on the assumption that the error occurrence and the error detection are performed at the same time and that the error detection is performed before the error occurrence source causes a change in the state of the program. Had been done.

【０００５】[0005]

【発明が解決しようとする課題】従来のコンピュータシ
ステムのエラー発生源の認定は以上のような環境下で行
われていたのに対し、近年のハードウェア技術の進展に
よりさらに高速のＣＰＵと高速のキャッシュメモリを大
量に使用したコンピュータシステムが構成可能となり、
このためにエラーを発生する動作をＣＰＵが実行しても
この結果発生するエラーはすぐには検出されず、またそ
の一方でＣＰＵはキャッシュメモリ内のプログラムを高
速で実行できるためエラー検出時にはプログラムがエラ
ー発生元からかなり先まで実行されている状況が生じ、
実行中のプログラム状態とエラー発生元のプログラム状
態とが一致しない場合が生ずることになる。Although the error source of the conventional computer system has been identified under the above-described environment, recent advances in hardware technology have made it possible to use a higher-speed CPU and a higher-speed CPU. A computer system that uses a large amount of cache memory can be configured,
For this reason, even if the CPU executes an operation that generates an error, the resulting error is not immediately detected. On the other hand, the CPU can execute the program in the cache memory at a high speed. There is a situation where it is running far before the source of the error,
In some cases, the state of the program being executed does not match the state of the program where the error occurred.

【０００６】すなわち、図２３に示すタイミング５とタ
イミング１０、タイミング６とタイミング１１、タイミ
ング７とタイミング１２、タイミング８とタイミング１
３、タイミング９とタイミング１４など夫々のエラー発
生とエラー検出の組の内で、タイミング６とタイミング
１１、タイミング８とタイミング１３、タイミング９と
タイミング１４のエラー発生元とエラー検出の組では、
エラーが検出されるタイミングでのプログラム状態とエ
ラー発生元のプログラム状態とが異なることになるため
エラー発生元が正確に認定できない。タイミング６とタ
イミング１１の組ではユーザプログラムがエラー発生元
であるのに対しシステムプログラムがエラー発生元であ
ると誤って認定され、またタイミング８とタイミング１
３の組ではシステムプログラムがエラー発生元であるの
に対しユーザプログラムがエラー発生元であると誤って
認定され、さらにタイミング９とタイミング１４の組で
は無関係のプロセスＢがエラー発生元と認定されること
になる。That is, timings 5 and 10, timings 6 and 11, timings 7 and 12, and timings 8 and 1 shown in FIG.
3, among the sets of error occurrence and error detection, such as timing 9 and timing 14, in the set of error occurrence source and error detection of timing 6 and timing 11, timing 8 and timing 13, and timing 9 and timing 14,
Since the program state at the time when the error is detected is different from the program state of the error source, the error source cannot be accurately identified. In the set of timing 6 and timing 11, the user program is erroneously identified as the source of the error while the system program is incorrectly identified as the source of the error.
In the set of 3, the system program is the source of the error while the user program is incorrectly recognized as the source of the error, and in the set of the timing 9 and the timing 14, the unrelated process B is determined as the source of the error. Will be.

【０００７】つまり、従来のコンピュータシステムでは
容易に成立したエラーの原因となった実行プログラムと
エラーを検出した実行プログラムとは同じプロセスであ
るとの前提条件は成立しなくなり、正確なエラー検出が
困難になる問題点が生ずる。In other words, in the conventional computer system, the precondition that the execution program that easily caused the error and the execution program that detected the error are the same process is not satisfied, and it is difficult to detect an accurate error. Problem arises.

【０００８】請求項１の発明は、エラー発生原因となっ
た資源を使用しているプログラムの認定を正確かつ容易
に行うことのできるエラートレースによるシステム制御
管理方法を得ることを目的とする。[0008] The invention of 請 Motomeko 1, an error cause
Accurate and easy certification of programs that use resources
For its object to obtain a system control management method according to an error trace that can be made.

【０００９】請求項２の発明は、エラー発生時のプログ
ラム状態とエラー検出時のプログラム状態が異なる場合
でもエラー発生元の認定を正確かつ容易に行うことので
きるエラートレースによるシステム制御管理方法を得る
ことを目的とする。According to a second aspect of the present invention, there is provided a program when an error occurs.
When the program status at the time of error detection differs from the program status
However, the error source can be accurately and easily identified.
For its object to obtain a system control management method according to an error trace that can.

【００１０】請求項３の発明は、エラー発生元の認定を
正確かつ容易に行うことのできるエラートレースによる
システム制御管理方法を得ることを目的とする。According to a third aspect of the present invention, an error occurrence source is identified.
An object of the present invention is to provide a system control management method using an error trace that can be performed accurately and easily .

【００１１】請求項４の発明は、エラー発生元の認定を
正確かつ容易に行うことのできるエラートレースによる
システム制御管理方法を得ることを目的とする。According to a fourth aspect of the present invention, an error occurrence source is identified.
An object of the present invention is to provide a system control management method using an error trace that can be performed accurately and easily .

【００１２】請求項５の発明は、エラー発生元の認定を
正確かつ容易に行うことのできるエラートレースによる
システム制御管理方法を得ることを目的とする。According to a fifth aspect of the present invention, an error occurrence source is identified.
An object of the present invention is to provide a system control management method using an error trace that can be performed accurately and easily .

【００１３】請求項６の発明は、エラー発生元の認定を
正確かつ容易に行うことのできるエラートレースによる
システム制御管理方法を得ることを目的とする。According to a sixth aspect of the present invention, an error occurrence source is identified.
An object of the present invention is to provide a system control management method using an error trace that can be performed accurately and easily .

【００１４】請求項７の発明は、エラー発生元の認定を
正確かつ容易に行うことのできるエラートレースによる
システム制御管理方法を得ることを目的とする。According to a seventh aspect of the present invention, an error occurrence source is identified.
An object of the present invention is to provide a system control management method using an error trace that can be performed accurately and easily .

【００１５】請求項８の発明は、エラー発生元の認定を
正確かつ容易に行うことのできるエラートレースによる
システム制御管理方法を得ることを目的とする。According to the present invention, the recognition of the error occurrence source is performed.
An object of the present invention is to provide a system control management method using an error trace that can be performed accurately and easily .

【００１６】請求項９の発明は、エラー発生元の認定を
正確かつ容易に行うことのできるエラートレースによる
システム制御管理方法を得ることを目的とする。According to a ninth aspect of the present invention, an error occurrence source is identified.
An object of the present invention is to provide a system control management method using an error trace that can be performed accurately and easily .

【００１７】請求項１０の発明は、エラー発生元の認定
を正確かつ容易に行うことのできるエラートレースによ
るシステム制御管理方法を得ることを目的とする。According to a tenth aspect of the present invention, an error occurrence source is identified.
It is an object of the present invention to obtain a system control management method using an error trace , which can accurately and easily perform the control.

【００１８】請求項１１の発明は、エラー発生元の認定
に必要なデータを効率良く収集でき、エラー発生元の認
定を正確かつ容易に行うことのできるエラートレースに
よるシステム制御管理方法を得ることを目的とする。According to the eleventh aspect of the present invention, an error occurrence source is identified.
Data necessary for the operation can be collected efficiently,
It is an object of the present invention to obtain a system control management method based on error tracing , which can accurately and easily perform the setting.

【００１９】請求項１２の発明は、エラー発生元の認定
を正確かつ容易に行うことのできるエラートレースによ
るシステム制御管理方法を得ることを目的とする。According to a twelfth aspect of the present invention, an error occurrence source is identified.
It is an object of the present invention to obtain a system control management method using an error trace , which can accurately and easily perform the control.

【００２０】請求項１３の発明は、エラー発生元の認定
を正確かつ容易に行うことのできるエラートレースによ
るシステム制御管理方法を得ることを目的とする。According to a thirteenth aspect, an error occurrence source is identified.
It is an object of the present invention to obtain a system control management method using an error trace , which can accurately and easily perform the control.

【００２１】請求項１４の発明は、エラー発生元の認定
を正確かつ容易に行うと共にシステムの信頼性を向上さ
せるエラートレースによるシステム制御管理方法を得る
ことを目的とする。According to a fourteenth aspect of the present invention, an error occurrence source is identified.
Accurate and easy, while improving system reliability.
It is an object of the present invention to obtain a system control management method using an error trace.

【００２２】請求項１５の発明は、エラー発生元の認定
を正確かつ容易に行うと共に有効資源の利用率を向上さ
せ、システムの信頼性を向上させるエラートレースによ
るシステム制御管理方法を得ることを目的とする。According to a fifteenth aspect of the present invention, an error occurrence source is identified.
Accurate and easy and increase the utilization of available resources
It is another object of the present invention to provide a system control management method using error tracing that improves the reliability of the system .

【００２３】請求項１６の発明は、エラー発生元の認定
を正確かつ容易に行うと共にシステムの信頼性を向上さ
せるエラートレースによるシステム制御管理方法を得る
ことを目的とする。According to a sixteenth aspect of the present invention, an error occurrence source is identified.
Accurate and easy, while improving system reliability.
It is an object of the present invention to obtain a system control management method using an error trace.

【００２４】請求項１７の発明は、エラー発生元の認定
を正確かつ容易に行うと共にシステムの信頼性を向上さ
せるエラートレースによるシステム制御管理方法を得る
ことを目的とする。According to a seventeenth aspect of the present invention, an error occurrence source is identified.
Accurate and easy, while improving system reliability.
It is an object of the present invention to obtain a system control management method using an error trace.

【００２５】請求項１８の発明は、エラー発生元の認定
を正確かつ容易に行うと共にシステムの信頼性を向上さ
せるエラートレースによるシステム制御管理方法を得る
ことを目的とする。The invention according to claim 18 is a method for identifying an error occurrence source.
Accurate and easy, while improving system reliability.
It is an object of the present invention to obtain a system control management method using an error trace.

【００２６】[0026]

【課題を解決するための手段】請求項１の発明に係るエ
ラートレースによるシステム制御管理方法は、プロセス
がどの資源を使用しているかを知ることのできるトレー
ス用バッファに記憶した属性情報を検索し、前記プロセ
スが使用している資源を知ることでエラー発生原因とな
った資源を使用しているプログラムを認定する構成を備
えたものである。Means for Solving the Problems The system control management method according to error tracing according to the invention of claim 1, the process
Tray to know which resources are used
Search the attribute information stored in the
Knowing the resources used by the
It is provided with a configuration for recognizing programs that use the used resources .

【００２７】請求項２の発明に係るエラートレースによ
るシステム制御管理方法は、書き込みバッファやキャッ
シュなどの記憶装置を使用しているときには、エラーが
発生すると前記記憶装置内のデータを固定し、プログラ
ムの状態が変化する前のプログラム状態あるいは変化し
た後のプログラム状態などのトレース用バッファに格納
された属性情報を検索し、エラー発生源を認定する構成
を備えたものである。The system control management method according to error tracing according to the invention of claim 2, write buffer and cache
Error when using storage devices such as
When this occurs, the data in the storage device is fixed and the program is
The program state before the program
Stored in the trace buffer for the program status after
It is provided with a configuration for retrieving the attribute information thus obtained and certifying the error source .

【００２８】請求項３の発明に係るエラートレースによ
るシステム制御管理方法は、オペレーティングシステム
によりプログラムの使用状況に応じて作成管理されたあ
るいは予め定義されてなるプログラムの管理テーブルへ
のポインタ、または前記管理テーブルのコピーまたはプ
ログラムの識別子の属性情報をトレース用バッファに記
憶する構成を備えたものである。The system control management method according to error tracing according to the invention of claim 3, the operating system
Created and managed according to the program usage
Or to a predefined program management table
Pointer, or a copy or
Write the attribute information of the program identifier in the trace buffer.
It has a configuration to remember .

【００２９】請求項４の発明に係るエラートレースによ
るシステム制御管理方法は、プログラムの状態変化が発
生した時刻またはプログラムの状態が変化してからの経
過時間などに関しての属性情報をトレース用バッファに
記憶する構成を備えたものである。According to a fourth aspect of the present invention, there is provided a system control management method using an error trace , wherein a change in a program state occurs.
Time since the program was created or the program status changed
Attribute information about overtime etc. in the trace buffer
It has a configuration for storing .

【００３０】請求項５の発明に係るエラートレースによ
るシステム制御管理方法は、エラーの検出時刻と発生し
たエラーが検出されるのに要したエラー検出時間から実
際にエラーの発生したエラー発生時刻を求め、さらにト
レース用バッファに記憶したプログラムの状態変化が発
生した時刻またはプログラムの状態が変化してからの経
過時間などの属性情報により、前記求めたエラー発生時
刻に実行されていたプログラムを検索して求め、エラー
発生源を認定する構成を備えたものである。The system control management method according to error tracing according to the invention of claim 5 is generated and the detection time of the error
From the error detection time required to detect
At the time of error occurrence,
A change in the state of the program stored in the race buffer occurs.
Time since the program was created or the program status changed
When the above-mentioned error occurs due to attribute information such as overtime
Search for the program that was running at the moment and ask for it.
It is provided with a configuration that identifies the source .

【００３１】請求項６の発明に係るエラートレースによ
るシステム制御管理方法は、発生したエラーの種類に応
じたエラーを検出するのに要するエラー検出時間のリス
トをあらかじめ求めておき、発生したエラーの種類に応
じたエラー検出時間を前記リストから検索して求め、実
際のエラーの検出時刻と前記リストより求めたエラー検
出時間からエラーの種類に応じた実際のエラー発生時刻
を求め、さらに属性情報としてトレース用バッファに記
憶したプログラムの状態変化が発生した時刻またはプロ
グラムの状態が変化してからの経過時間の情報により、
前記求めたエラー発生時刻に実行されていたプログラム
を検索して求め、エラー発生源を認定する構成を備えた
ものである。According to a sixth aspect of the present invention, there is provided a system control management method using an error trace according to the type of error that has occurred.
List of error detection time required to detect duplicate errors
To determine the type of error that occurred.
The same error detection time is found by searching from the list
Error detection time and the error detection
The actual error occurrence time according to the type of error from the output time
And record it in the trace buffer as attribute information.
憶the state change of the program was or time has occurred professional
By the information of the elapsed time since the state of the gram changed,
The program executed at the time of the error occurrence determined above
Is searched for, and an error generation source is recognized .

【００３２】請求項７の発明に係るエラートレースによ
るシステム制御管理方法は、トレース用バッファへのプ
ログラムの状態変化に関しての属性情報の格納を、ＣＰ
Ｕの実行状態がユーザプログラムからシステムプログラ
ムへ、あるいはシステムプログラムからユーザプログラ
ムへの遷移を基に行う構成を備えたものである。The system control management method according to error tracing according to the invention of claim 7, up to the trace buffer
The storage of the attribute information relating to the state change of the program
The execution status of U is changed from user program to system program.
To the user program from the system program or from the system program.
It is provided with a configuration for performing the transition based on the transition to the system.

【００３３】請求項８の発明に係るエラートレースによ
るシステム制御管理方法は、ＣＰＵの実行状態が非特権
レベルから特権レベルへ、あるいは特権レベルから非特
権レベルへの遷移を基に、トレース用バッファへプログ
ラムの状態変化に関しての属性情報の格納を行う構成を
備えたものである。The system control management method based on the error trace according to the invention of claim 8, wherein the execution state of the CPU is non-privileged.
From privileged to privileged or from privileged to unspecified
To the trace buffer based on the transition to the authority level.
It is provided with a configuration for storing attribute information relating to a state change of a ram .

【００３４】請求項９の発明に係るエラートレースによ
るシステム制御管理方法は、割り込み処理の起動あるい
は終了したタイミング、または割り込み処理の起動およ
び終了したタイミング、または割り込み処理中の任意の
時間タイミングでトレース用バッファへプログラムの状
態変化に関しての属性情報の格納を行う構成を備えたも
のである。The system control management method according to error tracing according to the invention of claim 9, the interrupt process starts walking
At the end of the process, or
At the end of the
The state of the program in the trace buffer at the time timing
It is provided with a configuration for storing attribute information relating to a state change .

【００３５】請求項１０の発明に係るエラートレースに
よるシステム制御管理方法は、入出力装置における入出
力動作終了時または入出力動作中のエラー発生またはハ
ードウェアエラーまたは例外を割り込み要因とする割り
込み処理の起動あるいは終了したタイミング、または前
記割り込み処理の起動および終了したタイミング、また
は前記割り込み処理中の任意の時間タイミングでトレー
ス用バッファへプログラムの状態変化に関しての属性情
報の格納を行う構成を備えたものである。The system control management method according to error tracing according to the invention of claim 10, and out of the input-output device
When an error occurs during input or output
Hardware error or exception as an interrupt source
Or before the start or end of the embedded process
The start and end timing of interrupt processing,
Is traced at any time during interrupt processing.
Attribute information about program state changes to the
It is provided with a configuration for storing information .

【００３６】請求項１１の発明に係るエラートレースに
よるシステム制御管理方法は、プログラムカウンタ値，
ＣＰＵ状態値，時間要素，トレース事象番号，各種レジ
スタの内容，プログラム名，プロセス名，メッセージ，
アドレス空間識別子，プロセス識別子，プログラム識別
子，プロセス属性情報，プログラム属性情報，メッセー
ジキュー識別子，メッセージキュー属性情報のいずれか
あるいはそれらの組み合わせをインターフェースのパラ
メータとし、アプリケーションプログラムに提供された
インターフェースを用いてプログラムの状態変化に関し
ての属性情報をトレース情報として格納する構成を備え
たものである。According to the eleventh aspect of the present invention, there is provided a system control management method using an error trace .
CPU status value, time element, trace event number, various registers
The contents of the Star, program name, process name, message,
Address space identifier, process identifier, program identification
Child, process attribute information, program attribute information, message
Either a jqueue identifier or message queue attribute information
Or combine them into interface parameters
Meter and provided to the application program
Using the interface to change the state of the program
All the attribute information is stored as trace information .

【００３７】請求項１２の発明に係るエラートレースに
よるシステム制御管理方法は、プログラムの状態変化に
関しての属性情報をハードウェアの制御回路内に設けら
れたトレース用バッファに記憶する構成を備えたもので
ある。According to a twelfth aspect of the present invention, there is provided a system control management method using an error trace , wherein
Attribute information in the hardware control circuit.
And a configuration for storing the data in a trace buffer .

【００３８】請求項１３の発明に係るエラートレースに
よるシステム制御管理方法は、ＣＰＵと主メモリとのイ
ンターフェース回路、外部バス制御回路、ローカルバス
制御回路、入出力制御回路、出力メモリなどのハードウ
ェアに設けられたトレース用バッファにプログラムの状
態変化に関しての属性情報を記憶する構成を備えたもの
である。The system control management method according to error tracing according to the invention of claim 13, b between the CPU and main memory
Interface circuit, external bus control circuit, local bus
Hardware such as control circuit, input / output control circuit, output memory
The state of the program is stored in the trace buffer provided in the
It is provided with a configuration for storing attribute information relating to a state change .

【００３９】請求項１４の発明に係るエラートレースに
よるシステム制御管理方法は、エラー発生源と認定され
たプログラムの種類に応じて、前記認定されたプログラ
ムの実行を停止させたりシステムを停止させ、あるいは
前記プログラムの実行を延期させてエラー原因が回復し
た時点で処理を再開させ継続させることでシステムを制
御し管理する構成を備えたものである。The system control management method using the error trace according to the fourteenth aspect of the present invention is recognized as an error source.
Depending on the type of program,
System execution, system shutdown, or
The cause of the error is recovered by delaying the execution of the program
The system is controlled by restarting and continuing
It is provided with a configuration for controlling and controlling .

【００４０】請求項１５の発明に係るエラートレースに
よるシステム制御管理方法は、エラー発生源と認定され
たプログラムがユーザプログラムまたはユーザプロセス
の場合に、前記ユーザプログラムまたはユーザプロセス
が使用している資源をそのシステムの管理を実行してい
るシステム管理手段に返却することでシステムを制御し
管理する構成を備えたものである。The system control management method using an error trace according to the invention of claim 15 is recognized as an error source.
User program or user process
The user program or user process
Resources being used by the system
Control the system by returning it to the system management
It has a configuration for management .

【００４１】請求項１６の発明に係るエラートレースに
よるシステム制御管理方法は、エラー発生源と認定され
たプログラムが使用していたメモリ領域または各種の管
理用テーブルまたは入出力装置などの資源を夫々の空き
資源管理プールに返却することでシステムを制御し管理
する構成を備えたものである。The system control management method based on the error trace according to the sixteenth aspect of the present invention is recognized as an error source.
Memory area or various pipes used by the program
Free up resources such as management tables or I / O devices
Control and manage the system by returning it to the resource management pool
It is provided with a configuration to perform .

【００４２】請求項１７の発明に係るエラートレースに
よるシステム制御管理方法は、エラー発生源と認定され
たプログラムでエラーを発生させたメモリページまたは
その他のエラーを発生させた資源を、夫々の空き資源管
理プールに返却しないことでシステムを制御し管理する
構成を備えたものである。The system control management method using error trace according to the seventeenth aspect of the present invention is recognized as an error source.
Memory page that caused the error in the program
The resources that have caused other errors are
The system is configured to control and manage the system by not returning it to the management pool .

【００４３】請求項１８の発明に係るエラートレースに
よるシステム制御管理方法は、エラー発生源と認定され
たプログラムでエラーを発生した資源を代替資源と入れ
替えることでエラーからの回復を行い、エラー原因が回
復した時点で処理を再開させ継続させることでシステム
を制御し管理する構成を備えたものである。The system control management method using error trace according to the eighteenth aspect of the present invention is recognized as an error source.
Resource that caused the error in the program
Recovery from the error by changing the
The system can be restarted and resumed when it is restored.
Is provided for controlling and managing the information .

【００４４】[0044]

【作用】請求項１の発明におけるエラートレースによる
システム制御管理方法は、トレース用バッファにプロセ
スがどの資源を使用しているかを知ることのできる属性
情報を記憶しておき、この属性情報を検索し、検索した
属性情報を基に前記プロセスがどの資源を使用している
かを知ることで、エラーの発生源が前記プロセスの所有
している資源であるときには当該資源を使用しているプ
ログラムをエラー発生源として認定する。 According to the first aspect of the present invention, there is provided a system control management method using an error trace, wherein a process is stored in a trace buffer.
Attribute to know which resource is using
Information is stored, this attribute information is searched, and the searched
Which resources are used by the process based on attribute information
The source of the error is the ownership of the process.
If the resource is
Authorize the program as an error source.

【００４５】請求項２の発明におけるエラートレースに
よるシステム制御管理方法は、エラーが発生すると書き
込みバッファやキャッシュなどの記憶装置内のデータを
固定し、実行されているプログラムの状態が変化する前
のプログラム状態あるいは変化した後のプログラム状態
などのトレース用バッファに格納された属性情報を検索
し、エラー発生源を正確かつ容易に認定する。 In the system control management method based on the error trace according to the second aspect of the present invention, when an error occurs,
Data in storage devices such as embedded buffers and caches.
Before the state of the fixed and running program changes
Program state or changed program state
Search attribute information stored in the trace buffer such as
And accurately and easily identify the source of the error.

【００４６】請求項３の発明におけるエラートレースに
よるシステム制御管理方法は、オペレーティングシステ
ムによりプログラムの使用状況に応じて作成管理された
り予め定義されてなるプログラムの管理テーブルへのポ
インタ、または前記管理テーブルのコピーまたはプログ
ラムの識別子などを属性情報としてトレース用バッファ
に記憶し、この記憶した属性情報を検索してエラー発生
源を正確かつ容易に認定する。 The system control management method using error tracing according to the third aspect of the present invention provides an operating system
Created and managed according to the program usage
Port to a predefined program management table.
Internet or copy or program of the management table
Trace buffer with the identifier of the ram etc. as attribute information
And retrieve the stored attribute information to generate an error
Accurately and easily certify sources.

【００４７】請求項４の発明におけるエラートレースに
よるシステム制御管理方法は、プログラムの状態変化が
発生した時刻またはプログラムの状態が変化してからの
経過時間を属性情報としてトレース用バッファに記憶し
て、この記憶した属性情報を検索してエラー発生源を正
確かつ容易に認定する。 The system control management method according to error trace in the invention of claim 4 is the state change of the program
The time when the event occurred or the status of the program changed
Store the elapsed time as attribute information in the trace buffer
Search the stored attribute information to correct the error source.
Certainly and easily certify.

【００４８】請求項５の発明におけるエラートレースに
よるシステム制御管理方法は、エラーの検出時刻とエラ
ー検出時間から求められた実際にエラーの発生したエラ
ー発生時刻と、トレース用バッファに記憶したプログラ
ムの状態変化が発生した時刻またはプログラムの状態が
変化してからの経過時間などの属性情報とにより、前記
エラー発生時刻に実行されていたプログラムを検索して
求め、エラー発生源を正確かつ容易に認定することを可
能にする。 The system control management method according to error trace in the invention of claim 5 is the detection time of the error gill
-The error in which the error actually occurred
-The time of occurrence and the program stored in the trace buffer
The time at which the program status change occurred or the program status
According to attribute information such as elapsed time after the change,
Search for the program that was running at the time of the error
Required to accurately and easily identify the source of error.
Make it work.

【００４９】請求項６の発明におけるエラートレースに
よるシステム制御管理方法は、発生したエラーの種類に
応じてリストから求められたエラーを検出するのに要す
るエラー検出時間と、エラーの検出時刻とからエラーの
種類に応じた実際のエラー発生時刻を求め、さらにトレ
ース用バッファに記憶したプログラムの状態変化が発生
した時刻またはプログラムの状態が変化してからの経過
時間などの属性情報により、前記求めたエラー発生時刻
に実行されていたプログラムを検索して求め、エラー発
生源であるプログラムを正確かつ容易に認定することを
可能にする。 According to a sixth aspect of the present invention, there is provided a system control management method using an error trace , wherein
Required to detect the error sought from the list accordingly.
Error detection time and the error detection time
Find the actual error occurrence time according to the type, and
Of the program stored in the source buffer
Time elapsed since the program was started or the program status changed
The error occurrence time obtained by the attribute information such as time
Search for the program that was running
Accurately and easily certify the source program
to enable.

【００５０】請求項７の発明におけるエラートレースに
よるシステム制御管理方法は、ＣＰＵの実行状態がプロ
グラムの状態変化としてユーザプログラムからシステム
プログラムへ遷移し、あるいはシステムプログラムから
ユーザプログラムへ遷移することでトレース用バッファ
へのプログラムの状態変化を示す属性情報の格納を行
い、エラー発生源を正確かつ容易に認定することを可能
にする。 The system control management method according to errors traces in the invention of claim 7, the execution state of the CP U is pro
System changes from user program as gram state change
Transition to program or from system program
Trace buffer by transition to user program
To store attribute information indicating program status changes
Error source can be accurately and easily identified
To

【００５１】請求項８の発明におけるエラートレースに
よるシステム制御管理方法は、プログラムの状態変化と
してＣＰＵの実行状態が非特権レベルから特権レベル
へ、あるいは特権レベルから非特権レベルへ遷移するこ
とでトレース用バッファへのプログラムの状態変化を示
す属性情報の格納を行い、エラー発生源を正確かつ容易
に認定することを可能にする。 The system control management method according to error trace in the invention of claim 8 is a state change of a program
CPU execution state changes from non-privileged to privileged
Or transition from privileged to unprivileged
And indicate the change of the program state to the trace buffer.
Accurate and easy to find error sources by storing attribute information
To be certified.

【００５２】請求項９の発明におけるエラートレースに
よるシステム制御管理方法は、割り込み処理の起動ある
いは終了したタイミング、またはその両方、または割り
込み処理中の任意の時間タイミングでトレース用バッフ
ァへのプログラムの状態変化を示す属性情報の格納を行
い、エラー発生源を正確かつ容易に認定することを可能
にする。 According to a ninth aspect of the present invention, there is provided a system control management method using an error trace in which an interrupt process is started.
Or finished timing, or both, or split
Trace buffer at any time during the loading process
Stores attribute information indicating program status changes to
Error source can be accurately and easily identified
To

【００５３】請求項１０の発明におけるエラートレース
によるシステム制御管理方法は、入出力装置における入
出力動作終了時または入出力動作中のエラー発生または
ハードウェアエラーまたは例外を割り込み要因とする割
り込み処理の起動あるいは終了したタイミング、または
前記割り込み処理の起動および終了したタイミング、ま
たは前記割り込み処理中の任意の時間タイミングでトレ
ース用バッファへのプログラムの状態変化を示す属性情
報の格納を行い、エラー発生源を正確かつ容易に認定す
ることを可能にする。 [0053] The system control management method according to errors traces in the invention of claim 10, entering the input-output device
When an error occurs during output operation or input / output operation, or
Assignment of hardware errors or exceptions as interrupt factors
The start or end of the import process, or
The timing of the start and end of the interrupt processing,
Or at any time during interrupt processing.
Attribute information indicating a change in the state of the program to the source buffer
Information to identify the source of error accurately and easily.
To be able to

【００５４】請求項１１の発明におけるエラートレース
によるシステム制御管理方法は、プログラムカウンタ
値，ＣＰＵ状態値，時間要素，トレース事象番号，各種
レジスタの内容，プログラム名，プロセス名，メッセー
ジ，アドレス空間識別子，プロセス識別子，プログラム
識別子，プロセス属性情報，プログラム属性情報，メッ
セージキュー識別子，メッセージキュー属性情報のいず
れかあるいはそれらの組み合わせをインターフェースの
パラメータとし、アプリケーションプログラムに提供さ
れた前記インターフェースを用いてプログラムの属性情
報をトレース用バッファに格納し、エラー発生元の認定
に必要なデータを効率良く収集してエラー発生源を正確
かつ容易に認定することを可能にする。 [0054] The system control management method according to error trace in the invention of claim 11 is a program counter
Value, CPU status value, time element, trace event number, various
Register contents, program name, process name, message
Page, address space identifier, process identifier, program
Identifier, process attribute information, program attribute information, message
Whether the message queue identifier or message queue attribute information
Or a combination of
Parameters and provided to the application program
Attribute information of the program using the interface
Information is stored in the trace buffer, and the error source is identified.
Efficiently collects data required for data acquisition and accurately identifies error sources
And allow for easy certification.

【００５５】請求項１２の発明におけるエラートレース
によるシステム制御管理方法は、ハードウェアの制御回
路内に設けられたトレース用バッファにプログラムの状
態変化を示す属性情報を記憶し、記憶した属性情報によ
りエラー発生源の認定を正確かつ容易にする。 [0055] The system control management method according to errors traces in the invention of claim 12, hardware control times
The state of the program is stored in the trace buffer provided in the road.
Attribute information indicating a state change is stored.
Accurate and easy identification of error sources.

【００５６】請求項１３の発明におけるエラートレース
によるシステム制御管理方法は、ＣＰＵと主メモリとの
インターフェース回路、外部バス制御回路、ローカルバ
ス制御回路、入出力制御回路、出力メモリなどのハード
ウェアに設けられたトレース用バッファにプログラムの
状態変化を示す属性情報であるトレース情報を記憶し、
記憶したトレース情報によりエラー発生源の認定を正確
かつ容易にする。 According to a thirteenth aspect of the present invention, there is provided a system control management method using an error trace .
Interface circuit, external bus control circuit, local bus
Hardware such as hardware control circuit, input / output control circuit, output memory
The program buffer is stored in the trace buffer provided in the hardware.
Trace information, which is attribute information indicating a state change, is stored,
Accurate error source identification based on stored trace information
And make it easier.

【００５７】請求項１４の発明におけるエラートレース
によるシステム制御管理方法は、エラー発生源と認定さ
れたプログラムの種類に応じて、前記認定されたプログ
ラムの実行を停止させ、あるいはシステムを停止させ、
あるいは前記プログラムの実行を延期させ、エラー原因
が回復した時点で前記プログラムの実行や前記システム
の動作を再開させ継続させ、発生したエラーに対するシ
ステムの信頼性を向上させる。 The system control management method based on the error trace according to the fourteenth aspect of the present invention is characterized in that an error source is identified.
Depending on the type of program approved,
Stop running the ram or stop the system,
Alternatively, the execution of the program is postponed and the cause of the error is
At the time of recovery, execution of the program or the system
Operation is resumed and continued, and the error
Improve the reliability of the stem.

【００５８】請求項１５の発明におけるエラートレース
によるシステム制御管理方法は、エラー発生源と認定さ
れたプログラムがユーザプログラムまたはユーザプロセ
スの場合に、前記ユーザプログラムまたはユーザプロセ
スが使用している資源をそのシステムの管理を実行して
いるシステム管理手段に返却し、有効資源の利用率を向
上させ、発生したエラーに対するシステムの信頼性を向
上させる。 In the system control management method using the error trace according to the fifteenth aspect of the present invention, the system is managed as an error source.
User program or user process
The user program or user process
The resources used by the system are managed by executing the management of the system.
System management means to increase the effective resource utilization rate.
To improve the reliability of the system against errors that occur.
Up.

【００５９】請求項１６の発明におけるエラートレース
によるシステム制御管理方法は、エラー発生源と認定さ
れたプログラムがユーザプログラムまたはユーザプロセ
スの場合に、前記ユーザプログラムまたはユーザプロセ
スが使用しているメモリ領域または各種の管理用テーブ
ルまたは入出力装置などの資源を夫々の空き資源管理プ
ールに返却し、有効資源の利用率を向上させ、発生した
エラーに対するシステムの信頼性を向上させる。 The system control management method based on the error trace according to the invention of claim 16 is characterized in that an error occurrence source is identified.
User program or user process
The user program or user process
Memory area used by the database or various management tables
Resources such as files or I / O devices to their respective free resource management programs.
To improve the utilization rate of available resources,
Improve system reliability against errors.

【００６０】請求項１７の発明におけるエラートレース
によるシステム制御管理方法は、エラー発生源と認定さ
れたプログラムでエラーを発生させたメモリページまた
はその他のエラーを発生させた資源を、夫々の空き資源
管理プールに返却しないことでエラーを発生させた資源
の再使用によるエラー発生を防止し、発生したエラーに
対するシステムの信頼性を向上させる。 [0060] Certification of system control management method according to error trace in the invention of claim 17 includes a source of error
Memory pages or
Indicates the resources that have caused other errors
Resource that caused an error by not returning it to the management pool
Prevents errors caused by reuse of
Improve the reliability of the system.

【００６１】請求項１８の発明におけるエラートレース
によるシステム制御管理方法は、エラー発生源と認定さ
れたプログラムに応じて、前記認定されたプログラムの
実行を停止させ、エラー発生源と認定されたプログラム
でのエラーを発生した資源を代替資源と入れ替えること
でエラーからの回復を行い、エラー原因が回復した時点
で処理を再開させ継続させて発生したエラーに対するシ
ステムの信頼性を向上させる。 The system control management method based on the error trace according to the eighteenth aspect of the present invention recognizes that the
Of the accredited program
A program that has stopped running and has been identified as an error source
A resource that has caused an error in a job with an alternative resource
Recovery from the error, and when the cause of the error is recovered
To restart the process and continue
Improve the reliability of the stem.

【００６２】[0062]

【実施例】実施例１．以下、この発明の一実施例を図について説明する。図１
は、本実施例のエラートレースによるシステム制御管理
方法の構成を示すブロック図であり、同図（イ）はシス
テムプログラムによりエラートレースによるシステム制
御管理方法を実現する場合のブロック図、同図（ロ）は
トレース処理用ハードウェアを用いたエラートレースに
よるシステム制御管理方法の構成を示す。図１の（イ）
において２１はシステムプログラムまたはユーザプログ
ラムの実行処理を示し、２２はプログラムの属性情報で
あるトレース情報が書き込まれるトレース用バッファ、
２３はエラーの発生元を判定するエラー判定プログラム
を示している。また、同図（ロ）において２２はユーザ
プログラム２１ａやシステムプログラム２１ｂによりト
レース処理の起動が行われることでプログラムの状態変
化に関しての属性情報を収集してトレース用バッファ２
４に格納する処理を実行するトレース処理用ハードウェ
アである。[Embodiment 1] An embodiment of the present invention will be described below with reference to the drawings. FIG.
FIG. 2 is a block diagram showing a configuration of a system control management method using an error trace according to the present embodiment. FIG. 2A is a block diagram in a case where the system control management method using an error trace is realized by a system program. () Shows the configuration of a system control management method based on error tracing using hardware for tracing. Fig. 1 (a)
In the figure, 21 indicates execution processing of a system program or a user program, 22 indicates a trace buffer in which trace information which is attribute information of the program is written,
Reference numeral 23 denotes an error determination program for determining the source of the error. In FIG. 2B, a trace process 22 is started by the user program 21a or the system program 21b to collect attribute information relating to a change in the state of the program.
4 is hardware for tracing processing that executes processing stored in the tracing device.

【００６３】このエラートレースによるシステム制御管
理方法では、プログラムの状態変化が発生するときにプ
ロセスの属性情報をトレース用バッファメモリあるいは
メモリにおけるトレース用のバッファ領域に格納する。
そして、エラーが発生したときには、前記トレース用バ
ッファメモリあるいはトレース用のバッファ領域に格納
された属性情報を辿り、エラー判定プログラムがエラー
の発生元のプログラムを認定する。In the system control management method based on the error trace, when a change in the state of the program occurs, the attribute information of the process is stored in the trace buffer memory or the trace buffer area in the memory.
When an error occurs, the error determination program identifies the program that caused the error by tracing the attribute information stored in the trace buffer memory or the trace buffer area.

【００６４】図２は、これら一連の動作を示すフローチ
ャートである。このフローチャートによれば、プログラ
ムの状態変化が発生すると（ステップＳＴ１）、プログ
ラムが状態変化を示したときのプログラムの属性情報を
トレース用バッファ２４に格納する（ステップＳＴ
２）。そして、起動されているエラー発生の検出処理機
能および検出したエラー発生に対するエラー処理機能に
より、エラーの発生が検出されると（ステップＳＴ
３）、次にエラーの原因調査が行われる（ステップＳＴ
４）。このエラーの原因調査は、エラー情報からエラー
原因を知り、トレース用バッファ２４からプログラムの
属性情報を読み出し（ステップＳＴ５）、この属性情報
からエラー発生元を調査・認定する（ステップＳＴ
６）。そして、前記エラー原因に応じて前記認定したエ
ラー発生元に対してエラー処理を行う（ステップＳＴ
７）。FIG. 2 is a flowchart showing a series of these operations. According to this flowchart, when a change in the state of the program occurs (step ST1), the attribute information of the program when the program indicates the state change is stored in the trace buffer 24 (step ST1).
2). Then, when an error occurrence is detected by the activated error occurrence detection processing function and the detected error occurrence error processing function (step ST
3) Then, the cause of the error is investigated (step ST).
4). In investigating the cause of this error, the cause of the error is known from the error information, the attribute information of the program is read from the trace buffer 24 (step ST5), and the source of the error is investigated and recognized from this attribute information (step ST5).
6). Then, error processing is performed on the identified error source according to the cause of the error (step ST).
7).

【００６５】この結果、本実施例によれば、トレース用
バッファあるいはトレース用バッファ領域として使用さ
れるメモリ領域がないときには全てのプロセスのテーブ
ルまたはその他の明示的あるいは暗示的なシステム内管
理情報を網羅的に検索してエラー発生元を認定するのに
対し、トレース用バッファあるいはトレース用バッファ
領域として使用されるメモリ領域にプログラムの状態変
化に関しての属性情報が格納されているのでこの属性情
報を基に効率的にエラー発生元のプログラムの認定を行
うことが可能となる。As a result, according to the present embodiment, when there is no trace buffer or memory area to be used as the trace buffer area, all process tables or other explicit or implicit system management information are covered. While the error source is identified by searching in a specific manner, attribute information relating to a change in the state of the program is stored in the trace buffer or the memory area used as the trace buffer area. It is possible to efficiently identify the program that caused the error.

【００６６】実施例２．以下、この発明の一実施例を図について説明する。図３
は、トレース用バッファ２４に格納されたプログラムの
属性情報を示しており、プログラムの属性情報１〜ｎは
プログラムの状態変化の発生時間順にトレース用バッフ
ァ２４に格納される。本実施例のトレース用バッファ２
４に格納されているプログラムの属性情報は、プログラ
ムの発生時間の流れに従って昇順（順次時間の流れと逆
に過去のプログラムの属性情報を辿る）あるいは降順
（順次時間の流れと同一の方向にプログラムの属性情報
を辿る）で検索する。Embodiment 2 FIG. An embodiment of the present invention will be described below with reference to the drawings. FIG.
Indicates the attribute information of the program stored in the trace buffer 24, and the attribute information 1 to n of the program are stored in the trace buffer 24 in the order of the occurrence time of the status change of the program. Trace buffer 2 of this embodiment
The attribute information of the program stored in the program 4 is stored in ascending order (tracing the attribute information of the past program in reverse order of the flow of time) or descending order (sequentially in the same direction as the flow of time). Tracing the attribute information).

【００６７】従って、属性情報を発生時間順に格納する
ことで、時間順に新しいものまたは新しいものから順に
経時的変化に従って検索することが容易になってエラー
発生元の認定が効率的に行えるようになり、プログラム
の属性情報を基に効率的なエラー発生元のプログラムの
認定が可能となる。Therefore, by storing the attribute information in the order of the generation time, it is easy to search for the newest one or the newest one in the order of the time and in accordance with the change over time, and the error occurrence source can be efficiently identified. Thus, it is possible to efficiently identify a program that has caused an error based on the attribute information of the program.

【００６８】実施例３．以下、この発明の一実施例を図について説明する。図４
は、前記実施例１において説明した図２のフローチャー
トのステップＳＴ６の詳細なフローチャートである。本
実施例では、エラーを発生した資源を検出し、エラー発
生源である資源からエラー発生元であるプログラムを認
定する。Embodiment 3 FIG. An embodiment of the present invention will be described below with reference to the drawings. FIG.
Is a detailed flowchart of step ST6 of the flowchart of FIG. 2 described in the first embodiment. In the present embodiment, the resource in which the error has occurred is detected, and the program which is the source of the error is identified from the resource which is the source of the error.

【００６９】これらのフローチャートによれば、ステッ
プＳＴ５においてトレース用バッファ２４から読み出さ
れたプログラムの属性情報よりプロセスが使用している
資源を確認する（ステップＳＴ６ａ）。すなわち、プロ
グラムの属性情報にはそのプロセスが使用している資源
などの情報が含まれている。また、この場合の資源と
は、そのプログラムが使用しているメモリ領域あるいは
入出力装置などである。次に、ステップＳＴ６ａにおい
て確認した資源情報を検出して（ステップＳＴ６ｂ）、
発生したエラー原因が前記検出した資源から発生したも
のであるか否かを判定する（ステップＳＴ６ｃ）。発生
したエラー原因が前記検出した資源から発生したもので
ないと判定されたときには、ステップＳＴ６ａに戻り、
ステップＳＴ６ｂからステップＳＴ６ｃまでの処理を、
発生したエラーの原因が前記検出した資源から発生した
ものであると判定されるまで繰り返す。ステップＳＴ６
ｃにおいて発生したエラー原因が前記検出した資源から
発生したものであると判定されたときには、そのプロセ
スをエラーの発生元であると認定する（ステップＳＴ６
ｄ）。According to these flowcharts, the resources used by the process are confirmed from the attribute information of the program read from the trace buffer 24 in step ST5 (step ST6a). That is, the attribute information of the program includes information such as resources used by the process. The resources in this case are a memory area or an input / output device used by the program. Next, the resource information confirmed in step ST6a is detected (step ST6b),
It is determined whether or not the error cause has occurred from the detected resource (step ST6c). When it is determined that the error cause does not originate from the detected resource, the process returns to step ST6a,
The processing from step ST6b to step ST6c is
The process is repeated until it is determined that the cause of the error has occurred from the detected resource. Step ST6
If it is determined in c that the cause of the error has occurred from the detected resource, the process is determined to be the source of the error (step ST6).
d).

【００７０】従って、本実施例でもプログラムの属性情
報を基に効率的にエラー発生元のプログラムの認定を行
うことが可能となる。Therefore, also in the present embodiment, it is possible to efficiently identify the program that has caused the error based on the attribute information of the program.

【００７１】実施例４．以下、この発明の一実施例を図について説明する。図５
は、本実施例のトレース用バッファ２４に格納されたプ
ログラムの属性情報を示しており、プログラムの属性情
報１〜ｎはプログラムの状態変化の例えば発生時間順に
トレース用バッファに格納され、また格納されている属
性情報は状態変化が発生したときの一つ前のプログラム
の属性情報である。この場合、トレース用バッファ２４
に格納されているプログラムの属性情報は、プログラム
の発生時間の流れに従って昇順あるいは降順で検索して
もよい。Embodiment 4 FIG. An embodiment of the present invention will be described below with reference to the drawings. FIG.
Indicates the attribute information of the program stored in the trace buffer 24 of the present embodiment. The attribute information 1 to n of the program are stored and stored in the trace buffer in the order of, for example, the time of occurrence of the change in the state of the program. The attribute information is the attribute information of the program immediately before the state change occurred. In this case, the trace buffer 24
May be searched in ascending or descending order according to the flow of the program generation time.

【００７２】また、図６もトレース用バッファ２４に格
納されたプログラムの属性情報を示しており、プログラ
ムの属性情報１〜ｎはプログラムの状態変化の例えば発
生時間順にトレース用バッファ２４に格納され、また格
納されている属性情報は状態変化が発生したときのその
状態変化後のプログラムの属性情報である。この場合、
トレース用バッファに格納されているプログラムの属性
情報は、プログラムの発生時間の流れに従って昇順ある
いは降順で検索してもよい。FIG. 6 also shows the attribute information of the program stored in the trace buffer 24. The attribute information 1 to n of the program are stored in the trace buffer 24 in the order of, for example, the occurrence time of the program state change. The stored attribute information is attribute information of the program after the state change when the state change occurs. in this case,
The attribute information of the program stored in the trace buffer may be searched in ascending or descending order according to the flow of the generation time of the program.

【００７３】従って、本実施例では、プログラムの状態
変化によりトレース用バッファ２４に格納されたそのプ
ログラムの状態変化前あるいは状態変化後の属性情報を
基に効率的にエラー発生元のプログラムの認定を行うこ
とが可能となる。Therefore, in this embodiment, the program which has caused the error can be efficiently identified based on the attribute information before or after the status change of the program stored in the trace buffer 24 due to the status change of the program. It is possible to do.

【００７４】実施例５．以下、この発明の一実施例について説明する。本実施例
のエラートレースによるシステム制御管理方法では、ハ
ードウェア構成が書き込みバッファあるいはキャッシュ
を有している場合には、エラー発生時に書き込みバッフ
ァを凍結してトレース用バッファの検索を行う。ただ
し、ここではキャッシュまたは書き込みバッファはＦＩ
ＦＯ方式で動作するものとしている。すなわち、トレー
ス用バッファに状態が変化した後の新しいプログラムの
状態を格納するものとした場合、エラーを発生させる動
作が行われた直後にプログラムの状態変化があり、ＣＰ
Ｕは新しい状態に遷移した後にエラーを検出すると、Ｃ
ＰＵは状態変化をトレース用バッファに書き込む動作を
行うが、このとき書き込みバッファには状態変化をトレ
ース用バッファに書き込む動作が残っているので、図７
のフローチャートに示すようにこの時点で書き込みバッ
ファを一時的に凍結させて状態変化をトレース用バッフ
ァに書き込む動作を停止させる。そして、トレース用バ
ッファに格納されている属性情報を検索して調べること
でエラー発生元の認定を容易にする。この場合のトレー
ス用バッファに格納されている属性情報の検索は、属性
情報が発生時間順に格納されている場合には、前記実施
例２において説明した昇順あるいは降順で検索するよう
にしてもよい。Embodiment 5 FIG. Hereinafter, an embodiment of the present invention will be described. In the system control management method using the error trace according to the present embodiment, when the hardware configuration has a write buffer or a cache, the write buffer is frozen when an error occurs, and the trace buffer is searched. However, here the cache or write buffer is FI
It operates in the FO system. That is, when the state of the new program after the state change is stored in the trace buffer, the state of the program changes immediately after the operation for generating the error is performed, and
If U detects an error after transitioning to the new state,
The PU performs an operation of writing the state change to the trace buffer, but at this time, the operation of writing the state change to the trace buffer remains in the write buffer.
At this point, the write buffer is temporarily frozen to stop the operation of writing the state change to the trace buffer as shown in the flowchart of FIG. Then, by retrieving and examining the attribute information stored in the trace buffer, it is easy to identify the error source. In this case, the attribute information stored in the trace buffer may be searched in the ascending order or the descending order described in the second embodiment when the attribute information is stored in the order of occurrence time.

【００７５】実施例６．以下、この発明の一実施例を図について説明する。図８
は、本実施例のエラートレースによるシステム制御管理
方法におけるトレース用バッファの構成を示す説明図で
ある。図において３１はリング状に構成されたトレース
用バッファに最も新しく格納される属性情報、３２〜３
３はリング状に構成されたトレース用バッファの夫々に
古い順に格納された属性情報である。Embodiment 6 FIG. An embodiment of the present invention will be described below with reference to the drawings. FIG.
FIG. 5 is an explanatory diagram showing a configuration of a trace buffer in the system control management method using error trace according to the embodiment. In the figure, reference numeral 31 denotes attribute information most recently stored in a ring-shaped trace buffer;
Reference numeral 3 denotes attribute information stored in each of the ring-shaped trace buffers in chronological order.

【００７６】本実施例では、トレース用バッファあるい
はトレース用バッファ領域をリング状のリングバッファ
構成にして、トレース用バッファあるいはトレース用バ
ッファ領域をサイクリックに用いる。この結果、トレー
ス用バッファあるいはトレース用バッファ領域を広く設
定することなく、リングバッファが満杯になったときに
は最も古い情報を格納したバッファから繰り返し使用し
てトレース用バッファの使用効率を向上させることが可
能となる。In this embodiment, the tracing buffer or the tracing buffer area has a ring-shaped ring buffer configuration, and the tracing buffer or the tracing buffer area is used cyclically. As a result, it is possible to improve the efficiency of using the trace buffer by repeatedly using the buffer that stores the oldest information when the ring buffer is full, without setting the trace buffer or trace buffer area wide. Becomes

【００７７】また、ポインタによる接続情報によりリン
グバッファを構成する以外に、配列内インデックスを使
用することで配列構成をリング状にしてリングバッファ
を構成することも可能である。また、リングバファを構
成している各バッファに格納される情報が有効であるか
否かを示すフラグや格納時間を示す情報を格納すること
も可能である。In addition to configuring a ring buffer using connection information based on pointers, it is also possible to configure a ring buffer by using an array index to make the array configuration ring-shaped. Further, it is also possible to store a flag indicating whether information stored in each buffer constituting the ring buffer is valid or information indicating a storage time.

【００７８】実施例７．なお、以上の実施例１〜実施例６で説明した属性情報と
して、図９に示すようにプログラムの管理テーブルへの
ポインタあるいは管理テーブルのコピーの一部または全
て、あるいはプロセスを識別するためのプロセスＩＤや
タスクを識別するためのタスクＩＤなどのプログラムの
識別子を他の情報と共にトレース用バッファに格納する
ようにしてもよい。Embodiment 7 FIG. As the attribute information described in the first to sixth embodiments, as shown in FIG. 9, a pointer to a program management table, a part or all of a copy of the management table, or a process for identifying a process. A program identifier such as an ID or a task ID for identifying a task may be stored in the trace buffer together with other information.

【００７９】さらにこの情報を資源毎に管理変数領域の
ビット位置で管理することも可能である。Further, this information can be managed by the bit position of the management variable area for each resource.

【００８０】実施例８．また、前記実施例１〜実施例６で説明した属性情報とし
て、プログラムの状態変化が発生した時刻あるいはプロ
グラムの状態が変化してからの経過時間をトレース用バ
ッファに格納するようにしてもよく、図１０はこのエラ
ートレースによるシステム制御管理方法の一実施例の特
徴を説明するための説明図である。同図（イ）は、エラ
ーの発生時刻Ｔ１とエラー発生からその発生したエラー
がＣＰＵにより認識されるまでの時間Ｔ２などを時間軸
上に示した説明図、同図（ロ）は同時に進行しているプ
ロセス実行の遷移を示す説明図である。Embodiment 8 FIG. Further, as the attribute information described in the first to sixth embodiments, the time at which the state of the program has changed or the elapsed time since the state of the program has changed may be stored in the trace buffer. Figure 10 is an explanatory view for illustrating a characteristic of one embodiment of a system control management method according to the error <br/> over trace. FIG. 2A is an explanatory diagram showing, on a time axis, an error occurrence time T1 and a time T2 from the occurrence of the error until the occurrence of the error is recognized by the CPU, and FIG. FIG. 7 is an explanatory diagram showing a transition of a process execution.

【００８１】図１０に基づいて本実施例の動作を説明す
る。トレース用バッファには、プログラムの状態変化が
発生した時刻あるいはプログラムの状態が変化してから
の経過時間が属性情報として格納されている。上述した
ように、エラーの発生時刻をＴ１とし、またエラー発生
からその発生したエラーがＣＰＵにより認識されるまで
に要する時間をＴ２とした場合、図１０の（イ）に示す
ように時刻Ｔｅすなわち（Ｔ１−Ｔ２）により示される
時刻に実行されているプログラムがエラーを発生させた
と認定できることになる。従って、図１０の（ロ）に示
すプロセス実行の遷移を示す説明図から、時刻Ｔｅでは
ユーザプログラムが実行されているのでエラーの発生元
はユーザプログラムにあると認定できる。The operation of this embodiment will be described with reference to FIG. The trace buffer stores, as attribute information, a time at which a program state change occurs or an elapsed time since the program state change. As described above, when an error occurrence time is T1 and a time required from the occurrence of the error until the occurrence of the error is recognized by the CPU is T2, as shown in FIG. It can be determined that the program executed at the time indicated by (T1-T2) has caused an error. Therefore, from the explanatory diagram showing the transition of the process execution shown in (b) of FIG. 10, since the user program is being executed at the time Te, it can be recognized that the source of the error is in the user program.

【００８２】実施例９．また、前記エラー発生からその発生したエラーがＣＰＵ
により認識されるまでに要する時間Ｔ２が、発生したエ
ラーの種類に依存する場合には、あらかじめエラーの種
類毎にそのエラーがＣＰＵにより認識されるまでに要す
る時間Ｔ２を求め図１１に示すようにテーブル化してお
き、発生したエラー種類に応じた適切な発生エラー認識
時間Ｔ２を読み出してエラー発生元を精度良く認定する
こともできる。Embodiment 9 FIG. In addition, from the occurrence of the error, the error
If the time T2 required until the error is recognized depends on the type of error that has occurred, the time T2 required before the error is recognized by the CPU is determined for each type of error as shown in FIG. It is also possible to make a table, read out an appropriate error recognition time T2 appropriate for the type of error that has occurred, and accurately identify the error source.

【００８３】実施例１０．また、図１２に示すようにトレース用バッファにプログ
ラムとそのプログラムが使用している資源のリストを格
納することで、エラー原因となった資源を使用している
プログラムをトレース用バッファを検索することで知
り、エラー発生元のプログラムの設定を正確かつ容易に
行うことが可能である。Embodiment 10 FIG. In addition, by storing a list of programs and resources used by the programs in the trace buffer as shown in FIG. 12, it is possible to search the trace buffer for a program using the resource causing the error. , It is possible to accurately and easily set the program in which the error has occurred.

【００８４】実施例１１．また、プログラムの属性情報をトレース用バッファに格
納するタイミングとしてプロセスのコンテクストスイッ
チを採用することも可能であり、本実施例では図２のス
テップＳＴ２においてあるプロセスから別のプロセスへ
切り替えが行われるタイミングでトレース用バッファへ
属性情報を格納する。この結果、トレース用バッファに
はプロセスの実行順序に関しての情報が属性情報として
格納されることになり、エラーが発生したときにはトレ
ース用バッファに格納された属性情報を検索すること
で、エラー発生前に実行されていたプロセスの実行順序
を容易に知ることができ、エラー発生時に実行されてい
たプロセスを容易に認定することが可能になる。Embodiment 11 FIG. Also, a context switch of a process can be employed as the timing for storing the attribute information of the program in the trace buffer. In the present embodiment, the timing at which switching from one process to another process is performed in step ST2 of FIG. To store attribute information in the trace buffer. As a result, the information on the execution order of the process is stored as attribute information in the trace buffer. When an error occurs, the attribute information stored in the trace buffer is searched, so that the error can be obtained before the error occurs. The execution order of the executed processes can be easily known, and the process executed at the time of occurrence of the error can be easily identified.

【００８５】以上の説明を図１０を用いてさらに具体的
に説明すると、プロセスの実行遷移（ここではプロセス
Ａ→プロセスＢ→プロセスＣ）が行われる毎にプロセス
の属性情報を時間情報あるいは時刻情報と共に格納する
ことにより、実際のエラー発生時刻がＴｅであることが
判明したときにはこの時刻Ｔｅにより、前記時刻情報と
共に格納されている属性情報から時刻Ｔｅに実行されて
いたプロセスはプロセスＢであることが容易に認定され
る。The above description will be more specifically described with reference to FIG. 10. Each time a process execution transition (here, process A → process B → process C) is performed, the process attribute information is changed to time information or time information. When the actual error occurrence time is found to be Te, the process executed at the time Te from the attribute information stored together with the time information is determined to be the process B based on the time Te. Is easily certified.

【００８６】実施例１２．以下、この発明の一実施例について説明する。本実施例
では、図２のステップＳＴ２においてＣＰＵの実行レベ
ルが変化する毎にプログラムの状態の変化をトレース用
バッファに格納する。すなわち、システムコールや特権
レベルの変更や割り込みなどが発生する毎にプログラム
の属性情報をトレース用バッファに格納することでエラ
ー発生元がユーザプログラムかシステムプログラムか割
り込みプログラムかのエラー発生元の認定が容易にな
る。Embodiment 12 FIG. Hereinafter, an embodiment of the present invention will be described. In this embodiment, every time the execution level of the CPU changes in step ST2 of FIG. 2, a change in the state of the program is stored in the trace buffer. In other words, every time a system call, privilege level change, or interrupt occurs, the attribute information of the program is stored in the trace buffer so that the source of the error can be identified as an error source, whether it is a user program, a system program, or an interrupt program. It will be easier.

【００８７】図１０において、プロセスの実行遷移中の
（Ｓ）はシステムプログラムによるＣＰＵの動作状態、
（Ｕ）はユーザプログラムによるＣＰＵの動作状態を示
している。このＣＰＵの実行動作状態を時間情報と共に
トレース用バッファに格納することにより時刻Ｔｅにお
いてはＣＰＵはユーザプログラムによる実行状態にあ
り、時刻Ｔｅにおけるエラーの発生元はユーザプログラ
ムにあることが容易に認定できる。In FIG. 10, (S) shows the operation state of the CPU by the system program during the execution transition of the process.
(U) shows the operation state of the CPU by the user program. By storing the execution operation state of the CPU together with the time information in the trace buffer, at time Te, the CPU is in the execution state by the user program, and it can be easily recognized that the error source at time Te is in the user program. .

【００８８】またさらに、二つのタイミングでのプログ
ラムの状態の変化をトレース用バッファに格納すること
により、時間軸上でのプログラムの実行履歴（すなわ
ち、プロセスＡ「ユーザ状態」→プロセスＡ「システム
状態」→プロセスＢ「システム状態」→プロセスＢ「ユ
ーザ状態」のような実行履歴）が得られるようになり、
より確実なエラー発生元の認定が可能となる。Further, by storing the change in the state of the program at the two timings in the trace buffer, the execution history of the program on the time axis (that is, process A “user state” → process A “system state” → process B “system state” → execution history such as process B “user state”).
It is possible to more reliably identify an error occurrence source.

【００８９】また、割り込みなどが発生する毎にプログ
ラムの属性情報をトレース用バッファに格納する場合
に、入出力割り込みや例外、ハードウェアエラーなどを
割り込み要因とする割り込み発生毎にプログラムの属性
情報をトレース用バッファに格納し、さらにこれら割り
込み処理の起動あるいは終了したタイミング、またはそ
の両方、あるいは割り込み処理中の任意のタイミングで
プログラムの属性情報をトレース用バッファに格納する
ようにしてもよい。When the attribute information of the program is stored in the trace buffer every time an interrupt or the like occurs, the attribute information of the program is stored every time an interrupt is caused by an input / output interrupt, an exception, or a hardware error. The attribute information of the program may be stored in the trace buffer, and the attribute information of the program may be stored in the trace buffer at the timing of starting or ending the interrupt processing, or both, or any timing during the interrupt processing.

【００９０】実施例１３．以下、この発明の一実施例を図について説明する。図１
３は、本実施例のエラートレースによるシステム制御管
理方法の構成を示す概念図である。本実施例では、ユー
ザプログラムから明示的に属性情報をトレース用バッフ
ァに格納するためのアプリケーションプログラムによる
ソフトウェアインターフェースを提供するものである。
図において４１はユーザプログラム、４２はシステムプ
ログラム、４３はトレース用バッファに格納された属性
情報を元にエラー発生元を認定するエラー判定プログラ
ム、４４はユーザプログラムから明示的に属性情報をト
レース用バッファに格納するためのアプリケーションプ
ログラムによるソフトウェアインターフェースである。Embodiment 13 FIG. An embodiment of the present invention will be described below with reference to the drawings. FIG.
FIG. 3 is a conceptual diagram illustrating a configuration of a system control management method using an error trace according to the present embodiment. In the present embodiment, a software interface by an application program for explicitly storing attribute information in a trace buffer from a user program is provided.
In the figure, 41 is a user program, 42 is a system program, 43 is an error determination program for certifying an error occurrence source based on attribute information stored in the trace buffer, and 44 is a trace buffer for explicitly transmitting attribute information from the user program. Is a software interface by an application program to be stored in the PC.

【００９１】本実施例では、ユーザプログラムから明示
的に属性情報をトレース用バッファに格納するためのソ
フトウェアインターフェースが提供され、このソフトウ
ェアインターフェースによりユーザプログラム４１から
明示的に属性情報がトレース用バッファ２４に格納され
る。この場合、ユーザプログラムはソフトウェアインタ
ーフェース４４を介しての属性情報のトレース用バッフ
ァ２４への格納を明確に意識したものとなる。In the present embodiment, a software interface for explicitly storing attribute information in the trace buffer from the user program is provided, and the attribute information is explicitly stored in the trace buffer 24 from the user program 41 by this software interface. Is stored. In this case, the user program is clearly aware of storing the attribute information in the trace buffer 24 via the software interface 44.

【００９２】また、ソフトウェアインターフェースのパ
ラメータとしては、プログラムカウンタの値，ＣＰＵの
状態値，時間要素，トレース事象番号，各種レジスタの
内容，プログラム名またはプロセス名，メッセージ，ア
ドレス空間識別子，プロセス識別子，プログラム識別
子，プロセス属性情報，プログラム属性情報，メッセー
ジキュー識別子，メッセージキュー属性情報などがあ
り、属性情報についてのこれらの情報がソフトウェアイ
ンターフェース４４を介してトレース用バッファ２４に
格納される。The software interface parameters include a program counter value, a CPU status value, a time element, a trace event number, contents of various registers, a program or process name, a message, an address space identifier, a process identifier, and a program. There are identifiers, process attribute information, program attribute information, message queue identifiers, message queue attribute information, and the like, and these pieces of attribute information are stored in the trace buffer 24 via the software interface 44.

【００９３】この結果、前記プログラムカウンタの値，
ＣＰＵの状態値，時間要素，トレース事象番号，各種レ
ジスタの内容，プログラム名またはプロセス名，メッセ
ージ，アドレス空間識別子，プロセス識別子，プログラ
ム識別子，プロセス属性情報，プログラム属性情報，メ
ッセージキュー識別子，メッセージキュー属性情報など
の属性情報のトレース用バッファ２４への格納をソフト
ウェアにより制御することが可能となり、エラー発生元
の認定に有効なデータを効率良くトレース用バッファ２
４に収集できることになる。As a result, the value of the program counter,
CPU status value, time element, trace event number, contents of various registers, program or process name, message, address space identifier, process identifier, program identifier, process attribute information, program attribute information, message queue identifier, message queue attribute The storage of attribute information such as information in the trace buffer 24 can be controlled by software, and data effective for certifying an error occurrence source can be efficiently stored in the trace buffer 2.
4 can be collected.

【００９４】実施例１４．以下、この発明の一実施例について説明する。前記実施
例１１および実施例１２では、プログラムの属性情報を
トレース用バッファに格納するタイミングとしてプロセ
スのコンテクストスイッチやＣＰＵの実行レベルが変化
する毎に行うものとして説明したが、あらかじめ指定し
た要因が発生したときのみ属性情報をトレース用バッフ
ァに格納するようにしてもよく、この場合のあらかじめ
指定する要因としては割り込みの種類、システムコール
の種類、プロセスまたはプログラムの識別子、プロセス
名またはプログラム名または関数名、特定のアドレス領
域の読み書きおよび実行などを前記要因とすることが可
能である。Embodiment 14 FIG. Hereinafter, an embodiment of the present invention will be described. In the eleventh and twelfth embodiments, the description is made that the process of storing the attribute information of the program in the trace buffer is performed every time the context switch of the process or the execution level of the CPU is changed. The attribute information may be stored in the trace buffer only when the call is made. In this case, the factors to be specified in advance are the type of interrupt, the type of system call, the identifier of the process or program, the process name or the program name or the function name. The reading, writing, and execution of a specific address area can be the factors.

【００９５】実施例１５．以下、この発明の一実施例について説明する。本実施例
では、トレース用バッファをハードウェアの制御回路内
に設定し、またこの制御回路としてはＣＰＵと主メモリ
のインターフェース回路や外部バス制御回路やローカル
バス制御回路や入出力制御回路であり、また主メモリの
所定のエリアにトレース用バッファ領域を設定する。Embodiment 15 FIG. Hereinafter, an embodiment of the present invention will be described. In the present embodiment, the trace buffer is set in a hardware control circuit, and the control circuit is an interface circuit of a CPU and a main memory, an external bus control circuit, a local bus control circuit, and an input / output control circuit. Also, a trace buffer area is set in a predetermined area of the main memory.

【００９６】図１４は、本実施例のエラートレースによ
るシステム制御管理方法の構成を示す概念図である。図
において５１はＣＰＵ、５２はＣＰＵ５１と主メモリ５
３との間のインターフェース、５４は外部バス制御回
路、５５はローカルバス制御回路、５６は入出力制御回
路である。FIG. 14 is a conceptual diagram showing the configuration of a system control management method using error tracing according to this embodiment. In the figure, 51 is a CPU, 52 is a CPU 51 and a main memory 5.
3, an external bus control circuit, 55 a local bus control circuit, and 56 an input / output control circuit.

【００９７】図１５は、主メモリ５３とＣＰＵ５１間の
インターフェース５２（以下、ＰＭＩ，Ｐｒｏｃｅｓｓ
ｏｒＭｅｍｏｒｙＩｎｔｅｒｆａｃｅという）にト
レース用バッファ（領域）を設けた場合の概念図であ
る。図１５において図１４と同一または相当の部分につ
いては同一の符号を付し説明を省略するが６０はキャッ
シュメモリ、６１は書き込みバッファ、６２はトレース
用バッファ（領域）である。FIG. 15 shows an interface 52 (hereinafter, PMI, Process) between the main memory 53 and the CPU 51.
FIG. 3 is a conceptual diagram in a case where a trace buffer (area) is provided in an “or memory interface”. In FIG. 15, the same or corresponding parts as those in FIG. 14 are denoted by the same reference numerals and description thereof is omitted, but reference numeral 60 denotes a cache memory, 61 denotes a write buffer, and 62 denotes a trace buffer (area).

【００９８】ＣＰＵ５１は、ＰＭＩ５２を介してデータ
を外部に書き込む場合にキャッシュメモリ６０と書き込
みバッファ６１を使用して書き込みを行うことで書き込
み処理の高速化を実現している。すなわち、ＣＰＵ５１
の外部への書き込み動作はその書き込み対象がキャッシ
ュされる領域の場合には一度キャッシュメモリ６０へ格
納し、その後、書き込みバッファ６１へキューイングす
る。ＣＰＵ５１は、キャッシュメモリ６０への書き込み
を終了した段階で別の動作に制御を移行し、その後に前
記書き込み動作が実行されることになり、このためＣＰ
Ｕ５１は書き込みによる処理の遅れの影響を受けること
はない。キャッシュメモリ６０の内容は、書き込みバッ
ファ６１を経由して主メモリ５３などに書き込まれる。
書き込みバッファ６１とＣＰＵ５１のキャッシュメモリ
６０はＦＩＦＯ方式により書き込み処理を実行している
ものとすれば、書き込みバッファ６１とＣＰＵ５１のキ
ャッシュメモリ６０のために、実際のＣＰＵ５１の書き
込み動作と実際の物理的書き込みは同期的に行われるこ
とにはならず、処理の高速化が実現される一方でエラー
発生と発生したエラーの検出との間にはタイムラグが生
じエラーの検出が困難となる。従って、この書き込み処
理時のコンピュータシステムのエラートレース処理の非
同期性から生ずる問題点を、トレース用バッファを特定
のハードウェアの制御回路内に設定することで解決す
る。The CPU 51 realizes a high-speed write process by writing data using the cache memory 60 and the write buffer 61 when writing data externally via the PMI 52. That is, the CPU 51
In the write operation to the outside, if the write target is a cached area, the write operation is once stored in the cache memory 60 and then queued in the write buffer 61. The CPU 51 shifts the control to another operation when the writing to the cache memory 60 is completed, and thereafter the writing operation is executed.
U51 is not affected by the processing delay caused by writing. The contents of the cache memory 60 are written to the main memory 53 via the write buffer 61.
Assuming that the write buffer 61 and the cache memory 60 of the CPU 51 are executing the write process by the FIFO method, the write operation of the actual CPU 51 and the actual physical write operation are performed for the write buffer 61 and the cache memory 60 of the CPU 51. Are not performed synchronously, and the processing is speeded up, while a time lag occurs between the occurrence of the error and the detection of the generated error, making it difficult to detect the error. Therefore, the problem caused by the asynchronousness of the error trace processing of the computer system at the time of the write processing is solved by setting the trace buffer in a specific hardware control circuit.

【００９９】本実施例では、プログラムの状態が変化す
るときにトレース用バッファ６２に状態変化後の新しい
プログラム状態を格納する。例えば、エラー発生動作が
行われた直後にプログラムの状態変化があり、ＣＰＵ５
１が新しい状態に遷移した後にエラーを検出したとす
る。In this embodiment, when the state of the program changes, the new program state after the state change is stored in the trace buffer 62. For example, there is a change in the state of the program immediately after the error generating operation is performed, and the CPU 5
Suppose that an error is detected after 1 has transitioned to a new state.

【０１００】つまり、（１）エラーを発生させる動作が
ＣＰＵ５１により行われるが、状態変化のトレース用バ
ッファ６２への実際の書き込み動作は書き込みバッファ
６１に格納される。（２）さらにプログラムが状態遷移
する。状態遷移をトレース用バッファ６２に書き込む動
作を行うが、状態変化のトレース用バッファ６２への実
際の書き込み動作は書き込みバッファ６１に蓄積され
る。（３）書き込みバッファ６１はＦＩＦＯなのでエラ
ーを発生させる動作が先に実行されエラーが検出され
る。以上の（１），（２），（３）の処理が順番に進ん
だ場合に状態変化をトレース用バッファ６２に書き込む
動作は、（３）によりエラー検出時に書き込みバッファ
６２内に残っていることになる。従ってこの場合には、
書き込みバッファ６１を一時的に凍結しプログラムの状
態変化の動作を書き込みバッファ６１内で固定してトレ
ース用バッファ６２に格納されているプログラムの状態
変化などについての属性情報を調べることで検出したエ
ラーの発生元の認定が容易となる。That is, (1) an operation for generating an error is performed by the CPU 51, but an actual operation of writing the state change into the trace buffer 62 is stored in the write buffer 61. (2) Further, the program makes a state transition. The operation of writing the state transition to the trace buffer 62 is performed. The actual operation of writing the state change to the trace buffer 62 is accumulated in the write buffer 61. (3) Since the write buffer 61 is a FIFO, an operation for generating an error is executed first and the error is detected. The operation of writing the state change to the trace buffer 62 when the above-described processes (1), (2), and (3) proceed in order is that the operation remains in the write buffer 62 when an error is detected due to (3). become. So in this case,
The error detected by temporarily freezing the write buffer 61 and fixing the operation of the program state change in the write buffer 61 and examining the attribute information about the program state change stored in the trace buffer 62 is detected. It is easy to identify the origin.

【０１０１】図１６は、エラーが発生する前の書き込み
バッファ６１の状態を示す説明図であり、２５はエラー
を発生させる書き込み動作のエラー書き込み動作情報、
２６はプログラムの状態変化についてトレース用バッフ
ァ６２に書き込み動作を行う状態変化書き込み動作情報
である。この場合、エラー書き込み動作情報２５により
エラーを発生させる書き込み動作が行われエラーが発生
したときには書き込みバッファ６１を一時的に凍結し、
それ以降のプログラムの状態変化についてトレース用バ
ッファ６２への書き込み動作が実行されないようにす
る。従ってトレース用バッファ６２には、エラー発生時
にそれ以前の状態が残っており、エラー発生時のトレー
ス用バッファ６２に残っている属性情報によりエラー発
生元の認定を容易に行うことが可能となる。FIG. 16 is an explanatory diagram showing the state of the write buffer 61 before an error occurs. Reference numeral 25 denotes error write operation information of a write operation for generating an error.
Reference numeral 26 denotes state change write operation information for performing a write operation to the trace buffer 62 with respect to a change in the state of the program. In this case, a write operation for generating an error is performed based on the error write operation information 25, and when an error occurs, the write buffer 61 is temporarily frozen,
The writing operation to the trace buffer 62 is not executed for the subsequent state change of the program. Therefore, the state prior to the occurrence of the error remains in the trace buffer 62, and it is possible to easily identify the source of the error based on the attribute information remaining in the trace buffer 62 when the error occurs.

【０１０２】次に、この場合のエラー発生元の認定方法
を図１７および図１８および図２３を用いて具体的に説
明する。図２３においてＳはシステムプログラムの実行
中を示し、またＵはユーザプログラムの実行中を示して
いる。従来の技術で説明したようにエラー発生とエラー
検出の組（６，１１）、（８，１３）、（９，１４）に
関してはエラー発生元の認定を正しく行うことができな
かった。これに対し本実施例では、エラー発生とエラー
検出の組（９，１４）についてはエラー検出時のプロセ
スが異なっているので、トレース用バッファ６２の属性
情報にプロセスの識別子を書き込むことでエラー発生元
の認定が可能となる。Next, a method of identifying the error occurrence source in this case will be specifically described with reference to FIGS. 17, 18 and 23. In FIG. 23, S indicates that the system program is being executed, and U indicates that the user program is being executed. As described in the background art, the error generation source cannot be correctly identified for the pairs (6, 11), (8, 13), and (9, 14) of error occurrence and error detection. On the other hand, in the present embodiment, since the process at the time of error detection is different for the set (9, 14) of error occurrence and error detection, writing the process identifier in the attribute information of the trace buffer 62 causes the error occurrence. The original certification becomes possible.

【０１０３】図１７は、それ以外の場合のエラー発生元
の認定についての対応表を示す説明図である。すなわ
ち、図２３において、エラー検出時１１とエラー検出時
１３では６と８のエラー発生についてのプログラムの属
性情報を更新する要求は書き込みバッファ６１内に残っ
ているので、トレース用バッファ６２を調べることでエ
ラー検出時１１において検出したエラーについてのプロ
グラム状態がユーザ状態であったこと、またエラー検出
時１３において検出したエラーについてのプログラム状
態がシステム状態であったことを知ることが出来る。FIG. 17 is an explanatory diagram showing a correspondence table for error source recognition in other cases. That is, in FIG. 23, at the time of error detection 11 and the time of error detection 13, since the request for updating the attribute information of the program regarding the occurrence of the errors 6 and 8 remains in the write buffer 61, the trace buffer 62 is examined. Thus, it is possible to know that the program state of the error detected at the time of error detection 11 was the user state, and that the program state of the error detected at the time of error detection 13 was the system state.

【０１０４】また、エラー検出１０とエラー検出１２の
場合には、トレース用バッファ６２の属性情報とエラー
検出時のＣＰＵの状態とは一致しているのでトレース用
バッファ６２の属性情報によりエラー発生元のＣＰＵの
状態を知ることができる。従って、図１７に示すように
エラー検出時１０とエラー検出時１１のエラー発生元は
ユーザプログラムにあり、またエラー検出時１２とエラ
ー検出時１３のエラー発生元はシステムプログラムにあ
ることが明らかとなる。すなわち、従来では正しく認定
できなかったエラー発生とエラー検出の組（６，１１）
と（８，１３）についてもエラー発生元の認定が正しく
行われる。In the case of the error detection 10 and the error detection 12, since the attribute information of the trace buffer 62 and the state of the CPU at the time of error detection match, the source of the error occurrence is determined by the attribute information of the trace buffer 62. CPU status can be known. Therefore, as shown in FIG. 17, it is clear that the error source at the time of error detection 10 and error detection 11 is in the user program, and that the error source at error detection 12 and error detection 13 is in the system program. Become. That is, a set of error occurrence and error detection (6, 11) which could not be correctly identified in the past.
Also for (8, 13), the error occurrence source is correctly identified.

【０１０５】また、プログラムの状態が変化するときに
トレース用バッファ６２にプログラムの状態が変化する
前のプログラム状態を格納するようにしても同様の効果
が得られる。図１８は、プログラムの状態が変化すると
きにトレース用バッファ６２にプログラムの状態が変化
する前のプログラム状態を格納したときの対応表であ
り、図１７の対応表に示した場合と同様にエラー検出時
１０とエラー検出時１１のエラー発生元はシステムプロ
グラム状態の変化後のユーザプログラム状態にあり、ま
たエラー検出時１２とエラー検出時１３のエラー発生元
はユーザプログラム状態の変化後のシステムプログラム
状態にあることが明らかとなる。すなわち、従来では正
しく認定できなかったエラー発生とエラー検出の組
（６，１１）と（８，１３）についてもエラー発生元の
認定が正しく行われる。Similar effects can be obtained by storing the program state before the program state changes in the trace buffer 62 when the program state changes. FIG. 18 is a correspondence table when the program state before the change of the program state is stored in the trace buffer 62 when the state of the program changes. As in the case shown in the correspondence table of FIG. The error source at the time of detection 10 and error detection 11 is in the user program state after the change of the system program state, and the error source at error detection 12 and error detection 13 is the system program after the change of the user program state. It becomes clear that it is in a state. In other words, the error occurrence source is also correctly identified for the sets (6, 11) and (8, 13) of error occurrence and error detection that could not be correctly identified in the past.

【０１０６】実施例１６．以下、この発明の一実施例を説明する。本実施例は、以
上説明してきた実施例１〜実施例１５によりエラー発生
元の認定が行われた場合に実行される処理について述べ
る。このエラー発生元の認定による処理については、エ
ラーを発生させたプログラムだけをシステムから消去し
てそれ以外の処理を継続実行することも可能であり、ま
たシステム全体を停止させることも可能である。たとえ
ば、ユーザプロセスがエラーの発生元であった場合に
は、そのプログラムが使用していた資源を全て空き資源
プールに開放し、またシステムプログラムが発生元であ
った場合にはシステム全体を停止させるか、別のシステ
ムに現在実行中の全プログラムまたは一部のプログラム
を移動させることが可能である。Embodiment 16 FIG. Hereinafter, an embodiment of the present invention will be described. In the present embodiment, a process executed when an error occurrence source is identified according to the above-described first to fifteenth embodiments will be described. Regarding the processing based on the recognition of the error source, it is possible to delete only the program that caused the error from the system and continue the other processing, or to stop the entire system. For example, if the user process is the source of the error, all resources used by the program are released to the free resource pool, and if the system program is the source, the entire system is stopped. Alternatively, all or some of the currently running programs can be moved to another system.

【０１０７】また、エラーを発生させたプログラムの実
行を停止させ、それ以外のプログラムの実行を継続した
り、エラー発生プロセスを実行延期にしてエラー原因が
回復したときに実行継続させるなどの方法が可能であ
る。すなわち、エラー発生原因がＩ／Ｏエラーであり、
その入出力デバイスなしではプログラムの実行継続が不
可能である場合にはそのプログラムを停止させ、代替入
出力デバイスが後に利用可能である場合には、その代替
入出力デバイスが使用可能になるまでそのエラーを発生
させたプログラムの実行を一時中断し延期させる。Further, there is a method of stopping the execution of the program in which the error has occurred and continuing the execution of the other programs, or delaying the execution of the error generating process to continue the execution when the cause of the error is recovered. It is possible. That is, the cause of the error is an I / O error,
If the execution of the program cannot be continued without the I / O device, the program is stopped.If an alternative I / O device is available later, the program is stopped until the alternative I / O device becomes available. Suspend and suspend the execution of the program that caused the error.

【０１０８】また、エラーを発生させたプログラムが使
用していた資源を元に戻すことで有効資源の利用率を向
上させることが可能である。Also, by recovering the resources used by the program that caused the error, it is possible to improve the utilization rate of the effective resources.

【０１０９】さらに、エラーの発生原因であった資源は
空き資源管理プールに返却しないことでエラーの再発を
防ぐことが可能である。すなわち、エラー発生原因がメ
モリの故障であり、プログラムが使用していた主メモリ
の内で故障していないものを空きメモリとしてシステム
に返却し、エラーを発生したメモリは使用不能として空
き資源管理プールに返却せず、エラー原因が回復するま
で使用を禁止する。エラーを発生させた資源が代替資源
と入れ替え可能である場合には、使用不能として空き資
源管理プールに返却せず、エラー原因が回復するまで使
用を禁止することでシステムの信頼性を向上させること
が可能である。すなわち、メモリデバイスの故障が原因
でエラーを発生し、エラー発生時の動作が前記メモリデ
バイスへの書き込み処理であるような場合、別のメモリ
を代替メモリとして用意しそのメモリを使用して書き込
み処理を行うことでエラー回復が可能となる。Further, it is possible to prevent the recurrence of the error by not returning the resource that caused the error to the free resource management pool. In other words, the cause of the error is a memory failure, and the main memory used by the program that has not failed is returned to the system as free memory. The use is prohibited until the cause of the error is recovered. If the resource that caused the error can be replaced with a substitute resource, do not return it to the free resource management pool as unusable and prohibit its use until the cause of the error is recovered, thereby improving system reliability. Is possible. That is, when an error occurs due to a failure of the memory device and the operation at the time of the error is a write process to the memory device, another memory is prepared as an alternative memory, and the write process is performed using the memory. , Error recovery becomes possible.

【０１１０】図１９は、この発明の一実施例におけるエ
ラー発生元の認定後の動作を示すフローチャート、図２
０は、この発明の一実施例におけるエラー発生元の認定
後の動作を示すフローチャート、図２１は、この発明の
一実施例におけるエラー発生元の認定後の動作を示すフ
ローチャート、図２２はこの発明の一実施例におけるエ
ラー発生元の認定後の動作を示すフローチャートであ
る。FIG. 19 is a flow chart showing the operation after the recognition of the error occurrence source in one embodiment of the present invention.
0 is a flowchart showing the operation after certification error source in the embodiment of the present invention, FIG 21 is a flowchart showing the operation after certification error source in the embodiment of the present invention, Figure 22 is the present invention 9 is a flowchart showing an operation after the recognition of an error occurrence source in one embodiment.

【０１１１】すなわち、図１９のフローチャートに示す
動作によればエラー発生元であると認定されたプログラ
ムの種類（システムプログラムあるいはユーザプログラ
ムなどの種類）と同一種類のプログラムの特定を行い
（ステップＳＴ１１）、この特定したプログラムの種類
はシステムプログラムであるか否かを判定し（ステップ
ＳＴ１２）、システムプログラムであると判定したとき
には、システム全体を停止させる（ステップＳＴ１
３）。一方、ステップＳＴ１２においてシステムプログ
ラムでない、すなわちユーザプログラムであると判定さ
れたときには、ステップＳＴ１１において特定したプロ
グラムに対応するユーザプログラムの実行を停止させる
（ステップＳＴ１４）。この状態では、前記特定したプ
ログラムに対応するユーザプログラムは停止状態にある
が他のエラー発生元でないプログラムは停止することな
く実行が継続され、エラー原因を回復するための操作が
行われる。That is, according to the operation shown in the flowchart of FIG. 19, a program of the same type as the type of the program (the type of the system program or the user program) recognized as the error source is specified (step ST11). Then, it is determined whether or not the specified program type is a system program (step ST12). If it is determined that the program type is a system program, the entire system is stopped (step ST1).
3). On the other hand, if it is determined in step ST12 that the program is not a system program, that is, it is a user program, the execution of the user program corresponding to the program specified in step ST11 is stopped (step ST14). In this state, the user program corresponding to the specified program is in a halt state, but the other non-error-producing programs continue to be executed without stopping, and an operation for recovering the cause of the error is performed.

【０１１２】ユーザプログラムが停止している状態で
は、次にエラーの原因が回復したか否かを判定し（ステ
ップＳＴ１５）、エラーの原因が回復したと判定したと
きには、停止しているユーザプログラムの実行を再開し
継続する（ステップＳＴ１６）。In the state where the user program is stopped, it is next determined whether or not the cause of the error has been recovered (step ST15). The execution is resumed and continued (step ST16).

【０１１３】また、図２０に示すフローチャートに示す
動作によれば、エラー発生元であると認定されたプログ
ラムの種類（システムプログラムあるいはユーザプログ
ラムなどの種類）と同一種類のプログラムの特定を行い
（ステップＳＴ２１）、さらに特定したプログラムが使
用している資源を確認する（ステップＳＴ２２）。この
資源としては、発生元であると認定されたプログラムが
使用しているメモリ領域，各種の管理用テーブル，入出
力装置などである。そして、ステップＳＴ２２において
確認した資源の全部または一部を前記資源を管理してい
る資源管理システムに返却する（ステップＳＴ２３）。According to the operation shown in the flowchart shown in FIG. 20, a program of the same type as the type of the program (the type of the system program or the user program) recognized as the source of the error is specified (step ST21) Further, the resources used by the specified program are confirmed (step ST22). The resources include a memory area, various management tables, and an input / output device used by a program that has been recognized as the source. Then, all or some of the resources confirmed in step ST22 are returned to the resource management system managing the resources (step ST23).

【０１１４】また、図２１に示すフローチャートでは、
図２０のフローチャートに示したステップＳＴ２３にお
ける資源を管理している資源管理システムに資源を返却
する動作を行わないようにしたものであり（ステップＳ
Ｔ３３）、エラー発生原因となっている資源が資源管理
システムに返却され、使用されることでエラーの再発生
原因となるのを回避する。In the flowchart shown in FIG. 21,
The operation of returning the resources to the resource management system managing the resources in step ST23 shown in the flowchart of FIG. 20 is not performed (step S23).
T33) It is avoided that the resource causing the error is returned to the resource management system and used to cause the error to reoccur.

【０１１５】また、図２２に示すフローチャートに示す
動作によれば、エラー発生元であるプログラムが認定さ
れるとエラー処理が起動され、認定されたプログラムの
実行を停止する（ステップＳＴ４１）。そして、前記エ
ラーの発生元であると認定されたプログラムが使用して
いた資源の確認を行い（ステップＳＴ４２）、この資源
に代る代替資源の有無を判定する（ステップＳＴ４
３）。代替資源がないと判定し、また代替資源を用いな
ければ以後の処理を継続できないと判定したときには、
処理を中断する中断処理を行う（ステップＳＴ４４）。
一方、ステップＳＴ４３において代替資源があると判定
したときには、エラー発生原因である資源を代替資源と
入れ替える（ステップＳＴ４５）。そして、新しく入れ
替えた代替資源の状態が入れ替える前の資源の状態と同
一の状態になるように必要な設定データなどを再設定し
（ステップＳＴ４６）、前記ステップＳＴ４１において
中断したプログラムの実行を開始し継続する（ステップ
ＳＴ４７）。Further, according to the operation shown in the flowchart shown in FIG. 22, when the program which is the source of the error is recognized, error processing is started, and the execution of the recognized program is stopped (step ST41). Then, the resource used by the program recognized as the source of the error is checked (step ST42), and it is determined whether or not there is an alternative resource in place of this resource (step ST4).
3). When it is determined that there is no substitute resource and that the subsequent processing cannot be continued without using the substitute resource,
An interruption process for interrupting the process is performed (step ST44).
On the other hand, when it is determined in step ST43 that there is an alternative resource, the resource causing the error is replaced with the alternative resource (step ST45). Then, necessary setting data and the like are reset so that the status of the newly replaced alternative resource becomes the same as the status of the resource before the replacement (step ST46), and the execution of the program interrupted in step ST41 is started. Continue (step ST47).

【０１１６】[0116]

【発明の効果】以上のように、請求項１の発明によれ
ば、トレース用バッファに記憶した当該プロセスがどの
資源を使用しているかを知ることのできるプロセスの属
性情報を検索し、この属性情報を基に前記プロセスが使
用している資源を知ることでエラー発生原因となった資
源を使用しているプログラムを認定するように構成した
ので、エラー発生元の認定が正確かつ容易になる効果が
ある。 As described above, according to the first aspect of the present invention, the process stored in the trace buffer
An attribute of a process that can determine whether resources are being used.
Gender information, and based on this attribute information, the process
By knowing the resources used, the resources that caused the error
Configured to qualify programs that use resources
As a result, the error source can be accurately and easily identified.
is there.

【０１１７】請求項２の発明によれば、書き込みバッフ
ァやキャッシュなどの記憶装置を使用しているときにエ
ラーが発生すると、前記記憶装置内のデータを固定し実
行されているプログラムの状態が変化する前のプログラ
ム状態あるいは変化した後のプログラム状態などのトレ
ース用バッファに格納された属性情報を検索し、エラー
発生源を認定するように構成したので、エラー発生元の
認定が正確かつ容易になる効果がある。 According to the second aspect of the present invention, the write buffer
Error when using storage devices such as
When an error occurs, the data in the storage device is fixed and executed.
Before the program the state of the program that are line changes
Such as program status or program status after change
Search attribute information stored in the source buffer
Since the source is configured to be certified, the source of the error
This has the effect of making the certification accurate and easy.

【０１１８】請求項３の発明によれば、オペレーティン
グシステムによりプログラムの使用状況に応じて作成管
理されあるいは予め定義されてなるプログラムの管理テ
ーブルへのポインタ、または前記管理テーブルのコピー
またはプログラムの識別子などを属性情報としてトレー
ス用バッファに記憶し、この記憶した属性情報を基にエ
ラー発生元を認定するように構成したので、エラー発生
元の認定が正確かつ容易になる効果がある。 According to the third aspect of the present invention, the operating
Management system according to the program usage
Managed or pre-defined program management
Pointer to the table or a copy of the management table
Alternatively, trace the program identifier etc. as attribute information.
And store it in the data buffer based on the stored attribute information.
Error occurred because the error source was configured to be certified.
This has the effect of making the original certification accurate and easy.

【０１１９】請求項４の発明によれば、プログラムの状
態変化が発生した時刻またはプログラムの状態が変化し
てからの経過時間を属性情報としてトレース用バッファ
に記憶し、この記憶した属性情報を基にエラー発生元を
認定するように構成したので、エラー発生元の認定が正
確かつ容易になる効果がある。 According to the fourth aspect of the present invention, the state of the program
When the status change occurs or the program status changes
Trace buffer as attribute information
And the source of the error occurrence based on the stored attribute information.
The error source is certified
This has the effect of making sure and easy.

【０１２０】請求項５の発明によれば、エラーの検出時
刻と発生したエラーが検出されるのに要したエラー検出
時間から実際にエラーの発生したエラー発生時刻を求
め、さらにトレース用バッファに記憶したプログラムの
状態変化が発生した時刻またはプログラムの状態が変化
してからの経過時間などの属性情報により、前記求めた
エラー発生時刻に実行されていたプログラムを検索して
求め、エラー発生源を認定するように構成したので、エ
ラー発生元の認定が正確かつ容易になる効果がある。 According to the fifth aspect of the present invention, when an error is detected
Error detection required to detect an instantaneous error
Calculate the error occurrence time when the error actually occurred from the time
Of the program stored in the trace buffer
The time at which the status change occurred or the status of the program changed
From the attribute information such as the elapsed time since
Search for the program that was running at the time of the error
Request, and configured to identify the source of the error.
This has the effect of making accurate and easy recognition of the error source.

【０１２１】請求項６の発明によれば、発生したエラー
の種類に応じたエラーを検出するのに要するエラー検出
時間のリストをあらかじめ求めておき、発生したエラー
の種類に応じた前記エラー検出時間を前記リストから検
索して求め、エラーの検出時刻と前記リストより求めた
エラー検出時間からエラーの種類に応じた実際のエラー
発生時刻を求め、さらに属性情報としてトレース用バッ
ファに記憶したプログラムの状態変化が発生した時刻ま
たはプログラムの状態が変化してからの経過時間の情報
により、前記求めたエラー発生時刻に実行されていたプ
ログラムを検索して求め、エラー発生源を認定するよう
に構成したので、エラー発生時のプログラム状態とエラ
ー検出時のプログラム状態が異なる場合でもエラー発生
元の認定が正確かつ容易になる効果がある。 According to the sixth aspect of the present invention, an error that has occurred
Error detection required to detect errors according to the type of
A list of times is obtained in advance, and errors that occur
The error detection time corresponding to the type of error is detected from the list.
From the error detection time and the above list
The actual error in accordance with the type of error from the error detection time
The occurrence time is obtained, and the attribute
Time when the status change of the program stored in the
Or information about the elapsed time since the program status changed
The program executed at the time of the error occurrence
Search for programs and ask for the source of error
The program status and error
-An error occurs even if the program status at the time of detection is different
This has the effect of making the original certification accurate and easy.

【０１２２】請求項７の発明によれば、ＣＰＵの実行状
態がユーザプログラムからシステムプログラムへ、ある
いはシステムプログラムからユーザプログラムへ遷移し
たことを基にトレース用バッファへ属性情報を格納し、
トレース用バッファに格納されたプログラムの状態変化
に関しての属性情報を基にエラー発生元を認定するよう
に構成したので、エラー発生時のプログラム状態とエラ
ー検出時のプログラム状態が異なる場合でもエラー発生
元の認定が正確かつ容易になる効果がある。 According to the seventh aspect of the present invention, the execution state of the CPU
State from user program to system program
Or transition from the system program to the user program
Attribute information is stored in the trace buffer based on the
Changes in the status of the program stored in the trace buffer
Identify error sources based on attribute information about
The program status and error
-An error occurs even if the program status at the time of detection is different
This has the effect of making the original certification accurate and easy.

【０１２３】請求項８の発明によれば、プログラムの状
態変化としてＣＰＵの実行状態が非特権レベルから特権
レベルへ、あるいは特権レベルから非特権レベルへ遷移
することでトレース用バッファへ属性情報を格納し、こ
のトレース用バッファに格納された属性情報を基にエラ
ー発生元を認定するように構成したので、エラー発生元
の認定が正確かつ容易になる効果がある。 According to the eighth aspect of the present invention, the state of the program
CPU execution status changes from unprivileged level to privileged as state change
Level or from privileged to unprivileged
To store attribute information in the trace buffer,
Error based on the attribute information stored in the
-Since the source is configured to be certified, the source of the error
This has the effect of making the certification of the information accurate and easy.

【０１２４】請求項９の発明によれば、割り込み処理の
起動あるいは終了したタイミング、または割り込み処理
の起動および終了したタイミング、または割り込み処理
中の任意の時間タイミングでトレース用バッファへ属性
情報を格納し、このトレース用バッファに格納された属
性情報を基にエラー発生元を認定するように構成したの
で、エラー発生元の認定が正確かつ容易になる効果があ
る。 According to the ninth aspect of the present invention, the interrupt processing
Startup or termination timing, or interrupt handling
Startup and termination timing, or interrupt handling
Attribute to trace buffer at any time timing in
Stores information and attributes stored in this trace buffer.
Error source based on gender information.
This has the effect of making accurate and easy
You.

【０１２５】請求項１０の発明によれば、入出力装置に
おける入出力動作終了時や入出力動作中のエラー発生や
ハードウェアエラーや例外を割り込み要因とし、これら
割り込み要因が発生したときにトレース用バッファへ属
性情報を格納し、このトレース用バッファに格納された
属性情報を基にエラー発生元を認定するように構成した
ので、エラー発生時のプログラム状態とエラー検出時の
プログラム状態が異なる場合でもエラー発生元の認定が
正確かつ容易になる効果がある。 According to the tenth aspect, the input / output device
Error at the end of I / O operation,
Using hardware errors and exceptions as interrupt sources,
When an interrupt factor occurs, it is assigned to the trace buffer.
Stores sex data, stored in the trace buffer
Configured to recognize error sources based on attribute information
Therefore, the program status when an error occurs and the
Even if the program status is different, the error source
It has the effect of being accurate and easy.

【０１２６】請求項１１の発明によれば、プログラムカ
ウンタ値，ＣＰＵ状態値，時間要素，トレース事象番
号，各種レジスタの内容，プログラム名，プロセス名，
メッセージ，アドレス空間識別子，プロセス識別子，プ
ログラム識別子，プロセス属性情報，プログラム属性情
報，メッセージキュー識別子，メッセージキュー属性情
報のいずれかあるいはそれらの組み合わせをインターフ
ェースのパラメータとして、アプリケーションプログラ
ムに提供されたインターフェースを用いて属性情報を格
納し、格納したトレース情報を基にエラー発生元を認定
するように構成したので、エラー発生元の認定に有効な
データを効率良く収集でき、エラー発生元の認定が正確
かつ容易になる効果がある。 According to the eleventh aspect of the present invention, the program
Counter value, CPU status value, time element, trace event number
Number, contents of various registers, program name, process name,
Message, address space identifier, process identifier,
Program identifier, process attribute information, program attribute information
Information, message queue identifier, message queue attribute information
Any or a combination of these reports
As application parameters as application parameters.
Attribute information using the interface provided to the
Identify the error source based on the stored and stored trace information
Is configured to be effective
Efficient data collection and accurate error source recognition
And there is an effect that becomes easy.

【０１２７】請求項１２の発明によれば、ハードウェア
の制御回路内に設けられたトレース用バッファにプログ
ラムの状態変化を示す属性情報を記憶し、記憶した属性
情報を基にエラー発生元を認定するように構成したの
で、エラー発生時のプログラム状態とエラー検出時のプ
ログラム状態が異なる場合でもエラー発生元の認定が正
確かつ容易になる効果がある。 According to the twelfth aspect, the hardware
To the trace buffer provided in the control circuit of
Storing attribute information indicating the state change of the ram, and storing the stored attribute
It is configured to recognize the error source based on the information
The program status when an error occurs and the program status when an error is detected.
Even if the program status is different, the error source
This has the effect of making sure and easy.

【０１２８】請求項１３の発明によれば、ＣＰＵと主メ
モリとのインターフェース回路、外部バス制御回路、ロ
ーカルバス制御回路、入出力制御回路、出力メモリをハ
ードウェアの制御回路として、これらハードウェアに設
けられたトレース用バッファに属性情報を記憶し、記憶
した属性情報を基にエラー発生元を認定するように構成
したので、エラー発生元の認定が正確かつ容易になる効
果がある。 According to the thirteenth aspect , the CPU and the main memory
Memory interface circuit, external bus control circuit,
Local bus control circuit, input / output control circuit, and output memory.
A hardware control circuit is installed on these hardware.
The attribute information is stored in the trace buffer
Configured to identify error sources based on attribute information
The error source can be accurately and easily identified.
There is fruit.

【０１２９】請求項１４の発明によれば、エラー発生源
と認定されたプログラムの種類に応じて、前記認定され
たプログラムの実行を停止させ、あるいはシステムを停
止させ、あるいは前記プログラムの実行を延期させエラ
ー原因が回復した時点で前記プログラムの実行や前記シ
ステムの動作を再開させ継続させる構成を備えたので、
システムの信頼性を向上させるエラートレースによるシ
ステム制御管理方法が得られる効果がある。 According to the fourteenth aspect, an error source
Depending on the type of program certified,
Program execution or system
Error or postponed the execution of the program.
-When the cause has been resolved, the
Since the system has a configuration to restart and continue the operation of the system,
Error traces improve system reliability.
There is an effect that a stem control management method can be obtained.

【０１３０】請求項１５の発明によれば、エラー発生源
と認定されたプログラムがユーザプログラムまたはユー
ザプロセスの場合に、前記ユーザプログラムまたはユー
ザプロセスが使用している資源をそのシステムの管理を
実行しているシステム管理手段に返却することでシステ
ムを制御し管理するように構成したので、有効資源の利
用率を向上させ、システムの信頼性を向上させるエラー
トレースによるシステム制御管理方法が得られる効果が
ある。 According to the fifteenth aspect , an error source
Program certified as a user program or user
User process or user program
The resources used by the process are managed by the system.
The system is returned to the executing system management
System to control and manage
Errors that increase utilization and increase system reliability
The effect that the system control management method by trace can be obtained
is there.

【０１３１】請求項１６の発明によれば、エラー発生源
と認定されたプログラムが使用していたメモリ領域や各
種の管理用テーブルや入出力装置などの資源を夫々の空
き資源管理プールに返却することでシステムを制御し管
理するように構成したので、有効資源の利用率を向上さ
せ、システムの信頼性を向上させるエラートレースによ
るシステム制御管理方法が得られる効果がある。 According to the sixteenth aspect , an error source
Memory areas used by programs certified as
Resources such as various management tables and input / output devices
Control the system by returning it to the resource management pool
To increase the utilization of available resources.
Error traces to improve system reliability.
There is an effect that a system control management method can be obtained.

【０１３２】請求項１７の発明によれば、エラー発生源
と認定されたプログラムでエラーを発生させたメモリペ
ージまたはその他のエラーを発生させた資源を、夫々の
空き資源管理プールに返却しないことでシステムを制御
し管理するように構成したので、エラーの発生が繰り返
されることがなくなり、システムの信頼性を向上させる
エラートレースによるシステム制御管理方法が得られる
効果がある。 According to the seventeenth aspect, an error source
Memory page that caused an error with a program
Resources that caused a page or other error
Control the system by not returning it to the free resource management pool
Error management is repeated.
And improve system reliability
Provides a system control management method using error tracing
effective.

【０１３３】請求項１８の発明によれば、エラー発生源
と認定されたプログラムでエラーを発生した資源を代替
資源と入れ替えることでエラーからの回復を行い、エラ
ー原因が回復した時点で処理を再開させ継続させるよう
に構成したので、システムの信頼性を向上させるエラー
トレースによるシステム制御管理方法が得られる効果が
ある。 According to the eighteenth aspect, an error source
Substitute for failed resources in programs certified as
Recovering from errors by replacing resources
-When the cause is recovered, restart the process and continue
Errors that increase system reliability
The effect that the system control management method by trace can be obtained
is there.

[Brief description of the drawings]

【図１】この発明の一実施例によるエラートレースに
よるシステム制御管理方法の構成を示すブロック図であ
る。FIG. 1 is a block diagram showing a configuration of a system control management method using an error trace according to an embodiment of the present invention.

【図２】この発明の一実施例によるエラートレースに
よるシステム制御管理方法の動作を示すフローチャート
である。2 is a flowchart showing the operation of the system control management method according to error tracing according to an embodiment of the present invention.

【図３】この発明の一実施例によるエラートレースに
よるシステム制御管理方法における発生時間順にトレー
ス用バッファに格納されたプログラムの属性情報を示す
説明図である。FIG. 3 is an explanatory diagram showing attribute information of a program stored in a trace buffer in order of generation time in a system control management method using an error trace according to an embodiment of the present invention.

【図４】この発明の一実施例によるエラートレースに
よるシステム制御管理方法におけるエラー発生元の調査
・認定処理の動作を示すフローチャートである。4 is a flowchart showing an operation of the error originated survey and certification process in the system control management method according to error tracing according to an embodiment of the present invention.

【図５】この発明の一実施例によるエラートレースに
よるシステム制御管理方法におけるトレース用バッファ
に格納されたプログラムの属性情報を示す説明図であ
る。FIG. 5 is an explanatory diagram showing attribute information of a program stored in a trace buffer in a system control management method using an error trace according to an embodiment of the present invention.

【図６】この発明の一実施例によるエラートレースに
よるシステム制御管理方法におけるトレース用バッファ
に格納されたプログラムの属性情報を示す説明図であ
る。FIG. 6 is an explanatory diagram showing attribute information of a program stored in a trace buffer in a system control management method using an error trace according to an embodiment of the present invention.

【図７】この発明の一実施例によるエラートレースに
よるシステム制御管理方法の動作を示すフローチャート
である。7 is a flowchart showing the operation of the system control management method according to error tracing according to an embodiment of the present invention.

【図８】この発明の一実施例によるエラートレースに
よるシステム制御管理方法におけるリング状のトレース
用バッファの構成を示す説明図である。FIG. 8 is an explanatory diagram showing a configuration of a ring-shaped trace buffer in a system control management method using an error trace according to an embodiment of the present invention.

【図９】この発明の一実施例によるエラートレースに
よるシステム制御管理方法のトレース用バッファに格納
された属性情報を示す説明図である。FIG. 9 is an explanatory diagram showing attribute information stored in a trace buffer in a system control management method using an error trace according to an embodiment of the present invention.

【図１０】この発明の一実施例のエラートレースによ
るシステム制御管理方法の特徴を説明するための説明図
である。FIG. 10 is an explanatory diagram for explaining features of a system control management method using an error trace according to an embodiment of the present invention;

【図１１】この発明の一実施例のエラートレースによ
るシステム制御管理方法におけるエラー検出時間のリス
トを示す説明図である。FIG. 11 is an explanatory diagram showing a list of error detection times in a system control management method using an error trace according to an embodiment of the present invention.

【図１２】この発明の一実施例のエラートレースによ
るシステム制御管理方法における使用資源のリストを示
す説明図である。FIG. 12 is an explanatory diagram showing a list of used resources in a system control management method using an error trace according to an embodiment of the present invention.

【図１３】この発明の一実施例のエラートレースによ
るシステム制御管理方法の構成を示す概念図である。FIG. 13 is a conceptual diagram showing a configuration of a system control management method using an error trace according to an embodiment of the present invention.

【図１４】この発明の一実施例のエラートレースによ
るシステム制御管理方法の構成を示す概念図である。FIG. 14 is a conceptual diagram showing a configuration of a system control management method using an error trace according to an embodiment of the present invention.

【図１５】この発明の一実施例のエラートレースによ
るシステム制御管理方法の構成のＰＭＩ５２にトレース
用バッファを設けた場合の概念図である。FIG. 15 is a conceptual diagram in the case where a trace buffer is provided in the PMI 52 of the configuration of the system control management method using error tracing according to one embodiment of the present invention.

【図１６】この発明の一実施例のエラートレースによ
るシステム制御管理方法の構成におけるエラーが発生す
る前の書き込みバッファの状態を示す説明図である。FIG. 16 is an explanatory diagram showing a state of a write buffer before an error occurs in a configuration of a system control management method using an error trace according to an embodiment of the present invention.

【図１７】この発明の一実施例のエラートレースによ
るシステム制御管理方法の構成におけるトレース用バッ
ファに記憶したプログラムの状態によりエラー発生元を
知るための対応表を示す説明図である。FIG. 17 is an explanatory diagram showing a correspondence table for knowing an error occurrence source based on a state of a program stored in a trace buffer in the configuration of the system control management method using an error trace according to an embodiment of the present invention;

【図１８】この発明の一実施例のエラートレースによ
るシステム制御管理方法の構成におけるトレース用バッ
ファに記憶したプログラムの状態によりエラー発生元を
知るための対応表を示す説明図である。FIG. 18 is an explanatory diagram showing a correspondence table for knowing an error occurrence source based on a state of a program stored in a trace buffer in the configuration of the system control management method using an error trace according to an embodiment of the present invention.

【図１９】この発明の一実施例におけるエラー発生元
の認定後の動作を示すフローチャートである。19 is a flowchart showing the operation after certification error source in the embodiment of the present invention.

【図２０】この発明の一実施例におけるエラー発生元
の認定後の動作を示すフローチャートである。FIG. 20 is a flowchart showing the operation after the recognition of the error occurrence source in one embodiment of the present invention.

【図２１】この発明の一実施例におけるエラー発生元
の認定後の動作を示すフローチャートである。21 is a flowchart showing the operation after certification error source in the embodiment of the present invention.

【図２２】この発明の一実施例におけるエラー発生元
の認定後の動作を示すフローチャートである。22 is a flowchart showing the operation after certification error source in the embodiment of the present invention.

【図２３】従来のコンピュータシステムにおけるエラ
ー発生状況を模式的に時間の経過に従って示した説明図
である。FIG. 23 is an explanatory diagram schematically showing an error occurrence situation in a conventional computer system as time elapses.

[Explanation of symbols]

２４，６２トレース用バッファ、４４ソフトウェア
インターフェース、５１ＣＰＵ、５３主メモリ、５
４外部バス制御回路、５５ローカルバス制御回路、
５６入出力制御回路、６０キャッシュメモリ、６１
書き込みバッファ。24, 62 trace buffer, 44 software interface, 51 CPU, 53 main memory, 5
4 external bus control circuit, 55 local bus control circuit,
56 input / output control circuit, 60 cache memory, 61
Write buffer.

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平６−4364（ＪＰ，Ａ) 特開平３−48946（ＪＰ，Ａ) 特開平４−307641（ＪＰ，Ａ) 特開平４−344542（ＪＰ，Ａ) 特開平２−271435（ＪＰ，Ａ) 特開昭63−91749（ＪＰ，Ａ) 特開平４−286035（ＪＰ，Ａ) 特開平３−184130（ＪＰ，Ａ) 特開平１−213727（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06F 11/28 - 11/34──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-6-4364 (JP, A) JP-A-3-48946 (JP, A) JP-A-4-307641 (JP, A) JP-A-4-307 344542 (JP, A) JP-A-2-271435 (JP, A) JP-A-63-91749 (JP, A) JP-A-4-286035 (JP, A) JP-A-3-184130 (JP, A) JP-A-1-213727 (JP, A) (58) Fields investigated (Int. Cl. ⁶ , DB name) G06F 11/28-11/34

Claims

(57) [Claims]

1. A storing trace information when an error occurs in the trace buffer, certifies error source by the stored trace information, the system control management method according to error trace to control and system management, the trace Buffer for the process
Know what resources the process is using
Attribute information is stored as trace information, the attribute information is searched, and the process is performed based on the attribute information.
Knowing the resources used by the
A system control management method using error tracing, characterized in that a program that uses a used resource is recognized .

2. Description of a writing buffer, a cache and the like.
When using the storage device, if an error occurs,
The data in the storage device is fixed and the
Program state or change before ram state change
In the trace buffer for the program status etc.
Paid by retrieving the attribute information, this <br/> system control management method according to error trace of claim 1, wherein to certify error sources.

3. An operating system for programming.
Created and managed or predefined according to ram usage
Pointer to the management table of the program being
Or a copy of the management table or identification of the program
The method according to claim 1, wherein children and the like are stored as attribute information in a trace buffer .

4. A time when a program status change occurs.
Or the elapsed time since the program state changed
2. The method according to claim 1, wherein all the attribute information is stored in a trace buffer .

5. The method according to claim 1, wherein an error detection time and an error that has occurred are detected.
The error detection time taken to issue
Find the error occurrence time, and trace
Time when the status change of the program stored in the
The other is the program state, such as elapsed time from the change
According to the attribute information, the process is executed at the obtained error occurrence time.
Search for and search for the program that was used to identify the source of the error
2. The system control management method according to claim 1, wherein:

6. An error according to the type of error that has occurred.
Preliminary list of error detection time required for detection
The error according to the type of error that occurred.
-Search for the detection time from the list and find the error.
Error from the output time and the error detection time
The actual error occurrence time according to the type of
Changes in the state of the program stored in the trace buffer
The time when the event occurred or the status of the program changed
The above-mentioned error occurred due to attribute information such as elapsed time
Search for the program that was running at the time and ask for it.
2. The method according to claim 1, wherein the source is identified .

7. Storing attribute information in a trace buffer.
The execution state of the CPU as a program state change.
From user program to system program or system
Based on transition from stem program to user program
2. The system control management method according to claim 1, wherein the method is performed.

8. The transition of the execution state of the CPU is performed by a program.
CPU execution state changes from non-privileged level
From privilege level or from privilege level to non-privileged level
8. The system control management method using error trace according to claim 7, wherein:

9. The execution state transition of the CPU is executed by an interrupt processing.
Start or end of processing, or interrupt
Start and end of processing, or interrupt
It is characterized by arbitrary timing during processing
A system control management method using an error trace according to claim 7 .

10. An interrupt processing in an input / output device.
If an error occurs during I / O operation termination or I / O operation, or
Causes a hardware error or exception as an interrupt source
10. The system control management method using error tracing according to claim 9, wherein:

11. A program provided to an application program.
Trace information using attribute information
When stored as a program counter value,
Status value, time element, trace event number, and various registers
Contents, program name, process name, message, address
Space identifier, process identifier, program identifier,
Access attribute information, program attribute information, message queue
Either an identifier, message queue attribute information, or
Combining those combinations with interface parameters
2. The system control management method according to claim 1, wherein:

12. A device provided in a hardware control circuit.
Attribute indicating a program status change in the trace buffer
The system control management method according to claim 1, wherein the sex information is stored .

13. An interface between a CPU and a main memory.
Circuit, external bus control circuit, local bus control circuit, input
Output control circuit and main memory as hardware control circuits
13. The system control management method using an error trace according to claim 12, wherein

14. A program recognized as an error source.
The execution of the accredited program according to the type of
Or shut down the system, or
When the execution of the program is postponed and the cause of the error is recovered,
Restart the execution of the program or the operation of the system
Claim 1 to control and manage the system by continuing
System control management method using the error trace described above.

15. A program identified as an error source.
Is the user program or user process,
The user program or user process is using
System management means that manages the resources
Controlling and managing the system by returning the source
4. A system control management method using the error trace described in 4 .

16. A program recognized as an error source.
Area used by the server or various management tables
Alternatively, allocate resources such as input / output devices to each free resource management pool.
The system is controlled and managed by returning it to a file.
5. A system control management method using the error trace according to 5 .

17. A program recognized as an error source
Memory page or other error that caused the error
Return the resources that generated the error to their respective free resource management pools.
16. Controlling and managing the system by not rejecting it
System control management method using the error trace described above.

18. A program recognized as an error source
By replacing the resource that caused the error with the substitute resource
After recovering from the error and recovering from the error cause
Control and manage the system by resuming and continuing processing
The system control management method using error tracing according to claim 14 .