JP2000112788A

JP2000112788A - Supporting method for fault recovery, device therefor and machine readable storage medium recording program

Info

Publication number: JP2000112788A
Application number: JP10292918A
Authority: JP
Inventors: Yuichi Kondo; 裕一近藤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1998-09-30
Filing date: 1998-09-30
Publication date: 2000-04-21
Anticipated expiration: 2018-09-30
Also published as: JP3141856B2

Abstract

PROBLEM TO BE SOLVED: To make it possible to present a user with an appropriate method coping with a fundamental fault cause when the fault occurs to a computer. SOLUTION: A program 1 stores an identifier allocated to processing unit in a state monitoring table 2 and stores resource information which indicates a resource used by the processing unit in a resource data base 3 every time it executes the processing unit (for example, a function or the like). When a fault detection part 5 detects a fault occurrence, a fault cause examination part 6 finds the processing unit that becomes a fundamental fault cause on the basis of the state monitoring table 2, the resource data base 3 and contents of a document data base 4 and, furthermore, acquires a coping method stored in the document data base 4 in accordance with the processing unit. A document edit display mechanism part 8 edits the coping method and a memory image which a resource sample mechanism part 7 samples and displays them to a display device 9.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、コンピュータに障
害が発生した場合、利用者に適切な対処方法を提示する
技術に関し、特に、プログラムが行った処理を原因とす
る障害が発生した場合、利用者に適切な対処方法を提示
する技術に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a technique for presenting an appropriate countermeasure to a user when a failure occurs in a computer. Technology for presenting appropriate countermeasures to the elderly.

【０００２】[0002]

【従来の技術】オペレーティングシステム，アプリケー
ションプログラム等のプログラムの動作時に障害が発生
した場合、従来は、一般に、利用者が、システム側が出
力する障害メッセージとマニュアル等のドキュメントと
を突き合わせて障害の種類，原因を突き止めたり、対処
方法を得るようにしていた。しかし、この方法は、人手
によっているため、対処方法を得るまでに多くの時間が
費やされるという問題があると共に、利用者に負担がか
かるという問題があった。2. Description of the Related Art When a failure occurs during the operation of a program such as an operating system or an application program, conventionally, generally, a user compares a failure message output from the system with a document such as a manual to determine the type of the failure. I was trying to find the cause and get a solution. However, since this method is manually performed, there is a problem that much time is required to obtain a coping method, and there is a problem that a burden is imposed on a user.

【０００３】そこで、このような問題を解決するため、
障害メッセージと、対処方法と、説明文とを対応付けて
格納したテーブルを設けておき、障害メッセージが出力
された際、障害メッセージをキーにして上記テーブルを
検索し、障害メッセージに対応する対処方法，説明文を
利用者に提示するという技術が提案された（例えば、特
開平３−７５８４４号公報）。この従来の技術によれ
ば、障害発生時に、自動的に対処方法が提示されるの
で、利用者に負担をかけることなく、短時間で対処方法
を得ることが可能になる。In order to solve such a problem,
A table in which a failure message, a coping method, and a description are stored in association with each other is provided, and when a failure message is output, the table is searched using the failure message as a key, and a coping method corresponding to the failure message is provided. A technique has been proposed for presenting an explanation to a user (for example, Japanese Patent Application Laid-Open No. 3-75844). According to this conventional technique, when a failure occurs, a coping method is automatically presented, so that a coping method can be obtained in a short time without putting a burden on the user.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上述し
た従来の技術には、次のような問題があった。つまり、
システム側が出力した障害メッセージに対応してテーブ
ルに登録されている対処方法を利用者に提示しているだ
けであるので、適切な対処方法が利用者に提示されない
場合があるという問題があった。However, the above-mentioned prior art has the following problems. That is,
Since only the coping method registered in the table in response to the failure message output by the system is presented to the user, there is a problem that an appropriate coping method may not be presented to the user.

【０００５】例えば、或るプログラムが特定のファイル
をアクセスした時点で障害が発生したとすると、システ
ム側からは、上記特定のファイルに不正があることを示
す障害メッセージが出力される。しかし、この場合の根
本的な障害の原因は、上記特定のファイルに不正がある
ということではなく、何が上記特定のファイルに不正を
生じさせたかということである。例えば、障害メッセー
ジが出力される前に他のプログラムが上記特定のファイ
ルに不正を生じさせていたり、利用者の操作ミスが不正
を生じさせている場合には、それらが根本的な障害原因
となる。このように、障害メッセージが示す障害原因と
根本的な障害原因とが異なる場合、従来の技術では、根
本的な障害原因に対する対処方法が提示されないので、
提示された対処方法が適切なものでない場合があるとい
う問題があった。[0005] For example, if a failure occurs when a certain program accesses a specific file, the system outputs a failure message indicating that the specific file is invalid. However, the root cause of the failure in this case is not that the specific file is fraudulent, but what caused the specific file to be fraudulent. For example, if another program has created an error in the specific file before the error message was output, or if a user's operation error has caused the error, these are the root causes of the error. Become. As described above, when the failure cause indicated by the failure message is different from the fundamental failure cause, the conventional technology does not provide a method of coping with the fundamental failure cause,
There was a problem that the presented coping method may not be appropriate.

【０００６】そこで、本発明の目的は、根本的な障害原
因に対する対処方法を提示できるようにすることにあ
る。An object of the present invention is to provide a method for coping with a fundamental cause of a failure.

【０００７】[0007]

【課題を解決するための手段】本発明の障害復旧補助方
法は、上記目的を達成するため、プログラムの処理単位
の識別子に対応付けて、その処理単位が根本的な障害原
因となる可能性があるか否かを示す根本情報と、その処
理単位が根本的な障害原因とされた場合に於ける対処方
法とが格納されたドキュメントデータベースを設け、プ
ログラムの処理単位が実行される毎に、その処理単位の
識別子を実行順に状態監視テーブルに格納すると共に、
その処理単位に於いて使用したリソースを示すリソース
情報を、前記状態監視テーブルに格納した前記処理単位
の識別子と関連付けてリソースデータベースに格納し、
障害検出時、前記状態監視テーブルの内容，前記リソー
スデータベースの内容および前記ドキュメントデータベ
ースの内容に基づいて、根本的な障害原因となる処理単
位を求め、該求めた根本的な障害原因となる処理単位の
識別子に対応して前記ドキュメントデータベースに格納
されている対処方法を出力する。In order to achieve the above object, the failure recovery assisting method of the present invention associates an identifier of a processing unit of a program with the possibility that the processing unit may cause a fundamental failure. A document database is provided in which basic information indicating whether or not there is an error and a method for coping with the case where the processing unit is a fundamental cause of failure are provided. While storing the identifier of the processing unit in the status monitoring table in the order of execution,
Storing resource information indicating resources used in the processing unit in a resource database in association with the processing unit identifier stored in the status monitoring table;
Upon detecting a failure, a processing unit that causes a fundamental failure is obtained based on the contents of the status monitoring table, the contents of the resource database, and the contents of the document database, and the obtained processing unit that causes the fundamental failure is obtained. And outputting the coping method stored in the document database corresponding to the identifier of the document.

【０００８】上記障害復旧補助方法を実施する好適な装
置として、本発明の障害復旧補助装置は、プログラムの
処理単位の識別子に対応付けて、その処理単位が根本的
な障害原因となる可能性があるか否かを示す根本情報
と、その処理単位が根本的な障害原因とされた場合に於
ける対処方法とが格納されたドキュメントデータベース
と、プログラムの処理単位が実行される毎に、その処理
単位の識別子が実行順に格納される状態監視テーブル
と、プログラムの処理単位が実行される毎に、その処理
単位に於いて使用したリソースを示すリソース情報が前
記状態監視テーブルに格納された識別子と関連付けて格
納されるリソースデータベースと、障害検出時、前記状
態監視テーブルの内容，前記リソースデータベースの内
容および前記ドキュメントデータベースの内容に基づい
て、根本的な障害原因となる処理単位を求める障害原因
調査部と、該障害原因調査部で求めた根本的な障害原因
となる処理単位の識別子に対応して前記ドキュメントデ
ータベースに格納されている対処方法を出力するドキュ
メント編集表示機構部とを備えている。[0008] As a preferred apparatus for carrying out the above-mentioned failure recovery assistance method, the failure recovery assistance apparatus of the present invention associates an identifier of a processing unit of a program with the possibility that the processing unit may cause a fundamental failure. A document database storing basic information indicating whether or not there is a processing unit and a countermeasure in the case where the processing unit is considered to be a fundamental cause of failure. A status monitoring table in which unit identifiers are stored in the order of execution, and each time a program processing unit is executed, resource information indicating resources used in the processing unit is associated with the identifier stored in the status monitoring table. A resource database stored when the failure is detected, the contents of the status monitoring table, the contents of the resource database, and the document. A failure cause investigating unit for obtaining a processing unit that causes a fundamental failure based on the contents of the database, and the document corresponding to the identifier of the processing unit that causes a fundamental failure found by the failure cause investigating unit A document editing and displaying mechanism for outputting a coping method stored in the database.

【０００９】この構成に於いては、障害検出時、障害原
因調査部が、前記状態監視テーブルの内容，前記リソー
スデータベースの内容および前記ドキュメントデータベ
ースの内容に基づいて、根本的な障害原因となる処理単
位を求め、ドキュメント編集表示機構部が、障害原因調
査部で求めた根本的な障害原因となる処理単位の識別子
に対応してドキュメントデータベースに格納されている
対処方法を出力する。In this configuration, when a failure is detected, the failure cause investigating unit performs a process for causing a fundamental failure based on the contents of the status monitoring table, the contents of the resource database and the contents of the document database. The unit is obtained, and the document editing and displaying mechanism outputs the coping method stored in the document database corresponding to the identifier of the processing unit which is the fundamental cause of the failure obtained by the failure cause investigating unit.

【００１０】[0010]

【発明の実施の形態】次に本発明の実施の形態について
図面を参照して詳細に説明する。Embodiments of the present invention will now be described in detail with reference to the drawings.

【００１１】図１は本発明の実施の形態の構成例を示す
ブロック図であり、オペレーティングシステムやアプリ
ケーションプログラム等のプログラム１と、状態監視テ
ーブル２と、リソースデータベース３と、ドキュメント
データベース４と、障害検出部５と、障害原因調査部６
と、リソース採取機構部７と、ドキュメント編集表示機
構部８と、表示部９とを備えている。FIG. 1 is a block diagram showing a configuration example of an embodiment of the present invention. A program 1 such as an operating system or an application program, a state monitoring table 2, a resource database 3, a document database 4, a fault Detection unit 5 and failure cause investigation unit 6
, A resource collection mechanism unit 7, a document editing and display mechanism unit 8, and a display unit 9.

【００１２】プログラム１の処理単位にはそれぞれ異な
る状態識別子が割り当てられている。ここで、処理単位
とは、プログラム作成者が決める処理単位であり、例え
ば、関数を１つの処理単位とすることもできるし、関数
内を複数の部分に分割し、各部分を処理単位とすること
もできる。プログラム１は、その実行時に、処理単位を
実行する毎に、その処理単位の状態識別子を状態監視テ
ーブル２に実行順に格納する機能を有すると共に、その
処理単位で使用したリソースを示すリソース情報を、状
態監視テーブル２に格納されている状態識別子と関連付
けてリソースデータベース３に格納する機能を有する。Different status identifiers are assigned to the processing units of the program 1 respectively. Here, the processing unit is a processing unit determined by a program creator. For example, a function can be a single processing unit, or a function is divided into a plurality of parts, and each part is set as a processing unit. You can also. At the time of execution, the program 1 has a function of storing the status identifier of the processing unit in the status monitoring table 2 in the order of execution each time the processing unit is executed, and also stores resource information indicating the resources used in the processing unit. It has a function of storing in the resource database 3 in association with the status identifier stored in the status monitoring table 2.

【００１３】ドキュメントデータベース４には、プログ
ラムの処理単位の識別子に対応付けて、その処理単位が
根本的な障害原因となる可能性があるか否かを示す根本
情報と、その処理単位が根本的な障害原因とされた場合
に於ける対処方法とが格納されている。The document database 4 includes, in association with an identifier of a processing unit of a program, basic information indicating whether or not the processing unit is likely to cause a fundamental failure; And the countermeasures to be taken in the event of a serious failure.

【００１４】障害原因調査部６は、障害検出部５が障害
を検出した時、状態監視テーブル２，リソースデータベ
ース３及びドキュメントデータベース４の内容に基づい
て、根本的な障害原因となった処理単位を求める機能
や、根本的な障害原因に対応する対処方法をドキュメン
トデータベース４から取り出す機能等を有する。When the fault detecting unit 5 detects a fault, the fault cause investigating unit 6 determines a processing unit which has caused a root fault based on the contents of the status monitoring table 2, the resource database 3 and the document database 4. It has a function to be sought and a function to retrieve a coping method corresponding to a fundamental cause of failure from the document database 4.

【００１５】リソース採取機構部７は、根本的な障害原
因となった処理単位がアクセスしたメモリアドレスのメ
モリイメージを採取する機能を有する。The resource collecting mechanism 7 has a function of collecting a memory image of a memory address accessed by a processing unit that has caused a fundamental failure.

【００１６】ドキュメント編集表示機構部８は、障害原
因調査部６がドキュメントデータベース４から取り出し
た対処方法や、リソース採取機構部７が採取したメモリ
イメージを編集して表示部９に表示する機能を有する。The document editing and displaying mechanism unit 8 has a function of coping with the problem extracted from the document database 4 by the failure cause investigating unit 6 and a function of editing the memory image collected by the resource collecting mechanism unit 7 and displaying it on the display unit 9. .

【００１７】次に、本実施の形態の動作について説明す
る。Next, the operation of this embodiment will be described.

【００１８】プログラム１は、各処理単位を実行する毎
に、その処理単位に付与されている状態識別子を状態監
視テーブル２に実行順に書き込むと共に、その処理単位
で使用したリソースを示す情報を、状態監視テーブル２
に格納した状態識別子と関連付けてリソースデータベー
ス３に格納する。Each time a program 1 is executed, the program 1 writes a status identifier assigned to the process unit into the status monitoring table 2 in the order of execution, and stores information indicating resources used in the process unit in the status monitor table 2. Monitoring table 2
Is stored in the resource database 3 in association with the status identifier stored in the resource database 3.

【００１９】障害検出部５が障害発生を検出すると、障
害原因調査部６は、状態監視テーブル２，リソースデー
タベース３，ドキュメントデータベース４の内容に基づ
いて根本的な障害原因となった処理単位を探し出す（図
２，Ｓ１）。When the failure detecting unit 5 detects the occurrence of a failure, the failure cause investigating unit 6 searches for a processing unit that has caused a fundamental failure based on the contents of the state monitoring table 2, the resource database 3, and the document database 4. (FIG. 2, S1).

【００２０】このＳ１の処理を詳細に説明すると、次の
ようになる。The processing in S1 will be described in detail as follows.

【００２１】障害検出部５が障害を検出すると、障害原
因調査部６は、状態監視テーブル２の末尾のエントリ
（最後に状態識別子が格納されたエントリ）と関連する
リソースデータベース３のエントリからリソース情報を
取得し、それをキーとする（図３，Ｓ１１）。When the failure detecting unit 5 detects a failure, the failure cause investigating unit 6 retrieves resource information from the last entry of the status monitoring table 2 (the entry in which the status identifier is stored last) and the relevant entry of the resource database 3. Is obtained and used as a key (FIG. 3, S11).

【００２２】その後、障害原因調査部６は、状態監視テ
ーブル２のエントリを１つさかのぼり、そのエントリと
関連するリソースデータベース３のエントリの内容を参
照する（Ｓ１２，Ｓ１４）。そして、このエントリに格
納されているリソース情報が示すリソースの中に、キー
としているリソース情報が示すリソース（関連リソー
ス）が存在するか否かを判断する（Ｓ１５）。Thereafter, the failure cause investigating unit 6 goes back one entry in the status monitoring table 2 and refers to the contents of the entry of the resource database 3 related to the entry (S12, S14). Then, it is determined whether the resource indicated by the key resource information (related resource) exists in the resources indicated by the resource information stored in this entry (S15).

【００２３】そして、関連リソースが存在しない場合
（Ｓ１５がＮＯ）は、Ｓ１２の処理に戻る。これに対し
て、関連リソースが存在する場合（Ｓ１５がＹＥＳ）に
は、現在注目している状態監視テーブル２のエントリに
格納されている状態識別子をキーにしてドキュメントデ
ータベース４を検索し、上記状態識別子に対応する根本
情報が参照する。そして、根本情報が、根本的な障害原
因になる可能性があることを示している場合は、上記状
態識別子に対応する処理単位が根本的な障害原因である
と判断して、その状態識別子を保持すると共に、Ｓ１４
で参照したエントリに格納されているリソース情報を保
持する（Ｓ１６）。If there is no related resource (S15: NO), the process returns to S12. On the other hand, if there is a related resource (YES in S15), the document database 4 is searched using the status identifier stored in the entry of the status monitoring table 2 of interest as a key, and The root information corresponding to the identifier refers to. If the root information indicates that there is a possibility of causing a root failure, the processing unit corresponding to the state identifier is determined to be the root cause, and the state identifier is determined. Hold and S14
The resource information stored in the entry referred to in (1) is held (S16).

【００２４】その後、障害原因調査部６は、Ｓ１４で参
照したリソースデータベース３のエントリ中のリソース
情報に、現在キーにしているリソース情報に含まれてい
ないリソース情報が存在する場合は、そのリソース情報
をキーに追加した後（Ｓ１７）、Ｓ１２の処理に戻る。
尚、そのようなリソース情報が存在しない場合は、キー
の追加処理は行わずに、Ｓ１２の処理に戻る。上述した
処理を、状態監視テーブル２の全てのエントリに対して
行うと（Ｓ１３がＹＥＳ）、障害原因調査部６は、Ｓ２
の処理を行う。After that, if the resource information in the entry of the resource database 3 referred to in S14 includes resource information that is not included in the resource information currently set as a key, the failure cause investigating unit 6 checks the resource information. Is added to the key (S17), and the process returns to S12.
If no such resource information exists, the process returns to S12 without performing the key addition process. When the above-described processing is performed on all entries of the state monitoring table 2 (YES in S13), the failure cause investigation unit 6 sets
Is performed.

【００２５】Ｓ２に於いて障害原因調査部６は、Ｓ１で
探し出した根本的な障害原因となった処理単位の状態識
別子に基づいてドキュメントデータベース４を検索して
対処方法を取得し、更に、この対処方法と上記処理単位
のリソース情報とを組み合わせた障害情報を生成してド
キュメント編集表示機構部８に渡す。このＳ２の処理に
於いて、障害原因調査部６は、上記処理単位のリソース
情報をリソース採取機構部７に渡す処理も行っている。In S2, the failure cause investigating unit 6 searches the document database 4 based on the state identifier of the processing unit which has become the fundamental cause of failure found in S1 to obtain a coping method. Fault information combining the coping method and the resource information of the processing unit is generated and passed to the document editing and displaying mechanism unit 8. In the process of S2, the failure cause investigation unit 6 also performs a process of passing the resource information of the processing unit to the resource collection mechanism unit 7.

【００２６】リソース採取機構部７は、リソース情報が
渡されると、それに基づいてメモリイメージを採取し、
採取したメモリイメージをドキュメント編集表示機構部
８に渡す（Ｓ３）。When the resource information is passed, the resource collection mechanism section 7 collects a memory image based on the resource information, and
The collected memory image is transferred to the document editing and displaying mechanism 8 (S3).

【００２７】ドキュメント編集表示機構部８は、障害原
因調査部６からの障害情報と、リソース採取機構部７か
らのメモリイメージとを編集して表示部９に表示する
（Ｓ４，Ｓ５）。The document editing and displaying mechanism 8 edits the fault information from the fault cause investigating unit 6 and the memory image from the resource collecting mechanism 7 and displays them on the display unit 9 (S4, S5).

【００２８】このように、本実施の形態によれば、根本
的な障害原因に対する対処方法を利用者に提示すること
が可能になるので、適切な処置をとることが可能にな
る。As described above, according to the present embodiment, it is possible to present a user with a method of coping with a fundamental cause of a failure, so that appropriate measures can be taken.

【００２９】図４は本実施の形態の第１の実施例のブロ
ック図であり、コンピュータ１０と、磁気ディスク装置
２０と、表示部３０と、記録媒体４０とを備えている。FIG. 4 is a block diagram of a first example of the present embodiment, which comprises a computer 10, a magnetic disk device 20, a display unit 30, and a recording medium 40.

【００３０】コンピュータ１０は、アプリケーションプ
ログラム１１と、オペレーティングシステム１２と、状
態監視テーブル１３と、リソースデータベース１４と、
障害原因調査部１５と、リソース採取機構部１６と、ド
キュメント編集表示機能部１７とを含んでいる。The computer 10 includes an application program 11, an operating system 12, a status monitoring table 13, a resource database 14,
It includes a failure cause investigation unit 15, a resource collection mechanism unit 16, and a document editing and displaying function unit 17.

【００３１】磁気ディスク装置２０は、ドキュメントデ
ータベース２１と、リソース保存領域２２とを含んでい
る。The magnetic disk device 20 includes a document database 21 and a resource storage area 22.

【００３２】アプリケーションプログラム１１，オペレ
ーティングシステム１２の各処理単位には、それぞれ異
なる状態識別子が割り当てられている。ここで、処理単
位とは、プログラム作成者が決める処理単位であり、例
えば、関数を１つの処理単位とすることもできるし、関
数内を複数の部分に分割し、各部分を処理単位とするこ
ともできる。アプリケーションプログラム１１，オペレ
ーティングシステム１２は、その実行時に、処理単位を
実行する毎に、その処理単位の状態識別子を状態監視テ
ーブル１３に実行順に格納する機能を有すると共に、そ
の処理単位で使用したリソースを示すリソース情報を、
状態監視テーブル１３に格納されている状態識別子と関
連付けてリソースデータベース１４に格納する機能を有
する。Different state identifiers are assigned to the respective processing units of the application program 11 and the operating system 12. Here, the processing unit is a processing unit determined by a program creator. For example, a function can be a single processing unit, or a function is divided into a plurality of parts, and each part is set as a processing unit. You can also. The application program 11 and the operating system 12 have a function of storing the status identifier of the processing unit in the status monitoring table 13 in the order of execution each time the processing unit is executed at the time of execution. Resource information
It has a function of storing in the resource database 14 in association with the status identifier stored in the status monitoring table 13.

【００３３】状態監視テーブル１３には、図５に示すよ
うに、状態識別子と、状態識別子の格納順を示す第１ポ
インタと、関連するリソース情報を指し示す第２ポイン
タとが登録される。As shown in FIG. 5, the state monitoring table 13 registers a state identifier, a first pointer indicating the storage order of the state identifier, and a second pointer indicating related resource information.

【００３４】リソースデータベース１４には、図５に示
すように、状態識別子と、リソース情報とが格納され
る。リソース情報は、使用したリソースを示すものであ
り、使用したファイル（環境設定ファイル，一時ファイ
ル等）のファイル名や、他の処理単位や利用者とやり取
りした入出力値や、使用したメモリのアドレス等を含
む。As shown in FIG. 5, the resource database 14 stores a state identifier and resource information. The resource information indicates the used resources, such as the file names of the used files (environment setting files, temporary files, etc.), input / output values exchanged with other processing units and users, and addresses of used memories. And so on.

【００３５】ドキュメントデータベース２１には、コン
ピュータ１０上で実行される全てのプログラムの処理単
位の状態識別子が格納されると共に、各状態識別子に対
応付けて下記の情報が格納されている。The document database 21 stores state identifiers of processing units of all programs executed on the computer 10 and stores the following information in association with each state identifier.

【００３６】・根本情報…根本的な障害原因になる可能
性があるか否かを示す情報。・対処方法…根本情報が根本的な障害原因となる可能性
があることを示している場合のみ設定されるものであ
り、「変数ａの入力値が正しいことを確認して下さ
い。」，「以下の情報を確認して下さい。α，β」等の
ような対処方法を示すものである。・コンポーネント種別…アプリケーションプログラムの
ｚ関数，カーネル内のｙ関数等のようなコンポーネント
の種別を示す情報である。・動作内容の概要…「変数ａを入力し、ファイルｘに出
力」，「利用者からの入力によってｚｚｚを実行」等の
動作内容の概要を示すものである。Basic information: information indicating whether there is a possibility of causing a fundamental failure.・ Response method: This is set only when the basic information indicates that there is a possibility of causing a fundamental failure. “Check that the input value of variable a is correct.”, “ Please check the following information, which indicates a solution such as “α, β”. Component type: Information indicating the type of component, such as the z function of the application program and the y function in the kernel. Outline of operation contents: This shows an outline of operation contents such as "input variable a and output to file x" and "execute zzz by input from user".

【００３７】障害原因調査部１５は、障害検出部１２ａ
が障害を検出した時、状態監視テーブル１３，リソース
データベース１４及びドキュメントデータベース２１の
内容に基づいて根本的な障害原因となった処理単位を探
し出す機能や、探し出した処理単位の状態識別子に対応
してドキュメントデータベース２１に格納されている対
処方法を取得する機能等を有する。The fault cause investigating unit 15 is provided with a fault detecting unit 12a.
When a failure is detected, a function for searching for a processing unit that has caused a fundamental failure based on the contents of the state monitoring table 13, the resource database 14, and the document database 21 and a state identifier of the searched processing unit are provided. It has a function of acquiring a coping method stored in the document database 21 and the like.

【００３８】リソース採取機構部１６は、根本的な障害
原因となった処理単位がアクセスしたメモリアドレスの
メモリイメージを採取する機能を有する。The resource collecting mechanism 16 has a function of collecting a memory image of a memory address accessed by a processing unit that has caused a fundamental failure.

【００３９】ドキュメント編集表示機構部１７は、障害
原因調査部１５がドキュメントデータベース２１から取
り出した対処方法や、リソース採取機構部１６が採取し
たメモリイメージを編集して表示部３０に表示する機能
を有する。The document editing and displaying mechanism 17 has a function of coping with the problem extracted from the document database 21 by the fault cause investigating unit 15 and a function of editing the memory image collected by the resource collecting mechanism 16 and displaying the edited image on the display 30. .

【００４０】記録媒体４０は、ディスク，半導体メモ
リ，その他の記録媒体であり、コンピュータ１０を障害
復旧補助装置として機能させるためのプログラムが記録
されている。この記録媒体４０に格納されているプログ
ラムは、コンピュータ１０によって読み取られ、コンピ
ュータ１０の動作を制御することにより、コンピュータ
１０上に、障害原因調査部１５，リソース採取機構部１
６，ドキュメント編集表示機構部１７を実現する。The recording medium 40 is a disk, a semiconductor memory, or another recording medium, in which a program for causing the computer 10 to function as a failure recovery auxiliary device is recorded. The program stored in the recording medium 40 is read by the computer 10, and by controlling the operation of the computer 10, the failure cause investigation unit 15 and the resource collection mechanism unit 1 are stored on the computer 10.
6. Implement the document editing and displaying mechanism 17.

【００４１】次に、本実施例の動作を、２つのアプリケ
ーションプログラムＡＰ０１，ＡＰ０２が動作している
場合を例にとって説明する。尚、アプリケーションプロ
グラムＡＰ０１は、利用者の要求によってデータベース
ＤＢ０１を検索し、その結果をファイルＦ０１に出力す
るものとし、アプリケーションプログラムＡＰ０２は、
利用者の入力によってデータベースＤＢ０１の内容を更
新するものとする。Next, the operation of this embodiment will be described by taking as an example a case where two application programs AP01 and AP02 are operating. The application program AP01 searches the database DB01 at the request of the user and outputs the result to a file F01.
It is assumed that the contents of the database DB01 are updated by the user's input.

【００４２】今、アプリケーションプログラムＡＰ０１
の実行中に障害検出部１２ａが障害を検出したとする。
この時、状態監視テーブル１３の末尾のエントリには、
図５に示すように、アプリケーションプログラムＡＰ０
１の処理単位の内の、実行中であった処理単位の状態識
別子「ＡＰ０１：００２」が格納されている。また、リ
ソースデータベース１４には、上記処理単位が使用して
いたリソースを示すリソース情報「ファイル名；ＤＢ０
１，エントリ；ＸＸＸ」が、状態監視テーブル１３の末
尾のエントリの関連付けて格納されている。Now, the application program AP01
It is assumed that the failure detection unit 12a detects a failure during the execution of.
At this time, the last entry of the state monitoring table 13 includes
As shown in FIG. 5, the application program AP0
The status identifier “AP01: 002” of the processing unit being executed in one processing unit is stored. In the resource database 14, resource information “file name; DB0” indicating the resource used by the processing unit is stored.
1, entry; XXX "is stored in association with the last entry of the status monitoring table 13.

【００４３】障害原因調査部１５は、障害検出部１２ａ
が障害を検出すると、図３の流れ図に示すように、状態
監視テーブル１３の末尾のエントリに関連付けられてい
るリソースデータベース１４のエントリに格納されてい
るリソース情報「ファイル名；ＤＢ０１，エントリ；Ｘ
ＸＸ」をキーとする（Ｓ１１）。The failure cause investigating unit 15 includes a failure detecting unit 12a.
Detects the failure, as shown in the flowchart of FIG. 3, the resource information “file name; DB01, entry; X” stored in the entry of the resource database 14 associated with the last entry of the status monitoring table 13
XX "as a key (S11).

【００４４】その後、障害原因調査部１５は、状態監視
テーブル１３のエントリを１つ逆上り、そのエントリと
関連するリソースデータベース１４のエントリを参照す
る（Ｓ１２，Ｓ１４）。今、例えば、参照したリソース
データベース１４のエントリに、リソース情報として
「ファイル名；ＤＢ０１，エントリ；ＸＸＸ，入力値；
ＹＹＹ」が格納されていたとすると、Ｓ１５の判断結果
がＹＥＳとなるので、障害原因調査部１５は、Ｓ１６の
処理を行う。Thereafter, the failure cause investigating unit 15 goes up one entry of the status monitoring table 13 and refers to the entry of the resource database 14 related to the entry (S12, S14). Now, for example, in the entry of the resource database 14 referred to, "file name; DB01, entry; XXX, input value;
If "YYY" is stored, the result of the determination in S15 is YES, and the failure cause investigating unit 15 performs the processing in S16.

【００４５】Ｓ１６に於いて、障害原因調査部１５は、
現在注目している状態監視テーブル１３のエントリ（末
尾から２番目のエントリ）に格納されている状態識別子
「ＡＰ０２：００６」をキーにしてドキュメントデータ
ベース２１を検索し、状態識別子「ＡＰ０２：００６」
に対応する根本情報を取得する。今、根本情報が、根本
的な障害原因になる可能性があることを示しているとす
ると、障害原因調査部１５は、状態識別子「ＡＰ０２：
００６」を根本的な障害原因となる処理単位の状態識別
子として保持すると共に、Ｓ１４で参照したエントリ中
のリソース情報「ファイル名；ＤＢ０１，エントリ；Ｘ
ＸＸ，入力値；ＹＹＹ」を上記状態識別子と組にして保
持する。In S16, the failure cause investigating unit 15
The document database 21 is searched using the status identifier “AP02: 006” stored in the entry (the second entry from the end) of the status monitoring table 13 that is currently focused on, and the status identifier “AP02: 006”
Get the basic information corresponding to. Now, assuming that the root information indicates that there is a possibility of causing a root failure, the failure cause investigating unit 15 sets the state identifier “AP02:
006 "as the status identifier of the processing unit causing the fundamental failure, and the resource information" file name; DB01, entry; X "in the entry referred to in S14.
XX, input value; YYY "is held in combination with the above-mentioned state identifier.

【００４６】その後、障害原因調査部１５は、Ｓ１６で
保持したリソース情報「ファイル名；ＤＢ０１，エント
リ；ＸＸＸ，入力値；ＹＹＹ」に、現在キーにしている
リソース情報「ファイル名；ＤＢ０１，エントリ；ＸＸ
Ｘ」に含まれていないリソース情報「入力値；ＹＹＹ」
が存在するので、現在のキーに「入力値；ＹＹＹ」を追
加した「ファイル名；ＤＢ０１，エントリ；ＸＸＸ，入
力値；ＹＹＹ」を新たなキーとした後（Ｓ１７）、状態
監視テーブル１３のエントリを１つ逆上り、末尾から３
番目のエントリに注目する（Ｓ１２）。その後、障害原
因調査部１５は、上記末尾から３番目のエントリと対応
する、リソースデータベース１４のエントリを参照する
（Ｓ１４）。Thereafter, the failure cause investigating unit 15 adds the resource information “file name; DB01, entry; currently used as a key to the resource information“ file name; DB01, entry; XXX, input value; YYY ”held in S16. XX
Resource information "input value; YYY" not included in "X"
Exists, so that “file name; DB01, entry; XXX, input value; YYY” obtained by adding “input value; YYY” to the current key is used as a new key (S17), and then the entry in the state monitoring table 13 is made. One up, 3 from the end
Attention is paid to the third entry (S12). Thereafter, the failure cause investigating unit 15 refers to the entry of the resource database 14 corresponding to the third entry from the end (S14).

【００４７】今、例えば、参照したリソースデータベー
ス１４のエントリに、リソース情報として「出力値：Ｙ
ＹＹ，入力値０００」が格納されていたとすると、Ｓ１
５の判断結果がＹＥＳとなるので、障害原因調査部１５
は、Ｓ１６の処理を行う。Now, for example, in the entry of the resource database 14 referred to, “output value: Y
Assuming that “YY, input value 000” is stored, S1
Since the determination result of step 5 is YES, the failure cause investigation unit 15
Performs the processing of S16.

【００４８】障害原因調査部１５は、現在注目している
状態監視テーブル１３のエントリ（末尾から３番目のエ
ントリ）に格納されている状態識別子「ＡＰ０１：００
１」をキーにしてドキュメントデータベース２１を検索
し、状態識別子「ＡＰ０１：００１」に対応する根本情
報を取得する。今、根本情報が、根本的な障害原因にな
る可能性があることを示していないとすると、障害原因
調査部１５は、状態識別子「ＡＰ０１：００１」を保持
することなく、Ｓ１７の処理を行う。上述した処理を状
態監視テーブル１３の最後のエントリまで行うと、障害
原因調査部１５は、図２の流れ図に示すＳ２の処理を行
う。The failure cause investigating unit 15 stores the status identifier “AP01: 00” stored in the entry (the third entry from the end) of the status monitoring table 13 that is currently focused on.
The document database 21 is searched using “1” as a key, and the basic information corresponding to the state identifier “AP01: 001” is obtained. Now, assuming that the root information does not indicate that there is a possibility of causing a root failure, the failure cause investigating unit 15 performs the processing of S17 without holding the state identifier “AP01: 001”. . When the above processing is performed up to the last entry of the status monitoring table 13, the failure cause investigating unit 15 performs the processing of S2 shown in the flowchart of FIG.

【００４９】Ｓ２に於いては、Ｓ１６で保持した状態識
別子をキーにしてドキュメントデータベース２１を検索
し、上記状態識別子と対応して格納されている対処方法
を取得する。その後、取得した対処方法と、Ｓ１６で保
持したリソース情報とを組み合わせた障害情報を生成し
てドキュメント編集表示機能部１７に渡すと共に、リソ
ース情報をリソース採取機構部１６に渡す。尚、Ｓ１６
で保持した状態識別子とリソース情報の組が複数存在す
る場合には、それぞれの組に対して上述した処理を行う
ようにしても良いし、最後に保持した組に対してのみ上
述した処理を行うようにしても良い。つまり、最後に保
持した組に含まれている状態識別子の処理単位が、根本
的な障害原因である可能性が高いので、最後に保持した
組に対してのみ、上述した処理を行うようにしても良
い。また、複数の組が保持されている場合には、各組に
ついて障害情報を作成し、更に各障害情報に優先度（最
後に保持した組ついての障害情報の優先度を最も高く
し、最初に保持した組についての障害情報の優先度を最
も低くする）を組み込むようにしても良い。In S2, the document database 21 is searched using the status identifier held in S16 as a key, and the coping method stored corresponding to the status identifier is obtained. Thereafter, fault information is generated by combining the acquired coping method and the resource information held in S16, and the fault information is passed to the document edit display function unit 17 and the resource information is passed to the resource collection mechanism unit 16. S16
When there are a plurality of sets of the state identifier and the resource information held in the above, the above-described processing may be performed on each of the sets, or the above-described processing may be performed only on the last held set. You may do it. In other words, since the processing unit of the state identifier included in the last held pair is highly likely to be a fundamental failure cause, the above-described processing is performed only on the last held pair. Is also good. When a plurality of sets are held, fault information is created for each set, and the priority of each fault information (the highest priority is given to the fault information for the last held set, (Lowest priority of the failure information for the held set) may be incorporated.

【００５０】リソース採取機構部１６は、リソース情報
が渡されると、それに含まれているメモリアドレスが示
すメモリイメージを採取し、リソース保存領域２２に格
納すると共に、ドキュメント編集表示機能部１７に渡す
（Ｓ３）。When the resource information is transferred, the resource collection mechanism 16 collects a memory image indicated by the memory address included in the resource information, stores the memory image in the resource storage area 22, and transfers the memory image to the document edit display function unit 17 ( S3).

【００５１】これにより、ドキュメント編集表示機能部
１７は、障害原因調査部１５からの障害情報とリソース
採取機構部１６からのメモリイメージとを編集し、表示
部３０に表示する（Ｓ４，Ｓ５）。Thus, the document editing and displaying function unit 17 edits the fault information from the fault cause investigating unit 15 and the memory image from the resource collecting mechanism unit 16 and displays them on the display unit 30 (S4, S5).

【００５２】図６は第２の実施例のブロック図であり、
コンピュータ１０ａと、磁気ディスク装置２０ａと、表
示部３０と、記録媒体４０ａとを備えている。FIG. 6 is a block diagram of the second embodiment.
It includes a computer 10a, a magnetic disk device 20a, a display unit 30, and a recording medium 40a.

【００５３】本実施例のコンピュータ１０ａは、状態監
視テーブル１３，リソースデータベース１４を備えてい
ない点と、障害原因調査部１５の代わりに障害原因調査
部１５ａを備えている点が図４に示したコンピュータ１
０と相違している。FIG. 4 shows that the computer 10a of this embodiment does not have the status monitoring table 13 and the resource database 14, and that it has the fault cause investigating unit 15a instead of the fault cause investigating unit 15. Computer 1
0.

【００５４】障害原因調査部１５ａは、障害検出部１２
ａが障害を検出した場合だけでなく、コンピュータ１０
ａの再起動時にも、図２に示したＳ１，Ｓ２の処理を行
う機能を有する。尚、コンピュータ１０ａの他の構成要
素１１，１２，１２ａ，１６，１７は、第１の実施例に
於いて同一符号を付した構成要素と同様の機能を有す
る。The failure cause investigating unit 15 a
a if a failure is detected,
It has a function of performing the processing of S1 and S2 shown in FIG. 2 even when restarting a. The other components 11, 12, 12a, 16, 17 of the computer 10a have the same functions as the components denoted by the same reference numerals in the first embodiment.

【００５５】また、本実施例の磁気ディスク装置２０ａ
は、ドキュメントデータベース２１，リソース保存領域
２２の他に、状態監視テーブル１３，リソースデータベ
ース１４を備えている点が図４に示した第１の実施例に
於ける磁気ディスク装置２０と相違している。Further, the magnetic disk drive 20a of this embodiment
Is different from the magnetic disk device 20 in the first embodiment shown in FIG. 4 in that a state monitoring table 13 and a resource database 14 are provided in addition to the document database 21 and the resource storage area 22. .

【００５６】記録媒体４０ａはディスク，半導体メモ
リ，その他の記録媒体であり、コンピュータ１０ａを障
害復旧補助装置として機能させるためのプログラムが記
録されている。この記録媒体４０ａに記録されているプ
ログラムは、コンピュータ１０ａによって読み取られ、
コンピュータ１０ａの動作を制御することにより、コン
ピュータ１０ａ上に障害原因調査部１５ａ，リソース採
取機構部１６，ドキュメント編集表示機構部１７を実現
する。The recording medium 40a is a disk, a semiconductor memory, or another recording medium, and stores a program for causing the computer 10a to function as a failure recovery auxiliary device. The program recorded on the recording medium 40a is read by the computer 10a,
By controlling the operation of the computer 10a, a failure cause investigating unit 15a, a resource collecting mechanism unit 16, and a document editing and displaying mechanism unit 17 are realized on the computer 10a.

【００５７】次に、本実施例の動作を説明する。Next, the operation of this embodiment will be described.

【００５８】アプリケーションプログラム１１，オペレ
ーティングシステム１２は、前述した実施例と同様にそ
の処理単位を実行する毎に、状態監視テーブル１３に状
態識別子を格納すると共に、リソースデータベース１４
にリソース情報を格納する。コンピュータ１０ａに、そ
の動作が不可能となる障害が発生した場合であっても、
状態監視テーブル１３，リソースデータベース１４は、
不揮発性の磁気ディスク装置２０ａに設けられているの
で、その内容は保存される。The application program 11 and the operating system 12 store the status identifier in the status monitoring table 13 and execute the resource database 14 every time the processing unit is executed, similarly to the above-described embodiment.
To store resource information. Even if a failure that makes the operation impossible occurs in the computer 10a,
The status monitoring table 13 and the resource database 14
Since it is provided in the nonvolatile magnetic disk device 20a, its contents are preserved.

【００５９】コンピュータ１０ａの再起動時、障害原因
調査部１５ａが、図２の流れ図に示したＳ１，Ｓ２の処
理を行い、リソース採取機構部１６がＳ３の処理を行
い、ドキュメント編集表示機構部１７がＳ４，Ｓ５の処
理を行う。これにより、表示部３０に対処方法を含む障
害情報や、メモリイメージが表示される。When the computer 10a is restarted, the failure cause investigating unit 15a performs the processing of S1 and S2 shown in the flowchart of FIG. 2, the resource collecting mechanism unit 16 performs the processing of S3, and the document editing and displaying unit 17 Perform the processing of S4 and S5. Thereby, the failure information including the coping method and the memory image are displayed on the display unit 30.

【００６０】このように、本実施例では、状態監視テー
ブル１３，リソースデータベース１４を不揮発性の磁気
ディスク装置２０ａ上に設けるようにしているので、コ
ンピュータ１０ａの動作が不可能になるような障害が発
生した場合であっても、コンピュータ１０ａの再起動時
に、根本的な障害原因に対する対処方法を提示すること
が可能になる。尚、本実施例に於いては、磁気ディスク
装置２０ａ上に状態監視テーブル１３，リソースデータ
ベース１４を設けるようにしたが、他の不揮発性の記憶
装置上に設けるようにしても良い。As described above, in the present embodiment, the status monitoring table 13 and the resource database 14 are provided on the nonvolatile magnetic disk device 20a. Even in the case of occurrence, when the computer 10a is restarted, it is possible to present a countermeasure for the fundamental cause of the failure. In the present embodiment, the status monitoring table 13 and the resource database 14 are provided on the magnetic disk device 20a, but they may be provided on another nonvolatile storage device.

【００６１】[0061]

【発明の効果】以上説明したように、本発明は、障害検
出時、状態監視テーブル，リソースデータベースおよび
ドキュメントデータベースの内容に基づいて、根本的な
障害原因となる処理単位を求め、更に、根本的な障害原
因となる処理単位の識別子に対応してドキュメントデー
タベースに格納されている対処方法を利用者に提示する
ようにしているので、根本的は障害原因に対応する適切
な対処方法を利用者に提示できる効果がある。As described above, according to the present invention, when a failure is detected, a processing unit which causes a fundamental failure is obtained based on the contents of the status monitoring table, the resource database and the document database. Since the countermeasures stored in the document database are presented to the user in accordance with the identifier of the processing unit that causes the trouble, the user is basically required to provide the user with an appropriate countermeasure corresponding to the cause of the trouble. There is an effect that can be presented.

【００６２】また、本発明は、状態監視テーブル，リソ
ースデータベースを不揮発性の記憶装置上に構成したの
で、コンピュータの動作が不可能になるような障害が発
生した場合であっても、再起動時に、根本的な障害原因
に対する対処方法を提示することが可能になる効果があ
る。Further, according to the present invention, since the status monitoring table and the resource database are configured on a non-volatile storage device, even if a failure occurs that disables the operation of the computer, the status monitoring table and the resource database can be restarted. Thus, there is an effect that it is possible to present a method of coping with a fundamental failure cause.

[Brief description of the drawings]

【図１】本発明の実施の形態の構成例を示すブロック図
である。FIG. 1 is a block diagram illustrating a configuration example of an embodiment of the present invention.

【図２】実施の形態及び第１，第２の実施例の処理例を
示す流れ図である。FIG. 2 is a flowchart showing a processing example of the embodiment and first and second examples.

【図３】図２に示したＳ１の詳細な処理例を示す流れ図
である。FIG. 3 is a flowchart showing a detailed processing example of S1 shown in FIG. 2;

【図４】本発明の実施の形態の第１の実施例のブロック
図である。FIG. 4 is a block diagram of a first example of the embodiment of the present invention.

【図５】状態監視テーブル１３，リソースデータベース
１４の内容例を示す図である。FIG. 5 is a diagram showing an example of contents of a state monitoring table 13 and a resource database 14.

【図６】本発明の実施の形態の第２の実施例のブロック
図である。FIG. 6 is a block diagram of a second example of the embodiment of the present invention.

[Explanation of symbols]

１…プログラム２…状態監視テーブル３…リソースデータベース４…ドキュメントデータベース５…障害検出部６…障害原因調査部７…リソース採取機構部８…ドキュメント編集表示機構部９…表示部１０，１０ａ…コンピュータ１１…アプリケーションプログラム１２…オペレーティングシステム１２ａ…障害検出部１３…状態監視テーブル１４…リソースデータベース１５，１５ａ…障害原因調査部１６…リソース採取機構部１７…ドキュメント編集表示機構部２０，２０ａ…磁気ディスク装置２１…ドキュメントデータベース２２…リソース保存領域３０…表示部４０，４０ａ…記録媒体 DESCRIPTION OF SYMBOLS 1 ... Program 2 ... State monitoring table 3 ... Resource database 4 ... Document database 5 ... Failure detection part 6 ... Failure cause investigation part 7 ... Resource collection mechanism part 8 ... Document edit display mechanism part 9 ... Display part 10, 10a ... Computer 11 ... Application program 12 ... Operating system 12a ... Fault detection unit 13 ... Status monitoring table 14 ... Resource database 15, 15a ... Fault cause investigation unit 16 ... Resource collection mechanism unit 17 ... Document editing and display mechanism unit 20, 20a ... Magnetic disk unit 21 ... Document database 22 ... Resource storage area 30 ... Display unit 40,40a ... Recording medium

Claims

[Claims]

1. Basic information indicating whether there is a possibility that a processing unit may be a fundamental cause of failure in association with an identifier of a processing unit of a program. A document database is provided in which a method for dealing with the processing is stored, and each time a processing unit of the program is executed, the identifier of the processing unit is stored in the state monitoring table in the order of execution, and the processing unit is stored in the processing unit. Resource information indicating the used resources in the resource database in association with the identifier of the processing unit stored in the status monitoring table, and when a failure is detected, the content of the status monitoring table, the content of the resource database, and the document Based on the contents of the database, a processing unit that causes a fundamental failure is determined, and the determined fundamental failure source is determined. A method for assisting recovery from a failure, comprising: outputting a countermeasure stored in the document database corresponding to an identifier of a processing unit that causes the failure.

2. A resource stored in the resource database starting from resource information of a processing unit executed at the time of detecting the failure among resource information stored in the resource database when the failure is detected. By tracing the information in the reverse order of the storage order, the basic information in the document database, which is a processing unit using the resource indicated by the resource information as the starting point, causes a fundamental failure By searching for a processing unit indicating that there is a possibility, and then tracing the resource information stored in the resource database in the reverse order to the storage order, using the resource information of the searched processing unit as a starting point, A processing unit using the resource indicated by the resource information as the starting point, and By repeatedly performing a process of searching for a processing unit indicating that the fundamental information in the document database is likely to be a fundamental cause of failure, and using the found processing unit as a processing unit of a fundamental failure cause The method according to claim 1, wherein

3. The state monitoring table and the resource database are provided on a non-volatile storage device, and are executed when the computer restarts and when the failure is detected in the resource information stored in the resource database. By tracing the resource information stored in the resource database in the order opposite to the storage order with the resource information of the processing unit as a starting point, the resource indicated by the resource information as the starting point is used. Search for a processing unit that indicates that the underlying information in the document database may be a fundamental cause of failure, and then using the resource information of the found processing unit as a starting point Trace the resource information stored in the resource database in the reverse order of the storage order This is a processing unit that uses the resource indicated by the resource information as the starting point, and indicates that the fundamental information in the document database may cause a fundamental failure 3. The failure recovery assisting method according to claim 2, wherein a process of searching for the error is repeatedly performed, and the searched processing unit is set as a processing unit of a fundamental failure cause.

4. The method according to claim 1, wherein the resource information includes a memory address accessed by the processing unit, and outputs a coping method stored in the document database corresponding to an identifier of the processing unit causing a fundamental failure. 3. The method according to claim 2, further comprising outputting a memory image indicated by a memory address included in the resource information of the processing unit that causes the fundamental failure.
Disaster recovery assistance method described.

5. Basic information indicating whether there is a possibility that the processing unit may cause a fundamental failure in association with an identifier of the processing unit of the program, and determining whether the processing unit is a fundamental failure cause. A document database in which a coping method in the event of being executed is stored, a status monitoring table in which an identifier of the processing unit is stored in an execution order each time a processing unit of the program is executed, and a processing unit of the program is executed. Each time the resource is detected, resource information indicating resources used in the processing unit is stored in association with the identifier stored in the status monitoring table. When a failure is detected, the content of the status monitoring table, Failure to find a processing unit that causes a fundamental failure based on the contents of the resource database and the contents of the document database A harm cause investigating unit; and a document editing and displaying mechanism unit for outputting a coping method stored in the document database in accordance with an identifier of a processing unit that is a fundamental cause of failure obtained by the trouble cause investigating unit. A failure recovery assist device characterized by the following.

6. The resource management system according to claim 1, wherein when the failure is detected, the resource information of the processing unit executed at the time of detecting the failure among the resource information stored in the resource database is used as a starting point. By tracing the resource information stored in the database in the reverse order of the storage order, the processing unit using the resource indicated by the resource information as the starting point and the basic information in the document database are A processing unit indicating that there is a possibility of causing a fundamental failure is searched for, and then, starting from the resource information of the searched processing unit, the resource information stored in the resource database is reversed in the storage order. By using the resource indicated by the resource information set as the starting point. Process of searching for a processing unit that indicates that the basic information in the document database may cause a fundamental failure is repeated. 6. The fault recovery assisting device according to claim 5, wherein the fault recovery assisting device has a configuration as a cause processing unit.

7. The state monitoring table and the resource database are configured on a non-volatile storage device, and the failure cause investigating unit, when the computer is restarted, selects one of the resource information stored in the resource database. Tracing the resource information stored in the resource database in the reverse order of the storage order with the resource information of the processing unit being executed at the time of detecting the failure as a starting point, A processing unit that uses the indicated resource and that indicates that the fundamental information in the document database may cause a fundamental failure is searched for, and then the processing unit of the searched processing unit is searched for. Starting from the resource information, the resource information stored in the resource database is By tracing in the reverse order to the delivery order, it is a processing unit using the resource indicated by the resource information as the starting point, and there is a possibility that the basic information in the document database may be a fundamental cause of failure. 7. The failure recovery assisting device according to claim 6, wherein a process of searching for a processing unit indicating that there is is performed repeatedly, and the searched processing unit is used as a processing unit of a fundamental failure cause.

8. The resource information includes a memory address accessed in a processing unit, and indicates a memory address included in the resource information of the processing unit that is a fundamental cause of failure by the failure cause investigation unit. A resource collection mechanism for collecting a memory image, wherein the document editing and display mechanism includes a coping method stored in the document database corresponding to an identifier of a processing unit that causes a fundamental failure; and the resource collection mechanism. 7. The fault recovery assist device according to claim 6, wherein the fault recovery assist device has a configuration for editing and outputting the memory image collected by the unit.

9. Basic information indicating whether or not the processing unit is likely to cause a fundamental failure in association with an identifier of the processing unit of the program; A document database in which a countermeasure in the case where the program is executed, a status monitoring table in which an identifier of the processing unit is stored in the execution order each time the processing unit of the program is executed, and Each time the computer is provided with a resource database in which resource information indicating resources used in the processing unit is stored in association with the identifier stored in the status monitoring table, as a failure recovery auxiliary device. A machine-readable recording medium on which a program is recorded, the computer comprising: A failure cause investigating unit that determines a processing unit that causes a fundamental failure based on the contents of the file, the contents of the resource database, and the contents of the document database. A machine-readable recording recording a program for recording a program for functioning as a document editing and displaying mechanism for outputting a coping method stored in the document database in accordance with an identifier of a processing unit. Medium.