JP2006277115A

JP2006277115A - Abnormality detection program and abnormality detection method

Info

Publication number: JP2006277115A
Application number: JP2005092780A
Authority: JP
Inventors: Seiki Yamashita; 清貴山下; Joji Kato; 丈治加藤
Original assignee: Denso Ten Ltd; Fujitsu Ltd
Current assignee: Denso Ten Ltd; Fujitsu Ltd
Priority date: 2005-03-28
Filing date: 2005-03-28
Publication date: 2006-10-12
Anticipated expiration: 2025-03-28
Also published as: JP4562568B2

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently detect operation abnormality of an application program without repairing the application program. <P>SOLUTION: An operation information supplying part collects operation information of a process operating on an operating system and supplies the information to a monitoring processing part. An operation abnormality determination part of the monitoring processing part determines the operation abnormality of a process belonging to the application program as a monitoring object based on the process operation state, operation state duration time, allogenic context switching numbers and spontaneous context switching numbers included in the operation information supplied by the operation information supplying part. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明は、オペレーティングシステム上で動作するアプリケーションプログラムの動作異常を検出する異常検出プログラムおよび異常検出方法に関し、特に、監視対象となるアプリケーションプログラムを改修することなく効率的にアプリケーションプログラムの動作異常を検出することができる異常検出プログラムおよび異常検出方法に関する。 The present invention relates to an anomaly detection program and an anomaly detection method for detecting an anomaly in an application program that operates on an operating system, and more particularly to efficiently detect an anomaly in an application program without modifying the application program to be monitored. The present invention relates to an abnormality detection program and an abnormality detection method that can be performed.

従来、オペレーティングシステム上で動作するアプリケーションプログラムが無限ループに陥ったり、資源競合の結果ハングアップしたりといったプログラムの動作異常を検出する異常検出プログラムが知られている。 Conventionally, an abnormality detection program that detects an abnormal operation of a program such as an application program running on an operating system falling into an infinite loop or being hung up as a result of resource contention is known.

たとえば、動作異常の監視対象となるアプリケーションプログラムにプログラム異常を検出するためのコードを埋め込むことにより、アプリケーションプログラムの動作異常を検出する異常検出プログラムがある。しかし、市販のアプリケーションプログラムのように、ソースコードの改修をエンドユーザがおこなうことができないプログラムについては、かかるコードを埋め込むことができないので動作異常の検出対象とすることができないという問題があった。 For example, there is an abnormality detection program for detecting an operation abnormality of an application program by embedding a code for detecting the program abnormality in an application program to be monitored for operation abnormality. However, there is a problem that a program that cannot be modified by an end user, such as a commercially available application program, cannot be embedded in such a code and cannot be detected as an abnormal operation.

このため、動作異常の検出対象となるアプリケーションプログラムのコード改修を必要としない動作異常検出技術が提案されている。たとえば、特許文献１には、アプリケーションプログラムの暴走が引き起こすバスエラーの継続時間を計時するタイマをＣＰＵ側に用意してアプリケーションプログラムの動作異常を検出する技術が開示されている。また、特許文献２には、アプリケーションプログラムがディスプレイなどに表示する画面の表示に要する時間を計時することによりアプリケーションプログラムの動作異常を検出する技術が開示されている。 For this reason, there has been proposed an operation abnormality detection technique that does not require code modification of an application program to be detected as an operation abnormality. For example, Patent Document 1 discloses a technique for detecting an abnormal operation of an application program by preparing a timer on the CPU side that measures the duration of a bus error caused by an application program runaway. Patent Document 2 discloses a technique for detecting an operation abnormality of an application program by measuring the time required to display a screen displayed on a display or the like by the application program.

特開平１０−３０７７３７号公報JP-A-10-307737 特開２００３−５８３９４号公報JP 2003-58394 A

しかしながら、かかる特許文献１を用いたとしても、バスエラーを引き起こさない動作異常を検出することはできない。たとえば、アプリケーションプログラムの無限ループは、バスエラーを引き起こさない動作異常の一つである。この無限ループは、他のアプリケーションの動作に悪影響をおよぼすばかりか、場合によってはシステムダウンを引き起こすこともあるため検出対象とすべき動作異常である。しかし、特許文献１を用いても無限ループによる動作異常を検出することはできない。 However, even if this Patent Document 1 is used, it is not possible to detect an abnormal operation that does not cause a bus error. For example, an infinite loop of an application program is one of abnormal operations that do not cause a bus error. This infinite loop not only adversely affects the operation of other applications, but may cause system down in some cases, and is an operation abnormality that should be detected. However, even if Patent Document 1 is used, an operation abnormality due to an infinite loop cannot be detected.

また、特許文献２を用いたとしても、表示画面を有しないアプリケーションプログラムの動作異常を検出することはできない。たとえば、オペレーティングシステム上で動作する通信デーモンなどの常駐プロセスは表示画面を有しないので、特許文献２を用いてもかかる常駐プロセスの動作異常を検出することはできない。 Even if Patent Document 2 is used, it is not possible to detect an abnormal operation of an application program that does not have a display screen. For example, since a resident process such as a communication daemon operating on the operating system does not have a display screen, even if Patent Document 2 is used, it is impossible to detect an operation abnormality of the resident process.

また、アプリケーションプログラムは複数のプロセスにより構成されることが通常であるが、従来の異常検出プログラムは、アプリケーションプログラムのメインプロセスが終了した場合に、メインプロセスが生成した子プロセスの監視を継続しておこなうことができないという問題があった。このような監視できないプロセスが残存すると、他のアプリケーションに悪影響をおよぼす場合があるので、これらのプロセスを確実に監視対象とする必要がある。 In addition, the application program is usually composed of a plurality of processes. However, the conventional abnormality detection program continues to monitor the child processes generated by the main process when the main process of the application program ends. There was a problem that it could not be done. If such a process that cannot be monitored remains, it may adversely affect other applications. Therefore, it is necessary to reliably monitor these processes.

これらのことから、動作異常の検出対象となるアプリケーションプログラムの改修をおこなうことなく、アプリケーションプログラムを構成するすべてのプロセスについて、無限ループなどの動作異常を効率的かつ確実に検出する異常検出プログラムをいかにして実現するかが大きな課題となっている。 For these reasons, there is no need to develop an abnormality detection program that efficiently and reliably detects anomalies such as infinite loops for all processes that make up an application program, without revising the application programs that are subject to detection of anomalies. How to achieve this is a major issue.

この発明は、上述した従来技術による問題点を解消するためになされたものであり、動作異常の検出対象となるアプリケーションプログラムの改修をおこなうことなく、かかるアプリケーションプログラムの動作異常を効率的に検出することができる異常検出プログラムおよび異常検出方法を提供することを目的とする。 The present invention has been made to solve the above-described problems caused by the prior art, and efficiently detects an operation abnormality of such an application program without revising the application program to be detected. An object of the present invention is to provide an abnormality detection program and an abnormality detection method.

上述した課題を解決し、目的を達成するため、請求項１の発明に係る異常検出プログラムは、オペレーティングシステム上で動作するアプリケーションプログラムの動作異常を検出する異常検出プログラムであって、前記アプリケーションプログラムに属するプロセスの動作情報を前記オペレーティングシステムから取得する取得手順と、前記動作情報取得手順が取得した前記動作情報に基づいて前記アプリケーションプログラムの動作異常を判定する判定手順とをコンピュータに実行させることを特徴とする。 In order to solve the above-described problems and achieve the object, an abnormality detection program according to the invention of claim 1 is an abnormality detection program for detecting an operation abnormality of an application program operating on an operating system. An acquisition procedure for acquiring operation information of a process belonging to the operating system, and a determination procedure for determining an operation abnormality of the application program based on the operation information acquired by the operation information acquisition procedure. And

また、請求項２の発明に係る異常検出プログラムは、請求項１の発明において、前記取得手順が取得する前記動作情報は、前記プロセスのコンテキストスイッチ回数を含んだことを特徴とする。 An abnormality detection program according to a second aspect of the present invention is the abnormality detection program according to the first aspect, wherein the operation information acquired by the acquisition procedure includes the number of context switches of the process.

また、請求項３の発明に係る異常検出プログラムは、請求項２の発明において、前記コンテキストスイッチ回数は、前記プロセスが自らの指示で発生させたコンテキストスイッチの回数をあらわす自発的コンテキストスイッチ回数と、該プロセス以外の要因により引き起こされたコンテキストスイッチの回数をあらわす他発的コンテキストスイッチ回数とに区分されていることを特徴とする。 An abnormality detection program according to a third aspect of the present invention is the abnormality detection program according to the second aspect of the invention, wherein the number of context switches is a number of spontaneous context switches that represents the number of context switches generated by the process according to its own instruction. The number of times of context switching caused by factors other than the process is divided into the number of times of spontaneous context switching.

また、請求項４の発明に係る異常検出プログラムは、請求項１、２または３の発明において、前記取得手順が取得する前記動作情報は、前記プロセスの実行状態をあらわすプロセス状態を含んだことを特徴とする。 According to a fourth aspect of the present invention, there is provided the abnormality detection program according to the first, second, or third aspect of the invention, wherein the operation information acquired by the acquisition procedure includes a process state representing an execution state of the process. Features.

また、請求項５の発明に係る異常検出プログラムは、請求項４の発明において、前記判定手順は、前記プロセス状態が変化しないまま所定時間経過したならば、前記コンテキストスイッチ回数に基づいて前記アプリケーションプログラムの動作異常を判定することを特徴とする。 An abnormality detection program according to a fifth aspect of the present invention is the abnormality detection program according to the fourth aspect of the present invention, wherein, in the determination procedure, the application program is based on the number of context switches when a predetermined time elapses without changing the process state. It is characterized by determining an abnormal operation.

また、請求項６の発明に係る異常検出プログラムは、請求項５の発明において、前記判定手順は、前記プロセス状態が実行中のまま所定時間経過した場合であって、前記自発的コンテキストスイッチ回数が変化せず、かつ、前記他発的コンテキストスイッチ回数が変化した場合に、前記プロセスが無限ループに陥ったと判定することを特徴とする。 The abnormality detection program according to a sixth aspect of the present invention is the abnormality detection program according to the fifth aspect, wherein the determination procedure is a case where a predetermined time has elapsed while the process state is being executed, and the number of spontaneous context switch times is It is characterized in that it is determined that the process has fallen into an infinite loop when there is no change and the number of times of the spontaneous context switch is changed.

また、請求項７の発明に係る異常検出プログラムは、請求項５または６の発明において、前記判定手順は、前記プロセス状態が実行中のまま所定時間経過した場合であって、前記自発的コンテキストスイッチ回数が変化せず、かつ、前記他発的コンテキストスイッチ回数も変化しない場合に、前記プロセスがＣＰＵ待ち異常に陥ったと判定することを特徴とする。 An abnormality detection program according to a seventh aspect of the present invention is the abnormality detection program according to the fifth or sixth aspect, wherein the determination procedure is a case where a predetermined time elapses while the process state is being executed, and the spontaneous context switch If the number of times does not change and the number of times of the spontaneous context switch does not change, it is determined that the process has fallen into a CPU waiting abnormality.

また、請求項８の発明に係る異常検出プログラムは、請求項５、６または７の発明において、前記判定手順は、前記プロセス状態が実行待ちのまま所定時間経過した場合であって、前記自発的コンテキストスイッチ回数が変化せず、かつ、前記他発的コンテキストスイッチ回数も変化しない場合に、前記プロセスがＩ／Ｏ待ち異常に陥ったと判定することを特徴とする。 According to an eighth aspect of the present invention, there is provided the abnormality detection program according to the fifth, sixth or seventh aspect, wherein the determination procedure is a case where the process state is waiting for execution and a predetermined time has elapsed, When the number of context switches does not change and the number of spontaneous context switches does not change, it is determined that the process has fallen into an I / O waiting abnormality.

また、請求項９の発明に係る異常検出プログラムは、請求項１〜８のいずれか一つに記載の発明において、前記取得手順が取得する動作情報は、前記アプリケーションプログラムに属する前記プロセスの親子関係を含んだことを特徴とする。 The abnormality detection program according to the invention of claim 9 is the invention according to any one of claims 1 to 8, wherein the operation information acquired by the acquisition procedure is a parent-child relationship of the process belonging to the application program. It is characterized by including.

また、請求項１０の発明に係る異常検出プログラムは、請求項９の発明において、前記判定手順は、前記アプリケーションプログラムに属する子プロセスの親プロセスが終了して該子プロセスがゾンビプロセスとなった場合であっても該ゾンビプロセスを含む該アプリケーションプログラムの動作異常の判定を継続することを特徴とする。 The abnormality detection program according to the invention of claim 10 is the abnormality detection program according to claim 9, wherein the determination procedure is performed when a parent process of a child process belonging to the application program ends and the child process becomes a zombie process. Even so, it is characterized in that the determination of abnormal operation of the application program including the zombie process is continued.

また、請求項１１の発明に係る異常検出プログラムは、請求項１〜１０のいずれか一つに記載の発明において、前記動作情報は、前記アプリケーションプログラムに属するプロセスのプロセス優先度を含んだことを特徴とする。 An abnormality detection program according to an invention of claim 11 is the invention according to any one of claims 1 to 10, wherein the operation information includes a process priority of a process belonging to the application program. Features.

また、請求項１２の発明に係る異常検出プログラムは、請求項１〜１１のいずれか一つに記載の発明において、前記オペレーティングシステムは、前記アプリケーションプログラムに属するプロセスのプロセスディスクリプタごとに前記動作情報を収集して前記取得手順に提供する提供手順をコンピュータに実行させることを特徴とする。 An abnormality detection program according to a twelfth aspect of the present invention is the abnormality detection program according to any one of the first to eleventh aspects, wherein the operating system stores the operation information for each process descriptor of a process belonging to the application program. A provision procedure that is collected and provided to the acquisition procedure is executed by a computer.

また、請求項１３の発明に係る異常検出プログラムは、請求項１〜１２のいずれか一つに記載の発明において、前記アプリケーションプログラムに属するプロセスには、マルチスレッドプロセスが含まれることを特徴とする。 An abnormality detection program according to a thirteenth aspect of the present invention is the abnormality detection program according to any one of the first to twelfth aspects, wherein the process belonging to the application program includes a multi-thread process. .

また、請求項１４の発明に係る異常検出方法は、オペレーティングシステム上で動作するアプリケーションプログラムの動作異常を検出する異常検出方法であって、前記アプリケーションプログラムに属するプロセスの動作情報を前記オペレーティングシステムから取得する取得工程と、前記動作情報取得工程が取得した前記動作情報に基づいて前記アプリケーションプログラムの動作異常を判定する判定工程とを含んだことを特徴とする。 An abnormality detection method according to the invention of claim 14 is an abnormality detection method for detecting an operation abnormality of an application program operating on an operating system, and acquires operation information of a process belonging to the application program from the operating system. And a determination step of determining an operation abnormality of the application program based on the operation information acquired by the operation information acquisition step.

また、請求項１５の発明に係る異常検出方法は、請求項１４の発明において、前記オペレーティングシステムは、前記アプリケーションプログラムに属するプロセスのプロセスディスクリプタごとに前記動作情報を収集して前記取得工程に提供する提供工程を含んだことを特徴とする。 According to a fifteenth aspect of the present invention, in the abnormality detection method according to the fifteenth aspect, the operating system collects the operation information for each process descriptor of a process belonging to the application program and provides it to the acquisition step. A providing step is included.

請求項１にかかる異常検出プログラムによれば、アプリケーションプログラムに属するプロセスの動作情報をオペレーティングシステムから取得し、取得した動作情報に基づいてアプリケーションプログラムの動作異常を判定するよう構成したので、動作異常の検出対象となるアプリケーションプログラムの改修をおこなうことなく、かかるアプリケーションプログラムの動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program of the first aspect, the operation information of the process belonging to the application program is acquired from the operating system, and the operation abnormality of the application program is determined based on the acquired operation information. There is an effect that the operation abnormality of the application program can be efficiently detected without modifying the application program to be detected.

また、請求項２にかかる異常検出プログラムによれば、取得手順が取得する動作情報は、プロセスのコンテキストスイッチ回数を含むよう構成したので、アプリケーションプログラムを構成するプロセスのＣＰＵ権の取得および開放の状態を把握することによりアプリケーションプログラムの動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program according to claim 2, since the operation information acquired by the acquisition procedure includes the number of context switches of the process, the CPU right acquisition and release states of the processes constituting the application program As a result, it is possible to efficiently detect an abnormal operation of the application program.

また、請求項３にかかる異常検出プログラムによれば、コンテキストスイッチ回数は、プロセスが自らの指示で発生させたコンテキストスイッチの回数をあらわす自発的コンテキストスイッチ回数と、かかるプロセス以外の要因により引き起こされたコンテキストスイッチの回数をあらわす他発的コンテキストスイッチ回数とに区分されるよう構成したので、アプリケーションプログラムを構成するプロセスのＣＰＵ権の取得および開放の状態を詳細に把握することによりアプリケーションプログラムの動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program of claim 3, the number of context switches is caused by the number of spontaneous context switches that indicate the number of context switches that the process has generated by its own instructions and factors other than such processes. Since it is configured to be divided into other context switch times that indicate the number of context switches, it is possible to detect abnormalities in application program operation by grasping in detail the acquisition and release status of the CPU rights of the processes that make up the application program There exists an effect that it can detect efficiently.

また、請求項４にかかる異常検出プログラムによれば、取得手順が取得する動作情報は、プロセスの実行状態をあらわすプロセス状態を含むよう構成したので、アプリケーションプログラムを構成するプロセスの実行状態を把握することによりアプリケーションプログラムの動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program of claim 4, the operation information acquired by the acquisition procedure is configured to include a process state representing the execution state of the process. Therefore, the execution state of the process constituting the application program is grasped. As a result, the operation abnormality of the application program can be efficiently detected.

また、請求項５にかかる異常検出プログラムによれば、判定手順は、プロセス状態が変化しないまま所定時間経過したならば、コンテキストスイッチ回数に基づいてアプリケーションプログラムの動作異常を判定するよう構成したので、動作異常と判定するための閾値を柔軟に設定することができるという効果を奏する。 According to the abnormality detection program according to claim 5, the determination procedure is configured to determine the operation abnormality of the application program based on the number of context switches when the predetermined time has passed without changing the process state. There is an effect that the threshold value for determining the abnormal operation can be set flexibly.

また、請求項６にかかる異常検出プログラムによれば、判定手順は、プロセス状態が実行中のまま所定時間経過した場合であって、自発的コンテキストスイッチ回数が変化せず、かつ、他発的コンテキストスイッチ回数が変化した場合に、プロセスが無限ループに陥ったと判定するよう構成したので、無限ループによる動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program of claim 6, the determination procedure is a case where a predetermined time has passed while the process state is being executed, the spontaneous context switch count does not change, and the spontaneous context Since the configuration is such that it is determined that the process has fallen into an infinite loop when the number of times of switching changes, there is an effect that it is possible to efficiently detect an abnormal operation due to the infinite loop.

また、請求項７にかかる異常検出プログラムによれば、判定手順は、プロセス状態が実行中のまま所定時間経過した場合であって、自発的コンテキストスイッチ回数が変化せず、かつ、他発的コンテキストスイッチ回数も変化しない場合に、プロセスがＣＰＵ待ち異常に陥ったと判定するよう構成したので、ＣＰＵ待ちによる動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program of claim 7, the determination procedure is a case where a predetermined time has passed while the process state is being executed, and the number of spontaneous context switches does not change, and the spontaneous context When the number of switches does not change, it is determined that the process has fallen into the CPU wait abnormality, so that an operational abnormality due to the CPU wait can be efficiently detected.

また、請求項８にかかる異常検出プログラムによれば、判定手順は、プロセス状態が実行待ちのまま所定時間経過した場合であって、自発的コンテキストスイッチ回数が変化せず、かつ、他発的コンテキストスイッチ回数も変化しない場合に、プロセスがＩ／Ｏ待ち異常に陥ったと判定するよう構成したので、Ｉ／Ｏ待ちによる動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program according to claim 8, the determination procedure is a case where a predetermined time has passed while the process state is waiting to be executed, and the number of spontaneous context switches does not change, and the spontaneous context If the number of switches does not change, it is determined that the process has fallen into an I / O wait abnormality, so that an operational abnormality due to an I / O wait can be efficiently detected.

また、請求項９にかかる異常検出プログラムによれば、取得手順が取得する動作情報は、アプリケーションプログラムに属するプロセスの親子関係を含むよう構成したので、監視対象となるアプリケーションプログラムを構成するすべてのプロセスの動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program of claim 9, since the operation information acquired by the acquisition procedure is configured to include the parent-child relationship of processes belonging to the application program, all the processes constituting the application program to be monitored It is possible to efficiently detect the abnormal operation.

また、請求項１０にかかる異常検出プログラムによれば、判定手順は、アプリケーションプログラムに属する子プロセスの親プロセスが終了してかかる子プロセスがゾンビプロセスとなった場合であってもゾンビプロセスを含むアプリケーションプログラムの動作異常の判定を継続するよう構成したので、監視対象となるアプリケーションプログラムに属するプロセスが終了し、プロセスの親子関係が崩れた場合であっても動作異常監視を継続することができるという効果を奏する。 Further, according to the abnormality detection program according to claim 10, the determination procedure includes an application including a zombie process even when the parent process of the child process belonging to the application program ends and the child process becomes a zombie process. Since it is configured to continue the determination of program operation abnormality, it is possible to continue monitoring operation abnormality even when the process belonging to the application program to be monitored ends and the parent-child relationship of the process breaks down Play.

また、請求項１１にかかる異常検出プログラムによれば、動作情報は、アプリケーションプログラムに属するプロセスのプロセス優先度を含むよう構成したので、プロセス優先度を考慮した動作異常判定をおこなうことにより、動作異常判定の精度を向上させることができるという効果を奏する。 According to the abnormality detection program according to claim 11, since the operation information includes the process priority of the process belonging to the application program, the operation abnormality is determined by performing the operation abnormality determination in consideration of the process priority. There is an effect that the accuracy of determination can be improved.

また、請求項１２にかかる異常検出プログラムによれば、オペレーティングシステムは、アプリケーションプログラムに属するプロセスのプロセスディスクリプタごとに動作情報を収集して取得手順に提供する提供手順をコンピュータに実行させるよう構成したので、アプリケーション監視に必要な情報の提供を効率的におこなうことができるという効果を奏する。 According to the abnormality detection program of the twelfth aspect, the operating system is configured to cause the computer to execute the providing procedure for collecting operation information for each process descriptor of the process belonging to the application program and providing the acquisition procedure. Thus, it is possible to efficiently provide information necessary for application monitoring.

また、請求項１３にかかる異常検出プログラムによれば、アプリケーションプログラムに属するプロセスには、マルチスレッドプロセスが含まれるよう構成したので、監視対象となるアプリケーションプログラムにマルチスレッドプロセスが含まれる場合であっても、非マルチスレッドプロセスと同様に取り扱うことにより、プロセスの動作異常を効率的に検出することができるという効果を奏する。 According to the abnormality detection program of the thirteenth aspect, since the process belonging to the application program is configured to include the multi-thread process, the application program to be monitored includes the multi-thread process. However, when handled in the same manner as a non-multi-thread process, it is possible to efficiently detect an abnormal operation of the process.

また、請求項１４にかかる異常検出方法によれば、アプリケーションプログラムに属するプロセスの動作情報をオペレーティングシステムから取得し、取得した動作情報に基づいてアプリケーションプログラムの動作異常を判定するよう構成したので、動作異常の検出対象となるアプリケーションプログラムの改修をおこなうことなく、かかるアプリケーションプログラムの動作異常を効率的に検出することができるという効果を奏する。 In addition, according to the abnormality detection method of the fourteenth aspect, the operation information of the process belonging to the application program is acquired from the operating system, and the operation abnormality of the application program is determined based on the acquired operation information. There is an effect that it is possible to efficiently detect the operation abnormality of the application program without modifying the application program to be detected.

また、請求項１５にかかる異常検出方法によれば、オペレーティングシステムは、アプリケーションプログラムに属するプロセスのプロセスディスクリプタごとに動作情報を収集して取得手順に提供する提供工程を含むよう構成したので、アプリケーション監視に必要な情報の提供を効率的におこなうことができるという効果を奏する。 According to the abnormality detection method of the fifteenth aspect, the operating system is configured to include a providing step of collecting operation information for each process descriptor of the process belonging to the application program and providing the acquisition procedure. It is possible to efficiently provide information necessary for the operation.

以下に添付図面を参照して、この発明に係る異常検出プログラムおよび異常検出方法の好適な実施例を詳細に説明する。なお、以下の実施例においては、本発明を、オペレーティングシステムとしてＬｉｎｕｘ（登録商標）が動作するコンピュータに適用した場合について説明することとする。 Exemplary embodiments of an abnormality detection program and an abnormality detection method according to the present invention will be described below in detail with reference to the accompanying drawings. In the following embodiments, a case will be described in which the present invention is applied to a computer that runs Linux (registered trademark) as an operating system.

図１は、本発明に係るプログラムの異常検出処理の概念を示す図である。同図に示すように、本発明に係るプログラムの異常検出処理では、オペレーティングシステム上で動作するアプリケーションプログラムの１つとして監視プロセス１００が設けられ、監視対象となる各アプリケーションの動作異常を監視する。この監視対象となるアプリケーションプログラム（図１に示す「監視対象アプリ」）は、複数のプロセスからなることが通常である。 FIG. 1 is a diagram showing the concept of a program abnormality detection process according to the present invention. As shown in the figure, in the abnormality detection process for a program according to the present invention, a monitoring process 100 is provided as one of application programs operating on the operating system, and the operation abnormality of each application to be monitored is monitored. The application program to be monitored (“monitored application” shown in FIG. 1) usually consists of a plurality of processes.

これらのプロセスがオペレーティングシステム上に生成されると、各プロセスに対応するプロセスディスクリプタがオペレーティングシステム内で管理される。オペレーティングシステムは、このプロセスディスクリプタを用いて各プロセスのスケジューリングをおこなったり、入出力などの資源の割り当てをおこなったりすることになる。 When these processes are generated on the operating system, process descriptors corresponding to the respective processes are managed in the operating system. The operating system performs scheduling of each process using this process descriptor and allocates resources such as input / output.

オペレーティングシステム内に設けられた動作情報提供部２００は、かかるプロセスディスクリプタごとのプロセスの動作情報を取得し、取得した動作情報を上記監視プロセス１００に提供する処理部である。なお、Ｌｉｎｕｘ（登録商標）は、ソースコードが公開されているオペレーティングシステムの１つであり、ＲＡＰ（Resource Archive Project）によりオペレーティングシステムの機能拡張がおこなわれている。かかる動作情報提供部２００は、Ｌｉｎｕｘ（登録商標）を機能拡張することにより設けられる処理部である。 The operation information providing unit 200 provided in the operating system is a processing unit that acquires process operation information for each process descriptor and provides the acquired operation information to the monitoring process 100. Note that Linux (registered trademark) is one of operating systems whose source codes are publicly available, and the functions of the operating system are expanded by RAP (Resource Archive Project). The operation information providing unit 200 is a processing unit provided by extending the function of Linux (registered trademark).

監視プロセス１００は、動作情報提供部２００に動作情報取得要求をおこなうことにより監視対象アプリに属するプロセスの動作情報を取得する。そして、取得した動作情報に基づいて各プロセスに動作異常があるか否かを判定することにより監視対象アプリの動作異常を検出する。また、動作異常を検出した場合には、プロセスの強制終了などの異常時処理をおこなう。 The monitoring process 100 acquires operation information of processes belonging to the monitoring target application by making an operation information acquisition request to the operation information providing unit 200. Then, the operation abnormality of the monitoring target application is detected by determining whether each process has an operation abnormality based on the acquired operation information. Also, when an abnormal operation is detected, an abnormal process such as forced termination of the process is performed.

従来、アプリケーションプログラムを監視する場合には、監視対象となるアプリケーションプログラムに所定のソースコードを追記する改修をおこなうことが通常であった。また、かかる改修をおこなう必要のないアプリ異常検出方法も提案されていたが、検出可能な動作異常が制限されており、アプリケーションプログラムがＣＰＵを占有してしまう無限ループや、リアルタイムに実行する必要のあるアプリケーションプログラムがＩ／ＯまちやＣＰＵ待ちのため長期間動作を待たされるといった動作異常を効率的に検出することができないという問題があった。 Conventionally, when monitoring an application program, it has been usual to make a modification by adding a predetermined source code to the application program to be monitored. In addition, although an application abnormality detection method that does not require such modification has been proposed, the detectable operation abnormality is limited, and the application program must occupy the CPU or be executed in real time. There has been a problem that it is impossible to efficiently detect an operation abnormality in which an application program waits for a long period of time because of waiting for an I / O town or a CPU.

本発明に係る異常検出処理では、オペレーティングシステム内で管理されるプロセスディスクリプタと対応する動作情報をオペレーティングシステム側で収集し、オペレーティングシステム上で動作する監視プロセス１００がかかる動作情報を取得し、取得した動作情報に基づいて監視対象アプリの動作異常を検出することとしている。 In the abnormality detection processing according to the present invention, the operation information corresponding to the process descriptor managed in the operating system is collected on the operating system side, and the monitoring process 100 operating on the operating system acquires and acquires the operation information. An abnormal operation of the monitoring target application is detected based on the operation information.

このように、監視プロセス１００は、監視対象アプリを直接的に監視するのではなく、オペレーティングシステム内のプロセスディスクリプタに対応する動作情報を用いて監視対象アプリを間接的に監視することとした。したがって、監視対象アプリの改修をおこなうことなく監視対象アプリの動作異常を検出することができるとともに、オペレーティングシステム内の詳細なプロセス動作情報を取得することができるので、監視対象アプリに属するプロセスのさまざまな動作異常を効率的に検出することができる。 As described above, the monitoring process 100 does not directly monitor the monitoring target application, but indirectly monitors the monitoring target application using the operation information corresponding to the process descriptor in the operating system. Therefore, it is possible to detect an abnormal operation of the monitored application without modifying the monitored application, and to obtain detailed process operation information in the operating system, so that various processes belonging to the monitored application can be obtained. It is possible to efficiently detect abnormal operation abnormalities.

次に、本発明に係る異常検出処理において用いられる動作情報と、かかる動作情報に基づいて判定される異常種別との関係について図２を用いて説明する。図２は、動作情報と異常種別との関係を示す図である。本発明に係る異常検出処理では、ＣＰＵ上で動作するプロセスの切り替え（以下、「コンテキストスイッチ」と言う）に着目し、コンテキストスイッチ回数の変化と異常種別との関係を用いて監視対象アプリの動作異常を検出することとした。 Next, the relationship between the operation information used in the abnormality detection process according to the present invention and the abnormality type determined based on the operation information will be described with reference to FIG. FIG. 2 is a diagram illustrating the relationship between the operation information and the abnormality type. In the abnormality detection processing according to the present invention, attention is paid to switching of processes operating on the CPU (hereinafter referred to as “context switch”), and the operation of the monitoring target application is performed using the relationship between the change in the number of context switches and the abnormality type. Anomaly was detected.

ここで、かかるコンテキストスイッチについて説明しておく。シングルＣＰＵのコンピュータ上にマルチプロセス環境を実現するためには、ＣＰＵ上で動作する各プロセスを順次切替えることにより、あたかも複数のプロセスが同時に動作しているようにスケジューリングする必要がある。たとえば、リアルタイムスケジューリングの場合にはタイマ割込みによりコンテキストスイッチがおこなわれ、実行中のプロセスにかわって実行待ちのプロセスがＣＰＵにロードされる。また、ラウンドロビンスケジューリングの場合には各プロセスにタイムスライスを割り当て、このタイムスライスを使い切ったならばコンテキストスイッチがおこなわれ実行プロセスが切替えられる。 Here, the context switch will be described. In order to realize a multi-process environment on a single CPU computer, it is necessary to perform scheduling so that a plurality of processes are simultaneously operating by sequentially switching processes operating on the CPU. For example, in the case of real-time scheduling, context switching is performed by a timer interrupt, and a process waiting for execution is loaded on the CPU in place of the process being executed. In the case of round robin scheduling, a time slice is assigned to each process, and when this time slice is used up, a context switch is performed to switch the execution process.

また、かかるコンテキストスイッチには、自発的コンテキストスイッチと他発的コンテキストスイッチとがある。ここで、自発的コンテキストスイッチとは、ＣＰＵ上で実行中のプロセスが入出力処理やセマフォなどのリソースを獲得するために、ＣＰＵ上での実行権を自ら開放することにより引き起こすコンテキストスイッチのことをいう。また、他発的コンテキストスイッチとは、上述したオペレーティングシステムのスケジューリングなどにより実行中のプロセス以外の要因で引き起こされるコンテキストスイッチのことをいう。 Such context switches include a spontaneous context switch and a spontaneous context switch. Here, the spontaneous context switch is a context switch that is caused by releasing the execution right on the CPU itself so that the process running on the CPU acquires resources such as input / output processing and semaphore. Say. Further, the spontaneous context switch refers to a context switch caused by factors other than the process being executed due to the above-described scheduling of the operating system or the like.

そして、これらのコンテキストスイッチのカウントアップは、たとえば、以下のような手順でおこなわれる。すなわち、オペレーティングシステムが上記した自発的コンテキストスイッチを検出した場合、ＣＰＵを開放したプロセスから他のプロセスへ制御を移すとともに、メモリ上の自発的コンテキストスイッチ回数をインクリメントする。一方、上記した他発的コンテキストスイッチを検出した場合には、実行中のプロセスの実行を中断させ、割り込みプロセスへ制御を移すとともに、メモリ上の他発的コンテキストスイッチ回数をインクリメントする。 The count up of these context switches is performed, for example, in the following procedure. That is, when the operating system detects the above-described spontaneous context switch, the control is transferred from the process that releases the CPU to another process, and the number of spontaneous context switches on the memory is incremented. On the other hand, when the above-described other context switch is detected, the execution of the process being executed is interrupted, the control is transferred to the interrupt process, and the number of other context switches on the memory is incremented.

なお、このようなコンテキストスイッチのカウントアップ機能を備えないオペレーティングシステムを用いる場合であっても、プロセスがハードウェアにＩ／Ｏ要求を出した回数や、割り込みが発生した回数などをモニタすることにより、かかるカウントアップ機能を実現することができる。 Even when an operating system that does not have such a context switch count-up function is used, the number of times that a process issues an I / O request to the hardware, the number of times an interrupt occurs, and the like are monitored. Such a count-up function can be realized.

このように、コンテキストスイッチは実行プロセスが切り替わる際に引き起こされるので、かかるコンテキストスイッチ回数に着目することによりプロセスの動作異常を検出することが可能となる。 As described above, since the context switch is caused when the execution process is switched, it is possible to detect an abnormal operation of the process by paying attention to the number of context switches.

具体的には、図２に示したように、無限ループによる動作異常の場合には、プロセス状態が実行中であり、他発的コンテキストスイッチ回数は増加しているものの自発的コンテキストスイッチ回数が変化していない状態となる。すなわち、オペレーティングシステムのスケジューリングによりコンテキストスイッチが引き起こされているもののプロセスの処理は完了していないので、自発的なコンテキストスイッチを引き起こしていない状態が継続していることをあらわしている。したがって、かかる状態が所定時間継続しているプロセスは、無限ループによりＣＰＵを長時間使用している可能性が高いので動作異常であると判定する。 Specifically, as shown in FIG. 2, in the case of an abnormal operation due to an infinite loop, the process state is being executed and the number of spontaneous context switches has increased, but the number of spontaneous context switches has changed. It will be in a state that is not. That is, although the context switch is caused by the scheduling of the operating system, the processing of the process is not completed, so that the state that does not cause the spontaneous context switch continues. Therefore, it is determined that a process in which such a state continues for a predetermined time is abnormal in operation because it is highly likely that the CPU has been used for a long time due to an infinite loop.

また、ＣＰＵ待ちによる動作異常の場合には、プロセス状態が実行中であり、他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数のいずれもが変化しない状態となる。かかる状態が所定時間継続した場合には、他のプロセスによりプロセスの実行が阻害されている可能性が高いのでＣＰＵ待ちによる動作異常であると判定する。 In the case of an abnormal operation due to waiting for the CPU, the process state is being executed, and neither the number of spontaneous context switches nor the number of spontaneous context switches changes. If this state continues for a predetermined time, it is highly likely that the execution of the process is hindered by another process, so it is determined that the operation is abnormal due to the CPU waiting.

また、Ｉ／Ｏ待ちによる動作異常の場合には、プロセスが待機中であり、他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数のいずれもが変化しない状態となる。かかる状態が所定時間継続した場合には、他のプロセスとのリソースの競合などによりプロセスのＩ／Ｏ処理が阻害されている可能性が高いのでＩ／Ｏ待ちによる動作異常であると判定する。 In the case of an abnormal operation due to waiting for I / O, the process is waiting, and neither the number of spontaneous context switches nor the number of spontaneous context switches changes. If such a state continues for a predetermined time, it is highly likely that the I / O processing of the process is hindered due to resource contention with another process, and so it is determined that the operation is abnormal due to I / O waiting.

このように、本発明に係る異常検出処理では、コンテキストスイッチに着目し、かかるコンテキストスイッチ回数の変化とプロセス状態の継続時間とを関連づけることにより動作異常を検出することとしたので、効率的に監視対象アプリの動作異常を検出することができる。 As described above, in the abnormality detection processing according to the present invention, attention is paid to the context switch, and the operation abnormality is detected by associating the change in the number of context switches with the duration of the process state. An abnormal operation of the target application can be detected.

また、本発明に係る異常検出処理では、監視対象アプリを構成するすべてのプロセスを監視対象とすることができる。かかる監視対象アプリのプロセス構成について図３を用いて説明しておく。図３は、監視対象アプリのプロセス構成を示す図である。 In the abnormality detection process according to the present invention, all processes constituting the monitoring target application can be set as monitoring targets. The process configuration of the monitoring target application will be described with reference to FIG. FIG. 3 is a diagram illustrating a process configuration of the monitoring target application.

同図に示すように、監視対象アプリは、メインプロセスとなる親プロセス１０１と、かかる親プロセスがｆｏｒｋなどのシステムコールを発行することにより生成される子プロセス１０２とから構成されることが通常である。なお、かかる子プロセス１０２がさらに子プロセス１０２を生成する場合もあるので、監視対象アプリは、プロセスの親子関係の階層構造から構成される場合が多い。 As shown in the figure, the monitoring target application is usually composed of a parent process 101 as a main process and a child process 102 generated by the parent process issuing a system call such as fork. is there. Since the child process 102 may further generate a child process 102, the monitoring target application is often configured with a hierarchical structure of parent-child relationships of processes.

本発明に係る異常検出処理では、プロセス生成時にオペレーティングシステム上に生成されるプロセスディスクリプタを用いて間接的に監視対象アプリの動作異常を検出することとしている。具体的には、オペレーティングシステム内に設けられた動作情報提供部２００がプロセスディスクリプタの内容からプロセスの親子関係を関連づけて保持することとしている。 In the abnormality detection processing according to the present invention, an operation abnormality of the monitoring target application is indirectly detected using a process descriptor generated on the operating system at the time of process generation. Specifically, the operation information providing unit 200 provided in the operating system holds the parent-child relationship of the process in association with the content of the process descriptor.

通常のＬｉｎｕｘ（登録商標）などのＵＮＩＸ（登録商標）オペレーティングシステムは、親プロセスが消滅した場合には、かかる親プロセスが生成した子プロセスはマスタープロセスの配下に置かれ、いわゆるゾンビプロセスとなる。したがって、従来の異常検出処理では、親プロセスが消滅した後には、監視対象アプリに属する子プロセスを認識することができないため監視を継続することができないという問題があった。 In a UNIX (registered trademark) operating system such as a normal Linux (registered trademark), when a parent process disappears, a child process generated by the parent process is placed under the master process and becomes a so-called zombie process. Therefore, the conventional abnormality detection process has a problem that monitoring cannot be continued because the child process belonging to the monitoring target application cannot be recognized after the parent process disappears.

本発明に係る異常検出処理では、上述したようにプロセス生成時にプロセスの親子関係を関連づけて保持することとしているので、監視対象アプリのメインプロセスが終了した後であっても、監視対象アプリを構成する子プロセスの監視を継続しておこなうことができる。なお、メインプロセス以外のプロセスが終了した場合であっても、プロセス生成時の親子関係をたどることにより、同様に監視を継続することができる。 In the abnormality detection processing according to the present invention, since the parent-child relationship of the processes is associated and held at the time of process generation as described above, the monitoring target application is configured even after the main process of the monitoring target application ends. It is possible to continue to monitor the child process. Even when a process other than the main process ends, monitoring can be continued in the same manner by following the parent-child relationship at the time of process generation.

次に、本発明に係る異常検出処理を含むコンピュータ１の構成について図４を用いて説明する。図４は、異常検出処理を含むコンピュータ１の構成を示す機能ブロック図である。同図に示すように、コンピュータ１は、監視処理部１０と、動作情報提供処理部１５と、記憶部２０とを備えている。なお、監視処理部１０はオペレーティングシステム上で動作する監視プログラムが備える処理部であり、動作情報提供処理部１５はオペレーティングシステムが備える処理部である。 Next, the configuration of the computer 1 including the abnormality detection process according to the present invention will be described with reference to FIG. FIG. 4 is a functional block diagram showing the configuration of the computer 1 including the abnormality detection process. As shown in FIG. 1, the computer 1 includes a monitoring processing unit 10, an operation information provision processing unit 15, and a storage unit 20. The monitoring processing unit 10 is a processing unit included in a monitoring program that operates on the operating system, and the operation information provision processing unit 15 is a processing unit included in the operating system.

監視処理部１０は、アプリケーション起動部１０ａと、動作情報取得部１０ｂと、動作異常判定部１０ｃと、動作異常時処理部１０ｄとをさらに備えており、動作情報提供処理部１５は、動作情報提供部１５ａと、動作情報管理部１５ｂとをさらに備えている。また、記憶部２０は、監視処理部１０が用いる監視対象データ２０ａと、動作情報提供処理部１５が用いる動作情報２０ｂとをさらに備えている。 The monitoring processing unit 10 further includes an application activation unit 10a, an operation information acquisition unit 10b, an operation abnormality determination unit 10c, and an operation abnormality processing unit 10d. The operation information provision processing unit 15 provides operation information provision A unit 15a and an operation information management unit 15b. The storage unit 20 further includes monitoring target data 20 a used by the monitoring processing unit 10 and operation information 20 b used by the operation information providing processing unit 15.

監視処理部１０は、監視対象アプリを生成して監視対象データ２０ａを記憶部２０に書き込むとともに動作情報提供処理部１５から動作情報２０ｂを取得し、監視対象データ２０ａおよび動作情報２０ｂに基づいて監視対象アプリの動作異常を判定し、動作異常に対する処理を実行する処理部である。 The monitoring processing unit 10 generates a monitoring target application, writes the monitoring target data 20a to the storage unit 20, acquires the operation information 20b from the operation information providing processing unit 15, and monitors based on the monitoring target data 20a and the operation information 20b. It is a processing unit that determines an operation abnormality of the target application and executes a process for the operation abnormality.

アプリケーション起動部１０ａは、監視対象アプリを生成するとともに、監視対象アプリに関する情報を監視対象データ２０ａとして記憶部２０に書き込む処理部である。ここで、かかる監視対象データ２０ａの一例について図５を用いて説明しておく。図５は、監視対象データ２０ａの一例を示す図である。 The application activation unit 10a is a processing unit that generates a monitoring target application and writes information related to the monitoring target application in the storage unit 20 as monitoring target data 20a. Here, an example of the monitoring target data 20a will be described with reference to FIG. FIG. 5 is a diagram illustrating an example of the monitoring target data 20a.

同図に示すように、アプリケーション起動部１０ａが監視対象アプリを起動した場合には、起動したアプリ名と、起動したアプリのメインプロセスのメインプロセスＩＤとを含む情報が監視対象データ２０ａとして記憶部２０に書き込まれる。たとえば、図５に示した場合について説明すると、アプリケーション起動部１０ａは、アプリＡ、アプリＢおよびアプリＣの３つのアプリを監視対象としており、各監視対象アプリのメインプロセスＩＤは、それぞれ、２００、２４０および３００であることを示している。 As shown in the figure, when the application activation unit 10a activates the monitoring target application, information including the name of the activated application and the main process ID of the main process of the activated application is stored as the monitoring target data 20a. 20 is written. For example, in the case illustrated in FIG. 5, the application activation unit 10a monitors three applications, app A, app B, and app C, and the main process ID of each monitored app is 200, 240 and 300.

なお、本実施例においては、監視処理部１０が、起動したアプリ名と、起動したアプリのメインプロセスのプロセスＩＤとを含む監視対象データ２０ａを記憶部２０に記憶し、のちに取得するプロセスの動作情報２０ｂと関連づける場合について説明するが、監視対象データ２０ａは、かかるプロセスＩＤを含まないものとすることもできる。この場合には、動作情報２０ｂにプロセス名を含ませることとし、このプロセス名と監視対象データ２０ａに含まれるアプリ名とを関連づけることにより監視対象アプリに属するプロセスを認識することになる。 In the present embodiment, the monitoring processing unit 10 stores the monitoring target data 20a including the name of the activated application and the process ID of the main process of the activated application in the storage unit 20, and the process to be acquired later. Although the case of associating with the operation information 20b will be described, the monitoring target data 20a may not include such a process ID. In this case, the process name is included in the operation information 20b, and the process belonging to the monitoring target application is recognized by associating the process name with the application name included in the monitoring target data 20a.

図４の説明に戻り、動作情報取得部１０ｂについて説明する。動作情報取得部１０ｂは、所定のタイミングで動作情報提供処理部１５に動作情報取得要求をおこなってプロセスの動作情報２０ｂを取得し、取得した動作情報２０ｂを動作異常判定部１０ｃに渡す処理をおこなう処理部である。ここで、動作情報取得部１０ｂが取得するプロセスの動作情報２０ｂの一例について図６を用いて説明しておく。図６は、動作情報２０ｂの一例を示す図である。 Returning to the description of FIG. 4, the operation information acquisition unit 10 b will be described. The motion information acquisition unit 10b performs a process of making a motion information acquisition request to the motion information provision processing unit 15 at a predetermined timing, acquiring the motion information 20b of the process, and passing the acquired motion information 20b to the motion abnormality determination unit 10c. It is a processing unit. Here, an example of the process operation information 20b acquired by the operation information acquisition unit 10b will be described with reference to FIG. FIG. 6 is a diagram illustrating an example of the operation information 20b.

図６に示すように、この動作情報２０ｂは、プロセスＩＤと、親プロセスＩＤと、プロセス動作状態と、動作状態継続時間と、他発的コンテキストスイッチ回数と、自発的コンテキストスイッチ回数とを含んだ情報であり、動作情報提供処理部１５が収集し、動作情報取得部１０ｂの求めに応じて提供する情報である。 As shown in FIG. 6, the operation information 20b includes a process ID, a parent process ID, a process operation state, an operation state duration, a number of spontaneous context switches, and a number of spontaneous context switches. This is information that is collected by the motion information provision processing unit 15 and provided in response to a request from the motion information acquisition unit 10b.

図６に示したように、動作情報２０ｂは、プロセスＩＤと親プロセスＩＤとを含んでおり、監視対象アプリに属するプロセスが生成された際に、このプロセスのプロセスＩＤと、かかるプロセスを生成した親プロセスＩＤとが動作情報提供処理部１５によりセットされる。したがって、監視対象アプリに属するプロセスが終了して親子関係が崩れた場合であっても動作情報２０ｂの親プロセスＩＤをたどることで、監視対象アプリに属するすべてのプロセスに関する情報を監視プロセスに対して提供することができる。 As shown in FIG. 6, the operation information 20b includes a process ID and a parent process ID. When a process belonging to the monitoring target application is generated, the process ID of the process and the process are generated. The parent process ID is set by the operation information provision processing unit 15. Therefore, even when the process belonging to the monitoring target application is terminated and the parent-child relationship is broken, by tracing the parent process ID of the operation information 20b, information on all the processes belonging to the monitoring target application is transmitted to the monitoring process. Can be provided.

ここで、親プロセスが終了した場合におけ監視対象アプリのプロセス構成について図７を用いて説明する。図７は、親プロセスが終了した場合における監視対象アプリのプロセス構成を示す図である。同図に示すように、親プロセス１０１が終了した場合であっても、監視プロセス１００は、監視対象アプリケーションに属する子プロセス１０２の監視を継続することができる。 Here, the process configuration of the monitoring target application when the parent process ends will be described with reference to FIG. FIG. 7 is a diagram illustrating a process configuration of the monitoring target application when the parent process is terminated. As shown in the figure, even when the parent process 101 is terminated, the monitoring process 100 can continue monitoring the child process 102 belonging to the monitoring target application.

その理由は、動作情報２０ｂには、監視対象アプリに属するプロセスが生成された際に、プロセスＩＤと、その親プロセスＩＤがセットされているため、親プロセスが終了した場合であってもプロセスツリーをたどることにより監視対象アプリに属するすべてのプロセスを把握することができるからである。なお、監視対象アプリに属するすべてのプロセスが終了した際に、動作情報提供処理部１５は、動作情報２０ｂからかかる監視対象アプリに属するプロセスに関する情報を削除することになる。 The reason is that the process information and its parent process ID are set in the operation information 20b when the process belonging to the monitoring target application is generated, so even if the parent process ends, the process tree This is because it is possible to grasp all the processes belonging to the monitored application by following the above. When all the processes belonging to the monitoring target application are completed, the operation information provision processing unit 15 deletes information related to the process belonging to the monitoring target application from the operation information 20b.

図６の説明に戻って、プロセス動作状態について説明する。プロセス動作状態は、プロセスの実行状態をあらわす項目であり、「実行中」または「実行待ち」の状態がある。「実行中」状態とは、プロセスがＣＰＵ権を取得してＣＰＵ上で動作中である状態、またはＣＰＵ上で動作していたプロセスが他発的コンテキストスイッチにより他のプロセスにＣＰＵ権を横取りされた状態をあらわす。また、「実行待ち」状態とは、ＣＰＵ上で動作していたプロセスがＣＰＵ権を自ら開放して自発的コンテキストスイッチを引き起こした状態をあらわす。 Returning to the description of FIG. 6, the process operation state will be described. The process operation state is an item representing the execution state of the process, and includes a state of “executing” or “waiting for execution”. The “running” state is a state in which a process has acquired the CPU right and is operating on the CPU, or a process that has been operating on the CPU is intercepted by another process by a spontaneous context switch. It shows the state. The “waiting to execute” state represents a state in which a process operating on the CPU releases the CPU right and causes a spontaneous context switch.

動作状態継続時間は、プロセス動作状態が変化せず継続している時間をあらわす項目である。たとえば、プロセス動作状態が「実行待ち」から「実行中」に変化し、「実行中」のまま１ｓｅｃが経過した場合には、この動作状態継続時間は１ｓｅｃとなる。 The operation state continuation time is an item representing the time during which the process operation state continues without changing. For example, when the process operation state changes from “Waiting for execution” to “Now executing” and 1 sec has passed while “In execution”, the operation state duration is 1 sec.

他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数は、上述した他発的コンテキストスイッチおよび自発的コンテキストスイッチが、プロセス動作状態が変化しない状態で何回発生したかを累計したものである。したがって、他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数は、プロセス動作状態が変化した場合にはリセットされ、あらたに累計が開始される。なお、これらの回数をリセットせずに、プロセス動作状態が変化した時点の回数との差分をとることとしてもよい。 The number of spontaneous context switches and the number of spontaneous context switches are the cumulative number of times that the above-mentioned spontaneous context switch and spontaneous context switch occur in a state where the process operation state does not change. Therefore, the number of spontaneous context switches and the number of spontaneous context switches are reset when the process operation state changes, and a new accumulation is started. In addition, it is good also as taking the difference with the frequency | count at the time of a process operation state changing, without resetting these frequency | counts.

次に、プロセスの動作情報２０ｂのデータ例について図８を用いて説明する。図８は、動作情報のデータの一例を示す図である。同図の５１に示したデータは、プロセスＩＤが２００のプロセスを親として、プロセスＩＤが２０４、２１０および２３０の子プロセスが生成されている状態を示したものである。この状態において、プロセスＩＤが２００のプロセスが終了した場合には、動作情報２０ｂは同図の５２に示したようになる。このように、プロセスが終了した場合であっても、プロセスＩＤと親プロセスＩＤは消去されることなく残るため、プロセスＩＤおよび親プロセスＩＤを用いてプロセスツリーをたどることができる。したがって、監視対象アプリに属するプロセスを継続して監視することができる。 Next, a data example of the process operation information 20b will be described with reference to FIG. FIG. 8 is a diagram illustrating an example of operation information data. The data shown in 51 in the figure shows a state where a process with process ID 200 is a parent and child processes with process IDs 204, 210, and 230 are generated. In this state, when the process with process ID 200 is completed, the operation information 20b is as shown at 52 in FIG. As described above, even when the process is finished, the process ID and the parent process ID remain without being erased, and therefore the process tree can be traced using the process ID and the parent process ID. Therefore, processes belonging to the monitoring target application can be continuously monitored.

図４の説明に戻り、動作異常判定部１０ｃについて説明する。動作異常判定部１０ｃは、動作情報取得部１０ｂが動作情報提供処理部１５から取得した動作情報２０ｂと、監視対象データ２０ａとを用いて監視対象アプリの動作異常を判定する処理をおこなう処理部である。 Returning to the description of FIG. 4, the operation abnormality determination unit 10 c will be described. The operation abnormality determination unit 10c is a processing unit that performs a process of determining an operation abnormality of the monitoring target application using the operation information 20b acquired by the operation information acquisition unit 10b from the operation information provision processing unit 15 and the monitoring target data 20a. is there.

具体的には、この動作異常判定部１０ｃは、図５に示したアプリ名とメインプロセスＩＤとを含む監視対象データ２０ａと、図６に示したプロセスＩＤと親プロセスＩＤとを含む動作情報２０ｂとを関連付けることによりプロセスがどの監視対象アプリに属するかを把握する。そして、各プロセスのプロセス状態、動作状態継続時間、他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数を用いてかかるプロセスに動作異常が発生しているか否かを判定する。 Specifically, the operation abnormality determination unit 10c includes monitoring target data 20a including the application name and main process ID shown in FIG. 5, and operation information 20b including the process ID and parent process ID shown in FIG. To know which monitored application the process belongs to. Then, it is determined whether an abnormal operation has occurred in the process using the process state, the operation state duration, the number of spontaneous context switches, and the number of spontaneous context switches.

たとえば、図２に示したように、無限ループによる動作異常の場合には、プロセス状態が実行中であり、他発的コンテキストスイッチ回数は増加しているものの自発的コンテキストスイッチ回数が変化していない状態となる。すなわち、オペレーティングシステムのスケジューリングによりコンテキストスイッチが引き起こされているもののプロセスの処理は完了していないので、自発的にはコンテキストスイッチを引き起こしていない状態が継続していることをあらわしている。したがって、かかる状態が所定時間継続した場合には、無限ループによりＣＰＵを占有している可能性が高いので動作異常であると判定する。 For example, as shown in FIG. 2, in the case of an abnormal operation due to an infinite loop, the process state is being executed and the number of spontaneous context switches has increased, but the number of spontaneous context switches has not changed. It becomes a state. That is, although the context switch is caused by the scheduling of the operating system, the processing of the process is not completed, so that the state that does not cause the context switch spontaneously continues. Therefore, when such a state continues for a predetermined time, it is highly likely that the CPU is occupied by an infinite loop, so that it is determined that the operation is abnormal.

なお、動作異常判定部１０ｃは、上述した各プロセスのプロセス状態、動作状態継続時間、他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数に加え、他のプロセス情報を用いて判定処理をおこなうことができる。たとえば、プロセスのスケジューリングポリシー、スケジューリングの優先度を含んだスケジューリングパラメータ、消費したＣＰＵ割り当て時間をあらわすティックカウンタ値、プロセスの実行時間などを用いることにより、動作異常判定の精度を向上させることが可能となる。 The operation abnormality determination unit 10c may perform determination processing using other process information in addition to the above-described process state, operation state duration, number of spontaneous context switches, and number of spontaneous context switches. it can. For example, it is possible to improve the accuracy of operation abnormality determination by using process scheduling policy, scheduling parameters including scheduling priority, tick counter value indicating consumed CPU allocation time, process execution time, etc. Become.

たとえば、無限ループ動作異常と判定するような場合であっても、プロセスのスケジューリング優先度が高ければ、無限ループに陥っているのではなく単に正常処理をつづけていることがある。このような場合に、プロセスのスケジューリング優先度を加味して動作異常継続時間の閾値を設定することにより、無限ループの動作異常判定の精度を向上させることができる。 For example, even when it is determined that the infinite loop operation is abnormal, if the scheduling priority of the process is high, the normal processing may be continued instead of falling into the infinite loop. In such a case, it is possible to improve the accuracy of the operation abnormality determination of the infinite loop by setting the threshold value of the operation abnormality duration in consideration of the scheduling priority of the process.

動作異常時処理部１０ｄは、動作異常判定部１０ｃが動作異常であると判定した場合に、該当する監視対象アプリの終了処理や、エラー表示処理などをおこなう処理部である。また、動作情報提供処理部１５は、プロセスディスクリプタごとにオペレーティングシステム内に作成されるプロセス情報を収集して動作情報２０ｂとして記憶部２０に書き込むととともに、監視プログラムの動作情報取得部１０ｂの求めに応じてかかる動作情報２０ｂを提供する処理をおこなう処理部である。 The abnormal operation processing unit 10d is a processing unit that performs termination processing of the corresponding monitoring target application, error display processing, and the like when the abnormal operation determination unit 10c determines that there is an abnormal operation. In addition, the operation information provision processing unit 15 collects process information created in the operating system for each process descriptor and writes it in the storage unit 20 as operation information 20b, and also requests the operation information acquisition unit 10b of the monitoring program. It is a processing unit that performs processing to provide the operation information 20b accordingly.

動作情報提供部１５ａは、動作情報取得部１０ｂから動作情報取得要求を受け付けると記憶部２０から動作情報２０ｂを読み出して動作情報取得部１０ｂに渡す処理をおこなう処理部である。また、動作情報管理部１５ｂは、オペレーティングシステム内に作成されるプロセス情報を収集し、収集したプロセス情報を動作情報２０ｂとして記憶部２０に書き込む処理をおこなう処理部である。 The motion information providing unit 15a is a processing unit that performs a process of reading the motion information 20b from the storage unit 20 and passing it to the motion information acquiring unit 10b when a motion information acquisition request is received from the motion information acquiring unit 10b. The operation information management unit 15b is a processing unit that collects process information created in the operating system and writes the collected process information to the storage unit 20 as operation information 20b.

記憶部２０は、ＲＡＭ（Random Access Memory）などのメモリから構成され、監視対象データ２０ａおよび動作情報２０ｂを記憶する記憶部である。監視対象データ２０ａは、アプリケーション起動部１０ａにより記憶部２０に書き込まれ、動作異常判定部１０ｃにより読み出されるデータである。なお、監視対象データ２０ａは、図５を用いてすでに説明したように、アプリ名およびメインプロセスＩＤを含んだデータである。 The storage unit 20 includes a memory such as a RAM (Random Access Memory), and stores the monitoring target data 20a and the operation information 20b. The monitoring target data 20a is data that is written to the storage unit 20 by the application activation unit 10a and read by the operation abnormality determination unit 10c. Note that the monitoring target data 20a is data including an application name and a main process ID as already described with reference to FIG.

また、動作情報２０ｂは、動作情報管理部１５ｂにより記憶部２０に書き込まれ、動作情報提供部１５ａにより読み出されるプロセス情報である。なお、動作情報２０ｂは、図６を用いてすでに説明したように、プロセスＩＤ、親プロセスＩＤ、動作状態継続時間、他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数を含んだプロセス情報である。 The operation information 20b is process information written in the storage unit 20 by the operation information management unit 15b and read out by the operation information providing unit 15a. The operation information 20b is process information including the process ID, the parent process ID, the operation state duration, the number of spontaneous context switches, and the number of spontaneous context switches, as already described with reference to FIG.

次に、上述した監視プログラムが監視対象アプリの動作異常を検出する場合の処理手順について図９〜図１１を用いて説明する。図９は、無限ループによる動作異常を検出する場合における処理手順を示すフローチャートである。 Next, a processing procedure when the above-described monitoring program detects an abnormal operation of the monitoring target application will be described with reference to FIGS. FIG. 9 is a flowchart showing a processing procedure in the case of detecting an operation abnormality due to an infinite loop.

図９に示すように、まず、動作異常判定部１０ｃが動作情報取得部１０ｂから動作情報２０ｂを受け取ると（ステップＳ１０１）、動作情報２０ｂのプロセス動作状態および動作状態継続時間を参照し、所定時間継続してプロセス動作状態が「実行中」であるか否かを判定する（ステップＳ１０２）。そして、所定時間継続して「実行中」ではない場合には（ステップＳ１０２否定）、無限ループによる動作異常はないと判定して（ステップＳ１０５）処理を終了する。 As shown in FIG. 9, first, when the operation abnormality determination unit 10c receives the operation information 20b from the operation information acquisition unit 10b (step S101), the process operation state and the operation state duration of the operation information 20b are referred to for a predetermined time. It is continuously determined whether or not the process operation state is “executing” (step S102). If it is not “running” for a predetermined time (No at Step S102), it is determined that there is no abnormal operation due to an infinite loop (Step S105), and the process is terminated.

つづいて、所定時間継続して「実行中」である場合には（ステップＳ１０２肯定）、取得した動作情報２０ｂの自発的コンテキストスイッチ回数を参照し、自発的コンテキストスイッチ回数に変化があるか否かを判定する（ステップＳ１０３）。そして、自発的コンテキストスイッチ回数に変化がある場合には（ステップＳ１０３否定）、無限ループによる動作異常はないと判定して（ステップＳ１０５）処理を終了する。 Subsequently, when “execution” continues for a predetermined time (Yes in step S102), the number of spontaneous context switches is referred to by referring to the number of spontaneous context switches in the acquired operation information 20b. Is determined (step S103). If there is a change in the number of spontaneous context switches (No at Step S103), it is determined that there is no abnormal operation due to an infinite loop (Step S105), and the process is terminated.

つづいて、自発的コンテキストスイッチ回数に変化がない場合には（ステップＳ１０３肯定）、取得した動作情報２０ｂの他発的コンテキストスイッチ回数を参照し、他発的コンテキストスイッチ回数に変化があるか否かを判定する（ステップＳ１０４）。そして、他発的コンテキストスイッチに変化がない場合には（ステップＳ１０４肯定）、無限ループによる動作異常はないと判定して（ステップＳ１０５）処理を終了する。一方、他発的コンテキストスイッチ回数に変化がある場合には（ステップＳ１０４否定）、無限ループによる動作異常があると判定して（ステップＳ１０６）処理を終了する。 Subsequently, when there is no change in the number of voluntary context switches (Yes in step S103), the number of other context switches in the acquired operation information 20b is referred to and whether or not there is a change in the number of other context switches. Is determined (step S104). If there is no change in the spontaneous context switch (Yes at Step S104), it is determined that there is no abnormal operation due to the infinite loop (Step S105), and the process is terminated. On the other hand, if there is a change in the number of spontaneous context switches (No at Step S104), it is determined that there is an operation abnormality due to an infinite loop (Step S106), and the process is terminated.

次に、監視プログラムがＣＰＵ待ちによる動作異常を検出する場合の処理手順について図１０を用いて説明する。図１０は、ＣＰＵ待ちによる動作異常を検出する場合における処理手順を示すフローチャートである。 Next, a processing procedure when the monitoring program detects an operation abnormality caused by waiting for the CPU will be described with reference to FIG. FIG. 10 is a flowchart showing a processing procedure in the case of detecting an operation abnormality caused by waiting for the CPU.

図１０に示すように、まず、動作異常判定部１０ｃが動作情報取得部１０ｂから動作情報２０ｂを受け取ると（ステップＳ２０１）、動作情報２０ｂのプロセス動作状態および動作状態継続時間を参照し、所定時間継続してプロセス動作状態が「実行中」であるか否かを判定する（ステップＳ２０２）。そして、所定時間継続して「実行中」ではない場合には（ステップＳ２０２否定）、ＣＰＵ待ちによる動作異常はないと判定して（ステップＳ２０６）処理を終了する。 As shown in FIG. 10, first, when the operation abnormality determination unit 10c receives the operation information 20b from the operation information acquisition unit 10b (step S201), the process operation state and the operation state duration of the operation information 20b are referred to for a predetermined time. It is continuously determined whether or not the process operation state is “executing” (step S202). If it is not “running” for a predetermined time (No at Step S202), it is determined that there is no abnormal operation due to waiting for the CPU (Step S206), and the process is terminated.

つづいて、所定時間継続して「実行中」である場合には（ステップＳ２０２肯定）、取得した動作情報２０ｂの自発的コンテキストスイッチ回数を参照し、自発的コンテキストスイッチ回数に変化があるか否かを判定する（ステップＳ２０３）。そして、自発的コンテキストスイッチ回数に変化がある場合には（ステップＳ２０３否定）、ＣＰＵ待ちによる動作異常はないと判定して（ステップＳ２０６）処理を終了する。 Subsequently, when “execution” continues for a predetermined time (Yes at step S202), the number of spontaneous context switches is referred to by referring to the number of spontaneous context switches in the acquired operation information 20b. Is determined (step S203). If there is a change in the number of spontaneous context switches (No at Step S203), it is determined that there is no abnormal operation due to waiting for the CPU (Step S206), and the process ends.

つづいて自発的コンテキストスイッチ回数に変化がない場合には（ステップＳ２０３肯定）、取得した動作情報２０ｂの他発的コンテキストスイッチ回数を参照し、他発的コンテキストスイッチ回数に変化があるか否かを判定する（ステップＳ２０４）。そして、他発的コンテキストスイッチに変化がある場合には（ステップＳ２０４否定）、ＣＰＵ待ちによる動作異常はないと判定して（ステップＳ２０６）処理を終了する。一方、他発的コンテキストスイッチ回数に変化がない場合には（ステップＳ２０４肯定）、ＣＰＵ待ちによる動作異常があると判定して（ステップＳ２０５）処理を終了する。 Subsequently, when there is no change in the number of spontaneous context switches (Yes at Step S203), the number of other context switches in the acquired operation information 20b is referred to and whether or not there is a change in the number of other context switches. Determination is made (step S204). If there is a change in the spontaneous context switch (No at Step S204), it is determined that there is no abnormal operation due to waiting for the CPU (Step S206), and the process ends. On the other hand, if there is no change in the number of spontaneous context switches (Yes at Step S204), it is determined that there is an operation abnormality due to waiting for the CPU (Step S205), and the process is terminated.

次に、監視プログラムがＩ／Ｏ待ちによる動作異常を検出する場合の処理手順について図１１を用いて説明する。図１１は、Ｉ／Ｏ待ちによる動作異常を検出する場合における処理手順を示すフローチャートである。 Next, a processing procedure when the monitoring program detects an operation abnormality due to I / O waiting will be described with reference to FIG. FIG. 11 is a flowchart showing a processing procedure in the case of detecting an operation abnormality due to waiting for I / O.

図１１に示すように、まず、動作異常判定部１０ｃが動作情報取得部１０ｂから動作情報２０ｂを受け取ると（ステップＳ３０１）、動作情報２０ｂのプロセス動作状態および動作状態継続時間を参照し、所定時間継続してプロセス動作状態が「待機中（実行待ち）」であるか否かを判定する（ステップＳ３０２）。そして、所定時間継続して「待機中」ではない場合には（ステップＳ３０２否定）、Ｉ／Ｏ待ちによる動作異常はないと判定して（ステップＳ３０６）処理を終了する。 As shown in FIG. 11, first, when the operation abnormality determination unit 10c receives the operation information 20b from the operation information acquisition unit 10b (step S301), the process operation state and the operation state duration of the operation information 20b are referred to for a predetermined time. It is determined whether or not the process operation state is “waiting (waiting for execution)” (step S302). If it is not “standby” for a predetermined time (No at Step S302), it is determined that there is no abnormal operation due to waiting for I / O (Step S306), and the process is terminated.

つづいて、所定時間継続して「待機中」である場合には（ステップＳ３０２肯定）、取得した動作情報２０ｂの自発的コンテキストスイッチ回数を参照し、自発的コンテキストスイッチ回数に変化があるか否かを判定する（ステップＳ３０３）。そして、自発的コンテキストスイッチ回数に変化がある場合には（ステップＳ３０３否定）、Ｉ／Ｏ待ちによる動作異常はないと判定して（ステップＳ３０６）処理を終了する。 Subsequently, when “waiting” continues for a predetermined time (Yes in step S302), the number of spontaneous context switches in the acquired operation information 20b is referred to to determine whether the number of spontaneous context switches has changed. Is determined (step S303). If there is a change in the number of spontaneous context switches (No at Step S303), it is determined that there is no abnormal operation due to waiting for I / O (Step S306), and the process ends.

つづいて、自発的コンテキストスイッチ回数に変化がない場合には（ステップＳ３０３肯定）、取得した動作情報２０ｂの他発的コンテキストスイッチ回数を参照し、他発的コンテキストスイッチ回数に変化があるか否かを判定する（ステップＳ３０４）。そして、他発的コンテキストスイッチに変化がある場合には（ステップＳ３０４否定）、Ｉ／Ｏ待ちによる動作異常はないと判定して（ステップＳ３０６）処理を終了する。一方、他発的コンテキストスイッチ回数に変化がない場合には（ステップＳ３０４肯定）、ＣＰＵ待ちによる動作異常があると判定して（ステップＳ３０５）処理を終了する。 Subsequently, when there is no change in the number of voluntary context switches (Yes in step S303), the number of other context switches in the acquired operation information 20b is referred to and whether or not there is a change in the number of other context switches. Is determined (step S304). If there is a change in the spontaneous context switch (No at Step S304), it is determined that there is no abnormal operation due to waiting for I / O (Step S306), and the process ends. On the other hand, if there is no change in the number of spontaneous context switches (Yes at step S304), it is determined that there is an operation abnormality due to waiting for the CPU (step S305), and the process is terminated.

上述してきたように、本実施例に係る異常検出処理では、動作情報提供部がオペレーティングシステム上で動作するプロセスの動作情報を収集して監視処理部に提供し、監視処理部の動作異常判定部は、動作情報提供部により提供された動作情報に含まれるプロセス動作状態、動作状態継続時間、他発的コンテキストスイッチ回数および自発的コンテキストスイッチ回数に基づいて監視対象のアプリケーションプログラムに属するプロセスの動作異常を判定するよう構成したので、監視対象となるアプリケーションプログラムの改修をおこなうことなく、かかるアプリケーションプログラムの動作異常を効率的に検出することができる。 As described above, in the abnormality detection processing according to the present embodiment, the operation information providing unit collects operation information of a process operating on the operating system and provides it to the monitoring processing unit, and the operation abnormality determining unit of the monitoring processing unit Is an abnormal operation of a process belonging to an application program to be monitored based on the process operation state, the operation state duration, the number of spontaneous context switches, and the number of spontaneous context switches included in the operation information provided by the operation information providing unit. Therefore, it is possible to efficiently detect an abnormal operation of the application program without modifying the application program to be monitored.

以上のように、本発明にかかる異常検出プログラムおよび異常検出方法は、アプリケーションプログラムの動作異常検出に有用であり、特に、市販アプリケーションなどの改修をおこなうことができないアプリケーションプログラムの動作異常検出に適している。 As described above, the abnormality detection program and the abnormality detection method according to the present invention are useful for detecting an operation abnormality of an application program, and are particularly suitable for detecting an operation abnormality of an application program that cannot be modified such as a commercial application. Yes.

本発明に係るプログラムの異常検出処理の概念を示す図である。It is a figure which shows the concept of the abnormality detection process of the program which concerns on this invention. 動作情報と異常種別との関係を示す図である。It is a figure which shows the relationship between operation information and abnormality classification. 監視対象アプリのプロセス構成を示す図である。It is a figure which shows the process structure of the monitoring object application. 異常検出処理を含むコンピュータの構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of a computer containing an abnormality detection process. 監視対象データの一例を示す図である。It is a figure which shows an example of monitoring object data. 動作情報の一例を示す図である。It is a figure which shows an example of operation information. 親プロセスが終了した場合における監視対象アプリのプロセス構成を示す図である。It is a figure which shows the process structure of the monitoring object application when a parent process is complete | finished. 動作情報のデータの一例を示す図である。It is a figure which shows an example of the data of operation | movement information. 無限ループによる動作異常を検出する場合における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the case of detecting the operation | movement abnormality by an infinite loop. ＣＰＵ待ちによる動作異常を検出する場合における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the case of detecting the operation abnormality by CPU waiting. Ｉ／Ｏ待ちによる動作異常を検出する場合における処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in the case of detecting operation | movement abnormality by I / O waiting.

Explanation of symbols

１データ提供装置
１０監視処理部
１０ａアプリケーション起動部
１０ｂ動作情報取得部
１０ｃ動作異常判定部
１０ｄ動作異常時処理部
１５動作情報提供処理部
１５ａ動作情報提供部
１５ｂ動作情報管理部
２０記憶部
２０ａ監視対象データ
２０ｂ動作情報
５１、５２動作情報例
１００監視プロセス
１０１親プロセス
１０２子プロセス
２００動作情報提供部 DESCRIPTION OF SYMBOLS 1 Data provision apparatus 10 Monitoring processing part 10a Application starting part 10b Operation information acquisition part 10c Operation abnormality determination part 10d Operation abnormality processing part 15 Operation information provision processing part 15a Operation information provision part 15b Operation information management part 20 Storage part 20a Monitoring object Data 20b Operation information 51, 52 Example of operation information 100 Monitoring process 101 Parent process 102 Child process 200 Operation information providing unit

Claims

An abnormality detection program for detecting an operation abnormality of an application program operating on an operating system,
An acquisition procedure for acquiring operation information of a process belonging to the application program from the operating system;
An abnormality detection program that causes a computer to execute a determination procedure for determining an operation abnormality of the application program based on the operation information acquired by the operation information acquisition procedure.

The abnormality detection program according to claim 1, wherein the operation information acquired by the acquisition procedure includes a context switch count of the process.

The number of context switches indicates the number of spontaneous context switches that indicate the number of context switches generated by the process according to its own instruction, and the number of other context switches that indicate the number of context switches caused by factors other than the process. The abnormality detection program according to claim 2, wherein the abnormality detection program is classified into two types.

The abnormality detection program according to claim 1, wherein the operation information acquired by the acquisition procedure includes a process state representing an execution state of the process.

5. The abnormality detection program according to claim 4, wherein the determination procedure determines an operation abnormality of the application program based on the number of context switches when a predetermined time elapses without changing the process state.

The determination procedure is when the predetermined time has passed while the process state is being executed, and when the number of spontaneous context switches does not change and the number of spontaneous context switches changes, the process 6. The abnormality detection program according to claim 5, wherein it is determined that has fallen into an infinite loop.

The determination procedure is performed when a predetermined time has elapsed while the process state is being executed, and when the number of spontaneous context switches does not change and the number of spontaneous context switches does not change, the process 7. The abnormality detection program according to claim 5, wherein the abnormality detection program determines that the CPU has fallen into a CPU wait abnormality.

The determination procedure is performed when the process state is in a waiting state for execution and a predetermined time has elapsed, and when the number of spontaneous context switches does not change and the number of spontaneous context switches does not change, the process The abnormality detection program according to claim 5, wherein it is determined that an I / O waiting abnormality has occurred.

The abnormality detection program according to any one of claims 1 to 8, wherein the operation information acquired by the acquisition procedure includes a parent-child relationship of the process belonging to the application program.

The determination procedure continues the determination of abnormal operation of the application program including the zombie process even when the parent process of the child process belonging to the application program ends and the child process becomes a zombie process. The abnormality detection program according to claim 9.

The abnormality detection program according to claim 1, wherein the operation information includes a process priority of a process belonging to the application program.

The said operating system makes a computer perform the provision procedure which collects the said operation information for every process descriptor of the process which belongs to the said application program, and provides to the said acquisition procedure. Anomaly detection program described in 1.

The abnormality detection program according to claim 1, wherein the process belonging to the application program includes a multi-thread process.

An abnormality detection method for detecting an operation abnormality of an application program operating on an operating system,
An acquisition step of acquiring operation information of a process belonging to the application program from the operating system;
A determination step of determining an operation abnormality of the application program based on the operation information acquired by the operation information acquisition step.

The abnormality detection method according to claim 14, wherein the operating system includes a providing step of collecting the operation information for each process descriptor of a process belonging to the application program and providing the operation information to the acquisition step.