JP2017107372A

JP2017107372A - Failure symptom detection system and failure symptom detection method

Info

Publication number: JP2017107372A
Application number: JP2015240182A
Authority: JP
Inventors: 聖人細田; Kiyoto Hosoda
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2015-12-09
Filing date: 2015-12-09
Publication date: 2017-06-15
Anticipated expiration: 2035-12-09
Also published as: JP6410705B2

Abstract

PROBLEM TO BE SOLVED: To provide a failure symptom detection system and a failure symptom detection method with which it is possible to respond early to a failure that occurs at the time of software execution.SOLUTION: The failure symptom detection system comprises: a detection definition creation device 100 for previously creating, as a detection definition, a log used to determine the name of a process monitoring unit that prescribes a range in which to detect a program failure symptom, and determine the presence of a failure symptom; and a log analysis device 200 for acquiring a log outputted each time a software program is executed, and detecting a failure symptom on the basis of the detection definition created for the log by the detection definition creation device 100. The log analysis device 200 detects an error log linked to a program failure and analyzing, on the basis of the detection definition for each process monitoring unit of a program, whether there is an occurrence of a detection log that is expected to occur when an originally normal process is executed, and detecting the case of no occurrence of a detection log as a failure symptom.SELECTED DRAWING: Figure 1

Description

この発明は、ソフトウェアの実行時に出力するシステムログ（以下、単にログという）の特徴から障害の予兆を検出することで、障害発生を未然に防止、もしくは早期発見する障害予兆検出システムおよびその方法に関するものである。 TECHNICAL FIELD The present invention relates to a failure sign detection system and method for detecting a failure sign from the characteristics of a system log (hereinafter simply referred to as a log) output during execution of software, thereby preventing the occurrence of a failure in advance or early detection. Is.

従来、ソフトウェアの障害検出においては、障害と紐づいたエラーログを予め定義し、ソフトウェアが稼働する現場では定義されたエラーログの発生を起因としてエラーと判定する(例えば、下記の特許文献1参照)。 Conventionally, in software fault detection, an error log associated with a fault is defined in advance, and an error is determined based on the occurrence of a defined error log at the site where the software is operating (see, for example, Patent Document 1 below) ).

列車の運行管理システムを例にとって説明すると、司令員が入力した列車番号がシステムに存在しないものである場合、システムは定義されたエラーログを出力するので、そのエラーログを確認していくことで障害の特定が行われる。 Taking the train operation management system as an example, if the train number entered by the commander does not exist in the system, the system will output a defined error log, so check the error log. Fault identification is performed.

特開２００８−９８５４号公報JP 2008-9854 A

従来のログによる障害検出方式は、このように障害に紐づいたエラーログを発生するので、エラーログが明示的に発生しない障害の場合にはその発見が遅れ、その結果、業務に支障をきたす可能性があるといった問題があった。 The failure detection method using the conventional log generates an error log linked to the failure in this way, so in the case of a failure that does not explicitly generate an error log, the discovery is delayed, resulting in trouble with business. There was a problem that there was a possibility.

この発明は、上記のような課題を解決するためになされたものであり、障害に紐づいたエラーログを明示的に発生する場合だけでなく、これに加えて、エラーログが明示的に発生しない場合でも障害予兆として検出できるようにして、より多くの障害に対して早期対応が可能となる障害予兆検出システムおよび障害予兆検出方法を提供することを目的とする。 The present invention has been made to solve the above-described problems. In addition to explicitly generating an error log associated with a failure, an error log is explicitly generated in addition to this. It is an object of the present invention to provide a failure sign detection system and a failure sign detection method that can be detected as a failure sign even when the failure is not detected, and can quickly cope with more failures.

この発明に係る障害予兆検出システムは、障害の解析対象となるソフトウェアのプログラムが実行されるたびに出力されるログに基づいて障害予兆を検出するものであって、上記プログラムの障害予兆を検出する範囲を規定する処理監視単位の名称と障害予兆の有無の判断に使うログを検出定義として予め作成して登録しておく検出定義作成装置と、上記ソフトウェアのプログラムが実行されるたびに出力されるログを取得し、その取得したログに対して、上記検出定義作成装置で作成された検出定義に基づいて障害予兆を検出するログ解析装置とを備え、上記ログ解析装置は、プログラムの障害に紐づいたエラーログを検出するとともに、上記検出定義作成装置において予め作成された検出定義に基づき、プログラムの処理監視単位ごとに、本来正常な処理が実行された場合に発生すると予定されている検出ログの発生の有無を解析し、上記検出ログが発生しない場合には、これを障害予兆として検出することを特徴としている。 The failure sign detection system according to the present invention detects a failure sign based on a log output each time a software program to be analyzed for a failure is executed, and detects the failure sign of the program. A detection definition creation device that creates and registers as a detection definition the name of the process monitoring unit that defines the range and the log used to determine whether there is a sign of failure, and is output each time the software program is executed A log analysis device that acquires a log and detects a failure sign based on the detection definition created by the detection definition creation device for the acquired log, and the log analysis device is associated with a program failure. For each processing monitoring unit of the program based on the detection definition created in advance by the detection definition creation device. Analyzing the occurrence of the detection log which is scheduled the original occurs when normal processing is performed, if the detected log does not occur, is characterized by detecting this as a failure predictor.

また、この発明に係る障害予兆検出方法は、障害の解析対象となるソフトウェアのプログラムが実行されるたびに出力されるログに基づいて障害予兆を検出するために、上記プログラムの障害予兆を検出する範囲を規定する処理監視単位の名称と障害予兆の有無の判断に使うログを検出定義として予め作成して登録しておく第１のステップと、障害の解析対象となるソフトウェアのプログラムが実行されるたびに出力されるログを取得し、その取得したログに対して、予め作成された上記検出定義に基づいて障害予兆を検出し、障害予兆があればその旨を通知する第２のステップとを備え、上記第２のステップでは、上記ソフトウェアのプログラムの障害に紐づいたエラーログを検出した場合だけでなく、予め作成された上記検出定義に基づき、プログラムの処理監視単位ごとに処理の開始、終了を検出し、かつ本来正常な処理が実行された場合に発生すると予定されている検出ログが発生しない場合には、これを障害予兆として検出することを特徴としている。 In addition, the failure sign detection method according to the present invention detects a failure sign of the above-mentioned program in order to detect the failure sign based on a log that is output each time a software program to be analyzed for failure is executed. The first step of creating and registering as a detection definition the name of the process monitoring unit that defines the range and the log used to determine whether or not there is a sign of failure, and the software program to be analyzed for failure are executed A second step of acquiring a log output each time, detecting a failure sign for the acquired log based on the detection definition created in advance, and notifying the presence of a failure sign; In the second step, not only when an error log associated with a failure of the software program is detected, but also based on the detection definition created in advance. Detects the start and end of processing for each processing monitoring unit of the program, and detects this as a sign of failure when a detection log that is expected to occur when normal processing is executed does not occur It is characterized by that.

この発明によれば、障害に紐づいたエラーログを明示的に発生する場合だけでなく、これに加えて、エラーログが明示的に発生しない場合でも障害予兆として検出できるので、より多くの障害に対して早期対応が可能となる。 According to the present invention, not only when an error log associated with a failure is explicitly generated, but also when an error log is not explicitly generated, it can be detected as a failure sign. Early response is possible.

この発明の実施の形態１に係る障害予兆検出システムの構成を示すブロック図である。It is a block diagram which shows the structure of the failure sign detection system which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係る障害予兆検出システムの障害予兆の検出処理動作を示すフローチャートである。It is a flowchart which shows the detection process operation of the failure sign of the failure sign detection system which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係る障害予兆検出システムの検出定義作成装置で作成される検出定義テーブルとログ監視状態テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the detection definition table and log monitoring status table which are produced with the detection definition production apparatus of the failure sign detection system which concerns on Embodiment 1 of this invention. この発明の実施の形態２に係る障害予兆検出システムの構成を示すブロック図である。It is a block diagram which shows the structure of the failure sign detection system which concerns on Embodiment 2 of this invention. この発明の実施の形態２に係る障害予兆検出システムの試験情報解析部で予め定義される試験定義テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the test definition table previously defined by the test information analysis part of the failure sign detection system which concerns on Embodiment 2 of this invention. この発明の実施の形態３に係る障害予兆検出システムの構成を示すブロック図である。It is a block diagram which shows the structure of the failure sign detection system which concerns on Embodiment 3 of this invention. この発明の実施の形態３に係る障害予兆検出システムの障害該当判定部で予め設定される障害判定テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the failure determination table preset by the failure applicable determination part of the failure sign detection system which concerns on Embodiment 3 of this invention. この発明の実施の形態４に係る障害予兆検出システムの構成を示すブロック図である。It is a block diagram which shows the structure of the failure sign detection system which concerns on Embodiment 4 of this invention. この発明の実施の形態４に係る障害予兆検出システムの試験情報解析部で定義される試験定義テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the test definition table defined by the test information analysis part of the failure sign detection system which concerns on Embodiment 4 of this invention. この発明の実施の形態５に係る障害予兆検出システムの構成を示すブロック図である。It is a block diagram which shows the structure of the failure sign detection system which concerns on Embodiment 5 of this invention. この発明の実施の形態５に係る障害予兆検出システムの影響通知部に予め設定登録されるソース参照関連管理テーブルの一例を示す説明図である。It is explanatory drawing which shows an example of the source reference relevant management table preset and registered in the influence notification part of the failure sign detection system which concerns on Embodiment 5 of this invention.

実施の形態１．
図１はこの発明の実施の形態１における障害予兆検出システムの構成を示すブロック図である。 Embodiment 1 FIG.
FIG. 1 is a block diagram showing a configuration of a failure sign detection system according to Embodiment 1 of the present invention.

この発明の実施の形態１における障害予兆検出システム１は、検出定義作成装置１００と、ログ解析装置２００とにより構成される。検出定義作成装置１００とログ解析装置２００の間は図示しないが、ネットワーク接続もしくはＣＤ等のメディアによりデータを受け渡すことができる。なお、各装置１００、２００は一般的な計算機によって構成される。 The failure sign detection system 1 according to the first embodiment of the present invention includes a detection definition creation device 100 and a log analysis device 200. Although not shown between the detection definition creation device 100 and the log analysis device 200, data can be transferred via a network connection or a medium such as a CD. Each device 100, 200 is configured by a general computer.

検出定義作成装置１００は、障害の解析対象となるソフトウェア９００について、そのプログラムが実行されるたびに出力されるログに基づいて障害予兆を検出するために、プログラムの障害予兆を検出する範囲を規定する処理監視単位の名称や障害予兆の有無の判断に使うログなどを検出定義として予め作成して登録しておくために設けられており、定義入力部１１と検出定義作成部１２とからなる。 The detection definition creation apparatus 100 defines a range for detecting a failure sign of a program in order to detect a failure sign based on a log output each time the program is executed for the software 900 to be analyzed for the failure. A definition input unit 11 and a detection definition creation unit 12 are provided to create and register in advance a detection definition such as a name of a process monitoring unit to be used and a log used to determine whether or not there is a sign of failure.

一方、ログ解析装置２００は、障害の解析対象となるソフトウェア９００において、そのプログラムが実行されるたびに出力されるログを取得し、その取得したログに対して、検出定義作成装置１００で作成された検出定義に基づいて障害予兆を検出し、障害予兆があれば、その旨を通知するものであり、ログ取得部２１、ログ解析部２２、検出定義管理部２３、ログ状態管理部２４、および障害予兆通知部２５からなる。 On the other hand, the log analysis device 200 acquires a log that is output each time the program is executed in the software 900 that is the target of failure analysis, and the detection definition creation device 100 creates the log that is obtained. A failure sign is detected based on the detected definition, and if there is a failure sign, it is notified, and a log acquisition unit 21, a log analysis unit 22, a detection definition management unit 23, a log status management unit 24, and It comprises a failure sign notification unit 25.

次に、各装置１００、２００の具体的な構成について説明する。 Next, a specific configuration of each of the devices 100 and 200 will be described.

検出定義作成装置１００において、定義入力部１１は、検出定義を作成するために必要となる情報を入力するインタフェースであり、例えばキーボートやマウス、タブレット端末などで構成される。検出定義作成部１２は、定義入力部１１から入力された情報を用いて検出定義を作成し、この検出定義の情報を図３（ａ）に示すような検出定義テーブルとして保持する。 In the detection definition creating apparatus 100, the definition input unit 11 is an interface for inputting information necessary for creating a detection definition, and includes, for example, a keyboard, a mouse, and a tablet terminal. The detection definition creation unit 12 creates a detection definition using the information input from the definition input unit 11, and holds this detection definition information as a detection definition table as shown in FIG.

一方、ログ解析装置２００において、ログ取得部２１は、解析対象のソフトウェア９００のプログラム実行時に発生する各種ログを取得する。また、検出定義管理部２３は、検出定義作成装置１００の検出定義作成部１２で作成された検出定義を取得する。ログ解析部２２は、ログ取得部２１で取得されたログの内容を解析し、障害予兆を検出する。そのログ解析の際、ログ解析部２２は、検出定義管理部２３が保持する検出定義の情報を使用する。ログ状態管理部２４は、ログ解析部２２で解析中の状態を管理するためのメモリを備えており、ログ解析部２２による解析結果を図３（ｂ）に示すようなログ監視状態テーブルとして保持する。さらに、ログ解析部２２は、ログ解析した結果、ログ状態管理部２４に登録された情報に基づいて障害予兆の有無を判断し、障害予兆があると判断した場合には、その旨を障害予兆通知部２５に通知する。障害予兆通知部２５は、ログ解析部２２からの障害予兆の通知に応じて、ログ解析装置２００が備える図示しないＣＲＴや液晶などの表画面上、あるいは外部の警報装置に対して障害予兆がある旨を通知する。 On the other hand, in the log analysis device 200, the log acquisition unit 21 acquires various logs that are generated when the analysis target software 900 is executed. Further, the detection definition management unit 23 acquires the detection definition created by the detection definition creation unit 12 of the detection definition creation device 100. The log analysis unit 22 analyzes the contents of the log acquired by the log acquisition unit 21 and detects a failure sign. During the log analysis, the log analysis unit 22 uses the detection definition information held by the detection definition management unit 23. The log status management unit 24 includes a memory for managing the status being analyzed by the log analysis unit 22, and holds the analysis result by the log analysis unit 22 as a log monitoring status table as shown in FIG. To do. Further, as a result of the log analysis, the log analysis unit 22 determines the presence / absence of a failure sign based on the information registered in the log state management unit 24. Notify the notification unit 25. The failure sign notification unit 25 has a failure sign on a front screen (not shown) such as a CRT or a liquid crystal (not shown) included in the log analysis device 200 or an external alarm device in response to the notification of the failure sign from the log analysis unit 22. Notify that.

次に、この障害予兆検出システム１の障害予兆の検出処理動作について、図２に示すフローチャートを参照して説明する。なお、符号Ｓは各処理ステップを意味する。 Next, the failure sign detection processing operation of the failure sign detection system 1 will be described with reference to the flowchart shown in FIG. In addition, the code | symbol S means each process step.

この実施の形態１における処理動作は、図２（ａ）に示すように、検出定義を作成する作成フローＳ１と、図２（ｂ）に示すように、検出定義と解析対象となるソフトウェアのログとに基づいて障害検出を実施する解析フローＳ２とに大別される。 The processing operation in the first embodiment includes a creation flow S1 for creating a detection definition, as shown in FIG. 2A, and a log of the software to be analyzed and the detection definition, as shown in FIG. 2B. And an analysis flow S2 for performing failure detection based on the above.

まず、作成フローＳ１については、定義入力部１１から検出定義作成部１２に対して障害予兆を検出するために必要な検出定義の情報を入力し（Ｓ１１）、検出定義作成部１２にその検出定義の情報を登録する（Ｓ１２）。これにより、検出定義作成部１２には、定義入力部１１から入力された情報に基づいて、図３（ａ）に示すような検出定義テーブルが作成される。そして、検出定義作成部１２で作成された図３（ａ）に示すような検出定義テーブルの情報は、ログ解析装置２００の検出定義管理部２３に転送されてここに登録される。 First, regarding the creation flow S1, detection definition information necessary for detecting a failure sign is input from the definition input unit 11 to the detection definition creation unit 12 (S11), and the detection definition is input to the detection definition creation unit 12. Is registered (S12). As a result, the detection definition creation unit 12 creates a detection definition table as shown in FIG. 3A based on the information input from the definition input unit 11. Then, the information in the detection definition table as shown in FIG. 3A created by the detection definition creation unit 12 is transferred to the detection definition management unit 23 of the log analysis device 200 and registered therein.

図３（ａ）に示す検出定義テーブルの作成に際しては、障害予兆を検出する解析対象範囲としてプログラムの処理監視単位を設定して登録する。また、ソフトウェアは様々な処理の開始や実行にあたって処理が実行したことを示すトレースログを出力することが一般的である。そこで、このトレースログを利用して、処理監視単位ごとに、障害予兆の有無の判断に使うログを登録する。すなわち、処理監視単位ごとに、障害予兆検出を開始する目印となるログを開始ログ、終了時の目印となるログを終了ログとして登録する。さらに、正常に処理監視単位内のプログラムに基づく処理が正常に実行された場合に終了ログの発生直前に通常発生されるログを検出ログとして登録する。 When creating the detection definition table shown in FIG. 3A, the process monitoring unit of the program is set and registered as an analysis target range for detecting a failure sign. Further, it is common for software to output a trace log indicating that processing has been executed in starting and executing various processing. Therefore, using this trace log, a log used for determining whether there is a failure sign is registered for each process monitoring unit. That is, for each process monitoring unit, a log that serves as a mark for starting failure sign detection is registered as a start log, and a log that serves as a mark at the end is registered as a finish log. Furthermore, a log that is normally generated immediately before the end log is generated when the process based on the program in the process monitoring unit is normally executed is registered as a detection log.

一方、解析フローＳ２については、ログ取得部２１が実行中のソフトウェアから開始ログを取得する。なお、解析対象となるソフトウェア９００は、ここではプログラムの処理監視単位名ごとにマルチタクスで動作しているものとする。 On the other hand, for the analysis flow S2, the log acquisition unit 21 acquires a start log from the software being executed. Here, it is assumed that the software 900 to be analyzed is operating in multi-tax for each process monitoring unit name of the program.

ログ解析部２２は、この開始ログが検出定義管理部２３に移管されている検出定義テーブル（図３（ａ））に登録された開始ログに該当した場合、処理監視単位番号をキーとして、ログ状態管理部２４が備える図３（ｂ）に示すログ監視状態テーブルに対して、現在の状態を”開始”にセットする（Ｓ２１）。 When this start log corresponds to the start log registered in the detection definition table (FIG. 3A) transferred to the detection definition management unit 23, the log analysis unit 22 uses the process monitoring unit number as a key to log The current state is set to “start” in the log monitoring state table shown in FIG. 3B provided in the state management unit 24 (S21).

また、ログ解析部２２は、ログ取得部２１が取得する検出ログの発生を監視し、検出ログが発生した場合、ログ状態管理部２４に設けられているログ監視状態テーブル（図３（ｂ））に対して、検出ログの発生を”有”にセットする（Ｓ２２）。 In addition, the log analysis unit 22 monitors the occurrence of the detection log acquired by the log acquisition unit 21, and when a detection log is generated, the log monitoring state table provided in the log state management unit 24 (FIG. 3B). ) Is set to “present” (S22).

さらに、ログ解析部２２は、ログ取得部２１が取得する終了ログの発生を監視し、終了ログが発生した場合、ログ状態管理部２４が備えるログ監視状態テーブル（図３（ｂ））に対して、現在の状態を”終了”にセットする（Ｓ２３）。 Further, the log analysis unit 22 monitors the occurrence of the end log acquired by the log acquisition unit 21. When the end log is generated, the log analysis unit 22 performs the log monitoring state table (FIG. 3B) provided in the log state management unit 24. The current state is set to “end” (S23).

そして、ログ解析部２２は、ログ状態管理部２４が備えるログ監視状態テーブル（図３（ｂ））を参照して、ログ解析を行う（Ｓ２４）。すなわち、終了ログの状態が”終了”になっている処理監視単位につき、検出ログの発生の有無を判断する（Ｓ２５）。 Then, the log analysis unit 22 performs log analysis with reference to the log monitoring state table (FIG. 3B) provided in the log state management unit 24 (S24). That is, it is determined whether or not a detection log is generated for a process monitoring unit whose end log status is “End” (S25).

ログ解析部２２は、終了ログが”終了”の状態になっているにもかかわらず、検出ログの発生が”有”でない場合、その処理監視単位のプログラムは、本来正常な処理が実行された場合に発生するはずの検出ログが発生し無かったので、その処理監視単位のプログラムには障害が有るものと判断し、障害予兆通知部２５によりその旨をログ解析装置２００が備える図示しないＣＲＴや液晶などの表画面上、あるいは外部の警報装置に通知する（Ｓ２６）。 The log analysis unit 22 indicates that when the detection log is not “existing” even though the end log is in the “end” state, the program of the process monitoring unit has been normally executed normally. Since the detection log that should occur in this case has not occurred, it is determined that there is a failure in the program of the processing monitoring unit, and the failure predictor notification unit 25 notifies the CRT or the like (not shown) provided in the log analysis apparatus 200 to that effect. A notification is sent to a front screen such as a liquid crystal display or to an external alarm device (S26).

上記処理の具体例として、図３（ｂ）のログ監視状態テーブルにおいて、処理監視単位名が”１０００１”のプログラムについては、そのタスクが未だ完了していないので、開始ログの状態は”開始”のままになっていて、終了ログは未だ発生されていない。したがって、検出ログも発生されておらず、障害予兆通知部２５からは障害予兆の通知は出力されない。 As a specific example of the above processing, in the log monitoring status table of FIG. 3B, for the program whose processing monitoring unit name is “10001”, the task has not yet been completed, so the status of the start log is “start”. The exit log has not been generated yet. Therefore, no detection log is generated, and no failure predictor notification is output from the failure predictor notification unit 25.

処理監視単位名が”１０００２”のプログラムについては、そのタスクが既に完了したので、終了ログが発生し、現在の状態は”終了”となっている。にもかかわらず、検出ログの発生状態は”無”なので、この場合には、処理監視単位名が”１０００２”のプログラム処理には障害が発生している可能性がある。このため、障害予兆通知部２５から障害予兆の通知が出力される。 For the program whose process monitoring unit name is “10002”, since the task has already been completed, an end log is generated and the current state is “end”. Nevertheless, since the detection log generation state is “None”, in this case, there is a possibility that a failure has occurred in the program processing whose process monitoring unit name is “10002”. Therefore, a failure sign notification is output from the failure sign notification unit 25.

一方、処理監視単位名が”１０００３”のプログラムについては、そのタスクが既に完了したので、終了ログが発生して現在の状態は”終了”となっている。また、検出ログが正常に発生されたので、その発生状態が”有”となっている。この場合には、処理監視単位名が”１０００３”のプログラム処理は正常に実行されたものと判断し、このため、障害予兆通知部２５から障害予兆の通知は出力されない。 On the other hand, for the program whose process monitoring unit name is “10003”, since the task has already been completed, an end log is generated and the current state is “end”. In addition, since the detection log is generated normally, the generation state is “present”. In this case, it is determined that the program process having the process monitoring unit name “10003” has been normally executed, and therefore, the failure predictor notification unit 25 does not output a failure predictor notification.

なお、処理監視単位名が”１００１１”のプログラムについては、そのタスクの実行完了前にプログラムの明確な障害に紐づいたエラーログが発生されているので、その場合には検出ログの発生の有無にかかわらず、障害予兆通知部２５から障害予兆の通知が出力される。 For the program whose process monitoring unit name is “10011”, an error log associated with a clear failure of the program is generated before the execution of the task is completed. Regardless, the failure sign notification unit 25 outputs a failure sign notification.

このように、この実施の形態１では、検出定義作成装置１００において予め作成した検出定義に基づき、ログ解析装置２００によりプログラムの処理監視単位ごとに処理の開始、終了を検出し、本来正常な処理が実行された場合に発生すると予定されている検出ログが発生しないことが分かった場合には、これを障害予兆として検出するようにした。その結果、障害に紐づいたエラーログを検出する場合に加えて、明示的にエラーログが発生しない場合でも、障害予兆を検出することができる。このため、より多くの障害に対して早期対応が可能となるという効果が得られる。 As described above, in the first embodiment, based on the detection definition created in advance by the detection definition creation device 100, the log analysis device 200 detects the start and end of processing for each process monitoring unit of the program, and the processing is normally normal. When it is found that the detection log that is expected to occur when is executed, this is detected as a failure sign. As a result, in addition to the case where an error log associated with a failure is detected, a failure sign can be detected even when an error log is not explicitly generated. For this reason, the effect that an early response | compatibility with respect to more faults is attained is acquired.

実施の形態２．
図４はこの実施の形態２における障害予兆検出システムの構成を示すブロック図であり、図１に示した実施の形態１の構成と対応する部分には同一の符号を付す。 Embodiment 2. FIG.
FIG. 4 is a block diagram showing the configuration of the failure sign detection system according to the second embodiment, and parts corresponding to those of the configuration of the first embodiment shown in FIG.

この実施の形態２の特徴は、上記実施の形態１の構成に対して、試験情報解析装置３００を追加し、これによって解析対象となるソフトウェア９００の試験実施に伴って、図３（ａ）に示したような検出定義テーブルが簡便に作成できるようにしていることである。 The feature of the second embodiment is that a test information analysis apparatus 300 is added to the configuration of the first embodiment, and as a result, the software 900 to be analyzed is tested as shown in FIG. The detection definition table as shown can be easily created.

解析対象となるソフトウェア９００がインストールされる製品は、製品の出荷前に当該ソフトウェア９００が所定通りに動作するか否かが試験される。その試験に際しては、予め正常な結果が得られることが期待されるデータを入力するだけでなく、間違った結果が得られるデータを故意に入力してエラーを発生させ、ソフトウェア９００が正しく動作することを確認する。 A product in which the software 900 to be analyzed is installed is tested to determine whether or not the software 900 operates as prescribed before the product is shipped. In the test, not only data that is expected to give a normal result in advance is entered, but data that gives an incorrect result is intentionally entered to generate an error and the software 900 operates correctly. Confirm.

そこで、この実施の形態２では、試験情報解析装置３００を、試験情報入力部３１、試験情報解析部３２、および試験情報送信部３３で構成している。 Therefore, in the second embodiment, the test information analysis apparatus 300 is configured by the test information input unit 31, the test information analysis unit 32, and the test information transmission unit 33.

ここに、試験情報入力部３１は、試験内容を定義するために必要となる試験情報を入力するインタフェースであり、例えばキーボートやマウス、タブレット端末などで構成される。この場合、試験内容を定義するために必要となる試験情報としては、処理監視対象となる個々のプログラムに対応付けた試験番号、処理監視単位名、当該プログラムの試験結果を想定して正常／異常となる情報が入力される。 Here, the test information input unit 31 is an interface for inputting test information necessary for defining test contents, and is configured by, for example, a keyboard, a mouse, a tablet terminal, or the like. In this case, the test information required to define the test content includes normal / abnormal assuming the test number associated with each program to be monitored, the process monitoring unit name, and the test result of the program. Is input.

試験情報解析部３２は、試験情報入力部３１から入力された試験内容を定義した試験情報を、図５に示すような試験定義テーブルとしてメモリに保持するとともに、この試験定義テーブルに基づいて、ソフトウェア９００のプログラムが処理監視単位ごとに所定通りに動作するか否かの試験が実施される際に、そのソフトウェア９００の処理監視単位ごとのプログラム実行による試験結果が正常となる際に得られるログ情報を抽出する。 The test information analysis unit 32 holds the test information defining the test contents input from the test information input unit 31 in a memory as a test definition table as shown in FIG. 5, and based on this test definition table, software Log information obtained when a test result obtained by executing a program for each processing monitoring unit of the software 900 becomes normal when a test for whether or not the 900 program operates as predetermined for each processing monitoring unit is performed. To extract.

すなわち、試験情報解析部３２は、処理監視単位ごとに実行されたプログラムの処理結果が正常となる場合に得られる検出ログを抽出する。そして、試験情報解析部３２は、こうして抽出した情報（処理監視単位名と検出ログ）を、試験情報送信部３３を介して検出定義作成装置１００に検出定義の情報として送信する。 That is, the test information analysis unit 32 extracts a detection log obtained when the processing result of the program executed for each processing monitoring unit is normal. Then, the test information analysis unit 32 transmits the extracted information (process monitoring unit name and detection log) as detection definition information to the detection definition creating apparatus 100 via the test information transmission unit 33.

そして、検出定義作成部１２において、試験情報解析装置３００から送られてくる検出定義の情報に、さらに開始ログと終了ログとを処理監視単位名ごとに人手で別途追加することで、図３（ａ）に示したような検出定義テーブルが完成される。 Then, the detection definition creation unit 12 manually adds a start log and an end log to the detection definition information sent from the test information analysis apparatus 300 for each process monitoring unit name, thereby allowing the detection definition information shown in FIG. The detection definition table as shown in a) is completed.

これにより、図３（ａ）に示したような検出定義テーブルをソフトウェア９００の試験を実施する際に簡便に作成できるので、障害予兆解析を行う上で必要な検出定義テーブルを検出定義作成装置１００で独自に作成する場合に比べて、その手間や準備時間を削減することができるという効果が得られる。
なお、その他の構成および作用効果は、実施の形態１の場合と同様であるから、ここでは詳しい説明は省略する。 As a result, the detection definition table as shown in FIG. 3A can be easily created when the test of the software 900 is performed. Therefore, the detection definition table necessary for performing the failure sign analysis can be created. As compared with the case of creating the original, the effect that the labor and preparation time can be reduced is obtained.
Since other configurations and operational effects are the same as those in the first embodiment, detailed description thereof is omitted here.

実施の形態３．
図６はこの発明の実施の形態３における障害予兆検出システムの構成を示すブロック図であり、図４に示した実施の形態２の構成と対応する部分には同一の符号を付す。 Embodiment 3 FIG.
FIG. 6 is a block diagram showing the configuration of the failure sign detection system according to the third embodiment of the present invention, and the same reference numerals are given to the portions corresponding to the configuration of the second embodiment shown in FIG.

解析対象となるソフトウェア９００のプログラム内の一部の処理範囲を実行する場合、その処理内容によっては、エラーログが発生されない限り、検出ログが発生されなくても全体の処理結果に影響を及ぼさない程度のもので、運用上、支障のないプログラムが存在することがある。 When a part of the processing range in the program of the software 900 to be analyzed is executed, depending on the processing contents, the entire processing result is not affected even if no detection log is generated unless an error log is generated. There may be a program that does not interfere with operation.

そこで、この実施の形態３では、上記実施の形態２の構成に対して、ログ解析装置２００内に障害該当判定部２６を追加し、この障害該当判定部２６には、図７に示すように、検出ログが発生されなくても、障害予兆を殊更通知する必要がないことを定義できる障害判定テーブルを予め設定してメモリに登録している。 Therefore, in the third embodiment, a failure appropriateness determination unit 26 is added in the log analysis apparatus 200 to the configuration of the second embodiment, and the failure appropriateness determination unit 26 has a configuration as shown in FIG. If a detection log is not generated, a failure determination table that can define that it is not necessary to notify a failure sign is preset and registered in the memory.

例えば、図７に示す障害判定テーブルにおいて、処理監視単位名が”１００１”や”１００３”のプログラムについては、そのタスクが既に完了して現在の状態が”終了”となる場合に検出ログの発生が無ければ、障害予兆があるとして障害予兆通知部２５による通知を許容（ＹＥＳ）するように予め設定しておく。 For example, in the failure determination table shown in FIG. 7, for a program whose process monitoring unit name is “1001” or “1003”, a detection log is generated when the task has already been completed and the current state is “finished” If there is no failure, it is set in advance to allow (YES) notification by the failure sign notifying unit 25 that there is a failure sign.

これに対して、処理監視単位名が”１００２”のプログラムについては、そのタスクが既に完了して現在の状態が”終了”となる場合に検出ログの発生がなくても、障害予兆通知部２５による通知は不要（Ｎｏ）と予め設定しておく。 On the other hand, for the program whose process monitoring unit name is “1002”, the failure predictor notifying unit 25 even if no detection log is generated when the task has already been completed and the current state is “finished”. Notification in advance is set to be unnecessary (No).

したがって、前述の実施の形態１では、図３（ｂ）に示したように、処理監視単位名が”１００２”のプログラムについては、そのタスクが既に完了して現在の状態が”終了”となっている場合に検出ログの発生がなければ障害予兆通知部２５による通知がなされるのに対して、この実施の形態３では、障害該当判定部２６が図７に示した障害判定テーブルを参照することにより、検出ログの発生がない場合でも障害予兆通知部２５による通知が抑止される。 Therefore, in the above-described first embodiment, as shown in FIG. 3B, for the program whose process monitoring unit name is “1002”, the task has already been completed and the current state becomes “finished”. In the third embodiment, the failure predictor notification unit 25 refers to the failure determination table shown in FIG. 7 while the failure predictor notification unit 25 notifies the failure log if no detection log is generated. Thus, notification by the failure sign notification unit 25 is suppressed even when no detection log is generated.

このように、この実施の形態３では、障害該当判定部２６を設け、プログラム内の一部の処理範囲を実行する際、その処理内容によって全体の処理結果に影響を及ぼさない程度のものである場合には、検出ログの発生がなくても運用上、特に支障はないと判断して障害予兆通知を抑止することが可能になる。これにより、各処理監視単位名のプログラムの全てについて通知の有無を確認する手間を省くことができ、特に確認数が多くなるような障害予兆通知がある場合に確認時間を削減できるという効果が得られる。
なお、その他の構成および作用効果は、実施の形態２の場合と同様であるから、ここでは詳しい説明は省略する。 As described above, in the third embodiment, the failure determination unit 26 is provided, and when executing a part of the processing range in the program, the processing result does not affect the entire processing result. In this case, even if no detection log is generated, it is determined that there is no particular problem in operation and it is possible to suppress the failure sign notification. As a result, it is possible to save the trouble of confirming the presence / absence of notifications for all programs of each process monitoring unit name, and it is possible to reduce the confirmation time especially when there is a failure predictor notification that increases the number of confirmations. It is done.
Since other configurations and operational effects are the same as those in the second embodiment, detailed description thereof is omitted here.

実施の形態４．
図８はこの発明の実施の形態４における障害予兆検出システムの構成を示すブロック図であり、図６に示した実施の形態３の構成と対応する部分には同一の符号を付す。 Embodiment 4 FIG.
FIG. 8 is a block diagram showing the configuration of the failure sign detection system according to the fourth embodiment of the present invention, and parts corresponding to those of the configuration of the third embodiment shown in FIG.

この実施の形態４の特徴は、図８に示すように、上記実施の形態３の構成に対して、ログ追加処理部４１とログ追加情報送信部４２からなるソース管理装置４００を追加し、これによって解析対象となるソフトウェア９００の試験に伴って図３（ａ）に示したような検出定義テーブルが自動的に作成できるようにしていることである。 As shown in FIG. 8, the feature of the fourth embodiment is that a source management device 400 including a log addition processing unit 41 and a log additional information transmission unit 42 is added to the configuration of the third embodiment, and this Thus, the detection definition table as shown in FIG. 3A can be automatically created in accordance with the test of the software 900 to be analyzed.

ここに、ログ追加処理部４１は、試験番号に対応した処理監視単位ごとのプログラムのソースコードの開始時点を示す開始ログと終了時点を示す終了ログの情報を入力し、ログ追加情報送信部４２は、このログ追加処理部４１で追加された開始ログと終了ログの情報を試験情報解析装置３００に送信する。 Here, the log addition processing unit 41 inputs information of a start log indicating the start time of the source code of the program and an end log indicating the end time for each process monitoring unit corresponding to the test number, and the log additional information transmission unit 42. Transmits the information of the start log and the end log added by the log addition processing unit 41 to the test information analysis apparatus 300.

このログ追加処理部４１から送信された開始ログと終了ログの情報は、試験情報入力部３１を介して試験情報解析部３２に取り込まれるので、試験情報解析部３２が備える試験定義テーブルには、図９に示すように、処理監視対象となる個々のプログラムに対応付けた試験番号、処理監視単位名、当該プログラムの試験結果が正常／異常となる情報（図５参照）だけでなく、さらに、試験番号に対応した処理監視単位ごとのプログラムのソースコード、そのソースコードの開始時点を示す開始ログと終了時点を示す終了ログの情報も併せて登録される。 Since the information of the start log and the end log transmitted from the log addition processing unit 41 is taken into the test information analysis unit 32 via the test information input unit 31, the test definition table provided in the test information analysis unit 32 includes As shown in FIG. 9, not only the test number associated with each program to be monitored, the process monitoring unit name, information indicating that the test result of the program is normal / abnormal (see FIG. 5), The source code of the program for each process monitoring unit corresponding to the test number, the start log indicating the start time of the source code, and the end log information indicating the end time are also registered.

試験情報解析装置３００は、実施の形態２で説明したように、解析対象となるソフトウェア９００のプログラムの処理監視単位ごとに所定通りに動作するか否かの試験が実施される際に、そのソフトウェア９００の処理監視単位ごとのプログラム実行による試験結果が正常となる際に得られるログ情報を抽出する。この場合に抽出される情報としては、処理監視単位名、検出ログ、開始ログ、終了ログが含まれる。そして、こうして抽出した情報が検出定義作成装置１００に対して検出定義として送信されるので、検出定義作成部１２は、図３（ａ）に示したような検出定義テーブルが自動的に作成される。 As described in the second embodiment, the test information analysis apparatus 300 is configured to perform a test as to whether or not it operates as predetermined for each processing monitoring unit of the program of the software 900 to be analyzed. The log information obtained when the test result by the program execution for every 900 processing monitoring units becomes normal is extracted. Information extracted in this case includes a process monitoring unit name, a detection log, a start log, and an end log. Since the information extracted in this way is transmitted as a detection definition to the detection definition creation device 100, the detection definition creation unit 12 automatically creates a detection definition table as shown in FIG. .

このように、この実施の形態４では、実施の形態３の構成に対してソース管理装置４００を追加し、これによって解析対象となるソフトウェア９００の試験に伴って図３（ａ）に示したような検出定義テーブルが自動的に作成されるため、障害予兆解析のために必要とする検出定義テーブルを検出定義作成装置１００で独自に作成する場合に比べて、その手間や準備時間を大幅に削減できるという効果が得られる。
なお、その他の構成および作用効果は、実施の形態３の場合と同様であるから、ここでは詳しい説明は省略する。 As described above, in the fourth embodiment, the source management device 400 is added to the configuration of the third embodiment, and as a result, as shown in FIG. Since the detection definition table is automatically created, the time and preparation time are greatly reduced compared to the case where the detection definition table necessary for failure sign analysis is created by the detection definition creation device 100 independently. The effect that it can be obtained.
Since other configurations and operational effects are the same as those in the third embodiment, detailed description thereof is omitted here.

実施の形態５．
図１０はこの発明の実施の形態５における障害予兆検出システムの構成を示すブロック図であり、図８に示した実施の形態４の構成と対応する部分には同一の符号を付す。 Embodiment 5. FIG.
FIG. 10 is a block diagram showing the configuration of the failure sign detection system according to the fifth embodiment of the present invention. The same reference numerals are given to the portions corresponding to the configuration of the fourth embodiment shown in FIG.

この実施の形態５では、図１０に示すように、上記実施の形態４の構成に対して、ソース管理装置４００内に、影響通知部４３を追加し、この影響通知部４３には、図１１に示すような、各処理監視単位名に関連するソースコードとその参照元や参照先のプログラムのソースコードとの関連を定義したソース参照関連管理テーブルを予めメモリに登録して保持している。そして、このソース参照関連管理テーブルの情報を障害該当判定部２６に通知するようにしている。 In the fifth embodiment, as shown in FIG. 10, an influence notification unit 43 is added to the configuration of the fourth embodiment in the source management apparatus 400. A source reference relation management table that defines the relation between the source code related to each process monitoring unit name and the source code of the reference source or the reference destination program as shown in FIG. The information on the source reference relation management table is notified to the failure appropriateness determination unit 26.

これにより、ログ解析部２２のログ解析の結果、所定の処理監視単位名をもつプログラムのタスクが既に完了して現在の状態が”終了”となっているにもかかわらず、検出ログの発生がないと解析された場合であって、かつ、障害該当判定部２６に予め登録されている図７に示した障害判定テーブルにより、検出ログ未発生時に障害予兆を通知することが許容（ＹＥＳ）されていると判定された場合、障害該当判定部２６は、影響通知部４３から通知された図１１に示すソース参照関連管理テーブルを参照して、当該処理監視単位名をもつプログラムのソースコードとその参照元や参照先のプログラムのソースコードを、障害予兆通知部２５を介して例えば図示しないＣＲＴや液晶などの表画面上に通知する。 Thereby, as a result of the log analysis by the log analysis unit 22, the detection log is generated even though the task of the program having the predetermined process monitoring unit name has already been completed and the current state is “finished”. 7 and the failure determination table shown in FIG. 7 registered in advance in the failure determination unit 26 is allowed (YES) to notify a failure predictor when no detection log has occurred. If it is determined that the failure is detected, the failure determination unit 26 refers to the source reference relation management table shown in FIG. 11 notified from the influence notification unit 43, and the source code of the program having the process monitoring unit name and its source code The source code of the reference source or reference destination program is notified on a front screen such as a CRT or a liquid crystal (not shown) via the failure sign notification unit 25.

例えば、処理監視単位名が”１０００１”をもつプログラムに障害予兆があると判定される場合、図１１に示したソース参照関連管理テーブルに基づき、そのソースコード”ＡＡＡ．ｃ”と、その参照元であるプログラムのソースコード”ｍａｉｎ．ｃ”や、参照先のプログラムのソースコード”ＡＡＡ−１．ｃ”や”ＡＡＡ−２．ｃ”が通知される。 For example, when it is determined that there is a failure sign in a program having the process monitoring unit name “10001”, the source code “AAA.c” and its reference source are based on the source reference relation management table shown in FIG. The source code “main.c” of the program and the source code “AAA-1.c” and “AAA-2.c” of the reference destination program are notified.

これにより、所定の処理監視単位名をもつプログラムについて、そのソースコードが変更された場合に検出ログが発生しなかった時には、その変更されたソースコードの参照元や参照先のプログラムのソースコードについても障害予兆発生の可能性ありとして、再度確認することができるので、ソースコードの変更に伴って発生する可能性のある周辺処理のプログラムについての障害予兆を漏らさず検出することができる。このため、より多くの障害に対して早期対応が可能となる効果が得られる。
なお、その他の構成および作用効果は、実施の形態４の場合と同様であるから、ここでは詳しい説明は省略する。 As a result, when a detection log does not occur when the source code of a program having a predetermined process monitoring unit name is changed, the source code of the changed source code reference source or the reference destination program Since it is possible to confirm again that there is a possibility that a failure sign has occurred, it is possible to detect a failure sign for a peripheral processing program that may occur in accordance with a change in the source code. For this reason, the effect that the early response | compatibility with respect to more faults is attained is acquired.
Other configurations and operational effects are the same as those of the fourth embodiment, and thus detailed description thereof is omitted here.

この発明は上記の実施の形態１〜５の構成のみに限定されるものではなく、この発明の趣旨を逸脱しない範囲において、各実施の形態１〜５の構成の一部に変形を加えたり、構成の一部を省略することができ、さらに、各実施の形態１〜５の構成を適宜組み合わせることが可能である。 The present invention is not limited to the configurations of the first to fifth embodiments described above, and may be modified to a part of the configurations of the first to fifth embodiments without departing from the spirit of the present invention. Part of the configuration can be omitted, and the configurations of the first to fifth embodiments can be combined as appropriate.

１障害予兆検出システム、１００検出定義作成装置、１１定義入力部、
１２検出定義作成部、２００ログ解析装置、２１ログ取得部、２２ログ解析部、２３検出定義管理部、２４ログ状態管理部、２５障害予兆通知部、
２６障害該当判定部、３００試験情報解析装置、３１試験情報入力部、
３２試験情報解析部、３３試験情報送信部、４００ソース管理装置、
４１ログ追加処理部、４２ログ追加情報送信部、４３影響通知部。 1 failure sign detection system, 100 detection definition creation device, 11 definition input unit,
12 detection definition creation unit, 200 log analysis device, 21 log acquisition unit, 22 log analysis unit, 23 detection definition management unit, 24 log state management unit, 25 failure sign notification unit,
26 failure determination unit, 300 test information analyzer, 31 test information input unit,
32 test information analysis unit, 33 test information transmission unit, 400 source management device,
41 log addition processing unit, 42 log additional information transmission unit, 43 influence notification unit.

Claims

A failure sign detection system that detects a failure sign based on a log that is output each time a software program that is subject to failure analysis is executed,
A detection definition creation device for creating and registering as a detection definition a name of a process monitoring unit that defines a range in which a failure sign of the program is detected and a log used to determine whether or not there is a failure sign, and a program for the software A log analysis device that acquires a log that is output each time it is executed, and that detects a failure sign based on the detection definition created by the detection definition creation device for the acquired log;
The log analysis device detects an error log associated with the failure of the program, and performs normal processing for each processing monitoring unit of the program based on the detection definition created in advance by the detection definition creation device. A failure sign detection system that analyzes whether or not a detection log that is expected to occur when executed is detected, and detects the detection log as a failure sign when the detection log is not generated.

A test information analyzer, the test information analyzer is configured to input test information for defining test contents for each processing monitoring unit of the software program, and the test information input unit inputs the test information. Based on the test information, when the test of whether the software program operates as prescribed for each processing monitoring unit is performed or not, when the test result by the program execution for each processing monitoring unit becomes normal A test information analysis unit that extracts log information obtained, and a test information transmission unit that transmits the log information extracted by the test information analysis unit as detection definition information to the detection definition creation device. The failure sign detection system according to claim 1.

Log addition that adds start log information indicating the start time and end log information indicating the end time of the program source code for each process monitoring unit to the test information input to the test information input unit of the test information analyzer The failure sign detection system according to claim 2, further comprising a processing unit.

Even if it is analyzed by the above log analysis device that no detection log has been generated, if it is not necessary to notify the failure sign in a system operation, the failure determination that suppresses the notification that the failure sign is present The failure sign detection system according to any one of claims 1 to 3, further comprising a unit.

An impact notification unit that defines the relationship between a specific source code and its reference source and reference source code, and the log analyzer detects the detection log as a failure sign when no detection log is generated. 5. The system according to claim 1, further comprising the step of notifying that there is a possibility that a failure sign has occurred for a reference source and a reference source code of a specific source code based on information of an influence notification unit. The failure sign detection system described in 1.

In order to detect a failure sign based on a log that is output each time a software program that is subject to failure analysis is executed, the name of the process monitoring unit that defines the range in which the failure sign of the program is detected and the failure sign A first step of creating and registering a log used for determining whether or not there is a detection definition in advance;
Obtain a log that is output each time a software program that is subject to failure analysis is executed, detect a failure sign for the acquired log based on the detection definition created in advance, And a second step for notifying that if any,
In the second step, not only when an error log associated with a failure of the software program is detected, but also based on the detection definition created in advance, the start and end of processing for each process monitoring unit of the program. A failure sign detection method, wherein when a detection log that is detected and is expected to occur when normal processing is executed does not occur, this is detected as a failure sign.