JP2016071640A

JP2016071640A - Information processing system, logging control program, and logging control method

Info

Publication number: JP2016071640A
Application number: JP2014200524A
Authority: JP
Inventors: 敬藏小池; Keizo Koike; 愛子中野; Aiko Nakano
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2014-09-30
Filing date: 2014-09-30
Publication date: 2016-05-09
Anticipated expiration: 2034-09-30
Also published as: JP6330607B2

Abstract

PROBLEM TO BE SOLVED: To provide an information processing system, a logging control program, and a logging control method capable of protecting log information required for specifying a cause of a failure.SOLUTION: The information processing system includes a first logging device and a second logging device. The first logging device is provided with: a reception part that receives, from a first device, notification that a second device has not responded to a first message transmitted from the first device to the second device; and a processing part that performs processing for extracting, from stored log information relating to the first device, log information corresponding to a second message received from the second device by the first device before the transmission of the first message. The second logging device is provided with: a reception device that receives transmitted first identification information; and a processing part that specifies, from stored log information relating to the second device, log information corresponding to the message transmitted to the first device after the transmission of the second message on the basis of the received first identification information.SELECTED DRAWING: Figure 6

Description

本発明は，情報処理システム，ロギング制御プログラム及びロギング制御方法に関する。 The present invention relates to an information processing system, a logging control program, and a logging control method.

例えば，利用者に対してサービスを提供する業務システムは，その業務システムで実行される処理の動作に関する情報（以下，動作情報とも呼ぶ）を記憶する場合がある（以下，記憶された動作情報をログ情報とも呼ぶ）。そして，運用管理者は，例えば，業務システムで障害が発生したことを検知した場合，障害が発生した前後のログ情報に基づく解析を行う。これにより，運用管理者は，発生した障害の原因を特定することが可能になる。 For example, a business system that provides a service to a user may store information (hereinafter also referred to as operation information) regarding the operation of a process executed by the business system (hereinafter referred to as stored operation information). Also called log information). For example, when the operation manager detects that a failure has occurred in the business system, the operation manager performs analysis based on log information before and after the failure. As a result, the operation manager can identify the cause of the failure that has occurred.

また，上記のような業務システムは，新たな動作情報によって，障害発生前後の動作情報に関するログ情報が上書きされることを防止するため，例えば，予め定めた条件に合致したログ情報を保護するための処理（以下，保護処理とも呼ぶ）を行う場合がある。これにより，運用管理者は，例えば，障害が発生してからログ情報の解析を行うまでにタイムラグがあった場合であっても，ログ情報に基づく障害の原因の特定を行うことが可能になる（例えば，特許文献１及び２参照）。 The business system as described above also prevents log information related to operation information before and after a failure from being overwritten by new operation information. For example, to protect log information that meets a predetermined condition. (Hereinafter also referred to as protection processing). As a result, for example, even if there is a time lag between the occurrence of a failure and the analysis of log information, the operation manager can identify the cause of the failure based on the log information. (For example, see Patent Documents 1 and 2).

特開平７−６４８２５号公報JP 7-64825 A 特開２０１２−１６８９０７号公報JP 2012-168907 A

上記のような業務システムが複数のマシン（物理マシンまたは仮想マシン）に跨る形で構築されている場合，各マシンは，連携を行いながら利用者にサービスを提供するための処理を行う。そして，業務システムを構築する複数のマシンのうち，いずれかのマシンで障害の発生を検知した場合，業務システムは，他のマシンのログ情報についても保護処理を行う。これにより，運用管理者は，障害の原因が障害の発生を検知したマシンとは異なるマシンにある場合であっても，障害の原因を特定することが可能になる。 When the business system as described above is constructed across multiple machines (physical machines or virtual machines), each machine performs processing for providing services to users while cooperating. When a failure is detected on any of a plurality of machines constituting the business system, the business system also performs protection processing on log information of other machines. As a result, the operation manager can identify the cause of the failure even when the failure is in a different machine from the machine that detected the occurrence of the failure.

しかしながら，業務システムが複数のマシンに跨る形で構築されている場合，障害を検知するタイミングがマシン間で異なる場合がある。そして，この場合，障害の検知が遅れたマシンでは，ログ情報の保護処理を実行するタイミングが他のマシンよりも遅れる。そのため，障害の検知が遅れたマシンでは，障害の原因を特定するために必要なログ情報が上書きされてしまう場合がある。 However, when a business system is built across multiple machines, the timing for detecting a failure may differ between machines. In this case, the timing of executing the log information protection processing is delayed in the machine in which the detection of the failure is delayed compared to the other machines. For this reason, the log information necessary to identify the cause of a failure may be overwritten on a machine whose failure detection is delayed.

そこで，一つの実施の形態の目的は，障害の原因を特定するために必要なログ情報を特定できる情報処理システム，ロギング制御プログラム及びロギング制御方法を提供することにある。 Accordingly, an object of one embodiment is to provide an information processing system, a logging control program, and a logging control method that can specify log information necessary for specifying the cause of a failure.

実施の形態の一つの側面によれば，第１の装置に関するログ情報を記憶する記憶部と，
前記第１の装置から第２の装置に送信された第１のメッセージに対して前記第２の装置から無応答であった旨の通知を前記第１の装置から受信する受信部と，
前記記憶された第１の装置に関するログ情報から，前記第１のメッセージの送信前に前記第１の装置が前記第２の装置から受信した第２のメッセージに対応するログ情報を抽出する処理を行う処理部と，
前記抽出した第２のメッセージを識別可能な第１の識別情報を送信する送信部と，
を備えた第１のロギング装置と，
前記第２の装置に関するログ情報を記憶する記憶部と，
前記送信された第１の識別情報を受信する受信部と，
前記受信した第１の識別情報に基づいて，前記記憶された第２の装置に関するログ情報から前記第２のメッセージの送信よりも後に前記第１の装置へ送信したメッセージに対応するログ情報を特定する処理部と，
を備えた第２のロギング装置と，を含む。 According to one aspect of the embodiment, a storage unit that stores log information about the first device;
A receiving unit that receives a notification from the first device that there is no response from the second device with respect to the first message transmitted from the first device to the second device;
A process of extracting log information corresponding to a second message received by the first device from the second device before transmission of the first message from the stored log information about the first device; A processing unit to perform,
A transmitter for transmitting first identification information capable of identifying the extracted second message;
A first logging device comprising:
A storage unit for storing log information related to the second device;
A receiver for receiving the transmitted first identification information;
Based on the received first identification information, log information corresponding to a message transmitted to the first device after transmission of the second message is specified from the stored log information about the second device. A processing unit to perform,
A second logging device comprising:

一つの側面によれば，障害の原因を特定するために必要なログ情報を特定できる。 According to one aspect, log information necessary to identify the cause of a failure can be identified.

情報処理システムの全体構成を示す図である。1 is a diagram illustrating an overall configuration of an information processing system. 障害発生時の動作例を説明する図である。It is a figure explaining the operation example at the time of failure occurrence. 情報処理システムのハードウエア構成を示す図である。It is a figure which shows the hardware constitutions of an information processing system. 図３の物理マシンの機能ブロック図である。It is a functional block diagram of the physical machine of FIG. 図３の物理マシンの機能ブロック図である。It is a functional block diagram of the physical machine of FIG. 第１の実施の形態におけるロギング制御処理の概略を説明するシーケンスチャート図である。It is a sequence chart figure explaining the outline of logging control processing in a 1st embodiment. 第１の実施の形態におけるロギング制御処理の概略を説明する図である。It is a figure explaining the outline of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明するフローチャート図である。It is a flowchart figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明するフローチャート図である。It is a flowchart figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明するフローチャート図である。It is a flowchart figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明するフローチャート図である。It is a flowchart figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明するフローチャート図である。It is a flowchart figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明するフローチャート図である。It is a flowchart figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明する図である。It is a figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明する図である。It is a figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明する図である。It is a figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明する図である。It is a figure explaining the detail of the logging control process in 1st Embodiment. 第１の実施の形態におけるロギング制御処理の詳細を説明する図である。It is a figure explaining the detail of the logging control process in 1st Embodiment. 第２の実施の形態におけるロギング制御処理を説明するフローチャート図である。It is a flowchart figure explaining the logging control processing in 2nd Embodiment. 第２の実施の形態におけるロギング制御処理を説明するフローチャート図である。It is a flowchart figure explaining the logging control processing in 2nd Embodiment. 第２の実施の形態におけるロギング制御処理を説明するフローチャート図である。It is a flowchart figure explaining the logging control processing in 2nd Embodiment. 第２の実施の形態におけるロギング制御処理を説明する図である。It is a figure explaining logging control processing in a 2nd embodiment. 識別情報が送信できない場合の処理を説明するフローチャート図である。It is a flowchart figure explaining a process when identification information cannot be transmitted. 識別情報が送信できない場合の処理を説明するフローチャート図である。It is a flowchart figure explaining a process when identification information cannot be transmitted.

［情報処理システムの構成］
図１は，情報処理システムの全体構成を示す図である。図１に示す情報処理システム１００は，物理マシン１（以下，コンピュータ１とも呼ぶ）と物理マシン２（以下，コンピュータ２とも呼ぶ）とが設けられている。そして，図１に示す物理マシン１では，利用者にサービスを提供するために動作する監視対象処理部１１（以下，第１の監視対象処理部１１とも呼ぶ）と，監視対象処理部１１の動作情報を蓄積するため動作するロギング制御部１２（以下，第１のロギング制御部１２とも呼ぶ）とが動作している。また，図１に示す物理マシン２では，利用者にサービスを提供するために動作する監視対象処理部２１（以下，第２の監視対象処理部２１とも呼ぶ）と，監視対象処理部２１の動作情報を蓄積するために動作するロギング制御部２２（以下，第２のロギング制御部２２とも呼ぶ）とが動作している。図１に示す例において，監視対象処理部１１と監視対象処理部２１とは，互いに連携しながら利用者にサービスを提供するための処理を行う。また，図１に示すロギング制御部１２は，物理マシン１に設けられた記憶媒体１３に監視対象処理部１１の動作情報を記憶し，図１に示すロギング制御部２２は，物理マシン２に設けられた記憶媒体２３に監視対象処理部２１の動作情報を記憶する。 [Configuration of information processing system]
FIG. 1 is a diagram illustrating an overall configuration of an information processing system. An information processing system 100 shown in FIG. 1 includes a physical machine 1 (hereinafter also referred to as a computer 1) and a physical machine 2 (hereinafter also referred to as a computer 2). In the physical machine 1 shown in FIG. 1, the monitoring target processing unit 11 (hereinafter also referred to as the first monitoring target processing unit 11) that operates to provide a service to the user, and the operation of the monitoring target processing unit 11 A logging control unit 12 (hereinafter also referred to as a first logging control unit 12) that operates to store information is operating. Further, in the physical machine 2 shown in FIG. 1, the monitoring target processing unit 21 (hereinafter also referred to as the second monitoring target processing unit 21) that operates to provide a service to the user and the operation of the monitoring target processing unit 21. A logging control unit 22 (hereinafter also referred to as a second logging control unit 22) that operates to store information is operating. In the example illustrated in FIG. 1, the monitoring target processing unit 11 and the monitoring target processing unit 21 perform processing for providing a service to the user in cooperation with each other. Further, the logging control unit 12 shown in FIG. 1 stores the operation information of the monitoring target processing unit 11 in the storage medium 13 provided in the physical machine 1, and the logging control unit 22 shown in FIG. The storage medium 23 stores the operation information of the monitoring target processing unit 21.

なお，監視対象処理部１１，ロギング制御部１２，監視対象処理部２１及びロギング制御部２２は，それぞれ異なる物理マシン（例えば，第１の装置，第１のロギング装置，第２の装置及び第２のロギング装置）で動作するものであってもよい。この場合，記憶媒体１３及び記憶媒体２３は，例えば，ロギング制御部１２及びロギング制御部２２がそれぞれ動作する物理マシン（第１のロギング装置及び第２のロギング装置）にそれぞれ設けられるものであってもよい。さらに，監視対象処理部１１，ロギング制御部１２，監視対象処理部２１及びロギング制御部２２は，例えば，物理マシン１または物理マシン２のリソースが割り当てられて動作する仮想マシンにおいて動作するものであってもよい。 Note that the monitoring target processing unit 11, the logging control unit 12, the monitoring target processing unit 21, and the logging control unit 22 are respectively different physical machines (for example, a first device, a first logging device, a second device, and a second device). The logging device). In this case, the storage medium 13 and the storage medium 23 are respectively provided in physical machines (first logging device and second logging device) on which the logging control unit 12 and the logging control unit 22 operate, respectively. Also good. Furthermore, the monitoring target processing unit 11, the logging control unit 12, the monitoring target processing unit 21, and the logging control unit 22 operate, for example, in a virtual machine that operates by allocating resources of the physical machine 1 or the physical machine 2. May be.

[障害発生時の動作例]
次に，情報処理システム１００において障害が発生した場合の動作例を説明する。図２は，障害発生時の動作例を説明する図である。以下，図１と異なる点について説明を行う。 [Operation example when a failure occurs]
Next, an operation example when a failure occurs in the information processing system 100 will be described. FIG. 2 is a diagram illustrating an operation example when a failure occurs. Hereinafter, differences from FIG. 1 will be described.

図２に示す例において，監視対象処理部１１は，例えば，監視対象処理部１１で実行した処理の動作結果等に異常があった場合，監視対象処理部１１において障害が発生したものと判定する。この場合，監視対象処理部１１は，ロギング制御部１２に障害の発生を通知する。そして，監視対象処理部１１から通知を受信したロギング制御部１２は，例えば，記憶媒体１３に記憶されたログ情報のうち，監視対象処理部１１で障害が発生した時刻の前後のログ情報の保護処理を行う。すなわち，ロギング制御部１２は，新たな動作情報によって障害が発生した時刻の前後のログ情報が上書きされることを防止するために，ログ情報の保護処理を行う。これにより，運用管理者は，例えば，障害発生からログ情報の解析を行うまでの間にタイムラグがあった場合であっても，障害が発生した時刻の前後のログ情報を参照することが可能になり，障害の原因の特定を行うことが可能になる。 In the example illustrated in FIG. 2, the monitoring target processing unit 11 determines that a failure has occurred in the monitoring target processing unit 11 when, for example, there is an abnormality in the operation result of the process executed by the monitoring target processing unit 11. . In this case, the monitoring target processing unit 11 notifies the logging control unit 12 of the occurrence of the failure. The logging control unit 12 that has received the notification from the monitoring target processing unit 11 protects the log information before and after the time when the failure occurred in the monitoring target processing unit 11 among the log information stored in the storage medium 13, for example. Process. That is, the logging control unit 12 performs log information protection processing in order to prevent the log information before and after the time when the failure has occurred from being overwritten by new operation information. As a result, for example, even if there is a time lag between the occurrence of a failure and the analysis of log information, the operations manager can refer to the log information before and after the time when the failure occurred. It becomes possible to identify the cause of the failure.

一方，物理マシン２で動作する監視対象処理部２１は，例えば，定期的に行う監視対象処理部１１の起動確認（生存監視）により，監視対象処理部１１で障害が発生したことを検知する。ここで，監視対象処理部２１は，起動確認の実行間隔によっては，監視対象処理部１１で発生した障害を検知するまでに時間を要する場合がある。そして，この場合，監視対象処理部２１がロギング制御部２２に対して行う通知が遅れるため，ロギング制御部２２は，ログ情報の保護処理を行うタイミングが遅れる。そのため，監視対象処理部１１の障害発生前後のログ情報は，新たな動作情報によって上書きされてしまう可能性がある。したがって，運用管理者は，監視対象処理部１１で発生した障害の原因の究明を行う際に，監視対象処理部１１の障害発生前後のログ情報を参照することができなくなる可能性がある。 On the other hand, the monitoring target processing unit 21 operating in the physical machine 2 detects that a failure has occurred in the monitoring target processing unit 11 by, for example, periodically confirming activation (survival monitoring) of the monitoring target processing unit 11. Here, depending on the execution confirmation execution interval, the monitoring target processing unit 21 may take time to detect a failure that has occurred in the monitoring target processing unit 11. In this case, since the notification to be performed by the monitoring target processing unit 21 to the logging control unit 22 is delayed, the timing at which the logging control unit 22 performs the log information protection processing is delayed. Therefore, there is a possibility that the log information before and after the failure of the monitoring target processing unit 11 is overwritten by new operation information. Therefore, the operation manager may not be able to refer to the log information before and after the failure of the monitoring target processing unit 11 when investigating the cause of the failure that has occurred in the monitoring target processing unit 11.

そこで，本実施の形態では，ロギング制御部１２は，監視対象処理部１１が送信した通知（以下，第１のメッセージとも呼ぶ）に対して監視対象処理部２１が無応答である場合，記憶媒体１３を参照する。そして，ロギング制御部１２は，第１のメッセージの送信前に，監視対象処理部２１から受信した通知（以下，第２のメッセージとも呼ぶ）に対応する識別情報（以下，第１の識別情報とも呼ぶ）をロギング制御部２２に送信する。さらに，ロギング制御部２２は，受信した識別情報に基づき，第２のメッセージの送信よりも後のログ情報を特定する。これにより，ロギング制御部２２は，監視対象処理部２１の無応答に関連するログ情報を特定することが可能になる。そのため，運用管理者は，特定したログ情報に基づき，無応答に伴う障害の原因の調査を行うことが可能になる。 Therefore, in the present embodiment, the logging control unit 12 stores the storage medium when the monitoring target processing unit 21 does not respond to the notification transmitted by the monitoring target processing unit 11 (hereinafter also referred to as a first message). 13 is referred to. The logging control unit 12 then identifies identification information (hereinafter also referred to as first identification information) corresponding to a notification (hereinafter also referred to as a second message) received from the monitoring target processing unit 21 before transmission of the first message. To the logging control unit 22. Furthermore, the logging control unit 22 specifies log information after the transmission of the second message based on the received identification information. Thereby, the logging control unit 22 can specify log information related to no response of the monitoring target processing unit 21. Therefore, the operation manager can investigate the cause of the failure caused by no response based on the specified log information.

［情報処理システムのハードウエア構成］
次に，情報処理システム１００の構成について説明する。図３は，情報処理システムのハードウエア構成を示す図である。 [Hardware configuration of information processing system]
Next, the configuration of the information processing system 100 will be described. FIG. 3 is a diagram illustrating a hardware configuration of the information processing system.

物理マシン１は，プロセッサであるＣＰＵ１０１と，メモリ１０２と，外部インターフェース（Ｉ／Ｏユニット）１０３と，記憶媒体１０４とを有する。各部は，バス１０５を介して互いに接続される。記憶媒体１０４は，例えば，記憶媒体１０４内のプログラム格納領域（図示しない）に，ログ情報の蓄積を制御する処理（以下，ロギング制御処理とも呼ぶ）等を行うためのプログラム１１０（以下，ロギング制御プログラム１１０とも呼ぶ）を記憶する。ＣＰＵ１０１は，図３に示すように，プログラム１１０の実行時に，プログラム１１０を記憶媒体１０４からメモリ１０２にロードし，プログラム１１０と協働してロギング制御処理等を行う。また，記憶媒体１０４は，例えば，ロギング制御処理等を行う際に用いられる情報を記憶する情報格納領域１３０を有する。 The physical machine 1 includes a CPU 101 that is a processor, a memory 102, an external interface (I / O unit) 103, and a storage medium 104. Each unit is connected to each other via a bus 105. The storage medium 104 includes, for example, a program 110 (hereinafter referred to as logging control) for performing processing for controlling accumulation of log information (hereinafter also referred to as logging control processing) in a program storage area (not shown) within the storage medium 104. (Also called program 110). As shown in FIG. 3, the CPU 101 loads the program 110 from the storage medium 104 to the memory 102 when executing the program 110, and performs a logging control process and the like in cooperation with the program 110. In addition, the storage medium 104 includes an information storage area 130 that stores information used when, for example, logging control processing is performed.

また，物理マシン２は，物理マシン１と同様に，プロセッサであるＣＰＵ２０１と，メモリ２０２と，外部インターフェース（Ｉ／Ｏユニット）２０３と，記憶媒体２０４とを有する。各部は，バス２０５を介して互いに接続される。記憶媒体２０４は，例えば，記憶媒体２０４内のプログラム格納領域（図示しない）に，ロギング制御処理等を行うためのプログラム２１０（以下，ロギング制御プログラム２１０とも呼ぶ）を記憶する。ＣＰＵ２０１は，図３に示すように，プログラム２１０の実行時に，プログラム２１０を記憶媒体２０４からメモリ２０２にロードし，プログラム２１０と協働してロギング制御処理等を行う。また，記憶媒体２０４は，例えば，ロギング制御処理等を行う際に用いられる情報を記憶する情報格納領域２３０を有する。 Similarly to the physical machine 1, the physical machine 2 includes a CPU 201 that is a processor, a memory 202, an external interface (I / O unit) 203, and a storage medium 204. Each unit is connected to each other via a bus 205. The storage medium 204 stores, for example, a program 210 (hereinafter also referred to as a logging control program 210) for performing a logging control process or the like in a program storage area (not shown) in the storage medium 204. As shown in FIG. 3, when executing the program 210, the CPU 201 loads the program 210 from the storage medium 204 to the memory 202 and performs a logging control process and the like in cooperation with the program 210. In addition, the storage medium 204 has an information storage area 230 that stores information used when, for example, logging control processing is performed.

[物理マシンのソフトウエア構成]
図４及び図５は，図３の物理マシンの機能ブロック図である。図４は，物理マシン１の機能ブロック図であり，図５は，物理マシン２の機能ブロック図である。 [Software configuration of physical machine]
4 and 5 are functional block diagrams of the physical machine of FIG. FIG. 4 is a functional block diagram of the physical machine 1, and FIG. 5 is a functional block diagram of the physical machine 2.

物理マシン１のＣＰＵ１０１は，プログラム１１０と協働することにより，例えば，ログ情報取得部１１１と，情報受信部１１２（以下，受信部１１２とも呼ぶ）と，ログ情報抽出部１１３（以下，処理部１１３とも呼ぶ）と，情報送信部１１４（以下，送信部１１４とも呼ぶ）と，ログ情報保護部１１５として動作する。また，ＣＰＵ１０１は，プログラム１１０と協働することにより，例えば，状態取得部１１６と，状態更新部１１７と，状態判定部１１８と，起動検知部１１９と，ダンプ取得部１２０として動作する。また，情報格納領域１３０（以下，記憶部１３０とも呼ぶ）には，例えば，ログ情報１３１と，保護情報１３２，状態情報１３３とが記憶されている。なお，ログ情報取得部１１１，情報受信部１１２，ログ情報抽出部１１３，情報送信部１１４，ログ情報保護部１１５，状態取得部１１６，状態更新部１１７，状態判定部１１８，起動検知部１１９及びダンプ取得部１２０は，図１におけるロギング制御部１２に対応する。また，情報格納領域１３０は，図１における記憶媒体１３に対応する。 The CPU 101 of the physical machine 1 cooperates with the program 110 to, for example, a log information acquisition unit 111, an information reception unit 112 (hereinafter also referred to as a reception unit 112), and a log information extraction unit 113 (hereinafter, a processing unit). 113), an information transmission unit 114 (hereinafter also referred to as a transmission unit 114), and a log information protection unit 115. Further, the CPU 101 operates as the state acquisition unit 116, the state update unit 117, the state determination unit 118, the activation detection unit 119, and the dump acquisition unit 120 by cooperating with the program 110. Further, for example, log information 131, protection information 132, and status information 133 are stored in the information storage area 130 (hereinafter also referred to as a storage unit 130). In addition, the log information acquisition unit 111, the information reception unit 112, the log information extraction unit 113, the information transmission unit 114, the log information protection unit 115, the state acquisition unit 116, the state update unit 117, the state determination unit 118, the activation detection unit 119, and The dump acquisition unit 120 corresponds to the logging control unit 12 in FIG. The information storage area 130 corresponds to the storage medium 13 in FIG.

また，物理マシン２のＣＰＵ２０１は，上記のＣＰＵ１０１と同様に，プログラム２１０と協働することにより，例えば，ログ情報取得部２１１と，情報受信部２１２（以下，受信部２１２とも呼ぶ）と，ログ情報抽出部２１３（以下，処理部２１３とも呼ぶ）と，情報送信部２１４と，ログ情報保護部２１５として動作する。また，ＣＰＵ２０１は，プログラム２１０と協働することにより，例えば，状態取得部２１６と，状態更新部２１７と，状態判定部２１８と，起動検知部２１９と，ダンプ取得部２２０として動作する。また，情報格納領域２３０（以下，記憶部２３０とも呼ぶ）には，例えば，ログ情報２３１と，保護情報２３２，状態情報２３３とが記憶されている。なお，ログ情報取得部２１１，情報受信部２１２，ログ情報抽出部２１３，情報送信部２１４，ログ情報保護部２１５，状態取得部２１６，状態更新部２１７，状態判定部２１８，起動検知部２１９及びダンプ取得部２２０は，図１におけるロギング制御部２２に対応する。また，情報格納領域２３０は，図１における記憶媒体２３に対応する。 Similarly to the CPU 101, the CPU 201 of the physical machine 2 cooperates with the program 210, for example, a log information acquisition unit 211, an information reception unit 212 (hereinafter also referred to as a reception unit 212), a log It operates as an information extraction unit 213 (hereinafter also referred to as a processing unit 213), an information transmission unit 214, and a log information protection unit 215. Further, the CPU 201 operates as the state acquisition unit 216, the state update unit 217, the state determination unit 218, the activation detection unit 219, and the dump acquisition unit 220 by cooperating with the program 210, for example. Further, for example, log information 231, protection information 232, and status information 233 are stored in the information storage area 230 (hereinafter also referred to as a storage unit 230). The log information acquisition unit 211, the information reception unit 212, the log information extraction unit 213, the information transmission unit 214, the log information protection unit 215, the status acquisition unit 216, the status update unit 217, the status determination unit 218, the activation detection unit 219, and The dump acquisition unit 220 corresponds to the logging control unit 22 in FIG. The information storage area 230 corresponds to the storage medium 23 in FIG.

初めに，物理マシン１のログ情報取得部１１１と，情報受信部１１２と，ログ情報抽出部１１３と，情報送信部１１４とを説明する。 First, the log information acquisition unit 111, the information reception unit 112, the log information extraction unit 113, and the information transmission unit 114 of the physical machine 1 will be described.

物理マシン１のログ情報取得部１１１は，例えば，図１に示す監視対象処理部１１に関するログ情報１３１を監視対象処理部１１から取得し，情報格納領域１３０に記憶する。ログ情報取得部１１１は，例えば，監視対象処理部１１が動作した履歴に関する情報（以下、トレース情報とも呼ぶ）をログ情報１３１として取得する。これにより，運用管理者は，ログ情報１３１を参照することにより，障害発生前後の監視対象処理部１１の動作を追跡することが可能になる。なお，ログ情報１３１の具体例については後述する。 The log information acquisition unit 111 of the physical machine 1 acquires, for example, the log information 131 related to the monitoring target processing unit 11 illustrated in FIG. 1 from the monitoring target processing unit 11 and stores it in the information storage area 130. The log information acquisition unit 111 acquires, for example, information relating to a history of operation of the monitoring target processing unit 11 (hereinafter also referred to as trace information) as log information 131. Thereby, the operation manager can track the operation of the monitoring target processing unit 11 before and after the occurrence of the failure by referring to the log information 131. A specific example of the log information 131 will be described later.

物理マシン１の情報受信部１１２は，例えば，監視対象処理部１１から通知を受信する。具体的に，情報受信部１１２は，例えば，監視対象処理部１１が物理マシン２の監視対象処理部２１に送信した第１のメッセージに対して，監視対象処理部２１から無応答であった旨の通知（以下，無応答通知とも呼ぶ）を，監視対象処理部１１から受信する。これにより，情報受信部１１２は，監視対象処理部１１から無応答通知を受信した場合に，監視対象処理部２１において障害が発生したものと判定することが可能になる。そして，情報受信部１１２は，監視対象処理部２１において障害が発生したものと判定した場合に，後述するようにログ情報１３１，２３１の保護処理を行うことが可能になる。 For example, the information receiving unit 112 of the physical machine 1 receives a notification from the monitoring target processing unit 11. Specifically, for example, the information receiving unit 112 indicates that there is no response from the monitoring target processing unit 21 to the first message transmitted from the monitoring target processing unit 11 to the monitoring target processing unit 21 of the physical machine 2. Is received from the monitoring target processing unit 11 (hereinafter also referred to as a non-response notification). As a result, the information receiving unit 112 can determine that a failure has occurred in the monitoring target processing unit 21 when a no-response notification is received from the monitoring target processing unit 11. When the information receiving unit 112 determines that a failure has occurred in the monitoring target processing unit 21, the information receiving unit 112 can perform protection processing on the log information 131 and 231 as described later.

物理マシン１のログ情報抽出部１１３は，例えば，記憶媒体１３から監視対象処理部１１に関するログ情報１３１を抽出する処理を行う。具体的に，ログ情報抽出部１１３は，例えば，情報受信部１１２が監視対象処理部１１から無応答通知を受信した場合に，監視対象処理部１１が第１のメッセージの送信前に監視対象処理部２１から受信した第２のメッセージに対応するログ情報１３１を抽出する。 For example, the log information extraction unit 113 of the physical machine 1 performs a process of extracting log information 131 related to the monitoring target processing unit 11 from the storage medium 13. Specifically, for example, when the information reception unit 112 receives a no-response notification from the monitoring target processing unit 11, the log information extraction unit 113 performs monitoring target processing before the first message is transmitted. The log information 131 corresponding to the second message received from the unit 21 is extracted.

物理マシン１の情報送信部１１４は，例えば，ログ情報抽出部１１３が抽出した第２のメッセージを識別可能な識別情報を送信する。 For example, the information transmission unit 114 of the physical machine 1 transmits identification information that can identify the second message extracted by the log information extraction unit 113.

次に，物理マシン２のログ情報取得部２１１と，情報受信部２１２と，ログ情報抽出部２１３と，情報送信部２１４とを説明する。 Next, the log information acquisition unit 211, the information reception unit 212, the log information extraction unit 213, and the information transmission unit 214 of the physical machine 2 will be described.

物理マシン２のログ情報取得部２１１は，例えば，ログ情報取得部１１１と同様に，図１に示す監視対象処理部２１に関するログ情報２３１を取得し，情報格納領域２３０に記憶する。 The log information acquisition unit 211 of the physical machine 2 acquires log information 231 related to the monitoring target processing unit 21 illustrated in FIG. 1 and stores it in the information storage area 230, for example, in the same manner as the log information acquisition unit 111.

物理マシン２の情報受信部２１２は，例えば，物理マシン１の情報送信部１１４が送信した識別情報を受信する。 For example, the information receiving unit 212 of the physical machine 2 receives the identification information transmitted by the information transmitting unit 114 of the physical machine 1.

物理マシン２のログ情報抽出部２１３は，例えば，情報受信部２１２が受信した識別情報に基づいて，記憶媒体２３に記憶された監視対象処理部２１に関するログ情報２３１から，第２のメッセージの送信よりも後に監視対象処理部１１へ送信したメッセージに対応するログ情報２３１を特定する。これにより，ログ情報抽出部２１３は，監視対象処理部２１が監視対象処理部１１に正常に送信したと判断できる第２のメッセージよりも後に，監視対象処理部１１に送信したメッセージを特定することが可能になる。これにより，ログ情報抽出部２１３は，第１のメッセージに関するログ情報を情報格納領域２３０から抽出することが可能になる。そして，運用管理者は，ログ情報１３１，２３１に基づき，発生した障害の原因究明を行うことが可能になる。 The log information extraction unit 213 of the physical machine 2 transmits the second message from the log information 231 related to the monitoring target processing unit 21 stored in the storage medium 23 based on the identification information received by the information reception unit 212, for example. The log information 231 corresponding to the message transmitted to the monitoring target processing unit 11 later is specified. Thereby, the log information extraction unit 213 specifies the message transmitted to the monitoring target processing unit 11 after the second message that can be determined that the monitoring target processing unit 21 has normally transmitted to the monitoring target processing unit 11. Is possible. As a result, the log information extraction unit 213 can extract log information related to the first message from the information storage area 230. Then, the operation manager can investigate the cause of the failure that has occurred based on the log information 131 and 231.

なお，上記の例では，監視対象処理部２１で発生した障害を監視対象処理部１１が検知した場合について説明したが，監視対象処理部１１，２１がそれぞれ互いに障害の発生を監視するものであってよい。そして，この場合，障害を検知した監視対象処理部から無応答通知を受信したロギング制御部は，上記のロギング制御部１２として機能するものであってよい。また，他方の監視対象処理部の動作情報を記憶するロギング制御部は，上記のロギング制御部２２として機能するものであってもよい。 In the above example, the case where the monitoring target processing unit 11 detects a failure occurring in the monitoring target processing unit 21 has been described. However, the monitoring target processing units 11 and 21 monitor the occurrence of a fault with each other. It's okay. In this case, the logging control unit that has received the no-response notification from the monitoring target processing unit that has detected the failure may function as the above-described logging control unit 12. Further, the logging control unit that stores the operation information of the other monitoring target processing unit may function as the logging control unit 22 described above.

次に，物理マシン１及び物理マシン２のその他の機能を説明する。 Next, other functions of the physical machine 1 and the physical machine 2 will be described.

物理マシン１のログ情報保護部１１５は，例えば，ログ情報抽出部１１３が抽出したログ情報１３１の保護処理を行う。具体的に，ログ情報保護部１１５は，例えば，ログ情報１３１の保護を行う範囲を設定した保護情報１３２に基づき，ログ情報抽出部１１３が抽出したログ情報１３１をメモリ上において上書き禁止にする。これにより，ログ情報保護部１１５は，障害発生前後に記憶されたログ情報が新たなログ情報によって上書きされることを防止することが可能になる。また，物理マシン２のログ情報保護部２１５は，ログ情報保護部１１５と同様に，ログ情報２３１の保護を行う範囲を設定した保護情報２３２に基づき，ログ情報抽出部２１３が抽出したログ情報２３１をメモリ上において上書き禁止にする。なお，保護情報１３２の具体例については後述する。 For example, the log information protection unit 115 of the physical machine 1 performs a protection process on the log information 131 extracted by the log information extraction unit 113. Specifically, the log information protection unit 115 prohibits the log information 131 extracted by the log information extraction unit 113 from being overwritten on the memory based on, for example, the protection information 132 in which a range for protecting the log information 131 is set. Thereby, the log information protection unit 115 can prevent the log information stored before and after the occurrence of the failure from being overwritten by new log information. Similarly to the log information protection unit 115, the log information protection unit 215 of the physical machine 2 extracts the log information 231 extracted by the log information extraction unit 213 based on the protection information 232 in which the range for protecting the log information 231 is set. Is overwritten in memory. A specific example of the protection information 132 will be described later.

物理マシン１の状態取得部１１６は，例えば，監視対象処理部１１が送信するメッセージ（以下，第３のメッセージとも呼ぶ）の送受信の状態に関する情報を取得する。具体的に，状態取得部１１６は，例えば，監視対象処理部１１が送信した第３のメッセージの応答を待っている状態にあるか否かについての情報を定期的に取得する。なお，監視対象処理部１１による第３のメッセージの送信先は，例えば，監視対象処理部２１であってよい。 The state acquisition unit 116 of the physical machine 1 acquires, for example, information related to a transmission / reception state of a message (hereinafter also referred to as a third message) transmitted by the monitoring target processing unit 11. Specifically, the state acquisition unit 116 periodically acquires information about whether or not the monitoring target processing unit 11 is waiting for a response to the third message transmitted, for example. Note that the transmission destination of the third message by the monitoring target processing unit 11 may be, for example, the monitoring target processing unit 21.

物理マシン１の状態更新部１１７は，例えば，監視対象処理部１１による第３のメッセージの送信に応じて，監視対象処理部１１が第３のメッセージの応答を受信待ちである旨を示す情報を状態情報１３３として情報格納領域１３０に記憶する。また，状態更新部１１７は，第３のメッセージの応答の受信に応じて，状態情報１３３のうち対応する情報を消去（更新）する。 For example, in response to the transmission of the third message by the monitoring target processing unit 11, the state update unit 117 of the physical machine 1 displays information indicating that the monitoring target processing unit 11 is waiting to receive a response to the third message. The status information 133 is stored in the information storage area 130. Further, the status update unit 117 deletes (updates) the corresponding information in the status information 133 in response to receiving the response of the third message.

物理マシン１の状態判定部１１８は，例えば，記憶されてからの時間が所定の時間（例えば，１分）を上回る状態情報１３３の存在を検知する。すなわち，状態判定部１１８は，記憶されてからの時間が所定の時間を上回る状態情報１３３の存在を検知することにより，第３のメッセージの送信先（例えば，監視対象処理部２１）で障害が発生したと判定することができる。そして，状態判定部１１８は，例えば，ログ情報抽出部１１３に対して，ログ情報１３１の抽出の指示を行うことが可能になる。物理マシン２の状態取得部２１６，状態更新部２１７及び状態判定部２１８については，状態取得部１１６，状態更新部１１７及び状態判定部１１８とそれぞれ同じ処理を行うため，その説明を省略する。 For example, the state determination unit 118 of the physical machine 1 detects the presence of the state information 133 that has been stored for more than a predetermined time (for example, 1 minute). That is, the state determination unit 118 detects the presence of the state information 133 whose time since the storage has exceeded a predetermined time, thereby causing a failure at the third message transmission destination (for example, the monitoring target processing unit 21). It can be determined that it has occurred. The state determination unit 118 can instruct the log information extraction unit 113 to extract the log information 131, for example. The state acquisition unit 216, the state update unit 217, and the state determination unit 218 of the physical machine 2 perform the same processes as the state acquisition unit 116, the state update unit 117, and the state determination unit 118, respectively, and thus description thereof is omitted.

物理マシン１の起動検知部１１９は，例えば，監視対象処理部２１に対して定期的に起動確認（生存確認）を行う。起動検知部１１９は，例えば，監視対象処理部２１に対して定期的にＰＩＮＧを送信するものであってよい。そして，起動検知部１１９は，例えば，監視対象処理部２１の起動確認ができない場合（識別情報の送信ができない場合），情報送信部１１４に情報の送信を待機させる。その後，起動検知部１１９は，例えば，監視対象処理部２１の起動確認ができた場合，識別情報を送信（再送）する。物理マシン２の起動検知部２１９については，物理マシン１の起動検知部１１９と同様であるため説明を省略する。 For example, the activation detection unit 119 of the physical machine 1 periodically performs activation confirmation (survival confirmation) with respect to the monitoring target processing unit 21. For example, the activation detection unit 119 may periodically transmit a PING to the monitoring target processing unit 21. Then, for example, when the activation of the monitoring target processing unit 21 cannot be confirmed (when the identification information cannot be transmitted), the activation detection unit 119 causes the information transmission unit 114 to wait for information transmission. Thereafter, the activation detection unit 119 transmits (retransmits) the identification information when the activation of the monitoring target processing unit 21 can be confirmed, for example. Since the activation detection unit 219 of the physical machine 2 is the same as the activation detection unit 119 of the physical machine 1, description thereof is omitted.

物理マシン１のダンプ取得部１２０は，例えば，情報受信部１１２が通知を受信したことに応じて，監視対象処理部１１のメモリ状態に関するメモリダンプを取得する。物理マシン２のダンプ取得部２２０については，物理マシン２のダンプ取得部２２０と同様であるため説明を省略する。 For example, the dump acquisition unit 120 of the physical machine 1 acquires a memory dump related to the memory state of the monitoring target processing unit 11 in response to the information reception unit 112 receiving the notification. Since the dump acquisition unit 220 of the physical machine 2 is the same as the dump acquisition unit 220 of the physical machine 2, description thereof will be omitted.

［第１の実施の形態の概略］
次に，第１の実施の形態の概略について説明する。図６は，第１の実施の形態におけるロギング制御処理の概略を説明するシーケンスチャート図である。また，図７は，第１の実施の形態におけるロギング制御処理の概略を説明する図である。以下，図７については，図１と異なる点について説明を行う。 [Outline of First Embodiment]
Next, an outline of the first embodiment will be described. FIG. 6 is a sequence chart for explaining the outline of the logging control process in the first embodiment. FIG. 7 is a diagram for explaining the outline of the logging control process in the first embodiment. Hereinafter, with respect to FIG. 7, differences from FIG. 1 will be described.

図６及び図７に示すように，監視対象処理部１１は，例えば，ロギング制御部１２にログ情報１３１の送信を行う（Ｓ１）。また，監視対象処理部２１は，例えば，ロギング制御部２２にログ情報２３１の送信を行う（Ｓ２）。ログ情報１３１，２３１は，例えば，監視対象処理部１１または監視対象処理部２１の動作に関するトレース情報である。なお，図６の例に示す監視対象処理部１１，２１は，後述するＳ３からＳ７の発生の有無を問わず，ロギング制御部１２，２２に対して定常的にログ情報１３１，２３１の送信を行う。 As shown in FIGS. 6 and 7, the monitoring target processing unit 11 transmits log information 131 to the logging control unit 12, for example (S1). Further, the monitoring target processing unit 21 transmits the log information 231 to the logging control unit 22, for example (S2). The log information 131 and 231 is trace information related to the operation of the monitoring target processing unit 11 or the monitoring target processing unit 21, for example. It should be noted that the monitoring target processing units 11 and 21 shown in the example of FIG. 6 regularly transmit log information 131 and 231 to the logging control units 12 and 22 regardless of whether or not S3 to S7 described later occur. Do.

そして，監視対象処理部１１は，例えば，監視対象処理部２１に送信した第１のメッセージについて無応答を検知した場合（Ｓ３，Ｓ４），ロギング制御部１２に対して第１のメッセージに関する無応答通知を行う（Ｓ５）。すなわち，監視対象処理部１１は，第１のメッセージについての無応答を検知した場合，監視対象処理部２１において障害は発生したものと判定し，ロギング制御部１２に対して無応答通知を行う。なお，監視対象処理部１１は，監視対象処理部２１に第１のメッセージを送信した後，所定の時間（例えば，３０秒）を経過しても返信がない場合に，無応答であると判定するものであってよい。 Then, for example, when the monitoring target processing unit 11 detects no response for the first message transmitted to the monitoring target processing unit 21 (S3, S4), the monitoring target processing unit 11 does not respond to the logging control unit 12 regarding the first message. Notification is performed (S5). That is, when the monitoring target processing unit 11 detects no response for the first message, the monitoring target processing unit 11 determines that a failure has occurred in the monitoring target processing unit 21 and notifies the logging control unit 12 of no response. Note that the monitoring target processing unit 11 determines that there is no response when no response is received after a predetermined time (for example, 30 seconds) has elapsed after transmitting the first message to the monitoring target processing unit 21. It may be.

続いて，無応答通知を受信したロギング制御部１２は，例えば，監視対象処理部１１が第１のメッセージを送信する前に，監視対象処理部２１から受信した第２のメッセージに対応するログ情報１３１を記憶媒体１３から取得する（Ｓ６）。そして，ロギング制御部１２は，例えば，第２のメッセージの識別情報をロギング制御部２２に送信する（Ｓ７）。その後，識別情報を受信したロギング制御部２２は，例えば，第２のメッセージよりも後に，監視対象処理部１１へ送信したメッセージに対応するログ情報１３１を特定する（Ｓ８）。 Subsequently, the logging control unit 12 that has received the no-response notification, for example, log information corresponding to the second message received from the monitoring target processing unit 21 before the monitoring target processing unit 11 transmits the first message. 131 is acquired from the storage medium 13 (S6). Then, for example, the logging control unit 12 transmits the identification information of the second message to the logging control unit 22 (S7). Thereafter, the logging control unit 22 that has received the identification information specifies the log information 131 corresponding to the message transmitted to the monitoring target processing unit 11 after the second message, for example (S8).

すなわち，ロギング制御部１２は，例えば，監視対象処理部１１から無応答通知を受信した際に，監視対象処理部２１で障害が発生したものと判定する。そして，ロギング制御部１２は，監視対象処理部２１から正常に受信していると判断できる第２のメッセージを抽出し，その第２のメッセージの識別情報をロギング制御部２２に送信する。さらに，ロギング制御部２２は，識別情報に基づいて，第２のメッセージの送信よりも後に監視対象処理部１１へ送信したメッセージに対応するログ情報１３１を特定する。これにより，ロギング制御部２２は，監視対象処理部１１に正常に送信されたと判断できる第２のメッセージの送信時よりも後に記憶されたログ情報１３１を抽出することが可能になる。そのため，ロギング制御部２２は，第２のメッセージの送信よりも後に行われた第１のメッセージの送受信に関するログ情報を抽出することが可能になる。したがって，運用管理者は，監視すべき情報処理システムが複数のマシンに跨る形で設けられている場合において，いずれかのマシンで障害が発生した場合であっても，全てのマシンにおいて記憶された障害に関連するログ情報を抽出することが可能になる。 That is, the logging control unit 12 determines that a failure has occurred in the monitoring target processing unit 21 when, for example, a non-response notification is received from the monitoring target processing unit 11. Then, the logging control unit 12 extracts a second message that can be determined to be normally received from the monitoring target processing unit 21, and transmits identification information of the second message to the logging control unit 22. Furthermore, the logging control unit 22 specifies log information 131 corresponding to the message transmitted to the monitoring target processing unit 11 after the transmission of the second message based on the identification information. Thereby, the logging control unit 22 can extract the log information 131 stored after the transmission of the second message that can be determined to have been normally transmitted to the monitoring target processing unit 11. Therefore, the logging control unit 22 can extract log information related to the transmission / reception of the first message performed after the transmission of the second message. Therefore, when the information processing system to be monitored is provided across multiple machines, the operation manager stores the information on all machines even if a failure occurs on any of the machines. It becomes possible to extract log information related to the failure.

このように，第１の実施の形態によれば，ロギング制御部１２は，監視対象処理部１１に関するログ情報１３１を記憶する。そして，ロギング制御部１２は，監視対象処理部１１から監視対象処理部２１に送信された第１のメッセージに対して監視対象処理部２１から無応答であった旨の通知を監視対象処理部１１から受信する。続いて，ロギング制御部１２は，監視対象処理部１１に関するログ情報１３１から，第１のメッセージの送信前に監視対象処理部１１が監視対象処理部２１から受信した第２のメッセージに対応するログ情報１３１を抽出する処理を行う。そして，ロギング制御部１２は，抽出した第２のメッセージを識別可能な識別情報を送信する。一方，ロギング制御部２２は，受信した識別情報に基づいて，記憶された監視対象処理部２１に関するログ情報２３１から第２のメッセージの送信よりも後に監視対象処理部２１へ送信したメッセージに対応するログ情報２３１を特定する。これにより，ロギング制御部２２は，無応答に関連するログ情報２３１を特定することが可能になる。そのため，運用管理者は，特定したログ情報２３１に基づき，無応答に伴う障害の原因の特定を行うことが可能になる。 As described above, according to the first embodiment, the logging control unit 12 stores the log information 131 related to the monitoring target processing unit 11. Then, the logging control unit 12 notifies the monitoring target processing unit 11 that there is no response from the monitoring target processing unit 21 to the first message transmitted from the monitoring target processing unit 11 to the monitoring target processing unit 21. Receive from. Subsequently, the logging control unit 12 determines from the log information 131 regarding the monitoring target processing unit 11 that the log corresponding to the second message received by the monitoring target processing unit 11 from the monitoring target processing unit 21 before transmitting the first message. A process of extracting the information 131 is performed. Then, the logging control unit 12 transmits identification information that can identify the extracted second message. On the other hand, the logging control unit 22 corresponds to the message transmitted to the monitoring target processing unit 21 after the transmission of the second message from the stored log information 231 regarding the monitoring target processing unit 21 based on the received identification information. The log information 231 is specified. As a result, the logging control unit 22 can specify the log information 231 related to no response. Therefore, the operation manager can identify the cause of the failure caused by no response based on the identified log information 231.

[第１の実施の形態の詳細]
次に，第１の実施の形態の詳細について説明する。図８から図１３は，第１の実施の形態におけるロギング制御処理の詳細を説明するフローチャート図である。また，図１４から図１８は，第１の実施の形態におけるロギング制御処理の詳細を説明する図である。図１４から図１８を参照しながら，図８から図１３のロギング処理の詳細を説明する。 [Details of First Embodiment]
Next, details of the first embodiment will be described. FIGS. 8 to 13 are flowcharts for explaining the details of the logging control processing in the first embodiment. FIGS. 14 to 18 are diagrams for explaining the details of the logging control process in the first embodiment. Details of the logging process of FIGS. 8 to 13 will be described with reference to FIGS. 14 to 18.

[第１のロギング制御部での処理]
初めに，ロギング制御部１２において実行されるロギング制御処理を説明する。ロギング制御部１２のログ情報取得部１１１は，監視対象処理部１１に関するログ情報を記憶する（Ｓ１０）。具体的に，ログ情報取得部１１１は，例えば，取得したログ情報１３１を情報格納領域１３０に記憶する。以下，ログ情報１３１の具体例を説明する。 [Processing in the first logging control unit]
First, the logging control process executed in the logging control unit 12 will be described. The log information acquisition unit 111 of the logging control unit 12 stores log information related to the monitoring target processing unit 11 (S10). Specifically, the log information acquisition unit 111 stores the acquired log information 131 in the information storage area 130, for example. Hereinafter, a specific example of the log information 131 will be described.

図１４に示すログ情報１３１は，ログ情報１３１に含まれる各情報を識別する「識別ＩＤ」と，ログ情報１３１を情報格納領域１３０に記憶した日時である「日時」と，各メッセージが送受信されるセッションを識別する「セッションＩＤ」とを項目として有する。また，メッセージの送信に対応する情報であるか受信に対応する情報であるかを識別する「属性」と，メッセージの種別である「種別」と，メッセージの内容を識別する「コード」とを項目として有する。「属性」には，監視対象処理部１１がメッセージを送信した場合に対応する情報であることを示す「Ｒｅｑｕｅｓｔ」と，監視対象処理部１１がメッセージを受信した場合に対応する情報であることを示す「Ｒｅｃｅｉｖｅ」とが設定される。また，「種別」には，監視対象処理部１１が他のマシン等にメッセージを送信した場合の情報であることを示す「データ」と，「種別」が「データ」であるメッセージに対する応答のメッセージである「レス」とが設定される。さらに，「種別」には，「種別」が「データ」であるメッセージに対する応答のメッセージを再度要求する「レス待ち」が設定される。そして，図１４に示すログ情報１３１は，さらに，各情報が記憶される情報格納領域１３０内の格納アドレスを示す「格納アドレス」を項目として有する。 In the log information 131 shown in FIG. 14, an “identification ID” that identifies each piece of information included in the log information 131 and a “date and time” that is the date and time when the log information 131 is stored in the information storage area 130 are transmitted and received. As an item, “Session ID” is used to identify a session. In addition, there are an “attribute” that identifies whether the information corresponds to message transmission or reception, “class” that is the message type, and “code” that identifies the message content. Have as. The “attribute” indicates that “Request” indicating that the monitoring target processing unit 11 is a message corresponding to the transmission of a message and information corresponding to the monitoring target processing unit 11 receiving a message. “Receive” is set. The “type” includes “data” indicating information when the monitoring target processing unit 11 transmits a message to another machine or the like, and a response message for a message whose “type” is “data”. “Less” is set. Further, “waiting for less” is set in the “type” to request a response message again for a message whose “type” is “data”. The log information 131 shown in FIG. 14 further includes “storage address” indicating a storage address in the information storage area 130 in which each information is stored as an item.

具体的に，図１４に示すログ情報１３１において，「識別ＩＤ」が１である情報には，「日時」として「０３／０５１２：２５：１０：５０２」が設定され，「セッションＩＤ」として「１１」が設定されている。そして，「識別ＩＤ」が１である情報には，「属性」として「Ｒｅｑｕｅｓｔ」が設定され，「種別」として「データ」が設定され，「コード」として「ＡＡＡＡ」が設定され，「格納アドレス」として「０ｘ１１２２３３１１」が設定されている。また，図１４に示すログ情報１３１のうち，「識別ＩＤ」が２である情報には，「日時」として「０３／０５１２：２５：１０：５０３」が設定され，「セッションＩＤ」として「１１」が設定されている。そして，「識別ＩＤ」が２である情報には，「属性」として「Ｒｅｃｅｉｖｅ」が設定され，「種別」として「レス」が設定され，「コード」として「ＡＡＡＡ」が設定され，「格納アドレス」として「０ｘ１１２２３３２２」が設定されている。また，図１４に示すログ情報１３１のうち，「識別ＩＤ」が４である情報には，「日時」として「０３／０５１２：２５：１０：５１８」が設定され，「セッションＩＤ」として「１５」が設定されている。そして，「識別ＩＤ」が４である情報には，「属性」として「Ｒｅｑｕｅｓｔ」が設定され，「種別」として「レス待ち」が設定され，「コード」として「ＢＢＢＢ」が設定され，「格納アドレス」として「０ｘ１１２２３３４４」が設定されている。図１４のその他の情報については，上記と同様であるため説明を省略する。 Specifically, in the log information 131 shown in FIG. 14, “03/05 12: 25: 10: 502” is set as “date and time” in the information whose “identification ID” is 1, and “session ID” is set as “session ID”. “11” is set. For the information whose “identification ID” is 1, “Request” is set as “attribute”, “data” is set as “type”, “AAAA” is set as “code”, and “storage address” "0x11223311" is set. Further, in the log information 131 shown in FIG. 14, “03/05 12: 25: 10: 503” is set as “Date / Time” and “Session ID” is “ 11 "is set. For the information whose “identification ID” is 2, “Receive” is set as “attribute”, “less” is set as “type”, “AAAA” is set as “code”, and “storage address” "0x112223322" is set as "." Also, in the log information 131 shown in FIG. 14, “03/05 12: 25: 10: 518” is set as the “date and time” in the information whose “identification ID” is 4, and “session ID” is “ 15 "is set. For the information whose “identification ID” is 4, “Request” is set as “Attribute”, “Waiting for reply” is set as “Type”, “BBBB” is set as “Code”, and “Storage” “0x11223344” is set as the “address”. The other information in FIG. 14 is the same as described above, and a description thereof will be omitted.

図８に戻り，ロギング制御部１２の情報受信部１１２は，例えば，監視対象処理部１１から無応答通知を受信するまで待機する（Ｓ１１のＮＯ）。以下，無応答通知の具体例を説明する。 Returning to FIG. 8, the information receiving unit 112 of the logging control unit 12 waits until, for example, a non-response notification is received from the monitoring target processing unit 11 (NO in S11). Hereinafter, a specific example of non-response notification will be described.

図１６は，無応答通知の具体例を示す例である。図１６に示す無応答通知は，発生した障害の種別を示す「エラー種別」と，監視対象処理部１１の状態を示す「自分の状態」と，監視対象処理部１１が通信を行う相手の状態を示す「相手の状態」とを項目として有する。「エラー種別」には，例えば，監視対象処理部１１が送信したメッセージに対して応答がないことを示す「無応答検知」と，監視対象処理部１１が送信したメッセージに対する応答が異常な内容であったことを示す「応答異常検知」等が設定される。また，「自分の状態」及び「相手の状態」には，異常が発生していないことを示す「正常」と，一時的に発生する異常であることを示す「一時的な異常」と，恒久的な異常であることを示す「恒久的な異常」等が設定される。なお，この「正常」は，例えば，監視対象処理部１１が通信を行う相手が処理を行うまで待機している状態である「処理実行待ち」と，監視対象処理部１１が通信を行う相手に送信したメッセージに対する応答を待っている状態である「応答待ち」とを含むものであってもよい。また，図１６に示す無応答通知は，メッセージの内容を示す「メッセージ内容」と，監視対象処理部１１と通信を行う相手とのセッションを識別する「セッションＩＤ」とを項目として有する。「メッセージ内容」には，監視対象処理部１１がロギング制御部１２に無応答通知を送信する契機となったメッセージの内容が設定される。図１６に示す「メッセージ内容」には，監視対象処理部１１がロギング制御部１２に無応答通知を送信する契機となったメッセージの「日時」，「属性」，「種別」及び「コード」が設定されている。これにより，ログ情報抽出部１１３は，後述するように，無応答通知に含まれる「メッセージ内容」に基づいて，無応答通知に対応するログ情報１３１を検索することが可能になる。 FIG. 16 shows an example of a non-response notification. The non-response notification shown in FIG. 16 includes “error type” indicating the type of failure that has occurred, “own status” indicating the status of the monitoring target processing unit 11, and the status of the partner with which the monitoring target processing unit 11 communicates. The item “partner's state” is shown as an item. The “error type” includes, for example, “no response detection” indicating that there is no response to the message transmitted by the monitoring target processing unit 11 and abnormal response to the message transmitted by the monitoring target processing unit 11. “Response abnormality detection” or the like indicating that there has been, is set. In addition, the “self status” and “partner status” include “normal” indicating that no abnormality has occurred, “temporary abnormality” indicating that the abnormality has occurred temporarily, and “permanent abnormality”. For example, “permanent abnormality” indicating a permanent abnormality is set. Note that “normal” indicates, for example, “waiting for process execution” in which the other party with whom the monitoring target processing unit 11 communicates waits for processing, and the other party with which the monitoring target processing unit 11 performs communication. It may include “waiting for response” which is a state waiting for a response to the transmitted message. Further, the no-response notification shown in FIG. 16 has “message content” indicating the content of the message and “session ID” identifying the session with the other party that communicates with the monitoring target processing unit 11 as items. In the “message content”, the content of the message that triggered the monitoring target processing unit 11 to send a no-response notification to the logging control unit 12 is set. The “message content” shown in FIG. 16 includes the “date and time”, “attribute”, “type”, and “code” of the message that triggered the monitoring target processing unit 11 to send a no-response notification to the logging control unit 12. Is set. As a result, the log information extraction unit 113 can search the log information 131 corresponding to the non-response notification based on the “message content” included in the non-response notification, as will be described later.

具体的に，図１６に示す無応答通知は，「エラー種別」として「無応答検知」が設定され，「自分の状態」として「正常」が設定され，「相手の状態」として「恒久的な異常」が設定されている。また，図１６に示すエラー通知は，「メッセージ内容」として「０３／０５１２：２５：１０：５３９，Ｒｅｃｅｉｖｅ，レス待ち，ＣＣＣＣ」が設定され，「セッションＩＤ」として「１１」が設定されている。 Specifically, in the non-response notification shown in FIG. 16, “No response detection” is set as “Error type”, “Normal” is set as “My state”, and “Permanent state” is “Permanent”. "Abnormal" is set. Further, in the error notification shown in FIG. 16, “03/05 12: 25: 10: 539, Receive, wait for wait, CCCC” is set as “message content”, and “11” is set as “session ID”. Yes.

なお，監視対象処理部１１は，例えば，無応答通知の「エラー種別」を変更することにより，無応答の検知以外による障害を検知した場合においても，その内容をロギング制御部１２に通知することが可能になる。 Note that the monitoring target processing unit 11 notifies the logging control unit 12 of the contents even when a failure other than detection of no response is detected, for example, by changing the “error type” of the no response notification. Is possible.

図８に戻り，監視対象処理部１１から無応答通知を受信した場合（Ｓ１１のＹＥＳ），ロギング制御部１２のログ情報抽出部１１３は，例えば，情報格納領域１３０に記憶されたログ情報１３１を抽出する。具体的に，ログ情報抽出部１１３は，例えば，監視対象処理部１１が第１のメッセージを送信する前に，監視対象処理部２１から受信した第２のメッセージに対応するログ情報１３１よりも後に情報格納領域１３０に記憶されたログ情報１３１を抽出する（Ｓ１２）。すなわち，ログ情報抽出部１１３は，情報受信部１１２が監視対象処理部２１から正常に受信していると判断できるメッセージである第２のメッセージに対応するログ情報１３１を抽出する。 Returning to FIG. 8, when the no-response notification is received from the monitoring target processing unit 11 (YES in S11), the log information extraction unit 113 of the logging control unit 12 stores the log information 131 stored in the information storage area 130, for example. Extract. Specifically, the log information extraction unit 113, for example, after the log information 131 corresponding to the second message received from the monitoring target processing unit 21 before the monitoring target processing unit 11 transmits the first message. The log information 131 stored in the information storage area 130 is extracted (S12). That is, the log information extraction unit 113 extracts the log information 131 corresponding to the second message that is a message that can be determined that the information reception unit 112 has normally received from the monitoring target processing unit 21.

具体的に，図１６に示す無応答通知を情報受信部１１２が受信した場合，ログ情報抽出部１１３は，例えば，無応答通知の「メッセージ内容」を参照する。そして，ログ情報抽出部１１３は参照した「メッセージ内容」と同じ内容を含むログ情報１３１を特定する。例えば，図１４に示すログ情報１３１のうち，図１６に示す無応答通知の「メッセージ内容」と同じ内容を含む情報は，「識別ＩＤ」が「９」である情報である。したがって，ログ情報抽出部１１３は，監視対象処理部１１から受信した無応答通知に対応するログ情報１３１として，「識別ＩＤ」が「９」である情報を特定する。 Specifically, when the information reception unit 112 receives the no-response notification illustrated in FIG. 16, the log information extraction unit 113 refers to, for example, “message content” of the no-response notification. Then, the log information extraction unit 113 identifies log information 131 including the same content as the referred “message content”. For example, in the log information 131 shown in FIG. 14, information including the same content as the “message content” of the non-response notification shown in FIG. 16 is information whose “ID” is “9”. Therefore, the log information extraction unit 113 specifies information whose “identification ID” is “9” as the log information 131 corresponding to the no-response notification received from the monitoring target processing unit 11.

次に，ログ情報抽出部１１３は，ログ情報１３１をさらに参照し，監視対象処理部１１が監視対象処理部２１に第１のメッセージを送信する前に，監視対象処理部２１から受信した第２のメッセージに対応する情報を抽出する。具体的に，ログ情報抽出部１１３は，「識別ＩＤ」が「９」である情報よりも前の情報であって，「セッションＩＤ」が「１１」であり，「属性」が「Ｒｅｃｅｉｖｅ」である情報を抽出する。すなわち，ログ情報抽出部１１３は，図１４に示すログ情報１３１においては，「識別ＩＤ」が「２」である情報を抽出する。これにより，ログ情報抽出部１１３は，監視対象処理部２１が送信したメッセージであって，監視対象処理部１１が正常に受信していると判断できる情報を抽出することが可能になる。 Next, the log information extraction unit 113 further refers to the log information 131, and the second information received from the monitoring target processing unit 21 before the monitoring target processing unit 11 transmits the first message to the monitoring target processing unit 21. Extract information corresponding to the message. Specifically, the log information extraction unit 113 is information before the information whose “identification ID” is “9”, the “session ID” is “11”, and the “attribute” is “Receive”. Extract some information. That is, the log information extraction unit 113 extracts information whose “identification ID” is “2” in the log information 131 shown in FIG. As a result, the log information extraction unit 113 can extract information that is a message transmitted by the monitoring target processing unit 21 and that can be determined as being normally received by the monitoring target processing unit 11.

なお，ログ情報抽出部１１３は，監視対象処理部２１が送信したメッセージであって，監視対象処理部１１が正常に受信していると判断できる情報が複数ある場合，最も新しい情報を抽出することが好ましい。これにより，ログ情報保護部１１５は，保護処理を行う情報の量を抑えることが可能になる。 Note that the log information extraction unit 113 extracts the latest information when there is a plurality of information that can be determined to be normally received by the monitoring target processing unit 11 in the message transmitted by the monitoring target processing unit 21. Is preferred. As a result, the log information protection unit 115 can reduce the amount of information to be protected.

図８に戻り，ログ情報保護部１１５は，例えば，抽出したログ情報１３１の保護処理を実行する（Ｓ１３）。すなわち，ログ情報抽出部１１３は，保護情報１３２を設定することにより，抽出したログ情報１３１よりも後に記憶されたログ情報１３１が上書きされることを防止する。これにより，ログ情報保護部１１５は，第２のメッセージの受信よりも後に実行された第１のメッセージの送受信に関するログ情報１３１を保護することが可能になる。具体的に，図１４に示すログ情報１３１においては，ログ情報保護部１１５は，例えば，「識別ＩＤ」が「２」である情報から，「識別ＩＤ」が「１１」である情報（ログ情報１３１に含まれる最も新しい情報）まで保護処理を行う。以下，保護情報１３２の具体例を説明する。 Returning to FIG. 8, the log information protection unit 115 executes, for example, a protection process for the extracted log information 131 (S13). That is, the log information extraction unit 113 sets the protection information 132 to prevent the log information 131 stored after the extracted log information 131 from being overwritten. Thus, the log information protection unit 115 can protect the log information 131 related to transmission / reception of the first message executed after the reception of the second message. Specifically, in the log information 131 illustrated in FIG. 14, the log information protection unit 115 performs, for example, information (log information) whose “identification ID” is “11” from information whose “identification ID” is “2”. The newest information included in 131 is protected. Hereinafter, a specific example of the protection information 132 will be described.

図１７は，保護情報１３２の具体例を示す図である。図１７に示す保護情報１３２は，ログ情報１３１を記憶するために割り当てられたメモリ上の領域の先頭アドレスを示す「先頭アドレス」を項目として有する。また，図１７に示す保護情報１３２は，次にログ情報１３１を記憶する予定のメモリ上の領域の先頭アドレスを示す「書き込み先頭アドレス」と，上書き禁止が行われているか否かを示す「上書き禁止有無」とを項目として有する。さらに，図１７に示す保護情報１３２は，上書き禁止が行われているメモリ上の領域の先頭アドレスを示す「上書き禁止先頭アドレス」と，上書き禁止が行われているメモリ上の領域の最終アドレスを示す「上書き禁止最終アドレス」とを項目として有する。具体的に，図１７に示す保護情報１３２には，「先頭アドレス」として「０ｘ１１２２３３１１」が設定され，「書込可能先頭アドレス」として「０ｘ１１２２３３ｃｃ」が設定され，「上書き禁止有無」として「有り」が設定されている。また，図１７に示す保護情報１３２には，「上書き禁止先頭アドレス」として「０ｘ１１２２３３ｂｂ」が設定され，「上書き禁止最終アドレス」として「０ｘ１１２２３３ｃｃ」が設定されている。 FIG. 17 is a diagram illustrating a specific example of the protection information 132. The protection information 132 shown in FIG. 17 has “head address” indicating the head address of the area on the memory allocated for storing the log information 131 as an item. Further, the protection information 132 shown in FIG. 17 includes a “write start address” indicating the start address of an area on the memory where log information 131 is to be stored next, and an “overwrite” indicating whether or not overwriting is prohibited. “Prohibited / Not Prohibited” as an item. Furthermore, the protection information 132 shown in FIG. 17 includes an “overwrite-prohibited start address” indicating the start address of the area on the memory where overwrite is prohibited and a final address of the area on the memory where overwrite is prohibited. “Overwrite-prohibited final address” shown as an item. Specifically, in the protection information 132 shown in FIG. 17, “0x11223311” is set as the “start address”, “0x112233cc” is set as the “writable start address”, and “present” is set as “overwrite prohibition presence / absence”. Is set. In the protection information 132 shown in FIG. 17, “0x112233bb” is set as the “overwrite-prohibited head address” and “0x112233cc” is set as the “overwrite-prohibited final address”.

図８に戻り，ロギング制御部１２のダンプ取得部１２０は，例えば，物理マシン１のメモリの内容をファイルに出力し，メモリダンプを取得する（Ｓ１４）。これにより，運用管理者は，発生した障害の原因究明を行う際に，より詳細な調査を行うことが可能になる。 Returning to FIG. 8, the dump acquisition unit 120 of the logging control unit 12 outputs, for example, the contents of the memory of the physical machine 1 to a file, and acquires the memory dump (S14). As a result, the operation manager can conduct a more detailed investigation when investigating the cause of the failure that has occurred.

図１０は，ダンプ取得部１２０がメモリダンプを取得する処理（以下，ダンプ取得処理とも呼ぶ）の詳細を説明するフローチャート図である。図１０に示すフローチャート図において，ロギング制御部１２のダンプ取得部１２０は，例えば，監視対象処理部２１に状態を確認するためのメッセージを送信する（Ｓ４１）。そして，ダンプ取得部１２０が送信したメッセージに対して無応答であり，監視対象処理部２１が異常から復旧していないと判定した場合（Ｓ４１のＹＥＳ），ダンプ取得部１２０は，例えば，メモリダンプの取得を行う（Ｓ４２）。一方，ダンプ取得部１２０が送信したメッセージに対して応答があり，監視対象処理部２１が異常から復旧したと判定した場合（Ｓ４１のＮＯ），ダンプ取得部１２０は，例えば，メモリダンプの取得を行わない。すなわち，ダンプ取得部１２０がメモリダンプの取得を行う場合，監視対象処理部１１（監視対象処理部１１で動作しているプロセス）を停止させる必要がある。そのため，この場合，監視対象処理部１１が利用者に提供しているサービスに影響を及ぼす可能性がある。したがって，ダンプ取得部１２０は，メモリダンプの取得を行う前に監視対象処理部２１の状態を再度確認する。そして，ダンプ取得部１２０は，監視対象処理部２１が異常から復旧していると判定した場合，メモリダンプの取得を行わないものであってよい。これにより，ダンプ取得部１２０は，メモリダンプの取得によるサービスへの影響を抑えることが可能になる。 FIG. 10 is a flowchart for explaining details of the process in which the dump acquisition unit 120 acquires a memory dump (hereinafter also referred to as a dump acquisition process). In the flowchart shown in FIG. 10, the dump acquisition unit 120 of the logging control unit 12 transmits, for example, a message for confirming the state to the monitoring target processing unit 21 (S41). Then, when it is determined that there is no response to the message transmitted by the dump acquisition unit 120 and the monitoring target processing unit 21 has not recovered from the abnormality (YES in S41), the dump acquisition unit 120, for example, Is acquired (S42). On the other hand, when there is a response to the message transmitted by the dump acquisition unit 120 and it is determined that the monitoring target processing unit 21 has recovered from the abnormality (NO in S41), the dump acquisition unit 120 acquires, for example, a memory dump. Not performed. That is, when the dump acquisition unit 120 acquires a memory dump, it is necessary to stop the monitoring target processing unit 11 (a process operating in the monitoring target processing unit 11). Therefore, in this case, there is a possibility that the service that the monitoring target processing unit 11 provides to the user is affected. Therefore, the dump acquisition unit 120 checks the state of the monitoring target processing unit 21 again before acquiring the memory dump. And the dump acquisition part 120 may not acquire a memory dump, when it determines with the monitoring object process part 21 having recovered | restored from abnormality. Thereby, the dump acquisition unit 120 can suppress the influence on the service due to the acquisition of the memory dump.

また，Ｓ４１において，監視対象処理部１１は，監視対象処理部２１の状態を確認するためのメッセージの送信を行うものであってもよい。そして，監視対象処理部２１が異常から復旧していないと判定した場合，監視対象処理部１１は，ロギング制御部１２に対して再度無応答通知を送信するものであってよい。なお，ロギング制御部１２のダンプ取得部１２０は，例えば，監視対象処理部１１から無応答通知を２回受信したことに応じて，メモリダンプの取得を行うものであってよい。 In S <b> 41, the monitoring target processing unit 11 may transmit a message for confirming the state of the monitoring target processing unit 21. When it is determined that the monitoring target processing unit 21 has not recovered from the abnormality, the monitoring target processing unit 11 may transmit a non-response notification to the logging control unit 12 again. Note that the dump acquisition unit 120 of the logging control unit 12 may acquire a memory dump in response to receiving a no-response notification twice from the monitoring target processing unit 11, for example.

図８に戻り，ロギング制御部１２の情報送信部１１４は，例えば，ログ情報抽出部１１３が抽出した第２のメッセージと，第１のメッセージとを識別可能な識別情報をロギング制御部２２に送信する（Ｓ１５）。 Returning to FIG. 8, the information transmission unit 114 of the logging control unit 12 transmits, for example, identification information that can identify the second message extracted by the log information extraction unit 113 and the first message to the logging control unit 22. (S15).

図１１は，情報送信部１１４が識別情報を送信する処理（以下，識別情報送信処理とも呼ぶ９の詳細を説明するフローチャート図である。図１１に示すフローチャート図において，Ｓ１４の場合と同様に，監視対象処理部２１が異常から復旧しているか否かを判定する（Ｓ５１）。そして，監視対象処理部２１が異常から復旧していないと判定した場合（Ｓ５１のＹＥＳ），情報送信部１１４は，メモリダンプの取得の指示を含む識別情報をロギング制御部２２に送信する（Ｓ５２）。一方，監視対象処理部２１が異常から復旧していると判定した場合（Ｓ５１のＮＯ），情報送信部１１４は，メモリダンプの取得の指示を含まない識別情報をロギング制御部２２に送信する（Ｓ５３）。すなわち，Ｓ１４の場合と同様に，監視対象処理部２１が異常から復旧していると判定した場合，情報送信部１１４は，ロギング制御部２２にメモリダンプの取得の指示を行わない。これにより，情報送信部１１４は，Ｓ１４の場合と同様に，メモリダンプの取得によるサービスへの影響を抑えることが可能になる。以下，識別情報の具体例を説明する。 FIG. 11 is a flowchart for explaining details of the process in which the information transmission unit 114 transmits identification information (hereinafter also referred to as an identification information transmission process 9). In the flowchart shown in FIG. It is determined whether or not the monitoring target processing unit 21 has recovered from the abnormality (S51), and when it is determined that the monitoring target processing unit 21 has not recovered from the abnormality (YES in S51), the information transmission unit 114 Then, identification information including an instruction to acquire the memory dump is transmitted to the logging control unit 22 (S52) On the other hand, when it is determined that the monitoring target processing unit 21 has recovered from the abnormality (NO in S51), the information transmission unit 114 transmits the identification information not including the instruction to acquire the memory dump to the logging control unit 22 (S53), that is, as in the case of S14, the monitoring target processing unit 2 Is determined to have recovered from the abnormality, the information transmission unit 114 does not instruct the logging control unit 22 to acquire a memory dump, so that the information transmission unit 114 performs the same process as in S14. The impact on the service due to the acquisition of the dump can be suppressed.Hereinafter, a specific example of identification information will be described.

図１８は，識別情報の具体例を説明する図である。図１８に示す識別情報は，識別情報の送信先に指示する処理の内容を示す「処理内容」と，図１４で説明した「セッションＩＤ」とを項目として有する。「処理内容」には，識別情報の送信先にログ情報２３１の上書き禁止を指示する「上書き禁止」等が設定される。また，図１８に示す識別情報は，第１のメッセージの内容である「メッセージ内容１」と，第２のメッセージの内容である「メッセージ内容２」と，識別情報の送信先にメモリダンプの取得を指示するか否かを示す「メモリダンプ取得有無」とを有する。「メッセージ内容１」及び「メッセージ内容２」には，例えば，図１６で説明した「メッセージ内容」と同じ内容が設定される。 FIG. 18 is a diagram illustrating a specific example of identification information. The identification information shown in FIG. 18 includes “processing contents” indicating the contents of the processing instructed to the transmission destination of the identification information and “session ID” described in FIG. In the “processing content”, “overwrite prohibition” for instructing the transmission destination of the identification information to overwrite the log information 231 is set. Further, the identification information shown in FIG. 18 includes “message content 1” that is the content of the first message, “message content 2” that is the content of the second message, and acquisition of a memory dump at the transmission destination of the identification information. “Memory dump acquisition presence / absence” indicating whether or not to instruct. In “message content 1” and “message content 2”, for example, the same content as “message content” described in FIG. 16 is set.

具体的に，図１８に示す識別情報は，「処理内容」として「上書き禁止」が設定され，「セッションＩＤ」として「３１」が設定され，「メッセージ内容１」として「０３／０５１２：２５：１０：５３９，Ｒｅｃｅｉｖｅ，レス待ち，ＣＣＣＣ」が設定されている。また，図１８に示す識別情報は，「メッセージ内容２」として「０３／０５１２：２５：１０：５０３，Ｒｅｃｅｉｖｅ，レス，ＡＡＡＡ」が設定され，「メモリダンプ取得有無」として「無」が設定されている。 Specifically, in the identification information shown in FIG. 18, “overwrite prohibited” is set as “processing content”, “31” is set as “session ID”, and “03/05 12:25” is set as “message content 1”. : 10: 539, Receive, wait for wait, CCCC ”is set. In the identification information shown in FIG. 18, “03/05 12: 25: 10: 503, Receive, Less, AAAA” is set as “Message content 2”, and “None” is set as “Memory dump acquisition presence / absence”. Has been.

なお，監視対象処理部１１は，無応答通知の「セッションＩＤ」に，第１のメッセージの送信を行った際に用いたポートのポート番号（監視対象処理部１１側のポート番号）を設定するものであってもよい。この場合，ロギング制御部１２は，例えば，メッセージの送信先のホスト名と，その送信先と通信を行うための監視対象処理部１１側のポート番号とを対応させた対応情報（図示しない）を予め記憶しておく。そして，無応答通知を受信したロギング制御部１２は，例えば，無応答通知に設定されたポート番号に基づき，第１のメッセージの送信先のホスト名を取得する。さらに，ロギング制御部１２は，対応情報を参照して，第１のメッセージを送信した送信先の動作情報を取得するロギング制御部のポート番号を特定し，その特定したポート番号に対して識別情報を送信するものであってもよい（図１０のＳ１５）。 The monitoring target processing unit 11 sets the port number (port number on the monitoring target processing unit 11 side) used when the first message is transmitted to the “session ID” of the non-response notification. It may be a thing. In this case, for example, the logging control unit 12 displays correspondence information (not shown) that associates the host name of the message transmission destination with the port number on the monitoring target processing unit 11 side for communicating with the transmission destination. Store in advance. Then, the logging control unit 12 that has received the no-response notification acquires the host name of the transmission destination of the first message based on, for example, the port number set for the no-response notification. Further, the logging control unit 12 specifies the port number of the logging control unit that acquires the operation information of the transmission destination that has transmitted the first message with reference to the correspondence information, and the identification information for the specified port number. May be transmitted (S15 in FIG. 10).

また，ロギング制御部１２は，例えば，監視対象処理部２１の状態を定期的に取得するものであってもよい。具体的に，ロギング制御部１２は，監視対象処理部２１の状態として，通信相手において処理が行われるまで待機している状態である「処理実行待ち」，または，通信相手に送信したメッセージに対する応答を待っている状態である「応答待ち」に関する情報を取得する。さらに，ロギング制御部１２は，監視対象処理部２１の状態が「応答待ち」である場合に，応答を待っているメッセージに関する情報を取得する。そして，ロギング制御部１２は，例えば，新たに取得した監視対象処理部２１の状態及び前回取得した監視対象処理部２１の状態が「応答待ち」であって，応答を待っているメッセージが同じである場合，監視対象処理部２１が無応答の状態であると判定するものであってよい。すなわち，この場合，ロギング制御部１２は，監視対象処理部１１から無応答通知を受信した場合（Ｓ１１のＹＥＳ）と同様に，Ｓ１２以降の処理を行うものであってよい。これにより，例えば，監視対象処理部１１において，監視対象処理部２１が無応答の状態であることを検知するためのタイマーが正常に動作していない場合であっても，ロギング制御部１２は，監視対象処理部２１が無応答の状態にあることを検知することが可能になる。 Further, the logging control unit 12 may acquire the state of the monitoring target processing unit 21 periodically, for example. Specifically, the logging control unit 12 sets the state of the monitoring target processing unit 21 as “waiting for processing execution”, which is a waiting state until processing is performed at the communication partner, or a response to a message transmitted to the communication partner. Get information about "waiting for response", which is waiting for Furthermore, the logging control unit 12 acquires information about a message waiting for a response when the state of the monitoring target processing unit 21 is “waiting for response”. For example, the logging control unit 12 has the same status of the newly acquired monitoring target processing unit 21 and the previously acquired monitoring target processing unit 21 as “waiting for response” and waiting for a response. In some cases, the monitoring target processing unit 21 may determine that there is no response. That is, in this case, the logging control unit 12 may perform the processing from S12 onward in the same manner as when the no-response notification is received from the monitoring target processing unit 11 (YES in S11). Thereby, for example, in the monitoring target processing unit 11, even when the timer for detecting that the monitoring target processing unit 21 is in a non-response state is not operating normally, the logging control unit 12 It is possible to detect that the monitoring target processing unit 21 is not responding.

[第２のロギング制御部での処理]
次に，ロギング制御部２２において実行されるロギング制御処理を説明する。図９に示すように，ロギング制御部２２のログ情報取得部２１１は，監視対象処理部２１に関するログ情報を記憶する（Ｓ２０）。具体的に，ログ情報取得部２１１は，例えば，取得したログ情報２３１を情報格納領域２３０に記憶する。以下，ログ情報２３１の具体例を説明する。 [Processing in the second logging control unit]
Next, the logging control process executed in the logging control unit 22 will be described. As shown in FIG. 9, the log information acquisition unit 211 of the logging control unit 22 stores log information related to the monitoring target processing unit 21 (S20). Specifically, the log information acquisition unit 211 stores the acquired log information 231 in the information storage area 230, for example. Hereinafter, a specific example of the log information 231 will be described.

図１５は，ログ情報２３１の具体例を説明する図である。図１５に示すログ情報２３１は，例えば，図１４に示すログ情報１３１と同じ項目を有している。具体的に，図１５に示すログ情報２３１のうち，「識別ＩＤ」が１である情報には，「日時」として「０３／０５１２：２５：１０：５０２」が設定され，「セッションＩＤ」として「２３」が設定されている。そして，「識別ＩＤ」が１である情報には，「属性」として「Ｒｅｑｕｅｓｔ」が設定され，「種別」として「データ」が設定され，「コード」として「ＧＧＧＧ」が設定され，「格納アドレス」として「０ｘ２２３３４４１１」が設定されている。なお，ログ情報２３１における「セッションＩＤ」は，ログ情報１３１と共通の情報を用いるものであってもよいし，ロギング制御部１２及びロギング制御部２２がそれぞれ管理している情報を用いるものであってもよい。図１５のその他の情報については，上記と同様であるため説明を省略する。 FIG. 15 is a diagram for explaining a specific example of the log information 231. The log information 231 illustrated in FIG. 15 has, for example, the same items as the log information 131 illustrated in FIG. Specifically, in the log information 231 shown in FIG. 15, “03/05 12: 25: 10: 502” is set as the “date and time” in the information whose “identification ID” is 1, and “session ID” “23” is set. For the information whose “identification ID” is 1, “Request” is set as “attribute”, “data” is set as “type”, “GGGG” is set as “code”, and “storage address” "0x22334411" is set. The “session ID” in the log information 231 may use information common to the log information 131, or use information managed by the logging control unit 12 and the logging control unit 22, respectively. May be. The other information in FIG. 15 is the same as described above, and a description thereof will be omitted.

図９に戻り，ロギング制御部２２の情報受信部２１２は，例えば，ロギング制御部１２から第１のメッセージ及び第２のメッセージの識別情報を受信するまで待機する（Ｓ２１のＮＯ）。そして，ロギング制御部１２から識別情報を受信した場合（Ｓ２１のＹＥＳ），ロギング制御部２２のログ情報抽出部２１３は，例えば，情報格納領域２３０に記憶されたログ情報２３１を抽出する。具体的に，ログ情報抽出部２１３は，例えば，情報受信部２１２が受信した第２のメッセージを識別する識別情報に基づいて，第２のメッセージの送信よりも後に監視対象処理部１１へ送信したメッセージに対応するログ情報を抽出する（Ｓ２２）。すなわち，ログ情報抽出部２１３は，監視対象処理部１１が受信したメッセージである第２のメッセージの後に情報格納領域２３０に記憶されたログ情報２３１を抽出する。 Returning to FIG. 9, the information reception unit 212 of the logging control unit 22 waits until the identification information of the first message and the second message is received from the logging control unit 12, for example (NO in S21). And when identification information is received from the logging control part 12 (YES of S21), the log information extraction part 213 of the logging control part 22 extracts the log information 231 memorize | stored in the information storage area 230, for example. Specifically, the log information extraction unit 213, for example, transmitted to the monitoring target processing unit 11 after the transmission of the second message based on the identification information for identifying the second message received by the information reception unit 212. Log information corresponding to the message is extracted (S22). That is, the log information extraction unit 213 extracts the log information 231 stored in the information storage area 230 after the second message that is the message received by the monitoring target processing unit 11.

そして，ログ情報抽出部２１３は，例えば，ログ情報抽出部１１３と同様に，抽出したログ情報２３１の保護処理を実行する（Ｓ２３）。すなわち，ログ情報抽出部２１３は，新たに発生したログ情報２３１により，抽出したログ情報２３１が上書きされることを防止する。これにより，ログ情報抽出部２１３は，第２のメッセージの送信よりも後に記憶されたログ情報２３１を保護することが可能になる。 Then, the log information extraction unit 213 executes protection processing for the extracted log information 231 in the same manner as the log information extraction unit 113, for example (S23). That is, the log information extraction unit 213 prevents the extracted log information 231 from being overwritten by newly generated log information 231. Thereby, the log information extraction unit 213 can protect the log information 231 stored after the transmission of the second message.

図１２は，ロギング制御部２２におけるログ情報２３１の保護を行う処理（以下，ログ情報保護処理とも呼ぶ）の詳細を説明するフローチャート図である。図１２のフローチャート図が示すように，ログ情報保護部２１５は，情報受信部２１２が受信した識別情報の「処理内容」が「上書き禁止」であるか否かを確認する（Ｓ６１）。そして，受信した識別情報の「処理内容」に「上書き禁止」が設定されている場合（Ｓ６１のＹＥＳ），ログ情報保護部２１５は，Ｓ２３と同様に，抽出したログ情報２３１よりも後に情報格納領域２３０に記憶されたログ情報２３１の保護処理を実行する（Ｓ６２）。一方，受信した識別情報の「処理内容」に「上書き禁止」以外が設定されている場合（Ｓ６１のＮＯ），ログ情報保護部２１５は，ログ情報２３１の抽出を行わない。すなわち，ログ情報保護部２１５は，監視対象処理部１１からログ情報２３１の保護に関する指示がない場合には，ログ情報２３１の保護を行わない。 FIG. 12 is a flowchart for explaining details of processing for protecting the log information 231 in the logging control unit 22 (hereinafter also referred to as log information protection processing). As shown in the flowchart of FIG. 12, the log information protection unit 215 confirms whether the “processing content” of the identification information received by the information reception unit 212 is “overwrite prohibited” (S61). If “overwrite prohibited” is set in the “processing content” of the received identification information (YES in S61), the log information protection unit 215 stores information after the extracted log information 231 as in S23. A protection process for the log information 231 stored in the area 230 is executed (S62). On the other hand, when other than “overwrite prohibited” is set in the “processing content” of the received identification information (NO in S61), the log information protection unit 215 does not extract the log information 231. That is, the log information protection unit 215 does not protect the log information 231 when there is no instruction regarding the protection of the log information 231 from the monitoring target processing unit 11.

図９に戻り，ダンプ取得部２２０は，例えば，ダンプ取得部１２０と同様に，物理マシン１のメモリの内容をファイルに出力し，メモリダンプを取得する（Ｓ２４）。これにより，運用管理者は，発生した障害の原因究明を行う際に，より詳細な調査を行うことが可能になる。 Returning to FIG. 9, the dump acquisition unit 220 outputs the contents of the memory of the physical machine 1 to a file, for example, similarly to the dump acquisition unit 120, and acquires a memory dump (S24). As a result, the operation manager can conduct a more detailed investigation when investigating the cause of the failure that has occurred.

図１３は，ダンプ取得部２２０がメモリダンプを取得する処理（以下，ダンプ取得処理とも呼ぶ）の詳細を説明するフローチャート図である。図１３に示すフローチャート図において，ロギング制御部２２のダンプ取得部２２０は，例えば，情報受信部２１２が受信した識別情報の「メモリダンプ取得有無」が「有」であるか否かを確認する（Ｓ７１）。そして，受信した識別情報の「メモリダンプ取得有無」に「有」が設定されている場合（Ｓ７１のＹＥＳ），ダンプ取得部２２０は，例えば，メモリダンプの取得を行う（Ｓ７２）。一方，受信した識別情報の「メモリダンプ取得有無」に「無」が設定されている場合（Ｓ７１のＮＯ），ダンプ取得部２２０は，例えば，メモリダンプの取得を行わない。すなわち，ロギング制御部１２から受信した識別情報の「メモリダンプ取得有無」に「無」が設定されている場合とは，ロギング制御部１２が監視対象処理部２１の復旧を検知し，メモリダンプの取得を行う必要がないと判断した場合である。そのため，ロギング制御部１２から受信した識別情報の「メモリダンプ取得有無」に「無」が設定されている場合，ダンプ取得部２２０は，メモリダンプの取得を行わない。これにより，ダンプ取得部２２０は，メモリダンプの取得によるサービスへの影響を抑えることが可能になる。 FIG. 13 is a flowchart for explaining the details of the process in which the dump acquisition unit 220 acquires a memory dump (hereinafter also referred to as a dump acquisition process). In the flowchart shown in FIG. 13, the dump acquisition unit 220 of the logging control unit 22 confirms, for example, whether or not “existence of memory dump acquisition” in the identification information received by the information reception unit 212 is “present” ( S71). If “present” is set in “memory dump acquisition presence / absence” of the received identification information (YES in S71), the dump acquisition unit 220 acquires, for example, a memory dump (S72). On the other hand, when “none” is set in “memory dump acquisition presence / absence” of the received identification information (NO in S71), the dump acquisition unit 220 does not acquire a memory dump, for example. That is, when “none” is set in “memory dump acquisition presence / absence” of the identification information received from the logging control unit 12, the logging control unit 12 detects the recovery of the monitoring target processing unit 21, and the memory dump This is a case where it is determined that there is no need to acquire. Therefore, when “none” is set in “memory dump acquisition presence / absence” of the identification information received from the logging control unit 12, the dump acquisition unit 220 does not acquire a memory dump. Thereby, the dump acquisition unit 220 can suppress the influence on the service due to the acquisition of the memory dump.

図９に戻り，ログ情報抽出部２１３は，受信した第１のメッセージを識別する識別情報に基づいて，第１のメッセージに対応するログ情報２３１が記憶されているか否かを判定する（Ｓ２５）。すなわち，第１のメッセージに対応するログ情報２３１が記憶されている場合，ログ情報抽出部２１３は，監視対象処理部２１が第１のメッセージを受信していたと判定することができる。そのため，ログ情報抽出部２１３は，この場合，例えば，第１のメッセージに対する応答を監視対象処理部１１に送信した監視対象処理部２１の情報送信部２１４等で障害が発生したものと判定することができる。また，第１のメッセージに対応するログ情報２３１が記憶されていない場合，ログ情報抽出部２１３は，監視対象処理部２１が第１のメッセージを受信していないと判定することができる。ログ情報抽出部２１３は，この場合，例えば，第１のメッセージを送信した監視対象処理部１１の情報送信部１１４等において障害が発生したものと判定することができる。 Returning to FIG. 9, the log information extraction unit 213 determines whether or not the log information 231 corresponding to the first message is stored based on the received identification information for identifying the first message (S25). . That is, when the log information 231 corresponding to the first message is stored, the log information extraction unit 213 can determine that the monitoring target processing unit 21 has received the first message. Therefore, in this case, for example, the log information extraction unit 213 determines that a failure has occurred in the information transmission unit 214 of the monitoring target processing unit 21 that has transmitted a response to the first message to the monitoring target processing unit 11. Can do. When the log information 231 corresponding to the first message is not stored, the log information extraction unit 213 can determine that the monitoring target processing unit 21 has not received the first message. In this case, for example, the log information extraction unit 213 can determine that a failure has occurred in the information transmission unit 114 of the monitoring target processing unit 11 that has transmitted the first message.

［第２の実施の形態］
次に，第２の実施の形態について説明する。図１９から図２１は，第２の実施の形態におけるロギング制御処理を説明するフローチャート図である。図２２は，第２の実施の形態におけるロギング制御処理を説明する図である。図２２を参照しながら図１９から図２１のロギング制御処理の説明を行う。 [Second Embodiment]
Next, a second embodiment will be described. FIGS. 19 to 21 are flowcharts for explaining the logging control process in the second embodiment. FIG. 22 is a diagram for explaining the logging control processing in the second embodiment. The logging control process of FIGS. 19 to 21 will be described with reference to FIG.

第２の実施の形態では，第１の実施の形態で説明した無応答を検知した場合に加え，監視対象処理部１１，２１の処理状態に応じて，ログ情報１３１，２３１の保護処理を行う。具体的に，ロギング制御部１２，２２は，監視対象処理部の１１，２２の処理状態をそれぞれ定常的に取得する。そして，監視対象処理部１１，２１の処理状態が所定時間（例えば，１分）を経過しても更新されない場合，ロギング制御部１２，２２は，監視対象処理部１１または監視対象処理部２１で障害が発生したものと判定する。すなわち，例えば，監視対象処理部２１において障害が発生した場合，第１の実施の形態では，障害の発生を検知するために監視対象処理部１１から監視対象処理部２１にメッセージを送信している必要がある。これに対し，第２の実施の形態では，監視対象処理部１１から監視対象処理部２１にメッセージを送信することなく，監視対象処理部２１で発生している障害を検知することが可能になる。以下，第２の実施の形態におけるロギング制御処理の詳細を説明する。 In the second embodiment, in addition to the case where no response described in the first embodiment is detected, the log information 131 and 231 are protected according to the processing state of the monitoring target processing units 11 and 21. . Specifically, the logging control units 12 and 22 regularly acquire the processing states of the monitoring target processing units 11 and 22, respectively. When the processing state of the monitoring target processing units 11 and 21 is not updated even after a predetermined time (for example, 1 minute) has elapsed, the logging control units 12 and 22 are the monitoring target processing unit 11 or the monitoring target processing unit 21. It is determined that a failure has occurred. That is, for example, when a failure occurs in the monitoring target processing unit 21, in the first embodiment, a message is transmitted from the monitoring target processing unit 11 to the monitoring target processing unit 21 in order to detect the occurrence of the failure. There is a need. In contrast, in the second embodiment, it is possible to detect a failure occurring in the monitoring target processing unit 21 without transmitting a message from the monitoring target processing unit 11 to the monitoring target processing unit 21. . Details of the logging control process in the second embodiment will be described below.

［状態更新処理の詳細］
初めに，ロギング制御部１２における状態情報１３３の更新を行う処理（以下，状態更新処理とも呼ぶ）の詳細を説明する。図１９に示すように，ロギング制御部１２の状態取得部１１６は，例えば，状態取得タイミングになるまで待機する（Ｓ９１のＮＯ）。状態取得タイミングは，１分間隔等であってよい。そして，状態取得タイミングになった場合（Ｓ９１のＹＥＳ），状態取得部１１６は，例えば，ログ情報１３１を参照してセッションの状態を確認する。そして，ロギング制御部１２の状態更新部１１７は，対応する状態情報１３３に設定する（Ｓ９２）。以下，状態情報１３３の具体例を説明する。 [Details of status update processing]
First, the details of the process of updating the state information 133 in the logging control unit 12 (hereinafter also referred to as state update process) will be described. As shown in FIG. 19, the status acquisition unit 116 of the logging control unit 12 waits until the status acquisition timing is reached, for example (NO in S91). The state acquisition timing may be 1 minute intervals or the like. Then, when the state acquisition timing comes (YES in S91), the state acquisition unit 116 refers to the log information 131 and checks the session state, for example. Then, the state update unit 117 of the logging control unit 12 sets the corresponding state information 133 (S92). Hereinafter, a specific example of the state information 133 will be described.

図２２は，状態情報１３３の具体例を示す図である。図２２に示す状態情報１３３は，図１４で説明した「セッションＩＤ」と，各セッションに関する現在の状態を示す「現在の状態」と，各セッションに関する前回の状態取得時における状態を示す「前回の状態」とを項目として有する。 FIG. 22 is a diagram illustrating a specific example of the state information 133. The status information 133 shown in FIG. 22 includes the “session ID” described in FIG. 14, the “current status” indicating the current status regarding each session, and the “previous status” indicating the status at the previous status acquisition time for each session. "State" as an item.

「現在の状態」及び「前回の状態」には，メッセージの送信先から送信したメッセージの応答を待っている状態を示す「レス待ち」や，メッセージの送信を行っていない状態（メッセージの応答を待っていない状態）を示す「正常」等が設定される。すなわち，状態更新部１１７は，例えば，図１４で説明したログ情報１３１において，「セッションＩＤ」毎に，「属性」が「Ｒｅｑｕｅｓｔ」である情報に対して「属性」が「Ｒｅｃｅｉｖｅ」である情報が存在するか否かを判定する。そして，状態更新部１１７は，「属性」が「Ｒｅｑｕｅｓｔ」である情報に対して「属性」が「Ｒｅｃｅｉｖｅ」である情報が存在する「セッションＩＤ」について，「現在の状態」を「正常」に設定する。一方，状態取得部１１６は，「属性」が「Ｒｅｑｕｅｓｔ」である情報に対して「属性」が「Ｒｅｃｅｉｖｅ」である情報が存在しない「セッションＩＤ」について，「現在の状態」を「レス待ち」に設定する。具体的に，図１４のログ情報１３１のうち「セッションＩＤ」が「１１」である情報において，「属性」が「Ｒｅｑｕｅｓｔ」である情報（「識別ＩＤ」が１，７の情報）に対する「属性」が「Ｒｅｃｅｉｖｅ」である情報（「識別ＩＤ」が２，９の情報）は，全て存在する。そのため，この場合，状態更新部１１７は，状態情報１３３において，「セッションＩＤ」が「１１」である情報に対応する「現在の情報」を「正常」と判定する。一方，「セッションＩＤ」が「１５」である情報において，「属性」が「Ｒｅｑｕｅｓｔ」である情報（「識別ＩＤ」が３，４，７である情報）に対する「属性」が「Ｒｅｃｅｉｖｅ」である情報（「識別ＩＤ」が５，８である情報）は，一部のみが存在する。すなわち，図１４に示す状態において，情報受信部１１２は，「セッションＩＤ」が「１５」である通信相手から応答を待っている状態である。そのため，この場合，状態更新部１１７は，状態情報１３３において，「セッションＩＤ」が「１５」である情報に対応する「現在の情報」を「レス待ち」と判定する。 “Current status” and “Previous status” include “Waiting for waiting” indicating the status of waiting for the response of the message sent from the message destination, and the status of not transmitting the message (message response “Normal” indicating a state of not waiting) is set. That is, for example, in the log information 131 described with reference to FIG. 14, the state update unit 117 has information for which “attribute” is “Receive” for each “session ID” with respect to information whose “attribute” is “Request”. It is determined whether or not exists. Then, the status update unit 117 sets “current status” to “normal” for “session ID” in which information with “attribute” is “Receive” exists for information with “attribute” is “Request”. Set. On the other hand, the status acquisition unit 116 sets the “current status” to “less waiting” for the “session ID” for which the information whose “attribute” is “Receive” does not exist for the information whose “attribute” is “Request”. Set to. Specifically, among the log information 131 in FIG. 14, the “attribute” for the information whose “attribute” is “Request” in the information whose “session ID” is “11” (information whose “identification ID” is 1, 7). "Is" Receive "(information with" identification ID "2 and 9) is all present. Therefore, in this case, the state update unit 117 determines that “current information” corresponding to information whose “session ID” is “11” in the state information 133 is “normal”. On the other hand, in the information whose “session ID” is “15”, the “attribute” for the information whose “attribute” is “Request” (information whose “identification ID” is 3, 4, 7) is “Receive”. Only a part of the information (information whose “identification ID” is 5 or 8) exists. That is, in the state shown in FIG. 14, the information receiving unit 112 is in a state of waiting for a response from the communication partner whose “session ID” is “15”. Therefore, in this case, the state update unit 117 determines that “current information” corresponding to the information whose “session ID” is “15” in the state information 133 is “waiting for less”.

図２２に戻り，図２２に示す状態情報１３３は，具体的に，「セッションＩＤ」が「１１」，「１３」，「１５」，「２１」，「２３」及び「３１」であるセッションが確立されている状態を示している。そして，「セッションＩＤ」が「１１」である情報は，「現在の状態」として「正常」が設定され，「前回の状態」として「正常」が設定されている。また，「セッションＩＤ」が「２３」である情報は，「現在の状態」として「レス待ち」が設定され，「前回の状態」として「レス待ち」が設定されている。すなわち，「セッションＩＤ」が「２３」である情報は，「前回の状態」及び「現在の状態」ともに「レス待ち」の状態である。したがって，この場合，ロギング制御部１２は，監視対象処理部１１等において障害が発生した可能性があると判定し，例えば，ログ情報１３１，２３１の保護処理を行うことが可能になる。図２２のその他の情報については，上記と同様であるため説明を省略する。 Returning to FIG. 22, the status information 133 shown in FIG. 22 specifically includes the sessions whose “session ID” is “11”, “13”, “15”, “21”, “23”, and “31”. Indicates an established state. In the information whose “session ID” is “11”, “normal” is set as the “current state”, and “normal” is set as the “previous state”. In addition, in the information whose “session ID” is “23”, “less waiting” is set as the “current state”, and “less waiting” is set as the “previous state”. That is, the information whose “session ID” is “23” is in a “waiting for less” state for both the “previous state” and the “current state”. Therefore, in this case, the logging control unit 12 determines that there is a possibility that a failure has occurred in the monitoring target processing unit 11 and the like, and can perform protection processing of the log information 131 and 231, for example. Since the other information in FIG. 22 is the same as described above, the description thereof is omitted.

なお，状態更新部１１７は，例えば，状態情報１３３を更新する前に，「現在の状態」に設定されている内容を「前回の状態」に設定するものであってよい。これにより，状態取得部１１６が取得した新たな情報によって，状態取得部１１６が前回取得した情報が上書きされることを防止することが可能になる。 For example, the state update unit 117 may set the content set to “current state” to “previous state” before updating the state information 133. As a result, it is possible to prevent information previously acquired by the state acquisition unit 116 from being overwritten by new information acquired by the state acquisition unit 116.

図１９に戻り，状態取得部１１６が監視対象処理部１１との間で確立されている全セッションについて状態の取得が完了した場合（Ｓ９３のＹＥＳ），状態取得部１１６は，次の状態取得タイミングまで待機する（Ｓ９１）。一方，全セッションについての状態の取得が完了していない場合（Ｓ９３のＮＯ），状態取得部１１６は，状態の取得が完了していないセッションに関する状態の取得を行う（Ｓ９２）。 Returning to FIG. 19, when the state acquisition unit 116 completes the acquisition of the state for all sessions established with the monitoring target processing unit 11 (YES in S93), the state acquisition unit 116 determines the next state acquisition timing. (S91). On the other hand, if the acquisition of the status for all sessions has not been completed (NO in S93), the status acquisition unit 116 acquires the status for the session for which acquisition of the status has not been completed (S92).

［状態判定処理の詳細］
次に，ロギング制御部１２における状態情報１３３の判定を行う処理（以下，状態判定処理とも呼ぶ）の詳細を説明する。図２０に示すように，ロギング制御部１２の状態判定部１１８は，例えば，状態判定タイミングになるまで待機する（Ｓ１０１のＮＯ）。状態判定タイミングは，例えば，状態取得タイミングと同様に，１分間隔であってよい。そして，状態判定タイミングになった場合（Ｓ１０１のＹＥＳ），状態判定部１１８は，例えば，状態情報１３３を参照し，「前回の状態」及び「現在の状態」の両方が「レス待ち」になっている情報が存在するか否かを確認する（Ｓ１０２）。 [Details of status judgment processing]
Next, details of a process for determining the state information 133 in the logging control unit 12 (hereinafter also referred to as a state determination process) will be described. As shown in FIG. 20, the state determination unit 118 of the logging control unit 12 waits until the state determination timing is reached, for example (NO in S101). The state determination timing may be, for example, 1 minute intervals as in the state acquisition timing. Then, when the state determination timing comes (YES in S101), the state determination unit 118 refers to, for example, the state information 133, and both the “previous state” and the “current state” become “waiting for less”. It is confirmed whether or not there is any information (S102).

その結果，「前回の状態」及び「現在の状態」の両方が「レス待ち」になっている情報が存在しない場合（Ｓ１０２のＮＯ），状態判定部１１８は，次の状態判定タイミングまで再度待機する（Ｓ１０１）。一方，「前回の状態」及び「現在の状態」の両方が「レス待ち」になっている情報が存在する場合（Ｓ１０２のＹＥＳ），ログ情報抽出部１１３は，例えば，ログ情報１３１の抽出を行う。具体的に，ログ情報抽出部１１３は，監視対象処理部１１が第３のメッセージを送信する前に監視対象処理部２１から受信したメッセージ（以下，第４のメッセージとも呼ぶ）に対応するログ情報１３１よりも後に情報格納領域１３０に記憶された情報を抽出する（Ｓ１０３）。すなわち，「前回の状態」及び「現在の状態」の両方が「レス待ち」になっている情報が存在する場合，状態判定部１１８は，監視対象処理部１１（または監視対象処理部１１の通信相手）において障害が発生したために，情報が更新されていない可能性があると判定する。そして，ログ情報抽出部１１３は，「前回の状態」及び「現在の状態」の両方が「レス待ち」になっているセッションにおいて，監視対象処理部１１が正常に受信した判断できるメッセージに対応するログ情報１３１を抽出する。これにより，ログ情報抽出部１１３は，障害が発生した際の情報を含むログ情報１３１を特定して抽出することが可能になる。 As a result, when there is no information in which “Previous state” and “Current state” are both “Waiting for reply” (NO in S102), the state determination unit 118 waits again until the next state determination timing. (S101). On the other hand, when there is information in which both the “previous state” and the “current state” are “waiting for less” (YES in S102), the log information extraction unit 113, for example, extracts the log information 131. Do. Specifically, the log information extraction unit 113 includes log information corresponding to a message (hereinafter also referred to as a fourth message) received from the monitoring target processing unit 21 before the monitoring target processing unit 11 transmits the third message. Information stored in the information storage area 130 after 131 is extracted (S103). That is, when there is information in which both “previous state” and “current state” are “waiting for less”, the state determination unit 118 determines whether the monitoring target processing unit 11 (or the communication of the monitoring target processing unit 11). It is determined that there is a possibility that information has not been updated because a failure has occurred at the other party. Then, the log information extraction unit 113 responds to a message that can be determined that the monitoring target processing unit 11 has normally received in a session in which both the “previous state” and the “current state” are “waiting for a reply”. Log information 131 is extracted. As a result, the log information extraction unit 113 can identify and extract the log information 131 including information when a failure occurs.

続いて，ログ情報保護部１１５は，図８の場合と同様に，例えば，抽出したログ情報１３１の保護処理を実行する（Ｓ１０４）。そして，ダンプ取得部１２０は，例えば，物理マシン１のメモリの内容をファイルに出力し，メモリダンプを取得する（Ｓ１０５）。さらに，情報送信部１１４は，例えば，ログ情報抽出部１１３が抽出した第４のメッセージと，第３のメッセージとを識別可能な識別情報をロギング制御部２２に送信する（Ｓ１０６）。 Subsequently, the log information protection unit 115 executes protection processing for the extracted log information 131, for example, as in the case of FIG. 8 (S104). Then, for example, the dump acquisition unit 120 outputs the contents of the memory of the physical machine 1 to a file, and acquires a memory dump (S105). Further, for example, the information transmission unit 114 transmits identification information that can identify the fourth message extracted by the log information extraction unit 113 and the third message to the logging control unit 22 (S106).

一方，図２１に示すように，ロギング制御部２２の情報受信部２１２は，例えば，ロギング制御部１２から識別情報を受信するまで待機する（Ｓ１１１のＮＯ）。そして，ロギング制御部１２から識別情報を受信した場合（Ｓ１１１のＹＥＳ），ロギング制御部２２のログ情報抽出部２１３は，例えば，情報格納領域２３０に記憶されたログ情報２３１を抽出する。具体的に，ログ情報抽出部２１３は，例えば，情報受信部２１２が受信した第４のメッセージを識別する識別情報に基づいて，第４のメッセージの送信よりも後に監視対象処理部１１へ送信したメッセージに対応するログ情報を抽出する（Ｓ１１２）。 On the other hand, as shown in FIG. 21, the information receiving unit 212 of the logging control unit 22 waits until it receives identification information from the logging control unit 12, for example (NO in S111). And when identification information is received from the logging control part 12 (YES of S111), the log information extraction part 213 of the logging control part 22 extracts the log information 231 memorize | stored in the information storage area 230, for example. Specifically, the log information extraction unit 213, for example, transmitted to the monitoring target processing unit 11 after the transmission of the fourth message based on the identification information for identifying the fourth message received by the information reception unit 212. Log information corresponding to the message is extracted (S112).

そして，ログ情報抽出部２１３は，例えば，ログ情報抽出部１１３と同様に，抽出したログ情報２３１の保護処理を実行する（Ｓ１１３）。続いて，ダンプ取得部２２０は，例えば，ダンプ取得部１２０と同様に，物理マシン１のメモリの内容をファイルに出力し，メモリダンプを取得する（Ｓ１１４）。さらに，ログ情報抽出部２１３は，例えば，受信した第３のメッセージを識別する識別情報に基づいて，第１のメッセージに対応するログ情報２３１が記憶されているか否かを判定する（Ｓ１１５）。 And the log information extraction part 213 performs the protection process of the extracted log information 231 similarly to the log information extraction part 113, for example (S113). Subsequently, for example, similarly to the dump acquisition unit 120, the dump acquisition unit 220 outputs the contents of the memory of the physical machine 1 to a file and acquires a memory dump (S114). Furthermore, the log information extraction unit 213 determines whether or not the log information 231 corresponding to the first message is stored based on the identification information for identifying the received third message (S115).

すなわち，第２の実施の形態によれば，監視対象処理部１１の処理状態が所定時間（例えば，１分）を経過しても更新されない場合，ロギング制御部１２は，監視対象処理部１１等で障害が発生したものと判定する。これにより，メッセージの送受信を行わない場合においても，監視対象処理部１１等で発生している障害を検知することが可能になる。そのため，ロギング制御部１２は，発生した障害を迅速に検知することが可能になる。 That is, according to the second embodiment, when the processing state of the monitoring target processing unit 11 is not updated even after a predetermined time (for example, 1 minute), the logging control unit 12 It is determined that a failure has occurred. This makes it possible to detect a failure occurring in the monitoring target processing unit 11 or the like even when message transmission / reception is not performed. Therefore, the logging control unit 12 can quickly detect a failure that has occurred.

［識別情報が送信できない場合の処理］
次に，第１の実施の形態において，第２のメッセージの識別情報の送信（図８のＳ１５）を行うことができない場合の処理を説明する。図２３及び図２４は，識別情報の送信を行うことができない場合のフローチャート図である。 [Processing when identification information cannot be sent]
Next, in the first embodiment, a process when the identification information of the second message cannot be transmitted (S15 in FIG. 8) will be described. 23 and 24 are flowcharts when the identification information cannot be transmitted.

例えば，物理マシン２において再起動を要するような障害が発生した場合，ロギング制御部２２は，ロギング制御部１２から送信される識別情報を受信できない場合がある。この場合，ロギング制御部１２は，例えば，物理マシン２の再起動が完了するまで待機し，物理マシン２の再起動完了後，識別情報を再送する。ここで，物理マシン２では，再起動の実行に伴って，物理マシン２での障害発生前後に関するログ情報を含むメモリダンプが出力されている。そのため，ロギング制御部２２は，物理マシン２の再起動完了後，受信した識別情報に含まれる内容及びメモリダンプに含まれる内容に基づき，物理マシン２において発生した障害の原因調査を行うことが可能になる。すなわち，この場合，ロギング制御部２２は，メモリダンプに含まれる内容を参照することにより，ログ情報２３１の保護処理を行うことなく障害の原因調査が可能になる。以下，識別情報の送信を行うことができない場合の処理の詳細を説明する。 For example, when a failure that requires a restart occurs in the physical machine 2, the logging control unit 22 may not receive the identification information transmitted from the logging control unit 12. In this case, for example, the logging control unit 12 waits until the restart of the physical machine 2 is completed, and retransmits the identification information after the restart of the physical machine 2 is completed. Here, in the physical machine 2, a memory dump including log information about before and after the occurrence of the failure in the physical machine 2 is output along with execution of the restart. Therefore, the logging control unit 22 can investigate the cause of the failure that occurred in the physical machine 2 based on the contents included in the received identification information and the contents included in the memory dump after the restart of the physical machine 2 is completed. become. That is, in this case, the logging control unit 22 can investigate the cause of the failure without performing the protection process of the log information 231 by referring to the contents included in the memory dump. Hereinafter, details of the processing when the identification information cannot be transmitted will be described.

図２３は，識別情報の送信を行うことができない場合の識別情報送信処理を説明するフローチャート図である。図２３に示すように，情報送信部１１４は，図１１で説明した場合と同様に，監視対象処理部２１が異常から復旧しているか否かを判定する（Ｓ１２１）。そして，監視対象処理部２１が異常から復旧していないと判定した場合（Ｓ１２１のＹＥＳ），情報送信部１１４は，メモリダンプの取得の指示を含む識別情報をロギング制御部２２に送信する（Ｓ１２２）。一方，監視対象処理部２１が異常から復旧していると判定した場合（Ｓ１２１のＮＯ），情報送信部１１４は，メモリダンプの取得の指示を含まない識別情報をロギング制御部２２に送信する（Ｓ１２３）。 FIG. 23 is a flowchart for explaining identification information transmission processing when identification information cannot be transmitted. As shown in FIG. 23, the information transmission unit 114 determines whether or not the monitoring target processing unit 21 has recovered from the abnormality as in the case described with reference to FIG. 11 (S121). When it is determined that the monitoring target processing unit 21 has not recovered from the abnormality (YES in S121), the information transmission unit 114 transmits identification information including an instruction to acquire a memory dump to the logging control unit 22 (S122). ). On the other hand, when it is determined that the monitoring target processing unit 21 has recovered from the abnormality (NO in S121), the information transmission unit 114 transmits identification information that does not include an instruction to acquire a memory dump to the logging control unit 22 ( S123).

そして，識別情報の送信が完了した場合（Ｓ１２４のＹＥＳ），図１１で説明した場合と同様に，識別情報送信処理が終了する。一方，識別情報の送信が完了しない場合（Ｓ１２４のＮＯ），情報送信部１１４は，例えば，識別情報の送信が完了するまで待機する（Ｓ１２５のＮＯ）。すなわち，発生した障害に起因して物理マシン２で再起動が行われている場合，情報送信部１１４は，ロギング制御部２２に識別情報を送信することができない。そのため，情報送信部１１４は，この場合，物理マシン２に対してＰＩＮＧの送信等を行うことによって，定期的に物理マシン２（ロギング制御部２２）の状況を確認する。そして，ロギング制御部２２への識別情報の送信が可能になった場合（Ｓ１２５のＹＥＳ），情報送信部１１４は，例えば，送信できなかった識別情報をロギング制御部２２に送信（再送）する（Ｓ１２６）。 When the transmission of the identification information is completed (YES in S124), the identification information transmission process ends as in the case described with reference to FIG. On the other hand, when the transmission of the identification information is not completed (NO in S124), the information transmission unit 114 waits until the transmission of the identification information is completed (NO in S125), for example. That is, when the physical machine 2 is restarted due to the failure that has occurred, the information transmission unit 114 cannot transmit identification information to the logging control unit 22. Therefore, in this case, the information transmission unit 114 periodically checks the status of the physical machine 2 (logging control unit 22) by transmitting PING to the physical machine 2 or the like. When the identification information can be transmitted to the logging control unit 22 (YES in S125), for example, the information transmission unit 114 transmits (retransmits) the identification information that could not be transmitted to the logging control unit 22 ( S126).

次に，ロギング制御部２２が物理マシン２の再起動後に行う処理（以下，再起動時ロギング制御処理とも呼ぶ）を説明する。ロギング制御部２２は，例えば，物理マシン２の再起動に伴って起動した後，ロギング制御部１２から識別情報を受信するまで待機する（Ｓ１３１のＮＯ）。そして，識別情報を受信した場合（Ｓ１３１のＹＥＳ），ログ情報抽出部２１３は，例えば，物理マシン２の再起動の際に取得したメモリダンプを参照し，受信した識別情報に対応するログ情報２３１の後に記憶されたログ情報２３１を抽出する（Ｓ１３２）。すなわち，物理マシン２の再起動が行われた場合，情報格納領域２３０に記憶されたログ情報２３１はメモリダンプとして出力されている。そのため，ロギング制御部２２は，出力されたメモリダンプの内容を参照することにより，ログ情報２３１の保護処理を行うことなく，物理マシン２において発生した障害の原因調査を行うことが可能になる。 Next, processing performed by the logging control unit 22 after the physical machine 2 is restarted (hereinafter also referred to as restarting logging control processing) will be described. For example, the logging control unit 22 waits until the identification information is received from the logging control unit 12 after the physical control unit 2 is started with the restart of the physical machine 2 (NO in S131). When the identification information is received (YES in S131), the log information extraction unit 213 refers to, for example, a memory dump acquired when the physical machine 2 is restarted, and log information 231 corresponding to the received identification information. The log information 231 stored after is extracted (S132). That is, when the physical machine 2 is restarted, the log information 231 stored in the information storage area 230 is output as a memory dump. Therefore, the logging control unit 22 can investigate the cause of the failure that has occurred in the physical machine 2 without referring to the log information 231 protection process by referring to the contents of the output memory dump.

その後，ログ情報抽出部２１３は，図１０で説明した場合と同様に，例えば，受信した第１のメッセージを識別する識別情報に基づいて，第１のメッセージに対応するログ情報２３１が記憶されているか否かを判定する（Ｓ１３３）。 Thereafter, the log information extraction unit 213 stores the log information 231 corresponding to the first message based on the identification information for identifying the received first message, for example, as in the case described with reference to FIG. It is determined whether or not there is (S133).

以上の実施の形態をまとめると，以下の付記のとおりである。 The above embodiment is summarized as follows.

（付記１）
第１の装置に関するログ情報を記憶する記憶部と，
前記第１の装置から第２の装置に送信された第１のメッセージに対して前記第２の装置から無応答であった旨の通知を前記第１の装置から受信する受信部と，
前記記憶された第１の装置に関するログ情報から，前記第１のメッセージの送信前に前記第１の装置が前記第２の装置から受信した第２のメッセージに対応するログ情報を抽出する処理を行う処理部と，
前記抽出した第２のメッセージを識別可能な第１の識別情報を送信する送信部と，
を備えた第１のロギング装置と，
前記第２の装置に関するログ情報を記憶する記憶部と，
前記送信された第１の識別情報を受信する受信部と，
前記受信した第１の識別情報に基づいて，前記記憶された第２の装置に関するログ情報から前記第２のメッセージの送信よりも後に前記第１の装置へ送信したメッセージに対応するログ情報を特定する処理部と，
を備えた第２のロギング装置と，を含む，
ことを特徴とする情報処理システム。 (Appendix 1)
A storage unit for storing log information relating to the first device;
A receiving unit that receives a notification from the first device that there is no response from the second device with respect to the first message transmitted from the first device to the second device;
A process of extracting log information corresponding to a second message received by the first device from the second device before transmission of the first message from the stored log information about the first device; A processing unit to perform,
A transmitter for transmitting first identification information capable of identifying the extracted second message;
A first logging device comprising:
A storage unit for storing log information related to the second device;
A receiver for receiving the transmitted first identification information;
Based on the received first identification information, log information corresponding to a message transmitted to the first device after transmission of the second message is specified from the stored log information about the second device. A processing unit to perform,
A second logging device comprising:
An information processing system characterized by this.

（付記２）
付記１において，
前記第２のロギング装置の前記処理部は，さらに，前記特定したログ情報を抽出して保護処理を実行する，
ことを特徴とする情報処理システム。 (Appendix 2)
In Appendix 1,
The processing unit of the second logging device further extracts the identified log information and executes a protection process;
An information processing system characterized by this.

（付記３）
付記１において，
前記送信部は，前記第１のメッセージを識別可能な識別情報を送信し，
前記第２のロギング装置の前記処理部は，前記受信した第１のメッセージの識別情報に基づいて，前記第２のロギング装置の前記記憶部に前記第１のメッセージに対応するログ情報が記憶されているか否かを判定する，
ことを特徴とする情報処理システム。 (Appendix 3)
In Appendix 1,
The transmission unit transmits identification information capable of identifying the first message;
The processing unit of the second logging device stores log information corresponding to the first message in the storage unit of the second logging device based on the received identification information of the first message. To determine whether or not
An information processing system characterized by this.

（付記４）
付記１において，
前記第２のロギング装置の前記処理部は，前記第２の装置から前記第１の装置への第３のメッセージの送信に応じて，前記第２の装置が前記第３のメッセージの応答を受信待ちである旨を示す状態情報を記憶部に記憶し，前記第３のメッセージの応答の受信に応じて，前記記憶した状態情報を消去し，
前記第２のロギング装置の前記処理部は，記憶されてからの時間が所定の時間を上回る状態情報の存在を検知した場合，前記記憶された第２の装置に関するログ情報から，前記第３のメッセージの送信前に前記第２の装置が前記第１の装置から受信した第４のメッセージに対応するログ情報を抽出し，
前記送信部は，前記抽出した第４のメッセージを識別可能な識別情報を送信し，
前記第１のロギング装置の前記受信部は，前記送信された第４のメッセージの識別情報を受信し，
前記受信した第４のメッセージの識別情報に基づいて，前記記憶された第１の装置に関するログ情報から前記第４のメッセージの送信よりも後に前記第２の装置へ送信したメッセージに対応するログ情報を抽出する，
ことを特徴とする情報処理システム。 (Appendix 4)
In Appendix 1,
The processing unit of the second logging device receives the response of the third message in response to the transmission of the third message from the second device to the first device. Storing status information indicating that it is waiting in the storage unit, and erasing the stored status information in response to receiving a response to the third message;
When the processing unit of the second logging device detects the presence of state information that has been stored for more than a predetermined time, the processing unit detects, based on the stored log information about the second device, the third information Extracting log information corresponding to the fourth message received from the first device by the second device before sending the message;
The transmission unit transmits identification information capable of identifying the extracted fourth message;
The receiving unit of the first logging device receives identification information of the transmitted fourth message;
Based on the received identification information of the fourth message, log information corresponding to a message transmitted from the stored log information about the first device to the second device after transmission of the fourth message Extract,
An information processing system characterized by this.

（付記５）
付記１において，
前記送信部は，前記第２のロギング装置に前記識別情報を送信できない場合，前記第２のロギング装置への前記識別情報の送信が可能になるまで待機し，その後，前記識別情報を再送する，
ことを特徴とする情報処理システム。 (Appendix 5)
In Appendix 1,
The transmission unit waits until the identification information can be transmitted to the second logging device when the identification information cannot be transmitted to the second logging device, and then retransmits the identification information.
An information processing system characterized by this.

（付記６）
付記１において，
前記第１のロギング装置の前記処理部は，前記第１のロギング装置の前記受信部が前記通知を受信したことに応じて，前記第１の装置のメモリの状態に関するメモリダンプを取得し，
前記第２のロギング装置の前記処理部は，前記第１のロギング装置の前記受信部が前記識別情報を受信したことに応じて，前記第２の装置のメモリの状態に関するメモリダンプを取得する，
ことを特徴とする情報処理システム。 (Appendix 6)
In Appendix 1,
The processing unit of the first logging device acquires a memory dump related to a memory state of the first device in response to the reception of the notification by the receiving unit of the first logging device,
The processing unit of the second logging device acquires a memory dump related to a memory state of the second device in response to the reception of the identification information by the receiving unit of the first logging device;
An information processing system characterized by this.

（付記７）
付記６において，
前記第１のロギング装置の前記処理部は，前記第１のロギング装置の前記受信部が前記通知を受信した際に，前記第２の装置の状態を確認し，前記第２の装置の状態が正常であると判定した場合，前記メモリダンプの取得を行わない，
ことを特徴とする情報処理システム。 (Appendix 7)
In Appendix 6,
The processing unit of the first logging device checks the state of the second device when the receiving unit of the first logging device receives the notification, and the state of the second device is If it is determined to be normal, the memory dump is not acquired.
An information processing system characterized by this.

（付記８）
付記６において，
前記第２のロギング装置の前記処理部は，前記第２のロギング装置の前記受信部が前記識別情報を受信した際に，前記第２の装置の状態を確認し，前記第２の装置の状態が正常であると判定した場合，前記メモリダンプの取得を行わない，
ことを特徴とする情報処理システム。 (Appendix 8)
In Appendix 6,
The processing unit of the second logging device confirms the state of the second device when the receiving unit of the second logging device receives the identification information, and the state of the second device If it is determined that is normal, the memory dump is not acquired.
An information processing system characterized by this.

（付記９）
コンピュータに，
第１の装置に関するログ情報を記憶し，
前記第１の装置から第２の装置に送信された第１のメッセージに対して前記第２の装置から無応答であった旨の通知を前記第１の装置から受信し，
前記記憶された第１の装置に関するログ情報から，前記第１のメッセージの送信前に前記第１の装置が前記第２の装置から受信した第２のメッセージに対応するログ情報を抽出する処理を行い，
前記抽出した第２のメッセージを識別可能な識別情報を送信する，
ことを実行させることを特徴とするロギング制御プログラム。 (Appendix 9)
Computer
Storing log information about the first device;
Receiving a notification from the first device that there was no response from the second device to the first message transmitted from the first device to the second device;
A process of extracting log information corresponding to a second message received by the first device from the second device before transmission of the first message from the stored log information about the first device; Done,
Transmitting identification information for identifying the extracted second message;
A logging control program characterized by causing

（付記１０）
コンピュータに，
第２の装置に関するログ情報を記憶し，
前記第２の装置が第１の装置に送信したメッセージを識別可能な識別情報を第１の装置から受信し，
前記受信した識別情報に基づいて，前記記憶された第２の装置に関するログ情報から前記メッセージの送信よりも後に前記第１の装置へ送信したメッセージに対応するログ情報を特定する，
ことを実行させることを特徴とするロギング制御プログラム。 (Appendix 10)
Computer
Storing log information about the second device;
Receiving identification information from the first device that can identify the message transmitted by the second device to the first device;
Based on the received identification information, log information corresponding to a message transmitted to the first device after transmission of the message is specified from the stored log information on the second device.
A logging control program characterized by causing

（付記１１）
第１のロギング装置は，第１の装置に関するログ情報を記憶し，
前記第１のロギング装置は，前記第１の装置から第２の装置に送信された第１のメッセージに対して前記第２の装置から無応答であった旨の通知を前記第１の装置から受信し，
前記第１のロギング装置は，前記記憶された第１の装置に関するログ情報から，前記第１のメッセージの送信前に前記第１の装置が前記第２の装置から受信した第２のメッセージに対応するログ情報を抽出する処理を行い，
前記第１のロギング装置は，前記抽出した第２のメッセージを識別可能な識別情報を送信し，
第２のロギング装置は，前記第２の装置に関するログ情報を記憶し，
前記第２のロギング装置は，前記送信された識別情報を受信し，
前記第２のロギング装置は，前記受信した識別情報に基づいて，前記記憶された第２の装置に関するログ情報から前記第２のメッセージの送信よりも後に前記第１の装置へ送信したメッセージに対応するログ情報を特定する，
ことを特徴とするロギング制御方法。 (Appendix 11)
The first logging device stores log information relating to the first device,
The first logging device notifies the first device that there has been no response from the second device to the first message transmitted from the first device to the second device. Receive,
The first logging device responds to a second message received from the second device by the first device before transmission of the first message from the stored log information about the first device. Process to extract log information to be
The first logging device transmits identification information capable of identifying the extracted second message;
A second logging device for storing log information relating to the second device;
The second logging device receives the transmitted identification information;
The second logging device responds to a message transmitted to the first device after transmission of the second message from log information relating to the stored second device based on the received identification information. Identify the log information to be used,
A logging control method characterized by the above.

１：物理マシン２：物理マシン
１１：監視対象処理部１２：ロギング制御部
１３：記憶媒体２１：監視対象処理部
２２：ロギング制御部２３：記憶媒体 1: Physical machine 2: Physical machine 11: Monitoring target processing unit 12: Logging control unit 13: Storage medium 21: Monitoring target processing unit 22: Logging control unit 23: Storage medium

Claims

A storage unit for storing log information relating to the first device;
A receiving unit that receives a notification from the first device that there is no response from the second device with respect to the first message transmitted from the first device to the second device;
A process of extracting log information corresponding to a second message received by the first device from the second device before transmission of the first message from the stored log information about the first device; A processing unit to perform,
A transmitter for transmitting first identification information capable of identifying the extracted second message;
A first logging device comprising:
A storage unit for storing log information related to the second device;
A receiver for receiving the transmitted first identification information;
Based on the received first identification information, log information corresponding to a message transmitted to the first device after transmission of the second message is specified from the stored log information about the second device. A processing unit to perform,
A second logging device comprising:
An information processing system characterized by this.

In claim 1,
The processing unit of the second logging device further extracts the identified log information and executes a protection process;
An information processing system characterized by this.

In claim 1,
The transmission unit transmits identification information capable of identifying the first message;
The processing unit of the second logging device stores log information corresponding to the first message in the storage unit of the second logging device based on the received identification information of the first message. To determine whether or not
An information processing system characterized by this.

In claim 1,
The processing unit of the second logging device receives the response of the third message in response to the transmission of the third message from the second device to the first device. Storing status information indicating that it is waiting in the storage unit, and erasing the stored status information in response to receiving a response to the third message;
When the processing unit of the second logging device detects the presence of state information that has been stored for more than a predetermined time, the processing unit detects, based on the stored log information about the second device, the third information Extracting log information corresponding to the fourth message received from the first device by the second device before sending the message;
The transmission unit transmits identification information capable of identifying the extracted fourth message;
The receiving unit of the first logging device receives identification information of the transmitted fourth message;
Based on the received identification information of the fourth message, log information corresponding to a message transmitted from the stored log information about the first device to the second device after transmission of the fourth message Extract,
An information processing system characterized by this.

Computer
Storing log information about the first device;
Receiving a notification from the first device that there was no response from the second device to the first message transmitted from the first device to the second device;
A process of extracting log information corresponding to a second message received by the first device from the second device before transmission of the first message from the stored log information about the first device; Done,
Transmitting identification information for identifying the extracted second message;
A logging control program characterized by causing

Computer
Storing log information about the second device;
Receiving identification information from the first device that can identify the message transmitted by the second device to the first device;
Based on the received identification information, log information corresponding to a message transmitted to the first device after transmission of the message is specified from the stored log information on the second device.
A logging control program characterized by causing

The first logging device stores log information relating to the first device,
The first logging device notifies the first device that there has been no response from the second device to the first message transmitted from the first device to the second device. Receive,
The first logging device responds to a second message received from the second device by the first device before transmission of the first message from the stored log information about the first device. Process to extract log information to be
The first logging device transmits identification information capable of identifying the extracted second message;
A second logging device for storing log information relating to the second device;
The second logging device receives the transmitted identification information;
The second logging device responds to a message transmitted to the first device after transmission of the second message from log information relating to the stored second device based on the received identification information. Identify the log information to be used,
A logging control method characterized by the above.