JP2011170518A

JP2011170518A - State monitoring device and method

Info

Publication number: JP2011170518A
Application number: JP2010032455A
Authority: JP
Inventors: Hidehiro Kametani; 秀洋亀谷
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2010-02-17
Filing date: 2010-02-17
Publication date: 2011-09-01

Abstract

<P>PROBLEM TO BE SOLVED: To provide a state monitoring device and method which is capable of properly setting a threshold for use in determining whether abnormality has occurred or not, with a small number of samples. <P>SOLUTION: Status data of a device is collected at intervals of a pre set first period, and a plurality of pieces of status data acquired at intervals of a prescribed second period longer than the first period, out of collected status data are statistically processed to calculate an average value and a standard deviation corresponding to the device, and the average value and the standard deviation are calculated at intervals of the first period. A calculated standard deviation value is corrected by filtering according with characteristics of the device, and a threshold for the device is calculated at intervals of the first period on the basis of the average value and the corrected standard deviation. During actual operation, the calculated threshold is used to determine whether or not abnormality has occurred in the device. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は自装置に組み込まれたデバイスの異常や故障を検出するための状態監視装置及び方法に関する。 The present invention relates to a state monitoring apparatus and method for detecting an abnormality or failure of a device incorporated in the apparatus.

コンピュータあるいはコンピュータを含むシステムの品質は、信頼性（Reliability）、可用性（Availability）、保守容易性（Serviceability）の三つの頭文字をとってＲＡＳと呼ばれる指標で表される。例えば、高信頼性が要求されるコンピュータには、自装置に組み込まれたデバイスの異常動作や故障発生の有無を監視し、異常や故障の発生を通知するためのプログラム（以下、ＲＡＳプログラムと称す）を備えている。ＲＡＳプログラムには、異常や故障の発生時に、データの復元や故障原因を特定するためのプログラム等も含まれる。監視対象となるデバイスとしては、例えば、ＣＰＵ（Central Processing Unit）、ＨＤＤ（Hard Disk Drive）、ＬＣＤ（Liquid Crystal Display）、バッテリ、冷却ファン、ＰＣＩ（Peripheral Component Interconnect）デバイス等がある。 The quality of a computer or a system including a computer is represented by an index called RAS, taking three acronyms of reliability, availability, and serviceability. For example, in a computer that requires high reliability, a program (hereinafter referred to as a RAS program) for monitoring the presence or absence of an abnormal operation or failure of a device incorporated in its own device is monitored. ). The RAS program includes a program for restoring data and identifying the cause of failure when an abnormality or failure occurs. Examples of devices to be monitored include a central processing unit (CPU), a hard disk drive (HDD), a liquid crystal display (LCD), a battery, a cooling fan, and a peripheral component interconnect (PCI) device.

ＲＡＳプログラムを備えたコンピュータで実現される状態監視装置は、温度、電圧、電流等を測定する各種のセンサと、各デバイスに対応して予め設定されたしきい値とを備え、センサで測定した観測値（以下、状態データと称す場合もある）と、それに対応するしきい値とを用いてデバイスの異常や故障発生有無を判定する。 A state monitoring device realized by a computer equipped with a RAS program includes various sensors for measuring temperature, voltage, current, and the like, and threshold values set in advance corresponding to the respective devices. Whether or not a device abnormality or failure has occurred is determined using an observed value (hereinafter also referred to as state data) and a corresponding threshold value.

なお、コンピュータに限らず、冷凍／冷蔵庫、エアーコンディショナー、自動販売機等の各種の電子機器の異常や故障発生有無を所定のしきい値を用いて判定する技術は、例えば特許文献１に記載されている。 A technique for determining whether or not an abnormality or failure has occurred in various electronic devices such as a refrigeration / refrigerator, an air conditioner, and a vending machine using a predetermined threshold is described in Patent Document 1, for example. ing.

また、特許文献２には、監視対象のデバイスの定常的な状態変化を検出するために、各デバイスから取得した状態を示す状態データを適応型ラティスフィルタで処理する技術が記載されている。 Patent Document 2 describes a technique for processing state data indicating a state acquired from each device with an adaptive lattice filter in order to detect a steady state change of a device to be monitored.

また、特許文献３には、ネットワークに接続されたコンピュータ（特許文献３では「ステーション」と表記）でそれぞれＲＡＳ情報を取得し、ネットワークに接続された全てのコンピュータのＲＡＳ情報を各コンピュータで管理可能にした構成が記載されている。 In Patent Document 3, RAS information is acquired by each computer connected to the network (indicated as “Station” in Patent Document 3), and RAS information of all computers connected to the network can be managed by each computer. The structure which was made is described.

特開２００３−１７２５６７号公報JP 2003-172567 A 特開２００７−２０１６４８号公報JP 2007-201648 A 特開平８−３３１１２５号公報JP-A-8-331125

上述したように、ＲＡＳプログラムで用いるしきい値は、デバイスが異常であるか否かを判定するために重要な指標である。しかしながら、しきい値は、システム構成、処理するタスク、使用環境等によって最適値が異なるため、ユーザや保守員等により最適な値を設定するのが困難であるという問題がある。 As described above, the threshold value used in the RAS program is an important index for determining whether or not a device is abnormal. However, since the optimum threshold value varies depending on the system configuration, the task to be processed, the usage environment, etc., there is a problem that it is difficult for the user, maintenance personnel, etc. to set the optimum value.

また、背景技術の状態監視装置では、しきい値が基本的に固定値であり、ユーザや保守員等が変更しない限り変わることはない。そのため、例えば予想される変動の最大値に上限のしきい値を設定し、予想される変動の最小値に下限のしきい値を設定した場合、観測値に対するしきい値のマージンが大きくなり、異常が発生しても、該異常時の観測値がしきい値を越えていなければ、検知することができないという問題がある。 In the state monitoring apparatus of the background art, the threshold value is basically a fixed value and does not change unless changed by the user or maintenance personnel. So, for example, if you set an upper threshold for the maximum expected variation and set a lower threshold for the minimum expected variation, the threshold margin for the observed value increases, There is a problem that even if an abnormality occurs, it cannot be detected unless the observed value at the time of the abnormality exceeds a threshold value.

なお、上記特許文献１には、所定の周期毎に機器毎の状態を示す状態データ（観測値）を収集し、その正規分布曲線またはｔ分布曲線に基づいて故障判定に必要なしきい値を決定する手法が示されている。しかしながら、そのような方法は、しきい値を決定するのに多数の状態データが必要であり、しきい値を決定するまでに長い学習期間を要する場合がある。例えば、予め決められた曜日の時刻に定期的に観測値を収集する場合、学習を開始して１ヶ月が経過しても４〜５程度のサンプル数しか得られない。その場合、しきい値の不確かさが大きいためにより多くのサンプル数が必要であり、しきい値を決定するまでに数ヶ月も要してしまう可能性がある。したがって、サンプル数が少なくても適切なしきい値を決定できることが望ましい。 In Patent Document 1, state data (observation values) indicating the state of each device is collected every predetermined cycle, and a threshold necessary for failure determination is determined based on the normal distribution curve or the t distribution curve. The technique to do is shown. However, such a method requires a large number of state data to determine the threshold and may require a long learning period to determine the threshold. For example, in the case where observation values are collected regularly at a time of a predetermined day of the week, only about 4 to 5 samples can be obtained even after one month has passed since learning was started. In that case, because the uncertainty of the threshold is large, a larger number of samples is required, and it may take several months to determine the threshold. Therefore, it is desirable that an appropriate threshold value can be determined even when the number of samples is small.

本発明は上述したような背景技術が有する問題点を解決するためになされたものであり、少ないサンプル数でも異常発生有無の判定に用いるしきい値を適切に決定できる状態監視装置及び方法を提供することを目的とする。 The present invention has been made to solve the above-described problems of the background art, and provides a state monitoring apparatus and method capable of appropriately determining a threshold value used for determining whether or not an abnormality has occurred even with a small number of samples. The purpose is to do.

上記目的を達成するため本発明の状態監視装置は、監視対象となるデバイスの状態を示す状態データが所定のしきい値を超えたか否かにより該デバイスの異常発生有無を判定する状態監視装置であって、
前記デバイスの状態データを予め設定された第１周期毎に収集するデータ収集手段と、
前記データ収集手段で収集された状態データのうち、前記第１周期よりも長い所定の第２周期で取得した複数の状態データを統計処理することで、前記デバイスに対応する平均値及び標準偏差を算出すると共に、該平均値及び標準偏差を前記第１周期毎に算出する統計データ算出手段と、
前記統計データ算出手段で算出された前記標準偏差の値を前記デバイスの特性に応じてフィルタ処理することで補正し、前記平均値及び該補正後の標準偏差を基に、前記デバイスに対応する前記しきい値を前記第１周期毎に算出するしきい値決定手段と、
前記しきい値決定手段で算出したしきい値を用いて、前記デバイスの異常発生有無を判定する運用監視手段と、
を有する。 In order to achieve the above object, a state monitoring apparatus of the present invention is a state monitoring apparatus that determines whether or not an abnormality has occurred in a device based on whether or not state data indicating a state of a device to be monitored exceeds a predetermined threshold value. There,
Data collecting means for collecting the state data of the device every preset first period;
Of the state data collected by the data collection means, statistical processing is performed on a plurality of state data acquired in a predetermined second period longer than the first period, thereby obtaining an average value and a standard deviation corresponding to the device. Statistical data calculating means for calculating the average value and standard deviation for each of the first periods,
The value of the standard deviation calculated by the statistical data calculation means is corrected by filtering according to the characteristics of the device, and based on the average value and the standard deviation after the correction, the device corresponding to the device Threshold value determining means for calculating a threshold value for each first period;
Using the threshold value calculated by the threshold value determining means, operation monitoring means for determining the presence or absence of abnormality of the device;
Have

一方、本発明の状態監視方法は、監視対象となるデバイスの状態を示す状態データが所定のしきい値を超えたか否かにより該デバイスの異常発生有無を判定する状態監視方法であって、
前記デバイスの状態データを予め設定された第１周期毎に収集し、
前記収集した状態データのうち、前記第１周期よりも長い所定の第２周期で取得した複数の状態データを統計処理することで、前記デバイスに対応する平均値及び標準偏差を算出すると共に、該平均値及び標準偏差を前記第１周期毎に算出し、
該算出した標準偏差の値を前記デバイスの特性に応じてフィルタ処理することで補正し、前記平均値及び該補正後の標準偏差を基に、前記デバイスに対応する前記しきい値を前記第１周期毎に算出し、
該算出したしきい値を用いて、前記デバイスの異常発生有無を判定する方法である。 On the other hand, the state monitoring method of the present invention is a state monitoring method for determining whether or not an abnormality has occurred in the device based on whether or not the state data indicating the state of the device to be monitored exceeds a predetermined threshold value,
Collecting the device status data for each preset first period,
Among the collected state data, by statistically processing a plurality of state data acquired in a predetermined second period longer than the first period, an average value and a standard deviation corresponding to the device are calculated, An average value and a standard deviation are calculated for each first period,
The calculated standard deviation value is corrected by filtering according to the characteristics of the device, and the threshold value corresponding to the device is set based on the average value and the standard deviation after the correction. Calculate every cycle,
This is a method for determining whether or not an abnormality has occurred in the device by using the calculated threshold value.

本発明によれば、少ないサンプル数でも異常発生有無の判定に用いるしきい値を適切に決定できる。 According to the present invention, it is possible to appropriately determine a threshold value used for determining whether or not an abnormality has occurred even with a small number of samples.

本発明の状態監視装置を実現するコンピュータの一構成例を示す斜視図である。It is a perspective view which shows one structural example of the computer which implement | achieves the state monitoring apparatus of this invention. 本発明の状態監視装置の一構成例を示すブロック図である。It is a block diagram which shows one structural example of the state monitoring apparatus of this invention. 第１の実施の形態の状態監視装置の処理の一例を示すフローチャートである。It is a flowchart which shows an example of a process of the state monitoring apparatus of 1st Embodiment. 図２に示したデータ収集手段により取得した状態データの一例を示すテーブル図である。It is a table figure which shows an example of the status data acquired by the data collection means shown in FIG. 図２に示したデータ収集手段により取得した状態データの一例を示すグラフである。It is a graph which shows an example of the status data acquired by the data collection means shown in FIG. 図２に示したデータ収集手段により取得した状態データの一例を示すグラフである。It is a graph which shows an example of the status data acquired by the data collection means shown in FIG. 図６に示した状態データから算出したＣＰＵ温度の平均値が変化する様子を示すグラフである。It is a graph which shows a mode that the average value of CPU temperature computed from the state data shown in FIG. 6 changes. 図６に示した状態データから算出したＣＰＵ温度の平均値、標準偏差及び変化量を示すテーブル図である。It is a table figure which shows the average value, standard deviation, and variation | change_quantity of CPU temperature computed from the state data shown in FIG. 図６に示したＣＰＵ温度の所定の曜日の時刻における分布の一例を示すグラフである。It is a graph which shows an example of distribution in the time of the predetermined day of the week of CPU temperature shown in FIG. 第１の実施の形態の状態監視装置で実施するフィルタ処理の一例を示すフローチャートである。It is a flowchart which shows an example of the filter process implemented with the state monitoring apparatus of 1st Embodiment. 図２に示したしきい値決定手段により算出した所定の曜日の観測期間における各標準偏差を昇順に並べた様子を示すグラフである。It is a graph which shows a mode that each standard deviation was arranged in ascending order in the observation period of the predetermined day of the week calculated by the threshold value determination means shown in FIG. 図２に示したしきい値決定手段により求めたしきい値と観測値が変化する様子を示すグラフである。It is a graph which shows a mode that the threshold value calculated | required by the threshold value determination means shown in FIG. 2 and an observation value change. 第２の実施の形態の状態監視装置で実施するフィルタ処理の一例を示すフローチャートである。It is a flowchart which shows an example of the filter process implemented with the state monitoring apparatus of 2nd Embodiment.

次に本発明について図面を用いて説明する。 Next, the present invention will be described with reference to the drawings.

本発明では、デバイス毎の状態データを予め決められた収集周期（第１周期）毎に収集し、予め決められた曜日の時刻、あるいは予め決められた日毎の時刻等、所定の統計処理周期（第２周期）毎に取得した複数の状態データを統計処理することで、所定の曜日の時刻や日毎の時刻に対応するデバイス毎のしきい値を求める。さらに、デバイス毎のしきい値を状態データの収集周期毎に求め、実運用時には、それらのしきい値を曜日や時刻に合わせて時系列に用いることで異常発生の有無を判定する。取得する状態データには、ＲＡＳプログラムの監視項目である、ＣＰＵ温度、システム温度、ＨＤＤ温度、ＬＣＤ温度、バッテリ温度、ＨＤＤＳＭＡＲＴエラー情報、電圧、ファン回転数、ＰＣＩパリティ、通電時間等が含まれる。また、本発明では、サンプル数が少ないことによりしきい値の不確かさが増大する問題を、統計処理で求めた標準偏差σの値に対して監視対象のデバイスの特性に応じたフィルタ処理を実施することで抑制する。
（第１の実施の形態）
図１は本発明の状態監視装置を実現するコンピュータの一構成例を示す斜視図である。 In the present invention, state data for each device is collected at a predetermined collection cycle (first cycle), and a predetermined statistical processing cycle (such as a time of a predetermined day of the week or a time of a predetermined day) is selected. By statistically processing the plurality of state data acquired every second period), a threshold value for each device corresponding to the time of a predetermined day of the week or the time of each day is obtained. Further, threshold values for each device are obtained for each status data collection period, and during actual operation, these threshold values are used in chronological order according to the day of the week and time to determine whether or not an abnormality has occurred. The status data to be acquired includes monitoring items of the RAS program, such as CPU temperature, system temperature, HDD temperature, LCD temperature, battery temperature, HDDSMART error information, voltage, fan speed, PCI parity, energization time, and the like. In addition, in the present invention, the problem that the threshold uncertainty increases due to the small number of samples is subjected to the filtering process according to the characteristics of the monitored device with respect to the standard deviation σ value obtained by the statistical process. It suppresses by doing.
(First embodiment)
FIG. 1 is a perspective view showing a configuration example of a computer that realizes the state monitoring apparatus of the present invention.

図１に示すように、コンピュータ１００は、各種のプログラムにしたがって所定の処理を実行するＣＰＵを含むマザーボード、並びにユーザによりコマンドやデータを入力するための入力装置を備えた本体部１０１と、操作結果や処理結果等を表示するＬＣＤ部１０２とを備えた、例えばラップトップ型のコンピュータである。図４に示すコンピュータ１００は、周知のパーソナルコンピュータと同様に、ＣＰＵ、ＨＤＤ、ＬＣＤ、電源装置、冷却ファン、ＰＣＩデバイス等を備えている。 As illustrated in FIG. 1, a computer 100 includes a main body 101 including a motherboard including a CPU that executes predetermined processing according to various programs, an input device for inputting commands and data by a user, and operation results. For example, a laptop computer provided with an LCD unit 102 for displaying processing results and the like. A computer 100 illustrated in FIG. 4 includes a CPU, an HDD, an LCD, a power supply device, a cooling fan, a PCI device, and the like, similarly to a known personal computer.

図２は、本発明の状態監視装置の一構成例を示すブロック図である。 FIG. 2 is a block diagram showing a configuration example of the state monitoring apparatus of the present invention.

図２に示すように、本発明の状態監視装置は、データ収集手段１１１、統計データ算出手段１１２、しきい値決定手段１１３、運用監視手段１１４、統計データ保存部２００、しきい値データ保存部２１０及び監視ログ保存部２２０を備えている。 As shown in FIG. 2, the state monitoring apparatus of the present invention includes a data collection unit 111, a statistical data calculation unit 112, a threshold value determination unit 113, an operation monitoring unit 114, a statistical data storage unit 200, and a threshold data storage unit. 210 and a monitoring log storage unit 220.

データ収集手段１１１、統計データ算出手段１１２、しきい値決定手段１１３及び運用監視手段１１４は、例えば図１に示したコンピュータ１００が備えるＣＰＵ（不図示）が、本発明のＲＡＳプログラムにしたがって処理を実行することで実現される。統計データ保存部２００、しきい値データ保存部２１０及び監視ログ保存部２２０は、例えば図１に示したコンピュータ１００が備える不揮発性の記憶装置（ＨＤＤ等）によって実現される。 The data collection unit 111, the statistical data calculation unit 112, the threshold value determination unit 113, and the operation monitoring unit 114 are processed by a CPU (not shown) included in the computer 100 shown in FIG. 1 according to the RAS program of the present invention. It is realized by executing. The statistical data storage unit 200, the threshold data storage unit 210, and the monitoring log storage unit 220 are realized by, for example, a nonvolatile storage device (HDD or the like) included in the computer 100 illustrated in FIG.

データ収集手段１１１は、デバイスに対応した監視項目である、温度、電圧、ファン回転数、自己診断機能による検出エラー（ＨＤＤ）、パリティエラー（ＰＣＩデバイス）、通電時間等の観測値（状態データ）を予め設定された収集周期毎（第１周期：例えば、１分毎）に取得する。 The data collection unit 111 is a monitoring item corresponding to the device, such as temperature, voltage, fan rotation speed, detection error (HDD) by a self-diagnostic function, parity error (PCI device), observation value (state data), etc. Is acquired every preset collection cycle (first cycle: for example, every minute).

統計データ算出手段１１２は、データ収集手段１１１で収集した観測値を統計処理し、監視項目毎の平均値及び標準偏差σを算出する。平均値及び標準偏差σは、予め決められた曜日の時刻、あるいは予め決められた日毎の時刻等、
上記収集周期よりも長い所定の統計処理周期（第２周期）で取得した複数の観測値を用いて算出する。さらに、統計データ算出手段１１２は、上記平均値及び標準偏差σを上記収集周期毎に求め、求めた平均値及び標準偏差σを統計データ保存部２００へ時系列に格納する。 The statistical data calculation unit 112 statistically processes the observation values collected by the data collection unit 111 and calculates an average value and a standard deviation σ for each monitoring item. The average value and the standard deviation σ are the time of a predetermined day of the week or the time of a predetermined day,
Calculation is performed using a plurality of observation values acquired at a predetermined statistical processing period (second period) longer than the collection period. Further, the statistical data calculation means 112 obtains the average value and standard deviation σ for each collection period, and stores the obtained average value and standard deviation σ in the statistical data storage unit 200 in time series.

しきい値決定手段１１３は、統計データ算出手段１１２で算出された平均値及び標準偏差σに基づき、監視項目毎のしきい値（上限値及び下限値）をそれぞれ算出する。本実施形態では、予め設定された条件を満たす観測項目の標準偏差σに対して後述するフィルタ処理を実施した後、平均値＋３σを上限のしきい値に設定し、平均値−３σを下限のしきい値に設定する。しきい値決定手段１１３は、状態データの収集周期毎に求めたデバイス毎のしきい値を、しきい値データ保存部２１０へ時系列に格納する。 The threshold value determination unit 113 calculates threshold values (upper limit value and lower limit value) for each monitoring item based on the average value and the standard deviation σ calculated by the statistical data calculation unit 112. In this embodiment, after performing a filtering process to be described later on the standard deviation σ of observation items that satisfy a preset condition, the average value + 3σ is set as the upper threshold value, and the average value −3σ is set as the lower threshold value. Set to threshold. The threshold value determination unit 113 stores the threshold value for each device obtained for each state data collection period in the threshold data storage unit 210 in time series.

運用監視手段１１４は、しきい値決定手段１１３で収集周期毎に算出されたしきい値（上限のしきい値及び下限のしきい値）を、曜日や時刻に合わせて時系列に用いることで各デバイスの状態を監視し、観測値がしきい値を超えた場合は異常と判定し、例えば異常の発生をＬＣＤ部１０２に表示することでユーザへ通知する。 The operation monitoring unit 114 uses the thresholds (upper and lower thresholds) calculated for each collection cycle by the threshold determining unit 113 in time series according to the day of the week and time. The state of each device is monitored, and when the observed value exceeds the threshold value, it is determined that there is an abnormality. For example, the occurrence of the abnormality is displayed on the LCD unit 102 to notify the user.

統計データ保存部２００は、データ収集手段１１１により収集された観測値（状態データ）を保存する。 The statistical data storage unit 200 stores the observation values (state data) collected by the data collection unit 111.

しきい値データ保存部２１０は、しきい値決定手段１１３で求めた監視項目毎のしきい値（上限値及び下限値）を保存する。 The threshold data storage unit 210 stores threshold values (upper limit value and lower limit value) for each monitoring item obtained by the threshold value determination unit 113.

監視ログ保存部２２０は、運用監視手段１１４で検出された異常の発生時刻、異常の内容等を保存する。 The monitoring log storage unit 220 stores the time of occurrence of an abnormality detected by the operation monitoring unit 114, the content of the abnormality, and the like.

次に、第１の実施の形態の状態監視装置の動作について図３〜図１２を用いて説明する。 Next, the operation of the state monitoring apparatus according to the first embodiment will be described with reference to FIGS.

図３は、第１の実施の形態の状態監視装置の処理の一例を示すフローチャートである。 FIG. 3 is a flowchart illustrating an example of processing of the state monitoring apparatus according to the first embodiment.

図３に示すように、状態監視装置（コンピュータ１００）は、まずデータ収集手段１１１にてデバイス毎の状態データ（観測値）を収集する（ステップＳ１）。本実施形態では、ステップＳ１で収集した状態データに基づいて運用時のしきい値を決定するため、アプリケーションプログラム等にしたがって動作している実運用時と同様の状態にてデバイス毎の状態データを収集する。データ収集手段１１１により収集した状態データの一例を図４に示す。 As shown in FIG. 3, the state monitoring apparatus (computer 100) first collects state data (observed values) for each device by the data collecting unit 111 (step S1). In the present embodiment, since the threshold value during operation is determined based on the state data collected in step S1, the state data for each device is stored in the same state as during actual operation operating according to an application program or the like. collect. An example of the state data collected by the data collection unit 111 is shown in FIG.

図４に示す［ＴＩＭ＿ＲＴＣ］は状態データの取得時刻を示し、［ＴＭＰ＿ＣＰＵ］はＣＰＵの温度を示し、［ＴＭＰ＿ＳＹＳ］は本体部１０１の内部温度を示している。また、図４に示す［ＴＭＰ＿ＨＤＤ］はＨＤＤの温度を示し、［ＴＭＰ＿ＢＡＴ］はバッテリの温度を示し、［ＴＭＰ＿ＬＣＤ］はＬＣＤ部１０２の内部温度を示し、［ＶＬＴ＿１．８］は電源電圧（直流電圧）を示している。 [TIM_RTC] illustrated in FIG. 4 indicates the acquisition time of the state data, [TMP_CPU] indicates the temperature of the CPU, and [TMP_SYS] indicates the internal temperature of the main body 101. Also, [TMP_HDD] shown in FIG. 4 indicates the temperature of the HDD, [TMP_BAT] indicates the temperature of the battery, [TMP_LCD] indicates the internal temperature of the LCD unit 102, and [VLT_1.8] indicates the power supply voltage (DC voltage). ).

データ収集手段１１１は、例えば監視項目毎の観測値を予め決められた収集周期毎にロギングすることで状態データを収集する（図５参照）。収集した状態データは統計データ保存部２００に保存する。なお、図５は、例えば１１月２３日及び１１月２４日の二日間の同じ時間帯（９：４０〜１０：４０）において、各監視項目の観測値が変化する様子を示している。 For example, the data collection unit 111 collects state data by logging observed values for each monitoring item at a predetermined collection period (see FIG. 5). The collected state data is stored in the statistical data storage unit 200. In addition, FIG. 5 has shown a mode that the observed value of each monitoring item changes in the same time slot | zone (9: 40-10: 40) of two days of November 23 and November 24, for example.

次に、状態監視装置は、統計データ算出手段１１２により、ステップＳ１で収集したデバイス毎の状態データをそれぞれ統計処理し、監視項目毎に平均値及び標準偏差σを算出する（ステップＳ２）。 Next, the state monitoring apparatus statistically processes the state data for each device collected in step S1 by the statistical data calculation unit 112, and calculates an average value and a standard deviation σ for each monitoring item (step S2).

通常のビジネス用途で用いられるコンピュータ１００は、予め決められた曜日の時刻（例えば、毎週金曜日（曜日）の１８：００（時刻）にその週の実績を処理する）、あるいは１日のうちの予め決められた時刻（例えば一日の売り上げを１８：００にバッチ処理する）等、所定の周期でコンピュータの処理量が決まる場合が多い。したがって、日毎の時間帯に依存してコンピュータの処理量（負荷量）が決まる場合は、日毎の同じ時刻に取得した複数の状態データを統計処理し、曜日及び時刻に依存してコンピュータの処理量（負荷量）が決まる場合は、同じ曜日の同じ時刻に取得した複数の状態データを統計処理すればよい。例えば、同じ曜日の同じ時刻の監視項目（ＣＰＵ温度、ファン回転数等）毎の状態データを統計データ保存部２００から読み出し（図６参照）、同じ曜日の同じ時刻の複数の状態データから監視項目毎に平均値及び標準偏差σを単位で算出し、さらに平均値及び標準偏差σを上記収集周期毎に算出する。本実施形態では、このような統計処理で用いる状態データの取得周期を統計処理周期（第２周期）と称す。なお、図６は、例えば１１月２３日（月）の９：４０〜１０：４０において各観測値が変化する様子と、同じ曜日（毎月曜日：１１月２日，９日，１６日，２３日，３０日）の９：４０〜１０：４０においてＣＰＵ温度が変化する様子とを示している。 The computer 100 used for a normal business application is a time of a predetermined day of the week (for example, processing the result of the week at 18:00 (time) of every Friday (day of the week)) or a day of the day. In many cases, the processing amount of the computer is determined at a predetermined period, such as a predetermined time (for example, batch processing of daily sales at 18:00). Therefore, when the computer processing amount (load amount) is determined depending on the time zone of each day, statistical processing is performed on a plurality of state data acquired at the same time every day, and the computer processing amount is dependent on the day of the week and the time. When (load amount) is determined, statistical processing may be performed on a plurality of state data acquired at the same time on the same day of the week. For example, status data for each monitoring item (CPU temperature, fan speed, etc.) at the same time on the same day of the week is read from the statistical data storage unit 200 (see FIG. 6), and monitoring items are obtained from a plurality of status data at the same time on the same day of the week. The average value and standard deviation σ are calculated in units for each time, and the average value and standard deviation σ are calculated for each collection period. In the present embodiment, the state data acquisition cycle used in such statistical processing is referred to as a statistical processing cycle (second cycle). FIG. 6 shows the same day of the week (every Monday: November 2, 9th, 16th, 23, as the observation values change from 9:40 to 10:40 on Monday, November 23, for example. The CPU temperature changes at 9:40 to 10:40 on Sunday, 30th).

図７は、図６に示した状態データから算出したＣＰＵ温度の平均値が変化する様子を示すグラフであり、図８は、図６に示した状態データから算出したＣＰＵ温度の平均値、標準偏差及び変化量を示すテーブルである。なお、図７及び図８に示す各変数の意味は以下のとおりである。
ｔ₁，ｔ₂，…，ｔ_n：予め設定された観測期間（例えば、毎月曜日の９：４０〜１０：４０）における状態データの取得時刻
ｘ₁，ｘ₂，…，ｘ_n：ｔ_nで取得した状態データ（ＣＰＵ温度） FIG. 7 is a graph showing how the average CPU temperature calculated from the state data shown in FIG. 6 changes. FIG. 8 shows the average CPU temperature calculated from the state data shown in FIG. It is a table which shows a deviation and change amount. The meanings of the variables shown in FIGS. 7 and 8 are as follows.
t ₁ , t ₂ ,..., t _n : State data acquisition times x ₁ , x ₂ ,..., x _n : t _{n in a} preset observation period (for example, 9:40 to 10:40 on every Monday) Status data (CPU temperature)

：ｔ_nで取得した状態データの平均値（ＣＰＵ温度）
Δｘ₁，ｘ₂，…，Δｘ_n-1：ｔ₁〜ｔ_nにおける状態データの変化量（ＣＰＵ温度） : Average state data acquired at t _n (CPU temperature)
Δx ₁ , x ₂ ,..., Δx _n-1 : change amount of state data in t _{1 to} t _n (CPU temperature)

Δｘ_max：ｔ₁〜ｔ_nにおける状態データの変化量（Δｘ₁〜Δｘ_n-1）の最大値
σ_max：ｔ₁〜ｔ_nにおける標準偏差σの最大値
統計データ算出手段１１２は、図６に示した状態データに基づいてｔ_n毎に、平均値 [Delta] x _max: maximum value sigma _max amount of change in state data in _{_{_{t 1 ~t n (Δx 1 ~Δx}}} n-1): maximum statistical data calculating means 112 of the standard deviation sigma of t ₁ ~t _n is 6 Based on the state data shown in the above, the average value for every t _n

、標準偏差σ_n及び変化量Δｘを算出する。図８は、算出結果の一例を示している。このとき、各日毎の所定の時刻における状態データは、図９に示すように正規分布となる。 The standard deviation σ _n and the change amount Δx are calculated. FIG. 8 shows an example of the calculation result. At this time, the state data at a predetermined time every day has a normal distribution as shown in FIG.

次に、状態監視装置は、しきい値決定手段１１３によりステップＳ２で求めた各標準偏差σの値に対してフィルタ処理を実施する（ステップＳ３）。本実施形態では、例えばΔｘ_max≧２σ_maxの条件を満たすとき、予め設定された観測期間（ｔ₁〜ｔ_n）における各標準偏差σの値に対してフィルタ処理を実施する。これは、Δｘ_max≧２σ_maxの条件を満たすようなときは、状態データが単純なばらつきの範囲内にあるのではなく、急激に増大あるいは減少していると考えられるため、標準偏差σの値を最適化（大きなマージンを持たせる）する必要があるからである。例えば、図５や図６のグラフで示したように、ＣＰＵ温度は、処理を開始すると急激に上昇し、処理が終了すると急激に低下する。このような変化量Δｘが大きいＣＰＵ温度等の状態データにフィルタ処理を実施することで、状態データのサンプル数が少ないことによる、平均値や標準偏差σの不確かさ、さらにこれらの値から求めるしきい値の不確かさの増大が抑制される。 Next, the state monitoring apparatus performs a filtering process on the value of each standard deviation σ obtained in step S2 by the threshold value determining means 113 (step S3). In the present embodiment, for example, when the condition of Δx _max ≧ 2σ _max is satisfied, the filtering process is performed on each standard deviation σ value in a preset observation period (t _{1 to} t _n ). This is because when the condition of Δx _max ≧ 2σ _max is satisfied, it is considered that the state data does not fall within the range of simple variation but increases or decreases rapidly, so the value of the standard deviation σ This is because it is necessary to optimize (with a large margin). For example, as shown in the graphs of FIGS. 5 and 6, the CPU temperature rapidly increases when the process is started, and rapidly decreases when the process is completed. By performing filtering on state data such as CPU temperature where the amount of change Δx is large, the average value and the standard deviation σ due to the small number of state data samples are obtained, and these values are obtained. Increase in threshold uncertainty is suppressed.

なお、フィルタ処理の実施条件は、上記Δｘ_max≧２σ_maxに限定されるものではなく、例えば、Δｘ_max≧σ_max、Δｘ_max≧３σ_max、Δｘ_max≧４σ_max等に設定してもよい。但し、標準偏差の最大値σ_maxに乗ずる値を小さくすると、状態データの変化量が少ないデバイスに対してもフィルタ処理を実施することになるため、しきい値決定手段１１３の処理負荷が増大する。一方、標準偏差の最大値σ_maxに乗ずる値を大きくすると、しきい値決定手段１１３の処理負荷は低減するが、通常運用時では発生する可能性が少ない、状態データが大きく変化したときのみフィルタ処理を実施することになるため、状態データのサンプル数が少ないことによる、平均値、標準偏差σ、しきい値等の不確かさの増大を抑制できなくなる。したがって、フィルタ処理の実施条件は、上記Δｘ_max≧２σ_max程度が好ましい。 Note that execution condition of the filtering is not limited to the above [Delta] x _max ≧ 2 [sigma] _max, for _{_{example, Δx max ≧ σ max, Δx}} max ≧ 3σ max, may be set to [Delta] x _max ≧ 4 [sigma] _max and the like. However, if the value multiplied by the maximum value σ _max of the standard deviation is reduced, the filtering process is performed even for a device with a small amount of change in the state data, so that the processing load of the threshold value determination unit 113 increases. . On the other hand, if the value multiplied by the maximum value σ _max of the standard deviation is increased, the processing load of the threshold value determination means 113 is reduced, but the filter is only generated when the state data changes greatly, which is less likely to occur during normal operation. Since the processing is performed, it becomes impossible to suppress an increase in uncertainties such as an average value, a standard deviation σ, and a threshold value due to a small number of state data samples. Therefore, it is preferable that the filter processing condition is about Δx _max ≧ 2σ _max .

ＣＰＵ温度の標準偏差σに適用するフィルタ処理の一例を図１０に示す。 An example of the filter process applied to the standard deviation σ of the CPU temperature is shown in FIG.

フィルタ処理では、例えば同一の処理を繰り返し実行する同じ曜日の観測期間（例えば、月曜日の９：００〜１７：００等）ｔ₁，ｔ₂，…，ｔ_nにて算出した各標準偏差σの値を昇順にソートする（図１１参照）。ここで、標準偏差σの値が大きい場合は、取得した状態データのばらつきが大きいことを意味する。また、標準偏差σの値が小さい場合は、取得した状態データのばらつきが小さいことを意味し、状態データのサンプル数が少ないことによる標準偏差σの不確かさが大きいと考えられる。そこで、標準偏差σの値が大きい場合は、そのままの値を用いる。あるいは標準偏差σの値が大きい場合は、標準偏差σの値をわずかに大きな値に補正する。一方、標準偏差σの値が小さい場合は、標準偏差σの値を大きな値に補正する。すなわち、標準偏差σの値が小さくなるほど、該標準偏差σの値に乗ずる値を大きくすることで、各標準偏差σの値を補正する。このようなフィルタ処理は、デバイス特性に応じて監視項目の変化量が異なるため、監視項目毎に個別に設定する。 In the filter process, for example, the standard deviation σ calculated in the observation period t ₁ , t ₂ ,..., T _{n of} the same day of the week in which the same process is repeatedly executed (for example, 9: 00 to 17:00 on Monday). The values are sorted in ascending order (see FIG. 11). Here, when the value of the standard deviation σ is large, it means that the obtained state data varies greatly. In addition, when the value of the standard deviation σ is small, it means that the variation of the acquired state data is small, and it is considered that the uncertainty of the standard deviation σ due to the small number of state data samples is large. Therefore, when the standard deviation σ is large, the value is used as it is. Alternatively, when the value of the standard deviation σ is large, the value of the standard deviation σ is corrected to a slightly large value. On the other hand, when the value of the standard deviation σ is small, the value of the standard deviation σ is corrected to a large value. That is, the value of each standard deviation σ is corrected by increasing the value multiplied by the value of the standard deviation σ as the value of the standard deviation σ decreases. Such a filtering process is set individually for each monitoring item because the amount of change of the monitoring item varies depending on the device characteristics.

例えば、ＣＰＵ温度に対するフィルタ処理の場合、標準偏差σの最小値から最大値までを１００％としたとき（図１１参照）、図１０に示すように８０％以上の標準偏差σの値はそのまま用い、６０％から８０％の標準偏差σの値は１．１倍にし、３０％〜６０％の標準偏差σの値は１．２５倍にし、３０％以下の標準偏差σの値は１．５倍にする。 For example, in the case of the filter processing for the CPU temperature, when the minimum value to the maximum value of the standard deviation σ is 100% (see FIG. 11), the standard deviation σ value of 80% or more is used as it is as shown in FIG. The value of the standard deviation σ from 60% to 80% is 1.1 times, the value of the standard deviation σ of 30% to 60% is 1.25 times, and the value of the standard deviation σ of 30% or less is 1.5 Double.

しきい値決定手段１１３は、各標準偏差σに対して上記のフィルタ処理を実施した後、上記収集周期毎に、平均値＋３σを上限のしきい値に設定し、平均値−３σを下限のしきい値に設定する（ステップＳ４）。 The threshold value determination unit 113 performs the above filtering process on each standard deviation σ, and then sets the average value + 3σ as the upper limit threshold value and sets the average value −3σ as the lower limit value for each collection period. The threshold value is set (step S4).

平均値±３σを上限のしきい値及び下限のしきい値に設定する理由は、通常動作時であれば、状態データはこれらのしきい値の範囲内に９９．７４％の確率で収まるため、状態データが該しきい値を超えた場合は異常とみなすことができるからである。なお、実際には、ステップＳ３にて各標準偏差σをフィルタ処理しているため、状態データは、設定したしきい値の範囲内に、さらに高い確率で収まると考えられる。図１２はＣＰＵ温度の観測値（実測値）と求めたしきい値とが変化する様子を示している。 The reason why the average value ± 3σ is set as the upper threshold value and the lower threshold value is that, during normal operation, the state data falls within the range of these thresholds with a probability of 99.74%. This is because when the state data exceeds the threshold value, it can be regarded as abnormal. Actually, since each standard deviation σ is filtered in step S3, it is considered that the state data falls within a set threshold range with a higher probability. FIG. 12 shows how the observed value (measured value) of the CPU temperature and the obtained threshold value change.

しきい値決定手段１１３は、デバイス毎のしきい値を状態データの収集周期毎に求め、それらのしきい値を曜日や時刻に合わせて時系列にしきい値データ保存部２１０に保存する。 The threshold value determination unit 113 obtains threshold values for each device for each collection period of state data, and stores these threshold values in the threshold data storage unit 210 in time series in accordance with the day of the week and time.

状態監視装置は、デバイス毎のしきい値を求めると、運用監視手段１１４により状態データが対応する上限のしきい値と下限のしきい値の範囲内に収まっているか否かを判定することで、各デバイスの異常発生有無を監視する（ステップＳ５）。そして、状態データが上限または下限のしきい値を超えた場合は異常と判定し、異常の発生を通知すると共に、異常が発生したデバイス名、異常の発生時刻、異常内容等を監視ログ保存部２２０へ保存する（ステップＳ６）。 When the state monitoring apparatus obtains the threshold value for each device, the operation monitoring unit 114 determines whether the state data is within the range of the corresponding upper threshold value and lower threshold value. The presence / absence of abnormality in each device is monitored (step S5). When the status data exceeds the upper limit or lower limit threshold, it is determined as abnormal, and notification of the occurrence of an abnormality is made, and the name of the device where the abnormality occurred, the time of occurrence of the abnormality, the content of the abnormality, etc. Save to 220 (step S6).

運用監視手段１１４によりデバイスの監視を開始すると、以降は各デバイスの運用条件等が変更されない限りデバイス毎のしきい値を更新する必要はない。しかしながら、運用開始後であっても、運用監視手段１１４によりデバイスの状態を監視しつつ、図３のステップＳ１〜Ｓ４の処理を実行してデバイス毎のしきい値を随時更新してもよい。 When device monitoring is started by the operation monitoring unit 114, it is not necessary to update the threshold value for each device unless the operation conditions of each device are changed. However, even after the start of operation, the threshold value for each device may be updated as needed by executing the processing of steps S1 to S4 in FIG.

本実施形態によれば、デバイス毎のしきい値を統計処理に基づいて算出するため、ユーザや保守員等がしきい値を設定しなくても最適なしきい値が設定される。 According to the present embodiment, since the threshold value for each device is calculated based on statistical processing, an optimum threshold value is set even if the user, maintenance personnel, or the like does not set the threshold value.

また、予め決められた曜日の時刻、あるいは予め決められた日毎の時刻等、所定の統計処理周期の単位で状態データを統計処理することで、所定の曜日の時刻や日毎の時刻に対応するデバイス毎のしきい値を求め、さらに各しきい値を状態データの収集周期毎に求めるため、しきい値は曜日や時刻に合わせて収集周期毎に時間軸に沿って変動する（図１２参照）。そのため、観測値に対して無駄なマージンを持ったしきい値が設定されることがなく、しきい値が固定値である背景技術のコンピュータでは検知できなかった異常も検知できるようになる。 In addition, a device corresponding to the time of a predetermined day of the week or the time of every day by statistically processing the state data in units of a predetermined statistical processing cycle, such as a time of a predetermined day of the week or a time of a predetermined day Since each threshold value is obtained and each threshold value is obtained for each state data collection period, the threshold value varies along the time axis for each collection period in accordance with the day of the week and time (see FIG. 12). . Therefore, a threshold having a useless margin for the observed value is not set, and an abnormality that cannot be detected by a background art computer having a fixed threshold can be detected.

さらに、標準偏差σの値をフィルタ処理することで、状態データのサンプル数が少ないことによる、しきい値の不確かさの増大が抑制される。そのため、より適切なしきい値を設定できる。
（第２の実施の形態）
次に、第２の実施の形態のコンピュータについて図面を用いて説明する。 Further, by filtering the value of the standard deviation σ, an increase in threshold uncertainty due to a small number of state data samples is suppressed. Therefore, a more appropriate threshold value can be set.
(Second Embodiment)
Next, a computer according to the second embodiment will be described with reference to the drawings.

第１の実施の形態では、ＣＰＵ温度の標準偏差σに対してフィルタ処理する例を示した。第２の実施の形態では、その他のデバイスにも適用できるフィルタ処理を提案する。状態監視装置の構成及びその他の処理は、第１の実施の形態と同様であるため、その説明は省略する。 In the first embodiment, an example in which the filter process is performed on the standard deviation σ of the CPU temperature has been described. In the second embodiment, filter processing that can be applied to other devices is proposed. Since the configuration of the state monitoring apparatus and other processes are the same as those in the first embodiment, description thereof is omitted.

図１３は、第２の実施の形態の状態監視装置で実施するフィルタ処理の一例を示すフローチャートである。 FIG. 13 is a flowchart illustrating an example of filter processing performed by the state monitoring apparatus according to the second embodiment.

図１３に示す変数ａ₁〜ａ_n+1及びｘ₁〜ｘ_nは、ユーザ等が予め決めた異常検知のポリシーにしたがってデバイス毎に設定する。例えば上限のしきい値と下限のしきい値の間隔を狭くしたい場合、すなわち誤検出する可能性が高くなるが、異常検知の感度を高くしたい場合は変数ａの値を小さく設定する。また、上限のしきい値と下限のしきい値の間隔を広くしたい場合、すなわち異常検知の感度を低くして、誤検出の可能性を低減したい場合は変数ａの値を大きく設定する。 Variables a _₁ ~a n + ₁ and x ₁ ~x shown in FIG. 13 _n is set for each device according to the policies of the abnormality detection by the user or the like has determined in advance. For example, if the interval between the upper limit threshold and the lower limit threshold is to be narrowed, that is, the possibility of erroneous detection is increased, but the sensitivity of abnormality detection is to be increased, the value of the variable a is set small. Further, when it is desired to widen the interval between the upper limit threshold value and the lower limit threshold value, that is, when it is desired to reduce the sensitivity of abnormality detection and reduce the possibility of erroneous detection, the value of the variable a is set large.

変数ｘの値は、ソートした標準偏差σのばらつきが大きければ設定数を多くし、標準偏差σのばらつきが小さければ設定数を少なくすればよい。例えば、標準偏差σのばらつきが大きければ、変数ｘを９０％、８０％、７０％、６０％、５０％、４０％、３０％、２０％、１０％等に設定して標準偏差σの補正を１０段階で実施し、標準偏差σのばらつきが小さければ、変数ｘを７０％、３０％等に設定して標準偏差σの補正を３段階で実施すればよい。 The value of the variable x may be increased if the variation of the sorted standard deviation σ is large, and may be decreased if the variation of the standard deviation σ is small. For example, if the variation of the standard deviation σ is large, the variable x is set to 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%, etc., and the standard deviation σ is corrected. If the variation of the standard deviation σ is small, the variable x may be set to 70%, 30%, etc., and the standard deviation σ may be corrected in three steps.

図１３に示すフィルタ処理で用いる変数ａ₁〜ａ_n+1及びｘ₁〜ｘ_nは、ソフトウェア／ハードウェアエンジニアあるいはＳＥ（System Engineer）等の専門家が設定してもよく、ＲＡＳソフトウェアが備える機能の一部として、ユーザがＧＵＩ（Graphical User Interface）操作により設定できるようにしてもよい。 Variables a _₁ ~a n + ₁ and x ₁ ~x _n used in the filtering process shown in FIG. 13, the software / hardware engineers or SE (System Engineer) may be set experts like, provided in the RAS software As a part of the function, the user may be able to set by a GUI (Graphical User Interface) operation.

本実施形態によれば、第１の実施の形態と同様の効果が得られると共に、より多くのデバイスや観測項目にも適用できるフィルタ処理を実現できる。 According to this embodiment, the same effects as those of the first embodiment can be obtained, and filter processing that can be applied to more devices and observation items can be realized.

１００コンピュータ
１０１本体部
１０２ＬＣＤ部
１１１データ収集手段
１１２統計データ算出手段
１１３しきい値決定手段
１１４運用監視手段
２００統計データ保存部
２１０しきい値データ保存部
２２０監視ログ保存部 DESCRIPTION OF SYMBOLS 100 Computer 101 Main body part 102 LCD part 111 Data collection means 112 Statistical data calculation means 113 Threshold value determination means 114 Operation monitoring means 200 Statistical data storage part 210 Threshold data storage part 220 Monitoring log storage part

Claims

A state monitoring device that determines whether or not an abnormality has occurred in a device based on whether or not state data indicating a state of a device to be monitored exceeds a predetermined threshold value,
Data collecting means for collecting the state data of the device every preset first period;
Of the state data collected by the data collection means, statistical processing is performed on a plurality of state data acquired in a predetermined second period longer than the first period, thereby obtaining an average value and a standard deviation corresponding to the device. Statistical data calculating means for calculating the average value and standard deviation for each of the first periods,
The value of the standard deviation calculated by the statistical data calculation means is corrected by filtering according to the characteristics of the device, and based on the average value and the standard deviation after the correction, the device corresponding to the device Threshold value determining means for calculating a threshold value for each first period;
Using the threshold value calculated by the threshold value determining means, operation monitoring means for determining the presence or absence of abnormality of the device;
A state monitoring device.

When the maximum value of the change amount of the state data collected in a preset observation period is Δx _max and the maximum value of the standard deviation calculated in the observation period is σ _max ,
The threshold value determining means includes
The state monitoring apparatus according to claim 1, wherein when the condition of Δx _max ≧ 2σ _max is satisfied, the filtering process is performed on each standard deviation obtained in the observation period.

The threshold value determining means includes
The state monitoring apparatus according to claim 1 or 2, wherein, as the filtering process, correction is performed to increase a value multiplied by the standard deviation value as the standard deviation value decreases.

The threshold value determining means includes
When the standard deviation is σ,
The state monitoring apparatus according to claim 1, wherein the average value + 3σ is an upper threshold value and the average value −3σ is a lower threshold value.

A status monitoring method for determining whether or not an abnormality has occurred in a device based on whether or not status data indicating a status of a device to be monitored exceeds a predetermined threshold value,
Collecting the device status data for each preset first period,
Among the collected state data, by statistically processing a plurality of state data acquired in a predetermined second period longer than the first period, an average value and a standard deviation corresponding to the device are calculated, An average value and a standard deviation are calculated for each first period,
The calculated standard deviation value is corrected by filtering according to the characteristics of the device, and the threshold value corresponding to the device is set based on the average value and the standard deviation after the correction. Calculate every cycle,
A state monitoring method for determining whether or not an abnormality has occurred in the device using the calculated threshold value.

When the maximum value of the change amount of the state data collected in a preset observation period is Δx _max and the maximum value of the standard deviation calculated in the observation period is σ _max ,
The state monitoring method according to claim 5, wherein when the condition of Δx _max ≧ 2σ _max is satisfied, the filtering process is performed on each standard deviation obtained in the observation period.

The state monitoring method according to claim 5 or 6, wherein, as the filtering process, correction is performed to increase a value multiplied by the standard deviation value as the standard deviation value decreases.

When the standard deviation is σ,
The state monitoring method according to claim 5, wherein the average value + 3σ is an upper threshold value and the average value −3σ is a lower threshold value.