WO2011036809A1 - Abnormality identification system and method thereof - Google Patents

Abnormality identification system and method thereof Download PDF

Info

Publication number
WO2011036809A1
WO2011036809A1 PCT/JP2009/066806 JP2009066806W WO2011036809A1 WO 2011036809 A1 WO2011036809 A1 WO 2011036809A1 JP 2009066806 W JP2009066806 W JP 2009066806W WO 2011036809 A1 WO2011036809 A1 WO 2011036809A1
Authority
WO
WIPO (PCT)
Prior art keywords
normal
abnormal
data
determination
variables
Prior art date
Application number
PCT/JP2009/066806
Other languages
French (fr)
Japanese (ja)
Inventor
研 植野
クマル トポン ポール
Original Assignee
株式会社 東芝
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社 東芝 filed Critical 株式会社 東芝
Priority to PCT/JP2009/066806 priority Critical patent/WO2011036809A1/en
Publication of WO2011036809A1 publication Critical patent/WO2011036809A1/en

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0224Process history based detection method, e.g. whereby history implies the availability of large amounts of data
    • G05B23/0227Qualitative history assessment, whereby the type of data acted upon, e.g. waveforms, images or patterns, is not relevant, e.g. rule based assessment; if-then decisions
    • G05B23/0235Qualitative history assessment, whereby the type of data acted upon, e.g. waveforms, images or patterns, is not relevant, e.g. rule based assessment; if-then decisions based on a comparison with predetermined threshold or range, e.g. "classical methods", carried out during normal operation; threshold adaptation or choice; when or how to compare with the threshold

Definitions

  • the present invention relates to an abnormality determination system and method.
  • sensor fusion technology Japanese Patent No. 3931879, Japanese Patent Laid-Open No. 2005-165421
  • sensor data is discrete from continuous time series values. It is the mainstream to convert it into a value and handle it as categorical data.
  • the present invention provides an abnormality determination system and method capable of performing abnormality determination on a monitoring target (or target object) with high accuracy using sensing data of a plurality of sensors (or sensor nodes) accumulated in the past. provide.
  • the abnormality determination system of the present invention represents a plurality of time-series data related to a plurality of variables obtained by observing a monitoring target with a plurality of sensors, and a state of the monitoring target when the plurality of time-series data are acquired.
  • a data storage unit that stores a plurality of training data in which a normal class or an abnormal class is set as a set, a plurality of sections are specified for each of the plurality of variables, and each of the plurality of variables includes the plurality of
  • a waveform dividing unit that extracts a plurality of segment data that is data of the plurality of sections from the plurality of time-series data included in the training data, and the waveform dividing unit extracts each of the plurality of variables.
  • the best which is one of the plurality of sections Based on the evaluation unit for selecting a section, the number of times each of the plurality of variables is determined to be normal for each of the plurality of sections, and the number of times determined to be abnormal, the normal and abnormal of the best section Calculating a conditional probability, and calculating a normal and abnormal prior probability from the total number of normal classes and the total number of abnormal classes included in the plurality of training data; and a normal probability and an abnormal prior probability Storing, for each of the plurality of variables, the identification information of the best interval, the segment data of the best interval, the class associated with the segment data, and the normal and abnormal conditional probabilities of the best interval; , A sensing unit for observing the monitoring target with a plurality of sensors and acquiring a plurality of time-series data regarding a plurality of variables, and the plurality of variables.
  • a selection unit that selects segment data from the plurality of time-series data acquired by the sensing unit according to the best interval, and a selection unit that selects each of the plurality of variables.
  • the uppermost predetermined number of segment data is detected by the nearest neighbor method using the segment data in the storage unit, and the normal class in the predetermined number of segment data is detected for each of the plurality of variables.
  • the respective ratios of the abnormal classes and the conditional probabilities of normality and abnormality in the storage unit respectively, multiplying the multiplied values among the plurality of variables and multiplying the prior probabilities of normality and abnormality Calculate the likelihood of normal and abnormal, and the likelihood of normal and abnormal
  • a determination unit that determines a state of the monitored object on the greater.
  • the abnormality determination system of the present invention includes a plurality of first labels respectively indicating whether sensor data observed by a plurality of sensor nodes monitoring a target object is abnormal or normal, and whether the state of the target object is normal or normal.
  • a first database that stores a plurality of training data including a second label to indicate, and (A-1) mapping using a coding method that specifies mapping of the presence or absence of each of the plurality of sensor nodes to a bit string
  • a plurality of candidate solutions are generated by performing a plurality of times at random, and (A-2) a fitness evaluation of each of the plurality of candidate solutions with respect to the first database and a candidate solution selected based on the fitness
  • the generation of new candidate solutions by crossover and mutation operations is repeated according to the genetic algorithm
  • a decision fusion rule learning unit that determines an optimal candidate solution having optimal fitness and identifies a sensor node on which a bit is set in the optimal candidate solution, and (B-1) an observation by the specified sensor node.
  • Whether or not the sensor data is abnormal or normal is determined using a classifier prepared in advance for the specified sensor node and determining whether the given sensor data is abnormal or normal. 2) When all the determination results for the specified sensor node indicate abnormality, the target object is determined to be abnormal, and when at least one of the determination results indicates normal, the target object is A general determination unit that determines normality, and the determination fusion rule learning unit includes the plurality of candidate solutions.
  • the first label of the sensor node having a bit set in the candidate solution is detected for each of the plurality of training data, and normality and abnormality indicated by the detected first label Selecting a larger state of the plurality of training data, and calculating a ratio at which the state selected for each of the plurality of training data matches the state indicated by the second label of each of the plurality of training data.
  • the present invention it is possible to perform an abnormality determination on a monitoring target (or target object) with high accuracy using data of a plurality of sensors (or sensor nodes) accumulated in the past.
  • FIG. 1 shows a configuration of an abnormality determination system according to a first embodiment of the present invention.
  • the flow of the training learning process by the server is shown.
  • the detailed processing flow of a pseudo judgment evaluation process is shown. Details of step S205 of FIG. 3 are shown.
  • the process added in various modifications is shown.
  • the example of the training data set in a training data storage part is shown.
  • An example of waveform amplitude is shown.
  • the example of the feature vector after converting into a power spectrum is shown.
  • An example of waveform division is shown.
  • Another example of waveform division is shown.
  • Another example of waveform division will be described.
  • segmentation in the case of a power spectrum is shown.
  • An example of division of training data is shown.
  • the other division example of training data is shown.
  • An example of finding nearest neighbor segment data is shown.
  • the state of performing the nearest neighbor calculation is shown.
  • An example of a score table is shown.
  • the example of data (judgment model) in the best model storage part is shown.
  • the process of a 1st modification is shown.
  • the process of the 2nd modification is shown.
  • the process of the 3rd modification is shown.
  • the process of the 4th modification is shown.
  • An example of the determination model according to the fifth modification will be shown.
  • 14 shows a concept of a model formula according to a fifth modification.
  • An example of conditional probability calculation is shown.
  • the state of performing the nearest neighbor calculation is shown.
  • the format example of a frequency distribution table is shown.
  • a state in which the determination target data is scanned is shown.
  • the state of cutting out data (waveform) is shown.
  • the example of a display screen of the part of abnormality determination among the determination object data is shown.
  • movement flow in a client is shown.
  • 2 shows an example of a hardware configuration for realizing a server and a client.
  • An example of applying a segment template is shown below.
  • movement of the client which concerns on a modification is shown.
  • 2 shows an overall configuration of a remote monitoring system according to a second embodiment.
  • An example of a format of sensor data is shown.
  • An example of the sensor data extracted in order to build a single channel abnormality judgment model is shown.
  • a typical genetic algorithm processing flow for solving the problem is shown.
  • An example of optimal segmentation of feature regions in a training waveform is shown using a genetic algorithm.
  • Fig. 5 shows an example of a format of extracted sensor data used to construct a decision fusion rule.
  • the example of the database of a decision fusion rule in case one line is equivalent to one decision fusion rule is shown.
  • the example of the database of a decision fusion rule in case one rule consists of many decision fusion rules is shown.
  • An example of converting a classification rule into a number of decision fusion rules is shown.
  • An example of coding in a genetic algorithm for constructing a classification rule is shown.
  • An example of the processing flow of a genetic algorithm for constructing a classification rule from a sensor data set is shown.
  • An example of generating offspring using crossover and mutation in a genetic algorithm is shown.
  • An example of evaluating candidate classification rules in a genetic algorithm is shown.
  • An example of coding based on S expression in genetic programming is shown.
  • An example of tree-based coding in genetic programming is shown.
  • An example of a process flow of genetic programming for constructing a classification rule from sensor data is shown.
  • An example of classification rule evaluation in genetic programming is shown.
  • An example of generating offspring using crossover and mutation in genetic programming is shown.
  • the flow of processing inside the gateway is shown.
  • An example of determination of abnormality in data from a sensor node (test waveform) is shown.
  • Fig. 5 shows an example of the format of data sent to a server at a remote monitoring site when the sensor node status matches at least one of the decision fusion rules.
  • An example of the format of data sent to the server at the remote monitoring site when the state of the sensor node does not match any of the decision fusion rules is shown.
  • An example of the operation in the gateway is shown.
  • FIG. 1 shows a configuration of an abnormality determination system according to the first embodiment of the present invention.
  • This abnormality determination system includes a server (monitoring center device) and a client (remote monitoring terminal).
  • the server performs training learning using past sensor data (time-series data) obtained by observation of the monitoring target and a class (abnormal or normal) that identifies the status of the monitoring target at the time of acquisition of the sensor data.
  • a determination model for determining new sensor data is generated.
  • the client observes the monitoring target, acquires sensor data, and determines whether the monitoring target is normal or abnormal using the acquired sensor data and the determination model.
  • FIG. 2 is a flowchart showing the flow of the training learning process by the server.
  • the server reads various parameters set by the user (S101). For example, parameters such as the maximum waveform division number Z_max (used in step S106) are read. Reading is performed from a recording medium such as a memory or a hard disk.
  • the parameter z of the number of waveform divisions is set to 0 (S102).
  • the training data input unit 12 reads the training data set from the training data storage unit 11 and inputs it to the waveform preprocessing unit 13 at the next stage (S103).
  • Fig. 6 shows an example of a training data set in the training data storage unit 11.
  • Each training data is composed of at least one type of time series data (sensor data) and a class.
  • the class is a determination result obtained when a maintenance person or the like determines the state of the target device (monitoring target) when the corresponding time-series data is acquired in the past.
  • Classes are, for example, abnormal and normal. However, there may be multiple types of abnormal states such as abnormal type A and abnormal type B.
  • the training data includes time-series data of four variables (channels).
  • the class of training data d 1 to d N is normal, and the class of training data d N + 1 to d M is abnormal.
  • the time series data of the four variables are obtained from the corresponding four sensors.
  • the time series data has the same size (length in the time axis direction), but the size may be different for each variable (channel).
  • the waveform preprocessing unit 13 preprocesses each time series data included in the training data set (S104).
  • a feature vector such as an amplitude spectrum may be acquired by performing signal processing such as power spectrum conversion by FFT, short-time Fourier transform, and wavelet transform.
  • waveform amplitude values at a plurality of predetermined times may be acquired.
  • FIG. 7 shows examples of waveform amplitudes acquired at a plurality of predetermined times.
  • FIG. 8 shows an example of the feature vector after conversion to the power spectrum.
  • the waveform may be further processed using a low-frequency pass filter (smoothing filter). This is effective when noise is added to the waveform amplitude or when it is desired to grasp the general characteristics of the waveform.
  • steps S105 to S110 the time-series data is divided into a plurality of sections by the waveform division number z while sequentially increasing the waveform division number z from 1 to the maximum waveform division number z_max, and the waveform division number z (1 ⁇ Z_max), an important interval for each variable is determined. Then, the important section of each variable at the time of the waveform division number at which the highest evaluation value is obtained is determined as the optimum section. In addition, the data portion (segment data) of each optimum section in each time series data of the training data set is stored in association with the corresponding class. Details of steps S105 to S110 will be described below.
  • step S105 the server increments the waveform division number z by one.
  • step S106 the server determines whether or not the waveform division number z exceeds the maximum division number z_max. If it exceeds, the process proceeds to step S111. If not, the process proceeds to step S107.
  • step S107 the waveform division unit 14 divides each time series data of the training data set on the time axis by the number of waveform divisions z and cuts out segment data.
  • the division method is simple and is divided so that the division width is equal. However, another division method may be used.
  • the extracted segment data is stored in the segment storage unit 15.
  • Each segment data (partial time series data) cut out in this way is stored in the segment storage unit 15 in association with the value of z, the training data ID, and the variable ID.
  • the pre-processed time-series data feature vector
  • the data may be divided along the frequency axis direction as shown in FIG.
  • dividing time series data means dividing the time series data along the frequency axis direction when the time series data is converted into a power spectrum.
  • step S108 a pseudo determination evaluation process is performed by the pseudo determination evaluation unit 17, the probability likelihood calculation unit 16, and the best model selection unit 19.
  • FIG. 3 is a flowchart showing a detailed process flow of the pseudo judgment evaluation process (S108).
  • the pseudo judgment evaluation unit 17 divides the training data set (segmented) into a plurality of divided sets, and labels the plurality of divided sets with 1 to Vmax (S201).
  • One divided set may consist of a single piece of training data or a plurality of pieces of training data.
  • the training data set is divided into the total number of training data, and therefore Vmax matches the total number of training data.
  • Vmax matches the total number of training data.
  • v is incremented by 1 (S203).
  • the pseudo-judgment evaluation unit 17 selects, as a pseudo-judgment target data set Tv, the one indicated by the identifier v among the plurality of split sets divided in step S201. That is, a plurality of divided sets are divided into a pseudo determination target data set Tv and other divided sets.
  • the pseudo-determination target data Tv refers to a case where the pseudo-determination target data set Tv includes one piece of training data.
  • the pseudo judgment evaluation unit 17 performs a modeling process using Leave Cross Validation by training learning, and obtains an evaluation value r (S205).
  • the class of the pseudo determination target data Tv (determination result) is simulated and the remaining training data is used to estimate the class of the pseudo determination target data Tv.
  • the evaluation value r is obtained by calculating whether the estimated result matches the actual class of the pseudo determination target data Tv.
  • Leave-One-out Cross Validation (only one training data is included in one divided set) is effective when the training data is decimal.
  • step S207 the pseudo determination evaluation unit 17 adds the evaluation value r to the evaluation value q.
  • step S208 the pseudo determination evaluation unit 17 determines whether or not the divided set identifier v exceeds Vmax. That is, it is determined whether each training data of the training data set is selected as the pseudo determination target data Tv (whether each of the plurality of divided sets is selected as the pseudo determination target data set Tv). When it does not exceed Vmax, the process returns to step S203, and when it exceeds, the process proceeds to step S209.
  • step S209 the pseudo determination evaluation unit 17 calculates the pseudo correct answer rate Gz (average evaluation value) by dividing the evaluation value q by v_max which is the number of evaluations (number of divided sets). Accordingly, one pseudo correct answer rate Gz (average evaluation value) is obtained corresponding to one waveform division number z.
  • Step S210 condition probability calculation
  • step S211 important segment determination
  • step S109 the pseudo judgment evaluation unit 17 determines whether the pseudo correct answer rate Gz calculated in step S209 is smaller than the pseudo correct answer rate Gz-1 when the previous waveform division number z-1 or not. Determine whether or not.
  • Gz is equal to or greater than Gz ⁇ 1
  • the waveform division number z is incremented by 1, and the same procedure is repeated.
  • the pseudo correct answer rate Gz is smaller than Gz-1, it is determined that a pseudo correct answer rate larger than this cannot be obtained, and the process proceeds to step S110.
  • step S205 modeling processing by training learning
  • FIG. 4 is a flowchart showing details of step S205 in FIG.
  • a case where the number of waveform divisions z 4 will be described as an example.
  • This step S301 may be performed only once, and the processing of this step may be skipped from the next time.
  • step S302 the pseudo determination evaluation unit 17 performs initialization and sets i indicating variable ID (channel ID) to 0 and j indicating section ID to 0.
  • step S303a the pseudo judgment evaluation unit 17 increments channel i by 1, and in step S303b increments variable (channel) j by 1.
  • the pseudo determination evaluation unit 17 uses the k-nearest neighbor method proven in the time series data classification problem for the pseudo determination target data Tv to estimate the class of the pseudo determination target data Tv ( (Pseudo-judgment).
  • the k-nearest neighbor method k cases closest to the pseudo judgment target are extracted in the feature space, and the class occupying the largest number among the classes of the k cases is selected as the pseudo judgment target. It is the determination method determined as an estimation class. This will be described in detail below.
  • a scale such as Dynamic Time Warping (DTW) distance or Euclidean distance may be used.
  • DTW Dynamic Time Warping
  • the distances from the training data d 13 , d 14 , d 15 , d 17 , d 16 to the variable 1 segment data s 1 are calculated as 3.5, 9.3, 12.9, 13.2, and 14.1, respectively.
  • dist (x, y) indicates the distance between the segment data x and the segment data y.
  • step S305 the pseudo judgment evaluation unit 17 updates the normal and abnormal frequency distribution tables for each variable and for each section based on the normal frequency and the abnormal frequency obtained in step S304.
  • a format example of the frequency distribution table is shown in FIG.
  • the frequency distribution table is prepared for each waveform division number z. Initially, all items in the frequency distribution table are set to zero. In the above calculation example, 5 is added to the normal item of section s1 in the distribution table of channel 1 (upper left of FIG. 27), and nothing is added to the abnormal item.
  • the pseudo evaluation determination unit 17 updates the score table according to whether the estimation in step S304 is correct or incorrect.
  • the score table stores a score for each combination of all channels and segments (sections) selected in the process of proceeding with pseudo judgment evaluation (see the upper diagram of FIG. 17 described later).
  • a score table exists for each waveform division number z.
  • step S309 it is determined whether or not the segment j has reached jmax. If not, the process returns to step S303b to increment j and select the next segment. If reached, the process proceeds to the next step S310.
  • FIG. 16 shows how the nearest neighbor calculation is performed in step S304 for the section s2 of the variable 1 (channel 1). Here, the estimation result is abnormal, and the pseudo-determination target data d N + 1 is also abnormal, so it is a correct answer.
  • step S310 it is determined whether or not the variable (channel) i has reached imax. If not, the process returns to step S303a to select the next variable (channel). If reached, the next step S311 is performed. Proceed to FIG. 26 shows how the nearest neighbor calculation is performed in step S304 for section s2 of variable 3 (channel 3). Here, the estimation result is normal, and the pseudo determination target data d N + 1 is abnormal, so it is incorrect.
  • step S311 the evaluation value of the pseudo judgment target data Tv is calculated.
  • the evaluation value r is 1.0, otherwise When it is 0.0.
  • 1.0 may be used when the number of correct answers is greater than the number of incorrect answers, and 0.0 may be set when the number of correct answers is less than or equal to the number of incorrect answers.
  • the ratio of the number of correct answers to the number of determinations may be set as the evaluation value r.
  • steps S302 to S310 may be performed for each training data, and thereafter, an evaluation value may be calculated according to the same criteria.
  • step S207 of FIG. 3 the evaluation value r is added to q, and q is updated. Then, the process proceeds to selection of the next pseudo determination target data Tv, and the flow of FIG. 4 is performed in the same manner.
  • the probability / likelihood calculation unit 16 performs normal and abnormal conditional probabilities p (X
  • C for each variable and each segment combination based on the frequency distribution table updated in step S305 of FIG. ). For example, for a set of variable 2 and segment s2, p (X2 s2
  • conditional probabilities for variable 2 and variable 3 are shown in the upper right corner of the center of FIG. If the illustrated frequency distribution f (X2
  • step S211 determines an important segment (important section) for each variable. That is, all the training data is selected as the pseudo determination target data Tv, and the flow of FIG. 4 is performed for each, so that a score table is finally obtained as shown in the upper part of FIG.
  • the pseudo judgment evaluation unit 17 selects each important segment (important section) of each variable (channel) based on this score table, and records the information in the segment storage unit 15. Specifically, in the score table, the segment (section) with the highest score is selected as an important segment for each variable. For example, in the variable (channel) 1, since the segment s1 has the highest score, the segment s1 is selected as the important segment. Similarly, for the variables 2 to 4, segments s2, s2, and s4 are selected as important segments. A selection method when there are a plurality of segments having the same score and other selection methods will be described later.
  • step S211 If the important segment is determined in step S211, this flow is ended, and the process proceeds to step S109 in FIG.
  • step S109 as described a little earlier, it is checked whether the pseudo correct answer rate Gz calculated in step S209 is smaller than the pseudo correct answer rate Gz-1 when the waveform division number is z-1. If it is smaller, the best model selection unit 19 stores the important segment for each variable selected when Gz-1 is obtained in the best model storage unit 18 as the best segment (best section) in the next step S110. .
  • step S106 when it is determined in step S106 that the waveform division number z is larger than the maximum waveform division number z_max, the best segment (best section) in Gz-1 is specified and stored in the same manner. However, in this case, since z is incremented in step S105, it should be noted that in this case, Gz-1 matches Gz_max.
  • step S111 the best model selection unit 19 obtains normal and abnormal prior probability information and conditional probabilities for each variable obtained in step S210 (corresponding to z from which the best model was obtained) as the best model storage unit.
  • the conditional probability stored may be only the probability of the segment specified as the best segment in each variable.
  • model formula (described later) used for determination by the client is stored in the best model storage unit 18.
  • the model formula can be automatically generated once the best segment for each variable is determined.
  • the best model selection unit 19 reads the segment data of the best section of each variable and the corresponding class from the segment storage unit 15 and stores them in the best model storage unit 18.
  • the segment to be read may be the entire training data, a predetermined number of training data for each of normal and abnormal, or may be determined by other criteria.
  • the best model selection unit 19 also stores in the best model storage unit 18 detailed information (time length) of each section (segment) when divided by the selected waveform division number z. At least the detailed information of the section corresponding to the best segment is stored.
  • the segment s2 is specified as the best segment in all the variables.
  • C) is omitted for simplification, and simply p (X1
  • the transmission unit 20 transmits the determination model stored in the best model storage unit 18 to the client.
  • the model formula type data may be given to the client in advance, and the client may generate the model formula from the type data based on the best segment (best section) of each variable.
  • the server may not include the model formula in the determination model sent to the client.
  • the determination model is transmitted to the plurality of clients. By receiving this determination model, the client is ready for abnormality determination.
  • conditional probabilities There is a supplementary explanation of conditional probabilities.
  • C) the type of attribute value of attribute Xi becomes a problem.
  • C) the type of attribute value of attribute Xi becomes a problem.
  • ai of the attribute Xi it is calculated based on how often the discrete attribute value ai of the attribute Xi occurs, but in this embodiment, segment data obtained by cutting out the time series waveform Since there is no attribute value in itself, the probability cannot be calculated.
  • time-series waveforms are clustered and divided into several category categories, and each category type is calculated as an attribute value in the domain of time-series clustering and time-series classification. It's not easy to decide how many types to choose. Moreover, when the number of divisions increases, the number of clusters to be handled for each variable must be obtained when the number of types of variables handled at the same time increases, which is unrealistic. Although it is conceivable to handle the segments divided by each variable as all the attributes Xi, there is a problem that the probability calculation cost increases.
  • FIG. 19 is a diagram for explaining an important segment determination method according to the first modification.
  • the scores of the segments (s1 and s4) of the variable (channel) 4 are the same (each is 9 points).
  • segments s1, s2, and s2 are selected, respectively.
  • the likelihood is calculated according to the following equation. Since a vector composed of normal likelihood and abnormal likelihood is obtained by the calculation of the following equation, the likelihood of the abnormal is selected, and the likelihood of abnormality is compared between the candidates.
  • p (C) is a prior probability
  • p (X j s i
  • C) is a conditional probability.
  • the conditional probability the value calculated in step S210 in FIG. 3 can be used.
  • the important segment is specified by the maximum value of the score, but as another method, it is possible to select a combination of segments between the variables with the highest likelihood calculation. This is because the best segment is selected for each variable, but it is not always the best determination accuracy when the pseudo-judgment is made for the entire variable.
  • the likelihood of abnormality is calculated for all the combinations of segments between variables (S222 in FIG. 5), and the combination with the highest likelihood may be selected. (S223 in FIG. 5).
  • Part 2 The following method is also possible as the second modification.
  • the lower limit threshold ⁇ of the evaluation score is determined in advance, and all segments having a score larger than the lower limit threshold are selected as candidates, and the final important segment is determined from the candidates.
  • FIG. 20 is a diagram for explaining an important segment determination method according to the second modification.
  • the lower threshold ⁇ is set to 5. Segments with scores greater than 5 in each of the variables (channels) 1 to 4 are selected as candidates. For variable 1, segments s1, s4, for variable 2, s2, for variable 3, s2, for variable 4, s1, s3, s4 are selected. When the selected segments are combined between the variables, the following six candidates c1 to c6 are obtained.
  • S223 in FIG. 5 The likelihood of abnormality is calculated for each candidate in the same manner as “No. 1” (S222 in FIG. 5). Then, the candidate with the highest likelihood is selected (S223 in FIG. 5).
  • the likelihood L1 of the candidate c1 is 0.013
  • the likelihood L2 of the candidate c2 is 0.031
  • the likelihood L3 of the candidate c3 is 0.024
  • the likelihood L4 of the candidate c4 is 0.062
  • the candidate c5 The likelihood L5 is calculated as 0.033
  • the likelihood L6 of the candidate c6 is calculated as 0.093. Since the likelihood of the candidate c6 is the largest, the segment included in the candidate c6 is determined as the important segment. That is, segment s4 is determined as the important segment for variable 1, segment s2 is determined as variable 2, segment s2 is determined as variable 3, and segment s4 is determined as important segment in variable 4.
  • an important segment is not selected in a variable (channel) in which there is no segment with a score higher than the threshold ⁇ . It can be said that such a variable is less necessary for abnormality detection, and there is an advantage that the data of the variable (channel) is not used for abnormality determination.
  • a plurality of candidates may be selected.
  • Modification 3 of the server Depending on the sensing target (monitoring target), it may not be necessary to consider the time difference of each sensor. In that case, a segment at the same position in each variable (channel) may be selected, and estimation may be performed by the k-nearest neighbor method in units of the same segment sequence.
  • conditional probability is calculated in the flow unit of FIG. 4 corresponding to step S205 of FIG. 3 (the conditional probability is calculated in the divided set identifier v unit).
  • normal and abnormal likelihoods are calculated by multiplying the normal and abnormal conditional probabilities of the same segment (for example, s2) between the variables, and further multiplying the prior probabilities, and the values are large. The state of the direction is adopted. This is shown by Equation 1-3 below.
  • the score table is updated (the size of the score table is 1 ⁇ 4 compared to 4 ⁇ 4 in the first embodiment).
  • Formula 1-4 which is obtained by removing the prior probability distribution p (C) from the estimation formula shown in Formula 1-3, may be used instead of Formula 1-3.
  • FIG. 22 is a diagram for explaining a method for determining the best segment according to the fifth modification.
  • the segment length and provisional position of the best segment are determined in advance for each variable.
  • Each segment (section) with a predetermined segment length is placed at a provisional position, and each segment is shifted back and forth along the time axis from the provisional position by the minimum movement interval ⁇ unit, and the position where the pseudo judgment evaluation value Gz is the best is obtained.
  • the combination is determined, thereby determining the best segment.
  • the maximum width of the shift is the absolute value of the segment length.
  • a combination with the highest likelihood of abnormality is selected for all the combinations of segments between the variables when moving in units of ⁇ . In this way, the combination of the position of each possible segment is searched, and the best segment for each variable is determined by performing pseudo judgment evaluation comparison.
  • variable dependency relationship that is, the dependency relationship between sensors, is specified in advance by the user to the server.
  • the likelihood is calculated by Expression 1-5.
  • the normal and abnormal conditional probabilities for the variable X2 to the variable X3 are also stored in the best model storage unit 18 and included in the determination model.
  • FIG. 24 schematically shows the concept of the model expression of Expression 1-6.
  • X3 s2, C).
  • p (X2 s2
  • X3 s2, C).
  • the frequency f (X2 s2
  • C) and p (X3 s2
  • f (X2 s2
  • f (X2 s2
  • the frequency table of f (X2 s2
  • the client includes a sensing unit 30 that newly senses data using a plurality of sensors, and stores data sensed by the sensing unit 30 (this data will be referred to as determination target data) in the sensing data storage unit 31.
  • the determination target data input unit 32 monitors whether or not the determination target data is stored in the sensing data storage unit 31. If new determination target data is input, the determination target data is read and input to the waveform preprocessing unit 33.
  • the waveform preprocessing unit 33 performs waveform preprocessing in the same manner as described in the waveform preprocessing unit 13 of the server.
  • the model receiving unit 34 receives the determination model sent from the server.
  • the determination model storage unit 35 stores the determination model received by the model reception unit 34.
  • the segment selection unit 36 scans the determination target data at regular time intervals based on the segment template including the best segment of each variable included in the determination model. The data is read and output to the abnormality determination unit 39.
  • the abnormality determination unit 39 calculates the likelihood of abnormality and normality based on the cut-out data of each variable input from the segment selection unit 36 and the determination model. The abnormality determination unit 39 determines that the abnormality is abnormal if the abnormality likelihood is higher than the normal likelihood, and otherwise determines that the abnormality is normal.
  • the distances to the corresponding segment data are compared in the judgment model. Then, the determination result (normal or abnormal) of the top k ′ segments closest to each other is specified. However, 1 ⁇ k ′ ⁇ k, and k and k ′ are set in advance as system parameters. Then, likelihood calculation is performed using the model formula (2) in FIG. 18 or (2) in FIG. 23, and the likelihood that becomes normal and the likelihood that becomes abnormal are calculated. The way Output as. When both likelihood values are the same, one of the predetermined values is taken as the determination result.
  • the ratio of normal and abnormal (conditional probability) p (Xnew s2
  • the segment selection unit 36 and the abnormality determination unit 39 cut out the segment data by sliding the segment template from the determination target data in the same manner as the likelihood calculation by assigning the sliding window, and the determination model and the segment The likelihood is calculated using the data.
  • the abnormality determination unit 39 outputs the determination results for the number of times scanned.
  • FIG. 29 shows a case where it is normal in a certain part of the first half but is determined to be abnormal in the second half.
  • the notification display unit 38 notifies or displays the determination result by the abnormality determination unit 39.
  • FIG. 30 shows a screen display in the case of highlighting only the abnormality determination portion of the determination target data.
  • the determination result is notified, for example, to a remote monitoring terminal or a maintenance staff or a staff of the server using a display or a speaker.
  • the device control unit 37 controls the operation to be monitored according to the determination result by the abnormality determination unit 39. For example, when it is determined that there is an abnormality, the monitoring target is urgently stopped.
  • the determination result storage unit 40 has a predetermined time length (for example, the same time length as already stored training data) including the determination result in the abnormality determination unit 39 and the data of each variable extracted for determination from the determination target data. Accumulate time series data.
  • the determination result transmission unit 41 transmits the time series data of each variable and the corresponding determination result to the server.
  • the server determination result receiving unit 22 receives the time-series data of each variable transmitted from the client and the corresponding determination result, and the notification unit 21 displays or notifies the monitoring member of these. After the monitoring person confirms that the determination result is correct, the notification unit 21 adds the time series data and the determination result to the training data storage unit 11 of the server in response to an instruction input from the monitoring person.
  • the determination result is corrected according to the instruction input from the observer, and the time series data and the corrected determination result are stored in the training data storage unit 11.
  • the training data input unit 12 of the server may detect that the data in the training data storage unit 11 has been updated and recalculate the determination model.
  • the judgment model can be refined, that is, accuracy can be improved continuously. This means that there is a possibility that the abnormality determination accuracy can be improved in the daily monitoring operation, and it is considered to be particularly effective in an area where high abnormality determination performance is required.
  • FIG. 31 is a flowchart showing an operation flow from the input of the determination target data in the client until the determination by the abnormality determination unit 39 is performed.
  • the determination target data input unit 32 reads the determination target data in the sensing data storage unit 31 and inputs it to the waveform preprocessing unit 33 (S401).
  • the waveform preprocessing unit 33 performs preprocessing on the determination target data (S402), and the segment selection unit 36 extracts data based on the segment template including the best segment of each variable included in the determination model (S403). Then, the cut out data is input to the abnormality determination unit 39 (S404).
  • the abnormality determination unit 39 calculates k'-nearest neighbor (S407), and calculates a conditional probability (the ratio of normal and abnormal based on k'-nearest neighbor) (S408).
  • S407 and S408 are calculated according to the above-described model formula (S410).
  • the abnormality determination unit 39 compares the likelihood that the likelihood of abnormality is normal and gives the state having the larger value as the determination result (S411).
  • FIG. 32 shows an example of a hardware configuration for realizing a server and a client.
  • the server includes a CPU 51, RAM 52, ROM 53, HDD 54, I / O 55, display 56, speaker 57, I / O controller 58, and network interface 59.
  • the training data storage unit 11 and the segment storage unit 15 of the server are configured by the HDD 54, for example.
  • the model transmitting unit 20 and the determination result receiving unit 22 can be configured by a network interface 59.
  • the other elements 12, 13, 14, 16, 17, 19, and 21 can be configured by logic circuits as program modules that are executed by the CPU 51, for example.
  • the program modules are stored in the ROM 53 or the HDD 54, read out by the CPU 51, developed in the RAM 52, and executed, whereby the operations of the corresponding logic circuits are realized.
  • the client has a CPU 61, RAM 62, ROM 63, HDD 64, I / O controller 65, display 66, speaker 67, I / O 68, and network interface 69.
  • the client sensing data storage unit 31, the determination model storage unit 35, and the determination result storage unit 40 can be configured by the RAM 62 or the HDD 64.
  • the model receiving unit 34 and the determination result transmitting unit 41 can be configured by a network interface 69.
  • Other elements 32, 33, 36, 37, and 38 can be configured by a logic circuit as a program module to be executed by the CPU 61, for example.
  • the program modules are stored in the ROM 63 or the HDD 64, and the CPU 61 reads the program modules, develops them in the RAM 62, and executes them, thereby realizing the operations of the corresponding logic circuits.
  • the server and the client are separated. However, some or all of the server functions may be performed by the client, and some or all of the client functions may be performed by the server.
  • execution of a process by a computer includes a case where a single computer executes the process, and a case where the process is distributed and executed by a plurality of computers.
  • the likelihood is calculated by applying the segment template at regular intervals on the determination target data.
  • this method sometimes exceeds the upper limit of the time required for the determination.
  • information processing equipment such as a remote monitoring terminal installed in the field has severe restrictions on computing resources (memory amount, CPU, etc.) that can be used for judgment processing. If it is necessary to stop the monitoring target device (equipment) immediately in case of an abnormality, or if there is a communication performance limit on the communication path from the remote monitoring terminal in the field to the monitoring center server, scan at regular intervals. Sending all the results to the server is unrealistic.
  • an upper limit threshold value is determined in advance for each variable, and the upper limit threshold value for each variable is stored in the determination model storage unit 35. Then, when the value of the variable for any one variable or all of the variables exceeds the upper threshold, the device control unit 37 performs a method such as an emergency stop of the device.
  • a method such as an emergency stop of the device.
  • FIG. 33 an example is shown in which a segment template is applied so as to select a portion that exceeds the upper threshold in each variable.
  • the segment template is applied so as to include a portion exceeding the upper threshold, and the data is cut out as shown in FIG. 33, and the determination by the abnormality determination unit 39 is performed based on the cut out data.
  • the emergency stop (control operation) is automatically canceled by the device control unit 37.
  • the upper threshold may be given in advance by the user.
  • some threshold candidates may be selected and compared using the above-described method of dividing and determining training data, and a threshold having the best pseudo-determination performance may be adopted from the compared candidates.
  • FIG. 34 is a flowchart showing an example of the operation of the client according to this modification.
  • the determination target data input unit 32 monitors whether sensor data is input to the sensing data storage unit 31 (S501). When sensor data is not input, it is confirmed whether a stop instruction is input from a supervisor or the like. If it is input, this flow is terminated, and if it is not input, the process returns to S501.
  • the determination target data input unit 32 reads the sensor data from the sensing data storage unit 31 as determination target data, and outputs it to the abnormality determination unit 39 via the waveform preprocessing unit 33.
  • the abnormality determination unit 39 determines whether all or any one or more of the variables exceed the respective upper thresholds while moving the segment template (S505). When it does not exceed, the process returns to step S501, and when it exceeds, emergency stop of the monitoring target device is performed via the device control unit 37 (S506).
  • the segment selection unit 36 cuts out the data of each variable at the position of the segment template when it is determined that the upper limit threshold has been exceeded and sends it to the abnormality determination unit 39.
  • the abnormality determination unit 39 The determination is made based on the determination model (S507).
  • the abnormality determination unit 39 cancels the emergency stop or the like via the device control unit 37 (S509), and the determination result and data of a certain time length including the extracted data are determined result storage unit Store in 40 (S511). On the other hand, when it is determined that there is an abnormality, the abnormality determination unit 39 notifies or displays that fact via the notification display unit 38 (S510).
  • the dependency relationship between sensors while utilizing the set of accumulated multi-channel sensor data and the abnormality determination result (class) corresponding to the data, and determining the accuracy of determination.
  • the determination can be performed in a form (positional relationship between sections for each variable) that can indicate the basis of which part of the waveform data contributes to the determination.
  • a large number of sensor nodes are widely used to monitor various features of the object in order to improve the performance of detecting the state (normal or abnormal) of the target object.
  • These applications include object tracking, image recognition, collision avoidance in vehicles, remote monitoring of areas, and remote monitoring of plant operations.
  • the definition of the target object varies depending on the problem. For example, in remote monitoring of a train station area, the target object is a person, and in the collision avoidance in the vehicle, the target object is a vehicle.
  • the target object can be a component in the device.
  • Many target objects may also exist in a remote monitoring system.
  • the term “sensor node” is used to mean a sensor setup that monitors the state of a feature in a target object.
  • the data from the sensor node is sent to a server at the remote monitoring center because computing resources in the remote monitoring device (eg, a remote listening device) are limited.
  • the remote monitoring center server analyzes the data and takes appropriate action.
  • this approach is not feasible if the communication bandwidth is constrained by the sensor node generating a large amount of data.
  • a gateway that transmits data from the sensor node.
  • a false alarm false alarm (FA)
  • FA false alarm
  • An abnormal event in the sensor node may be triggered by an abnormality in some other sensor nodes.
  • the abnormality of the other sensor node is the cause of the abnormality of the target object.
  • the determination of abnormal events and their causes is very important, especially when equipment or plant operations are linked with human safety and security.
  • Bayesian network is widely used to show the causal relationship in sensor nodes, and conditional probability table (CPT) is used to infer the cause of abnormal events in sensor nodes.
  • CPT conditional probability table
  • the Bayesian network can be created manually. If the number of sensor nodes is very large and the relationships at the sensor nodes are hidden, it is virtually impossible to manually construct a Bayesian network.
  • a Bayesian network may be automatically constructed from data, but creating a Bayesian network from data is a difficult problem. Therefore, for a remote monitoring system including a large number of sensor nodes, it is not feasible to construct an optimal Bayesian network and to infer the cause of abnormality of sensor nodes used in the Bayesian network.
  • the sensor network consists of many sensor nodes, and the target object may be monitored by many sensors. In such a large network, not all sensor nodes may be necessary to find anomalous events in the target object. The removal of unnecessary sensor nodes leads to a reduction in the cost of the monitoring system.
  • more expensive sensors may be used to monitor the state of the object.
  • the determination of a set of sensors from a sensor network that can perform expensive sensor replacement is very helpful in reducing the cost of the sensor network.
  • the target object is an expensive sensor node.
  • an abnormal event of a target object is detected efficiently and reliably, the cause of the abnormality (causing sensor node) is identified from many sensor nodes, and from the gateway to the server in the remote monitoring center.
  • the communication overhead of data transmission is reduced, and unnecessary sensor nodes can be identified for the target object.
  • FIG. 35 shows the configuration of the abnormality determination system according to this embodiment.
  • This system detects the abnormal state (or abnormal event) of the target object in the monitoring site or plant, and identifies the cause of the abnormal state (hereinafter, the cause may also be referred to as a judgment basis).
  • This system comprises a gateway (client) 100 at the monitoring site and a server 200 at the remote monitoring center.
  • the gateway 100 includes a single-channel abnormality determination unit 102, a comprehensive determination unit 103 that performs comprehensive abnormality determination and ground identification using a determined fusion rule, a data filtering unit 104, an abnormality determination model database 105, a determined fusion rule database 106, and a reception unit 107. .
  • the remote monitoring center server 200 includes a receiving unit 205 that receives data from the gateway 100, a sensor data database 201 that holds sensor data, a determination result and determination basis database 204, a single channel abnormality determination model learning unit 203, a determination fusion A rule learning unit 202 and a transmission unit 206 are provided.
  • FIG. 36 shows an example of the sensor data database 201.
  • the database 201 includes sensor data observed from each sensor node, a status label indicating whether each sensor data is normal or abnormal (first label), and a determination label indicating whether the target object is normal or abnormal (second label). Are stored with various time stamps.
  • the determination label may be referred to as a class label.
  • the judgment label of the target object is given by the maintenance staff or the staff at the monitoring site after confirming the actual state of the target object at the time indicated in the time stamp.
  • the state label of each sensor node is determined according to the criteria (model, classifier) prepared for each sensor node. For example, when the reference is a threshold value, the status label is determined to be abnormal if the sensor data value exceeds the threshold value, and normal if it does not exceed the threshold value.
  • the determination of the status label may be automatically determined and given by the apparatus, or may be given by a maintenance person or an attendant.
  • a sound sensor As the sensor node, various types of sensors such as a sound sensor, a vibration sensor, and a temperature sensor can be used. Outputs from the sound sensor and the vibration sensor are waveform data. The output from the temperature sensor is a value integrated along the time axis.
  • the single-channel abnormality determination model learning unit 203 in the server 200 learns a single-channel abnormality determination model (classifier) that classifies each sensing data for each sensor node in the sensor data database 201 as abnormal or normal.
  • the single channel abnormality determination model learning unit 203 transmits the single channel abnormality determination model generated for each sensor node to the gateway 100 via the transmission unit 206.
  • the receiving unit 107 of the gateway 100 receives a single channel abnormality determination model (classifier) for each sensor node and stores it in the abnormality determination model database 105.
  • the single channel abnormality determination model learning unit 203 In order to learn a single channel abnormality determination model (classifier), the single channel abnormality determination model learning unit 203 first collects data and state labels recorded at various time stamps from the sensor data database 201 for each sensor node. Extract. An example of data extracted for one channel (sensor node) is shown in FIG. The single channel abnormality determination model learning unit 203 learns different types of classifiers according to the type of data (for example, whether the data is waveform data).
  • a predetermined threshold is used as a classifier for classifying data as abnormal or normal Used. It is very difficult to determine an appropriate threshold. If the threshold is set to a very high value, many false negatives are predicted, and if it is set to a low value, many false positives are predicted.
  • a method for determining the optimum threshold value from the extracted data and the state label will be described.
  • This method is based on the method of selecting features for optimal segmentation from training learning data in C4.5.
  • C4.5 is disclosed in "C4.5:” Programs for Machine Learning “[Morgan Kaufman Publishers, 1993] by Quinlan.
  • the values of training learning data are sorted. For example, when the left data in FIG. 38 is sorted, it becomes like the middle of FIG.
  • the median (breakpoint) of the value intervals with different state labels is calculated. For example, since the states of ID9 and ID8 are different in the figure, the median value of the data of ID9 and ID8 is “2.1”.
  • the median value thus calculated is a candidate threshold value.
  • Each candidate threshold value is evaluated by an index such as accuracy, F-score, geometric mean, or AUCB, and a median value (candidate threshold value) that returns the best score (fitness) is selected as the optimum threshold value.
  • AUCB is based on Paul et al. “Genetic algorithm based methods for identification of health risk factors aimed at preventing metabolic syndrome” [SEAL '08: Proceedings of the 7th International Conference on Simulated Evolutionerand Berlin 2008 Springer-Verlag].
  • NTP is the number of true positives
  • N TN is the number of true negatives
  • N FP is the number of false positives
  • N FN is the number of false negatives.
  • candidate thresholds 3.0 and 3.6 have the highest score, so one of these is selected as the optimal threshold.
  • the selection may be random or user specified.
  • waveform data when the output from the sensor node is waveform data, special consideration is necessary for the construction of the classifier (abnormality judgment model).
  • waveform data is first processed by a signal processing technique such as moving average, discrete wavelet transform (DWT), or short-time Fourier transform (STFT), and the classifier (anomaly judgment model) learns in the next step. Is done.
  • DWT discrete wavelet transform
  • STFT short-time Fourier transform
  • the simplest method for learning the classifier (abnormality determination model) in the case of waveform data is the threshold method.
  • this threshold method the highest peaks of waveform data measured at various time stamps are acquired, and the optimum threshold value is learned using the method described above.
  • Another possible technique is to extract many feature values such as maximum and minimum amplitude, average and standard deviation, and area under the waveform from the waveform data, and based on the extracted feature values, classifier (anomaly determination model ).
  • k nearest neighbor (kNN) classifier One example of a classifier that can be used to classify waveform data is the k nearest neighbor (kNN) classifier.
  • the k nearest neighbor (kNN) classifier uses dynamic time warping (DTW) that can handle variable length partial waveforms as a distance measurement.
  • DTW dynamic time warping
  • the k nearest neighbor (kNN) classifier is disclosed by Dasarathy in "NearestearNeighbor (NN) Norms: NN Pattern Classification Techniques" [IEEE Computer Society Press, 1991].
  • DTW is also disclosed in “A comparative study of several dynamic time-warping algorithms for connected word recognition” [The Bell System Technical Journal, 60 (7): 1389-1409, September 1981] by Myers and Rabiner.
  • DTW operations are very slow when the number of databases and / or observation points is very large. So instead of DTW, faster distance calculation methods like cross-correlation and Euclidean distance, t-statistic or signal-to-noise ratio (SNR) based functions are used to calculate the distance between two waveforms. May be.
  • SNR signal-to-noise ratio
  • the waveform contains many data points, it takes a very long execution time to determine whether the test waveform is abnormal, but using the waveform feature waveform makes the abnormality more accurate and faster.
  • feature regions can be extracted by using an optimization algorithm such as Genetic Algorithm (GA). Genetic algorithms include “Adaptation in Natural and Artificial Systems” by Holland [University of Michigan Press, AnnArbor, Michigan, 1975] and “Genetic Algorithms in Search, Optimization, and Machine Learning” by Goldberg [Addison Wesley (Reading, MA), 1989].
  • FIG. 39 is a flowchart showing a general processing flow of the genetic algorithm.
  • control parameter values such as candidate solution size (population size), offspring size, maximum number of generations, crossover probability and mutation probability are initialized (S1002). .
  • an initial candidate solution is randomly generated (S1003).
  • the group of generated initial candidate solutions corresponds to the initial population.
  • each candidate solution is evaluated and each fitness is calculated (S1004).
  • the termination criterion is not satisfied (NO in S1005), several candidate solutions are selected from the previous generation population to generate new candidate solutions (descendants) (S1006).
  • the selection is made according to a predetermined criterion based on the score (fitness) of each candidate solution. For example, a predetermined number of candidate solutions or a candidate solution having a fitness equal to or greater than a predetermined value may be selected as a predetermined criterion.
  • a new candidate solution (descendant) is generated by applying the crossover and mutation operator to the selected candidate solution (S1007). Then, a new candidate solution (descendant) is evaluated in the same manner as in step S1004, and each fitness is calculated (S1008).
  • a new set of candidate solutions (new population) is generated by combining the candidate solution selected from the previous generation population and the newly generated candidate solution (S1009).
  • the candidate solution to be selected is selected according to a predetermined criterion. For example, a predetermined number of candidate solutions having high fitness or candidate solutions having fitness equal to or higher than a predetermined value are selected.
  • the same candidate solution selected in step S1006 may be selected.
  • the best candidate solution for example, the candidate solution having the highest fitness
  • the best solution is obtained as the best solution (S1010).
  • FIG. 40 specifically shows an example of optimal segmentation of a waveform using a genetic algorithm (GA).
  • GA genetic algorithm
  • candidate solutions here, 50
  • candidate solutions are created by extracting partial waveforms arbitrarily (randomly) from a plurality of waveforms with different time stamps (S1011).
  • S1011 different time stamps
  • each candidate solution only one partial waveform is taken from each waveform.
  • the width of the waveform to be cut out may be constant or not constant.
  • each candidate solution is evaluated by the k-nearest neighbor method (S1012).
  • the k-nearest neighbor method for example, partial waveforms included in the candidate solution are classified, that is, the number of true positives (N TP ), the number of true negatives (N TN ), the number of false positives (N FP ), and false negatives.
  • Classification statistics such as the number of (N FN ) are calculated, and the fitness (fitness) of the candidate solution is calculated based on the calculated classification statistics. For fitness, for example, various indexes such as the accuracy described above can be used.
  • An example of the process performed in step S1012 is shown below. However, the example described here is merely an example, and the present invention is not limited to this.
  • one of the partial waveforms 1 to 4 included in candidate 1 (here, partial waveform 4) is removed.
  • the top k partial waveforms closest to the partial waveform 4 are selected from the remaining partial waveforms.
  • k 3, so all the rest are selected.
  • Each state (normal or abnormal) of the selected partial waveform is specified, the total number of normals and the total number of abnormalities are calculated, and the larger state is selected.
  • the actual state (determination label) of the partial waveform 4 is compared with the selected state. If they match, the answer is correct, and if they do not match, the answer is incorrect.
  • the partial waveforms 1 to 3 are also selected in order and compared to identify the correct or incorrect answer. The ratio of the number of correct answers to the number of comparisons is calculated, and this ratio is set as the fitness of candidate 1.
  • a candidate solution satisfying the predetermined criterion is selected based on fitness (S1014), and the selected candidate solution is selected. Based on this, a new set of candidate solutions is generated by performing crossover (S1015) and mutation (S1016) operations. In crossover, partial waveforms of one candidate solution are exchanged with corresponding partial waveforms in another candidate solution. In mutation, one partial waveform is replaced with another from the same waveform.
  • Each new candidate solution (offspring) is evaluated to calculate fitness (S1017), and the candidate solution selected from the old set of candidate solutions and the new candidate solution (descendants) are combined to create a new set of candidate solutions (new (S1018).
  • Candidate solutions to be selected from the old set are selected according to a predetermined criterion based on fitness (for example, selecting a predetermined number of candidate solutions having high fitness or a candidate solution having a fitness equal to or higher than a predetermined value).
  • the candidate solution selected in S1014 may be selected.
  • a candidate solution (partial waveform set) having the best fitness is obtained as an optimized set of training partial waveforms (S1019).
  • the obtained set corresponds to a single channel abnormality determination model.
  • the obtained set and the k nearest neighbor algorithm may be combined and handled as a single channel abnormality determination model.
  • the decision fusion rule learning unit 202 learns a decision fusion rule (or classification rule) for detecting an abnormality of the target object.
  • the decision fusion rule learning unit 202 transmits the generated decision fusion rule to the gateway 100 via the transmission unit 206.
  • the receiving unit 107 of the gateway 100 receives this determined fusion rule and stores it in the determined fusion rule database 106.
  • the decision fusion rule is to detect an abnormality of the target object and identify the basis of the abnormality (sensor node) in the target object. Also, sensor nodes that are not included in the decision fusion rule can be identified as sensor nodes that are unnecessary for detecting an abnormality of the target object.
  • the decision fusion rule learning unit 202 stores the learned decision fusion rule in an internal database, and transmits it to the gateway 100 and stores it in the decision fusion rule database 106 as described above.
  • the decision fusion rule learning unit 202 extracts data as shown in FIG. 41 from the sensor database as shown in FIG. 36 in order to learn the decision fusion rule.
  • the data to be extracted includes each time stamp (ID), a state label (normal or abnormal) of each sensor node, and a determination label (normal or abnormal) of the target object.
  • a classification rule learning approach is used from these extracted data to generate a decision fusion rule. That is, a classification rule that can accurately predict the state (normal or abnormal) of the target object is learned using the determination label of the target object as the class label and the state label of the sensor node as the feature value. That is, a combination of feature selection and classification is used to learn the decision fusion rules.
  • the resulting classification rules are composed of single or multiple decision fusion rules.
  • This classification rule consists of one decision fusion rule and is interpreted as follows. That is, when the data of the sensor nodes N4, N8, and N19 is abnormal, the target object is interpreted as being in an abnormal state at that time. In the cause-related context, the cause of the abnormality of the target object is that the data of the sensor nodes N4, N8, and N19 is abnormal. Another interpretation is that only three sensor nodes, N4, N8, and N19, are needed to discover anomalies in the target object. Hereinafter, it may mean that the sensing data of the sensor node is in an abnormal state only with the name of the sensor node (that is, a variable representing the sensor node).
  • the above notation means that when all sensor nodes (Na, Nb,..., Nk) are abnormal, the target object is in an abnormal state.
  • classification rules include the following four decision fusion rules.
  • Classification rules can have various formats.
  • FIG. 42 shows an example of the AND format
  • FIG. 43 shows an example of the rule format.
  • the AND format in the first line in FIG. 42 means that when the data of the sensor nodes N4, N8, and N19 are all abnormal, the target object A is in an abnormal state at that time.
  • the AND format in the second row means that when all the data of the sensor nodes N4, N8 and N25 are abnormal, the target object A is in an abnormal state at that time.
  • the AND format in the third row means that when all the data of the sensor nodes N10 and N19 are abnormal, the target object A is in an abnormal state at that time.
  • the AND format on the fourth line means that when all the data of the sensor nodes N10 and N25 are abnormal, the target object A is in an abnormal state at that time.
  • the rule format in the first line of FIG. 43 is that all the rules included in at least one of (N4, N8, N19), (N4, N8, N25), (N10, N19), (N10, N25)
  • all the sensor node data is abnormal, it means that the target object A is abnormal.
  • the evaluation using the AND format rule of FIG. 42 is easier and faster than the evaluation using the classification rule including a large number of decision fusion rules as shown in FIG. Furthermore, in the AND format rule of FIG. 42, all the sensor nodes in the row are obtained as the determination basis, so that the basis identification becomes easier.
  • the rule format of FIG. 43 is a compact expression of a large number of decision fusion rules, but extra syntax analysis is required to identify the judgment basis.
  • the classification rule can be transformed into a sum of product (SOP) format of multiple decision fusion rules, as shown in FIG. Note that if the sensor node is included in a number of decision fusion rules, indexing the sensor node can reduce the confirmation cost.
  • SOP sum of product
  • the decision fusion rule learning unit 202 can identify sensor nodes that are not included in the decision fusion rule as unnecessary sensor nodes for detecting an abnormality of the target object by scanning all learned decision fusion rules.
  • sensor nodes N4, N8, N10 It can be identified that N19 and N25 are necessary and the remaining sensor nodes are unnecessary.
  • One of the purposes of the decision fusion rule learning unit 202 is to find a combination of sensor nodes necessary for predicting an abnormal state in the target object.
  • N sensor nodes there are 2 n combinations of sensor nodes.
  • n is small, all combinations are thoroughly searched, and the combination with the maximum support (evaluation value) in the training data can be found as the best one. Instead of the best combination, multiple combinations may be found that have support greater than the threshold.
  • various heuristic search algorithms such as genetic algorithm (GA) or genetic programming (GP) can be used.
  • FIG. 46 shows an example of a processing flow for constructing a classification rule by a genetic algorithm.
  • an encoding method for mapping between a solution space and a search space is determined (S1101).
  • classification rule is used to mean a decision fusion rule.
  • Fig. 45 shows an example of encoding for a solution problem by GA. It is a binary string consisting of 0s and 1s. When there are n sensor nodes, the length of each string (each candidate solution) is n. That is, this encoding method indicates that the selection of each of the plurality of sensor nodes is mapped to binary values. 0 means that the corresponding sensor node does not affect the target object. 1 means that if the target object is in an abnormal state, the sensor node corresponding to that 1 must be in an abnormal state.
  • GA is used with this encoding, only one decision fusion rule is derived for each GA run. Each execution is an AND operation of all sensor node states (0 or 1). That is, the problem is directed to the feature selection problem.
  • an initial candidate solution (initial candidate classification rule) is randomly generated (S1103).
  • the string length is 5.
  • 10 candidates are generated.
  • each candidate solution is evaluated, and each fitness is calculated (S1104).
  • a candidate solution is selected from the current population according to a predetermined criterion based on fitness (S1106).
  • a predetermined criterion based on fitness for example, a predetermined number of candidate solutions are selected from those having high fitness, or candidate solutions having fitness equal to or higher than a predetermined value are selected.
  • FIG. 47 shows an example of generating offspring using crossover and mutation in the genetic algorithm (GA).
  • the generated offspring are evaluated in the same manner as in step S1104, and the fitness of each is calculated (S1107).
  • a candidate solution is selected according to a predetermined standard based on fitness in the previous generation population, and a new population is generated by combining the selected candidate solution and the generated descendant (S1108).
  • a predetermined criterion for example, a predetermined number of candidate solutions are selected from those having high fitness, or a candidate solution having a fitness equal to or higher than a predetermined value is selected.
  • the candidate solution (candidate classification rule) having the highest fitness in the population at that time is acquired as the best classification rule (S1109).
  • the GA generates one decision fusion rule from the best classification rule for each execution.
  • GP genetic programming
  • GP genetic programming
  • GP genetic programming
  • the tree can be either an expression by an S expression as shown in FIG. 49 (Symbolic Expression, S expression) or a decision tree as shown in FIG. For compactness, encoding based on the S representation is preferred.
  • FIG. 51 shows an example of a processing flow for constructing a classification rule by genetic programming (GP).
  • GP genetic programming
  • an encoding method for mapping between the solution space and the search space is determined (S1201).
  • the gene type is expressed by a sequence (see FIG. 45), but in the genetic programming (GP), it is expressed by a tree structure.
  • the name (variable) of the sensor node is assigned to the terminal node of the tree structure, and the logical operator is assigned to the non-terminal node. That is, an encoding method is used that defines a variable representing a sensor node selected from a plurality of sensor nodes at the end node of the tree structure and a mapping of a logic operation symbol selected from the plurality of logic operation symbols at a non-terminal node of the tree structure.
  • the name of the sensor node (variable) is set to the non-terminal node, the value indicating true (target object is abnormal) or false (target object is normal) to the end node, and each branch is directly above it.
  • an initial candidate solution (initial candidate classification rule) is randomly generated (S1203).
  • the size of the tree structure is also determined randomly within the constraints for each candidate solution, and variables and logical operators are randomly assigned to each node of the tree structure.
  • each candidate solution is evaluated and each fitness is calculated (S1204).
  • a candidate solution is selected according to a predetermined criterion based on fitness in the current population (S1205).
  • a predetermined reference for example, a predetermined number is selected from those having high fitness, or candidate solutions having fitness equal to or higher than a predetermined value are selected.
  • the operation of crossover and mutation is applied to generate a descendant (new candidate classification rule) (S1205).
  • FIG. 53 shows an example of generating offspring using crossover and mutation in genetic programming (GP).
  • one node is replaced with another node, but this is an example.
  • subtrees are also replaced between different sizes. For example, one node is replaced with a subtree composed of a plurality of hierarchies.
  • step S1206 the generated offspring are evaluated in the same manner as in step S1204, and the respective fitness is calculated (S1206).
  • candidate solutions are selected according to a predetermined criterion in the previous generation population, and a new population is generated by combining the selected candidate solutions and the generated descendants (S1207).
  • a predetermined reference a predetermined number of candidate solutions are selected from those having high fitness, or candidate solutions having a fitness equal to or higher than a predetermined value are selected.
  • FIG. 54 shows a processing flow inside the gateway.
  • the single channel abnormality determination unit 102 of the gateway 100 collects data from a plurality of sensor nodes 1 to n (S2001).
  • the data from the sensor node may be waveform data or a set of values observed at regular time intervals. The time interval for observation may be different for each sensor node.
  • the single channel abnormality determination unit 102 uses the abnormality determination model for each sensor node in the abnormality determination model database 105 to classify the data from each sensor node as abnormal or normal (S2002).
  • the sensor data classification for each sensor node is shown on the left of FIG.
  • FIG. 55 shows an example of test waveform classification using an abnormality determination model (optimized training partial waveform group).
  • the test waveform is first divided into a plurality of sections.
  • the division method is specified in advance.
  • a division method for example, a predetermined number of pieces are divided with a constant width.
  • the optimum partial waveform data with the closest distance for each section is identified from the abnormality determination model.
  • the state (normal or abnormal) of the optimum partial waveform data specified for each section is confirmed, and the larger one is adopted.
  • abnormality is adopted.
  • Such classification is performed for at least sensor nodes included in the decision fusion rule.
  • the sensor node included in the determined fusion rule is designated in advance in the single channel abnormality determination unit 102. The designation may be performed by a notification from the comprehensive determination unit 103, or may be performed by a maintenance staff or a staff.
  • the comprehensive judgment unit 103 extracts the decision fusion rule (classification rule) of the target object from the decision fusion rule database 106 (S2003).
  • the decision fusion rule (classification rule) of the target object from the decision fusion rule database 106 (S2003).
  • a classification rule composed of a plurality of decision fusion rules is extracted.
  • the overall determination unit 103 checks whether the extracted decision fusion rule matches the state (normal or abnormal) of the sensing data of the sensor node included in the rule (S2004), and matches at least one decision fusion rule. If so (YES in S2004), the target object is determined to be in an abnormal state. In the example of FIG. 58, two decision fusion rules are satisfied.
  • the data filtering unit 104 sends the determination result (the target object is abnormal) and the state (determination basis) of the sensor node included in the satisfied decision fusion rule to the server 200 of the remote monitoring center (S2005).
  • An example of a message format transmitted to server 200 in step S2005 is shown in FIG.
  • these data (determination result and determination basis) transmitted from the gateway 100 are received by the receiving unit 205 and stored in the database 204.
  • the data filtering unit 104 sends a list of all sensor nodes indicating an abnormal state to the server 200 of the monitoring center (S2006).
  • An example of the message format transmitted to the server 200 in this step S2006 is shown in FIG.
  • the data filtering unit 104 sends the data of these sensor nodes to the server 200 of the remote monitoring center (S2007).
  • the time stamp of the sensor node data may also be transmitted simultaneously.
  • the gateway 100 may discard the sensor node data in a normal state.
  • the server 200 may store these data (a list of sensor nodes in an abnormal state, data of these sensor nodes, and a time stamp) in the sensor database 201. At this time, the determination label of the target object is set normally. Also, sensor node status labels not included in the list are set normally. Thereafter, for example, the single channel abnormality determination model learning unit 203 or the decision fusion rule learning unit 202 may perform the above-described processing based on, for example, the updated sensor database 201.
  • the abnormality determination model based on each sensor and the decision fusion rule for comprehensively determining the determination of each abnormality determination model are all learned in the server.
  • the present invention is not limited to this setup, and if the gateway has sufficient computing resources, the abnormality determination model and the decision fusion rule may be learned at the gateway.
  • the feature selection technique is used to learn the decision fusion rule, unnecessary sensor nodes for the target object can be identified and removed from the remote monitoring system, thereby reducing the cost of the system. it can.
  • the state matching of only the sensor node specified in the decision fusion rule is performed, so that the cause of the abnormal event from many sensor nodes can be identified efficiently.
  • the detection of the abnormal event becomes reliable by using the state of a large number of sensors to confirm the abnormality in the target object.
  • no prior knowledge about the causal relationship at the sensor node is necessary to learn (construct) the decision fusion rule.
  • Communication overhead for transmitting data to the remote monitoring center is reduced by using the data filter unit. Only when the abnormality of the target object cannot be confirmed by the decision fusion rule, the sensor data may be sent to the server of the remote monitoring center by the data filtering unit.
  • the first and second embodiments it is possible to extract only the portion that contributes to the determination in the accumulated multi-channel sensor data and generate the determination model in consideration of the probability dependency between the channels. It is possible to perform determination using the generated determination model, and to accurately indicate the determination basis without excess or deficiency. In addition, when there is a new input, training data with high accuracy can be added, so that the performance of the determination model can be continuously improved.
  • the present invention includes a manufacturing system monitoring system, an elevator monitoring system, an air conditioning system monitoring system, a power system monitoring system, a vital sensing system monitoring system and a health equipment monitoring system in medical and nursing care, etc. It can be used as various remote monitoring systems such as quality control, maintenance and condition monitoring.

Abstract

Using data obtained by a plurality of sensors and accumulated in the past, an object being monitored is subjected to precise identification of abnormality. An abnormality identification system is provided with: a waveform segmentation unit which designates a plurality of segments to each of a plurality of variates and extracts a plurality of segment data being data relating to the plurality of segments; an evaluation unit which identifies, on the basis of each of the variates, each of the segments by the nearest method using the plurality of segment data extracted by the waveform segmentation unit, thereby selecting the best segment being one of the segments; and a calculation unit which calculates, on the basis of each of the variates, the conditional probabilities of normality and abnormality of the best segment according to the frequency of being identified to be normal and the frequency of being identified to be abnormal relating to each of the segments, and calculates the previous probabilities of normality and abnormality from the total of normal classes and the total of abnormal classes included in a plurality of pieces of exercise data.

Description

異常判定システムおよびその方法Abnormality judgment system and method
 本発明は、異常判定システムおよびその方法に関する。 The present invention relates to an abnormality determination system and method.
 近年のセンサネットワークの普及により、センサフュージョン等、様々なシーンでセンサデータ解析技術が必要となっている。しかしながら多変量時系列データを活用した判定装置において判定性能を向上させることは困難であった。 With the recent spread of sensor networks, sensor data analysis technology is required in various scenes such as sensor fusion. However, it has been difficult to improve the determination performance in a determination apparatus using multivariate time series data.
 一般的に判別精度がよく、比較的多く用いられている判別器としてSVMやニューラルネットワーク等があるが(特許第3624546号参照)、構築が難しい、判定の根拠が分かりにくいといった理由で、現場で受け入れがたい可能性があるという問題があった。 Generally, the discrimination accuracy is good, and there are SVMs and neural networks, etc. that are used relatively frequently (see Patent No. 3624546). However, because they are difficult to construct and the grounds for judgment are difficult to understand, There was a problem that it might be unacceptable.
 センサフュージョン技術(特許第3931879号、特開第2005-165421号公報)においては、確率モデルによる複数センサデータの情報の異常判定が可能であるが、この場合、センサデータは連続時系列値から離散値に変換し、カテゴリカルデータとして扱うのが主流である。 In the sensor fusion technology (Japanese Patent No. 3931879, Japanese Patent Laid-Open No. 2005-165421), it is possible to determine the abnormality of information of multiple sensor data using a probabilistic model. In this case, sensor data is discrete from continuous time series values. It is the mainstream to convert it into a value and handle it as categorical data.
特許第3624546号Patent No. 3624546 特許第3931879号Patent 3931879 特開2005-165421号公報JP 2005-165421 A 特開2007-64307号公報JP 2007-64307 A
 本発明は、過去に蓄積した複数のセンサ(またはセンサノード)のセンシングデータを用いて、監視対象(またはターゲットオブジェクト)に対する異常判定を高精度に行うことを可能とした異常判定システムおよびその方法を提供する。 The present invention provides an abnormality determination system and method capable of performing abnormality determination on a monitoring target (or target object) with high accuracy using sensing data of a plurality of sensors (or sensor nodes) accumulated in the past. provide.
 本発明の異常判定システムは、監視対象を複数のセンサにより観測して得た複数の変量に関する複数の時系列データと、前記複数の時系列データが取得されたときの前記監視対象の状態を表す正常クラスまたは異常クラスとを一組とした複数の訓練データを格納するデータ格納部と、前記複数の変量のそれぞれに対して複数の区間を指定し、前記複数の変量のそれぞれ毎に、前記複数の訓練データに含まれる前記複数の時系列データから、前記複数の区間のデータである複数のセグメントデータを抽出する波形分割部と、前記複数の変量のそれぞれ毎に、前記波形分割部により抽出された前記複数のセグメントデータを用いて前記複数の区間のそれぞれについて最近傍法による判定を行うことにより、前記複数の区間のうちの1つである最良区間を選択する評価部と、前記複数の変量のそれぞれ毎に、前記複数の区間のそれぞれについて正常と判定された回数と、異常と判定された回数とに基づき、前記最良区間の正常および異常の条件付き確率を計算し、前記複数の訓練データに含まれる正常クラスの合計数と異常クラスの合計数とから正常および異常の事前確率を計算する、計算部と、前記正常および異常の事前確率を記憶し、前記複数の変量のそれぞれ毎に、前記最良区間の識別情報と、前記最良区間のセグメントデータと、前記セグメントデータに関連するクラスと、前記最良区間の前記正常および異常の条件付き確率と、を記憶する記憶部と、前記監視対象を複数のセンサにより観測して複数の変量に関する複数の時系列データを取得するセンシング部と、前記複数の変量のそれぞれ毎に、それぞれの前記最良区間に従って、前記センシング部により取得された前記複数の時系列データからセグメントデータを選択する選択部と、前記複数の変量のそれぞれ毎に、前記選択部により選択されたセグメントデータについて、前記記憶部における前記セグメントデータを用いて前記最近傍法により、上位の所定数のセグメントデータを検出し、前記複数の変量のそれぞれ毎に、前記所定数のセグメントデータにおける正常クラスおよび異常クラスのそれぞれの比率と、前記記憶部における前記正常および異常の条件付き確率とをそれぞれ乗算し、乗算値を前記複数の変量間で掛け合わせるとともに前記正常および異常の事前確率を乗じることより前記正常および異常の尤度を計算し、前記正常および異常のうち尤度の大きい方に前記監視対象の状態を決定する判定部と、を備える。 The abnormality determination system of the present invention represents a plurality of time-series data related to a plurality of variables obtained by observing a monitoring target with a plurality of sensors, and a state of the monitoring target when the plurality of time-series data are acquired. A data storage unit that stores a plurality of training data in which a normal class or an abnormal class is set as a set, a plurality of sections are specified for each of the plurality of variables, and each of the plurality of variables includes the plurality of A waveform dividing unit that extracts a plurality of segment data that is data of the plurality of sections from the plurality of time-series data included in the training data, and the waveform dividing unit extracts each of the plurality of variables. In addition, by performing determination by the nearest neighbor method for each of the plurality of sections using the plurality of segment data, the best which is one of the plurality of sections Based on the evaluation unit for selecting a section, the number of times each of the plurality of variables is determined to be normal for each of the plurality of sections, and the number of times determined to be abnormal, the normal and abnormal of the best section Calculating a conditional probability, and calculating a normal and abnormal prior probability from the total number of normal classes and the total number of abnormal classes included in the plurality of training data; and a normal probability and an abnormal prior probability Storing, for each of the plurality of variables, the identification information of the best interval, the segment data of the best interval, the class associated with the segment data, and the normal and abnormal conditional probabilities of the best interval; , A sensing unit for observing the monitoring target with a plurality of sensors and acquiring a plurality of time-series data regarding a plurality of variables, and the plurality of variables. A selection unit that selects segment data from the plurality of time-series data acquired by the sensing unit according to the best interval, and a selection unit that selects each of the plurality of variables. For the segment data, the uppermost predetermined number of segment data is detected by the nearest neighbor method using the segment data in the storage unit, and the normal class in the predetermined number of segment data is detected for each of the plurality of variables. And the respective ratios of the abnormal classes and the conditional probabilities of normality and abnormality in the storage unit, respectively, multiplying the multiplied values among the plurality of variables and multiplying the prior probabilities of normality and abnormality Calculate the likelihood of normal and abnormal, and the likelihood of normal and abnormal And a determination unit that determines a state of the monitored object on the greater.
 本発明の異常判定システムは、ターゲットオブジェクトを監視する複数のセンサノードにより観測されたセンサデータがそれぞれ異常か正常かをそれぞれ示す複数の第1ラベルと、前記ターゲットオブジェクトの状態が正常か正常かを示す第2ラベルとを含む複数の訓練データを記憶する第1のデータベースと、(A-1)前記複数のセンサノードの各々の有無をビット列へマッピングすることを規定した符号化方法を用い、マッピングを複数回、ランダムに行うことにより複数の候補解を生成し、(A-2)前記第1のデータベースに対する前記複数の候補解のそれぞれのフィットネスの評価と、前記フィットネスに基づき選択される候補解の交叉および突然変異オペレーションによる新たな候補解の生成とを遺伝的アルゴリズムに従って繰り返し行うことにより最適フィットネスをもつ最適候補解を決定し、前記最適候補解において有のビットが立っているセンサノードを特定する決定フュージョンルール学習部と、(B-1)特定されたセンサノードにより観測されるセンサデータが異常か正常かを、前記特定されたセンサノードに対してあらかじめ用意された、与えられたセンサデータを異常および正常のいずれかに決定する分類器を用いて判定し、(B-2)前記特定されたセンサノードに対する判定の結果がすべて異常を示すときは前記ターゲットオブジェクトが異常であることを決定し、前記判定の結果の少なくともいずれか1つが正常を示すときは前記ターゲットオブジェクトが正常であることを決定する総合判定部と、を備え、前記決定フュージョンルール学習部は、前記複数の候補解のそれぞれのフィットネスの評価として、前記複数の訓練データのそれぞれについて、前記候補解において有のビットが立っているセンサノードの前記第1ラベルを検出し、検出した第1ラベルに示される正常および異常のうち多い方の状態を選択し、前記複数の訓練データのそれぞれに対して選択した状態と前記複数の訓練データのそれぞれの前記第2ラベルに示される状態とが一致する割合を計算することを特徴とする。 The abnormality determination system of the present invention includes a plurality of first labels respectively indicating whether sensor data observed by a plurality of sensor nodes monitoring a target object is abnormal or normal, and whether the state of the target object is normal or normal. A first database that stores a plurality of training data including a second label to indicate, and (A-1) mapping using a coding method that specifies mapping of the presence or absence of each of the plurality of sensor nodes to a bit string A plurality of candidate solutions are generated by performing a plurality of times at random, and (A-2) a fitness evaluation of each of the plurality of candidate solutions with respect to the first database and a candidate solution selected based on the fitness The generation of new candidate solutions by crossover and mutation operations is repeated according to the genetic algorithm A decision fusion rule learning unit that determines an optimal candidate solution having optimal fitness and identifies a sensor node on which a bit is set in the optimal candidate solution, and (B-1) an observation by the specified sensor node. Whether or not the sensor data is abnormal or normal is determined using a classifier prepared in advance for the specified sensor node and determining whether the given sensor data is abnormal or normal. 2) When all the determination results for the specified sensor node indicate abnormality, the target object is determined to be abnormal, and when at least one of the determination results indicates normal, the target object is A general determination unit that determines normality, and the determination fusion rule learning unit includes the plurality of candidate solutions. For each of the plurality of training data, the first label of the sensor node having a bit set in the candidate solution is detected for each of the plurality of training data, and normality and abnormality indicated by the detected first label Selecting a larger state of the plurality of training data, and calculating a ratio at which the state selected for each of the plurality of training data matches the state indicated by the second label of each of the plurality of training data. Features.
 本発明により、過去に蓄積した複数のセンサ(またはセンサノード)のデータを用いて、監視対象(またはターゲットオブジェクト)に対する異常判定を高精度に行うことが可能となる。 According to the present invention, it is possible to perform an abnormality determination on a monitoring target (or target object) with high accuracy using data of a plurality of sensors (or sensor nodes) accumulated in the past.
本発明の第1実施形態に係る異常判定システムの構成を示す。1 shows a configuration of an abnormality determination system according to a first embodiment of the present invention. サーバによる訓練学習プロセスの流れを示す。The flow of the training learning process by the server is shown. 疑似判定評価処理の詳細な処理の流れを示す。The detailed processing flow of a pseudo judgment evaluation process is shown. 図3のステップS205の詳細を示す。Details of step S205 of FIG. 3 are shown. 各種変形例で追加される処理を示す。The process added in various modifications is shown. 訓練データ格納部における訓練データ集合の例を示す。The example of the training data set in a training data storage part is shown. 波形振幅の例を示す。An example of waveform amplitude is shown. パワースペクトルへ変換後の特徴ベクトルの例を示す。The example of the feature vector after converting into a power spectrum is shown. 波形分割の一例を示す。An example of waveform division is shown. 波形分割の他の例を示す。Another example of waveform division is shown. 波形分割のさらに他の例を示す。Another example of waveform division will be described. パワースペクトルの場合の分割例を示す。The example of a division | segmentation in the case of a power spectrum is shown. 訓練データの分割例を示す。An example of division of training data is shown. 訓練データの他の分割例を示す。The other division example of training data is shown. 最近傍のセグメントデータを見つける例を示す。An example of finding nearest neighbor segment data is shown. 最近傍計算を行う様子を示す。The state of performing the nearest neighbor calculation is shown. スコア表の一例を示す。An example of a score table is shown. 最良モデル格納部内のデータ例(判定モデル)を示す。The example of data (judgment model) in the best model storage part is shown. 第1の変形例の処理を示す。The process of a 1st modification is shown. 第2の変形例の処理を示す。The process of the 2nd modification is shown. 第3の変形例の処理を示す。The process of the 3rd modification is shown. 第4の変形例の処理を示す。The process of the 4th modification is shown. 第5の変形例に係る判定モデルの一例を示す。An example of the determination model according to the fifth modification will be shown. 第5の変形例に係るモデル式の概念を示す。14 shows a concept of a model formula according to a fifth modification. 条件付き確率の計算例を示す。An example of conditional probability calculation is shown. 最近傍計算を行う様子を示す。The state of performing the nearest neighbor calculation is shown. 頻度分布表のフォーマット例を示す。The format example of a frequency distribution table is shown. 判定対象データ上をスキャンする様子を示す。A state in which the determination target data is scanned is shown. データ(波形)を切り出す様子を示す。The state of cutting out data (waveform) is shown. 判定対象データのうち異常判定の部分の表示画面例を示す。The example of a display screen of the part of abnormality determination among the determination object data is shown. クライアントにおける動作フローを示す。The operation | movement flow in a client is shown. サーバおよびクライアントを実現するためのハードウェア構成の一例を示す。2 shows an example of a hardware configuration for realizing a server and a client. セグメントテンプレートの当てはめ例を示す。An example of applying a segment template is shown below. 変形例に係るクライアントの動作の一例を示す。An example of operation | movement of the client which concerns on a modification is shown. 第2の実施形態に係る遠隔監視システムの全体構成を示す。2 shows an overall configuration of a remote monitoring system according to a second embodiment. センサデータのフォーマットの一例を示す。An example of a format of sensor data is shown. 単独チャネル異常判断モデルを構築するために抽出されたセンサデータの一例を示す。An example of the sensor data extracted in order to build a single channel abnormality judgment model is shown. C4.5に基づいた方法を使用して最適な閾値がどのように学習されるかを示す。We show how an optimal threshold is learned using a method based on C4.5. 問題の解決のために典型的な遺伝的アルゴリズムの処理のフローを示す。A typical genetic algorithm processing flow for solving the problem is shown. 遺伝的アルゴリズムを用いて、訓練波形において特徴領域の最適なセグメンテーションの例を示す。An example of optimal segmentation of feature regions in a training waveform is shown using a genetic algorithm. 決定フュージョンルールを構築するために用いられる抽出されたセンサデータのフォーマットの一例を示す。Fig. 5 shows an example of a format of extracted sensor data used to construct a decision fusion rule. 1行が1つの決定フュージョンルールに相当する場合における決定フュージョンルールのデータベースの例を示す。The example of the database of a decision fusion rule in case one line is equivalent to one decision fusion rule is shown. 1つのルールが多数の決定フュージョンルールから成る場合における決定フュージョンルールのデータベースの例を示す。The example of the database of a decision fusion rule in case one rule consists of many decision fusion rules is shown. 分類ルールを多数の決定フュージョンルールへ変換する例を示す。An example of converting a classification rule into a number of decision fusion rules is shown. 分類ルールの構築のために遺伝的アルゴリズムにおける符号化の例を示す。An example of coding in a genetic algorithm for constructing a classification rule is shown. センサデータ集合から分類ルールを構築するために遺伝的アルゴリズムの処理のフローの例を示す。An example of the processing flow of a genetic algorithm for constructing a classification rule from a sensor data set is shown. 遺伝的アルゴリズムにおいて交叉と突然変異とを用いて子孫を生成する例を示す。An example of generating offspring using crossover and mutation in a genetic algorithm is shown. 遺伝的アルゴリズムにおいて候補分類ルールを評価する例を示す。An example of evaluating candidate classification rules in a genetic algorithm is shown. 遺伝的プログラミングにおいてS表現に基づいた符号化の例を示す。An example of coding based on S expression in genetic programming is shown. 遺伝的プログラミングにおいて木ベースの符号化の例を示す。An example of tree-based coding in genetic programming is shown. センサデータからの分類ルールの構築のための遺伝的プログラミングの処理のフローの例を示す。An example of a process flow of genetic programming for constructing a classification rule from sensor data is shown. 遺伝的プログラミングにおいて分類ルールの評価の例を示す。An example of classification rule evaluation in genetic programming is shown. 遺伝的プログラミングにおいて交叉と突然変異とを用いて子孫を生成する例を示す。An example of generating offspring using crossover and mutation in genetic programming is shown. ゲートウェイの内部における処理のフローを示す。The flow of processing inside the gateway is shown. センサノード(テスト波形)からのデータにおける異常の判断の例を示す。An example of determination of abnormality in data from a sensor node (test waveform) is shown. センサノードの状態が決定フュージョンルールの少なくとも1つに一致するときに遠隔監視サイトのサーバへ送られるデータのフォーマットの例を示す。Fig. 5 shows an example of the format of data sent to a server at a remote monitoring site when the sensor node status matches at least one of the decision fusion rules. センサノードの状態が決定フュージョンルールのいずれにも一致しないときに遠隔監視サイトのサーバへ送られるデータのフォーマットの例を示す。An example of the format of data sent to the server at the remote monitoring site when the state of the sensor node does not match any of the decision fusion rules is shown. ゲートウェイにおける動作の例を示す。An example of the operation in the gateway is shown.
第1実施形態First embodiment
 図1は、本発明の第1実施形態に係る異常判定システムの構成を示す。 
 この異常判定システムはサーバ(監視センター装置)と、クライアント(遠隔監視端末)とを備える。サーバは、監視対象の観測により得られた過去のセンサデータ(時系列データ)と、当該センサデータの取得時における監視対象の状態を識別するクラス(異常あるいは正常)とを活用して訓練学習を行うことにより、新たなセンサデータの判定を行うための判定モデルを生成する。クライアントは、監視対象を観測してセンサデータを取得し、取得したセンサデータと判定モデルとを用いて監視対象が正常であるか異常であるかの判定を行う。
FIG. 1 shows a configuration of an abnormality determination system according to the first embodiment of the present invention.
This abnormality determination system includes a server (monitoring center device) and a client (remote monitoring terminal). The server performs training learning using past sensor data (time-series data) obtained by observation of the monitoring target and a class (abnormal or normal) that identifies the status of the monitoring target at the time of acquisition of the sensor data. As a result, a determination model for determining new sensor data is generated. The client observes the monitoring target, acquires sensor data, and determines whether the monitoring target is normal or abnormal using the acquired sensor data and the determination model.
(サーバ)
 図2は、サーバによる訓練学習プロセスの流れを示すフローチャートである。
(server)
FIG. 2 is a flowchart showing the flow of the training learning process by the server.
 まずサーバは、ユーザによって設定された各種パラメータを読み込む(S101)。例えば最大波形分割数Z_max(ステップS106で使用)などのパラメータを読み込む。読み込みは、メモリ、ハードディスク等の記録媒体から行う。 First, the server reads various parameters set by the user (S101). For example, parameters such as the maximum waveform division number Z_max (used in step S106) are read. Reading is performed from a recording medium such as a memory or a hard disk.
 次に初期設定を行うことにより、波形分割数のパラメータzを0に設定する(S102)。 Next, by performing initial setting, the parameter z of the number of waveform divisions is set to 0 (S102).
 次に、訓練データ入力部12が、訓練データ格納部11から訓練データ集合を読み出し、次段の波形前処理部13に入力する(S103)。 Next, the training data input unit 12 reads the training data set from the training data storage unit 11 and inputs it to the waveform preprocessing unit 13 at the next stage (S103).
 図6に訓練データ格納部11における訓練データ集合の例を示す。 Fig. 6 shows an example of a training data set in the training data storage unit 11.
 各訓練データはそれぞれ、少なくとも1種類以上の時系列データ(センサデータ)と、クラスとの組で構成される。ここでクラスとは、過去において該当する時系列データが取得されたときの対象機器(監視対象)の状態を保守員等が判定した判定結果である。クラスは例えば異常と正常がある。ただし異常タイプA、異常タイプBのように複数種類の異常状態があってもよい。ここでは説明を分かりやすくするために正常・異常の2つのクラスがある場合を説明する。図示の例では、訓練データは、4つの変量(チャネル)の時系列データを含んでいる。訓練データd1~dNのクラスは正常、訓練データdN+1~dMのクラスは異常である。4つの変量の時系列データはそれぞれ該当する4つのセンサから取得されたものである。ここでは説明の簡単のため各時系列データのサイズ(時間軸方向の長さ)は同じであるとするが、変量(チャネル)毎にサイズが異なっていてもかまわない。 Each training data is composed of at least one type of time series data (sensor data) and a class. Here, the class is a determination result obtained when a maintenance person or the like determines the state of the target device (monitoring target) when the corresponding time-series data is acquired in the past. Classes are, for example, abnormal and normal. However, there may be multiple types of abnormal states such as abnormal type A and abnormal type B. Here, in order to make the explanation easy to understand, the case where there are two classes of normal and abnormal will be described. In the illustrated example, the training data includes time-series data of four variables (channels). The class of training data d 1 to d N is normal, and the class of training data d N + 1 to d M is abnormal. The time series data of the four variables are obtained from the corresponding four sensors. Here, for simplification of explanation, it is assumed that the time series data has the same size (length in the time axis direction), but the size may be different for each variable (channel).
 次に、波形前処理部13が、訓練データ集合に含まれる各時系列データの前処理を行う(S104)。前処理としてたとえばFFTによるパワースペクトル変換や短時間フーリエ変換、ウェーブレット変換等の信号処理を施すことにより、振幅スペクトルなどの特徴ベクトルを取得してもよい。あるいは、複数の所定時刻における波形振幅値を取得しても構わない。図7に複数の所定時刻において取得した波形振幅の例を示す。図8にパワースペクトルへ変換後の特徴ベクトルの例を示す。前処理としては、さらに、低周波域通過フィルタ(平滑化フィルタ)を用いて波形を処理してもよい。これは、波形振幅にノイズが乗っている場合や波形の大局的特徴をつかみたい場合に有効である。または限定された帯域のみの波形のみをフィルタで取りだしてもよい。または、たとえば非特許文献1にあるように線分近似やチェビシェフ近似、APCA近似など、様々な波形近似計算を行っても良い。なお前処理を特に行わずに次の処理へ進むことも可能である。以降の説明では、理解の簡単のため、前処理を経ていない図6の時系列データを用いて説明する。 Next, the waveform preprocessing unit 13 preprocesses each time series data included in the training data set (S104). As preprocessing, a feature vector such as an amplitude spectrum may be acquired by performing signal processing such as power spectrum conversion by FFT, short-time Fourier transform, and wavelet transform. Alternatively, waveform amplitude values at a plurality of predetermined times may be acquired. FIG. 7 shows examples of waveform amplitudes acquired at a plurality of predetermined times. FIG. 8 shows an example of the feature vector after conversion to the power spectrum. As preprocessing, the waveform may be further processed using a low-frequency pass filter (smoothing filter). This is effective when noise is added to the waveform amplitude or when it is desired to grasp the general characteristics of the waveform. Alternatively, only a waveform in a limited band may be extracted by a filter. Alternatively, various waveform approximation calculations such as line segment approximation, Chebyshev approximation, and APCA approximation may be performed as described in Non-Patent Document 1, for example. It is also possible to proceed to the next process without performing any pre-processing. In the following description, for ease of understanding, description will be made using the time-series data of FIG. 6 that has not undergone preprocessing.
 次に、ステップS105~ステップS110では、波形分割数zを1から最大波形分割数z_maxまで順次増大させながら、時系列データを波形分割数zで複数の区間へ分割し、波形分割数z(1~z_max)のそれぞれにおいて、変量毎の重要区間を決定する。そして最も高い評価値が得られた波形分割数のときの各変量の重要区間を最適区間として決定する。また訓練データ集合の各時系列データにおける、各最適区間のデータ部分(セグメントデータ)を、該当するクラスと関連づけて記憶する。以下ステップS105~S110の詳細を説明する。 Next, in steps S105 to S110, the time-series data is divided into a plurality of sections by the waveform division number z while sequentially increasing the waveform division number z from 1 to the maximum waveform division number z_max, and the waveform division number z (1 ~ Z_max), an important interval for each variable is determined. Then, the important section of each variable at the time of the waveform division number at which the highest evaluation value is obtained is determined as the optimum section. In addition, the data portion (segment data) of each optimum section in each time series data of the training data set is stored in association with the corresponding class. Details of steps S105 to S110 will be described below.
 ステップS105ではサーバが、波形分割数zを1インクリメントする。 In step S105, the server increments the waveform division number z by one.
 ステップS106では、サーバが、波形分割数zが最大分割数z_maxを超えたかどうかを判断し、超えた場合はステップS111に進む。超えていない場合は、ステップS107に進む。 In step S106, the server determines whether or not the waveform division number z exceeds the maximum division number z_max. If it exceeds, the process proceeds to step S111. If not, the process proceeds to step S107.
 ステップS107では、波形分割部14が、波形分割数zで、訓練データ集合の各時系列データを時間軸上で分割してセグメントデータを切り出す。分割方法はここでは簡単のため分割幅が均等になるように分割するが、別の方法で分割してもかまわない。切り出したセグメントデータはセグメント格納部15に格納する。 In step S107, the waveform division unit 14 divides each time series data of the training data set on the time axis by the number of waveform divisions z and cuts out segment data. Here, the division method is simple and is divided so that the division width is equal. However, another division method may be used. The extracted segment data is stored in the segment storage unit 15.
 波形分割の一例を図9(z=1の場合)、図10(z=2の場合)、図11(z=4の場合)に示す。z=1の場合は実際には分割は行われないことに注意する。このように切り出した各セグメントデータ(部分時系列データ)は、zの値と訓練データIDと変量IDとに関連づけてセグメント格納部15に蓄積しておく。前処理後の時系列データ(特徴ベクトル)がパワースペクトルの場合は図12のように周波数軸方向に沿ってデータを分割すればよい。ここでは、z=3の場合を示し、周波数帯が3分割されている。本発明において時系列データを分割するというときは、時系列データをパワースペクトルに変換して扱う場合には周波数軸方向に沿って分割することを意味するものとする。 An example of waveform division is shown in FIG. 9 (when z = 1), FIG. 10 (when z = 2), and FIG. 11 (when z = 4). Note that no division is actually performed when z = 1. Each segment data (partial time series data) cut out in this way is stored in the segment storage unit 15 in association with the value of z, the training data ID, and the variable ID. When the pre-processed time-series data (feature vector) is a power spectrum, the data may be divided along the frequency axis direction as shown in FIG. Here, the case of z = 3 is shown, and the frequency band is divided into three. In the present invention, dividing time series data means dividing the time series data along the frequency axis direction when the time series data is converted into a power spectrum.
 次にステップS108では疑似判定評価部17、確率尤度計算部16および最良モデル選定部19による疑似判定評価処理を行う。 Next, in step S108, a pseudo determination evaluation process is performed by the pseudo determination evaluation unit 17, the probability likelihood calculation unit 16, and the best model selection unit 19.
 図3は疑似判定評価処理(S108)の詳細な処理の流れを示すフローチャートである。 FIG. 3 is a flowchart showing a detailed process flow of the pseudo judgment evaluation process (S108).
 まず疑似判定評価部17が、訓練データ集合(セグメント化したもの)を複数の分割集合に分割し、複数の分割集合を1からVmaxでラベル付けする(S201)。1つの分割集合は1つの訓練データからなっていてもよいし、複数の訓練データからなっていてもよい。1つの分割集合が、1つの訓練データからなるときは、訓練データ集合は訓練データの総数分に分割され、したがってVmaxは訓練データの総数に一致する。以下では説明の簡単のため、特に断りのない限り、1つの分割集合は1つの訓練データからなっているとする。 First, the pseudo judgment evaluation unit 17 divides the training data set (segmented) into a plurality of divided sets, and labels the plurality of divided sets with 1 to Vmax (S201). One divided set may consist of a single piece of training data or a plurality of pieces of training data. When one divided set consists of one training data, the training data set is divided into the total number of training data, and therefore Vmax matches the total number of training data. In the following, for simplicity of explanation, it is assumed that one divided set consists of one training data unless otherwise specified.
 次に、初期設定を行うことにより、分割集合識別子v=0、評価値q=0.0とする(S202)。 Next, by performing initialization, the divided set identifier v = 0 and the evaluation value q = 0.0 are set (S202).
 次に、vを1インクリメントする(S203)。 Next, v is incremented by 1 (S203).
 次に、疑似判定評価部17が、ステップS201で分割された複数の分割集合のうち識別子vに示されるものを疑似判定対象データ集合Tvとして選定する。すなわち複数の分割集合を疑似判定対象データ集合Tvと、それ以外の分割集合とに分ける。v=1のときの例を図13に、v=2のときの例を図14に示す。上述の通り、ここでは、複数の分割集合のそれぞれは1つの訓練データからなるため、疑似判定対象データ集合Tvは1つの訓練データを含む。従って、以下では、疑似判定対象データTvと称するときは疑似判定対象データ集合Tvが1つの訓練データを含む場合を指すものとする。 Next, the pseudo-judgment evaluation unit 17 selects, as a pseudo-judgment target data set Tv, the one indicated by the identifier v among the plurality of split sets divided in step S201. That is, a plurality of divided sets are divided into a pseudo determination target data set Tv and other divided sets. FIG. 13 shows an example when v = 1, and FIG. 14 shows an example when v = 2. As described above, since each of the plurality of divided sets includes one piece of training data, the pseudo determination target data set Tv includes one piece of training data. Therefore, hereinafter, the pseudo-determination target data Tv refers to a case where the pseudo-determination target data set Tv includes one piece of training data.
 次に疑似判定評価部17が、訓練学習によるLeave Cross Validationを用いたモデル化処理を行い、評価値rを取得する(S205)。この処理では、疑似判定対象データTvのクラス(判定結果)を擬似的に伏せ、残りの訓練データを用いて、疑似判定対象データTvのクラスを推定する。推定した結果が、疑似判定対象データTvの実際のクラスと一致しているかを算出することにより評価値rを取得する。特にLeave-One-out Cross Validation(1つの分割集合には1つの訓練データのみ含める)は訓練データが小数の場合に有効である。ただし分割集合に含める訓練データが1つのときは疑似判定処理に時間がかかり過ぎる問題があり、この問題を避けたい場合は、1つの分割集合に複数の訓練データを含め、分割集合の1つを疑似判定対象集合として選択し、すべての部分集合が1回ずつ疑似判定対象集合となるように評価を繰り返せばよい。これは一般にCross Validationと呼ばれる評価方法である。本例では上記したように分割集合には1つの訓練データが含まれる場合を想定する。 Next, the pseudo judgment evaluation unit 17 performs a modeling process using Leave Cross Validation by training learning, and obtains an evaluation value r (S205). In this process, the class of the pseudo determination target data Tv (determination result) is simulated and the remaining training data is used to estimate the class of the pseudo determination target data Tv. The evaluation value r is obtained by calculating whether the estimated result matches the actual class of the pseudo determination target data Tv. In particular, Leave-One-out Cross Validation (only one training data is included in one divided set) is effective when the training data is decimal. However, when there is only one training data to be included in the divided set, there is a problem that the pseudo judgment process takes too much time.If you want to avoid this problem, include multiple training data in one divided set and add one of the divided sets. It is only necessary to select the pseudo judgment target set and repeat the evaluation so that all the subsets become the pseudo judgment target set once. This is an evaluation method generally called Cross Validation. In this example, it is assumed that the training set includes one piece of training data as described above.
 ステップS207では、疑似判定評価部17が、評価値rを評価値qに加算する。 In step S207, the pseudo determination evaluation unit 17 adds the evaluation value r to the evaluation value q.
 ステップS208では疑似判定評価部17が、分割集合識別子vがVmaxを超えたかどうかを判定する。すなわち訓練データ集合の各訓練データのそれぞれが疑似判定対象データTvとして選定されたかどうか(複数の分割集合のそれぞれが疑似判定対象データ集合Tvとして選定されたかどうか)を判定する。Vmaxを超えていないときはステップS203に戻り、超えたときはステップS209に進む。 In step S208, the pseudo determination evaluation unit 17 determines whether or not the divided set identifier v exceeds Vmax. That is, it is determined whether each training data of the training data set is selected as the pseudo determination target data Tv (whether each of the plurality of divided sets is selected as the pseudo determination target data set Tv). When it does not exceed Vmax, the process returns to step S203, and when it exceeds, the process proceeds to step S209.
 ステップS209では疑似判定評価部17が、評価値qを、評価回数(分割集合の個数)であるv_maxで除算することにより疑似正答率Gz(平均評価値)を計算する。これにより1つの波形分割数zに対応して1つの疑似正答率Gz(平均評価値)が得られることとなる。 In step S209, the pseudo determination evaluation unit 17 calculates the pseudo correct answer rate Gz (average evaluation value) by dividing the evaluation value q by v_max which is the number of evaluations (number of divided sets). Accordingly, one pseudo correct answer rate Gz (average evaluation value) is obtained corresponding to one waveform division number z.
 ステップS210(条件付き確率の計算)およびステップS211(重要セグメントの決定)については後述する。 Step S210 (conditional probability calculation) and step S211 (important segment determination) will be described later.
 図2に戻り、ステップS109では、疑似判定評価部17が、ステップS209で計算された疑似正答率Gzが、1つ前の波形分割数z-1のときの疑似正答率Gz-1より小さいか否かを判定する。GzがGz-1以上のときは、さらに大きい値の疑似正答率が得られる可能性があると判断し、ステップS105に戻り、波形分割数zを1インクリメントして、同様の手順を繰り返す。一方、疑似正答率GzがGz-1より小さいときは、これより大きい値の疑似正答率を得られないと判断し、ステップS110に進む。 Returning to FIG. 2, in step S109, the pseudo judgment evaluation unit 17 determines whether the pseudo correct answer rate Gz calculated in step S209 is smaller than the pseudo correct answer rate Gz-1 when the previous waveform division number z-1 or not. Determine whether or not. When Gz is equal to or greater than Gz−1, it is determined that there is a possibility that a higher pseudo correct answer rate may be obtained, and the process returns to step S105, the waveform division number z is incremented by 1, and the same procedure is repeated. On the other hand, when the pseudo correct answer rate Gz is smaller than Gz-1, it is determined that a pseudo correct answer rate larger than this cannot be obtained, and the process proceeds to step S110.
 ここでステップS110、S111の説明を行うに先立ち、図3のステップS205(訓練学習によるモデル化処理)の詳細を説明する。 Here, prior to description of steps S110 and S111, details of step S205 (modeling processing by training learning) in FIG. 3 will be described.
 図4は図3のステップS205の詳細を示すフローチャートである。ここでは波形分割数z=4の場合を例に説明する。 FIG. 4 is a flowchart showing details of step S205 in FIG. Here, a case where the number of waveform divisions z = 4 will be described as an example.
 まず、ステップS301では、確率・尤度計算部16が、訓練データ集合(z=4でセグメント化されている)における各訓練データのクラスに基づき、正常および異常のそれぞれの生起確率を事前確率p(Ci)として計算する。たとえば訓練データ集合のサイズが200であり、正常クラスが140個、異常クラスが60個存在する場合は、正常の事前確率p(C1=正常)=0.7、異常の事前確率p(C2=異常)=0.3である(図25の左上を参照)。なお本ステップS301は1回のみ行えばよく、次回以降は本ステップの処理はスキップしてよい。 First, in step S301, the probability / likelihood calculation unit 16 calculates the normal and abnormal occurrence probabilities based on the classes of training data in the training data set (segmented by z = 4) as prior probabilities p. Calculate as (Ci). For example, if the size of the training data set is 200, there are 140 normal classes and 60 abnormal classes, normal prior probability p (C 1 = normal) = 0.7, abnormal prior probability p (C 2 = Abnormal) = 0.3 (see the upper left of FIG. 25). This step S301 may be performed only once, and the processing of this step may be skipped from the next time.
 次に、ステップS302で、疑似判定評価部17が、初期設定を行うことにより、変量ID(チャネルID)を示すiを0に設定し、区間のIDを示すjを0に設定する。 Next, in step S302, the pseudo determination evaluation unit 17 performs initialization and sets i indicating variable ID (channel ID) to 0 and j indicating section ID to 0.
 次に、ステップS303aで疑似判定評価部17が、チャネルiを1インクリメントし、ステップS303bで変量(チャネル)jを1インクリメントする。 Next, in step S303a, the pseudo judgment evaluation unit 17 increments channel i by 1, and in step S303b increments variable (channel) j by 1.
 次に、ステップS304で、疑似判定評価部17が、疑似判定対象データTvに対して、時系列データ分類問題で実績のあるk-最近傍法を用いて、疑似判定対象データTvのクラス推定(疑似判定)を行う。k-最近傍法とは、特徴空間上で、疑似判定対象に最も近いk個の事例を抽出し、そのk個の事例のそれぞれのクラスの中で、最も多数を占めるクラスを、疑似判定対象の推定クラスとして決定する判定方法である。以下詳細に説明する。 Next, in step S304, the pseudo determination evaluation unit 17 uses the k-nearest neighbor method proven in the time series data classification problem for the pseudo determination target data Tv to estimate the class of the pseudo determination target data Tv ( (Pseudo-judgment). In the k-nearest neighbor method, k cases closest to the pseudo judgment target are extracted in the feature space, and the class occupying the largest number among the classes of the k cases is selected as the pseudo judgment target. It is the determination method determined as an estimation class. This will be described in detail below.
 疑似判定対象データTvおける各変量(各チャネル)の各セグメントデータについて、k個の最近傍のセグメントデータを、疑似判定対象データTv以外の残りの訓練データ(残りの分割集合)の中から同一変量内で見つける。図15に疑似判定対象データdN+1の変量(チャネル)1およびセグメントs1に着目し、疑似判定対象データd1のセグメントデータs1に最も類似度が高い(距離が近い)上位k個のセグメントデータs1を、残りの訓練データの変量(チャネル)1の時系列データから見つける例を示す。ただしk=5とする。図示の例では、訓練データd13, d14, d15,d17, d16における変量1のセグメントデータs1が特定されている。ここでセグメントデータ間の距離の計算には、Dynamic Time Warping(DTW)距離やEuclidean(ユークリッド)距離などの尺度を用いればよい。ここでは訓練データd13, d14, d15,d17, d16の変量1のセグメントデータs1に対する距離がそれぞれ3.5, 9.3, 12.9, 13.2,14.1と計算されている。なお図中、dist (x,y)はセグメントデータxとセグメントデータyとの距離を示す。 For each segment data of each variable (each channel) in the pseudo judgment target data Tv, the k nearest segment data is the same variable from the remaining training data (remaining divided sets) other than the pseudo judgment target data Tv Find in. Focusing on variable (channel) 1 and segment s1 of pseudo judgment target data d N + 1 in FIG. 15, the top k segments having the highest similarity (closest distance) to segment data s1 of pseudo judgment target data d 1 An example is shown in which data s1 is found from time series data of variable (channel) 1 of the remaining training data. However, k = 5. In the illustrated example, training data d 13, d 14, d 15 , d 17, segment data s1 of variable 1 in d 16 have been identified. Here, for the calculation of the distance between the segment data, a scale such as Dynamic Time Warping (DTW) distance or Euclidean distance may be used. Here, the distances from the training data d 13 , d 14 , d 15 , d 17 , d 16 to the variable 1 segment data s 1 are calculated as 3.5, 9.3, 12.9, 13.2, and 14.1, respectively. In the figure, dist (x, y) indicates the distance between the segment data x and the segment data y.
 このように上位k(=5)個のセグメントデータを特定したらこれらのセグメントデータに関連するクラスの中で、最も個数の多いクラスを特定する。これを定式化すると式1-1のようになる。
Figure JPOXMLDOC01-appb-M000001
When the top k (= 5) segment data are specified in this way, the class with the largest number among the classes related to these segment data is specified. This is formulated as shown in Equation 1-1.
Figure JPOXMLDOC01-appb-M000001
 図15の例では、訓練データd13, d14, d15,d17, d16のすべてクラスが正常である。すなわち異常の頻度、正常の頻度は、freq(異常、正常)=(0,5)である。よって上記式1-1に従って、推定結果は正常と判定される。ここで、疑似判定対象データd1の実際のクラスは異常である。従ってこの推定結果は不正解(誤り)となる。 In the example of FIG. 15, all classes of training data d 13 , d 14 , d 15 , d 17 , and d 16 are normal. That is, the frequency of abnormality and the frequency of normality are freq (abnormality, normal) = (0, 5). Therefore, the estimation result is determined to be normal according to the above equation 1-1. Here, the actual class pseudo determination target data d 1 is abnormal. Therefore, this estimation result is an incorrect answer (error).
 次にステップS305では、疑似判定評価部17は、ステップS304で得られた上記正常の頻度と異常の頻度とに基づき、変量毎かつ区間毎の正常および異常の頻度分布表を更新する。頻度分布表のフォーマット例を図27に示す。頻度分布表はたとえば波形分割数z毎に用意される。最初、頻度分布表の全ての項目にゼロが設定されている。上記計算例では、チャネル1の分布表(図27の左上)においてセクションs1の正常の項目に5を加算し、異常の項目には何ら加算しない。 Next, in step S305, the pseudo judgment evaluation unit 17 updates the normal and abnormal frequency distribution tables for each variable and for each section based on the normal frequency and the abnormal frequency obtained in step S304. A format example of the frequency distribution table is shown in FIG. For example, the frequency distribution table is prepared for each waveform division number z. Initially, all items in the frequency distribution table are set to zero. In the above calculation example, 5 is added to the normal item of section s1 in the distribution table of channel 1 (upper left of FIG. 27), and nothing is added to the abnormal item.
 次にステップS306では、疑似評価判定部17が、ステップS304での推定が正解か不正解かに応じてスコア表を更新する。スコア表とは、疑似判定評価を進める過程で選択される全てのチャネルとセグメント(区間)との組合せに毎にスコアを格納するものである(後述する図17の上図を参照)。スコア表の各マスの初期値は0である。正解の場合には該当するマスに所定のスコア(ここでは1)を加算する。例えば図16に示すように、疑似判定対象データd1における変量(チャネル)1のセグメントs2に関して、ステップS304での推定結果が正解であったとした場合、変量(チャネル)1とセグメント2に対応するマスのスコアscore(ch1,s2)に1を加算する。すなわち、score(ch1,s2)=0+1となる。スコア表は、波形分割数z毎に存在する。 Next, in step S306, the pseudo evaluation determination unit 17 updates the score table according to whether the estimation in step S304 is correct or incorrect. The score table stores a score for each combination of all channels and segments (sections) selected in the process of proceeding with pseudo judgment evaluation (see the upper diagram of FIG. 17 described later). The initial value of each square in the score table is 0. If the answer is correct, a predetermined score (1 in this case) is added to the corresponding cell. For example, as shown in FIG. 16, regarding the segment s2 of the variable (channel) 1 in the pseudo determination target data d1, if the estimation result in step S304 is correct, it corresponds to the variable (channel) 1 and the segment 2 Add 1 to the score score (ch1, s2) of the cell. That is, score (ch1, s2) = 0 + 1. A score table exists for each waveform division number z.
 次にステップS309ではセグメントjがjmaxに達したか否かを判定し、達していないときはステップS303bに戻ってjをインクリメントして次のセグメントを選択する。達したときは次のステップS310に進む。上記図16には変量1(チャネル1)のセクションs2についてステップS304の最近傍計算を行う様子が示される。ここでは推定結果が異常であり、疑似判定対象データdN+1も異常であるため正解となっている。 In step S309, it is determined whether or not the segment j has reached jmax. If not, the process returns to step S303b to increment j and select the next segment. If reached, the process proceeds to the next step S310. FIG. 16 shows how the nearest neighbor calculation is performed in step S304 for the section s2 of the variable 1 (channel 1). Here, the estimation result is abnormal, and the pseudo-determination target data d N + 1 is also abnormal, so it is a correct answer.
 ステップS310では、変量(チャネル)iがimaxに達したか否かを判定し、達していないときはステップS303aに戻って次の変量(チャネル)を選択し、達したときは、次のステップS311に進む。また図26には変量3(チャネル3)のセクションs2についてステップS304の最近傍計算を行う様子が示される。ここでは推定結果が正常であり、疑似判定対象データdN+1は異常であるため不正解となっている。 In step S310, it is determined whether or not the variable (channel) i has reached imax. If not, the process returns to step S303a to select the next variable (channel). If reached, the next step S311 is performed. Proceed to FIG. 26 shows how the nearest neighbor calculation is performed in step S304 for section s2 of variable 3 (channel 3). Here, the estimation result is normal, and the pseudo determination target data d N + 1 is abnormal, so it is incorrect.
 ステップS311では疑似判定対象データTvの評価値を計算する。変量1~4毎のセグメントs1~s4について行った合計16回の判定(S304)のうち少なくともいずれか1つについて判定結果が異常でありかつ正解であるときは評価値rを1.0、それ以外のときは0.0とする。または正解の回数が不正解の回数よりも多いときは1.0、正解の回数が不正解の回数以下のときは0.0としてもよい。または判定回数に対する正解の回数の比率を評価値rとしてもよい。ステップS311を終えたら本フローを終了し、図3のステップS207に戻る。 In step S311, the evaluation value of the pseudo judgment target data Tv is calculated. When the judgment result is abnormal and correct for at least one of the total 16 judgments (S304) performed for the segments s1 to s4 for each variable 1 to 4, the evaluation value r is 1.0, otherwise When it is 0.0. Alternatively, 1.0 may be used when the number of correct answers is greater than the number of incorrect answers, and 0.0 may be set when the number of correct answers is less than or equal to the number of incorrect answers. Alternatively, the ratio of the number of correct answers to the number of determinations may be set as the evaluation value r. When step S311 is completed, this flow is ended, and the process returns to step S207 in FIG.
 なお1つの分割集合に複数の訓練データが含まれるときは各訓練データについてステップS302~S310の処理を行い、その後、同様の基準により評価値を計算すればよい。 Note that when a plurality of training data are included in one divided set, the processing of steps S302 to S310 may be performed for each training data, and thereafter, an evaluation value may be calculated according to the same criteria.
 図3のステップS207では評価値rをqに加算して、qを更新する。そして、次の疑似判定対象データTvの選定に進み、同様にして図4のフローを行う。 In step S207 of FIG. 3, the evaluation value r is added to q, and q is updated. Then, the process proceeds to selection of the next pseudo determination target data Tv, and the flow of FIG. 4 is performed in the same manner.
 そして、すべての訓練データ(すべての分割集合)がそれぞれ1回、疑似判定対象データ(疑似判定対象データ集合)Tvとして選定されたらステップS209に進み、疑似正答率(平均評価値)Gz=q/v_maxを計算する。すなわちすべての訓練データ(すべての分割集合)間での評価値の平均を計算する。 When all the training data (all divided sets) are selected once as the pseudo judgment target data (pseudo judgment target data set) Tv, the process proceeds to step S209, and the pseudo correct answer rate (average evaluation value) Gz = q / Calculate v_max. That is, the average of evaluation values among all training data (all divided sets) is calculated.
 次のステップS210では、確率・尤度計算部16が、図4のステップS305で更新された頻度分布表に基づき各変量および各セグメントの組み合わせ毎に正常および異常の条件付き確率p(X|C)を計算する。例えば変量2,セグメントs2の組について、p(X2=s2|C)=(異常=0.8、正常=0.067)といったように計算する。 In the next step S210, the probability / likelihood calculation unit 16 performs normal and abnormal conditional probabilities p (X | C for each variable and each segment combination based on the frequency distribution table updated in step S305 of FIG. ). For example, for a set of variable 2 and segment s2, p (X2 = s2 | C) = (abnormal = 0.8, normal = 0.067) is calculated.
 ここでp(X|C)とは、Cの条件下、すなわち、C=異常、C=正常のそれぞれの中で、変量Xが各セグメントをそれぞれとった場合にどれくらいの確率で正常となるか異常となるかを確率的に示したものである。つまり、C=正常、C=異常のそれぞれで各セグメントにおける正常・異常の分布を正規化した確率がp(X|C)であるといえる。P(X)だけだと正常の確率が異常の確率を上回っていても、もともとP(C)にける正常の生起確率が異常の生起確率よりも5倍であるならば、この事前確率の差異を1/5に補正して条件付き確率を考えなければならない。そこでここではp(C)でそれぞれ正規化した各セグメントにおける確率をp(X|C)として計算する。各C(すなわち正常および異常)の値について、すべてのとるべきセグメントの種類における確率の総和が1となるように各C毎に確率を計算することになる。 Here, p (X | C) is the probability of normalization when the variable X takes each segment under the condition of C, that is, C = abnormal and C = normal. This shows the probability of an abnormality. That is, the probability of normalizing the normal / abnormal distribution in each segment with C = normal and C = abnormal is p (X | C). If only P (X) is normal, the probability of normality exceeds the probability of abnormality, but if the probability of normal occurrence in P (C) is five times higher than the probability of occurrence of abnormality, this difference in prior probabilities Must be corrected to 1/5 to consider the conditional probability. Therefore, here, the probability in each segment normalized by p (C) is calculated as p (X | C). For each C (ie normal and abnormal) value, the probability is calculated for each C so that the sum of the probabilities in all the segment types to be taken is 1.
 一例として、変量2,および変量3についての条件付き確率の計算例をそれぞれ図25の中心の上、右上に示す。変量2に関して図示の頻度分布f(X2|C)が得られていたとすると、この頻度分布に基づき、条件付き確率p(X2|C)が図示の表のように計算される。すなわち正常および異常のそれぞれ毎に、各変量の頻度を、合計頻度数で除算することにより条件付き確率が計算される。変量3に関しても同様の手法にて計算される。図示していないが変量1および変量4に関しても同様にして条件付き確率が計算される。なお異常の条件付き確率が0.1、正常の条件付き確率が0.9といったように大きく偏る場合、これは異常の条件付き確率が0.55、正常の条件付き確率が0.45といったような場合に比べて、正常の確率が非常に高いと言え、このような確率は確信度に近いものとなる。 As an example, calculation examples of conditional probabilities for variable 2 and variable 3 are shown in the upper right corner of the center of FIG. If the illustrated frequency distribution f (X2 | C) is obtained for the variable 2, the conditional probability p (X2 | C) is calculated as shown in the table based on this frequency distribution. That is, for each normal and abnormal condition, the conditional probability is calculated by dividing the frequency of each variable by the total frequency. The same method is used for variable 3. Although not shown, conditional probabilities are similarly calculated for variables 1 and 4. If the abnormal conditional probability is 0.1, the normal conditional probability is 0.9, and so on, this is normal compared to the abnormal conditional probability of 0.55 and normal conditional probability of 0.45. The probability is very high, and such a probability is close to certainty.
 次にステップS211に進み、変量毎に重要セグメント(重要区間)を決定する。すなわち、すべての訓練データがそれぞれ疑似判定対象データTvとして選定され、それぞれについて図4のフローが行われることで、最終的に図17の上に示すようにスコア表が得られ、本ステップS210では、疑似判定評価部17が、このスコア表に基づき各変量(チャネル)のそれぞれの重要セグメント(重要区間)を選択し、セグメント格納部15にその情報を記録する。具体的に、スコア表において、最もスコアの高いセグメント(区間)を変量毎に重要セグメントとして選択する。例えば変量(チャネル)1ではセグメントs1が最もスコアが高いため、セグメントs1を重要セグメントとして選択する。同様に、変量2~4に対しては、セグメントs2,s2,s4が重要セグメントとして選択される。同じスコアのセグメントが複数存在するときの選択方法、およびそのほかの選択方法に関しては後述する。 Next, proceed to step S211 and determine an important segment (important section) for each variable. That is, all the training data is selected as the pseudo determination target data Tv, and the flow of FIG. 4 is performed for each, so that a score table is finally obtained as shown in the upper part of FIG. The pseudo judgment evaluation unit 17 selects each important segment (important section) of each variable (channel) based on this score table, and records the information in the segment storage unit 15. Specifically, in the score table, the segment (section) with the highest score is selected as an important segment for each variable. For example, in the variable (channel) 1, since the segment s1 has the highest score, the segment s1 is selected as the important segment. Similarly, for the variables 2 to 4, segments s2, s2, and s4 are selected as important segments. A selection method when there are a plurality of segments having the same score and other selection methods will be described later.
 ステップS211で重要セグメントを決定したら、本フローを終了し、図2のステップS109に進む。 If the important segment is determined in step S211, this flow is ended, and the process proceeds to step S109 in FIG.
 ステップS109では先に少し述べたように、ステップS209で計算した疑似正答率Gzが波形分割数z-1のときの疑似正答率Gz-1よりも小さいかどうかを検査する。小さかったら、最良モデル選定部19は、次のステップS110において、Gz-1が得られたときに選択された、変量毎の重要セグメントを最良セグメント(最良区間)として最良モデル格納部18に格納する。 In step S109, as described a little earlier, it is checked whether the pseudo correct answer rate Gz calculated in step S209 is smaller than the pseudo correct answer rate Gz-1 when the waveform division number is z-1. If it is smaller, the best model selection unit 19 stores the important segment for each variable selected when Gz-1 is obtained in the best model storage unit 18 as the best segment (best section) in the next step S110. .
 またステップS106で波形分割数zが最大波形分割数z_maxよりも大きいと判定されたときも同様にしてGz-1における最良セグメント(最良区間)の特定および格納を行う。ただしこの場合、ステップS105でzがインクリメントされているため、この場合、Gz-1はGz_maxに一致することに注意する。 Also, when it is determined in step S106 that the waveform division number z is larger than the maximum waveform division number z_max, the best segment (best section) in Gz-1 is specified and stored in the same manner. However, in this case, since z is incremented in step S105, it should be noted that in this case, Gz-1 matches Gz_max.
 なお、各チャネル(変量)の最良セグメント(最良区間)がいったん求まった後、各チャネルにおける他のセグメント(他の区間)を併合させて疑似判定性能が向上する余地がある場合には、併合アプローチを導入して併合したセグメントを最良セグメントとしてもよい。 In addition, once the best segment (best section) of each channel (variable) is obtained, if there is room to improve the pseudo judgment performance by merging other segments (other sections) in each channel, the merge approach It is good also considering the segment which introduce | transduced and merged as the best segment.
 次にステップS111において、最良モデル選定部19が、正常および異常の事前確率情報、ステップS210で求めた各変量の条件付き確率(最良モデルが得られたzに対応するもの)を最良モデル格納部18に格納する。なお、格納する条件付き確率は各変量において最良セグメントとして特定されたセグメントの確率のみでもかまわない。 Next, in step S111, the best model selection unit 19 obtains normal and abnormal prior probability information and conditional probabilities for each variable obtained in step S210 (corresponding to z from which the best model was obtained) as the best model storage unit. Store in 18. The conditional probability stored may be only the probability of the segment specified as the best segment in each variable.
 またクライアントで判定に使用するモデル式(後述)を最良モデル格納部18に格納する。モデル式は各変量の最良セグメントが決まれば自動的に生成可能である。 Also, the model formula (described later) used for determination by the client is stored in the best model storage unit 18. The model formula can be automatically generated once the best segment for each variable is determined.
 また、最良モデル選定部19は、セグメント格納部15から各変量の最良区間のセグメントデータと該当するクラスを読み出し、最良モデル格納部18に格納する。読み出すセグメントは訓練データのすべてを対象にしてもよいし正常と異常のそれぞれについて所定数の訓練データを対象にしてもよいし、その他の基準で決定してもよい。 Also, the best model selection unit 19 reads the segment data of the best section of each variable and the corresponding class from the segment storage unit 15 and stores them in the best model storage unit 18. The segment to be read may be the entire training data, a predetermined number of training data for each of normal and abnormal, or may be determined by other criteria.
 また最良モデル選定部19は、採択した波形分割数zで分割したときの各区間(セグメント)の詳細情報(時間長)も最良モデル格納部18に格納する。少なくとも最良セグメントに対応する区間の詳細情報については格納する。 The best model selection unit 19 also stores in the best model storage unit 18 detailed information (time length) of each section (segment) when divided by the selected waveform division number z. At least the detailed information of the section corresponding to the best segment is stored.
 最良モデル格納部18に格納されたこれらの情報データが判定モデルを形成する。最良モデル格納部18に格納されたデータ例を図18に示す。 These pieces of information data stored in the best model storage unit 18 form a judgment model. An example of data stored in the best model storage unit 18 is shown in FIG.
 図18の例ではすべての変量においてセグメントs2が最良セグメントとして特定されている。なお図18では表記の簡単化のためp(X1|C),p(X3|C),p(X4|C)の表の詳細な表記は省略して、単にp(X1|C),p(X3|C),p(X4|C)とのみ記している。(2)のモデル式の読み方は後述する。 In the example of FIG. 18, the segment s2 is specified as the best segment in all the variables. In FIG. 18, the detailed notation of the table of p (X1 | C), p (X3 | C), p (X4 | C) is omitted for simplification, and simply p (X1 | C), p Only (X3 | C) and p (X4 | C) are shown. How to read the model formula in (2) will be described later.
 次に、送信部20が最良モデル格納部18に格納された判定モデルをクライアントに送信する。なおモデル式の型データをあらかじめクライアントに与えておき、クライアントは各変量の最良セグメント(最良区間)に基づき型データからモデル式を生成してもよい。この場合サーバはクライアントに送る判定モデルにモデル式を含めなくても良い。ひとつのサーバに複数のクライアントが接続されている場合は、複数のクライアントに判定モデルを送信する。この判定モデルを受け取ることによりクライアントでは異常判定の準備が整うことになる。 Next, the transmission unit 20 transmits the determination model stored in the best model storage unit 18 to the client. The model formula type data may be given to the client in advance, and the client may generate the model formula from the type data based on the best segment (best section) of each variable. In this case, the server may not include the model formula in the determination model sent to the client. When a plurality of clients are connected to one server, the determination model is transmitted to the plurality of clients. By receiving this determination model, the client is ready for abnormality determination.
 (補足)ここで条件付き確率について補足説明を行う。通常、ベイズ分類器で判定結果を推定するには、各変量(属性)Xiにおいて条件付き確率p(Xi|C)を求める必要がある。ここで、属性Xiの属性値の種類が問題となる。通常、条件付き確率を計算する際には、属性Xiの離散的な属性値aiがそれぞれどれくらいの頻度で生起しているかを基準に計算するが、本実施形態、時系列波形を切り出したセグメントデータそのものに属性値が存在しないため、確率を計算することができない。通常は時系列波形をクラスタリングするなりしていくつかの類型カテゴリに分け、それぞれのカテゴリタイプを属性値として計算することが時系列クラスタリングならびに時系列分類の領域で行われるが、クラスタ(類型カテゴリ)の種類をいくつにするとよいかは容易には決まらない。しかも分割数が多くなった場合、同時に扱う変量の種類が多くなった場合には各変量における各セグメントに対してクラスタ数を求めなければならないため、非現実的である。各変量で分割したセグメントをすべての属性Xiとして扱うことも考えられるが、これでは確率計算コストが増大してしまう問題がある。そこで本実施形態では、各変量の属性Xiにおいて、最も疑似判定性能が高くなるセグメント種別を属性値とすることで、ベイズ分類問題の枠組みの中で時系列データを離散化して情報を落とさずに扱うことを可能にした。このようにすると、各変量で選択されたセグメントを確信度付きで表現することが可能となる。たとえば、先に示したように、p(X2=s2|C)=(異常=0.8,正常=0.067)のように条件付き確率を表現することができる。 (Supplement) Here is a supplementary explanation of conditional probabilities. Usually, in order to estimate a determination result by a Bayes classifier, it is necessary to obtain a conditional probability p (Xi | C) in each variable (attribute) Xi. Here, the type of attribute value of attribute Xi becomes a problem. Normally, when calculating the conditional probability, it is calculated based on how often the discrete attribute value ai of the attribute Xi occurs, but in this embodiment, segment data obtained by cutting out the time series waveform Since there is no attribute value in itself, the probability cannot be calculated. Usually, time-series waveforms are clustered and divided into several category categories, and each category type is calculated as an attribute value in the domain of time-series clustering and time-series classification. It's not easy to decide how many types to choose. Moreover, when the number of divisions increases, the number of clusters to be handled for each variable must be obtained when the number of types of variables handled at the same time increases, which is unrealistic. Although it is conceivable to handle the segments divided by each variable as all the attributes Xi, there is a problem that the probability calculation cost increases. Therefore, in this embodiment, in the attribute Xi of each variable, by setting the segment type with the highest pseudo judgment performance as the attribute value, the time series data is discretized within the framework of the Bayesian classification problem and the information is not dropped. Made it possible to handle. If it does in this way, it becomes possible to express the segment selected by each variable with certainty. For example, as described above, the conditional probability can be expressed as p (X2 = s2 | C) = (abnormal = 0.8, normal = 0.067).
 (サーバの第1の変形例)
 図17に示したスコア表のように最もスコアの高いセグメント(区間)が各変量のそれぞれにおいて1つしか存在しないときは重要セグメントは一意に決定する。しかしながら、同点となるセグメントが複数存在する場合はこの方法では一意に決定することはできない。そこで、このような場合は以下の方法で重要セグメント(重要区間)の決定を行う。
(First modification of server)
As shown in the score table shown in FIG. 17, when there is only one segment (section) with the highest score in each variable, the important segment is uniquely determined. However, when there are a plurality of segments having the same point, this method cannot be uniquely determined. In such a case, an important segment (important section) is determined by the following method.
 図19は、第1の変形例に係る重要セグメントの決定方法を説明する図である。 FIG. 19 is a diagram for explaining an important segment determination method according to the first modification.
 図19の例では、変量(チャネル)4のセグメントs1,s4のスコアが同じである(それぞれ9点である)。変量1,2,3では最高のスコアのセグメントが1つしか存在せず、それぞれセグメントs1,s2,s2が選択されている。 In the example of FIG. 19, the scores of the segments (s1 and s4) of the variable (channel) 4 are the same (each is 9 points). For variables 1, 2, and 3, there is only one segment with the highest score, and segments s1, s2, and s2 are selected, respectively.
 この場合、変量1~4間で、最高スコアのセグメントのすべての組合せをとることにより複数の候補を生成する。そして、図5のフローチャートに示すように、各候補のそれぞれについて異常の尤度を計算する(S222)。尤度とは判定のもっともらしさを表す測度であり、確率の総積として定義される。計算した尤度を比較し、尤度が最も高い候補を選択する(S223)。 In this case, multiple candidates are generated by taking all combinations of the highest score segments between the variables 1 to 4. Then, as shown in the flowchart of FIG. 5, the likelihood of abnormality is calculated for each candidate (S222). Likelihood is a measure representing the plausibility of a decision and is defined as the total product of probabilities. The calculated likelihoods are compared, and the candidate with the highest likelihood is selected (S223).
 図示の例では、候補c1として、(変量1,変量2,変量3,変量4)=(s1,s2,s2,s1)、候補c2として(変量1,変量2,変量3,変量4)=(s1,s2,s2,s4)が生成され、それぞれの候補について、異常の尤度を計算する。そして尤度が最も高い候補を選択する(尤度計算による疑似判定評価)。 In the illustrated example, as candidate c1, (variable 1, variable 2, variable 3, variable 4) = (s1, s2, s2, s1), as candidate c2 (variable 1, variable 2, variable 3, variable 4) = (S1, s2, s2, s4) are generated, and the likelihood of abnormality is calculated for each candidate. Then, the candidate with the highest likelihood is selected (pseudo-judgment evaluation by likelihood calculation).
 ここで尤度の計算は以下の式に従って行う。下記の式の計算により、正常の尤度と、異常の尤度とからなるベクトルが得られるため、異常の方の尤度を選択し、各候補間で異常の尤度を比較する。p(C)は事前確率であり、p(Xj=si|C)は条件付き確率である。条件付き確率は、図3のステップS210で計算した値を用いることができる。
Figure JPOXMLDOC01-appb-M000002
Here, the likelihood is calculated according to the following equation. Since a vector composed of normal likelihood and abnormal likelihood is obtained by the calculation of the following equation, the likelihood of the abnormal is selected, and the likelihood of abnormality is compared between the candidates. p (C) is a prior probability, and p (X j = s i | C) is a conditional probability. As the conditional probability, the value calculated in step S210 in FIG. 3 can be used.
Figure JPOXMLDOC01-appb-M000002
 図19の例で、実際に各候補の異常の尤度を計算すると、候補c1では0.087、候補c2では0.084となり、候補c1の方が値が大きいため、候補c1を選択する。すなわち、変量(チャネル)4の重要セグメントはs1に決定される。なお図19の例のように、同点のセグメントが存在する変量が1つしか存在しない場合は、これら同点セグメント間で異常の条件付き確率を比較して、最も大きい値をもつセグメントを重要セグメントとして決定しても同じ結果が得られる。 In the example of FIG. 19, when the likelihood of abnormality of each candidate is actually calculated, it becomes 0.087 for candidate c1 and 0.084 for candidate c2, and candidate c1 has a larger value, so candidate c1 is selected. That is, the important segment of the variable (channel) 4 is determined as s1. As shown in the example in Fig. 19, if there is only one variable with a tie segment, compare the conditional probabilities of abnormalities between these tie segments, and select the segment with the largest value as the important segment. The same result can be obtained even if it is determined.
(サーバの第2の変形例)
 (その1)第1実施形態ではスコアの最大値で重要セグメントを特定したが、別の方法として尤度計算で最も高い変量間でのセグメントの組み合わせを選択することも可能である。これは、各変量においては最良のセグメントを選択しているが,変量全体で疑似判定したときにはそれがベストな判定精度となっているとは限らないためである。本変形例の処理では、例えば上記の式1-2に従って、変量間ですべてのセグメントの組み合わせについて異常の尤度を計算し(図5のS222)、最も尤度の高い組み合わせを選択すればよい(図5のS223)。
(Second modification of server)
(No. 1) In the first embodiment, the important segment is specified by the maximum value of the score, but as another method, it is possible to select a combination of segments between the variables with the highest likelihood calculation. This is because the best segment is selected for each variable, but it is not always the best determination accuracy when the pseudo-judgment is made for the entire variable. In the process of this modification, for example, according to the above Equation 1-2, the likelihood of abnormality is calculated for all the combinations of segments between variables (S222 in FIG. 5), and the combination with the highest likelihood may be selected. (S223 in FIG. 5).
 (その2)また、本第2の変形例として以下の方法も可能である。 (Part 2) The following method is also possible as the second modification.
 すなわち評価スコアの下限閾値θをあらかじめ決めておき、その下限閾値より大きいスコアになったセグメントをすべて候補として選択し、候補の中から最終的な重要セグメントを決定する。 That is, the lower limit threshold θ of the evaluation score is determined in advance, and all segments having a score larger than the lower limit threshold are selected as candidates, and the final important segment is determined from the candidates.
 図20は本第2の変形例に係る重要セグメントの決定方法を説明する図である。 FIG. 20 is a diagram for explaining an important segment determination method according to the second modification.
 下限閾値θは5に設定されている。変量(チャネル)1~4のそれぞれにおいて5超のスコアをもつセグメントが候補として選択される。変量1ではセグメントs1,s4、変量2ではs2、変量3ではs2,変量4ではs1,s3、s4が選択される。選択されたセグメントを各変量間で組み合わせると、以下の6個の候補c1~c6が得られる。 The lower threshold θ is set to 5. Segments with scores greater than 5 in each of the variables (channels) 1 to 4 are selected as candidates. For variable 1, segments s1, s4, for variable 2, s2, for variable 3, s2, for variable 4, s1, s3, s4 are selected. When the selected segments are combined between the variables, the following six candidates c1 to c6 are obtained.
 c1=(s1,s2,s2,s1)、c2=(s4,s2,s2,s1)、c3=(s1,s2,s2,s3)、c4=(s4,s2,s2,s3)、c5=(s1,s2,s2,s4)、c6=(s4,s2,s2,s4)
 各候補について「その1」と同様にして異常の尤度を計算する(図5のS222)。そして、最も尤度の大きい候補を選択する(図5のS223)。この例では候補c1の尤度L1は0.013,候補c2の尤度L2は0.031、候補c3の尤度L3は0.024,候補c4の尤度L4は0.062、候補c5の尤度L5は0.033、候補c6の尤度L6は0.093と計算される。候補c6の尤度が最も大きいため、候補c6に含まれるセグメントが重要セグメントとして決定される。すなわち変量1ではセグメントs4、変量2ではセグメントs2,変量3ではセグメントs2、変量4ではセグメントs4がそれぞれ重要セグメントとして決定される。
c1 = (s1, s2, s2, s1), c2 = (s4, s2, s2, s1), c3 = (s1, s2, s2, s3), c4 = (s4, s2, s2, s3), c5 = (S1, s2, s2, s4), c6 = (s4, s2, s2, s4)
The likelihood of abnormality is calculated for each candidate in the same manner as “No. 1” (S222 in FIG. 5). Then, the candidate with the highest likelihood is selected (S223 in FIG. 5). In this example, the likelihood L1 of the candidate c1 is 0.013, the likelihood L2 of the candidate c2 is 0.031, the likelihood L3 of the candidate c3 is 0.024, the likelihood L4 of the candidate c4 is 0.062, and the candidate c5 The likelihood L5 is calculated as 0.033, and the likelihood L6 of the candidate c6 is calculated as 0.093. Since the likelihood of the candidate c6 is the largest, the segment included in the candidate c6 is determined as the important segment. That is, segment s4 is determined as the important segment for variable 1, segment s2 is determined as variable 2, segment s2 is determined as variable 3, and segment s4 is determined as important segment in variable 4.
 このような重要セグメントの決定方法によれば、閾値θより大きいスコアのセグメントが存在しない変量(チャネル)では、重要セグメントが選択されないこととなる。このような変量は異常検知にとって必要性が低いといえ、その変量(チャネル)のデータを異常判定に用いずに済むという利点もある。なお最も大きい尤度の候補が複数存在するときは、複数の候補を選択してもよい。 According to such an important segment determination method, an important segment is not selected in a variable (channel) in which there is no segment with a score higher than the threshold θ. It can be said that such a variable is less necessary for abnormality detection, and there is an advantage that the data of the variable (channel) is not used for abnormality determination. When there are a plurality of candidates with the largest likelihood, a plurality of candidates may be selected.
(サーバの変形例3)
 センシング対象(監視対象)によっては、各センサの時間差を考慮しなくても良い場合もある。その場合には、各変量(チャネル)において同じ位置のセグメントを選択し、同じセグメントの系列単位で、k-最近傍法で、推定を行ってもよい。
(Modification 3 of the server)
Depending on the sensing target (monitoring target), it may not be necessary to consider the time difference of each sensor. In that case, a segment at the same position in each variable (channel) may be selected, and estimation may be performed by the k-nearest neighbor method in units of the same segment sequence.
 この場合、図3のステップS205に対応する図4のフロー単位で条件付き確率を計算(分割集合識別子v単位で条件付き確率を計算し)する。図21に示すように、各変量間で同一セグメント(例えばs2)の正常および異常の条件付き確率を乗算し、さらに事前確率を乗算することで正常および異常の尤度を計算し、値が大きい方の状態を採用する。これは下記の式1-3によって示される。
Figure JPOXMLDOC01-appb-M000003
In this case, the conditional probability is calculated in the flow unit of FIG. 4 corresponding to step S205 of FIG. 3 (the conditional probability is calculated in the divided set identifier v unit). As shown in FIG. 21, normal and abnormal likelihoods are calculated by multiplying the normal and abnormal conditional probabilities of the same segment (for example, s2) between the variables, and further multiplying the prior probabilities, and the values are large. The state of the direction is adopted. This is shown by Equation 1-3 below.
Figure JPOXMLDOC01-appb-M000003
 採用した状態が、疑似判定対象データの状態と一致すれば正解としてスコアを1加算し、一致しなければスコアを加算しない。各変量間で他の同一セグメント(s1,s3,s4)についてもそれぞれ同様の計算を行って正常または異常を採択し、正解した場合はスコアを加算する。このようにしてスコア表を更新する(スコア表のサイズは第1実施形態の4×4に対し、1×4となる)。 ・ If the adopted status matches the status of the pseudo-judgment target data, 1 is added as a correct answer, and if it does not match, the score is not added. The same calculation is performed for other same segments (s1, s3, s4) between each variable, and normal or abnormal is selected, and if correct, the score is added. In this way, the score table is updated (the size of the score table is 1 × 4 compared to 4 × 4 in the first embodiment).
 図21の例では、
Figure JPOXMLDOC01-appb-M000004
(なお理解の簡単化のためここでは図25の表の値を用いた)。よって異常の方が正常よりも尤度が高くなる。したがって、この場合は正常が推定結果となる。疑似判定対象データd1のクラスは正常であるため、この推定結果は正解となり、セグメントs2に対するスコアscore(S2)に1加算する。
In the example of FIG.
Figure JPOXMLDOC01-appb-M000004
(Note that the values in the table of FIG. 25 are used here for ease of understanding). Therefore, the likelihood of abnormality is higher than normal. Therefore, in this case, normal is the estimation result. Because pseudo-class determination target data d 1 is normal, the estimation result is a correct answer, 1 is added to the score score (S2) for the segment s2.
 なお、場合によっては、k-最近傍法によってクラスの事前分布の偏りがすでに補正されている可能性もあり得る。そこで式1-3に示した推定式から事前確率分布p(C)を外した式1-4を式1-3の代わりに用いてもよい。
Figure JPOXMLDOC01-appb-M000005
In some cases, it is possible that the bias of the class prior distribution has already been corrected by the k-nearest neighbor method. Therefore, Formula 1-4, which is obtained by removing the prior probability distribution p (C) from the estimation formula shown in Formula 1-3, may be used instead of Formula 1-3.
Figure JPOXMLDOC01-appb-M000005
(サーバの第4の変形例)
 本変形例では各変量(チャネル)の最良セグメントを決定する他の方法を示す。
(Fourth modification of server)
This modification shows another method for determining the best segment of each variable (channel).
 図22は、本変形例5に係る最良セグメントの決定方法を説明する図である。 FIG. 22 is a diagram for explaining a method for determining the best segment according to the fifth modification.
 最初に各変量に対して最良セグメントのセグメント長および暫定位置をあらかじめ決めておく。あらかじめ決定したセグメント長のセグメント(区間)をそれぞれ暫定位置に配置し、暫定位置から時間軸に沿って各セグメントを前後に最小移動間隔Δ単位でずらし、最も疑似判定評価値Gzが良くなる位置の組み合わせを決定し、これにより最良セグメントを決定する。図22の例ではずらしの最大幅がセグメント長の絶対値となっている。このようなΔ単位で移動した場合の各変量間でのすべてのセグメントの組み合わせについて最も異常の尤度が高くなる組み合わせを選択する。このようにして、取り得る可能性のある各セグメントの位置の組み合わせを探っていき、疑似判定評価比較を行うことで変量毎の最良セグメントを決定する。 First, the segment length and provisional position of the best segment are determined in advance for each variable. Each segment (section) with a predetermined segment length is placed at a provisional position, and each segment is shifted back and forth along the time axis from the provisional position by the minimum movement interval Δ unit, and the position where the pseudo judgment evaluation value Gz is the best is obtained. The combination is determined, thereby determining the best segment. In the example of FIG. 22, the maximum width of the shift is the absolute value of the segment length. A combination with the highest likelihood of abnormality is selected for all the combinations of segments between the variables when moving in units of Δ. In this way, the combination of the position of each possible segment is searched, and the best segment for each variable is determined by performing pseudo judgment evaluation comparison.
(サーバの第5の変形例)
 本変形例では、第1の変量から第2の変量へ確率的依存関係が分かっている場合の例を説明する。ここでは第1の変量を変量(チャネル)X3、第2の変量を変量(チャネル)X2として説明する。変量の依存関係、すなわちセンサ同士の依存関係はあらかじめユーザがサーバに指定しておく。
(Fifth modification of server)
In this modification, an example will be described in which a stochastic dependency is known from the first variable to the second variable. Here, the first variable is described as variable (channel) X3, and the second variable is described as variable (channel) X2. The variable dependency relationship, that is, the dependency relationship between sensors, is specified in advance by the user to the server.
 変量X3から変量X2に確率的な依存関係がある場合、尤度は式1-5に従って計算され,判定推定式(正常および異常の尤度のうち大きい方の状態を推定する式)は数1-6のようになる。
Figure JPOXMLDOC01-appb-M000006
When there is a stochastic dependency from variable X3 to variable X2, the likelihood is calculated according to Equation 1-5, and the decision estimation equation (the equation that estimates the larger of normal and abnormal likelihoods) is It becomes like -6.
Figure JPOXMLDOC01-appb-M000006
 変形例4のようにクラス分布に関する事前確率を用いない場合は式1-6の代わりに式1-7を用いて判定してもよい。
Figure JPOXMLDOC01-appb-M000007
When the prior probability related to the class distribution is not used as in Modification 4, the determination may be made using Formula 1-7 instead of Formula 1-6.
Figure JPOXMLDOC01-appb-M000007
 第1実施形態との差異として、本変形例では、式1-5で尤度を計算する。また最良モデル格納部18に格納するモデル式として式1-8に示すものを用いる。P(Xnew=s2|C)等の意味は後述する。また本変形例では変量X2から変量X3に対する正常および異常の条件付き確率も最良モデル格納部18に格納し、これを判定モデルに含める。
Figure JPOXMLDOC01-appb-M000008
As a difference from the first embodiment, in this modification, the likelihood is calculated by Expression 1-5. In addition, as a model formula stored in the best model storage unit 18, a formula shown in Formula 1-8 is used. The meaning of P (Xnew = s2 | C) will be described later. In this modification, the normal and abnormal conditional probabilities for the variable X2 to the variable X3 are also stored in the best model storage unit 18 and included in the determination model.
Figure JPOXMLDOC01-appb-M000008
 本変形例に係る判定モデルの一例を図23に示す。図24は、式1-6のモデル式の概念を模式的に示したものである。 An example of the determination model according to this modification is shown in FIG. FIG. 24 schematically shows the concept of the model expression of Expression 1-6.
 ここでp(X2=s2|X3=s2,C)は、CとX3=s2とが、取り得る値の組み合わせに応じて、X2=s2に確率依存関係があることを示している。つまりCと、X3=s2のとるべき値の組み合わせとの(正または負の)相乗効果というべき影響があることを意味している。 Here, p (X2 = s2 | X3 = s2, C) indicates that C2 and S3 = s2 have a probability dependency relationship with X2 = s2 depending on the combination of possible values. In other words, this means that there is an influence that should be a synergistic effect (positive or negative) between C and the combination of values that X3 = s2.
 図25の下に条件付き確率p(X2=s2|X3=s2,C)の計算例を示す。p(X2=s2|X3=s2,C)は頻度f(X2=s2|X3=s2,C)から計算する。ここでは頻度f(X2=s2|X3=s2,C)を求めるのに、近似として、頻度f(X2=s2|C)とp(X3=s2|C)との総和を用いている。これはkが小さいとき、条件付き確率表の各セルの頻度が小さすぎることになるのを防ぐための1つの方法である。図示の例では、f(X2=s2|C=異常)=4と、f(X3=s2|C=異常)=2とから、f(X2=s2|X3=s2,C=異常)=4+2=6となる。またf(X2=s2|C=正常)=1と、f(X3=s2|C=正常)=2とから、f(X2=s2|X3=s2,C=正常)=1+3=4となる。 Fig. 25 shows a calculation example of the conditional probability p (X2 = s2 | X3 = s2, C). p (X2 = s2 | X3 = s2, C) is calculated from the frequency f (X2 = s2 | X3 = s2, C). Here, to obtain the frequency f (X2 = s2 | X3 = s2, C), the sum of the frequencies f (X2 = s2 | C) and p (X3 = s2 | C) is used as an approximation. This is one way to prevent the frequency of each cell in the conditional probability table from becoming too small when k is small. In the example shown in the figure, f (X2 = s2 | C = abnormal) = 4 and f (X3 = s2 | C = abnormal) = 2, so that f (X2 = s2 | X3 = s2, C = abnormal) = 4 + 2 = 6. Also, f (X2 = s2 | C = normal) = 1 and f (X3 = s2 | C = normal) = 2, so f (X2 = s2 | X3 = s2, C = normal) = 1 + 3 = 4 It becomes.
 これ以外の計算方法として、頻度の少なさが問題とならない場合は、そのまま、f(X3=s2|C=異常)と、f(X3=s2|C=正常)を満たす3+2=5個のデータのうち、f(X2=s2)となる異常および正常の頻度をカウントすることで、f(X2=s2|X3=s2,C)の頻度表を作成してもよい。 As a calculation method other than this, if infrequent frequency is not an issue, 3 + 2 = 5 satisfying f (X3 = s2 | C = abnormal) and f (X3 = s2 | C = normal) Of these data, the frequency table of f (X2 = s2 | X3 = s2, C) may be created by counting the frequency of abnormalities and normality of f (X2 = s2).
 なお頻度表から条件付き確率を計算する方法、既に述べたp(X2|C)やp(X3|C)と同様であるため重複する説明を省略する。 It should be noted that the method for calculating the conditional probability from the frequency table is the same as p (X2 | C) and p (X3 | C) described above, so duplicate explanation is omitted.
(クライアント)
 図1のクライアントについて説明する。
(client)
The client in FIG. 1 will be described.
 クライアントは複数のセンサにより新たにデータをセンシングするセンシング部30を備え、センシング部30によりセンシングしたデータ(このデータを判定対象データと呼ぶことにする)をセンシングデータ格納部31に格納する。 The client includes a sensing unit 30 that newly senses data using a plurality of sensors, and stores data sensed by the sensing unit 30 (this data will be referred to as determination target data) in the sensing data storage unit 31.
 判定対象データ入力部32は、センシングデータ格納部31に判定対象データが入っているかを監視し、新たな判定対象データが入っていたら判定対象データを読み出して波形前処理部33に入力する。 The determination target data input unit 32 monitors whether or not the determination target data is stored in the sensing data storage unit 31. If new determination target data is input, the determination target data is read and input to the waveform preprocessing unit 33.
 波形前処理部33は、サーバの波形前処理部13で説明したのと同様にして波形の前処理を行う。 The waveform preprocessing unit 33 performs waveform preprocessing in the same manner as described in the waveform preprocessing unit 13 of the server.
 モデル受信部34は、サーバから送られる判定モデルを受信する。 The model receiving unit 34 receives the determination model sent from the server.
 判定モデル格納部35は、モデル受信部34で受信された判定モデルを格納する。 The determination model storage unit 35 stores the determination model received by the model reception unit 34.
 セグメント選択部36は、図28に示すように、判定モデルに含まれる各変量の最良セグメントからなるセグメントテンプレートをもとに判定対象データ上を一定時間間隔でスキャンし、スキャンされたセグメントテンプレート内のデータを読み出して、異常判定部39に出力する。 As shown in FIG. 28, the segment selection unit 36 scans the determination target data at regular time intervals based on the segment template including the best segment of each variable included in the determination model. The data is read and output to the abnormality determination unit 39.
 異常判定部39は、セグメント選択部36から入力された各変量の切り出しデータと、判定モデルとをもとに異常および正常の尤度を計算する。異常判定部39は、異常の尤度が正常の尤度よりも高ければ異常と判定し、それでなければ正常と判定する。 The abnormality determination unit 39 calculates the likelihood of abnormality and normality based on the cut-out data of each variable input from the segment selection unit 36 and the determination model. The abnormality determination unit 39 determines that the abnormality is abnormal if the abnormality likelihood is higher than the normal likelihood, and otherwise determines that the abnormality is normal.
 例えば、各変量において図29の左のようにデータ(波形)が切り出された場合、判定モデルにおいてそれぞれ対応するセグメントデータ(図18の(4)または図23の(4))との距離を比較し、距離の近い上位k’個のセグメントの判定結果(正常または異常)を特定する。ただし、1<k’<kとし、kならびにk’はあらかじめシステムパラメータとして設定しておく。そして、図18の(2)または図23の(2)のモデル式を使って尤度計算を行い、正常となる尤度と、異常となる尤度とを計算し、値の大きい尤度の方を
Figure JPOXMLDOC01-appb-M000009
として出力する。両者の尤度の値が同じときはあらかじめ定めた一方を判定結果とする。
For example, if the data (waveform) is cut out as shown on the left in Fig. 29 for each variable, the distances to the corresponding segment data (Fig. 18 (4) or Fig. 23 (4)) are compared in the judgment model. Then, the determination result (normal or abnormal) of the top k ′ segments closest to each other is specified. However, 1 <k ′ <k, and k and k ′ are set in advance as system parameters. Then, likelihood calculation is performed using the model formula (2) in FIG. 18 or (2) in FIG. 23, and the likelihood that becomes normal and the likelihood that becomes abnormal are calculated. The way
Figure JPOXMLDOC01-appb-M000009
Output as. When both likelihood values are the same, one of the predetermined values is taken as the determination result.
 より詳細に、図18(2)のモデル式に従う場合は、まず変量毎に、正常と異常の割合(条件付き確率)p(Xnew=s2|C)をそれぞれ計算する。たとえばある変量に関し、k’=5で、正常の回数が1,異常の回数が4のときは、p(Xnew=s2|C=正常)=0.2、p(Xnew=s2|C=異常)=0.8となる。そして、変量毎のp(Xnew=s2|C)に、判定モデルに含まれる条件付き確率表の該当するセクション(区間)の正常および異常の確率p(Xj=s2|C)を乗じる(本例ではすべての変量でセクションはs2である)。すなわち、変量毎のp(Xnew=s2|C)に、p(X1=s2|C), p(X2=s2|C), p(X3=s2|C), p(X4=s2|C)を乗じる。そしてさらに正常および異常の事前確率p(C)を掛け合わせ、これにより正常および異常の尤度を得る。図23(2)のモデル式に従う場合は、さらにp(X2|X3,C)の正常および異常の確率を読み出して、これらを乗じることにより正常および異常の尤度を得る。最終的に、正常および異常のうち、大きい方の尤度の状態を決定する。 In more detail, when following the model formula in Fig. 18 (2), the ratio of normal and abnormal (conditional probability) p (Xnew = s2 | C) is first calculated for each variable. For example, for a variable, when k '= 5, normal count is 1, and abnormal count is 4, p (Xnew = s2 | C = normal) = 0.2, p (Xnew = s2 | C = abnormal) = 0.8. Then, p (Xnew = s2 | C) for each variable is multiplied by the normal and abnormal probabilities p (Xj = s2 | C) of the relevant section (section) of the conditional probability table included in the judgment model (this example) So in all variables the section is s2.) That is, p (Xnew = s2 | C), (p (X2 = s2 | C), p (X3 = s2 | C), p (X4 = s2 | C) for each variable Multiply Further, normal and abnormal prior probabilities p (C) are multiplied, thereby obtaining normal and abnormal likelihoods. In the case of following the model formula of FIG. 23 (2), normal and abnormal probabilities of p (X2 | X3, C) are further read and multiplied to obtain normal and abnormal likelihoods. Finally, the state of the higher likelihood of normal and abnormal is determined.
 このようにセグメント選択部36と異常判定部39とにより、ちょうど滑走窓をあてがって尤度計算をするのと同様に、判定対象データからセグメントテンプレートを滑走させてセグメントデータを切り出し、判定モデルとセグメントデータとを用いて、尤度を計算する。 In this way, the segment selection unit 36 and the abnormality determination unit 39 cut out the segment data by sliding the segment template from the determination target data in the same manner as the likelihood calculation by assigning the sliding window, and the determination model and the segment The likelihood is calculated using the data.
 このようにセグメントテンプレートを滑走させることで、異常判定部39では、スキャンした回数だけの判定結果が出力される。たとえば前半のある個所では正常であったが、後半に異常と判定された場合は図29のようになる。 By sliding the segment template in this way, the abnormality determination unit 39 outputs the determination results for the number of times scanned. For example, FIG. 29 shows a case where it is normal in a certain part of the first half but is determined to be abnormal in the second half.
 通知表示部38は、異常判定部39による判定結果を通知または表示する。一例として、図30に、判定対象データのうち異常判定の部分だけを強調表示させる場合の画面表示を示す。判定結果は例えば表示器やスピーカで、遠隔監視端末またはサーバの保守員や係員に通知される。 The notification display unit 38 notifies or displays the determination result by the abnormality determination unit 39. As an example, FIG. 30 shows a screen display in the case of highlighting only the abnormality determination portion of the determination target data. The determination result is notified, for example, to a remote monitoring terminal or a maintenance staff or a staff of the server using a display or a speaker.
 機器制御部37は、異常判定部39による判定結果に応じて監視対象の動作を制御する。たとえば異常と判定されたときは、監視対象を緊急停止させる。 The device control unit 37 controls the operation to be monitored according to the determination result by the abnormality determination unit 39. For example, when it is determined that there is an abnormality, the monitoring target is urgently stopped.
 判定結果格納部40は、異常判定部39での判定結果と、判定対象データから判定のために切り出した各変量のデータを含む一定時間長(たとえば既に格納済みの訓練データと同一時間長)の時系列データを蓄積する。 The determination result storage unit 40 has a predetermined time length (for example, the same time length as already stored training data) including the determination result in the abnormality determination unit 39 and the data of each variable extracted for determination from the determination target data. Accumulate time series data.
 判定結果送信部41は、各変量の時系列データと、該当する判定結果とをサーバに送信する。 The determination result transmission unit 41 transmits the time series data of each variable and the corresponding determination result to the server.
 サーバの判定結果受信部22はクライアントから送信された各変量の時系列データと、該当する判定結果とを受信し、通知部21がこれらを監視員に対して表示または通知する。監視員がこの判定結果に間違いないことを確認した後に、通知部21は監視員からの指示入力に応じて、これらの時系列データおよび判定結果をサーバの訓練データ格納部11に追加する。 The server determination result receiving unit 22 receives the time-series data of each variable transmitted from the client and the corresponding determination result, and the notification unit 21 displays or notifies the monitoring member of these. After the monitoring person confirms that the determination result is correct, the notification unit 21 adds the time series data and the determination result to the training data storage unit 11 of the server in response to an instruction input from the monitoring person.
 もし判定に誤りがあれば判定結果を監視員の指示入力に応じて修正した後、これらの時系列データと修正された判定結果とを訓練データ格納部11に格納する。 If there is an error in the determination, the determination result is corrected according to the instruction input from the observer, and the time series data and the corrected determination result are stored in the training data storage unit 11.
 サーバの訓練データ入力部12は、訓練データ格納部11中のデータが更新されたことを検知し、判定モデルの再計算を行うようにしてもよい。 The training data input unit 12 of the server may detect that the data in the training data storage unit 11 has been updated and recalculate the determination model.
 このように常に新しい訓練データを装置に与えられる仕組みを用意することで,判定モデルの精緻化、つまり精度向上が継続的に行えるようになる。これは,日々の監視オペレーションの中で異常判定精度を向上できる可能性を意味しており、高い異常判定性能が要求される領域において特に威力を発揮するものであると考えられる。 こ の By preparing a mechanism that can always give new training data to the device in this way, the judgment model can be refined, that is, accuracy can be improved continuously. This means that there is a possibility that the abnormality determination accuracy can be improved in the daily monitoring operation, and it is considered to be particularly effective in an area where high abnormality determination performance is required.
 図31は、クライアントにおける判定対象データの入力から異常判定部39による判定が行われるまでの動作フローを示すフローチャートである。 FIG. 31 is a flowchart showing an operation flow from the input of the determination target data in the client until the determination by the abnormality determination unit 39 is performed.
 判定対象データ入力部32は、センシングデータ格納部31内の判定対象データを読み出して波形前処理部33に入力する(S401)。 The determination target data input unit 32 reads the determination target data in the sensing data storage unit 31 and inputs it to the waveform preprocessing unit 33 (S401).
 波形前処理部33は、判定対象データに前処理を行い(S402)、セグメント選択部36は、判定モデルに含まれる各変量の最良セグメントからなるセグメントテンプレートに基づきデータの切り出しを行う(S403)。そして切り出したデータを異常判定部39に入力する(S404)。 The waveform preprocessing unit 33 performs preprocessing on the determination target data (S402), and the segment selection unit 36 extracts data based on the segment template including the best segment of each variable included in the determination model (S403). Then, the cut out data is input to the abnormality determination unit 39 (S404).
 異常判定部39は、変量毎(S406)に、k’-最近傍の計算(S407)、条件付き確率(k’-最近傍に基づく正常および異常のそれぞれの割合)の計算を行い(S408)、各変量についてそれぞれS407,S408の処理を終えたら(S409のYES)、前述したモデル式に従って、正常の尤度および異常の尤度を計算する(S410)。異常判定部39は、異常の尤度が正常の尤度を比較して、値の大きい方の状態を判定の結果として下す(S411)。 For each variable (S406), the abnormality determination unit 39 calculates k'-nearest neighbor (S407), and calculates a conditional probability (the ratio of normal and abnormal based on k'-nearest neighbor) (S408). When the processing of S407 and S408 is completed for each variable (YES in S409), the normal likelihood and the abnormal likelihood are calculated according to the above-described model formula (S410). The abnormality determination unit 39 compares the likelihood that the likelihood of abnormality is normal and gives the state having the larger value as the determination result (S411).
 図32は、サーバおよびクライアントを実現するためのハードウェア構成の一例を示す。 FIG. 32 shows an example of a hardware configuration for realizing a server and a client.
 サーバはCPU51、RAM52,ROM53、HDD54、I/O55、表示器56、スピーカ57、I/Oコントローラ58、ネットワークインタフェース59を備える。サーバの訓練データ格納部11、セグメント格納部15は例えばHDD54により構成される。モデル送信部20、判定結果受信部22はネットワークインタフェース59により構成されることができる。その他の要素12,13、14、16、17、19、21はたとえばCPU51に実行させるプログラムモジュールとして論理回路によって構成されることができる。プログラムモジュールはROM53またはHDD54に格納しておき、CPU51によってプログラムモジュールを読み出し、RAM52に展開して実行することでそれぞれ対応する論理回路の動作が実現される。 The server includes a CPU 51, RAM 52, ROM 53, HDD 54, I / O 55, display 56, speaker 57, I / O controller 58, and network interface 59. The training data storage unit 11 and the segment storage unit 15 of the server are configured by the HDD 54, for example. The model transmitting unit 20 and the determination result receiving unit 22 can be configured by a network interface 59. The other elements 12, 13, 14, 16, 17, 19, and 21 can be configured by logic circuits as program modules that are executed by the CPU 51, for example. The program modules are stored in the ROM 53 or the HDD 54, read out by the CPU 51, developed in the RAM 52, and executed, whereby the operations of the corresponding logic circuits are realized.
 クライアントはCPU61、RAM62,ROM63、HDD64、I/Oコントローラ65、表示器66、スピーカ67、I/O68、ネットワークインタフェース69を備える。クライアントのセンシングデータ格納部31、判定モデル格納部35、判定結果格納部40例えばRAM62またはHDD64により構成されることができる。モデル受信部34、判定結果送信部41はネットワークインタフェース69により構成されることができる。その他の要素32,33、36、37,38たとえばCPU61に実行させるプログラムモジュールとして論理回路によって構成されることができる。プログラムモジュールはROM63またはHDD64に格納しておき、CPU61によってプログラムモジュールを読み出し、RAM62に展開して実行することでそれぞれ対応する論理回路の動作が実現される。 The client has a CPU 61, RAM 62, ROM 63, HDD 64, I / O controller 65, display 66, speaker 67, I / O 68, and network interface 69. The client sensing data storage unit 31, the determination model storage unit 35, and the determination result storage unit 40, for example, can be configured by the RAM 62 or the HDD 64. The model receiving unit 34 and the determination result transmitting unit 41 can be configured by a network interface 69. Other elements 32, 33, 36, 37, and 38 can be configured by a logic circuit as a program module to be executed by the CPU 61, for example. The program modules are stored in the ROM 63 or the HDD 64, and the CPU 61 reads the program modules, develops them in the RAM 62, and executes them, thereby realizing the operations of the corresponding logic circuits.
 なお本実施形態ではサーバとクライントとに分かれているが、サーバの機能の一部またはすべてをクライアントで行っても良いし、またクライアントの機能の一部または全部をサーバで行っても良い。本発明においてコンピュータが処理を実行するとは、単一のコンピュータが当該処理を実行する場合と、複数のコンピュータで分散して当該処理を実行する場合とを含む。 In this embodiment, the server and the client are separated. However, some or all of the server functions may be performed by the client, and some or all of the client functions may be performed by the server. In the present invention, execution of a process by a computer includes a case where a single computer executes the process, and a case where the process is distributed and executed by a plurality of computers.
(クライアントの変形例)
 これまでのクライアントの説明では、判定対象データ上で一定時間毎にセグメントテンプレートを適用して尤度を計算した。しかしながら、この方法では、判定に要求される時間の上限を超えてしまう場合がある。特に遠隔監視端末のような現場に設置される情報処理機器などは、判定処理にまわせる計算資源(メモリ量やCPU等)に厳しい制約がある。異常の場合に即座に監視対象装置(機器)を停止する必要がある場合、あるいは現場の遠隔監視端末から監視センターサーバへの通信路の通信性能限界がある場合などは、一定時間毎にスキャンしてすべての結果をサーバに送信することは非現実的となる。
(Client variants)
In the explanation of the client so far, the likelihood is calculated by applying the segment template at regular intervals on the determination target data. However, this method sometimes exceeds the upper limit of the time required for the determination. In particular, information processing equipment such as a remote monitoring terminal installed in the field has severe restrictions on computing resources (memory amount, CPU, etc.) that can be used for judgment processing. If it is necessary to stop the monitoring target device (equipment) immediately in case of an abnormality, or if there is a communication performance limit on the communication path from the remote monitoring terminal in the field to the monitoring center server, scan at regular intervals. Sending all the results to the server is unrealistic.
 このような場合には、あらかじめ各変量で上限閾値を決めておき、判定モデル格納部35に各変量に対する上限閾値を記憶させておく。そして、いずれか1つの変量またはすべての変量について変量の値が、上限閾値を超えたら、機器制御部37にて装置の緊急停止をさせるなどの方法を行う。図33の例では、各変量において上限閾値を超える部分を選択するようにセグメントテンプレートが当てはめられた例を示す。 In such a case, an upper limit threshold value is determined in advance for each variable, and the upper limit threshold value for each variable is stored in the determination model storage unit 35. Then, when the value of the variable for any one variable or all of the variables exceeds the upper threshold, the device control unit 37 performs a method such as an emergency stop of the device. In the example of FIG. 33, an example is shown in which a segment template is applied so as to select a portion that exceeds the upper threshold in each variable.
 その後、あるいはこれと並行して、上限閾値を超えた部分を含むようにセグメントテンプレートを適用して図33のようにデータを切り出し、切り出したデータに基づき異常判定部39による判定を行う。異常判定により正常と判定されたときは、緊急停止(管制運転)を機器制御部37により自動解除する。なお、上限閾値は、事前にユーザが与えておけばよい。あるいは、訓練データを分割して判定する前述した方法で、いくつかの閾値候補を選択してそれらを比較し、その比較した候補の中から疑似判定性能が最も良い閾値を採用してもよい。 Thereafter, or in parallel with this, the segment template is applied so as to include a portion exceeding the upper threshold, and the data is cut out as shown in FIG. 33, and the determination by the abnormality determination unit 39 is performed based on the cut out data. When it is determined that the abnormality is normal, the emergency stop (control operation) is automatically canceled by the device control unit 37. The upper threshold may be given in advance by the user. Alternatively, some threshold candidates may be selected and compared using the above-described method of dividing and determining training data, and a threshold having the best pseudo-determination performance may be adopted from the compared candidates.
 図34は、本変形例に係るクライアントの動作の一例を示すフローチャートである。 FIG. 34 is a flowchart showing an example of the operation of the client according to this modification.
 判定対象データ入力部32がセンシングデータ格納部31にセンサデータが入力されたか否かを監視する(S501)。センサデータが入力されない場合は、監視員等から停止指示が入力されたか確認し、入力された場合は本フローを終了し、入力されない場合はS501に戻る。 The determination target data input unit 32 monitors whether sensor data is input to the sensing data storage unit 31 (S501). When sensor data is not input, it is confirmed whether a stop instruction is input from a supervisor or the like. If it is input, this flow is terminated, and if it is not input, the process returns to S501.
 S502においてセンサデータが入力された場合は、判定対象データ入力部32はセンサデータをセンシングデータ格納部31から判定対象データとして読み出し、波形前処理部33を介して異常判定部39に出力する。異常判定部39はセグメントテンプレートを移動させながら、各変量のすべてまたはいずれか1つ以上がそれぞれの上限閾値を超えたか判断する(S505)。超えていないときはステップS501に戻り、超えたときは機器制御部37を介して監視対象機器の緊急停止等を行う(S506)。 When sensor data is input in S502, the determination target data input unit 32 reads the sensor data from the sensing data storage unit 31 as determination target data, and outputs it to the abnormality determination unit 39 via the waveform preprocessing unit 33. The abnormality determination unit 39 determines whether all or any one or more of the variables exceed the respective upper thresholds while moving the segment template (S505). When it does not exceed, the process returns to step S501, and when it exceeds, emergency stop of the monitoring target device is performed via the device control unit 37 (S506).
 一方、セグメント選択部36では、上限閾値を超えたと判断されたときのセグメントテンプレートの位置における各変量のデータを切り出して異常判定部39に送り、異常判定部39は、受け取った各変量のデータと、判定モデルに基づき判定を行う(S507)。 On the other hand, the segment selection unit 36 cuts out the data of each variable at the position of the segment template when it is determined that the upper limit threshold has been exceeded and sends it to the abnormality determination unit 39. The abnormality determination unit 39 The determination is made based on the determination model (S507).
 正常と判定したときは、異常判定部39は、機器制御部37を介して緊急停止等を解除し(S509)、判定結果と、切り出したデータを含む一定時間長のデータとを判定結果格納部40に格納する(S511)。一方、異常と判定したときは、異常判定部39は、その旨を通知表示部38を介して通知または表示する(S510)。 When it is determined to be normal, the abnormality determination unit 39 cancels the emergency stop or the like via the device control unit 37 (S509), and the determination result and data of a certain time length including the extracted data are determined result storage unit Store in 40 (S511). On the other hand, when it is determined that there is an abnormality, the abnormality determination unit 39 notifies or displays that fact via the notification display unit 38 (S510).
 以上、本発明の第1実施形態によれば、蓄積された多チャンネルセンサデータとそれらに対応した異常判定結果(クラス)の組を活用し、センサ同士の依存関係を指定可能としつつ、判定精度を向上させることができる。波形データのどの部分が判定に寄与しているかの根拠を示せる形(変量毎の区間の位置関係)で判定を行うことができる。 As described above, according to the first embodiment of the present invention, it is possible to specify the dependency relationship between sensors while utilizing the set of accumulated multi-channel sensor data and the abnormality determination result (class) corresponding to the data, and determining the accuracy of determination. Can be improved. The determination can be performed in a form (positional relationship between sections for each variable) that can indicate the basis of which part of the waveform data contributes to the determination.
第2実施形態Second embodiment
 多くのアプリケーションでは、ターゲットオブジェクトの状態(正常または異常)の検出の性能を改善するために、多数のセンサノードが、そのオブジェクトの様々な特徴をモニターするために広く用いられている。これらのアプリケーションは、オブジェクト・トラッキング、画像認識、車両中の衝突回避、エリアの遠隔監視、およびプラントのオペレーションの遠隔モニタリングを含む。ターゲットオブジェクトの定義は問題によって異なり、例えば、列車駅のエリアの遠隔監視では、ターゲットオブジェクトは人であり、車両中の衝突回避では、ターゲットオブジェクトは車両である。ターゲットオブジェクトは装置中のコンポーネントになることもある。また、多くのターゲットオブジェクトは遠隔監視システムに存在する場合もある。以降、用語“センサノード”は、ターゲットオブジェクト中の特徴の状態をモニターするセンサセットアップを意味するために用いられる。 In many applications, a large number of sensor nodes are widely used to monitor various features of the object in order to improve the performance of detecting the state (normal or abnormal) of the target object. These applications include object tracking, image recognition, collision avoidance in vehicles, remote monitoring of areas, and remote monitoring of plant operations. The definition of the target object varies depending on the problem. For example, in remote monitoring of a train station area, the target object is a person, and in the collision avoidance in the vehicle, the target object is a vehicle. The target object can be a component in the device. Many target objects may also exist in a remote monitoring system. Hereinafter, the term “sensor node” is used to mean a sensor setup that monitors the state of a feature in a target object.
 遠隔監視装置(例えば遠隔の聴話装置)中のコンピューティングリソースは制限されるため、ほとんどのこれらのアプリケーションでは、センサノードからのデータは遠隔監視センターのサーバへ送られる。遠隔監視センターのサーバではデータを解析して適切なアクションが講じる。しかしながら、センサノードが大量のデータを生成することにより通信帯域幅が圧迫される場合、このアプローチは実現可能ではない。 In most of these applications, the data from the sensor node is sent to a server at the remote monitoring center because computing resources in the remote monitoring device (eg, a remote listening device) are limited. The remote monitoring center server analyzes the data and takes appropriate action. However, this approach is not feasible if the communication bandwidth is constrained by the sensor node generating a large amount of data.
 そこでセンサノードが異常な状態である場合に、センサノードからデータを送信するゲートウェイ(クライアント)の使用により通信オーバーヘッドを削減することが可能である。しかしながら、この場合も、雑音により、センサノードの異常な状態を決定する際に何度も、誤報(フォールスアラーム(FA))が生じる場合がある。 Therefore, when the sensor node is in an abnormal state, it is possible to reduce communication overhead by using a gateway (client) that transmits data from the sensor node. However, in this case as well, due to noise, a false alarm (false alarm (FA)) may occur many times when determining the abnormal state of the sensor node.
 センサノードにおける異常なイベントは、他のいくつかのセンサノードにおける異常が引き金となって起きることがある。この場合、他のセンサノードの異常は、ターゲットオブジェクトの異常の原因である。特に装置あるいはプラントのオペレーションが人間の安全性およびセキュリティで関連づけられる場合、異常なイベントおよびその原因の決定は非常に重要である。 An abnormal event in the sensor node may be triggered by an abnormality in some other sensor nodes. In this case, the abnormality of the other sensor node is the cause of the abnormality of the target object. The determination of abnormal events and their causes is very important, especially when equipment or plant operations are linked with human safety and security.
 ベイジアンネットワークはセンサノード中の因果関係を示すために広く用いられ、また条件付き確率テーブル(CPT)はセンサノードにおける異常なイベントの原因を推論するために用いられる。センサノードの数が少なく、センサノードにおける因果関係が知られている場合、ベイジアンネットワークは手動で作ることができる。センサノードの数が非常に大きく、センサノードにおけるに関係が隠される場合、手動でベイジアンネットワークを構築することは事実上不可能である。一方ベイジアンネットワークは、データから自動的に構築されることもあるが、データからのベイジアンネットワークを作るのはNP困難な問題である。したがって、多くのセンサノードを含んでいる遠隔監視システムについては、最適なベイジアンネットワークの構築、およびそのベイジアンネットワークで使用するセンサノードの異常の原因の推論は実現可能ではない。 Bayesian network is widely used to show the causal relationship in sensor nodes, and conditional probability table (CPT) is used to infer the cause of abnormal events in sensor nodes. If the number of sensor nodes is small and the causal relationships at the sensor nodes are known, the Bayesian network can be created manually. If the number of sensor nodes is very large and the relationships at the sensor nodes are hidden, it is virtually impossible to manually construct a Bayesian network. On the other hand, a Bayesian network may be automatically constructed from data, but creating a Bayesian network from data is a difficult problem. Therefore, for a remote monitoring system including a large number of sensor nodes, it is not feasible to construct an optimal Bayesian network and to infer the cause of abnormality of sensor nodes used in the Bayesian network.
 センサネットワークが多くのセンサノードからなり、ターゲットオブジェクトが多くのセンサによってモニターされることがある。そのような大きなネットワークでは、すべてのセンサノードがターゲットオブジェクトの異常なイベントを発見するのに必要だとは限らないかもしれない。不必要なセンサノードの除去は、監視システムのコストを縮小につながる。 The sensor network consists of many sensor nodes, and the target object may be monitored by many sensors. In such a large network, not all sensor nodes may be necessary to find anomalous events in the target object. The removal of unnecessary sensor nodes leads to a reduction in the cost of the monitoring system.
 いくつかのアプリケーションではさらに、高価なセンサがオブジェクトの状態をモニターするために用いられることがある。高価なセンサの置換を実行できるセンサ回路網からの1セットのセンサの決定は、センサ回路網のコストを縮小するのに非常に役立つ。そのようなアプリケーションでは、ターゲットオブジェクトは高価なセンサノードである。 In some applications, more expensive sensors may be used to monitor the state of the object. The determination of a set of sensors from a sensor network that can perform expensive sensor replacement is very helpful in reducing the cost of the sensor network. In such an application, the target object is an expensive sensor node.
 本実施形態は、ターゲットオブジェクトの異常イベントを効率的かつ高信頼性で検出し、多くのセンサノードの中から異常の原因(原因となるセンサノード)を識別し、ゲートウェイから遠隔監視センターにおけるサーバへのデータの送信の通信オーバーヘッドを削減し、ターゲットオブジェクトに対する不必要なセンサノードの識別を可能にするものである。 In this embodiment, an abnormal event of a target object is detected efficiently and reliably, the cause of the abnormality (causing sensor node) is identified from many sensor nodes, and from the gateway to the server in the remote monitoring center. The communication overhead of data transmission is reduced, and unnecessary sensor nodes can be identified for the target object.
 以下、図面を参照しながら、本実施形態について説明する。 Hereinafter, the present embodiment will be described with reference to the drawings.
 図35は本実施形態に係る異常判定システムの構成を示す。このシステムは、監視現場またはプラントの中でのターゲットオブジェクトの異常状態(あるいは異常イベント)の検出、および異常状態の原因(以降、原因は判定根拠と呼ばれることもある)を識別する。 FIG. 35 shows the configuration of the abnormality determination system according to this embodiment. This system detects the abnormal state (or abnormal event) of the target object in the monitoring site or plant, and identifies the cause of the abnormal state (hereinafter, the cause may also be referred to as a judgment basis).
 このシステムは、監視現場のゲートウェイ(クライアント)100および遠隔監視センターのサーバ200から成る。 This system comprises a gateway (client) 100 at the monitoring site and a server 200 at the remote monitoring center.
 ゲートウェイ100は、単チャネル異常判定部102、決定フュージョンルールにより総合異常判定および根拠特定を行う総合判定部103、データフィルタリング部104、異常判定モデルデータベース105、決定フュージョンルールデータベース106および受信部107を備える。 The gateway 100 includes a single-channel abnormality determination unit 102, a comprehensive determination unit 103 that performs comprehensive abnormality determination and ground identification using a determined fusion rule, a data filtering unit 104, an abnormality determination model database 105, a determined fusion rule database 106, and a reception unit 107. .
 遠隔監視センターのサーバ200は、ゲートウェイ100からデータを受信する受信部205、センサデータを保持するセンサデータ・データベース201、判定結果と判定根拠のデータベース204、単チャネル異常判別モデル学習部203、決定フュージョンルール学習部202、および送信部206を備える。 The remote monitoring center server 200 includes a receiving unit 205 that receives data from the gateway 100, a sensor data database 201 that holds sensor data, a determination result and determination basis database 204, a single channel abnormality determination model learning unit 203, a determination fusion A rule learning unit 202 and a transmission unit 206 are provided.
 図36は、センサデータ・データベース201の一例を示す。 FIG. 36 shows an example of the sensor data database 201.
 データベース201は、各センサノードから観測されたセンサデータと、各センサデータが正常か異常かを示す状態ラベル(第1ラベル)と、ターゲットオブジェクトが正常か異常かを示す判定ラベル(第2ラベル)とからなる複数の訓練データを様々なタイムスタンプで格納している。判定ラベルはクラスラベルと称されることもある。 The database 201 includes sensor data observed from each sensor node, a status label indicating whether each sensor data is normal or abnormal (first label), and a determination label indicating whether the target object is normal or abnormal (second label). Are stored with various time stamps. The determination label may be referred to as a class label.
 ターゲットオブジェクトの判定ラベルは監視現場の保守員または係員等が、タイムスタンプに示される時刻にける実際のターゲットオブジェクトの状態を確認して付与したものである。各センサノードの状態ラベルは、センサノード毎に用意された基準(モデル、分類器)に従って決定されたものである。たとえば当該基準が閾値の場合、センサデータの値が、閾値を超えていれば異常、超えていなければ正常と状態ラベルが決定される。状態ラベルの決定は、装置により自動的に決定され付与されたものでもよいし、保守員または係員が付与してもよい。 The judgment label of the target object is given by the maintenance staff or the staff at the monitoring site after confirming the actual state of the target object at the time indicated in the time stamp. The state label of each sensor node is determined according to the criteria (model, classifier) prepared for each sensor node. For example, when the reference is a threshold value, the status label is determined to be abnormal if the sensor data value exceeds the threshold value, and normal if it does not exceed the threshold value. The determination of the status label may be automatically determined and given by the apparatus, or may be given by a maintenance person or an attendant.
 センサノードとしては、音センサ、振動センサおよび温度センサといった様々なタイプのセンサが用いられることができる。音センサおよび振動センサからの出力は、波形データである。温度センサからの出力は時間軸に沿って集積された値である。 As the sensor node, various types of sensors such as a sound sensor, a vibration sensor, and a temperature sensor can be used. Outputs from the sound sensor and the vibration sensor are waveform data. The output from the temperature sensor is a value integrated along the time axis.
 サーバ200における単チャネル異常判定モデル学習部203は、センサデータ・データベース201における各センサノードについてそれぞれのセンシングデータを異常か正常かに分類する単チャネル異常判定モデル(分類器)を学習する。単チャネル異常判定モデル学習部203はセンサノード毎に生成した単チャネル異常判定モデルを、送信部206を介してゲートウェイ100に送信する。ゲートウェイ100の受信部107はセンサノード毎の単チャネル異常判定モデル(分類器)を受信して異常判定モデルデータベース105に格納する。 The single-channel abnormality determination model learning unit 203 in the server 200 learns a single-channel abnormality determination model (classifier) that classifies each sensing data for each sensor node in the sensor data database 201 as abnormal or normal. The single channel abnormality determination model learning unit 203 transmits the single channel abnormality determination model generated for each sensor node to the gateway 100 via the transmission unit 206. The receiving unit 107 of the gateway 100 receives a single channel abnormality determination model (classifier) for each sensor node and stores it in the abnormality determination model database 105.
 単チャネル異常判定モデル学習部203は、単チャネル異常判定モデル(分類器)を学習するのに、最初、センサデータ・データベース201から様々なタイムスタンプで記録されたデータおよび状態ラベルをセンサノードごとに抽出する。ある1つのチャネル(センサノード)について抽出したデータの一例を図37に示す。単チャネル異常判定モデル学習部203ではデータのタイプ(例えば波形データであるか否か)に応じて、異なるタイプの分類器を学習する。 In order to learn a single channel abnormality determination model (classifier), the single channel abnormality determination model learning unit 203 first collects data and state labels recorded at various time stamps from the sensor data database 201 for each sensor node. Extract. An example of data extracted for one channel (sensor node) is shown in FIG. The single channel abnormality determination model learning unit 203 learns different types of classifiers according to the type of data (for example, whether the data is waveform data).
 すなわち、センサノードからの出力が時間軸に沿って集められた値のような単一の値である場合は、あらかじめ定められた閾値が、異常か正常かにデータを分類するための分類器として用いられる。適切な閾値の決定は非常に困難である。閾値が非常に高い値にセットされれば、多くの偽陰性が予測され、低値にセットされる場合、多くの偽陽性が予測される。ここで、抽出したデータおよび状態ラベルから最適な閾値を決定する方法を示す。 That is, when the output from the sensor node is a single value such as a value collected along the time axis, a predetermined threshold is used as a classifier for classifying data as abnormal or normal Used. It is very difficult to determine an appropriate threshold. If the threshold is set to a very high value, many false negatives are predicted, and if it is set to a low value, many false positives are predicted. Here, a method for determining the optimum threshold value from the extracted data and the state label will be described.
 この方式は、C4.5で訓練学習データから最適な分割のための特徴を選択する手法に基づく。C4.5 はQuinlanによる"C4.5: Programs for Machine Learning"[Morgan Kaufman Publishers, 1993]に開示されている。 This method is based on the method of selecting features for optimal segmentation from training learning data in C4.5. C4.5 is disclosed in "C4.5:" Programs for Machine Learning "[Morgan Kaufman Publishers, 1993] by Quinlan.
 この方法では最初に、訓練学習データ(抽出したデータ)の値をソートする。たとえば図38の左のデータをソートすると、図38の中間のようになる。次に、状態ラベルが異なる値区間の中央値(区切り点)を計算する。例えば同図ではID9,ID8の状態が異なるため、ID9,ID8のデータの中央値を計算すると「2.1」となる。このように計算された中央値は候補閾値となる。各候補閾値は、正確度、F-スコア、幾何平均あるいはAUCBのような指標で評価され、最良のスコア(フィットネス)を返す中央値(候補閾値)が、最適な閾値として選択される。AUCBはポールらによる “Genetic algorithm based methods for identification of health risk factors aimed at preventing metabolic syndrome” [SEAL ’08: Proceedings of the 7th International Conference on Simulated Evolution and Learning, pages 210-219, Berlin, Heidelberg, 2008. Springer-Verlag]に開示されている。 In this method, first, the values of training learning data (extracted data) are sorted. For example, when the left data in FIG. 38 is sorted, it becomes like the middle of FIG. Next, the median (breakpoint) of the value intervals with different state labels is calculated. For example, since the states of ID9 and ID8 are different in the figure, the median value of the data of ID9 and ID8 is “2.1”. The median value thus calculated is a candidate threshold value. Each candidate threshold value is evaluated by an index such as accuracy, F-score, geometric mean, or AUCB, and a median value (candidate threshold value) that returns the best score (fitness) is selected as the optimum threshold value. AUCB is based on Paul et al. “Genetic algorithm based methods for identification of health risk factors aimed at preventing metabolic syndrome” [SEAL '08: Proceedings of the 7th International Conference on Simulated Evolutionerand Berlin 2008 Springer-Verlag].
 正確度、F-スコア、幾何平均 感度(sensitivity)、特異度(specificity)、適合率(Precision)あるいは再現率(Recall)の定義を以下に示す。
Figure JPOXMLDOC01-appb-M000010
The definition of accuracy, F-score, geometric mean sensitivity (sensitivity), specificity (Precision) or recall (Recall) is shown below.
Figure JPOXMLDOC01-appb-M000010
 ここで、NTPは真陽性の数、NTNは真陰性の数、NFPは偽陽性の数、NFNは偽陰性の数である。 Here, NTP is the number of true positives, N TN is the number of true negatives, N FP is the number of false positives, and N FN is the number of false negatives.
 図示の例では、4つの候補閾値が計算され、候補閾値3.0と3.6が最高のスコアをもつため、これらの一方を最適な閾値として選択する。選択は、ランダムでもよいし、ユーザ指定でもよい。 In the example shown, four candidate thresholds are calculated, and candidate thresholds 3.0 and 3.6 have the highest score, so one of these is selected as the optimal threshold. The selection may be random or user specified.
 一方、センサノードからの出力が波形データである場合は、分類器(異常判定モデル)の構築のために特別の考察が必要である。通常、波形データは、最初に、移動平均、離散ウェーブレット変換(DWT)あるいは短時間フーリエ変換(STFT)のような信号処理技術により処理され、次のステップで、分類器(異常判定モデル)が学習される。 On the other hand, when the output from the sensor node is waveform data, special consideration is necessary for the construction of the classifier (abnormality judgment model). Normally, waveform data is first processed by a signal processing technique such as moving average, discrete wavelet transform (DWT), or short-time Fourier transform (STFT), and the classifier (anomaly judgment model) learns in the next step. Is done.
 波形データの場合の分類器(異常判定モデル)を学習する最も単純な方法は、閾値方法である。この閾値方法では、様々なタイムスタンプで測定された波形データの最高ピークそれぞれ取得し、前述した方法を使用して最適な閾値を学習する。別の可能な技術は、波形データから最高振幅および最小振幅、平均および標準偏差および波形の下の面積のような多くの特徴値を抽出し、抽出した特徴値に基づき、分類器(異常判定モデル)を学習する。 The simplest method for learning the classifier (abnormality determination model) in the case of waveform data is the threshold method. In this threshold method, the highest peaks of waveform data measured at various time stamps are acquired, and the optimum threshold value is learned using the method described above. Another possible technique is to extract many feature values such as maximum and minimum amplitude, average and standard deviation, and area under the waveform from the waveform data, and based on the extracted feature values, classifier (anomaly determination model ).
 波形データの分類のために使用可能な分類器の1つの例としてk最近傍(kNN)分類器がある。k最近傍(kNN)分類器は、距離測定として、可変長部分波形に対処することができる動的時間伸縮(DTW:dynamic time warping)を用いる。k最近傍(kNN)分類器は、Dasarathyによって"Nearest Neighbor(NN) Norms: NN Pattern Classification Techniques"[IEEE Computer Society Press, 1991]において開示されている。またDTWは、MyersおよびRabinerによる"A comparative study of several dynamic time-warping algorithms for connected word recognition"[The Bell System Technical Journal, 60(7):1389-1409, September 1981]において開示されている。 One example of a classifier that can be used to classify waveform data is the k nearest neighbor (kNN) classifier. The k nearest neighbor (kNN) classifier uses dynamic time warping (DTW) that can handle variable length partial waveforms as a distance measurement. The k nearest neighbor (kNN) classifier is disclosed by Dasarathy in "NearestearNeighbor (NN) Norms: NN Pattern Classification Techniques" [IEEE Computer Society Press, 1991]. DTW is also disclosed in “A comparative study of several dynamic time-warping algorithms for connected word recognition" [The Bell System Technical Journal, 60 (7): 1389-1409, September 1981] by Myers and Rabiner.
 ただし、DTWのオペレーションは、データベースおよび(または)観測点の数が非常に大きい場合非常に遅くなる。そこでDTWの代わりに、相互相関とユークリッドの距離、t-統計あるいは信号対雑音比(SNR)ベースの機能のようなより速い距離算定方式を、2つの波形の間の距離を計算するために使用してもよい。 However, DTW operations are very slow when the number of databases and / or observation points is very large. So instead of DTW, faster distance calculation methods like cross-correlation and Euclidean distance, t-statistic or signal-to-noise ratio (SNR) based functions are used to calculate the distance between two waveforms. May be.
 ここで、訓練波形中のすべての領域が、テスト波形中の異常の検出において重要だとは限らず、異常の検出のためには、訓練波形中の特徴領域だけが必要である。さらに、波形が多くのデータ点を含んでいると、テスト波形が異常かどうか決定するのに非常に長い実行時間を要するが、波形の特徴部分波形を使用することで、異常をより正確に早く検知できる可能性がある
 1セットの訓練波形を与えられて、遺伝的アルゴリズム(GA)のような最適化アルゴリズムの使用により特徴領域を抽出できる。遺伝的アルゴリズムはHollandによる “Adaptation in Natural and Artificial Systems” [University of Michigan Press, AnnArbor, Michigan, 1975年]、そしてGoldbergによる「Genetic Algorithms in Search, Optimization, and Machine Learning” [アディソン・ウェズリー(レディング、MA)、1989年]によって開示されている。
Here, not all the regions in the training waveform are important in detecting an abnormality in the test waveform, and only the feature region in the training waveform is necessary for detecting the abnormality. In addition, if the waveform contains many data points, it takes a very long execution time to determine whether the test waveform is abnormal, but using the waveform feature waveform makes the abnormality more accurate and faster. Given a set of training waveforms that may be detected, feature regions can be extracted by using an optimization algorithm such as Genetic Algorithm (GA). Genetic algorithms include “Adaptation in Natural and Artificial Systems” by Holland [University of Michigan Press, AnnArbor, Michigan, 1975] and “Genetic Algorithms in Search, Optimization, and Machine Learning” by Goldberg [Addison Wesley (Reading, MA), 1989].
 図39は、遺伝的アルゴリズムの一般的な処理のフローを示すフローチャートである。 FIG. 39 is a flowchart showing a general processing flow of the genetic algorithm.
 まず、解空間と探索空間の間を写像する符号化手法を決定する(S1001)。 First, an encoding method for mapping between the solution space and the search space is determined (S1001).
 符号化方法が決定されたら、次に、候補解サイズ(ポピュレーションサイズ)、子孫サイズ、世代の最大数、交叉確率および突然変異確率のような様々なコントロールパラメーターの値を初期設定する(S1002)。 Once the encoding method is determined, next, various control parameter values such as candidate solution size (population size), offspring size, maximum number of generations, crossover probability and mutation probability are initialized (S1002). .
 次に、初期候補解をランダムに生成する(S1003)。生成された初期候補解の群は初期ポピュレーションに相当する。 Next, an initial candidate solution is randomly generated (S1003). The group of generated initial candidate solutions corresponds to the initial population.
 次に各候補解を評価し、それぞれのフィットネスを計算する(S1004)。 Next, each candidate solution is evaluated and each fitness is calculated (S1004).
 次に、世代が最大数に達する、あるいはポピュレーション内の最良の候補解が最適フィットネスに達するなどの、終了基準が満たされたかどうかをチェックする(S1005)。 Next, it is checked whether the termination criteria such as the maximum number of generations or the best candidate solution in the population reaches the optimal fitness are satisfied (S1005).
 終了基準が満たされない場合(S1005のNO)、前世代のポピュレーションからいくつかの候補解を、新たな候補解(子孫)の生成のために選択する(S1006)。選択は各候補解のスコア(フィットネス)に基づいて、所定の基準により選択する。所定の基準として例えばフィットネスの高いものから所定数の候補解、または所定値以上のフィットネスをもつ候補解を選択することがある。 If the termination criterion is not satisfied (NO in S1005), several candidate solutions are selected from the previous generation population to generate new candidate solutions (descendants) (S1006). The selection is made according to a predetermined criterion based on the score (fitness) of each candidate solution. For example, a predetermined number of candidate solutions or a candidate solution having a fitness equal to or greater than a predetermined value may be selected as a predetermined criterion.
 選択された候補解に対し交叉と突然変異オペレータを適用することにより新たな候補解(子孫)を生成し(S1007)する。そして、新たな候補解(子孫)を、ステップS1004と同様にして評価してそれぞれのフィットネスを計算する (S1008)。 A new candidate solution (descendant) is generated by applying the crossover and mutation operator to the selected candidate solution (S1007). Then, a new candidate solution (descendant) is evaluated in the same manner as in step S1004, and each fitness is calculated (S1008).
 次に、前世代のポピュレーションから選択した候補解、新しく生成された候補解を組み合わせることにより候補解の新しいセット(新たなポピュレーション)を生成する(S1009)。選択する候補解は、所定の基準により選択する。例えばフィットネスの高いものから所定数の候補解、または所定値以上のフィットネスをもつ候補解を選択する。ステップS1006で選択した候補解と同じものを選択してもよい。 Next, a new set of candidate solutions (new population) is generated by combining the candidate solution selected from the previous generation population and the newly generated candidate solution (S1009). The candidate solution to be selected is selected according to a predetermined criterion. For example, a predetermined number of candidate solutions having high fitness or candidate solutions having fitness equal to or higher than a predetermined value are selected. The same candidate solution selected in step S1006 may be selected.
 終了基準が満たされた場合(S1005のYES)、そのときのポピュレーションからの最良の候補解(例えば最高のフィットネスをもつ候補解)が、最良の解として得られる (S1010)。 When the termination criterion is satisfied (YES in S1005), the best candidate solution (for example, the candidate solution having the highest fitness) from the population at that time is obtained as the best solution (S1010).
 図40は、遺伝的アルゴリズム(GA)を用いた、波形の最適なセグメンテーションの例を具体的に示す。以下では1つのセンサノードの波形データのみに着目して説明する。 FIG. 40 specifically shows an example of optimal segmentation of a waveform using a genetic algorithm (GA). In the following, description will be given focusing only on the waveform data of one sensor node.
 最初に、多くの候補解(ここでは50個)が、異なるタイムスタンプの複数の波形から任意(ランダム)に部分波形を抽出することにより作成される(S1011)。各候補解については、1つの部分波形だけが、各波形から取られる。ここでは4つのタイムスタンプに対応する4つの波形1~4が存在し、各候補解においてそれぞれ、波形1~4から切り出した部分波形1~4を含む。切り取る波形の幅は一定でもよいし、一定でなくてもよい。 First, many candidate solutions (here, 50) are created by extracting partial waveforms arbitrarily (randomly) from a plurality of waveforms with different time stamps (S1011). For each candidate solution, only one partial waveform is taken from each waveform. Here, there are four waveforms 1 to 4 corresponding to four time stamps, and each candidate solution includes partial waveforms 1 to 4 cut out from the waveforms 1 to 4, respectively. The width of the waveform to be cut out may be constant or not constant.
 次に、各候補解をk-最近傍法により評価する(S1012)。k-最近傍法では、たとえば候補解に含まれる部分波形を分類して、すなわち真陽性の数(NTP)、真陰性の数(NTN)、偽陽性の数(NFP)および偽陰性の数(NFN)といった分類統計を計算し、計算した分類統計に基づき候補解のフィットネス(適合性)を計算する。フィットネスはたとえば前述した正確度等の各種指標を用いることができる。本ステップS1012で行われる処理の一例を以下に示す。ただしここで述べた例はあくまで一例であり本発明はこれに限定されるものではない。 Next, each candidate solution is evaluated by the k-nearest neighbor method (S1012). In the k-nearest neighbor method, for example, partial waveforms included in the candidate solution are classified, that is, the number of true positives (N TP ), the number of true negatives (N TN ), the number of false positives (N FP ), and false negatives. Classification statistics such as the number of (N FN ) are calculated, and the fitness (fitness) of the candidate solution is calculated based on the calculated classification statistics. For fitness, for example, various indexes such as the accuracy described above can be used. An example of the process performed in step S1012 is shown below. However, the example described here is merely an example, and the present invention is not limited to this.
 たとえば図40のID1の候補解のフィットネスを計算する例を示す。まず候補1に含まれる部分波形1~4のうち1つ(ここでは部分波形4)を外す。部分波形4に最も距離が近い上位k個の部分波形を残りの部分波形から選択する。ここではk=3とし、従って残りのすべてが選択される。選択された部分波形のそれぞれの状態(正常、異常)を特定し、正常の合計数および異常の合計数を計算し、多い方の状態を選択する。そして部分波形4の実際の状態(判定ラベル)と、選択した状態を比較し、一致していれば正解とし、一致していなければ不正解とする。部分波形1~3についても順番に選択して比較を行い、正解または不正解を特定する。比較回数に対する正解回数の割合を計算し、この割合を候補1のフィットネスとする。 For example, an example of calculating the fitness of the candidate solution with ID 1 in FIG. 40 is shown. First, one of the partial waveforms 1 to 4 included in candidate 1 (here, partial waveform 4) is removed. The top k partial waveforms closest to the partial waveform 4 are selected from the remaining partial waveforms. Here, k = 3, so all the rest are selected. Each state (normal or abnormal) of the selected partial waveform is specified, the total number of normals and the total number of abnormalities are calculated, and the larger state is selected. Then, the actual state (determination label) of the partial waveform 4 is compared with the selected state. If they match, the answer is correct, and if they do not match, the answer is incorrect. The partial waveforms 1 to 3 are also selected in order and compared to identify the correct or incorrect answer. The ratio of the number of correct answers to the number of comparisons is calculated, and this ratio is set as the fitness of candidate 1.
 図39で述べた終了基準が満足されたかどうかをチェックし(S1013)、満足されないときは(NO)、フィットネスに基づいて所定の基準を満たす候補解を選択し(S1014)、選択した候補解に基づき、交叉(S1015)および突然変異(S1016)のオペレーションを行うことによって候補解の新しいセットを生成する。交叉では、1つの候補解の部分波形が、別の候補解での対応する部分波形で交換される。突然変異では、1つの部分波形が、同じ波形からの別の一つと取り替えられる。 It is checked whether or not the termination criterion described in FIG. 39 is satisfied (S1013). If it is not satisfied (NO), a candidate solution satisfying the predetermined criterion is selected based on fitness (S1014), and the selected candidate solution is selected. Based on this, a new set of candidate solutions is generated by performing crossover (S1015) and mutation (S1016) operations. In crossover, partial waveforms of one candidate solution are exchanged with corresponding partial waveforms in another candidate solution. In mutation, one partial waveform is replaced with another from the same waveform.
 新しい候補解(子孫)の個々を評価してフィットネスを計算し(S1017)、候補解の古いセットから選択した候補解と、新しい候補解(子孫)を組み合わされて、候補解の新しいセット(新たなポピュレーション)を得る(S1018)。古いセットから選択する候補解はフィットネスに基づき所定の基準(たとえばフィットネスが高いものから所定数の候補解、または所定値以上のフィットネスをもつ候補解を選択する)に従って選択する。S1014で選択した候補解を選択してもよい。 Each new candidate solution (offspring) is evaluated to calculate fitness (S1017), and the candidate solution selected from the old set of candidate solutions and the new candidate solution (descendants) are combined to create a new set of candidate solutions (new (S1018). Candidate solutions to be selected from the old set are selected according to a predetermined criterion based on fitness (for example, selecting a predetermined number of candidate solutions having high fitness or a candidate solution having a fitness equal to or higher than a predetermined value). The candidate solution selected in S1014 may be selected.
 終了基準が満たされたとき(S1013のYES)、最良のフィットネスをもつ候補解(部分波形のセット)が、訓練部分波形の最適化されたセットとして得られる(S1019)。得られたセットが、単チャネル異常判定モデルに相当する。得られたセットと、k最近傍法アルゴリズムとを合わせて単チャネル異常判定モデルとして扱ってもよい。 When the termination criterion is satisfied (YES in S1013), a candidate solution (partial waveform set) having the best fitness is obtained as an optimized set of training partial waveforms (S1019). The obtained set corresponds to a single channel abnormality determination model. The obtained set and the k nearest neighbor algorithm may be combined and handled as a single channel abnormality determination model.
 決定フュージョンルール学習部202は、ターゲットオブジェクトの異常検出のための決定フュージョンルール(あるいは分類ルール)を学習する。決定フュージョンルール学習部202は生成した決定フュージョンルールを送信部206を介してゲートウェイ100に送信する。ゲートウェイ100の受信部107はこの決定フュージョンルールを受信して決定フュージョンルールデータベース106に格納する。 The decision fusion rule learning unit 202 learns a decision fusion rule (or classification rule) for detecting an abnormality of the target object. The decision fusion rule learning unit 202 transmits the generated decision fusion rule to the gateway 100 via the transmission unit 206. The receiving unit 107 of the gateway 100 receives this determined fusion rule and stores it in the determined fusion rule database 106.
 決定フュージョンルールは、ターゲットオブジェクトの異常を検出するとともに、ターゲットオブジェクトにおける異常の根拠(センサノード)を識別するものである。また決定フュージョンルールに含まれないセンサノードは、ターゲットオブジェクトの異常検出に不必要なセンサノードとして識別されることができる。決定フュージョンルール学習部202は学習した決定フュージョンルールを内部のデータベースに記憶し、また上述のようにゲートウェイ100に送信して決定フュージョンルールデータベース106に格納する。 The decision fusion rule is to detect an abnormality of the target object and identify the basis of the abnormality (sensor node) in the target object. Also, sensor nodes that are not included in the decision fusion rule can be identified as sensor nodes that are unnecessary for detecting an abnormality of the target object. The decision fusion rule learning unit 202 stores the learned decision fusion rule in an internal database, and transmits it to the gateway 100 and stores it in the decision fusion rule database 106 as described above.
 決定フュージョンルール学習部202は、決定フュージョンルールを学習するために図36に示したようなセンサデータベースから図41に示すようなデータを抽出する。抽出するデータは、各タイムスタンプ(ID)と、各センサノードの状態ラベル(正常または異常)と、ターゲットオブジェクトの判定ラベル(正常または異常)とを含む。これら抽出したデータから分類ルール学習アプローチが用いて、決定フュージョンルールを生成する。 
 すなわちターゲットオブジェクトの判定ラベルをクラスラベル、センサノードの状態ラベルを特徴値として用いて、ターゲットオブジェクトの状態(正常、異常)を正確に予測するできる分類ルールを学習する。つまり、決定フュージョンルールを学習するために特徴選択と分類法との組合せが用いられる。結果として生成される分類ルールは、単一または複数の決定フュージョンルールから構成される。例えば、次の分類ルールを考える。 
 IF (N4=異常AND N8=異常AND N19=異常) THEN (ターゲットオブジェクト=異常) ELSE (ターゲットオブジェクト=正常).
 この分類ルールは、1つの決定フュージョンルールから成り、以下のように解釈される。すなわち、センサノードN4、N8およびN19のデータが異常であるとき、ターゲットオブジェクトはそのとき異常な状態にあると解釈される。原因関係のコンテキストでは、ターゲットオブジェクトの異常の原因が、センサノードN4、N8およびN19のデータが異常であることである。別の解釈としては、ターゲットオブジェクトの異常を発見するために、N4、N8およびN19の3つのセンサノードだけが必要である。 
 以降、センサノードの名前(すなわちセンサノードを表す変数)だけをもってセンサノードのセンシングデータが異常な状態であることを意味することがある。
The decision fusion rule learning unit 202 extracts data as shown in FIG. 41 from the sensor database as shown in FIG. 36 in order to learn the decision fusion rule. The data to be extracted includes each time stamp (ID), a state label (normal or abnormal) of each sensor node, and a determination label (normal or abnormal) of the target object. A classification rule learning approach is used from these extracted data to generate a decision fusion rule.
That is, a classification rule that can accurately predict the state (normal or abnormal) of the target object is learned using the determination label of the target object as the class label and the state label of the sensor node as the feature value. That is, a combination of feature selection and classification is used to learn the decision fusion rules. The resulting classification rules are composed of single or multiple decision fusion rules. For example, consider the following classification rule:
IF (N4 = abnormal AND N8 = abnormal AND N19 = abnormal) THEN (target object = abnormal) ELSE (target object = normal).
This classification rule consists of one decision fusion rule and is interpreted as follows. That is, when the data of the sensor nodes N4, N8, and N19 is abnormal, the target object is interpreted as being in an abnormal state at that time. In the cause-related context, the cause of the abnormality of the target object is that the data of the sensor nodes N4, N8, and N19 is abnormal. Another interpretation is that only three sensor nodes, N4, N8, and N19, are needed to discover anomalies in the target object.
Hereinafter, it may mean that the sensing data of the sensor node is in an abnormal state only with the name of the sensor node (that is, a variable representing the sensor node).
 また、次の表記法を決定フュージョンルールの表記のために用いる。 
Figure JPOXMLDOC01-appb-M000011
The following notation is used for notation of the decision fusion rule.
Figure JPOXMLDOC01-appb-M000011
 上記表記は、センサノード(Na、Nb、…、Nk)がすべて異常なとき、ターゲットオブジェクトが異常な状態にあることを意味する。 The above notation means that when all sensor nodes (Na, Nb,..., Nk) are abnormal, the target object is in an abnormal state.
 また、別の分類ルールの例を示す。 
 IF (((N4 AND N8) OR N10) AND (N19 OR N25)) THENターゲットオブジェクト=異常) ELSE (ターゲットオブジェクト=正常)
 この場合における分類ルールには、以下の4つの決定フュージョンルールが含まれる。 
   (a) (N4,N8,N19) ⇒ ターゲットオブジェクト;
   (b) (N4,N8,N25) ⇒ ターゲットオブジェクト;
   (c) (N10,N19) ⇒ターゲットオブジェクト;
   (d) (N10, N25) ⇒ ターゲットオブジェクト.
 分類ルール(決定フュージョンルール)は様々なフォーマットを有することができる。図42はANDフォーマットの例、図43はルールフォーマットの例を示す。
An example of another classification rule is shown.
IF (((N4 AND N8) OR N10) AND (N19 OR N25)) THEN target object = abnormal) ELSE (target object = normal)
The classification rules in this case include the following four decision fusion rules.
(a) (N4, N8, N19) ⇒ target object;
(b) (N4, N8, N25) ⇒ target object;
(c) (N10, N19) ⇒ target object;
(d) (N10, N25) ⇒ Target object.
Classification rules (decision fusion rules) can have various formats. FIG. 42 shows an example of the AND format, and FIG. 43 shows an example of the rule format.
 図42の一行目のANDフォーマットは、センサノードN4、N8およびN19のデータがすべて異常であるとき、ターゲットオブジェクトAはそのとき異常な状態にあることを意味する。二行目のANDフォーマットは、センサノードN4、N8およびN25のデータがすべて異常であるとき、ターゲットオブジェクトAはそのとき異常な状態にあることを意味する。三行目のANDフォーマットは、センサノードN10、N19のデータがすべて異常であるとき、ターゲットオブジェクトAはそのとき異常な状態にあることを意味する。四行目のANDフォーマットは、センサノードN10、N25のデータがすべて異常であるとき、ターゲットオブジェクトAはそのとき異常な状態にあることを意味する。 42. The AND format in the first line in FIG. 42 means that when the data of the sensor nodes N4, N8, and N19 are all abnormal, the target object A is in an abnormal state at that time. The AND format in the second row means that when all the data of the sensor nodes N4, N8 and N25 are abnormal, the target object A is in an abnormal state at that time. The AND format in the third row means that when all the data of the sensor nodes N10 and N19 are abnormal, the target object A is in an abnormal state at that time. The AND format on the fourth line means that when all the data of the sensor nodes N10 and N25 are abnormal, the target object A is in an abnormal state at that time.
 図43の一行目のルールフォーマットは、 (N4,N8,N19)、 (N4,N8,N25)、(N10,N19)、(N10, N25) のうちの少なくともいずれか1つに含まれるすべてのセンサノードのデータがすべて異常であるとき、ターゲットオブジェクトAは異常であることを意味する。 The rule format in the first line of FIG. 43 is that all the rules included in at least one of (N4, N8, N19), (N4, N8, N25), (N10, N19), (N10, N25) When all the sensor node data is abnormal, it means that the target object A is abnormal.
 図42のANDフォーマットルールを用いたときの評価は、図43のような多数の決定フュージョンルールを含んだ分類ルールでの評価よりも容易で速い。さらに、図42のANDフォーマットルールでは、横列の中のセンサノードがすべて、判定根拠として得られるため、根拠の識別はより容易になる。 
 一方、図43のルールフォーマットは多数の決定フュージョンルールのコンパクトな表現であるが、判定根拠を識別するために余分の構文解析が必要である。代数の変形を用いると、分類ルールは、図44に示すように、多数の決定フュージョンルールの積和(SOP:sum of product)フォーマットに変形できる。なお、センサノードが多数の決定フュージョンルールに含まれている場合、センサのノードの指標付けは確認コスト(confirmation cost)を削減できる。
The evaluation using the AND format rule of FIG. 42 is easier and faster than the evaluation using the classification rule including a large number of decision fusion rules as shown in FIG. Furthermore, in the AND format rule of FIG. 42, all the sensor nodes in the row are obtained as the determination basis, so that the basis identification becomes easier.
On the other hand, the rule format of FIG. 43 is a compact expression of a large number of decision fusion rules, but extra syntax analysis is required to identify the judgment basis. Using an algebraic variant, the classification rule can be transformed into a sum of product (SOP) format of multiple decision fusion rules, as shown in FIG. Note that if the sensor node is included in a number of decision fusion rules, indexing the sensor node can reduce the confirmation cost.
 決定フュージョンルール学習部202は、学習した決定フュージョンルールをすべてスキャンすることで、決定フュージョンルールに含まれていないセンサノードを、ターゲットオブジェクトの異常検出のために不必要なセンサノードとして識別できる。 The decision fusion rule learning unit 202 can identify sensor nodes that are not included in the decision fusion rule as unnecessary sensor nodes for detecting an abnormality of the target object by scanning all learned decision fusion rules.
 例えば、上記のように4つの決定フュージョンルール(a),(b),(c),(d)から成る分類ルールの場合、ターゲットオブジェクトにおける異常の検出にはセンサノード、N4、N8、N10、N19、N25が必要であり、残りのセンサノードは不必要であることが識別できる。 For example, in the case of a classification rule consisting of four decision fusion rules (a), (b), (c), (d) as described above, sensor nodes N4, N8, N10, It can be identified that N19 and N25 are necessary and the remaining sensor nodes are unnecessary.
 決定フュージョンルール学習部202の目的の1つは、ターゲットオブジェクトにおける異常状態を予測するために必要なセンサノードの組合せを見つけることである。 One of the purposes of the decision fusion rule learning unit 202 is to find a combination of sensor nodes necessary for predicting an abnormal state in the target object.
 N個のセンサノードについては、センサノードの2nの組合せがある。nが小さい場合、徹底的にすべての組合せを探索し、訓練データの中に最大のサポート(評価値)を得ている組み合わせを最良のものとして見つけることができる。最良の組合せの代わりに、閾値より大きなサポートを得ている複数の組合せを見つけてもよい。これに対し、nが非常に大きい場合、すべての組合せを徹底的に探索することは実行不可能である。その状況では、遺伝的アルゴリズム(GA)または遺伝的プログラミング(GP)のような様々な発見的探索アルゴリズムを用いることができる。 For N sensor nodes, there are 2 n combinations of sensor nodes. When n is small, all combinations are thoroughly searched, and the combination with the maximum support (evaluation value) in the training data can be found as the best one. Instead of the best combination, multiple combinations may be found that have support greater than the threshold. On the other hand, if n is very large, it is not feasible to exhaustively search all combinations. In that situation, various heuristic search algorithms such as genetic algorithm (GA) or genetic programming (GP) can be used.
 遺伝的アルゴリズムが用いられる場合、アルゴリズムの実行毎に、1つの決定フュージョンルールが構築できる。多数の決定フュージョンルールを得るために、多数のGAの実行が必要である。 When a genetic algorithm is used, one decision fusion rule can be constructed for each execution of the algorithm. In order to obtain a large number of decision fusion rules, a large number of GA runs are required.
 以下図45~図48を用いて、GAを適用して分類ルール(決定フュージョンルール)を構築する方法を示す。 The method for constructing a classification rule (decision fusion rule) by applying GA is shown below using FIGS.
 図46は、遺伝的アルゴリズムにより分類ルールを構築するための処理のフローの一例を示す
 まず、解空間と探索空間の間を写像する符号化手法を決定する(S1101)。GAのコンテキストでは、用語“分類ルール”は決定フュージョンルールを意味するために用いられる。
FIG. 46 shows an example of a processing flow for constructing a classification rule by a genetic algorithm. First, an encoding method for mapping between a solution space and a search space is determined (S1101). In the GA context, the term “classification rule” is used to mean a decision fusion rule.
 図45に、GAによる解決問題のための符号化の一例を示す。それは0と1から成る2進法のストリングである。n個のセンサノードが存在する場合、各ストリング(各候補解)の長さはnである。すなわちこの符号化方法は、複数のセンサノードの各々の選択有無を2値にマッピングすることを示す。0は、対応するセンサノードがターゲットオブジェクトに影響しないことを意味する。1は、ターゲットオブジェクトが異常な状態である場合には、その1に対応するセンサノードが異常な状態であるに違いないことを意味する。GAがこの符号化と共に用いられる場合、1つの決定フュージョンルールだけがGAの実行毎に導出される。各実行はそれぞれ、すべてのセンサノードの状態(0あるいは1)のAND演算である。すなわち、その問題は、特徴選択問題へ向けられる。 Fig. 45 shows an example of encoding for a solution problem by GA. It is a binary string consisting of 0s and 1s. When there are n sensor nodes, the length of each string (each candidate solution) is n. That is, this encoding method indicates that the selection of each of the plurality of sensor nodes is mapped to binary values. 0 means that the corresponding sensor node does not affect the target object. 1 means that if the target object is in an abnormal state, the sensor node corresponding to that 1 must be in an abnormal state. When GA is used with this encoding, only one decision fusion rule is derived for each GA run. Each execution is an AND operation of all sensor node states (0 or 1). That is, the problem is directed to the feature selection problem.
 次に、図39のS1002と同様にして様々なコントロールパラメーターの値を決定する(S1102)。次に、初期候補解(初期候補分類ルール)をランダムに生成する(S1103)。図示の例ではセンサノードの個数nは5であるため、ストリング長は5である。この例では候補が10個生成されている。 Next, the values of various control parameters are determined in the same manner as S1002 in FIG. 39 (S1102). Next, an initial candidate solution (initial candidate classification rule) is randomly generated (S1103). In the illustrated example, since the number n of sensor nodes is 5, the string length is 5. In this example, 10 candidates are generated.
 次に、各候補解を評価して、それぞれのフィットネスを計算する(S1104)。図48に候補解の評価の一例を示す。すなわち候補解と、図41のような抽出したデータとに基づき、各データ(各行)について、候補解により判定を行って、データ毎に正常または異常の判定を得る。そして、判定した結果と、ターゲットオブジェクトの実際の状態とが一致する割合をフィットネスとして計算する。図示の例では10個のデータのうち、8個のデータに関しては判定の結果と実際の状態が一致し、残りの2個のデータに関しては一致しないため、フィットネス(正確度)は8/10=0.8となる。 Next, each candidate solution is evaluated, and each fitness is calculated (S1104). FIG. 48 shows an example of evaluation of candidate solutions. That is, based on the candidate solution and the extracted data as shown in FIG. 41, each data (each row) is determined based on the candidate solution, and normal or abnormal determination is obtained for each data. Then, the ratio at which the determined result matches the actual state of the target object is calculated as fitness. In the example shown in the figure, for 10 data, the judgment result and the actual state match for 8 data, and the other 2 data do not match, so the fitness (accuracy) is 8/10 = 0.8.
 次に終了条件が満たされたかどうかをチェックし(S1105)、満たされないときは(NO)、現在のポピュレーションからフィットネスに基づき所定の基準に従って候補解を選択する(S1106)。所定の基準として、たとえばフィットネスの高いものから所定数の候補解を選択、あるいは、所定値以上のフィットネスをもつ候補解を選択する。 Next, it is checked whether or not the termination condition is satisfied (S1105). If it is not satisfied (NO), a candidate solution is selected from the current population according to a predetermined criterion based on fitness (S1106). As the predetermined reference, for example, a predetermined number of candidate solutions are selected from those having high fitness, or candidate solutions having fitness equal to or higher than a predetermined value are selected.
 次に、選択した候補解に基づき、交叉と突然変異のオペレーションを適用して子孫(新たな候補分類ルール)を生成する。図47に遺伝的アルゴリズム(GA)において交叉と突然変異を用いて子孫を生成する例を示す。図47において、例えば「10011」の候補解(候補分類ルール1)は以下の通り解釈される。 
 IF(N1=異常 AND N4=異常 AND N5=異常) THEN ターゲットオブジェクト=異常 ELSEターゲットオブジェクト=正常. 
 次に、生成した子孫を、ステップS1104と同様にして評価し、それぞれのフィットネスを計算する(S1107)。
Next, based on the selected candidate solution, crossover and mutation operations are applied to generate descendants (new candidate classification rules). FIG. 47 shows an example of generating offspring using crossover and mutation in the genetic algorithm (GA). In FIG. 47, for example, a candidate solution (candidate classification rule 1) of “10011” is interpreted as follows.
IF (N1 = abnormal AND N4 = abnormal AND N5 = abnormal) THEN target object = abnormal ELSE target object = normal.
Next, the generated offspring are evaluated in the same manner as in step S1104, and the fitness of each is calculated (S1107).
 次に、前世代のポピュレーションにおいてフィットネスに基づき所定の基準により候補解を選択し、選択した候補解と、生成した子孫を組み合わせて新たなポピュレーションを生成する(S1108)。所定の基準としては、例えばフィットネスの高いものから所定数の候補解を選択、または所定値以上のフィットネスをもつ候補解を選択することがある。 Next, a candidate solution is selected according to a predetermined standard based on fitness in the previous generation population, and a new population is generated by combining the selected candidate solution and the generated descendant (S1108). As the predetermined criterion, for example, a predetermined number of candidate solutions are selected from those having high fitness, or a candidate solution having a fitness equal to or higher than a predetermined value is selected.
 終了条件が満たされたときは(S1105のYES)、そのときのポピュレーションにおいて最もフィットネスが高い候補解(候補分類ルール)を最良分類ルールとして取得する(S1109)。 When the termination condition is satisfied (YES in S1105), the candidate solution (candidate classification rule) having the highest fitness in the population at that time is acquired as the best classification rule (S1109).
 このようにGAは各実行のそれぞれで、最良の分類ルールから1つの決定フュージョンルールを生成する。 Thus, the GA generates one decision fusion rule from the best classification rule for each execution.
 一方、図43に示したような多数の決定フュージョンルールを含む分類ルールを構築するためには、様々な機械学習分類方法および特徴選択方法を用いることができる。そのような分類ルールを構築するそのような方法の1つは遺伝的プログラミング(GP)である。このGPはKozaによる“Genetic Programming: On the Programming of Computers by means of Natural Selection” [MIT Press, 1992] に開示されている。分類ルール構築分類の別の例はC4.5であり、これは、Quinlanにより"C4.5: Programs for Machine Learning"[Morgan Kaufman Publishers, 1993]に開示されている。 On the other hand, in order to construct a classification rule including a large number of decision fusion rules as shown in FIG. 43, various machine learning classification methods and feature selection methods can be used. One such method of constructing such classification rules is genetic programming (GP). This GP is disclosed by Koza in “Genetic Programming:“ On ”the“ Programming ”of“ Computers ”by“ means ”of“ Natural ”Selection” [MIT Press, 1992]. Another example of classification rule construction classification is C4.5, which is disclosed by Quinlan in "C4.5: Programs for Machine Learning" [Morgan Kaufman Publishers, 1993].
 遺伝的プログラミング(GP)を用いることで、様々な木構造を導出することができるとともに多くの木が実行中に様々な世代で探索されるため、GPはC4.5よりも好適な分類ルールを導出する。 By using genetic programming (GP), various tree structures can be derived, and many trees are searched in various generations during execution, so GP has a better classification rule than C4.5. To derive.
 GP(遺伝的プログラミング)では、木ベースの符号化が用いられる。木は図49に示すようなS式による表現(Symbolic Expression, S表現)、あるいは図50に示すような決定木のいずれかであることができる。コンパクトのため、S表現に基づいた符号化が好適である。 GP (genetic programming) uses tree-based coding. The tree can be either an expression by an S expression as shown in FIG. 49 (Symbolic Expression, S expression) or a decision tree as shown in FIG. For compactness, encoding based on the S representation is preferred.
 図49のS表現の木構造は、前述した(((N4 AND N8) OR N10) AND (N19 OR N25)) に対応するものであり、以下のように解釈される。 
 IF (((N4 AND N8) OR N10) AND (N19 OR N25)) THENターゲットオブジェクト=異常) ELSE (ターゲットオブジェクト=正常)
 また別の例として、(N1 XOR N3)AND N5、のS表現は以下のように解釈される。 
 IF(N1=異常 XOR N3=異常) AND (N5=異常) THEN ターゲットオブジェクト=異常 ELSEターゲットオブジェクト=正常.
 図50の決定木において例えば「N4 -> (False) -> N10 -> (True) -> N19 -> (True) -> False answer」はセンサノードN4が偽(正常)で、センサノードN10が真(異常)で、センサノードN19が真(異常)であるならば、ターゲットオブジェクトは真(異常)であることを意味する。
The tree structure of the S expression in FIG. 49 corresponds to the above-mentioned (((N4 AND N8) OR N10) AND (N19 OR N25)) and is interpreted as follows.
IF (((N4 AND N8) OR N10) AND (N19 OR N25)) THEN target object = abnormal) ELSE (target object = normal)
As another example, the S expression of (N1 XOR N3) AND N5 is interpreted as follows.
IF (N1 = abnormal XOR N3 = abnormal) AND (N5 = abnormal) THEN target object = abnormal ELSE target object = normal
In the decision tree of FIG. 50, for example, “N4->(False)->N10->(True)->N19->(True)-> False answer” indicates that sensor node N4 is false (normal) and sensor node N10 is If true (abnormal) and the sensor node N19 is true (abnormal), it means that the target object is true (abnormal).
 図51は、遺伝的プログラミング(GP)により分類ルールを構築するための処理のフローの一例を示す。ここでは木構造としてS表現を用いた場合を示す。同図において、AND、OR、NOTおよびXORは論理演算子である。 FIG. 51 shows an example of a processing flow for constructing a classification rule by genetic programming (GP). Here, the case where S expression is used as a tree structure is shown. In the figure, AND, OR, NOT, and XOR are logical operators.
 まず、解空間と探索空間の間を写像する符号化手法を決定する(S1201)。遺伝的アルゴリズム(GA)では配列(図45参照)によって遺伝子の型を表現したが、遺伝的プログラミング(GP)では木構造で表現する。 First, an encoding method for mapping between the solution space and the search space is determined (S1201). In the genetic algorithm (GA), the gene type is expressed by a sequence (see FIG. 45), but in the genetic programming (GP), it is expressed by a tree structure.
 S表現の場合、木構造の末端ノードにはセンサノードの名前(変数)が割り当てられ、非末端ノードには論理演算子が割り当てられる。すなわち、木構造の末端ノードに複数のセンサノードから選択したセンサノードを表す変数、木構造の非末端ノードに複数の論理演算記号から選択した論理演算記号のマッピングを規定した符号化方法を用いる。 In the case of S expression, the name (variable) of the sensor node is assigned to the terminal node of the tree structure, and the logical operator is assigned to the non-terminal node. That is, an encoding method is used that defines a variable representing a sensor node selected from a plurality of sensor nodes at the end node of the tree structure and a mapping of a logic operation symbol selected from the plurality of logic operation symbols at a non-terminal node of the tree structure.
 一方、決定木の場合は、非末端ノードにセンサノードの名前(変数)、末端ノードに真(ターゲットオブジェクトが異常)または偽(ターゲットオブジェクトは正常)を示す値、各枝にはその真上の変数が真(変数の値が異常)または偽(変数の値は正常)であることを示す値が割り当てられる。 On the other hand, in the case of a decision tree, the name of the sensor node (variable) is set to the non-terminal node, the value indicating true (target object is abnormal) or false (target object is normal) to the end node, and each branch is directly above it. A value indicating that the variable is true (the variable value is abnormal) or false (the variable value is normal) is assigned.
 次に、図39のステップS1002と同様にして様々なコントロールパラメーターの値を決定する(S1202)。 Next, various control parameter values are determined in the same manner as in step S1002 of FIG. 39 (S1202).
 次に、初期候補解(初期候補分類ルール)をランダムに生成する(S1203)。候補解毎に木構造のサイズも制約内でランダムに決定し、木構造の各ノードに、変数および論理演算子をランダムに割り当てる。 Next, an initial candidate solution (initial candidate classification rule) is randomly generated (S1203). The size of the tree structure is also determined randomly within the constraints for each candidate solution, and variables and logical operators are randomly assigned to each node of the tree structure.
 次に、各候補解を評価して、それぞれのフィットネスを計算する(S1204)。図52に候補解(候補分類ルール)に対するフィットネスの計算例を示す。候補解と、図41のような抽出したデータとに基づき、各データ(各行)について、その候補解により判定を行って、正常または異常の判定を得る。データ毎に得た判定結果のうち、ターゲットオブジェクトの実際の状態(判定ラベル)と一致する割合をフィットネスとして計算する。図示の例では10個のデータのうち、8個のデータに関しては判定の結果と実際の状態(判定ラベル)が一致し、残りの2個のデータに関しては一致しないため、フィットネス(正確度)は8/10=0.8となる。なお木構造として決定木を用いた場合も同様の方法によりフィットネスを計算すればよい。 Next, each candidate solution is evaluated and each fitness is calculated (S1204). FIG. 52 shows an example of fitness calculation for a candidate solution (candidate classification rule). Based on the candidate solution and the extracted data as shown in FIG. 41, each data (each row) is determined based on the candidate solution to obtain normal or abnormal determination. Of the determination results obtained for each data, the ratio that matches the actual state (determination label) of the target object is calculated as fitness. In the example shown in the figure, out of 10 data, the judgment result and the actual state (judgment label) match for 8 data, and the other 2 data do not match, so fitness (accuracy) is 8/10 = 0.8. In addition, what is necessary is just to calculate fitness by the same method also when a decision tree is used as a tree structure.
 次に、現在のポピュレーションにおいてフィットネスに基づき所定の基準に従って候補解を選択する(S1205)。所定の基準として、たとえばフィットネスの高いものから所定数を選択、あるいは、所定値以上のフィットネスをもつ候補解を選択する。さらに、選択した候補解に基づき、交叉と突然変異のオペレーションを適用して子孫(新たな候補分類ルール)を生成する(S1205)。図53に遺伝的プログラミング(GP)において交叉と突然変異を用いて子孫を生成する例を示す。 図示の例の突然変異では1つのノードが他の1つのノードに置換されているが、これは一例であり、突然変異では、異なるサイズ間で部分木の取り替えも行われる。たとえば1つのノードが、複数の階層からなる部分木に置換されることも行われる。 Next, a candidate solution is selected according to a predetermined criterion based on fitness in the current population (S1205). As the predetermined reference, for example, a predetermined number is selected from those having high fitness, or candidate solutions having fitness equal to or higher than a predetermined value are selected. Further, based on the selected candidate solution, the operation of crossover and mutation is applied to generate a descendant (new candidate classification rule) (S1205). FIG. 53 shows an example of generating offspring using crossover and mutation in genetic programming (GP). In the example mutation shown in the figure, one node is replaced with another node, but this is an example. In the mutation, subtrees are also replaced between different sizes. For example, one node is replaced with a subtree composed of a plurality of hierarchies.
 次に、生成した子孫を、ステップS1204と同様にして評価し、それぞれのフィットネスを計算する(S1206)。 Next, the generated offspring are evaluated in the same manner as in step S1204, and the respective fitness is calculated (S1206).
 次に、前世代のポピュレーションにおいて所定の基準に従って候補解を選択し、選択した候補解と、生成した子孫を組み合わせて新たなポピュレーションを生成する(S1207)。所定の基準として、フィットネスが高いものから所定数の候補解を選択、または所定値以上のフィットネスをもつ候補解を選択する。 Next, candidate solutions are selected according to a predetermined criterion in the previous generation population, and a new population is generated by combining the selected candidate solutions and the generated descendants (S1207). As a predetermined reference, a predetermined number of candidate solutions are selected from those having high fitness, or candidate solutions having a fitness equal to or higher than a predetermined value are selected.
 次に、終了条件(たとえば所望のフィットネスをもつ候補解が得られた、実行すべき最大世代に達した等)が満たされたかどうかをチェックする(S1208)。満たされないときは(NO)ステップS1205に戻り、終了条件が満たされたときは(YES)、そのときのポピュレーションにおいて最もフィットネスが高い候補解(候補分類ルール)を最良分類ルールとして取得する(S1209)。より詳細には当該候補解によって特定される論理演算式(または決定木)を最良分類ルールとして取得する
 図54は、ゲートウェイの内部における処理のフローを示す。
Next, it is checked whether an end condition (for example, a candidate solution having a desired fitness is obtained, the maximum generation to be executed has been reached, etc.) is satisfied (S1208). If not satisfied (NO), the process returns to step S1205. If the end condition is satisfied (YES), the candidate solution (candidate classification rule) having the highest fitness in the population at that time is acquired as the best classification rule (S1209). ). More specifically, the logical operation expression (or decision tree) specified by the candidate solution is acquired as the best classification rule. FIG. 54 shows a processing flow inside the gateway.
 ゲートウェイ100の単チャネル異常判定部102は、複数のセンサノード1~nからデータを集める(S2001)。センサノードからのデータは、波形データであってもよいし、あるいは一定時間間隔で観察された値の集合であってもよい。観察する時間間隔はセンサノードごとに異なっていても良い。 The single channel abnormality determination unit 102 of the gateway 100 collects data from a plurality of sensor nodes 1 to n (S2001). The data from the sensor node may be waveform data or a set of values observed at regular time intervals. The time interval for observation may be different for each sensor node.
 単チャネル異常判定部102は、異常判定モデルデータベース105内のセンサノード毎の異常判別モデルを使用して、各センサノードからのデータを異常か正常かに分類する (S2002)。センサノード毎のセンサデータの分類の様子を図58の左に示す。 The single channel abnormality determination unit 102 uses the abnormality determination model for each sensor node in the abnormality determination model database 105 to classify the data from each sensor node as abnormal or normal (S2002). The sensor data classification for each sensor node is shown on the left of FIG.
 ここでデータが波形の場合の分類方法の一例を、図55を用いて説明する。図55は異常判定モデル(最適化された訓練部分波形群)を用いた、テスト波形の分類の一例を示す。図55の右に示すように、まずテスト波形を複数のセクションに分割する。分割方法はあらかじめ指定しておく。分割方法としてたとえば一定幅で所定数に分割する。セクション毎に最も距離の近い最適部分波形データを異常判定モデルの中から特定する。そしてセクション毎に特定された最適部分波形データの状態(正常または異常)を確認し、多い方の状態を採用する。図55の例では異常が採用されている。このような分類を、少なくとも決定フュージョンルールに含まれるセンサノードについて行う。決定フュージョンルールに含まれるセンサノードはあらかじめ単チャネル異常判定部102に指定しておく。指定は、総合判定部103からの通知によって行うようにしてもよいし、保守員または係員が行ってもよい。 Here, an example of a classification method when the data is a waveform will be described with reference to FIG. FIG. 55 shows an example of test waveform classification using an abnormality determination model (optimized training partial waveform group). As shown on the right side of FIG. 55, the test waveform is first divided into a plurality of sections. The division method is specified in advance. As a division method, for example, a predetermined number of pieces are divided with a constant width. The optimum partial waveform data with the closest distance for each section is identified from the abnormality determination model. Then, the state (normal or abnormal) of the optimum partial waveform data specified for each section is confirmed, and the larger one is adopted. In the example of FIG. 55, abnormality is adopted. Such classification is performed for at least sensor nodes included in the decision fusion rule. The sensor node included in the determined fusion rule is designated in advance in the single channel abnormality determination unit 102. The designation may be performed by a notification from the comprehensive determination unit 103, or may be performed by a maintenance staff or a staff.
 次に、総合判定部103が、決定フュージョンルールデータベース106からターゲットオブジェクトの決定フュージョンルール(分類ルール)を抽出する(S2003)。図58の右上では、複数の決定フュージョンルールからなる分類ルールが抽出されている。 Next, the comprehensive judgment unit 103 extracts the decision fusion rule (classification rule) of the target object from the decision fusion rule database 106 (S2003). In the upper right of FIG. 58, a classification rule composed of a plurality of decision fusion rules is extracted.
 総合判定部103は、抽出した決定フュージョンルールが、当該ルールに含まれるセンサノードのセンシングデータの状態(正常または異常)に合致するかどうか検査し(S2004)、少なくとも1つの決定フュージョンルールについて一致する場合(S2004のYES)、ターゲットオブジェクトは異常状態にあると決定する。図58の例では2つの決定フュージョンルールが満足されている。 The overall determination unit 103 checks whether the extracted decision fusion rule matches the state (normal or abnormal) of the sensing data of the sensor node included in the rule (S2004), and matches at least one decision fusion rule. If so (YES in S2004), the target object is determined to be in an abnormal state. In the example of FIG. 58, two decision fusion rules are satisfied.
 この場合、データフィルタリング部104は、判定結果(ターゲットオブジェクトが異常である)と、満足された決定フュージョンルールに含まれるセンサノードの状態(判定根拠)を遠隔監視センターのサーバ200に送る(S2005)。本ステップS2005でサーバ200に送信されるメッセージフォーマットの一例を図56に示す。サーバ200では、ゲートウェイ100から送信されるこれらのデータ(判定結果と判定根拠)を受信部205において受信しデータベース204に保存する。 In this case, the data filtering unit 104 sends the determination result (the target object is abnormal) and the state (determination basis) of the sensor node included in the satisfied decision fusion rule to the server 200 of the remote monitoring center (S2005). . An example of a message format transmitted to server 200 in step S2005 is shown in FIG. In the server 200, these data (determination result and determination basis) transmitted from the gateway 100 are received by the receiving unit 205 and stored in the database 204.
 一方、分類ルールに含まれるどの決定フュージョンルールも満足されない場合、データフィルタリング部104は、異常状態を示しているすべてのセンサノードのリストを監視センターのサーバ200に送る(S2006)。本ステップS2006でサーバ200に送信されるメッセージフォーマットの一例を図57に示す。さらにデータフィルタリング部104は、これらのセンサノードのデータを遠隔監視センターのサーバ200へ送る(S2007)。センサノードのデータのタイムスタンプも同時に送信してもよい。この際、ゲートウェイ100では正常状態にあるセンサノードのデータを廃棄してもよい。 On the other hand, if any decision fusion rule included in the classification rule is not satisfied, the data filtering unit 104 sends a list of all sensor nodes indicating an abnormal state to the server 200 of the monitoring center (S2006). An example of the message format transmitted to the server 200 in this step S2006 is shown in FIG. Further, the data filtering unit 104 sends the data of these sensor nodes to the server 200 of the remote monitoring center (S2007). The time stamp of the sensor node data may also be transmitted simultaneously. At this time, the gateway 100 may discard the sensor node data in a normal state.
 サーバ200ではこれらのデータ(異常状態のセンサノードのリストと、これらのセンサノードのデータと、タイムスタンプ)をセンサデータベース201に格納してもよい。この際、ターゲットオブジェクトの判定ラベルは正常に設定される。またリストに含まれていないセンサノードの状態ラベルは正常に設定される。以降、たとえば単チャネル異常判定モデル学習部203、または決定フュージョンルール学習部202では、例えば更新されたセンサデータベース201に基づき、前述した処理を行ってもよい。 The server 200 may store these data (a list of sensor nodes in an abnormal state, data of these sensor nodes, and a time stamp) in the sensor database 201. At this time, the determination label of the target object is set normally. Also, sensor node status labels not included in the list are set normally. Thereafter, for example, the single channel abnormality determination model learning unit 203 or the decision fusion rule learning unit 202 may perform the above-described processing based on, for example, the updated sensor database 201.
 上記した実施形態では、個々のセンサに基づく異常判定モデル、および個々の異常判定モデルの判定を総合判定する決定フュージョンルールはすべてサーバにおいて学習した。しかしながら、本発明はこのセットアップに限定されず、ゲートウェイに十分なコンピューティングリソースがある場合、異常判定モデルおよび決定フュージョンルールをゲートウェイで学習してもよい。 In the above-described embodiment, the abnormality determination model based on each sensor and the decision fusion rule for comprehensively determining the determination of each abnormality determination model are all learned in the server. However, the present invention is not limited to this setup, and if the gateway has sufficient computing resources, the abnormality determination model and the decision fusion rule may be learned at the gateway.
 以上、本実施形態によれば、決定フュージョンルールを学習するのに特徴選択技術を用いるため、ターゲットオブジェクトに対する不必要なセンサノードを識別し、遠隔監視システムから取り除き、システムのコストを削減することができる。ベイジアンネットワーク中の推論と異なり、決定フュージョンルールにおいて指定されるセンサノードのみの状態のマッチングがなされるため、多くのセンサノードからの異常なイベントの原因の識別が効率的になる。その異常なイベントの検出は、ターゲットオブジェクトにおける異常を確認するために多数のセンサの状態の利用により信頼できるようになる。さらに、決定フュージョンルールを学習(構築)するのに、センサノードにおける因果関係に関する予備的知識は必要ではない。 
 遠隔監視センターへデータを送信する通信オーバーヘッドは、データフィルタ部の利用により縮小される。決定フュージョンルールによってターゲットオブジェクトの異常を確認できない場合のみ、データフィルタリング部により遠隔監視センターのサーバへセンサデータを送ればよい。
As described above, according to the present embodiment, since the feature selection technique is used to learn the decision fusion rule, unnecessary sensor nodes for the target object can be identified and removed from the remote monitoring system, thereby reducing the cost of the system. it can. Unlike the reasoning in the Bayesian network, the state matching of only the sensor node specified in the decision fusion rule is performed, so that the cause of the abnormal event from many sensor nodes can be identified efficiently. The detection of the abnormal event becomes reliable by using the state of a large number of sensors to confirm the abnormality in the target object. Furthermore, no prior knowledge about the causal relationship at the sensor node is necessary to learn (construct) the decision fusion rule.
Communication overhead for transmitting data to the remote monitoring center is reduced by using the data filter unit. Only when the abnormality of the target object cannot be confirmed by the decision fusion rule, the sensor data may be sent to the server of the remote monitoring center by the data filtering unit.
 以上、第1および第2の実施形態によれば、蓄積された多チャンネルのセンサデータにおいて判定に寄与する部分のみを抽出し、チャンネル間の確率依存関係を考慮して判定モデルを生成することができ、生成された判定モデルを用いて判定を行うとともに、判定根拠を的確に過不足なく示すことが可能となる。また、新たな入力があった場合、確度の高い訓練データを追加できるため、判定モデルの性能を向上させ続けることが可能となる。 As described above, according to the first and second embodiments, it is possible to extract only the portion that contributes to the determination in the accumulated multi-channel sensor data and generate the determination model in consideration of the probability dependency between the channels. It is possible to perform determination using the generated determination model, and to accurately indicate the determination basis without excess or deficiency. In addition, when there is a new input, training data with high accuracy can be added, so that the performance of the determination model can be continuously improved.
 本発明は、半導体製造装置や製品製造ラインにおける製造装置の監視システムやエレベータ監視装置、空調システム監視装置、電力システム監視装置や医療・介護におけるバイタルセンシング装置の監視システムや健康機器の監視装置など、品質管理・保守・状態監視などのさまざまな遠隔監視システムとして利用できる。 The present invention includes a manufacturing system monitoring system, an elevator monitoring system, an air conditioning system monitoring system, a power system monitoring system, a vital sensing system monitoring system and a health equipment monitoring system in medical and nursing care, etc. It can be used as various remote monitoring systems such as quality control, maintenance and condition monitoring.

Claims (8)

  1.  監視対象を複数のセンサにより観測して得た複数の変量に関する複数の時系列データと、前記複数の時系列データが取得されたときの前記監視対象の状態を表す正常クラスまたは異常クラスとを一組とした複数の訓練データを格納するデータ格納部と、
     前記複数の変量のそれぞれに対して複数の区間を指定し、前記複数の変量のそれぞれ毎に、前記複数の訓練データに含まれる前記複数の時系列データから、前記複数の区間のデータである複数のセグメントデータを抽出する波形分割部と、
     前記複数の変量のそれぞれ毎に、前記波形分割部により抽出された前記複数のセグメントデータを用いて前記複数の区間のそれぞれについて最近傍法による判定を行うことにより、前記複数の区間のうちの1つである最良区間を選択する評価部と、
       前記複数の変量のそれぞれ毎に、前記複数の区間のそれぞれについて正常と判定された回数と、異常と判定された回数とに基づき、前記最良区間の正常および異常の条件付き確率を計算し、
       前記複数の訓練データに含まれる正常クラスの合計数と異常クラスの合計数とから正常および異常の事前確率を計算する、
     計算部と、
     前記正常および異常の事前確率を記憶し、前記複数の変量のそれぞれ毎に、前記最良区間の識別情報と、前記最良区間のセグメントデータと、前記セグメントデータに関連するクラスと、前記最良区間の前記正常および異常の条件付き確率と、を記憶する記憶部と、
     前記監視対象を複数のセンサにより観測して複数の変量に関する複数の時系列データを取得するセンシング部と、
     前記複数の変量のそれぞれ毎に、それぞれの前記最良区間に従って、前記センシング部により取得された前記複数の時系列データからセグメントデータを選択する選択部と、
       前記複数の変量のそれぞれ毎に、前記選択部により選択されたセグメントデータについて、前記記憶部における前記セグメントデータを用いて前記最近傍法により、上位の所定数のセグメントデータを検出し、
       前記複数の変量のそれぞれ毎に、前記所定数のセグメントデータにおける正常クラスおよび異常クラスのそれぞれの比率と、前記記憶部における前記正常および異常の条件付き確率とをそれぞれ乗算し、乗算値を前記複数の変量間で掛け合わせるとともに前記正常および異常の事前確率を乗じることより前記正常および異常の尤度を計算し、
       前記正常および異常のうち尤度の大きい方に前記監視対象の状態を決定する
     判定部と、
     を備えた異常判定システム。
    A plurality of time-series data relating to a plurality of variables obtained by observing a monitoring target with a plurality of sensors, and a normal class or an abnormal class representing the state of the monitoring target when the plurality of time-series data are acquired. A data storage unit for storing a plurality of training data sets;
    A plurality of sections are specified for each of the plurality of variables, and each of the plurality of variables is a plurality of sections of data from the plurality of time-series data included in the plurality of training data. A waveform dividing unit for extracting segment data of
    For each of the plurality of variables, by performing determination by the nearest neighbor method for each of the plurality of sections using the plurality of segment data extracted by the waveform dividing unit, one of the plurality of sections is obtained. An evaluation unit for selecting the best interval,
    For each of the plurality of variables, based on the number of times determined to be normal for each of the plurality of sections and the number of times determined to be abnormal, calculate the normal and abnormal conditional probabilities of the best section,
    Calculating normal and abnormal prior probabilities from the total number of normal classes and the total number of abnormal classes included in the plurality of training data;
    A calculation unit;
    Storing normal and abnormal prior probabilities, and for each of the plurality of variables, identification information of the best section, segment data of the best section, a class related to the segment data, and the class of the best section A storage unit for storing normal and abnormal conditional probabilities;
    A sensing unit for observing the monitoring target with a plurality of sensors and acquiring a plurality of time-series data regarding a plurality of variables;
    For each of the plurality of variables, a selection unit that selects segment data from the plurality of time-series data acquired by the sensing unit according to the best interval,
    For each of the plurality of variables, for the segment data selected by the selection unit, the uppermost predetermined number of segment data is detected by the nearest neighbor method using the segment data in the storage unit,
    For each of the plurality of variables, the respective ratios of the normal class and the abnormal class in the predetermined number of segment data are multiplied by the normal and abnormal conditional probabilities in the storage unit, respectively, and the multiplication value is the plurality And the likelihood of normal and abnormal is calculated by multiplying between the variables and multiplying by the prior probability of normal and abnormal,
    A determination unit that determines the state of the monitoring target to be one of the normality and abnormality having the highest likelihood; and
    An abnormality judgment system with
  2.  前記計算部は、前記複数の変量のうちあらかじめ指定した第1の変量と第2の変量に関して、前記第2の変量の最良区間による最近傍判定が正常および異常となるときに前記第1の変量の最良区間による最近傍判定が正常および異常となる第1の条件付き確率を計算し、
     前記記憶部は、前記正常および異常の前記第1の条件付き確率を記憶し、
     前記判定部は、前記正常および異常の前記第1の条件付き確率をさらに乗算することにより前記正常および異常の尤度を計算する
     ことを特徴とする請求項1に記載のシステム。
    The calculation unit, with respect to the first variable and the second variable specified in advance among the plurality of variables, the first variable when the nearest neighbor determination by the best interval of the second variable becomes normal and abnormal Calculate the first conditional probability that the nearest neighbor judgment by the best interval of is normal and abnormal,
    The storage unit stores the first conditional probability of the normal and abnormal,
    2. The system according to claim 1, wherein the determination unit calculates the normal and abnormal likelihoods by further multiplying the normal conditional and abnormal first conditional probabilities.
  3.  前記評価部は、前記複数の区間のそれぞれについて判定の正解数を計算し、最も正解数の高い区間を前記最良区間として選択する
     ことを特徴とする請求項2に記載のシステム。
    3. The system according to claim 2, wherein the evaluation unit calculates the number of correct answers for each of the plurality of sections, and selects the section with the highest number of correct answers as the best section.
  4.  前記計算部は、前記複数の変量のうち少なくとも1つの変量に関して、前記正解数が同一または閾値以上である2つ以上の区間が存在するとき、前記2つの以上の区間のそれぞれについて前記異常の条件付き確率を計算し、
     前記評価部は、前記2つ以上の区間のそれぞれについて、前記複数の変量のうち前記少なくとも1つの変量以外の他の変量の前記最良区間に対して計算した前記異常の条件付き確率と前記2つ以上の区間のそれぞれの前記異常の条件付き確率とを乗算することにより前記異常の尤度を計算し、前記2つ以上の区間のうち前記異常の尤度が大きい方の区間を、前記少なくとも1つの変量の前記最良区間として選択する
     ことを特徴とする請求項2に記載のシステム。
    The calculation unit, regarding at least one variable among the plurality of variables, when there are two or more sections in which the number of correct answers is the same or greater than or equal to a threshold value, the abnormal condition for each of the two or more sections Calculate the probability of
    The evaluation unit, for each of the two or more sections, the conditional probability of the abnormality calculated for the best section of the plurality of variables other than the at least one variable other than the at least one variable, and the two The likelihood of the abnormality is calculated by multiplying the conditional probability of each abnormality in each of the above-described sections, and the section having the higher likelihood of the abnormality among the two or more sections is defined as the at least 1 3. The system according to claim 2, wherein the system is selected as the best interval of two variables.
  5.  コンピュータにおいて実行する異常判定方法であって、
     前記コンピュータが、
     監視対象を複数のセンサにより観測して得た複数の変量に関する複数の時系列データと、前記複数の時系列データが取得されたときの前記監視対象の状態を表す正常クラスまたは異常クラスとを一組とした複数の訓練データを格納するデータ格納部からデータを読み出すステップと、
     前記複数の変量のそれぞれに対して複数の区間を指定し、前記複数の変量のそれぞれ毎に、前記複数の訓練データに含まれる前記複数の時系列データから、前記複数の区間のデータである複数のセグメントデータを抽出する波形分割ステップと、
     前記複数の変量のそれぞれ毎に、前記波形分割部により抽出された前記複数のセグメントデータを用いて前記複数の区間のそれぞれについて最近傍法による判定を行うことにより、前記複数の区間のうちの1つである最良区間を選択する評価ステップと、
       前記複数の変量のそれぞれ毎に、前記複数の区間のそれぞれについて正常と判定された回数と、異常と判定された回数とに基づき、前記最良区間の正常および異常の条件付き確率を計算し、
       前記複数の訓練データに含まれる正常クラスの合計数と異常クラスの合計数とから正常および異常の事前確率を計算する
     計算ステップと、
     前記正常および異常の事前確率を記憶し、前記複数の変量のそれぞれ毎に、前記最良区間の識別情報と、前記最良区間のセグメントデータと、前記セグメントデータに関連するクラスと、前記最良区間の前記正常および異常の条件付き確率と、を記憶する記憶ステップと、
     前記監視対象を複数のセンサにより観測することにより複数の変量に関する複数の時系列データを取得するセンシングステップと、
     前記複数の変量のそれぞれ毎に、それぞれの前記最良区間に従って、前記センシング部により取得された前記複数の時系列データからセグメントデータを選択する選択ステップと、
       前記複数の変量のそれぞれ毎に、選択されたセグメントデータについて、前記記憶部における前記セグメントデータを用いて前記最近傍法により、上位の所定数のセグメントデータを検出し、
       前記複数の変量のそれぞれ毎に、前記所定数のセグメントデータにおける正常クラスおよび異常クラスのそれぞれの比率と、前記記憶部における前記正常および異常の条件付き確率とをそれぞれ乗算し、乗算値を前記複数の変量間で掛け合わせるとともに前記正常および異常の事前確率を乗じることより前記正常および異常の尤度を計算し、
       前記正常および異常のうち尤度の大きい方に前記監視対象の状態を決定する
     判定ステップと、
     を実行することを特徴とする異常判定方法。
    An abnormality determination method executed in a computer,
    The computer is
    A plurality of time-series data relating to a plurality of variables obtained by observing a monitoring target with a plurality of sensors, and a normal class or an abnormal class representing the state of the monitoring target when the plurality of time-series data are acquired. Reading data from a data storage unit that stores a plurality of training data in pairs;
    A plurality of sections are specified for each of the plurality of variables, and each of the plurality of variables is a plurality of sections of data from the plurality of time-series data included in the plurality of training data. Waveform segmentation step for extracting segment data of
    For each of the plurality of variables, by performing determination by the nearest neighbor method for each of the plurality of sections using the plurality of segment data extracted by the waveform dividing unit, one of the plurality of sections is obtained. An evaluation step for selecting the best interval,
    For each of the plurality of variables, based on the number of times determined to be normal for each of the plurality of sections and the number of times determined to be abnormal, calculate the normal and abnormal conditional probabilities of the best section,
    A calculation step of calculating normal and abnormal prior probabilities from the total number of normal classes and the total number of abnormal classes included in the plurality of training data;
    Storing normal and abnormal prior probabilities, and for each of the plurality of variables, identification information of the best section, segment data of the best section, a class related to the segment data, and the class of the best section A storage step for storing normal and abnormal conditional probabilities;
    A sensing step of acquiring a plurality of time-series data relating to a plurality of variables by observing the monitoring target with a plurality of sensors;
    For each of the plurality of variables, a selection step of selecting segment data from the plurality of time-series data acquired by the sensing unit according to the best interval,
    For each of the plurality of variables, for the selected segment data, the nearest neighbor method using the segment data in the storage unit to detect a predetermined predetermined number of segment data,
    For each of the plurality of variables, the respective ratios of the normal class and the abnormal class in the predetermined number of segment data are multiplied by the normal and abnormal conditional probabilities in the storage unit, respectively, and the multiplication value is the plurality of the plurality of variables. And the likelihood of normal and abnormal is calculated by multiplying between the variables and multiplying by the prior probability of normal and abnormal,
    A determination step of determining the state of the monitoring target to be one of the normality and abnormality having a higher likelihood; and
    An abnormality determination method characterized by executing
  6.  ターゲットオブジェクトを監視する複数のセンサノードにより観測されたセンサデータがそれぞれ異常か正常かをそれぞれ示す複数の第1ラベルと、前記ターゲットオブジェクトの状態が正常か正常かを示す第2ラベルとを含む複数の訓練データを記憶する第1のデータベースと、
      (A-1)前記複数のセンサノードの各々の有無をビット列へマッピングすることを規定した符号化方法を用い、マッピングを複数回、ランダムに行うことにより複数の候補解を生成し、 
      (A-2)前記第1のデータベースに対する前記複数の候補解のそれぞれのフィットネスの評価と、前記フィットネスに基づき選択される候補解の交叉および突然変異オペレーションによる新たな候補解の生成とを遺伝的アルゴリズムに従って繰り返し行うことにより最適フィットネスをもつ最適候補解を決定し、前記最適候補解において有のビットが立っているセンサノードを特定する決定フュージョンルール学習部と、
      (B-1)特定されたセンサノードにより観測されるセンサデータが異常か正常かを、前記特定されたセンサノードに対してあらかじめ用意された、与えられたセンサデータを異常および正常のいずれかに決定する分類器を用いて判定し、
      (B-2)前記特定されたセンサノードに対する判定の結果がすべて異常を示すときは前記ターゲットオブジェクトが異常であることを決定し、前記判定の結果の少なくともいずれか1つが正常を示すときは前記ターゲットオブジェクトが正常であることを決定する総合判定部と、
     を備え、
     前記決定フュージョンルール学習部は、前記複数の候補解のそれぞれのフィットネスの評価として、
     前記複数の訓練データのそれぞれについて、前記候補解において有のビットが立っているセンサノードの前記第1ラベルを検出し、検出した第1ラベルに示される正常および異常のうち多い方の状態を選択し、前記複数の訓練データのそれぞれに対して選択した状態と前記複数の訓練データのそれぞれの前記第2ラベルに示される状態とが一致する割合を計算する
     ことを特徴とする異常判定システム。
    A plurality of first labels each indicating whether sensor data observed by a plurality of sensor nodes monitoring the target object is abnormal or normal, and a second label indicating whether the state of the target object is normal or normal A first database for storing training data of
    (A-1) A plurality of candidate solutions are generated by performing the mapping a plurality of times at random using an encoding method that prescribes mapping of the presence or absence of each of the plurality of sensor nodes to a bit string,
    (A-2) Genetic evaluation of fitness of each of the plurality of candidate solutions with respect to the first database, and generation of new candidate solutions by crossover of candidate solutions selected based on the fitness and mutation operations A determination fusion rule learning unit that determines an optimal candidate solution having optimal fitness by repeatedly performing according to an algorithm, and identifies a sensor node in which a bit is set in the optimal candidate solution;
    (B-1) Whether the sensor data observed by the specified sensor node is abnormal or normal, whether the given sensor data prepared in advance for the specified sensor node is abnormal or normal Using the classifier to determine,
    (B-2) When all the determination results for the specified sensor node indicate abnormality, the target object is determined to be abnormal, and when at least one of the determination results indicates normal, An overall determination unit that determines that the target object is normal;
    With
    The decision fusion rule learning unit is configured to evaluate fitness of each of the plurality of candidate solutions.
    For each of the plurality of training data, the first label of the sensor node on which a bit is set in the candidate solution is detected, and the state of the normal or abnormal state indicated by the detected first label is selected. And calculating a rate at which the state selected for each of the plurality of training data matches the state indicated by the second label of each of the plurality of training data.
  7.  ターゲットオブジェクトを監視する複数のセンサノードにより観測されたセンサデータがそれぞれ異常か正常かをそれぞれ示す複数の第1ラベルと、前記ターゲットオブジェクトの状態が正常か正常かを示す第2ラベルとを含む複数の訓練データを記憶する第1のデータベースと、
      (A-1)木構造の末端ノードに前記複数のセンサノードから選択したセンサノードを表す変数、前記木構造の非末端ノードに複数の論理演算記号から選択した論理演算記号をマッピングすることを規定した、S表現への符号化方法を用い、前記マッピングを複数回、ランダムに行うことにより複数の候補解を生成し、
      (A-2)前記第1のデータベースに対する前記複数の候補解のそれぞれのフィットネスの評価と、前記フィットネスに基づき選択される候補解の交叉および突然変異オペレーションによる新たな候補解の生成とを、遺伝的プログラミングに従って、繰り返し行うことにより最適フィットネスをもつ最適候補解を求め、前記最適候補解によって特定される論理演算式を取得する決定フュージョンルール学習部と、
      (B-1)前記論理演算式に含まれる変数のセンサノードにより観測されたセンサデータが異常か正常かを、前記センサノードに対してあらかじめ用意された、与えられたセンサデータを異常および正常のいずれかに決定する分類器を用いて判定し、
      (B-2)前記センサデータが異常のとき前記センサノードの変数が真、正常のとき偽としたときに、前記論理演算式が真か偽かを判定し、判定が真となるときは前記ターゲットオブジェクトが異常、偽となるときは正常であることを決定する総合判定部と、
     を備え、
     前記決定フュージョンルール学習部は、前記複数の候補解のそれぞれのフィットネスの評価として、
     前記複数の訓練データのそれぞれについて、前記候補解に含まれるセンサノードの第1ラベルが異常を示すときは前記センサノードの変数が真、正常を示すときは偽として、前記候補解によって特定される論理演算式が真か偽かを判定し、真のときは前記ターゲットオブジェクトは異常、偽のときは正常と決定し、
     前記複数の訓練データのそれぞれに対して決定した状態と前記複数の訓練データのそれぞれの第2ラベルに示される状態とが一致する割合を計算する、
     ことを特徴とする異常判定システム。
    A plurality of first labels each indicating whether sensor data observed by a plurality of sensor nodes monitoring the target object is abnormal or normal, and a second label indicating whether the state of the target object is normal or normal A first database for storing training data of
    (A-1) It is defined that a variable representing a sensor node selected from the plurality of sensor nodes is mapped to a terminal node of the tree structure, and a logical operation symbol selected from a plurality of logical operation symbols is mapped to a non-terminal node of the tree structure. Using the encoding method to S representation, a plurality of candidate solutions are generated by randomly performing the mapping a plurality of times,
    (A-2) Genetic evaluation of the fitness of each of the plurality of candidate solutions with respect to the first database, and generation of a new candidate solution by crossover of candidate solutions selected based on the fitness and mutation operation A decision fusion rule learning unit that obtains an optimal candidate solution having optimal fitness by performing iteratively and obtaining a logical operation expression specified by the optimal candidate solution;
    (B-1) Whether the sensor data observed by the sensor node of the variable included in the logical operation expression is abnormal or normal, the given sensor data prepared in advance for the sensor node is abnormal and normal Judgment is made using a classifier that decides either
    (B-2) When the sensor data is abnormal, the sensor node variable is true, and when it is normal, it is determined whether the logical operation expression is true or false. When the determination is true, When the target object is abnormal or false, a general determination unit that determines that the target object is normal,
    With
    The decision fusion rule learning unit is configured to evaluate fitness of each of the plurality of candidate solutions.
    For each of the plurality of training data, when the first label of the sensor node included in the candidate solution indicates abnormality, the variable of the sensor node is specified as true, and when indicated as normal, the variable is specified by the candidate solution. Determine whether the logical expression is true or false. If true, determine that the target object is abnormal, if false, normal.
    Calculating a rate at which the state determined for each of the plurality of training data matches the state indicated on the second label of each of the plurality of training data;
    An abnormality determination system characterized by that.
  8.  ターゲットオブジェクトを監視する複数のセンサノードにより観測されたセンサデータがそれぞれ異常か正常かをそれぞれ示す複数の第1ラベルと、前記ターゲットオブジェクトの状態が正常か正常かを示す第2ラベルとを含む複数の訓練データを記憶する第1のデータベースと、
      (A-1)木構造の非末端ノードに前記複数のセンサノードから選択したセンサノードを表す変数、前記木構造の枝にその直上のノードの変数が真か偽かを示す値、前記木構造の非末端ノードに真または偽を示す値をマッピングすることを規定した、決定木への符号化方法を用い、前記マッピングを複数回、ランダムに行うことにより複数の候補解を生成し、
      (A-2)前記第1のデータベースに対する前記複数の候補解のそれぞれのフィットネスの評価と、前記フィットネスに基づき選択される候補解の交叉および突然変異オペレーションによる新たな候補解の生成とを、遺伝的プログラミングに従って、繰り返し行うことにより最適フィットネスをもつ最適候補解を求め、前記最適候補解により特定される決定木を取得する決定フュージョンルール学習部と、
      (B-1)前記決定木に含まれる変数のセンサノードにより観測されたセンサデータが異常か正常かを、前記センサノードに対してあらかじめ用意された、与えられたセンサデータを異常および正常のいずれかに決定する分類器を用いて判定し、
      (B-2)異常のときは前記センサノードに対応する変数は真、正常のときは偽として、前記決定木の真偽を判定し、判定の結果が真となるときは前記ターゲットオブジェクトが異常、偽となるときは正常であることを決定する総合判定部と、
     を備え、
     前記決定フュージョンルール学習部は、前記複数の候補解のそれぞれのフィットネスの評価として、
     前記複数の訓練データのそれぞれについて、前記候補解に含まれる変数のセンサノードの第1ラベルが異常を示すときは前記変数が真、正常のときは偽として、前記候補解により特定される決定木の真偽を判定し、真のときは前記ターゲットオブジェクトは異常、偽のときは正常と決定し、
     前記複数の訓練データのそれぞれに対して決定した状態と前記複数の訓練データのそれぞれの第2ラベルに示される状態との一致する割合を前記フィットネスとして計算する、
     ことを特徴とする異常判定システム。
    A plurality of first labels each indicating whether sensor data observed by a plurality of sensor nodes monitoring the target object is abnormal or normal, and a second label indicating whether the state of the target object is normal or normal A first database for storing training data of
    (A-1) a variable representing a sensor node selected from the plurality of sensor nodes as a non-terminal node of the tree structure, a value indicating whether a variable of the node immediately above is true or false, and a tree structure branch A plurality of candidate solutions are generated by performing the mapping multiple times at random using a coding method to a decision tree, which specifies that a value indicating true or false is mapped to a non-terminal node of
    (A-2) Genetic evaluation of the fitness of each of the plurality of candidate solutions with respect to the first database, and generation of a new candidate solution by crossover of candidate solutions selected based on the fitness and mutation operation A determination fusion rule learning unit that obtains an optimal candidate solution having optimal fitness by performing iterative programming and obtaining a decision tree specified by the optimal candidate solution;
    (B-1) Whether the sensor data observed by the sensor node of the variable included in the decision tree is abnormal or normal, whether the given sensor data prepared in advance for the sensor node is abnormal or normal Using a classifier that determines
    (B-2) The variable corresponding to the sensor node is true when it is abnormal, false when it is normal, the true / false of the decision tree is determined, and the target object is abnormal when the determination result is true , A general determination unit that determines that it is normal when false,
    With
    The decision fusion rule learning unit is configured to evaluate fitness of each of the plurality of candidate solutions.
    For each of the plurality of training data, the decision tree specified by the candidate solution is defined as true when the first label of the sensor node of the variable included in the candidate solution indicates abnormality, and false when normal. If the target object is true, the target object is abnormal, and if it is false, the target object is determined to be normal.
    Calculating, as the fitness, the proportion of the state determined for each of the plurality of training data and the state indicated by the second label of each of the plurality of training data,
    An abnormality determination system characterized by that.
PCT/JP2009/066806 2009-09-28 2009-09-28 Abnormality identification system and method thereof WO2011036809A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2009/066806 WO2011036809A1 (en) 2009-09-28 2009-09-28 Abnormality identification system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2009/066806 WO2011036809A1 (en) 2009-09-28 2009-09-28 Abnormality identification system and method thereof

Publications (1)

Publication Number Publication Date
WO2011036809A1 true WO2011036809A1 (en) 2011-03-31

Family

ID=43795579

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/066806 WO2011036809A1 (en) 2009-09-28 2009-09-28 Abnormality identification system and method thereof

Country Status (1)

Country Link
WO (1) WO2011036809A1 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016117086A1 (en) * 2015-01-22 2016-07-28 三菱電機株式会社 Chronological data search device and chronological data search program
WO2018100655A1 (en) * 2016-11-30 2018-06-07 株式会社日立製作所 Data collection system, abnormality detection system, and gateway device
EP3336636A1 (en) * 2016-12-19 2018-06-20 Palantir Technologies Inc. Machine fault modelling
CN108255142A (en) * 2018-01-19 2018-07-06 山东大陆计量科技有限公司 Quality of production control method and device
JP6362808B1 (en) * 2017-07-31 2018-07-25 三菱電機株式会社 Information processing apparatus and information processing method
JP2018156415A (en) * 2017-03-17 2018-10-04 株式会社リコー Diagnosis device, diagnosis system, diagnosis method and program
US10354196B2 (en) 2016-12-16 2019-07-16 Palantir Technologies Inc. Machine fault modelling
JP2019159779A (en) * 2018-03-13 2019-09-19 アズビル株式会社 Multivariate time series data synchronization method and multivariate time series data processing device
JP2019179395A (en) * 2018-03-30 2019-10-17 オムロン株式会社 Abnormality detection system, support device and abnormality detection method
JP6600120B1 (en) * 2019-02-06 2019-10-30 オーウエル株式会社 Management system, machine learning apparatus and management method therefor
WO2019235161A1 (en) * 2018-06-04 2019-12-12 日本電信電話株式会社 Data analysis system and data analysis method
CN110781433A (en) * 2019-10-11 2020-02-11 腾讯科技(深圳)有限公司 Data type determination method and device, storage medium and electronic device
US10663961B2 (en) 2016-12-19 2020-05-26 Palantir Technologies Inc. Determining maintenance for a machine
CN111325258A (en) * 2020-02-14 2020-06-23 腾讯科技(深圳)有限公司 Characteristic information acquisition method, device, equipment and storage medium
CN112001212A (en) * 2019-05-27 2020-11-27 株式会社东芝 Waveform segmentation device and waveform segmentation method
US10928817B2 (en) 2016-12-19 2021-02-23 Palantir Technologies Inc. Predictive modelling
CN112446647A (en) * 2020-12-14 2021-03-05 上海众源网络有限公司 Abnormal element positioning method and device, electronic equipment and storage medium
WO2021075039A1 (en) * 2019-10-18 2021-04-22 日本電気株式会社 Time-series data processing method
CN113168171A (en) * 2018-12-05 2021-07-23 三菱电机株式会社 Abnormality detection device and abnormality detection method
US11092460B2 (en) 2017-08-04 2021-08-17 Kabushiki Kaisha Toshiba Sensor control support apparatus, sensor control support method and non-transitory computer readable medium
US11163853B2 (en) 2017-01-04 2021-11-02 Kabushiki Kaisha Toshiba Sensor design support apparatus, sensor design support method and non-transitory computer readable medium
JPWO2020170304A1 (en) * 2019-02-18 2021-12-02 日本電気株式会社 Learning devices and methods, predictors and methods, and programs
CN114553756A (en) * 2022-01-27 2022-05-27 烽火通信科技股份有限公司 Equipment fault detection method based on joint generation countermeasure network and electronic equipment
CN114613120A (en) * 2022-03-29 2022-06-10 新奥(中国)燃气投资有限公司 Remote meter reading abnormity identification method and device
CN115017121A (en) * 2022-08-05 2022-09-06 山东天意机械股份有限公司 Concrete production equipment data storage system
CN115860590A (en) * 2023-03-02 2023-03-28 广东慧航天唯科技有限公司 Intelligent analysis early warning method and system for enterprise emission pollution data
US11874854B2 (en) 2020-01-06 2024-01-16 Kabushiki Kaisha Toshiba Information processing apparatus, information processing method, and computer program
CN117495113A (en) * 2024-01-02 2024-02-02 海纳云物联科技有限公司 Building fire safety assessment method, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003271231A (en) * 2002-03-15 2003-09-26 Mitsubishi Heavy Ind Ltd Estimation device of detector drift and monitor system of detector
JP2004356510A (en) * 2003-05-30 2004-12-16 Fujitsu Ltd Device and method of signal processing
JP2008287495A (en) * 2007-05-17 2008-11-27 Toshiba Corp Equipment state monitoring device and equipment state monitoring method and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003271231A (en) * 2002-03-15 2003-09-26 Mitsubishi Heavy Ind Ltd Estimation device of detector drift and monitor system of detector
JP2004356510A (en) * 2003-05-30 2004-12-16 Fujitsu Ltd Device and method of signal processing
JP2008287495A (en) * 2007-05-17 2008-11-27 Toshiba Corp Equipment state monitoring device and equipment state monitoring method and program

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10223069B2 (en) 2015-01-22 2019-03-05 Mitsubishi Electric Corporation Time-series data search device and computer readable medium
JPWO2016117086A1 (en) * 2015-01-22 2017-04-27 三菱電機株式会社 Time-series data search device and time-series data search program
CN107111643A (en) * 2015-01-22 2017-08-29 三菱电机株式会社 Time series data retrieves device and time series data search program
WO2016117086A1 (en) * 2015-01-22 2016-07-28 三菱電機株式会社 Chronological data search device and chronological data search program
CN107111643B (en) * 2015-01-22 2018-12-28 三菱电机株式会社 Time series data retrieves device
WO2018100655A1 (en) * 2016-11-30 2018-06-07 株式会社日立製作所 Data collection system, abnormality detection system, and gateway device
US11067973B2 (en) 2016-11-30 2021-07-20 Hitachi, Ltd. Data collection system, abnormality detection method, and gateway device
JPWO2018100655A1 (en) * 2016-11-30 2019-06-27 株式会社日立製作所 Data acquisition system, anomaly detection method, and gateway device
US10354196B2 (en) 2016-12-16 2019-07-16 Palantir Technologies Inc. Machine fault modelling
US10928817B2 (en) 2016-12-19 2021-02-23 Palantir Technologies Inc. Predictive modelling
EP3336636A1 (en) * 2016-12-19 2018-06-20 Palantir Technologies Inc. Machine fault modelling
US11755006B2 (en) 2016-12-19 2023-09-12 Palantir Technologies Inc. Predictive modelling
US10663961B2 (en) 2016-12-19 2020-05-26 Palantir Technologies Inc. Determining maintenance for a machine
US10996665B2 (en) 2016-12-19 2021-05-04 Palantir Technologies Inc. Determining maintenance for a machine
US11163853B2 (en) 2017-01-04 2021-11-02 Kabushiki Kaisha Toshiba Sensor design support apparatus, sensor design support method and non-transitory computer readable medium
JP2018156415A (en) * 2017-03-17 2018-10-04 株式会社リコー Diagnosis device, diagnosis system, diagnosis method and program
US10613960B2 (en) 2017-07-31 2020-04-07 Mitsubishi Electric Corporation Information processing apparatus and information processing method
JP6362808B1 (en) * 2017-07-31 2018-07-25 三菱電機株式会社 Information processing apparatus and information processing method
WO2019026134A1 (en) * 2017-07-31 2019-02-07 三菱電機株式会社 Information processing device and information processing method
US11092460B2 (en) 2017-08-04 2021-08-17 Kabushiki Kaisha Toshiba Sensor control support apparatus, sensor control support method and non-transitory computer readable medium
CN108255142A (en) * 2018-01-19 2018-07-06 山东大陆计量科技有限公司 Quality of production control method and device
JP7051503B2 (en) 2018-03-13 2022-04-11 アズビル株式会社 Multivariate time series data synchronization method and multivariate time series data processing device
JP2019159779A (en) * 2018-03-13 2019-09-19 アズビル株式会社 Multivariate time series data synchronization method and multivariate time series data processing device
JP2019179395A (en) * 2018-03-30 2019-10-17 オムロン株式会社 Abnormality detection system, support device and abnormality detection method
JP7106997B2 (en) 2018-06-04 2022-07-27 日本電信電話株式会社 Data analysis system and data analysis method
JP2019211942A (en) * 2018-06-04 2019-12-12 日本電信電話株式会社 Data analysis system and data analysis method
WO2019235161A1 (en) * 2018-06-04 2019-12-12 日本電信電話株式会社 Data analysis system and data analysis method
CN113168171B (en) * 2018-12-05 2023-09-19 三菱电机株式会社 Abnormality detection device and abnormality detection method
CN113168171A (en) * 2018-12-05 2021-07-23 三菱电机株式会社 Abnormality detection device and abnormality detection method
WO2020161835A1 (en) * 2019-02-06 2020-08-13 オーウエル株式会社 Management system and machine learning device therefor and managing method
JP6600120B1 (en) * 2019-02-06 2019-10-30 オーウエル株式会社 Management system, machine learning apparatus and management method therefor
JPWO2020170304A1 (en) * 2019-02-18 2021-12-02 日本電気株式会社 Learning devices and methods, predictors and methods, and programs
CN112001212A (en) * 2019-05-27 2020-11-27 株式会社东芝 Waveform segmentation device and waveform segmentation method
CN110781433A (en) * 2019-10-11 2020-02-11 腾讯科技(深圳)有限公司 Data type determination method and device, storage medium and electronic device
CN110781433B (en) * 2019-10-11 2023-06-02 腾讯科技(深圳)有限公司 Data type determining method and device, storage medium and electronic device
JP7315017B2 (en) 2019-10-18 2023-07-26 日本電気株式会社 Time series data processing method
WO2021075039A1 (en) * 2019-10-18 2021-04-22 日本電気株式会社 Time-series data processing method
US11885720B2 (en) 2019-10-18 2024-01-30 Nec Corporation Time series data processing method
JPWO2021075039A1 (en) * 2019-10-18 2021-04-22
US20220334030A1 (en) * 2019-10-18 2022-10-20 Nec Corporation Time series data processing method
US11874854B2 (en) 2020-01-06 2024-01-16 Kabushiki Kaisha Toshiba Information processing apparatus, information processing method, and computer program
CN111325258B (en) * 2020-02-14 2023-10-24 腾讯科技(深圳)有限公司 Feature information acquisition method, device, equipment and storage medium
CN111325258A (en) * 2020-02-14 2020-06-23 腾讯科技(深圳)有限公司 Characteristic information acquisition method, device, equipment and storage medium
CN112446647A (en) * 2020-12-14 2021-03-05 上海众源网络有限公司 Abnormal element positioning method and device, electronic equipment and storage medium
CN114553756B (en) * 2022-01-27 2023-06-13 烽火通信科技股份有限公司 Equipment fault detection method based on joint generation countermeasure network and electronic equipment
CN114553756A (en) * 2022-01-27 2022-05-27 烽火通信科技股份有限公司 Equipment fault detection method based on joint generation countermeasure network and electronic equipment
CN114613120A (en) * 2022-03-29 2022-06-10 新奥(中国)燃气投资有限公司 Remote meter reading abnormity identification method and device
CN115017121A (en) * 2022-08-05 2022-09-06 山东天意机械股份有限公司 Concrete production equipment data storage system
CN115860590B (en) * 2023-03-02 2023-04-28 广东慧航天唯科技有限公司 Intelligent analysis and early warning method and system for enterprise emission pollution data
CN115860590A (en) * 2023-03-02 2023-03-28 广东慧航天唯科技有限公司 Intelligent analysis early warning method and system for enterprise emission pollution data
CN117495113A (en) * 2024-01-02 2024-02-02 海纳云物联科技有限公司 Building fire safety assessment method, equipment and medium

Similar Documents

Publication Publication Date Title
WO2011036809A1 (en) Abnormality identification system and method thereof
Bashar et al. TAnoGAN: Time series anomaly detection with generative adversarial networks
Liu et al. Missing value imputation for industrial IoT sensor data with large gaps
Jiménez et al. Maintenance management based on machine learning and nonlinear features in wind turbines
US9626600B2 (en) Event analyzer and computer-readable storage medium
Van Der Gaag Bayesian belief networks: odds and ends
WO2022225579A1 (en) Variables &amp; implementations of solution automation &amp; interface analysis
JP5342708B1 (en) Anomaly detection method and apparatus
US20060106797A1 (en) System and method for temporal data mining
US20030200191A1 (en) Viewing multi-dimensional data through hierarchical visualization
WO2019050624A1 (en) Processing of computer log messages for visualization and retrieval
CN117131110B (en) Method and system for monitoring dielectric loss of capacitive equipment based on correlation analysis
Aydin et al. The prediction algorithm based on fuzzy logic using time series data mining method
Yang et al. A very fast decision tree algorithm for real-time data mining of imperfect data streams in a distributed wireless sensor network
Netzer et al. Intelligent anomaly detection of machine tools based on mean shift clustering
CN116523499A (en) Automatic fault diagnosis and prediction method and system based on data driving model
Zhao et al. A comparative study on unsupervised anomaly detection for time series: Experiments and analysis
Xiang et al. Reliable post-signal fault diagnosis for correlated high-dimensional data streams
US20160019267A1 (en) Using data mining to produce hidden insights from a given set of data
Furqan et al. Heart disease prediction using machine learning algorithms
Rajapaksha et al. Supervised machine learning algorithm selection for condition monitoring of induction motors
JP2020173525A (en) Risk response analysis system, risk response analysis method and risk response analysis program
JP5033155B2 (en) Similar partial sequence detection apparatus, similar partial sequence detection method, and similar partial sequence detection program
KR102224684B1 (en) System and method for generating prediction of technology transfer base on machine learning
Barach et al. Fuzzy decision trees in medical decision making support systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09849839

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09849839

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP