WO2021095222A1 - Threshold value generation device, threshold value generation method, and threshold value generation program - Google Patents

Threshold value generation device, threshold value generation method, and threshold value generation program Download PDF

Info

Publication number
WO2021095222A1
WO2021095222A1 PCT/JP2019/044818 JP2019044818W WO2021095222A1 WO 2021095222 A1 WO2021095222 A1 WO 2021095222A1 JP 2019044818 W JP2019044818 W JP 2019044818W WO 2021095222 A1 WO2021095222 A1 WO 2021095222A1
Authority
WO
WIPO (PCT)
Prior art keywords
threshold
candidate group
determination accuracy
threshold value
candidates
Prior art date
Application number
PCT/JP2019/044818
Other languages
French (fr)
Japanese (ja)
Inventor
信秋 田中
西田 博幸
Original Assignee
三菱電機株式会社
三菱電機エンジニアリング株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社, 三菱電機エンジニアリング株式会社 filed Critical 三菱電機株式会社
Priority to PCT/JP2019/044818 priority Critical patent/WO2021095222A1/en
Priority to JP2021555739A priority patent/JP7012913B2/en
Priority to TW109114016A priority patent/TW202121205A/en
Publication of WO2021095222A1 publication Critical patent/WO2021095222A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to a threshold generation device, a threshold generation method, and a threshold generation program.
  • the MT (Mahalanobis Taguchi) method is one of the most representative methods among them (see, for example, Non-Patent Document 1).
  • the distribution formed by the normal sample set in the feature space is learned in advance as the reference space, and at the time of judgment, normal or abnormal is identified depending on how much the observed feature vector deviates from the reference space. Do.
  • the Mahalanobis distance indicates how much a sample deviates from the reference space.
  • the Mahalanobis distance indicates that if it is small, the sample is close to normal, and if it is large, the sample is close to abnormal. That is, the Mahalanobis distance can be interpreted as a value indicating the degree of abnormality of the sample.
  • a method for setting a threshold value for discriminating between normal and abnormal with respect to the degree of abnormality has not been established, and in many cases, trial and error is required to find an appropriate threshold value.
  • Non-Patent Document 2 describes a method of learning a potential distribution of features of a normal sample set using a variational autoencoder and determining normality or abnormality based on the degree of deviation from the distribution. ..
  • the variational autoencoder finally outputs a value indicating the degree of abnormality as in the MT method, and Non-Patent Document 2 describes how to set an appropriate threshold value for this degree of abnormality. It does not specifically describe whether it is good or not.
  • the first is to establish a method to reflect the user's orientation in the threshold.
  • the degree of abnormality of the normal sample set and the abnormal sample set is plotted on a number line, as shown in FIG. 1, the degree of abnormality of the normal sample ( ⁇ mark) set and the abnormality of the abnormal sample (x mark) set There will be areas where the degrees are mixed. In such a case, there is no threshold value that can completely distinguish between normal and abnormal, and it is desirable to consider the trade-off between the erroneous judgment rate of the normal sample and the oversight rate of the abnormal sample. Which of these criteria is emphasized depends on the user. Therefore, it is desirable to establish a method that can reflect the user's orientation in the threshold generation process.
  • the second is to establish a method for determining the optimum threshold value after reflecting the above-mentioned user orientation, that is, the constraint specified by the user. For example, when generating a "threshold value at which the oversight rate of an abnormal sample is 10%", as a simple method, after listing various threshold candidates, the overlooking rate of the abnormal sample is 10%. A method of selecting the one with the closest oversight rate to 10% can be considered. However, in this method, as shown in FIG. 2, the degree of abnormality of the normal sample and the abnormal sample can be completely separated on the number line, and the oversight rate of the abnormal sample can be made smaller than 10% without side effects. If this is the case, the problem arises that a "better threshold" clearly exists but is not selected. Similar problems occur when the region where the abnormalities of normal samples and the abnormalities of abnormal samples coexist is small, or when the number of samples available for determining the threshold is small. Therefore, it is desirable to establish a method that can determine an appropriate threshold in any situation.
  • An object of the present invention is to obtain an appropriate threshold value based on a specified constraint.
  • the threshold generation device includes a threshold candidate group generation unit that generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples. , A first determination accuracy calculation unit that calculates the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group, and a constraint that specifies a constraint on the first determination accuracy. A first that selects one or more threshold candidates from the designated unit and the first threshold candidate group based on the constraint, and generates a second threshold candidate group including the selected one or more threshold candidates. The threshold selection unit, the second determination accuracy calculation unit for calculating the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group, and the second threshold candidate group. It has a second threshold value selection unit that outputs a threshold value candidate selected based on the second determination accuracy as a final threshold value.
  • the threshold generation method includes a step of generating a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples, and the first.
  • the step of calculating the second determination accuracy of each of the one or more threshold candidates and the threshold candidate selected from the second threshold candidate group based on the second determination accuracy are output as the final threshold. Has steps to do.
  • an appropriate threshold value based on a specified constraint can be obtained.
  • FIGS. 3 to 6 it is a figure which shows that a plurality of abnormalities are acquired from one sample. It is a figure which shows that one representative value is calculated from the plurality of abnormality degrees acquired from one sample, and one representative value is assigned as the abnormality degree of a sample. It is a functional block diagram which shows schematic structure of the threshold value generation apparatus which concerns on embodiment of this invention. It is a figure which shows the example of the threshold value candidate included in the 1st threshold value candidate group generated by the threshold value candidate group generation part.
  • the threshold value generator, the threshold value generation method, and the threshold value generation program according to the embodiment of the present invention will be described below with reference to the drawings.
  • the following embodiments are merely examples, and various modifications can be made within the scope of the present invention.
  • the threshold generation device generates a threshold value used when determining whether the device is in a normal state or an abnormal state.
  • the threshold generation device according to the present embodiment is based on, for example, the degree of abnormality of the device obtained as a result of analyzing the waveform detected by using an acoustic sensor or a vibration sensor (that is, a measuring device) for the operating sound or vibration of the device.
  • the threshold generation device performs a quality inspection of the motor based on the vibration of the motor which is a device.
  • the threshold value generator detects the vibration generated by the motor, and when the degree of abnormality obtained as a result of the analysis is equal to or higher than the threshold value, the motor is considered to be defective.
  • the threshold generation device generates a spectrogram as shown in FIG. 3 by performing wavelet transform on the vibration waveform measured by the measuring instrument.
  • FIG. 3 is a diagram showing an example of a spectrogram showing the vibration of the motor.
  • the vertical axis represents time
  • the horizontal axis represents frequency
  • the density of concentration represents the strength of vibration.
  • FIG. 4 is a diagram schematically showing the spectrogram shown in FIG. 3 as a matrix.
  • the vertical axis represents time and the horizontal axis represents frequency.
  • one square box corresponds to one point on the spectrogram.
  • it is assumed that a power value is assigned to each of the plurality of boxes.
  • the time / frequency resolutions in FIGS. 3 and 4 are different from each other.
  • FIG. 5 is a diagram showing an example in which the matrix shown in FIG. 4 is regarded as a feature vector in which the characteristics of the timbre at each time are quantified.
  • the vertical axis represents time and the horizontal axis represents frequency.
  • the matrix shown in FIG. 4 can be regarded as a feature vector in which the characteristics of the timbre are quantified for each time.
  • a plurality of feature vectors shown in FIG. 5 are generated from the vibration waveform of one motor.
  • the method of obtaining the feature vector from the vibration waveform is not limited to the method by wavelet transform as described above.
  • a method for obtaining a feature vector from a vibration waveform for example, Fourier transform, filter bank analysis, cepstrum analysis, LPC (Linear Positive Coefficient) analysis, or the like can be used.
  • a method of obtaining a feature vector from a vibration waveform it is also possible to construct a feature vector by combining various acoustic features.
  • the various acoustic features are, for example, a peak value, an RMS (Root Mean Square) value, a fundamental frequency, and the like.
  • FIG. 6 is an anomaly that receives a set of multiple feature vectors obtained based on a normal motor sample (ie, anomalous sample) and an anomalous motor sample (ie, anomalous sample) and generates an anomaly model. It is a figure which shows the degree model learner 10. As shown in FIGS. 3 to 5, analysis is performed using a plurality of samples including a normal sample and an abnormal sample, and a set of feature vectors obtained as a result is input to the abnormality model learner 10.
  • the anomaly degree model is a model in which one feature vector is input and a single anomaly degree is calculated and output by performing some conversion on the feature vector.
  • the anomaly model learner 10 constructs such an anomaly model from a set of given feature vectors.
  • LDA linear discriminant analysis
  • LDA gives a set of normal class feature vectors and a set of abnormal class feature vectors as training data, and finds a projection vector that emphasizes the difference between normal and abnormal based on the distribution of those feature vectors. It is a method.
  • the anomaly model learner 10 first determines the average vector of all the feature vectors included in both the normal class and the anomaly class. To ask. Similarly, the anomaly model learner 10 finds the standard deviation for each element of all feature vectors. At this time, the vector that stores the standard deviation for each element of all the feature vectors is And. All of these vectors are N-dimensional when the feature vector is N-dimensional (N is a positive integer).
  • the anomaly degree model learner 10 uses the above-mentioned vector. Normalize all feature vectors using. Normalization is to adjust the vector so that the mean of each element of the vector is 0 and the standard deviation is 1.
  • normalization is a vector Using Is to ask. However, " ⁇ " is the division for each element of the vector.
  • the anomaly model learner 10 normalizes all the feature vectors as described above, and then the average vector of the normal class and the average vector of the anomaly class. To be calculated respectively.
  • the anomaly model learner 10 is a covariance matrix based on the mean vector of the normal class and the mean vector of the anomaly class. To be calculated respectively.
  • the anomaly model learner 10 uses the following equations (1) and (2) to project a vector. To ask.
  • This equation (3) corresponds to the anomaly model in LDA.
  • the analysis method used by the anomaly model learner 10 is not limited to LDA.
  • the anomaly model learner 10 can use, for example, a support vector machine, a neural network, or a mixed normal distribution model.
  • the abnormality degree model learner 10 uses both the normal class and the abnormality class as training data, it is also possible to adopt a method of learning using only the data of a single class.
  • an MT method for example, an MT method, a principal component analysis (PCA), an autoencoder, a one-class support vector machine, or the like can be used as the abnormality degree model learner 10.
  • PCA principal component analysis
  • an autoencoder a one-class support vector machine, or the like
  • FIG. 7 is a diagram showing that a plurality of abnormalities are acquired from one motor as one sample in the example shown in FIGS. 3 to 6.
  • a single feature vector is transformed into a single anomaly.
  • a plurality of feature vectors can be obtained from the vibration waveform of one motor. Therefore, in this case, as shown in FIG. 7, a plurality of abnormalities can be obtained from one motor.
  • FIG. 8 is a diagram showing that one representative value is calculated based on a plurality of abnormality degrees acquired from one motor as one sample, and this is assigned as the abnormality degree of the motor.
  • a plurality of abnormality degrees shown in FIG. 7 obtained from one motor are aggregated and calculated by some method.
  • a single representative value is assigned as the degree of anomaly of the motor.
  • the simplest method for calculating the representative value is to use the average value of a plurality of abnormalities as the representative value. Any statistic, such as maximum, standard deviation, or mode, can be used as the representative value that can be used.
  • Threshold generator 20 In the following, one degree of abnormality is assigned to one motor, which is one sample, by using the method described above, and as shown in FIGS. 1 and 2, a plurality of motors of these motors are assigned. A threshold generation device and a method for automatically generating an appropriate threshold value reflecting the user's orientation by using the same number of abnormalities as the number of abnormalities will be described.
  • FIG. 9 is a block diagram schematically showing the configuration of the threshold generation device 20 according to the present embodiment.
  • the threshold value generation device 20 is a device capable of implementing the threshold value generation method according to the present embodiment.
  • the threshold value generation device 20 includes a threshold value candidate group generation unit 21, a constraint designation unit 22, a first threshold value selection unit 23, a first determination accuracy calculation unit 24, a second threshold value selection unit 25, and a second. It has a determination accuracy calculation unit 26 of the above.
  • the threshold candidate group generation unit 21 generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to each of the plurality of samples.
  • the plurality of samples are a plurality of motors.
  • the first determination accuracy calculation unit 24 calculates the first determination accuracy of each of one or more threshold candidates included in the first threshold candidate group.
  • the constraint specifying unit 22 specifies a constraint on the first determination accuracy. For example, the constraint specifying unit 22 accepts the input of a numerical value and determines the constraint based on the numerical value. The input of the numerical value is performed by the user, for example.
  • the first threshold selection unit 23 selects one or more threshold candidates from the first threshold candidate group based on the constraint, and generates a second threshold candidate group including one or more selected threshold candidates. .. For example, the first threshold selection unit 23 selects a threshold candidate satisfying the constraint from the first threshold candidate group, and generates a second threshold candidate group including one or more selected threshold candidates.
  • the second determination accuracy calculation unit 26 calculates the second determination accuracy of each of one or more threshold candidates included in the second threshold candidate group.
  • the second threshold value selection unit 25 outputs the threshold value candidates selected from the second threshold value candidate group based on the second determination accuracy as the final threshold value. For example, the second threshold value selection unit 25 selects the threshold value candidate having the maximum second determination accuracy from the second threshold value candidate group and outputs it as the final threshold value.
  • Constraint specification unit 22 The constraint on the first determination accuracy specified by the constraint specifying unit 22 is determined based on, for example, a numerical value input by the user.
  • the constraint is, for example, a condition as shown below.
  • the user can select one of the following constraints (A1) to (A4) and freely specify the value E in the constraint.
  • A1 The oversight of abnormal samples is set to E% or less.
  • A2) The false detection of a normal sample is set to E% or less.
  • A3 Set the detection rate of abnormal samples to E% or more.
  • A4) The detection rate of a normal sample is set to E% or more.
  • a user who wants to prioritize avoiding overlooking an abnormal sample may select the constraint (A1) and set the value E at that time to a small value. Further, the user who wants to prioritize the avoidance of false detection of the normal sample may select the constraint (A2) and set the value E at that time to a small value.
  • TPR true positive: True Positive Rate
  • TNR false positive: True Native Rate
  • the problem of selecting the optimum threshold value that satisfies the constraint specified as described above is to set the optimum threshold value after giving the constraint that "one of TPR and TNR is set to an arbitrary value or more.” It can be interpreted as a question of choice.
  • the constraint (A1) that "the oversight of an abnormal sample is 10% or less” can be realized by selecting the optimum threshold value under the "TPR is 90% or more” constraint.
  • the above-mentioned four types of constraints (A1) to (A4) can be replaced with constraints (B1) to (B4) in which "one of TPR and TNR is set to an arbitrary value or more" as follows. it can.
  • constraints (A1) to (A4) are equivalent to the constraints (B1) to (B4), respectively.
  • TPR is (100-E)% or more.
  • TNR is (100-E)% or more.
  • B3 Set TPR to E% or more.
  • B4 Set the TNR to E% or more.
  • This constraint (A1) can be replaced with the TPR and TNR constraint (B1), that is, "TPR is 80% or more.”
  • FIG. 10 is a diagram showing an example of threshold candidates C1 to C13 included in the first threshold candidate group generated by the threshold candidate group generation unit 21.
  • FIG. 11 is a diagram showing other examples of threshold candidates C21 to C25 included in the first threshold candidate group generated by the threshold candidate group generation unit 21.
  • the threshold candidate group generation unit 21 generates a first threshold candidate group including one or more threshold candidates in order to obtain a threshold that satisfies the specified constraint.
  • Various methods for generating threshold candidates can be considered, but as shown in FIG. 10 as an example, a method of enumerating the middle of all adjacent abnormalities plotted on a number line as threshold candidates can be used.
  • m-1 threshold candidates are generated.
  • m is a positive integer.
  • the degree of abnormality of 14 samples is given, and as a result, 13 threshold candidates C1 to C13 are generated.
  • the advantage of this method is that when the degree of abnormality of the normal sample and the degree of abnormality of the abnormal sample are significantly different from each other as shown in FIG. 11, the threshold candidate C23 having the maximum margin for distinguishing the two is generated. Is Rukoto. This improves generalization performance for unknown samples.
  • FIG. 12 is a diagram showing an example of the first determination accuracy calculated by the first determination accuracy calculation unit 24 and the constraints specified by the constraint designation unit 22 as Table 1.
  • the first threshold value selection unit 23 uses the first determination accuracy calculation unit 24 to obtain the first determination accuracy for each threshold value candidate for the first threshold value candidate group obtained as described above.
  • a set of TPR and TNR is used as a specific example of the first determination accuracy.
  • the pair of TPR and TNR is obtained as the first determination accuracy for the threshold candidates C1 to C13.
  • these threshold candidates those satisfying the above-mentioned constraint "TPR is 80% or more" are selected and output as the second threshold candidate group.
  • FIG. 13 is a flowchart showing the operation of the first threshold value selection unit 23 and the first determination accuracy calculation unit 24.
  • the first threshold value selection unit 23 selects one threshold value candidate that has not yet been selected from the first threshold value candidate group (step S11), and the first determination accuracy calculation unit 24 selects the selected threshold value candidate for the first threshold value candidate.
  • the determination accuracy of 1 is calculated (step S12).
  • the first threshold value selection unit 23 determines whether or not the first determination accuracy satisfies the specified constraint, and if the constraint is satisfied (YES in step S13), the constraint is set.
  • the threshold candidates to be satisfied are added to the second threshold candidate group (step S14), and it is determined whether or not all the threshold candidates have been selected (step S15).
  • the first threshold selection unit 23 does not add the threshold candidates to the second threshold candidate group, and all of them. It is determined whether or not the threshold value candidate of (step S15) is selected (step S15).
  • the first threshold selection unit 23 When all the threshold candidates are selected (YES in step S15), the first threshold selection unit 23 outputs the second threshold candidate group to the second threshold selection unit 25 (step S16) and does not select. If there is a threshold candidate of (NO in step S15), the process returns to step S11.
  • FIG. 14 is a diagram showing an example of the second determination accuracy calculated by the second determination accuracy calculation unit 26 as Table 2.
  • the second determination accuracy calculation unit 26 obtains the second determination accuracy for the threshold candidates C1 to C6 shown in FIG. Since the second threshold value selection unit 25 uniquely selects the final threshold value from the second threshold value candidate group, these threshold values are evaluated by a scale different from the first determination accuracy. This evaluation is performed by the second determination accuracy calculation unit 26.
  • the second determination accuracy "the smaller value of TPR and TNR" is used.
  • the threshold candidate having the highest second determination accuracy is C6. Therefore, this threshold candidate C6 is output as the final threshold.
  • FIG. 15 is a flowchart showing the operation of the second threshold value selection unit 25 and the second determination accuracy calculation unit 26.
  • the second threshold value selection unit 25 selects one threshold value candidate that has not yet been selected from the second threshold value candidate group (step S21), and the second determination accuracy calculation unit 26 selects the selected threshold value candidate for the second threshold value candidate.
  • the determination accuracy of 2 is calculated (step S22).
  • the second threshold value selection unit 25 determines whether or not the second determination accuracy is greater than the maximum value of the second determination accuracy stored in the memory (step S23), and if it is large, it determines. (YES in step S23), the maximum value of the second determination accuracy is stored (that is, updated) (step S24), and it is determined whether or not all the threshold candidates have been selected (step S25). When the second determination accuracy does not satisfy the specified constraint (NO in step S23), the second threshold selection unit 25 does not update the maximum value of the second determination accuracy, and does not update all the values. It is determined whether or not the threshold candidate is selected (step S25).
  • the second threshold selection unit 25 When all the threshold candidates are selected (YES in step S25), the second threshold selection unit 25 outputs the threshold candidate having the maximum second determination accuracy as the final threshold (step S26). If there is an unselected threshold candidate (NO in step S25), the process returns to step S21.
  • the evaluation scale used as the first determination accuracy and the second determination accuracy may be other than those based on TPR or TNR.
  • the evaluation scale can utilize any statistic or a combination thereof, such as accuracy of correct answer, accuracy rate, and F value (F-score or F-masure).
  • the threshold value generator 20 As described above, if the threshold value generator 20 according to the present embodiment is used, the user's orientation is reflected in the threshold value in the form of a constraint on the first determination accuracy, and the second determination accuracy satisfying the constraint can be obtained. By selecting an appropriate threshold value by using, it is possible to select an appropriate threshold value while reflecting the user's orientation.
  • the method of specifying the range of numerical values that can be taken by the judgment accuracy is intuitively understandable to the user, and the user's labor for adjusting the threshold value is small.
  • the form of specifying a range of numerical values it is possible to leave room for the system to select a more appropriate threshold value within that range. This makes it possible to both reflect the user's orientation and optimize the threshold value.
  • the final threshold value surely reflects the user's orientation.
  • threshold value to be finally output is narrowed down to one, additional work such as the user selecting the final threshold value from a plurality of presented threshold value candidates becomes unnecessary, and the labor of the user is reduced. can do.
  • the second criterion "the smaller value of TPR and TNR" is "determine that any input sample is normal” or “determine that any input sample is abnormal”. It will always be 0 for useless threshold candidates such as ".” Therefore, it can be expected that practical thresholds will be generated in various situations by avoiding the selection of such useless threshold candidates.
  • the judgment accuracy can be set anywhere between the normal sample closest to the abnormal sample and the abnormal sample closest to the normal sample. Is 100%.
  • the generalization performance for an unknown sample can be maximized by setting a threshold value between the two, as in the objective function for optimizing the support vector machine.
  • FIG. 16 is a diagram showing an example of the hardware configuration of the threshold value generation device 20 according to the present embodiment.
  • the threshold generation device 20 has a memory 32 for storing a program and a processor 31 such as a CPU (Central Processing Unit) for executing the program.
  • the program can include a threshold generation program for implementing the threshold generation method according to the present embodiment. All or part of the function of the threshold generator 20 shown in FIG. 9 can be realized by the processor 31 that executes the program. All or part of the function of the threshold generation device 20 shown in FIG. 16 may be realized by a semiconductor integrated circuit.
  • the threshold generation device 20 includes a display 34 as a display means as an interface for the user to specify a constraint on the determination accuracy, an input device 35 such as a mouse, a keyboard, and a touch panel, and a hard disk 33 as a storage device. You may have.
  • Threshold generation device 21 Threshold candidate group generation unit, 22 Constraint designation unit, 23 1st threshold selection unit, 24 1st judgment accuracy calculation unit, 25 2nd threshold selection unit, 26 2nd judgment accuracy calculation unit ..

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)

Abstract

A threshold value generation device (20) has: a threshold value candidate group generation unit (21) for generating a first threshold value candidate group that includes one or more threshold value candidates (C1-C13), on the basis of a plurality of abnormality levels respectively assigned to a plurality of samples; a first determination accuracy calculation unit (24) for calculating first determination accuracies (TPR, TNR) respectively for the one or more threshold value candidates (C1-C13) that are included in the first threshold value candidate group; a restriction designation unit (22) for designating restrictions on the first determination accuracies; a first threshold value selection unit (23) for selecting the one or more threshold value candidates from the first threshold value candidate group on the basis of the restrictions and generating a second threshold value candidate group that includes the selected one or more threshold value candidates; a second determination accuracy calculation unit (26) for calculating second determination accuracies (TPR, TNR) of the one or more threshold value candidates (C1-C6) that are included in the second threshold value candidate group; and a second threshold value selection unit (25) for outputting, as a final threshold value, the threshold value candidate that is selected from the second threshold value candidate group on the basis of the second determination accuracies.

Description

閾値生成装置、閾値生成方法、及び閾値生成プログラムThreshold generator, threshold generation method, and threshold generation program
 本発明は、閾値生成装置、閾値生成方法、及び閾値生成プログラムに関する。 The present invention relates to a threshold generation device, a threshold generation method, and a threshold generation program.
 機器の動作音又は振動を音響センサ又は振動センサを用いて測定し、その波形を分析することで機器の健全性を推定する様々な手法が開発されている。MT(マハラノビス・タグチ)法は、その中で最も代表的な手法のひとつである(例えば、非特許文献1を参照)。MT法では、特徴空間において正常サンプル集合が形成している分布を基準空間として事前に学習し、判定時には、観測された特徴ベクトルが基準空間からどの程度乖離しているかによって正常又は異常の識別を行う。 Various methods have been developed to estimate the soundness of equipment by measuring the operating sound or vibration of equipment using an acoustic sensor or vibration sensor and analyzing the waveform. The MT (Mahalanobis Taguchi) method is one of the most representative methods among them (see, for example, Non-Patent Document 1). In the MT method, the distribution formed by the normal sample set in the feature space is learned in advance as the reference space, and at the time of judgment, normal or abnormal is identified depending on how much the observed feature vector deviates from the reference space. Do.
 MT法によって得られるのは、あるサンプルが基準空間からどの程度乖離しているかを示すマハラノビス距離である。マハラノビス距離は、それが小さければサンプルが正常に近く、それが大きければサンプルが異常に近い、ということを示している。つまり、マハラノビス距離は、サンプルの異常度を示す値であると解釈することができる。しかし、その異常度に対して正常・異常を識別する閾値を設定する方法は、確立されておらず、多くの場合、適切な閾値を発見するためには、試行錯誤を要する。 What is obtained by the MT method is the Mahalanobis distance, which indicates how much a sample deviates from the reference space. The Mahalanobis distance indicates that if it is small, the sample is close to normal, and if it is large, the sample is close to abnormal. That is, the Mahalanobis distance can be interpreted as a value indicating the degree of abnormality of the sample. However, a method for setting a threshold value for discriminating between normal and abnormal with respect to the degree of abnormality has not been established, and in many cases, trial and error is required to find an appropriate threshold value.
 このような閾値設定の問題は、MT法のような古典的手法だけではなく、深層学習のような現代的な手法においても同様に生じる。例えば、非特許文献2では、変分オートエンコーダを用いて正常サンプル集合が持つ特徴の潜在的な分布を学習し、その分布からの乖離度合いによって正常又は異常の判定を行う手法が記載されている。しかし、最終的に変分オートエンコーダが出力するのは、MT法と同様に異常度を示す値であり、非特許文献2は、この異常度に対して適切な閾値をどのように設定すれば良いかについて具体的に記載していない。 The problem of such threshold setting occurs not only in the classical method such as the MT method but also in the modern method such as deep learning. For example, Non-Patent Document 2 describes a method of learning a potential distribution of features of a normal sample set using a variational autoencoder and determining normality or abnormality based on the degree of deviation from the distribution. .. However, the variational autoencoder finally outputs a value indicating the degree of abnormality as in the MT method, and Non-Patent Document 2 describes how to set an appropriate threshold value for this degree of abnormality. It does not specifically describe whether it is good or not.
 異常度に対して適切な閾値を自動的に決定するには、以下の課題を解決することが望ましい。 In order to automatically determine the appropriate threshold value for the degree of abnormality, it is desirable to solve the following problems.
 第1は、閾値にユーザの指向を反映させる方法を確立することである。多くの場合、正常サンプル集合と異常サンプル集合の異常度を数直線上にプロットすると、図1に示されるように、正常サンプル(〇印)集合の異常度と異常サンプル(×印)集合の異常度が混在する領域が生じる。このような場合、正常・異常を完全に識別可能な閾値は存在せず、正常サンプルの誤判定率又は異常サンプルの見逃し率のトレードオフを考慮することが望ましい。これらの基準のうち、どの基準を重視するかは、ユーザによって異なる。したがって、ユーザの指向を閾値生成処理に反映させることができる手法を確立することが望ましい。 The first is to establish a method to reflect the user's orientation in the threshold. In many cases, when the degree of abnormality of the normal sample set and the abnormal sample set is plotted on a number line, as shown in FIG. 1, the degree of abnormality of the normal sample (○ mark) set and the abnormality of the abnormal sample (x mark) set There will be areas where the degrees are mixed. In such a case, there is no threshold value that can completely distinguish between normal and abnormal, and it is desirable to consider the trade-off between the erroneous judgment rate of the normal sample and the oversight rate of the abnormal sample. Which of these criteria is emphasized depends on the user. Therefore, it is desirable to establish a method that can reflect the user's orientation in the threshold generation process.
 第2は、上述のユーザの指向、すなわち、ユーザが指定する制約を反映させた上で、最適な閾値を決定する方法を確立することである。例えば、「異常サンプルの見逃し率が10%となる閾値」を生成する場合、単純な方法として、様々な閾値候補を列挙した上で、その中から異常サンプルの見逃し率が10%となるもの又は見逃し率が10%に最も近くなるもの、を選択する方法が考えられる。しかし、この方法では、図2に示されるように、正常サンプルと異常サンプルの異常度が数直線上で完全に分離可能であり、副作用なしに異常サンプルの見逃し率を10%よりも小さくできるような場合、明らかに「より良い閾値」が存在するにもかかわらず、それが選択されないという問題が生じる。同様の問題は、正常サンプルの異常度と異常サンプルの異常度が混在する領域が小さい場合又は閾値の決定に利用可能なサンプルの数が少ない場合にも生じる。したがって、どのような状況においても、適切な閾値を決定できる手法を確立することが望ましい。 The second is to establish a method for determining the optimum threshold value after reflecting the above-mentioned user orientation, that is, the constraint specified by the user. For example, when generating a "threshold value at which the oversight rate of an abnormal sample is 10%", as a simple method, after listing various threshold candidates, the overlooking rate of the abnormal sample is 10%. A method of selecting the one with the closest oversight rate to 10% can be considered. However, in this method, as shown in FIG. 2, the degree of abnormality of the normal sample and the abnormal sample can be completely separated on the number line, and the oversight rate of the abnormal sample can be made smaller than 10% without side effects. If this is the case, the problem arises that a "better threshold" clearly exists but is not selected. Similar problems occur when the region where the abnormalities of normal samples and the abnormalities of abnormal samples coexist is small, or when the number of samples available for determining the threshold is small. Therefore, it is desirable to establish a method that can determine an appropriate threshold in any situation.
 本発明の目的は、指定された制約に基づく適切な閾値を得ることである。 An object of the present invention is to obtain an appropriate threshold value based on a specified constraint.
問題を解決するための手段Means to solve the problem
 本発明の一態様に係る閾値生成装置は、複数のサンプルにそれぞれ割り当てられた複数の異常度に基づいて1個以上の閾値候補を含む第1の閾値候補群を生成する閾値候補群生成部と、前記第1の閾値候補群に含まれる前記1個以上の閾値候補の各々の第1の判定精度を算出する第1の判定精度算出部と、前記第1の判定精度に対する制約を指定する制約指定部と、前記第1の閾値候補群から前記制約に基づいて1個以上の閾値候補を選択し、前記選択された1個以上の閾値候補を含む第2の閾値候補群を生成する第1の閾値選択部と、前記第2の閾値候補群に含まれる前記1個以上の閾値候補の各々の第2の判定精度を算出する第2の判定精度算出部と、前記第2の閾値候補群から前記第2の判定精度に基づいて選択された閾値候補を最終的な閾値として出力する第2の閾値選択部とを有する。 The threshold generation device according to one aspect of the present invention includes a threshold candidate group generation unit that generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples. , A first determination accuracy calculation unit that calculates the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group, and a constraint that specifies a constraint on the first determination accuracy. A first that selects one or more threshold candidates from the designated unit and the first threshold candidate group based on the constraint, and generates a second threshold candidate group including the selected one or more threshold candidates. The threshold selection unit, the second determination accuracy calculation unit for calculating the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group, and the second threshold candidate group. It has a second threshold value selection unit that outputs a threshold value candidate selected based on the second determination accuracy as a final threshold value.
 本発明の他の態様に係る閾値生成方法は、複数のサンプルにそれぞれ割り当てられた複数の異常度に基づいて1個以上の閾値候補を含む第1の閾値候補群を生成するステップと、前記第1の閾値候補群に含まれる前記1個以上の閾値候補の各々の第1の判定精度を算出するステップと、前記第1の判定精度に対する制約を指定するステップと、前記第1の閾値候補群から前記制約に基づいて1個以上の閾値候補を選択し、前記選択された1個以上の閾値候補を含む第2の閾値候補群を生成するステップと、前記第2の閾値候補群に含まれる前記1個以上の閾値候補の各々の第2の判定精度を算出するステップと、前記第2の閾値候補群から前記第2の判定精度に基づいて選択された閾値候補を最終的な閾値として出力するステップとを有する。 The threshold generation method according to another aspect of the present invention includes a step of generating a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples, and the first. A step of calculating the first determination accuracy of each of the one or more threshold candidates included in one threshold candidate group, a step of designating a constraint on the first determination accuracy, and the first threshold candidate group. A step of selecting one or more threshold candidates based on the above constraints and generating a second threshold candidate group including the selected one or more threshold candidates, and a step of being included in the second threshold candidate group. The step of calculating the second determination accuracy of each of the one or more threshold candidates and the threshold candidate selected from the second threshold candidate group based on the second determination accuracy are output as the final threshold. Has steps to do.
 本発明によれば、指定された制約に基づく適切な閾値を得ることができる。 According to the present invention, an appropriate threshold value based on a specified constraint can be obtained.
正常サンプルと異常サンプルの異常度を数直線上にプロットした例を示す図である。It is a figure which shows the example which plotted the degree of abnormality of a normal sample and an abnormal sample on a number line. 正常サンプルと異常サンプルの異常度を数直線上にプロットした他の例を示す図である。It is a figure which shows another example which plotted the degree of abnormality of a normal sample and an abnormal sample on a number line. モータの振動のスペクトログラムの例を示す図である。It is a figure which shows the example of the spectrogram of the vibration of a motor. 図3に示されるスペクトログラムを行列として模式的に表す図である。It is a figure which represents typically the spectrogram shown in FIG. 3 as a matrix. 図4に示される行列を時刻毎の音色の特徴ベクトルと見なす例を示す図である。It is a figure which shows the example which considers the matrix shown in FIG. 4 as the characteristic vector of the tone color for each time. 正常サンプルと異常サンプルに基づく複数の特徴ベクトルの集合を受け取り、異常度モデルを生成する異常度モデル学習器を示す図である。It is a figure which shows the abnormality degree model learner which receives a set of a plurality of feature vectors based on a normal sample and an abnormality degree sample, and generates an abnormality degree model. 図3から図6に示される例において、1つのサンプルから複数の異常度が取得されることを示す図である。In the example shown in FIGS. 3 to 6, it is a figure which shows that a plurality of abnormalities are acquired from one sample. 1つのサンプルから取得された複数の異常度から1つの代表値が算出され、1つの代表値がサンプルの異常度として割り当てられることを示す図である。It is a figure which shows that one representative value is calculated from the plurality of abnormality degrees acquired from one sample, and one representative value is assigned as the abnormality degree of a sample. 本発明の実施の形態に係る閾値生成装置の構成を概略的に示す機能ブロック図である。It is a functional block diagram which shows schematic structure of the threshold value generation apparatus which concerns on embodiment of this invention. 閾値候補群生成部によって生成された第1の閾値候補群に含まれる閾値候補の例を示す図である。It is a figure which shows the example of the threshold value candidate included in the 1st threshold value candidate group generated by the threshold value candidate group generation part. 第1の閾値選択部によって生成された第2の閾値候補群に含まれる閾値候補の他の例を示す図である。It is a figure which shows the other example of the threshold value candidate included in the 2nd threshold value candidate group generated by the 1st threshold value selection part. 第1の判定精度算出部によって算出された第1の判定精度と制約指定部によって指定された制約の例を表1として示す図である。It is a figure which shows the example of the 1st determination accuracy calculated by the 1st determination accuracy calculation unit, and the constraint specified by a constraint designation unit as Table 1. 第1の閾値選択部及び第1の判定精度算出部の動作を示すフローチャートである。It is a flowchart which shows the operation of the 1st threshold value selection part and the 1st determination accuracy calculation part. 第2の判定精度算出部によって算出された第2の判定精度の例を表2として示す図である。It is a figure which shows the example of the 2nd determination accuracy calculated by the 2nd determination accuracy calculation part as Table 2. 第2の閾値選択部及び第2の判定精度算出部の動作を示すフローチャートである。It is a flowchart which shows the operation of the 2nd threshold value selection part and the 2nd determination accuracy calculation part. 実施の形態に係る閾値生成装置のハードウェア構成の例を示す図である。It is a figure which shows the example of the hardware composition of the threshold value generation apparatus which concerns on embodiment.
 以下に、本発明の実施の形態に係る閾値生成装置、閾値生成方法、及び閾値生成プログラムを、図面を参照しながら説明する。以下の実施の形態は、例にすぎず、本発明の範囲内で種々の変更が可能である。 The threshold value generator, the threshold value generation method, and the threshold value generation program according to the embodiment of the present invention will be described below with reference to the drawings. The following embodiments are merely examples, and various modifications can be made within the scope of the present invention.
 本実施の形態に係る閾値生成装置は、機器が正常状態であるか又は異常状態であるかを判定するときに用いられる閾値を生成する。本実施の形態に係る閾値生成装置は、例えば、機器の動作音又は振動を音響センサ又は振動センサ(すなわち、測定器)を用いて検出された波形を分析した結果得られる機器の異常度に基づいて、機器が正常状態であるか又は異常状態であるかを判定するときに用いられる閾値を生成する。本実施の形態では、具体的な適用例として、閾値生成装置が機器であるモータの振動に基づきモータの品質検査を行う状況を想定する。本実施の形態に係る閾値生成装置は、モータで発生する振動を検出し、分析した結果得られた異常度が閾値以上である場合に、そのモータは不良品であると見なす。 The threshold generation device according to the present embodiment generates a threshold value used when determining whether the device is in a normal state or an abnormal state. The threshold generation device according to the present embodiment is based on, for example, the degree of abnormality of the device obtained as a result of analyzing the waveform detected by using an acoustic sensor or a vibration sensor (that is, a measuring device) for the operating sound or vibration of the device. To generate a threshold used to determine whether the device is in a normal or abnormal state. In the present embodiment, as a specific application example, it is assumed that the threshold generation device performs a quality inspection of the motor based on the vibration of the motor which is a device. The threshold value generator according to the present embodiment detects the vibration generated by the motor, and when the degree of abnormality obtained as a result of the analysis is equal to or higher than the threshold value, the motor is considered to be defective.
《異常度の導出》
 始めに、サンプルとしてのモータの振動波形に基づいて異常度を算出する方法の具体例を説明する。まず、本実施の形態に係る閾値生成装置は、測定器によって測定された振動波形に対してウェーブレット変換を行うことで、図3に示されるようなスペクトログラムを生成する。図3は、モータの振動を表すスペクトログラムの例を示す図である。図3では、縦軸が時刻を表し、横軸が周波数を表し、濃度の濃さが振動の強さを表す。
<< Derivation of abnormalities >>
First, a specific example of a method of calculating the degree of abnormality based on the vibration waveform of the motor as a sample will be described. First, the threshold generation device according to the present embodiment generates a spectrogram as shown in FIG. 3 by performing wavelet transform on the vibration waveform measured by the measuring instrument. FIG. 3 is a diagram showing an example of a spectrogram showing the vibration of the motor. In FIG. 3, the vertical axis represents time, the horizontal axis represents frequency, and the density of concentration represents the strength of vibration.
 図4は、図3に示されるスペクトログラムを行列として模式的に表す図である。図4では、縦軸が時刻を表し、横軸が周波数を表す。図4では、四角形である1個の枡がスペクトログラム上の1点に対応する。図4では、複数の枡の各々に、パワーの値が割り当てられているものとする。ただし、作図の都合上、図3と図4における時間・周波数解像度は、互いに異なっている。 FIG. 4 is a diagram schematically showing the spectrogram shown in FIG. 3 as a matrix. In FIG. 4, the vertical axis represents time and the horizontal axis represents frequency. In FIG. 4, one square box corresponds to one point on the spectrogram. In FIG. 4, it is assumed that a power value is assigned to each of the plurality of boxes. However, for convenience of drawing, the time / frequency resolutions in FIGS. 3 and 4 are different from each other.
 図5は、図4に示される行列を、時刻毎の音色の特徴を数値化した特徴ベクトルと見なした例を示す図である。図5では、縦軸が時刻を表し、横軸が周波数を表す。図4に示される行列は、図5に示されるように、時刻毎に音色の特徴を数値化した特徴ベクトルと見なすことができる。これにより、1個のモータの振動波形から図5に示される複数の特徴ベクトルが生成される。 FIG. 5 is a diagram showing an example in which the matrix shown in FIG. 4 is regarded as a feature vector in which the characteristics of the timbre at each time are quantified. In FIG. 5, the vertical axis represents time and the horizontal axis represents frequency. As shown in FIG. 5, the matrix shown in FIG. 4 can be regarded as a feature vector in which the characteristics of the timbre are quantified for each time. As a result, a plurality of feature vectors shown in FIG. 5 are generated from the vibration waveform of one motor.
 なお、振動波形から特徴ベクトルを得る手法は、上述のようなウェーブレット変換による手法に限定されない。振動波形から特徴ベクトルを得る手法として、例えば、フーリエ変換、フィルタバンク分析、ケプストラム分析、又はLPC(Linear Predictive Coefficient)分析などを用いることも可能である。また、振動波形から特徴ベクトルを得る手法として、種々の音響特徴量を組み合わせて特徴ベクトルを構成することも可能である。種々の音響特徴量は、例えば、ピーク値、RMS(Root Mean Square)値、及び基本周波数などである。 The method of obtaining the feature vector from the vibration waveform is not limited to the method by wavelet transform as described above. As a method for obtaining a feature vector from a vibration waveform, for example, Fourier transform, filter bank analysis, cepstrum analysis, LPC (Linear Positive Coefficient) analysis, or the like can be used. Further, as a method of obtaining a feature vector from a vibration waveform, it is also possible to construct a feature vector by combining various acoustic features. The various acoustic features are, for example, a peak value, an RMS (Root Mean Square) value, a fundamental frequency, and the like.
 図6は、正常なモータのサンプル(すなわち、正常サンプル)と異常なモータのサンプル(すなわち、異常サンプル)とに基づいて得られた複数の特徴ベクトルの集合を受け取り、異常度モデルを生成する異常度モデル学習器10を示す図である。図3から図5に示されるように、正常サンプルと異常サンプルを含む複数のサンプルを用いて分析が行われ、その結果得られた特徴ベクトルの集合が異常度モデル学習器10に入力される。異常度モデルとは、1つの特徴ベクトルを入力として、その特徴ベクトルに何らかの変換を行うことで単一の異常度を算出し出力するものを示す。異常度モデル学習器10は、そのような異常度モデルを、与えられた特徴ベクトルの集合から構成する。以降では、異常度モデル学習器10の具体例として、異常度モデル学習器10が線形判別分析(Linear Discriminant Analysis;LDA)を用いる場合について説明する。 FIG. 6 is an anomaly that receives a set of multiple feature vectors obtained based on a normal motor sample (ie, anomalous sample) and an anomalous motor sample (ie, anomalous sample) and generates an anomaly model. It is a figure which shows the degree model learner 10. As shown in FIGS. 3 to 5, analysis is performed using a plurality of samples including a normal sample and an abnormal sample, and a set of feature vectors obtained as a result is input to the abnormality model learner 10. The anomaly degree model is a model in which one feature vector is input and a single anomaly degree is calculated and output by performing some conversion on the feature vector. The anomaly model learner 10 constructs such an anomaly model from a set of given feature vectors. Hereinafter, as a specific example of the abnormality degree model learner 10, a case where the abnormality degree model learner 10 uses linear discriminant analysis (LDA) will be described.
 LDAは、学習データとして正常クラスの特徴ベクトルの集合と異常クラスの特徴ベクトルの集合を与え、それらの特徴ベクトルの分布に基づいて、正常と異常の差異が最も強調されるような射影ベクトルを求める手法である。異常度モデル学習器10は、まず、正常クラスと異常クラスの両方に含まれるすべての特徴ベクトルの平均ベクトル
Figure JPOXMLDOC01-appb-M000001
を求める。同様に、異常度モデル学習器10は、すべての特徴ベクトルの各要素について標準偏差を求める。このとき、すべての特徴ベクトルの各要素についての標準偏差を格納したベクトルを
Figure JPOXMLDOC01-appb-M000002
とする。これらのベクトルは、特徴ベクトルがN次元(Nは、正の整数である。)のときに、いずれもN次元となる。
LDA gives a set of normal class feature vectors and a set of abnormal class feature vectors as training data, and finds a projection vector that emphasizes the difference between normal and abnormal based on the distribution of those feature vectors. It is a method. The anomaly model learner 10 first determines the average vector of all the feature vectors included in both the normal class and the anomaly class.
Figure JPOXMLDOC01-appb-M000001
To ask. Similarly, the anomaly model learner 10 finds the standard deviation for each element of all feature vectors. At this time, the vector that stores the standard deviation for each element of all the feature vectors is
Figure JPOXMLDOC01-appb-M000002
And. All of these vectors are N-dimensional when the feature vector is N-dimensional (N is a positive integer).
 次に、異常度モデル学習器10は、上述のベクトル
Figure JPOXMLDOC01-appb-M000003
を用いて、すべての特徴ベクトルを正規化する。正規化とは、ベクトルの各要素の平均が0、標準偏差が1となるようにベクトルを調整することである。
Next, the anomaly degree model learner 10 uses the above-mentioned vector.
Figure JPOXMLDOC01-appb-M000003
Normalize all feature vectors using. Normalization is to adjust the vector so that the mean of each element of the vector is 0 and the standard deviation is 1.
 具体的には、正規化とは、ベクトル
Figure JPOXMLDOC01-appb-M000004
を用いて
Figure JPOXMLDOC01-appb-M000005
を求めることである。ただし、「÷」は、ベクトルの要素毎の除算とする。
Specifically, normalization is a vector
Figure JPOXMLDOC01-appb-M000004
Using
Figure JPOXMLDOC01-appb-M000005
Is to ask. However, "÷" is the division for each element of the vector.
 次に、異常度モデル学習器10は、以上のようにすべての特徴ベクトルを正規化した上で、正常クラスの平均ベクトルと異常クラスの平均ベクトル
Figure JPOXMLDOC01-appb-M000006
をそれぞれ求める。
Next, the anomaly model learner 10 normalizes all the feature vectors as described above, and then the average vector of the normal class and the average vector of the anomaly class.
Figure JPOXMLDOC01-appb-M000006
To be calculated respectively.
 同様に、異常度モデル学習器10は、正常クラスの平均ベクトルと異常クラスの平均ベクトルに基づいて共分散行列
Figure JPOXMLDOC01-appb-M000007
をそれぞれ求める。
Similarly, the anomaly model learner 10 is a covariance matrix based on the mean vector of the normal class and the mean vector of the anomaly class.
Figure JPOXMLDOC01-appb-M000007
To be calculated respectively.
 最後に、異常度モデル学習器10は、以下の式(1)及び(2)によって射影ベクトル
Figure JPOXMLDOC01-appb-M000008
を求める。
Finally, the anomaly model learner 10 uses the following equations (1) and (2) to project a vector.
Figure JPOXMLDOC01-appb-M000008
To ask.
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000009
 ある特徴ベクトル
Figure JPOXMLDOC01-appb-M000010
が入力されたとき、それを単一の異常度dに変換する式は、以下の式(3)のようになる。この式(3)は、LDAにおける異常度モデルに相当する。
A feature vector
Figure JPOXMLDOC01-appb-M000010
When is input, the formula for converting it into a single degree of anomaly d is as shown in the following formula (3). This equation (3) corresponds to the anomaly model in LDA.
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000011
 なお、異常度モデル学習器10が用いる分析の方法は、LDAに限定されない。異常度モデル学習器10は、例えば、サポートベクターマシン、ニューラルネットワーク、又は混合正規分布モデルなどを利用可能である。また、異常度モデル学習器10は、学習データとして正常クラスと異常クラスの両方を利用しているが、単一クラスのデータだけを用いて学習する手法を採用することも可能である。この場合、異常度モデル学習器10は、例えば、MT法、主成分分析(Principal Component Analysis;PCA)、オートエンコーダ、又は1クラスサポートベクターマシンなどが利用可能である。 The analysis method used by the anomaly model learner 10 is not limited to LDA. The anomaly model learner 10 can use, for example, a support vector machine, a neural network, or a mixed normal distribution model. Further, although the abnormality degree model learner 10 uses both the normal class and the abnormality class as training data, it is also possible to adopt a method of learning using only the data of a single class. In this case, as the abnormality degree model learner 10, for example, an MT method, a principal component analysis (PCA), an autoencoder, a one-class support vector machine, or the like can be used.
 図7は、図3から図6に示される例において、1つのサンプルとしての1個のモータから複数の異常度が取得されることを示す図である。上述の異常度モデルでは、単一の特徴ベクトルを単一の異常度に変換する。また、1個のモータの振動波形からは、図5に示したように、複数の特徴ベクトルが得られる。したがって、この場合、1個のモータからは、図7に示されるように、複数の異常度が得られる。 FIG. 7 is a diagram showing that a plurality of abnormalities are acquired from one motor as one sample in the example shown in FIGS. 3 to 6. In the anomaly model described above, a single feature vector is transformed into a single anomaly. Further, as shown in FIG. 5, a plurality of feature vectors can be obtained from the vibration waveform of one motor. Therefore, in this case, as shown in FIG. 7, a plurality of abnormalities can be obtained from one motor.
 図8は、1つのサンプルとしての1個のモータから取得された複数の異常度に基づいて1つの代表値が算出され、これがモータの異常度として割り当てられることを示す図である。図8に示されるように、正常又は異常を決定する最終的な判断を簡略化するため、1個のモータから得られる複数の異常度(図7に示される)が何らかの方法で集計され、算出された単一の代表値をそのモータの異常度として割り当てられる。代表値を算出する最も単純な方法は、複数の異常度の平均値を代表値とするものである。利用可能な代表値としては、最大値、標準偏差、又は最頻値など、任意の統計量を用いることができる。 FIG. 8 is a diagram showing that one representative value is calculated based on a plurality of abnormality degrees acquired from one motor as one sample, and this is assigned as the abnormality degree of the motor. As shown in FIG. 8, in order to simplify the final judgment of normality or abnormality, a plurality of abnormality degrees (shown in FIG. 7) obtained from one motor are aggregated and calculated by some method. A single representative value is assigned as the degree of anomaly of the motor. The simplest method for calculating the representative value is to use the average value of a plurality of abnormalities as the representative value. Any statistic, such as maximum, standard deviation, or mode, can be used as the representative value that can be used.
《閾値生成装置20》
 以降では、以上で説明した方法を用いて1つのサンプルである1個のモータに1つの異常度が割り当てられており、図1及び図2に示されるように、複数のモータから、これらモータの個数と同数の異常度が得られている場合に、それらの異常度を用いてユーザの指向が反映された適切な閾値を自動的に生成する閾値生成装置及び方法について説明する。
<< Threshold generator 20 >>
In the following, one degree of abnormality is assigned to one motor, which is one sample, by using the method described above, and as shown in FIGS. 1 and 2, a plurality of motors of these motors are assigned. A threshold generation device and a method for automatically generating an appropriate threshold value reflecting the user's orientation by using the same number of abnormalities as the number of abnormalities will be described.
 図9は、本実施の形態に係る閾値生成装置20の構成を概略的に示すブロック図である。閾値生成装置20は、本実施の形態に係る閾値生成方法を実施することができる装置である。閾値生成装置20は、閾値候補群生成部21と、制約指定部22と、第1の閾値選択部23と、第1の判定精度算出部24と、第2の閾値選択部25と、第2の判定精度算出部26とを有している。 FIG. 9 is a block diagram schematically showing the configuration of the threshold generation device 20 according to the present embodiment. The threshold value generation device 20 is a device capable of implementing the threshold value generation method according to the present embodiment. The threshold value generation device 20 includes a threshold value candidate group generation unit 21, a constraint designation unit 22, a first threshold value selection unit 23, a first determination accuracy calculation unit 24, a second threshold value selection unit 25, and a second. It has a determination accuracy calculation unit 26 of the above.
 閾値候補群生成部21は、複数のサンプルにそれぞれ割り当てられた複数の異常度に基づいて1個以上の閾値候補を含む第1の閾値候補群を生成する。ここでは、複数のサンプルは、複数のモータである。第1の判定精度算出部24は、第1の閾値候補群に含まれる1個以上の閾値候補の各々の第1の判定精度を算出する。 The threshold candidate group generation unit 21 generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to each of the plurality of samples. Here, the plurality of samples are a plurality of motors. The first determination accuracy calculation unit 24 calculates the first determination accuracy of each of one or more threshold candidates included in the first threshold candidate group.
 制約指定部22は、第1の判定精度に対する制約を指定する。制約指定部22は、例えば、数値の入力を受け付け、数値に基づいて制約を決定する。数値の入力は、例えば、ユーザによって行われる。第1の閾値選択部23は、第1の閾値候補群から制約に基づいて1個以上の閾値候補を選択し、選択された1個以上の閾値候補を含む第2の閾値候補群を生成する。例えば、第1の閾値選択部23は、第1の閾値候補群から制約を満たす閾値候補を選択し、選択された1個以上の閾値候補を含む第2の閾値候補群を生成する。 The constraint specifying unit 22 specifies a constraint on the first determination accuracy. For example, the constraint specifying unit 22 accepts the input of a numerical value and determines the constraint based on the numerical value. The input of the numerical value is performed by the user, for example. The first threshold selection unit 23 selects one or more threshold candidates from the first threshold candidate group based on the constraint, and generates a second threshold candidate group including one or more selected threshold candidates. .. For example, the first threshold selection unit 23 selects a threshold candidate satisfying the constraint from the first threshold candidate group, and generates a second threshold candidate group including one or more selected threshold candidates.
 第2の判定精度算出部26は、第2の閾値候補群に含まれる1個以上の閾値候補の各々の第2の判定精度を算出する。第2の閾値選択部25は、第2の閾値候補群から第2の判定精度に基づいて選択された閾値候補を最終的な閾値として出力する。例えば、第2の閾値選択部25は、第2の閾値候補群から第2の判定精度が最大となる閾値候補を選択し最終的な閾値として出力する。 The second determination accuracy calculation unit 26 calculates the second determination accuracy of each of one or more threshold candidates included in the second threshold candidate group. The second threshold value selection unit 25 outputs the threshold value candidates selected from the second threshold value candidate group based on the second determination accuracy as the final threshold value. For example, the second threshold value selection unit 25 selects the threshold value candidate having the maximum second determination accuracy from the second threshold value candidate group and outputs it as the final threshold value.
《制約指定部22》
 制約指定部22によって指定される、第1の判定精度に対する制約は、例えば、ユーザによって入力される数値に基づいて決定される。制約は、例えば、以下に示すような条件である。ユーザは、以下に示す制約(A1)~(A4)のうちの1個を選択し、その中の値Eを自由に指定することができる。
(A1)異常サンプルの見逃しをE%以下とする。
(A2)正常サンプルの誤検出をE%以下とする。
(A3)異常サンプルの検出率をE%以上とする。
(A4)正常サンプルの検出率をE%以上とする。
<< Constraint specification unit 22 >>
The constraint on the first determination accuracy specified by the constraint specifying unit 22 is determined based on, for example, a numerical value input by the user. The constraint is, for example, a condition as shown below. The user can select one of the following constraints (A1) to (A4) and freely specify the value E in the constraint.
(A1) The oversight of abnormal samples is set to E% or less.
(A2) The false detection of a normal sample is set to E% or less.
(A3) Set the detection rate of abnormal samples to E% or more.
(A4) The detection rate of a normal sample is set to E% or more.
 異常サンプルの見逃しの回避を優先させたいユーザは、制約(A1)を選択し、そのときの値Eを小さな値に設定すればよい。また、正常サンプルの誤検出の回避を優先させたいユーザは、制約(A2)を選択し、そのときの値Eを小さな値に設定すればよい。 A user who wants to prioritize avoiding overlooking an abnormal sample may select the constraint (A1) and set the value E at that time to a small value. Further, the user who wants to prioritize the avoidance of false detection of the normal sample may select the constraint (A2) and set the value E at that time to a small value.
 ここで、TPR(真陽性:True Positive Rate)及びTNR(偽陽性:True Negative Rate)と呼ばれる量を導入する。「Positive」とは、正常ではなく陽性、つまり検出したい対象を意味する。そのため、ここでは、「Positive」が異常サンプルに対応する。言い換えれば、TPRは、「異常サンプルのうち、システムが異常と判定したものの割合」を表す。TNRは、「正常サンプルのうち、システムが正常と判定したものの割合」を表す。 Here, the amounts called TPR (true positive: True Positive Rate) and TNR (false positive: True Native Rate) are introduced. "Positive" means a positive rather than normal, that is, an object to be detected. Therefore, here, "Positive" corresponds to the abnormal sample. In other words, TPR represents "the percentage of abnormal samples that the system determines to be abnormal". TNR represents "the percentage of normal samples that the system determines to be normal".
 上述のように指定された制約に対して、それを満たす最適な閾値を選択する問題は、「TPR及びTNRのうち一方を任意の値以上とする。」制約を与えた上で最適な閾値を選択する問題と解釈できる。例えば、「異常サンプルの見逃しを10%以下とする。」制約(A1)は、「TPRを90%以上とする。」制約の下で最適な閾値を選択することで実現可能である。同様にして、上述の4通りの制約(A1)~(A4)は、以下のように「TPR及びTNRのうち一方を任意の値以上とする」制約(B1)~(B4)に置き換えることができる。言い換えれば、制約(A1)~(A4)は、制約(B1)~(B4)とそれぞれ等価である。
(B1)TPRを(100-E)%以上とする。
(B2)TNRを(100-E)%以上とする。
(B3)TPRをE%以上とする。
(B4)TNRをE%以上とする。
The problem of selecting the optimum threshold value that satisfies the constraint specified as described above is to set the optimum threshold value after giving the constraint that "one of TPR and TNR is set to an arbitrary value or more." It can be interpreted as a question of choice. For example, the constraint (A1) that "the oversight of an abnormal sample is 10% or less" can be realized by selecting the optimum threshold value under the "TPR is 90% or more" constraint. Similarly, the above-mentioned four types of constraints (A1) to (A4) can be replaced with constraints (B1) to (B4) in which "one of TPR and TNR is set to an arbitrary value or more" as follows. it can. In other words, the constraints (A1) to (A4) are equivalent to the constraints (B1) to (B4), respectively.
(B1) TPR is (100-E)% or more.
(B2) TNR is (100-E)% or more.
(B3) Set TPR to E% or more.
(B4) Set the TNR to E% or more.
 以降では、制約(A1)が選択され、E=20%が入力された場合、すなわち、「異常サンプルの見逃しを20%以下とする。」制約を指定した場合を説明する。この制約(A1)は、TPR及びTNRの制約(B1)すなわち「TPRを80%以上とする。」に置き換えることができる。 Hereinafter, the case where the constraint (A1) is selected and E = 20% is input, that is, the case where the constraint "Missing an abnormal sample is 20% or less" is specified will be described. This constraint (A1) can be replaced with the TPR and TNR constraint (B1), that is, "TPR is 80% or more."
《閾値候補群生成部21》
 図10は、閾値候補群生成部21によって生成された第1の閾値候補群に含まれる閾値候補C1~C13の例を示す図である。図11は、閾値候補群生成部21によって生成された第1の閾値候補群に含まれる閾値候補C21~C25の他の例を示す図である。閾値候補群生成部21は、指定された制約を満たす閾値を得るため、1個以上の閾値候補を含む第1の閾値候補群を生成する。閾値候補を生成する手法は、種々考えられるが、一例として図10に示されるように、数直線上にプロットしたすべての隣り合う異常度の中間を閾値候補として列挙する方法を用いることができる。つまり、m個のサンプルの異常度が与えられている場合、m-1個の閾値候補が生成される。mは正の整数である。図10では、14個のサンプルの異常度が与えられており、その結果として13個の閾値候補C1~C13が生成されている。この方法の利点は、図11に示されるように正常サンプルの異常度と異常サンプルの異常度とが大きく乖離している場合、両者を識別する際のマージンが最大となる閾値候補C23が生成されることである。これにより、未知のサンプルに対する汎化性能が向上する。
<< Threshold candidate group generating unit 21 >>
FIG. 10 is a diagram showing an example of threshold candidates C1 to C13 included in the first threshold candidate group generated by the threshold candidate group generation unit 21. FIG. 11 is a diagram showing other examples of threshold candidates C21 to C25 included in the first threshold candidate group generated by the threshold candidate group generation unit 21. The threshold candidate group generation unit 21 generates a first threshold candidate group including one or more threshold candidates in order to obtain a threshold that satisfies the specified constraint. Various methods for generating threshold candidates can be considered, but as shown in FIG. 10 as an example, a method of enumerating the middle of all adjacent abnormalities plotted on a number line as threshold candidates can be used. That is, when the degree of abnormality of m samples is given, m-1 threshold candidates are generated. m is a positive integer. In FIG. 10, the degree of abnormality of 14 samples is given, and as a result, 13 threshold candidates C1 to C13 are generated. The advantage of this method is that when the degree of abnormality of the normal sample and the degree of abnormality of the abnormal sample are significantly different from each other as shown in FIG. 11, the threshold candidate C23 having the maximum margin for distinguishing the two is generated. Is Rukoto. This improves generalization performance for unknown samples.
《第1の閾値選択部23及び第1の判定精度算出部24》
 図12は、第1の判定精度算出部24によって算出された第1の判定精度と制約指定部22によって指定された制約の例を表1として示す図である。第1の閾値選択部23では、以上のように得られた第1の閾値候補群に対して、第1の判定精度算出部24を用いてそれぞれの閾値候補に対する第1の判定精度を求める。ここでは、第1の判定精度の具体例として「TPR及びTNRの組」を用いる。図12の例は、閾値候補C1~C13について、第1の判定精度として、TPRとTNRとの組を求めたものである。これらの閾値候補のうち、前述した「TPRを80%以上とする。」という制約を満たすものが第2の閾値候補群として選択され出力される。
<< 1st threshold selection unit 23 and 1st determination accuracy calculation unit 24 >>
FIG. 12 is a diagram showing an example of the first determination accuracy calculated by the first determination accuracy calculation unit 24 and the constraints specified by the constraint designation unit 22 as Table 1. The first threshold value selection unit 23 uses the first determination accuracy calculation unit 24 to obtain the first determination accuracy for each threshold value candidate for the first threshold value candidate group obtained as described above. Here, "a set of TPR and TNR" is used as a specific example of the first determination accuracy. In the example of FIG. 12, the pair of TPR and TNR is obtained as the first determination accuracy for the threshold candidates C1 to C13. Among these threshold candidates, those satisfying the above-mentioned constraint "TPR is 80% or more" are selected and output as the second threshold candidate group.
 図13は、第1の閾値選択部23及び第1の判定精度算出部24の動作を示すフローチャートである。第1の閾値選択部23は、第1の閾値候補群から未だ選択されていない1つの閾値候補を選択し(ステップS11)、第1の判定精度算出部24は、選択された閾値候補について第1の判定精度を算出する(ステップS12)。 FIG. 13 is a flowchart showing the operation of the first threshold value selection unit 23 and the first determination accuracy calculation unit 24. The first threshold value selection unit 23 selects one threshold value candidate that has not yet been selected from the first threshold value candidate group (step S11), and the first determination accuracy calculation unit 24 selects the selected threshold value candidate for the first threshold value candidate. The determination accuracy of 1 is calculated (step S12).
 次に、第1の閾値選択部23は、第1の判定精度が、指定された制約を満たしているか否かを判断し、制約を満たしている場合には(ステップS13においてYES)、制約を満たす閾値候補を第2の閾値候補群に追加し(ステップS14)、すべての閾値候補を選択したか否かを判断する(ステップS15)。第1の閾値選択部23は、第1の判定精度が、指定された制約を満たしていない場合には(ステップS13においてNO)、閾値候補を第2の閾値候補群に追加せずに、すべての閾値候補を選択したか否かを判断する(ステップS15)。 Next, the first threshold value selection unit 23 determines whether or not the first determination accuracy satisfies the specified constraint, and if the constraint is satisfied (YES in step S13), the constraint is set. The threshold candidates to be satisfied are added to the second threshold candidate group (step S14), and it is determined whether or not all the threshold candidates have been selected (step S15). When the first determination accuracy does not satisfy the specified constraint (NO in step S13), the first threshold selection unit 23 does not add the threshold candidates to the second threshold candidate group, and all of them. It is determined whether or not the threshold value candidate of (step S15) is selected (step S15).
 第1の閾値選択部23は、すべての閾値候補を選択した場合には(ステップS15においてYES)、第2の閾値候補群を第2の閾値選択部25に出力し(ステップS16)、未選択の閾値候補がある場合には(ステップS15においてNO)、処理をステップS11に戻す。 When all the threshold candidates are selected (YES in step S15), the first threshold selection unit 23 outputs the second threshold candidate group to the second threshold selection unit 25 (step S16) and does not select. If there is a threshold candidate of (NO in step S15), the process returns to step S11.
《第2の閾値選択部25及び第2の判定精度算出部26》
図14は、第2の判定精度算出部26によって算出された第2の判定精度の例を表2として示す図である。第2の判定精度算出部26は、図14に示した閾値候補C1~C6について、第2の判定精度を求めている。第2の閾値選択部25では、第2の閾値候補群から最終的な閾値を一意に選択するため、第1の判定精度とは、異なる尺度でこれらの閾値を評価する。この評価は、第2の判定精度算出部26によって行われる。ここでは、第2の判定精度の具体例として、「TPR及びTNRのうち小さい方の値」を用いる。図14に示される例では、第2の判定精度が最も高くなる閾値候補はC6である。そのため、この閾値候補C6が最終的な閾値として出力される。
<< Second threshold value selection unit 25 and second determination accuracy calculation unit 26 >>
FIG. 14 is a diagram showing an example of the second determination accuracy calculated by the second determination accuracy calculation unit 26 as Table 2. The second determination accuracy calculation unit 26 obtains the second determination accuracy for the threshold candidates C1 to C6 shown in FIG. Since the second threshold value selection unit 25 uniquely selects the final threshold value from the second threshold value candidate group, these threshold values are evaluated by a scale different from the first determination accuracy. This evaluation is performed by the second determination accuracy calculation unit 26. Here, as a specific example of the second determination accuracy, "the smaller value of TPR and TNR" is used. In the example shown in FIG. 14, the threshold candidate having the highest second determination accuracy is C6. Therefore, this threshold candidate C6 is output as the final threshold.
 図15は、第2の閾値選択部25及び第2の判定精度算出部26の動作を示すフローチャートである。第2の閾値選択部25は、第2の閾値候補群から未だ選択されていない1つの閾値候補を選択し(ステップS21)、第2の判定精度算出部26は、選択された閾値候補について第2の判定精度を算出する(ステップS22)。 FIG. 15 is a flowchart showing the operation of the second threshold value selection unit 25 and the second determination accuracy calculation unit 26. The second threshold value selection unit 25 selects one threshold value candidate that has not yet been selected from the second threshold value candidate group (step S21), and the second determination accuracy calculation unit 26 selects the selected threshold value candidate for the second threshold value candidate. The determination accuracy of 2 is calculated (step S22).
 次に、第2の閾値選択部25は、第2の判定精度が、メモリに記憶されている第2の判定精度の最大値より大きいか否かを判断し(ステップS23)、大きい場合には(ステップS23においてYES)、第2の判定精度の最大値を記憶(すなわち、更新)し(ステップS24)、すべての閾値候補を選択したか否かを判断する(ステップS25)。第2の閾値選択部25は、第2の判定精度が、指定された制約を満たしていない場合には(ステップS23においてNO)、第2の判定精度の最大値を更新せずに、すべての閾値候補を選択したか否かを判断する(ステップS25)。 Next, the second threshold value selection unit 25 determines whether or not the second determination accuracy is greater than the maximum value of the second determination accuracy stored in the memory (step S23), and if it is large, it determines. (YES in step S23), the maximum value of the second determination accuracy is stored (that is, updated) (step S24), and it is determined whether or not all the threshold candidates have been selected (step S25). When the second determination accuracy does not satisfy the specified constraint (NO in step S23), the second threshold selection unit 25 does not update the maximum value of the second determination accuracy, and does not update all the values. It is determined whether or not the threshold candidate is selected (step S25).
 第2の閾値選択部25は、すべての閾値候補を選択した場合には(ステップS25においてYES)、第2の判定精度が最大である閾値候補を最終的な閾値として出力し(ステップS26)、未選択の閾値候補がある場合には(ステップS25においてNO)、処理をステップS21に戻す。 When all the threshold candidates are selected (YES in step S25), the second threshold selection unit 25 outputs the threshold candidate having the maximum second determination accuracy as the final threshold (step S26). If there is an unselected threshold candidate (NO in step S25), the process returns to step S21.
 なお、第1の判定精度及び第2の判定精度として用いる評価尺度は、TPR又はTNRによるもの以外であってもよい。例えば、評価尺度は、正解精度、適合率、F値(F-score又はF-measure)など、任意の統計量又はその組み合わせを利用することができる。 The evaluation scale used as the first determination accuracy and the second determination accuracy may be other than those based on TPR or TNR. For example, the evaluation scale can utilize any statistic or a combination thereof, such as accuracy of correct answer, accuracy rate, and F value (F-score or F-masure).
《効果》
 以上に説明したように、本実施の形態に係る閾値生成装置20を用いれば、第1の判定精度に対する制約という形でユーザの指向を閾値に反映させ、その制約を満たす第2の判定精度を用いて適切な閾値を選択することで、ユーザの指向を反映させながら適切な閾値を選択することができる。
"effect"
As described above, if the threshold value generator 20 according to the present embodiment is used, the user's orientation is reflected in the threshold value in the form of a constraint on the first determination accuracy, and the second determination accuracy satisfying the constraint can be obtained. By selecting an appropriate threshold value by using, it is possible to select an appropriate threshold value while reflecting the user's orientation.
 また、判定精度が取ることができる数値の範囲を指定するという方法は、ユーザにとって直感的に理解可能であり、閾値の調整に掛かるユーザの労力が小さい。また、数値の範囲指定という形式を取ることで、その範囲内においてシステムが更に適切な閾値を選定する余地を残すことができる。これにより、ユーザの指向の反映と閾値の最適化を両立することが可能となる。 In addition, the method of specifying the range of numerical values that can be taken by the judgment accuracy is intuitively understandable to the user, and the user's labor for adjusting the threshold value is small. In addition, by taking the form of specifying a range of numerical values, it is possible to leave room for the system to select a more appropriate threshold value within that range. This makes it possible to both reflect the user's orientation and optimize the threshold value.
 また、ユーザによって指定された制約を満たす閾値候補だけが第2の閾値候補として選択されるため、最終的な閾値は、ユーザの指向が確実に反映されたものとなる。 Further, since only the threshold value candidates satisfying the constraint specified by the user are selected as the second threshold value candidates, the final threshold value surely reflects the user's orientation.
 また、最終的に出力される閾値が1つに絞られるため、複数提示された閾値候補の中から最終的な閾値をユーザが選択する等の追加的な作業が不要となり、ユーザの労力を小さくすることができる。 Further, since the threshold value to be finally output is narrowed down to one, additional work such as the user selecting the final threshold value from a plurality of presented threshold value candidates becomes unnecessary, and the labor of the user is reduced. can do.
 また、データ全体のうち、正常・異常サンプルが占める割合が大きく異なっている場合、例えば、正解精度又はF値といった判定精度は、信頼性が低下する。しかし、TPR及びTNRは、正常・異常サンプルの割合の影響を受けないため、様々な状況において信頼性の高い閾値を生成することが可能となる。 In addition, when the proportion of normal / abnormal samples in the entire data is significantly different, the reliability of the judgment accuracy such as the correct answer accuracy or the F value is lowered. However, since TPR and TNR are not affected by the proportion of normal / abnormal samples, it is possible to generate a highly reliable threshold value in various situations.
 また、第2の判定基準である「TPR及びTNRのうち小さい方の値」は、「入力されたどのようなサンプルも正常と判定する。」又は「入力されたどのようなサンプルも異常と判定する。」といった役に立たない閾値候補に対して、必ず0となる。したがって、そのような役に立たない閾値候補が選択されることを回避し、様々な状況において実用的な閾値が生成されることが期待できる。 In addition, the second criterion, "the smaller value of TPR and TNR", is "determine that any input sample is normal" or "determine that any input sample is abnormal". It will always be 0 for useless threshold candidates such as "." Therefore, it can be expected that practical thresholds will be generated in various situations by avoiding the selection of such useless threshold candidates.
 さらに、正常サンプル及び異常サンプルの異常度が数直線上において完全に分離可能な場合、異常サンプルに最も近い正常サンプルと、正常サンプルに最も近い異常サンプルの間のどこに閾値を設定しても判定精度は、100%となる。このような場合、サポートベクターマシンを最適化するための目的関数と同様に、両者のちょうど中間に閾値を設定することで未知のサンプルに対する汎化性能を最大化することができる。 Furthermore, if the degree of abnormality of the normal sample and the abnormal sample can be completely separated on the number line, the judgment accuracy can be set anywhere between the normal sample closest to the abnormal sample and the abnormal sample closest to the normal sample. Is 100%. In such a case, the generalization performance for an unknown sample can be maximized by setting a threshold value between the two, as in the objective function for optimizing the support vector machine.
《変形例》
 図16は、本実施の形態に係る閾値生成装置20のハードウェア構成の例を示す図である。図16に示されるように、閾値生成装置20は、プログラムを格納するメモリ32と、このプログラムを実行するCPU(Central Processing Unit)などのプロセッサ31とを有している。プログラムは、本実施の形態に係る閾値生成方法を実施するための閾値生成プログラムを含むことができる。図9に示される閾値生成装置20の機能の全体又は一部は、プログラムを実行するプロセッサ31によって実現されることができる。図16に示される閾値生成装置20の機能の全体又は一部は、半導体集積回路によって実現されてもよい。また、閾値生成装置20は、ユーザが判定精度に対する制約を指定するためのインターフェイスとしての表示手段としてのディスプレイ34と、マウス、キーボード、タッチパネルなどの入力デバイス35と、記憶装置としてのハードディスク33とを有してもよい。
<< Modification example >>
FIG. 16 is a diagram showing an example of the hardware configuration of the threshold value generation device 20 according to the present embodiment. As shown in FIG. 16, the threshold generation device 20 has a memory 32 for storing a program and a processor 31 such as a CPU (Central Processing Unit) for executing the program. The program can include a threshold generation program for implementing the threshold generation method according to the present embodiment. All or part of the function of the threshold generator 20 shown in FIG. 9 can be realized by the processor 31 that executes the program. All or part of the function of the threshold generation device 20 shown in FIG. 16 may be realized by a semiconductor integrated circuit. Further, the threshold generation device 20 includes a display 34 as a display means as an interface for the user to specify a constraint on the determination accuracy, an input device 35 such as a mouse, a keyboard, and a touch panel, and a hard disk 33 as a storage device. You may have.
 20 閾値生成装置、 21 閾値候補群生成部、 22 制約指定部、 23 第1の閾値選択部、 24 第1の判定精度算出部、 25 第2の閾値選択部、 26 第2の判定精度算出部。 20 Threshold generation device, 21 Threshold candidate group generation unit, 22 Constraint designation unit, 23 1st threshold selection unit, 24 1st judgment accuracy calculation unit, 25 2nd threshold selection unit, 26 2nd judgment accuracy calculation unit ..

Claims (11)

  1.  複数のサンプルにそれぞれ割り当てられた複数の異常度に基づいて1個以上の閾値候補を含む第1の閾値候補群を生成する閾値候補群生成部と、
     前記第1の閾値候補群に含まれる前記1個以上の閾値候補の各々の第1の判定精度を算出する第1の判定精度算出部と、
     前記第1の判定精度に対する制約を指定する制約指定部と、
     前記第1の閾値候補群から前記制約に基づいて1個以上の閾値候補を選択し、前記選択された1個以上の閾値候補を含む第2の閾値候補群を生成する第1の閾値選択部と、
     前記第2の閾値候補群に含まれる前記1個以上の閾値候補の各々の第2の判定精度を算出する第2の判定精度算出部と、
     前記第2の閾値候補群から前記第2の判定精度に基づいて選択された閾値候補を最終的な閾値として出力する第2の閾値選択部と、
     を有する閾値生成装置。
    A threshold candidate group generator that generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples, respectively.
    A first determination accuracy calculation unit that calculates the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group.
    A constraint specification unit that specifies a constraint on the first determination accuracy,
    A first threshold selection unit that selects one or more threshold candidates from the first threshold candidate group based on the constraint and generates a second threshold candidate group including the selected one or more threshold candidates. When,
    A second determination accuracy calculation unit that calculates the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group.
    A second threshold value selection unit that outputs a threshold value candidate selected from the second threshold value candidate group based on the second determination accuracy as a final threshold value, and
    Threshold generator having.
  2.  前記制約指定部は、数値の入力を受け付け、前記数値に基づいて前記制約を決定する請求項1に記載の閾値生成装置。 The threshold generation device according to claim 1, wherein the constraint designating unit receives input of a numerical value and determines the constraint based on the numerical value.
  3.  前記第1の閾値選択部は、前記第1の閾値候補群から前記制約を満たす閾値候補を選択し、前記選択された1個以上の閾値候補を含む前記第2の閾値候補群を生成する請求項1又は2に記載の閾値生成装置。 The first threshold selection unit selects a threshold candidate satisfying the constraint from the first threshold candidate group, and generates the second threshold candidate group including one or more selected threshold candidates. Item 3. The threshold value generator according to item 1 or 2.
  4.  前記第2の閾値選択部は、前記第2の閾値候補群から前記第2の判定精度が最大となる閾値候補を選択し前記最終的な閾値として出力する請求項1から3のいずれか1項に記載の閾値生成装置。 The second threshold selection unit selects any one of claims 1 to 3 from the second threshold candidate group to select the threshold candidate having the maximum second determination accuracy and output it as the final threshold. The threshold generation device according to.
  5.  前記複数のサンプルは、正常サンプルと異常サンプルとを含み、
     前記第1の判定精度算出部は、前記複数の異常度に基づいて算出されたTPRとTNRの組を前記第1の判定精度として出力する
     請求項1から4のいずれか1項に記載の閾値生成装置。
    The plurality of samples include a normal sample and an abnormal sample.
    The threshold value according to any one of claims 1 to 4, wherein the first determination accuracy calculation unit outputs a set of TPR and TNR calculated based on the plurality of abnormalities as the first determination accuracy. Generator.
  6.  前記複数のサンプルは、正常サンプルと異常サンプルとを含み、
     前記第2の判定精度算出部は、前記複数の異常度に基づいて算出されたTPR及びTNRのうち小さい方の値を前記第2の判定精度として出力する請求項1から4のいずれか1項に記載の閾値生成装置。
    The plurality of samples include a normal sample and an abnormal sample.
    The second determination accuracy calculation unit outputs any one of claims 1 to 4 as the second determination accuracy, whichever is smaller of the TPR and TNR calculated based on the plurality of abnormalities. The threshold generator according to the above.
  7.  前記第2の判定精度算出部は、前記TPR及び前記TNRのうち小さい方の値を前記第2の判定精度として出力する請求項5に記載の閾値生成装置。 The threshold generation device according to claim 5, wherein the second determination accuracy calculation unit outputs the smaller value of the TPR and the TNR as the second determination accuracy.
  8.  前記閾値候補群生成部は、数直線上に並ぶ前記複数の異常度のうちの互いに隣り合う異常度の中間に前記第1の閾値候補群に含まれる閾値候補を設定する請求項1から7のいずれか1項に記載の閾値生成装置。 The threshold candidate group generation unit sets the threshold candidates included in the first threshold candidate group in the middle of the abnormality degrees adjacent to each other among the plurality of abnormality degrees arranged on a number line according to claims 1 to 7. The threshold value generator according to any one item.
  9.  前記複数のサンプルは、複数の機器であり、
     前記複数の異常度は、前記複数の機器から発せられる音又は振動の強さである
     請求項1から8のいずれか1項に記載の閾値生成装置。
    The plurality of samples are a plurality of devices.
    The threshold generation device according to any one of claims 1 to 8, wherein the plurality of abnormalities is the intensity of sound or vibration emitted from the plurality of devices.
  10.  複数のサンプルにそれぞれ割り当てられた複数の異常度に基づいて1個以上の閾値候補を含む第1の閾値候補群を生成するステップと、
     前記第1の閾値候補群に含まれる前記1個以上の閾値候補の各々の第1の判定精度を算出するステップと、
     前記第1の判定精度に対する制約を指定するステップと、
     前記第1の閾値候補群から前記制約に基づいて1個以上の閾値候補を選択し、前記選択された1個以上の閾値候補を含む第2の閾値候補群を生成するステップと、
     前記第2の閾値候補群に含まれる前記1個以上の閾値候補の各々の第2の判定精度を算出するステップと、
     前記第2の閾値候補群から前記第2の判定精度に基づいて選択された閾値候補を最終的な閾値として出力するステップと、
     を有する閾値生成方法。
    A step of generating a first threshold candidate group containing one or more threshold candidates based on a plurality of anomalies assigned to a plurality of samples, and a step of generating a first threshold candidate group.
    A step of calculating the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group, and
    The step of designating the constraint on the first determination accuracy and
    A step of selecting one or more threshold candidates from the first threshold candidate group based on the constraint and generating a second threshold candidate group including the selected one or more threshold candidates.
    A step of calculating the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group, and
    A step of outputting a threshold candidate selected from the second threshold candidate group based on the second determination accuracy as a final threshold, and a step of outputting the threshold candidate.
    Threshold generation method having.
  11.  複数のサンプルにそれぞれ割り当てられた複数の異常度に基づいて1個以上の閾値候補を含む第1の閾値候補群を生成する処理と、
     前記第1の閾値候補群に含まれる前記1個以上の閾値候補の各々の第1の判定精度を算出する処理と、
     前記第1の判定精度に対する制約を指定する処理と、
     前記第1の閾値候補群から前記制約に基づいて1個以上の閾値候補を選択し、前記選択された1個以上の閾値候補を含む第2の閾値候補群を生成する処理と、
     前記第2の閾値候補群に含まれる前記1個以上の閾値候補の各々の第2の判定精度を算出する処理と、
     前記第2の閾値候補群から前記第2の判定精度に基づいて選択された閾値候補を最終的な閾値として出力する処理と、
     をコンピュータに実行させる閾値生成プログラム。
    A process of generating a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples, and a process of generating a first threshold candidate group.
    A process of calculating the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group, and
    The process of designating the constraint on the first determination accuracy and
    A process of selecting one or more threshold candidates from the first threshold candidate group based on the constraint and generating a second threshold candidate group including the selected one or more threshold candidates.
    A process of calculating the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group, and
    A process of outputting a threshold candidate selected from the second threshold candidate group based on the second determination accuracy as a final threshold, and a process of outputting the threshold candidate.
    A threshold generation program that causes a computer to execute.
PCT/JP2019/044818 2019-11-15 2019-11-15 Threshold value generation device, threshold value generation method, and threshold value generation program WO2021095222A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/JP2019/044818 WO2021095222A1 (en) 2019-11-15 2019-11-15 Threshold value generation device, threshold value generation method, and threshold value generation program
JP2021555739A JP7012913B2 (en) 2019-11-15 2019-11-15 Threshold generator, threshold generation method, and threshold generation program
TW109114016A TW202121205A (en) 2019-11-15 2020-04-27 Threshold value generation device, threshold value generation method, and threshold value generation program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/044818 WO2021095222A1 (en) 2019-11-15 2019-11-15 Threshold value generation device, threshold value generation method, and threshold value generation program

Publications (1)

Publication Number Publication Date
WO2021095222A1 true WO2021095222A1 (en) 2021-05-20

Family

ID=75913026

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/044818 WO2021095222A1 (en) 2019-11-15 2019-11-15 Threshold value generation device, threshold value generation method, and threshold value generation program

Country Status (3)

Country Link
JP (1) JP7012913B2 (en)
TW (1) TW202121205A (en)
WO (1) WO2021095222A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7457752B2 (en) 2022-06-15 2024-03-28 株式会社安川電機 Data analysis system, data analysis method, and program

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018528521A (en) * 2015-07-31 2018-09-27 クゥアルコム・インコーポレイテッドQualcomm Incorporated Media classification
JP2018190128A (en) * 2017-05-01 2018-11-29 日本電信電話株式会社 Setting device, analysis system, setting method and setting program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018528521A (en) * 2015-07-31 2018-09-27 クゥアルコム・インコーポレイテッドQualcomm Incorporated Media classification
JP2018190128A (en) * 2017-05-01 2018-11-29 日本電信電話株式会社 Setting device, analysis system, setting method and setting program

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Consideration of the difference between ROC and PR curves-continued", ROC CURVE EQUIVALENT, 2016, XP055823548, Retrieved from the Internet <URL:https://qiita.com/skyshk/items/bfb3ad19b47b7ca94829> [retrieved on 20191217] *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7457752B2 (en) 2022-06-15 2024-03-28 株式会社安川電機 Data analysis system, data analysis method, and program

Also Published As

Publication number Publication date
JPWO2021095222A1 (en) 2021-05-20
JP7012913B2 (en) 2022-01-28
TW202121205A (en) 2021-06-01

Similar Documents

Publication Publication Date Title
Awada et al. A review of the stability of feature selection techniques for bioinformatics data
Lustgarten et al. Measuring stability of feature selection in biomedical datasets
WO2013125482A1 (en) Document evaluation device, document evaluation method, and computer-readable recording medium
JP2020046888A (en) Learning program, prediction program, learning method, prediction method, learning device, and prediction device
CN112955837A (en) Abnormality diagnosis device, abnormality diagnosis method, and program
WO2018162047A1 (en) Tester and method for testing a device under test and tester and method for determining a single decision function
JP7012913B2 (en) Threshold generator, threshold generation method, and threshold generation program
US10585130B2 (en) Noise spectrum analysis for electronic device
CN113553319A (en) LOF outlier detection cleaning method, device and equipment based on information entropy weighting and storage medium
Wang et al. A novel dataset-similarity-aware approach for evaluating stability of software metric selection techniques
Timmermans et al. Using Bagidis in nonparametric functional data analysis: predicting from curves with sharp local features
WO2019235608A1 (en) Analysis device, analysis method, and recording medium
US20120136818A1 (en) Information Processing Device, Information Processing Method, and Program
JP5516925B2 (en) Reliability calculation device, reliability calculation method, and program
WO2022158037A1 (en) Quality prediction system, model-generating device, quality prediction method, and quality prediction program
JP4827285B2 (en) Pattern recognition method, pattern recognition apparatus, and recording medium
JP2020139914A (en) Substance structure analysis device, method and program
JP6529688B2 (en) Selection apparatus, selection method, and selection program
JP7309134B2 (en) Visualization method, program used therefor, visualization device, and discriminating device provided with the same
JP2018151913A (en) Information processing system, information processing method, and program
JP2007305048A (en) Influencing factor estimation device and influencing factor estimation program
EP4287075A1 (en) Training data generation device and method
US20240176848A1 (en) Information processing device, information processing method, and storage medium storing program
JP5517973B2 (en) Pattern recognition apparatus and pattern recognition method
JP4662909B2 (en) Feature evaluation method, apparatus and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19952447

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021555739

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19952447

Country of ref document: EP

Kind code of ref document: A1