WO2021095222A1

WO2021095222A1 - Threshold value generation device, threshold value generation method, and threshold value generation program

Info

Publication number: WO2021095222A1
Application number: PCT/JP2019/044818
Authority: WO
Inventors: 信秋田中; 西田　博幸
Original assignee: 三菱電機株式会社; 三菱電機エンジニアリング株式会社
Priority date: 2019-11-15
Filing date: 2019-11-15
Publication date: 2021-05-20
Also published as: JPWO2021095222A1; JP7012913B2; TW202121205A

Abstract

A threshold value generation device (20) has: a threshold value candidate group generation unit (21) for generating a first threshold value candidate group that includes one or more threshold value candidates (C1-C13), on the basis of a plurality of abnormality levels respectively assigned to a plurality of samples; a first determination accuracy calculation unit (24) for calculating first determination accuracies (TPR, TNR) respectively for the one or more threshold value candidates (C1-C13) that are included in the first threshold value candidate group; a restriction designation unit (22) for designating restrictions on the first determination accuracies; a first threshold value selection unit (23) for selecting the one or more threshold value candidates from the first threshold value candidate group on the basis of the restrictions and generating a second threshold value candidate group that includes the selected one or more threshold value candidates; a second determination accuracy calculation unit (26) for calculating second determination accuracies (TPR, TNR) of the one or more threshold value candidates (C1-C6) that are included in the second threshold value candidate group; and a second threshold value selection unit (25) for outputting, as a final threshold value, the threshold value candidate that is selected from the second threshold value candidate group on the basis of the second determination accuracies.

Description

Threshold generator, threshold generation method, and threshold generation program

The present invention relates to a threshold generation device, a threshold generation method, and a threshold generation program.

Various methods have been developed to estimate the soundness of equipment by measuring the operating sound or vibration of equipment using an acoustic sensor or vibration sensor and analyzing the waveform. The MT (Mahalanobis Taguchi) method is one of the most representative methods among them (see, for example, Non-Patent Document 1). In the MT method, the distribution formed by the normal sample set in the feature space is learned in advance as the reference space, and at the time of judgment, normal or abnormal is identified depending on how much the observed feature vector deviates from the reference space. Do.

What is obtained by the MT method is the Mahalanobis distance, which indicates how much a sample deviates from the reference space. The Mahalanobis distance indicates that if it is small, the sample is close to normal, and if it is large, the sample is close to abnormal. That is, the Mahalanobis distance can be interpreted as a value indicating the degree of abnormality of the sample. However, a method for setting a threshold value for discriminating between normal and abnormal with respect to the degree of abnormality has not been established, and in many cases, trial and error is required to find an appropriate threshold value.

The problem of such threshold setting occurs not only in the classical method such as the MT method but also in the modern method such as deep learning. For example, Non-Patent Document 2 describes a method of learning a potential distribution of features of a normal sample set using a variational autoencoder and determining normality or abnormality based on the degree of deviation from the distribution. .. However, the variational autoencoder finally outputs a value indicating the degree of abnormality as in the MT method, and Non-Patent Document 2 describes how to set an appropriate threshold value for this degree of abnormality. It does not specifically describe whether it is good or not.

In order to automatically determine the appropriate threshold value for the degree of abnormality, it is desirable to solve the following problems.

The first is to establish a method to reflect the user's orientation in the threshold. In many cases, when the degree of abnormality of the normal sample set and the abnormal sample set is plotted on a number line, as shown in FIG. 1, the degree of abnormality of the normal sample (○ mark) set and the abnormality of the abnormal sample (x mark) set There will be areas where the degrees are mixed. In such a case, there is no threshold value that can completely distinguish between normal and abnormal, and it is desirable to consider the trade-off between the erroneous judgment rate of the normal sample and the oversight rate of the abnormal sample. Which of these criteria is emphasized depends on the user. Therefore, it is desirable to establish a method that can reflect the user's orientation in the threshold generation process.

The second is to establish a method for determining the optimum threshold value after reflecting the above-mentioned user orientation, that is, the constraint specified by the user. For example, when generating a "threshold value at which the oversight rate of an abnormal sample is 10%", as a simple method, after listing various threshold candidates, the overlooking rate of the abnormal sample is 10%. A method of selecting the one with the closest oversight rate to 10% can be considered. However, in this method, as shown in FIG. 2, the degree of abnormality of the normal sample and the abnormal sample can be completely separated on the number line, and the oversight rate of the abnormal sample can be made smaller than 10% without side effects. If this is the case, the problem arises that a "better threshold" clearly exists but is not selected. Similar problems occur when the region where the abnormalities of normal samples and the abnormalities of abnormal samples coexist is small, or when the number of samples available for determining the threshold is small. Therefore, it is desirable to establish a method that can determine an appropriate threshold in any situation.

An object of the present invention is to obtain an appropriate threshold value based on a specified constraint.

Means to solve the problem

The threshold generation device according to one aspect of the present invention includes a threshold candidate group generation unit that generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples. , A first determination accuracy calculation unit that calculates the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group, and a constraint that specifies a constraint on the first determination accuracy. A first that selects one or more threshold candidates from the designated unit and the first threshold candidate group based on the constraint, and generates a second threshold candidate group including the selected one or more threshold candidates. The threshold selection unit, the second determination accuracy calculation unit for calculating the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group, and the second threshold candidate group. It has a second threshold value selection unit that outputs a threshold value candidate selected based on the second determination accuracy as a final threshold value.

The threshold generation method according to another aspect of the present invention includes a step of generating a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples, and the first. A step of calculating the first determination accuracy of each of the one or more threshold candidates included in one threshold candidate group, a step of designating a constraint on the first determination accuracy, and the first threshold candidate group. A step of selecting one or more threshold candidates based on the above constraints and generating a second threshold candidate group including the selected one or more threshold candidates, and a step of being included in the second threshold candidate group. The step of calculating the second determination accuracy of each of the one or more threshold candidates and the threshold candidate selected from the second threshold candidate group based on the second determination accuracy are output as the final threshold. Has steps to do.

According to the present invention, an appropriate threshold value based on a specified constraint can be obtained.

It is a figure which shows the example which plotted the degree of abnormality of a normal sample and an abnormal sample on a number line. It is a figure which shows another example which plotted the degree of abnormality of a normal sample and an abnormal sample on a number line. It is a figure which shows the example of the spectrogram of the vibration of a motor. It is a figure which represents typically the spectrogram shown in FIG. 3 as a matrix. It is a figure which shows the example which considers the matrix shown in FIG. 4 as the characteristic vector of the tone color for each time. It is a figure which shows the abnormality degree model learner which receives a set of a plurality of feature vectors based on a normal sample and an abnormality degree sample, and generates an abnormality degree model. In the example shown in FIGS. 3 to 6, it is a figure which shows that a plurality of abnormalities are acquired from one sample. It is a figure which shows that one representative value is calculated from the plurality of abnormality degrees acquired from one sample, and one representative value is assigned as the abnormality degree of a sample. It is a functional block diagram which shows schematic structure of the threshold value generation apparatus which concerns on embodiment of this invention. It is a figure which shows the example of the threshold value candidate included in the 1st threshold value candidate group generated by the threshold value candidate group generation part. It is a figure which shows the other example of the threshold value candidate included in the 2nd threshold value candidate group generated by the 1st threshold value selection part. It is a figure which shows the example of the 1st determination accuracy calculated by the 1st determination accuracy calculation unit, and the constraint specified by a constraint designation unit as Table 1. It is a flowchart which shows the operation of the 1st threshold value selection part and the 1st determination accuracy calculation part. It is a figure which shows the example of the 2nd determination accuracy calculated by the 2nd determination accuracy calculation part as Table 2. It is a flowchart which shows the operation of the 2nd threshold value selection part and the 2nd determination accuracy calculation part. It is a figure which shows the example of the hardware composition of the threshold value generation apparatus which concerns on embodiment.

The threshold value generator, the threshold value generation method, and the threshold value generation program according to the embodiment of the present invention will be described below with reference to the drawings. The following embodiments are merely examples, and various modifications can be made within the scope of the present invention.

The threshold generation device according to the present embodiment generates a threshold value used when determining whether the device is in a normal state or an abnormal state. The threshold generation device according to the present embodiment is based on, for example, the degree of abnormality of the device obtained as a result of analyzing the waveform detected by using an acoustic sensor or a vibration sensor (that is, a measuring device) for the operating sound or vibration of the device. To generate a threshold used to determine whether the device is in a normal or abnormal state. In the present embodiment, as a specific application example, it is assumed that the threshold generation device performs a quality inspection of the motor based on the vibration of the motor which is a device. The threshold value generator according to the present embodiment detects the vibration generated by the motor, and when the degree of abnormality obtained as a result of the analysis is equal to or higher than the threshold value, the motor is considered to be defective.

<< Derivation of abnormalities >>
First, a specific example of a method of calculating the degree of abnormality based on the vibration waveform of the motor as a sample will be described. First, the threshold generation device according to the present embodiment generates a spectrogram as shown in FIG. 3 by performing wavelet transform on the vibration waveform measured by the measuring instrument. FIG. 3 is a diagram showing an example of a spectrogram showing the vibration of the motor. In FIG. 3, the vertical axis represents time, the horizontal axis represents frequency, and the density of concentration represents the strength of vibration.

FIG. 4 is a diagram schematically showing the spectrogram shown in FIG. 3 as a matrix. In FIG. 4, the vertical axis represents time and the horizontal axis represents frequency. In FIG. 4, one square box corresponds to one point on the spectrogram. In FIG. 4, it is assumed that a power value is assigned to each of the plurality of boxes. However, for convenience of drawing, the time / frequency resolutions in FIGS. 3 and 4 are different from each other.

FIG. 5 is a diagram showing an example in which the matrix shown in FIG. 4 is regarded as a feature vector in which the characteristics of the timbre at each time are quantified. In FIG. 5, the vertical axis represents time and the horizontal axis represents frequency. As shown in FIG. 5, the matrix shown in FIG. 4 can be regarded as a feature vector in which the characteristics of the timbre are quantified for each time. As a result, a plurality of feature vectors shown in FIG. 5 are generated from the vibration waveform of one motor.

The method of obtaining the feature vector from the vibration waveform is not limited to the method by wavelet transform as described above. As a method for obtaining a feature vector from a vibration waveform, for example, Fourier transform, filter bank analysis, cepstrum analysis, LPC (Linear Positive Coefficient) analysis, or the like can be used. Further, as a method of obtaining a feature vector from a vibration waveform, it is also possible to construct a feature vector by combining various acoustic features. The various acoustic features are, for example, a peak value, an RMS (Root Mean Square) value, a fundamental frequency, and the like.

FIG. 6 is an anomaly that receives a set of multiple feature vectors obtained based on a normal motor sample (ie, anomalous sample) and an anomalous motor sample (ie, anomalous sample) and generates an anomaly model. It is a figure which shows the degree model learner 10. As shown in FIGS. 3 to 5, analysis is performed using a plurality of samples including a normal sample and an abnormal sample, and a set of feature vectors obtained as a result is input to the abnormality model learner 10. The anomaly degree model is a model in which one feature vector is input and a single anomaly degree is calculated and output by performing some conversion on the feature vector. The anomaly model learner 10 constructs such an anomaly model from a set of given feature vectors. Hereinafter, as a specific example of the abnormality degree model learner 10, a case where the abnormality degree model learner 10 uses linear discriminant analysis (LDA) will be described.

LDA gives a set of normal class feature vectors and a set of abnormal class feature vectors as training data, and finds a projection vector that emphasizes the difference between normal and abnormal based on the distribution of those feature vectors. It is a method. The anomaly model learner 10 first determines the average vector of all the feature vectors included in both the normal class and the anomaly class.

To ask. Similarly, the anomaly model learner 10 finds the standard deviation for each element of all feature vectors. At this time, the vector that stores the standard deviation for each element of all the feature vectors is

And. All of these vectors are N-dimensional when the feature vector is N-dimensional (N is a positive integer).

Next, the anomaly degree model learner 10 uses the above-mentioned vector.

Normalize all feature vectors using. Normalization is to adjust the vector so that the mean of each element of the vector is 0 and the standard deviation is 1.

Specifically, normalization is a vector

Using

Is to ask. However, "÷" is the division for each element of the vector.

Next, the anomaly model learner 10 normalizes all the feature vectors as described above, and then the average vector of the normal class and the average vector of the anomaly class.

To be calculated respectively.

Similarly, the anomaly model learner 10 is a covariance matrix based on the mean vector of the normal class and the mean vector of the anomaly class.

To be calculated respectively.

Finally, the anomaly model learner 10 uses the following equations (1) and (2) to project a vector.

To ask.

A feature vector

When is input, the formula for converting it into a single degree of anomaly d is as shown in the following formula (3). This equation (3) corresponds to the anomaly model in LDA.

The analysis method used by the anomaly model learner 10 is not limited to LDA. The anomaly model learner 10 can use, for example, a support vector machine, a neural network, or a mixed normal distribution model. Further, although the abnormality degree model learner 10 uses both the normal class and the abnormality class as training data, it is also possible to adopt a method of learning using only the data of a single class. In this case, as the abnormality degree model learner 10, for example, an MT method, a principal component analysis (PCA), an autoencoder, a one-class support vector machine, or the like can be used.

FIG. 7 is a diagram showing that a plurality of abnormalities are acquired from one motor as one sample in the example shown in FIGS. 3 to 6. In the anomaly model described above, a single feature vector is transformed into a single anomaly. Further, as shown in FIG. 5, a plurality of feature vectors can be obtained from the vibration waveform of one motor. Therefore, in this case, as shown in FIG. 7, a plurality of abnormalities can be obtained from one motor.

FIG. 8 is a diagram showing that one representative value is calculated based on a plurality of abnormality degrees acquired from one motor as one sample, and this is assigned as the abnormality degree of the motor. As shown in FIG. 8, in order to simplify the final judgment of normality or abnormality, a plurality of abnormality degrees (shown in FIG. 7) obtained from one motor are aggregated and calculated by some method. A single representative value is assigned as the degree of anomaly of the motor. The simplest method for calculating the representative value is to use the average value of a plurality of abnormalities as the representative value. Any statistic, such as maximum, standard deviation, or mode, can be used as the representative value that can be used.

<< Threshold generator 20 >>
In the following, one degree of abnormality is assigned to one motor, which is one sample, by using the method described above, and as shown in FIGS. 1 and 2, a plurality of motors of these motors are assigned. A threshold generation device and a method for automatically generating an appropriate threshold value reflecting the user's orientation by using the same number of abnormalities as the number of abnormalities will be described.

FIG. 9 is a block diagram schematically showing the configuration of the threshold generation device 20 according to the present embodiment. The threshold value generation device 20 is a device capable of implementing the threshold value generation method according to the present embodiment. The threshold value generation device 20 includes a threshold value candidate group generation unit 21, a constraint designation unit 22, a first threshold value selection unit 23, a first determination accuracy calculation unit 24, a second threshold value selection unit 25, and a second. It has a determination accuracy calculation unit 26 of the above.

The threshold candidate group generation unit 21 generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to each of the plurality of samples. Here, the plurality of samples are a plurality of motors. The first determination accuracy calculation unit 24 calculates the first determination accuracy of each of one or more threshold candidates included in the first threshold candidate group.

The constraint specifying unit 22 specifies a constraint on the first determination accuracy. For example, the constraint specifying unit 22 accepts the input of a numerical value and determines the constraint based on the numerical value. The input of the numerical value is performed by the user, for example. The first threshold selection unit 23 selects one or more threshold candidates from the first threshold candidate group based on the constraint, and generates a second threshold candidate group including one or more selected threshold candidates. .. For example, the first threshold selection unit 23 selects a threshold candidate satisfying the constraint from the first threshold candidate group, and generates a second threshold candidate group including one or more selected threshold candidates.

The second determination accuracy calculation unit 26 calculates the second determination accuracy of each of one or more threshold candidates included in the second threshold candidate group. The second threshold value selection unit 25 outputs the threshold value candidates selected from the second threshold value candidate group based on the second determination accuracy as the final threshold value. For example, the second threshold value selection unit 25 selects the threshold value candidate having the maximum second determination accuracy from the second threshold value candidate group and outputs it as the final threshold value.

<< Constraint specification unit 22 >>
The constraint on the first determination accuracy specified by the constraint specifying unit 22 is determined based on, for example, a numerical value input by the user. The constraint is, for example, a condition as shown below. The user can select one of the following constraints (A1) to (A4) and freely specify the value E in the constraint.
(A1) The oversight of abnormal samples is set to E% or less.
(A2) The false detection of a normal sample is set to E% or less.
(A3) Set the detection rate of abnormal samples to E% or more.
(A4) The detection rate of a normal sample is set to E% or more.

A user who wants to prioritize avoiding overlooking an abnormal sample may select the constraint (A1) and set the value E at that time to a small value. Further, the user who wants to prioritize the avoidance of false detection of the normal sample may select the constraint (A2) and set the value E at that time to a small value.

Here, the amounts called TPR (true positive: True Positive Rate) and TNR (false positive: True Native Rate) are introduced. "Positive" means a positive rather than normal, that is, an object to be detected. Therefore, here, "Positive" corresponds to the abnormal sample. In other words, TPR represents "the percentage of abnormal samples that the system determines to be abnormal". TNR represents "the percentage of normal samples that the system determines to be normal".

The problem of selecting the optimum threshold value that satisfies the constraint specified as described above is to set the optimum threshold value after giving the constraint that "one of TPR and TNR is set to an arbitrary value or more." It can be interpreted as a question of choice. For example, the constraint (A1) that "the oversight of an abnormal sample is 10% or less" can be realized by selecting the optimum threshold value under the "TPR is 90% or more" constraint. Similarly, the above-mentioned four types of constraints (A1) to (A4) can be replaced with constraints (B1) to (B4) in which "one of TPR and TNR is set to an arbitrary value or more" as follows. it can. In other words, the constraints (A1) to (A4) are equivalent to the constraints (B1) to (B4), respectively.
(B1) TPR is (100-E)% or more.
(B2) TNR is (100-E)% or more.
(B3) Set TPR to E% or more.
(B4) Set the TNR to E% or more.

Hereinafter, the case where the constraint (A1) is selected and E = 20% is input, that is, the case where the constraint "Missing an abnormal sample is 20% or less" is specified will be described. This constraint (A1) can be replaced with the TPR and TNR constraint (B1), that is, "TPR is 80% or more."

<< Threshold candidate group generating unit 21 >>
FIG. 10 is a diagram showing an example of threshold candidates C1 to C13 included in the first threshold candidate group generated by the threshold candidate group generation unit 21. FIG. 11 is a diagram showing other examples of threshold candidates C21 to C25 included in the first threshold candidate group generated by the threshold candidate group generation unit 21. The threshold candidate group generation unit 21 generates a first threshold candidate group including one or more threshold candidates in order to obtain a threshold that satisfies the specified constraint. Various methods for generating threshold candidates can be considered, but as shown in FIG. 10 as an example, a method of enumerating the middle of all adjacent abnormalities plotted on a number line as threshold candidates can be used. That is, when the degree of abnormality of m samples is given, m-1 threshold candidates are generated. m is a positive integer. In FIG. 10, the degree of abnormality of 14 samples is given, and as a result, 13 threshold candidates C1 to C13 are generated. The advantage of this method is that when the degree of abnormality of the normal sample and the degree of abnormality of the abnormal sample are significantly different from each other as shown in FIG. 11, the threshold candidate C23 having the maximum margin for distinguishing the two is generated. Is Rukoto. This improves generalization performance for unknown samples.

<< 1st threshold selection unit 23 and 1st determination accuracy calculation unit 24 >>
FIG. 12 is a diagram showing an example of the first determination accuracy calculated by the first determination accuracy calculation unit 24 and the constraints specified by the constraint designation unit 22 as Table 1. The first threshold value selection unit 23 uses the first determination accuracy calculation unit 24 to obtain the first determination accuracy for each threshold value candidate for the first threshold value candidate group obtained as described above. Here, "a set of TPR and TNR" is used as a specific example of the first determination accuracy. In the example of FIG. 12, the pair of TPR and TNR is obtained as the first determination accuracy for the threshold candidates C1 to C13. Among these threshold candidates, those satisfying the above-mentioned constraint "TPR is 80% or more" are selected and output as the second threshold candidate group.

FIG. 13 is a flowchart showing the operation of the first threshold value selection unit 23 and the first determination accuracy calculation unit 24. The first threshold value selection unit 23 selects one threshold value candidate that has not yet been selected from the first threshold value candidate group (step S11), and the first determination accuracy calculation unit 24 selects the selected threshold value candidate for the first threshold value candidate. The determination accuracy of 1 is calculated (step S12).

Next, the first threshold value selection unit 23 determines whether or not the first determination accuracy satisfies the specified constraint, and if the constraint is satisfied (YES in step S13), the constraint is set. The threshold candidates to be satisfied are added to the second threshold candidate group (step S14), and it is determined whether or not all the threshold candidates have been selected (step S15). When the first determination accuracy does not satisfy the specified constraint (NO in step S13), the first threshold selection unit 23 does not add the threshold candidates to the second threshold candidate group, and all of them. It is determined whether or not the threshold value candidate of (step S15) is selected (step S15).

When all the threshold candidates are selected (YES in step S15), the first threshold selection unit 23 outputs the second threshold candidate group to the second threshold selection unit 25 (step S16) and does not select. If there is a threshold candidate of (NO in step S15), the process returns to step S11.

<< Second threshold value selection unit 25 and second determination accuracy calculation unit 26 >>
FIG. 14 is a diagram showing an example of the second determination accuracy calculated by the second determination accuracy calculation unit 26 as Table 2. The second determination accuracy calculation unit 26 obtains the second determination accuracy for the threshold candidates C1 to C6 shown in FIG. Since the second threshold value selection unit 25 uniquely selects the final threshold value from the second threshold value candidate group, these threshold values are evaluated by a scale different from the first determination accuracy. This evaluation is performed by the second determination accuracy calculation unit 26. Here, as a specific example of the second determination accuracy, "the smaller value of TPR and TNR" is used. In the example shown in FIG. 14, the threshold candidate having the highest second determination accuracy is C6. Therefore, this threshold candidate C6 is output as the final threshold.

FIG. 15 is a flowchart showing the operation of the second threshold value selection unit 25 and the second determination accuracy calculation unit 26. The second threshold value selection unit 25 selects one threshold value candidate that has not yet been selected from the second threshold value candidate group (step S21), and the second determination accuracy calculation unit 26 selects the selected threshold value candidate for the second threshold value candidate. The determination accuracy of 2 is calculated (step S22).

Next, the second threshold value selection unit 25 determines whether or not the second determination accuracy is greater than the maximum value of the second determination accuracy stored in the memory (step S23), and if it is large, it determines. (YES in step S23), the maximum value of the second determination accuracy is stored (that is, updated) (step S24), and it is determined whether or not all the threshold candidates have been selected (step S25). When the second determination accuracy does not satisfy the specified constraint (NO in step S23), the second threshold selection unit 25 does not update the maximum value of the second determination accuracy, and does not update all the values. It is determined whether or not the threshold candidate is selected (step S25).

When all the threshold candidates are selected (YES in step S25), the second threshold selection unit 25 outputs the threshold candidate having the maximum second determination accuracy as the final threshold (step S26). If there is an unselected threshold candidate (NO in step S25), the process returns to step S21.

The evaluation scale used as the first determination accuracy and the second determination accuracy may be other than those based on TPR or TNR. For example, the evaluation scale can utilize any statistic or a combination thereof, such as accuracy of correct answer, accuracy rate, and F value (F-score or F-masure).

"effect"
As described above, if the threshold value generator 20 according to the present embodiment is used, the user's orientation is reflected in the threshold value in the form of a constraint on the first determination accuracy, and the second determination accuracy satisfying the constraint can be obtained. By selecting an appropriate threshold value by using, it is possible to select an appropriate threshold value while reflecting the user's orientation.

In addition, the method of specifying the range of numerical values that can be taken by the judgment accuracy is intuitively understandable to the user, and the user's labor for adjusting the threshold value is small. In addition, by taking the form of specifying a range of numerical values, it is possible to leave room for the system to select a more appropriate threshold value within that range. This makes it possible to both reflect the user's orientation and optimize the threshold value.

Further, since only the threshold value candidates satisfying the constraint specified by the user are selected as the second threshold value candidates, the final threshold value surely reflects the user's orientation.

Further, since the threshold value to be finally output is narrowed down to one, additional work such as the user selecting the final threshold value from a plurality of presented threshold value candidates becomes unnecessary, and the labor of the user is reduced. can do.

In addition, when the proportion of normal / abnormal samples in the entire data is significantly different, the reliability of the judgment accuracy such as the correct answer accuracy or the F value is lowered. However, since TPR and TNR are not affected by the proportion of normal / abnormal samples, it is possible to generate a highly reliable threshold value in various situations.

In addition, the second criterion, "the smaller value of TPR and TNR", is "determine that any input sample is normal" or "determine that any input sample is abnormal". It will always be 0 for useless threshold candidates such as "." Therefore, it can be expected that practical thresholds will be generated in various situations by avoiding the selection of such useless threshold candidates.

Furthermore, if the degree of abnormality of the normal sample and the abnormal sample can be completely separated on the number line, the judgment accuracy can be set anywhere between the normal sample closest to the abnormal sample and the abnormal sample closest to the normal sample. Is 100%. In such a case, the generalization performance for an unknown sample can be maximized by setting a threshold value between the two, as in the objective function for optimizing the support vector machine.

<< Modification example >>
FIG. 16 is a diagram showing an example of the hardware configuration of the threshold value generation device 20 according to the present embodiment. As shown in FIG. 16, the threshold generation device 20 has a memory 32 for storing a program and a processor 31 such as a CPU (Central Processing Unit) for executing the program. The program can include a threshold generation program for implementing the threshold generation method according to the present embodiment. All or part of the function of the threshold generator 20 shown in FIG. 9 can be realized by the processor 31 that executes the program. All or part of the function of the threshold generation device 20 shown in FIG. 16 may be realized by a semiconductor integrated circuit. Further, the threshold generation device 20 includes a display 34 as a display means as an interface for the user to specify a constraint on the determination accuracy, an input device 35 such as a mouse, a keyboard, and a touch panel, and a hard disk 33 as a storage device. You may have.

20 Threshold generation device, 21 Threshold candidate group generation unit, 22 Constraint designation unit, 23 1st threshold selection unit, 24 1st judgment accuracy calculation unit, 25 2nd threshold selection unit, 26 2nd judgment accuracy calculation unit ..

Claims

A threshold candidate group generator that generates a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples, respectively.
A first determination accuracy calculation unit that calculates the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group.
A constraint specification unit that specifies a constraint on the first determination accuracy,
A first threshold selection unit that selects one or more threshold candidates from the first threshold candidate group based on the constraint and generates a second threshold candidate group including the selected one or more threshold candidates. When,
A second determination accuracy calculation unit that calculates the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group.
A second threshold value selection unit that outputs a threshold value candidate selected from the second threshold value candidate group based on the second determination accuracy as a final threshold value, and
Threshold generator having.
The threshold generation device according to claim 1, wherein the constraint designating unit receives input of a numerical value and determines the constraint based on the numerical value.
The first threshold selection unit selects a threshold candidate satisfying the constraint from the first threshold candidate group, and generates the second threshold candidate group including one or more selected threshold candidates. Item 3. The threshold value generator according to item 1 or 2.
The second threshold selection unit selects any one of claims 1 to 3 from the second threshold candidate group to select the threshold candidate having the maximum second determination accuracy and output it as the final threshold. The threshold generation device according to.
The plurality of samples include a normal sample and an abnormal sample.
The threshold value according to any one of claims 1 to 4, wherein the first determination accuracy calculation unit outputs a set of TPR and TNR calculated based on the plurality of abnormalities as the first determination accuracy. Generator.
The plurality of samples include a normal sample and an abnormal sample.
The second determination accuracy calculation unit outputs any one of claims 1 to 4 as the second determination accuracy, whichever is smaller of the TPR and TNR calculated based on the plurality of abnormalities. The threshold generator according to the above.
The threshold generation device according to claim 5, wherein the second determination accuracy calculation unit outputs the smaller value of the TPR and the TNR as the second determination accuracy.
The threshold candidate group generation unit sets the threshold candidates included in the first threshold candidate group in the middle of the abnormality degrees adjacent to each other among the plurality of abnormality degrees arranged on a number line according to claims 1 to 7. The threshold value generator according to any one item.
The plurality of samples are a plurality of devices.
The threshold generation device according to any one of claims 1 to 8, wherein the plurality of abnormalities is the intensity of sound or vibration emitted from the plurality of devices.
A step of generating a first threshold candidate group containing one or more threshold candidates based on a plurality of anomalies assigned to a plurality of samples, and a step of generating a first threshold candidate group.
A step of calculating the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group, and
The step of designating the constraint on the first determination accuracy and
A step of selecting one or more threshold candidates from the first threshold candidate group based on the constraint and generating a second threshold candidate group including the selected one or more threshold candidates.
A step of calculating the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group, and
A step of outputting a threshold candidate selected from the second threshold candidate group based on the second determination accuracy as a final threshold, and a step of outputting the threshold candidate.
Threshold generation method having.
A process of generating a first threshold candidate group including one or more threshold candidates based on a plurality of abnormalities assigned to a plurality of samples, and a process of generating a first threshold candidate group.
A process of calculating the first determination accuracy of each of the one or more threshold candidates included in the first threshold candidate group, and
The process of designating the constraint on the first determination accuracy and
A process of selecting one or more threshold candidates from the first threshold candidate group based on the constraint and generating a second threshold candidate group including the selected one or more threshold candidates.
A process of calculating the second determination accuracy of each of the one or more threshold candidates included in the second threshold candidate group, and
A process of outputting a threshold candidate selected from the second threshold candidate group based on the second determination accuracy as a final threshold, and a process of outputting the threshold candidate.
A threshold generation program that causes a computer to execute.