TW202121205A - Threshold value generation device, threshold value generation method, and threshold value generation program - Google Patents
Threshold value generation device, threshold value generation method, and threshold value generation program Download PDFInfo
- Publication number
- TW202121205A TW202121205A TW109114016A TW109114016A TW202121205A TW 202121205 A TW202121205 A TW 202121205A TW 109114016 A TW109114016 A TW 109114016A TW 109114016 A TW109114016 A TW 109114016A TW 202121205 A TW202121205 A TW 202121205A
- Authority
- TW
- Taiwan
- Prior art keywords
- threshold value
- candidate group
- aforementioned
- determination accuracy
- candidates
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Testing And Monitoring For Control Systems (AREA)
- Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)
Abstract
Description
本發明係關於臨限值生成裝置、臨限值生成方法、及記錄有臨限值生成程式的記錄媒體。The present invention relates to a threshold value generating device, a threshold value generating method, and a recording medium on which a threshold value generating program is recorded.
已開發出使用音響感測器或振動感測器,測定機器的動作音或振動,且分析其波形,藉此推定機器的健全性的各種手法。MT(Mahalanobis Taguchi)法係其中最具代表性的手法之一(參照例如非專利文獻1)。在MT法中,係將特徵空間中正常試樣集合所形成的分布作為基準空間而事前學習,在判定時,係依被觀測到的特徵向量背離基準空間多少程度來進行正常或異常的識別。 [先前技術文獻] [專利文獻]Various methods have been developed that use acoustic sensors or vibration sensors to measure the operating sound or vibration of the machine and analyze the waveform to estimate the soundness of the machine. One of the most representative methods in the MT (Mahalanobis Taguchi) legal system (see, for example, Non-Patent Document 1). In the MT method, the distribution formed by the collection of normal samples in the feature space is used as the reference space to learn in advance. When determining, the normal or abnormal identification is performed according to how much the observed feature vector deviates from the reference space. [Prior technical literature] [Patent Literature]
[非專利文獻1]立林和夫著,「入門田口方法(Taguchi method)」,pp.167-185,日科技連出版社,2004年 [非專利文獻2]Yuma Koizumi等4名,“Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma”, IEEE/ACM Transactions on Audio Speech and Language Processing, 2019.[Non-Patent Document 1] Tachibayashi Kazuo, "Introduction to the Taguchi Method", pp.167-185, Nikkei University Publishing House, 2004 [Non-Patent Document 2] Yuma Koizumi and 4 others, "Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma", IEEE/ACM Transactions on Audio Speech and Language Processing, 2019.
[發明所欲解決之課題][Problems to be solved by the invention]
藉由MT法所得的係表示某試樣背離基準空間多少程度的馬式距離。馬式距離(Mahalanobis’ distance)係表示其若小,試樣接近正常,其若大,則試樣接近異常的情形。亦即,馬式距離係可解釋為表示試樣的異常度的值。但是,設定對該異常度識別正常/異常的臨限值的方法並未被確立,大多情形下,為了發現適當的臨限值,需要進行嘗試錯誤。The system obtained by the MT method represents the horse-style distance of how much a certain sample deviates from the reference space. Mahalanobis’ distance means that if it is small, the sample is close to normal, and if it is large, the sample is close to abnormal. That is, the horse distance system can be interpreted as a value indicating the abnormality of the sample. However, the method of setting the threshold value for identifying normal/abnormality of the abnormality degree has not been established. In most cases, it is necessary to make trial and error in order to find an appropriate threshold value.
如上所示之設定臨限值的問題係不僅如MT法所示之古典手法,在如深層學習般之現代手法,亦同樣地發生。例如,在非專利文獻2中,係記載使用變分自編碼器,學習正常試樣集合所具有的特徵的潛在分布,藉由離該分布的背離程度來進行正常或異常的判定的手法。但是,最終變分自編碼器所輸出的係與MT法同樣地為表示異常度的值,非專利文獻2並未具體記載對該異常度要如何設定適當臨限值為佳。The problem of setting the threshold as shown above is not only the classical technique shown in the MT method, but also the modern technique like in-depth learning. For example, Non-Patent Document 2 describes a method of using a variational autoencoder to learn the potential distribution of the characteristics of a normal sample set, and determine whether it is normal or abnormal by the degree of deviation from the distribution. However, the system output from the final variational autoencoder is a value indicating the degree of abnormality similarly to the MT method, and Non-Patent Document 2 does not specifically describe how to set an appropriate threshold value for the degree of abnormality.
為對異常度自動決定適當臨限值,以解決以下課題為宜。In order to automatically determine the appropriate threshold for the degree of abnormality, it is appropriate to solve the following issues.
第1係確立使臨限值反映使用者的指向的方法。大多情形下,若將正常試樣集合與異常試樣集合的異常度標繪在數線上,如圖1所示,產生正常試樣(○記號)集合的異常度與異常試樣(X記號)集合的異常度混合存在的區域。如上所示之情形下,並不存在可完全識別正常/異常的臨限值,以考慮到正常試樣的誤判定率或異常試樣的漏看率的取捨(trade off)為宜。該等基準之中要重視哪個基準,係依使用者而異。因此,以確立可使臨限值生成處理反映使用者的指向的手法為宜。The first is to establish a method for making the threshold value reflect the user's orientation. In most cases, if the abnormality of the normal sample set and the abnormal sample set is plotted on the number line, as shown in Figure 1, the abnormality of the normal sample (marked by ○) and the abnormal sample (marked by X) will be generated. The area where the abnormality of the collection is mixed. In the situation shown above, there is no threshold value that can fully identify normal/abnormal. It is advisable to take into account the misjudgment rate of normal samples or the miss rate of abnormal samples (trade off). Which of these benchmarks should be emphasized depends on the user. Therefore, it is appropriate to establish a method that allows the threshold value generation process to reflect the user's orientation.
第2係確立在反映出上述使用者的指向,亦即使用者所指定的制約之後,決定最適臨限值的方法。例如,若生成「異常試樣的漏看率成為10%的臨限值」,以單純的方法而言,考慮列舉各種臨限值候補後,由其中選擇異常試樣的漏看率成為10%者、或漏看率最接近10%者的方法。但是,在該方法中,如圖2所示,正常試樣與異常試樣的異常度在數線上可完全分離,若可無副作用地使異常試樣的漏看率小於10%時,會發生雖明顯存在「較佳的臨限值」,卻未被選擇的問題。同樣的問題亦發生在正常試樣的異常度與異常試樣的異常度混合存在的區域小的情形、或可利用在決定臨限值的試樣數少的情形。因此,以確立在什麼樣的狀況下,均可決定適當臨限值的手法為宜。The second system establishes a method for determining the most suitable threshold after reflecting the above-mentioned user's orientation, that is, the constraints specified by the user. For example, if the "missing rate of abnormal samples becomes a threshold of 10%" is generated, in a simple way, after considering various threshold candidates, the missed rate of abnormal samples selected from among them becomes 10%. The method of the person whose missed view rate is closest to 10%. However, in this method, as shown in Figure 2, the abnormality of the normal sample and the abnormal sample can be completely separated on the number line. If the missed rate of the abnormal sample is less than 10% without side effects, it will happen. Although it is obvious that there is a "better threshold", it has not been selected. The same problem also occurs when the area where the abnormality of the normal sample and the abnormality of the abnormal sample are mixed is small, or the number of samples that can be used to determine the threshold is small. Therefore, it is appropriate to establish a method that can determine an appropriate threshold under any conditions.
本發明之目的在取得根據所指定的制約的適當臨限值。 [用以解決課題的手段]The purpose of the present invention is to obtain an appropriate threshold value based on the specified constraints. [Means to solve the problem]
本發明之一態樣之臨限值生成裝置係具有:根據在複數試樣分別所分配的複數異常度,生成包含1個以上的臨限值候補的第1臨限值候補群的臨限值候補群生成部;算出前述第1臨限值候補群所包含的前述1個以上的臨限值候補的各個的第1判定精度的第1判定精度算出部;指定對前述第1判定精度的制約的制約指定部;由前述第1臨限值候補群,根據前述制約選擇1個以上的臨限值候補,生成包含前述被選擇出的1個以上的臨限值候補的第2臨限值候補群的第1臨限值選擇部;算出前述第2臨限值候補群所包含的前述1個以上的臨限值候補的各個的第2判定精度的第2判定精度算出部;及將由前述第2臨限值候補群根據前述第2判定精度所選擇出的臨限值候補輸出作為最終的臨限值的第2臨限值選擇部。A threshold value generating device of one aspect of the present invention is provided with: generating the threshold value of the first threshold value candidate group including one or more threshold value candidates based on the plural abnormalities assigned to the plural samples. Candidate group generation unit; a first judgment accuracy calculation unit that calculates the first judgment accuracy of each of the one or more threshold value candidates included in the first threshold value candidate group; specifies the restriction on the first judgment accuracy Restriction designation unit; from the aforementioned first threshold value candidate group, one or more threshold value candidates are selected based on the aforementioned restrictions, and a second threshold value candidate including one or more threshold value candidates selected above is generated The first threshold value selection part of the group; the second judgment accuracy calculation part that calculates the second judgment accuracy of each of the one or more threshold value candidates included in the second threshold value candidate group; and 2 Threshold value candidate group The threshold value candidate selected based on the aforementioned second determination accuracy outputs the second threshold value selection unit as the final threshold value.
本發明之其他態樣之臨限值生成方法係具有:根據在複數試樣分別所分配的複數異常度,生成包含1個以上的臨限值候補的第1臨限值候補群的步驟;算出前述第1臨限值候補群所包含的前述1個以上的臨限值候補的各個的第1判定精度的步驟;指定對前述第1判定精度的制約的步驟;由前述第1臨限值候補群,根據前述制約選擇1個以上的臨限值候補,生成包含前述被選擇出的1個以上的臨限值候補的第2臨限值候補群的步驟;算出前述第2臨限值候補群所包含的前述1個以上的臨限值候補的各個的第2判定精度的步驟;及將由前述第2臨限值候補群根據前述第2判定精度所選擇出的臨限值候補輸出作為最終的臨限值的步驟。 [發明之效果]Another aspect of the threshold value generation method of the present invention includes the steps of generating a first threshold value candidate group including one or more threshold value candidates based on the plural abnormalities assigned to the plural samples respectively; The step of the first determination accuracy of each of the one or more threshold value candidates included in the first threshold value candidate group; the step of specifying the restriction on the first determination accuracy; the step of specifying the restriction on the first threshold value candidate; Group, select one or more threshold value candidates based on the aforementioned constraints, and generate a second threshold value candidate group including one or more threshold value candidates selected above; calculate the aforementioned second threshold value candidate group The step of the second determination accuracy of each of the one or more threshold value candidates included; and the threshold value candidate output selected by the second threshold value candidate group based on the second determination accuracy as the final Threshold steps. [Effects of Invention]
藉由本發明,可取得根據所指定的制約的適當臨限值。According to the present invention, it is possible to obtain an appropriate threshold value based on the specified restriction.
以下一邊參照圖示,一邊說明本發明之實施形態之臨限值生成裝置、臨限值生成方法、及記錄有臨限值生成程式的電腦可讀取記錄媒體。以下的實施形態僅為例子,可在本發明之範圍內作各種變更。The following describes the threshold value generating device, the threshold value generating method, and the computer-readable recording medium on which the threshold value generating program is recorded in the embodiment of the present invention with reference to the drawings. The following embodiments are only examples, and various changes can be made within the scope of the present invention.
本實施形態之臨限值生成裝置係生成被使用在判定機器為正常狀態或異常狀態時的臨限值。本實施形態之臨限值生成裝置係根據對例如將機器的動作音或振動,使用音響感測器或振動感測器(亦即測定器)所檢測到的波形進行分析的結果所得的機器的異常度,生成判定機器為正常狀態或異常狀態時所使用的臨限值。在本實施形態中,以具體的適用例而言,假想臨限值生成裝置根據作為機器的馬達的振動,進行馬達的品質檢查的狀況。本實施形態之臨限值生成裝置係檢測在馬達發生的振動,若分析的結果所得的異常度為臨限值以上,該馬達係視為不良品。The threshold value generating device of this embodiment generates the threshold value used when determining whether the machine is in a normal state or an abnormal state. The threshold value generation device of this embodiment is based on the result of analyzing the waveform detected by, for example, the operating sound or vibration of the machine using an acoustic sensor or a vibration sensor (i.e., a measuring device). Abnormality, to generate the threshold value used to determine whether the machine is in a normal state or an abnormal state. In the present embodiment, as a specific application example, the virtual threshold value generation device performs the quality inspection of the motor based on the vibration of the motor as the machine. The threshold value generation device of this embodiment detects vibrations that occur in a motor, and if the abnormality obtained as a result of the analysis is above the threshold value, the motor is regarded as a defective product.
《異常度的導出》 一開始說明根據作為試樣的馬達的振動波形來算出異常度的方法的具體例。首先,本實施形態之臨限值生成裝置係對藉由測定器所測定到的振動波形,進行小波轉換,藉此生成如圖3所示之頻譜(spectrogram)。圖3係顯示表示馬達的振動的頻譜之例的圖。在圖3中,縱軸表示時刻,橫軸表示頻率,濃度的濃淡程度表示振動的強度。"Export of anomaly" At the beginning, a specific example of the method of calculating the degree of abnormality based on the vibration waveform of the motor as the sample is described. First, the threshold value generating device of this embodiment performs wavelet transformation on the vibration waveform measured by the measuring device, thereby generating the spectrogram as shown in FIG. 3. Fig. 3 is a diagram showing an example of the frequency spectrum of the vibration of the motor. In FIG. 3, the vertical axis represents time, the horizontal axis represents frequency, and the degree of density represents the intensity of vibration.
圖4係將圖3所示之頻譜形成為行列而以模式表示的圖。在圖4中,縱軸表示時刻,橫軸表示頻率。在圖4中,作為四角形的1個方塊對應頻譜上的1點。在圖4中係設為在複數方塊的各個被分配有功率的值者。但是,為方便作圖,圖3與圖4中的時間/頻率解像度係彼此不同。Fig. 4 is a diagram schematically showing the frequency spectrum shown in Fig. 3 in rows and columns. In FIG. 4, the vertical axis represents time, and the horizontal axis represents frequency. In Fig. 4, one square as a quadrangle corresponds to one point on the frequency spectrum. In FIG. 4, it is assumed that the value of power is allocated to each of the plural squares. However, for the convenience of drawing, the time/frequency resolutions in Fig. 3 and Fig. 4 are different from each other.
圖5係顯示將圖4所示之行列視為將每個時刻的音色的特徵數值化的特徵向量之例的圖。在圖5中,縱軸表示時刻,橫軸表示頻率。圖4所示之行列係如圖5所示,可視為按每個時刻將音色的特徵數值化的特徵向量。藉此,由1個馬達的振動波形生成圖5所示之複數特徵向量。FIG. 5 is a diagram showing an example in which the rows and columns shown in FIG. 4 are regarded as a feature vector in which the feature of the tone color at each time is digitized. In FIG. 5, the vertical axis represents time, and the horizontal axis represents frequency. The rows and columns shown in Fig. 4 are shown in Fig. 5, which can be regarded as a feature vector that digitizes the feature of tone color at each moment. In this way, the complex feature vector shown in FIG. 5 is generated from the vibration waveform of one motor.
其中,由振動波形取得特徵向量的手法並非限定於藉由如上所述之小波轉換所為之手法。以由振動波形取得特徵向量的手法而言,亦可使用例如傅立葉(Fourier)轉換、濾波器組(Filter bank)分析、倒頻譜(Cepstrum)分析、或LPC(Linear Predictive Coefficient,線性預測係數)分析等。此外,以由振動波形取得特徵向量的手法而言,亦可將各種音響特徵量組合而構成特徵向量。各種音響特徵量係例如波峰值、RMS(Root Mean Square,均方根)值、及基本頻率等。Among them, the method of obtaining the feature vector from the vibration waveform is not limited to the method by the wavelet transformation as described above. In terms of obtaining the feature vector from the vibration waveform, for example, Fourier transform, Filter bank analysis, Cepstrum analysis, or LPC (Linear Predictive Coefficient) analysis can also be used. Wait. In addition, in the method of obtaining a feature vector from a vibration waveform, various acoustic feature quantities may be combined to form a feature vector. Various acoustic characteristic quantities, such as peak value, RMS (Root Mean Square, root mean square) value, and fundamental frequency, etc.
圖6係顯示收取根據正常馬達的試樣(亦即正常試樣)與異常馬達的試樣(亦即異常試樣)所得的複數特徵向量的集合,生成異常度模型的異常度模型學習器10的圖。如圖3至圖5所示,使用包含正常試樣與異常試樣的複數試樣,進行分析,結果所得的特徵向量的集合被輸入至異常度模型學習器10。異常度模型係指將1個特徵向量作為輸入,藉由對該特徵向量進行某些轉換而算出單一異常度且進行輸出者。異常度模型學習器10係由所被給予的特徵向量的集合,構成如上所示之異常度模型。以下說明異常度模型學習器10使用線性判別分析(Linear Discriminant Analysis;LDA)的情形,作為異常度模型學習器10的具體例。Fig. 6 shows an abnormal
LDA係供予正常級別的特徵向量的集合與異常級別的特徵向量的集合,作為學習資料,且根據該等特徵向量的分布,求出正常與異常的差異最被強調出的射影向量的手法。異常度模型學習器10係首先求出正常級別與異常級別之雙方所包含的所有特徵向量的平均向量:
[數1]
μ
。同樣地,異常度模型學習器10係針對所有特徵向量的各要素,求出標準偏差。此時,將儲存有關於所有特徵向量的各要素的標準偏差的向量設為:
[數2]
σ
。該等向量係當特徵向量為N次元(N為正整數)之時,均成為N次元。LDA provides a collection of normal-level feature vectors and a collection of abnormal-level feature vectors as learning materials, and based on the distribution of these feature vectors, a method of finding the projective vector where the difference between normal and abnormal is most emphasized. The abnormal
接著,異常度模型學習器10係使用上述向量
[數3]
μ, σ
,將所有特徵向量進行正規化。正規化係指以向量的各要素的平均成為0、標準偏差成為1的方式調整向量。Next, the
具體而言,正規化係指使用向量: [數4] a ,求出: [數5] (a-μ)÷σ 。其中,「÷」係設為向量的每個要素的除算。Specifically, normalization refers to the use of vectors: [Number 4] a , Find: [Number 5] (A-μ)÷σ . Among them, "÷" is the division of each element of the vector.
接著,異常度模型學習器10係如以上所示將所有特徵向量進行正規化之後,分別求出正常級別的平均向量與異常級別的平均向量:
[數6]
μ1
, μ2
。Next, the
同樣地,異常度模型學習器10係根據正常級別的平均向量與異常級別的平均向量,分別求出共變異數矩陣:
[數7]
Σ1
, Σ2
。Similarly, the abnormal
最後,異常度模型學習器10係藉由以下式(1)及(2),求出射影向量:
[數8]
w。Finally, the abnormal
[數9] [Number 9]
被輸入某特徵向量: [數10] x 時,將其轉換成單一異常度d之式係成為如下式(3)所示。該式(3)係相當於LDA中的異常度模型。A certain feature vector is input: [Number 10] x When it is converted into a single abnormality degree d, the formula system becomes as shown in the following formula (3). The formula (3) is equivalent to the abnormal degree model in LDA.
[數11] [Number 11]
其中,異常度模型學習器10所使用的分析的方法並非限定於LDA。異常度模型學習器10係可利用例如支持向量機、神經網路、或混合正規分布模型等。此外,異常度模型學習器10係利用正常級別與異常級別之雙方作為學習資料,但是亦可採用僅使用單一級別(class)的資料來學習的手法。此時,異常度模型學習器10係可利用例如MT法、主成分分析(Principal Component Analysis;PCA)、自編碼器、或1級支持向量機等。However, the analysis method used by the
圖7係顯示在圖3至圖6所示之例中,由作為1個試樣的1個馬達取得複數異常度的圖。在上述異常度模型中,係將單一的特徵向量轉換成單一的異常度。此外,由1個馬達的振動波形,係如圖5所示,取得複數特徵向量。因此,此時,由1個馬達係如圖7所示取得複數異常度。Fig. 7 is a diagram showing the multiple abnormality degree obtained by one motor as one sample in the example shown in Figs. 3 to 6. In the above abnormal degree model, a single feature vector is converted into a single abnormal degree. In addition, from the vibration waveform of one motor, as shown in FIG. 5, a complex feature vector is obtained. Therefore, at this time, one motor system obtains a complex abnormality degree as shown in FIG. 7.
圖8係顯示根據由作為1個試樣的1個馬達所取得的複數異常度,算出1個代表值,且其被分配作為馬達的異常度的圖。如圖8所示,為簡化決定正常或異常的最終判斷,由1個馬達所得的複數異常度(如圖7所示)以某些方法予以總計,且將所算出的單一代表值分配作為該馬達的異常度。算出代表值之最單純的方法係將複數異常度的平均值設為代表值者。以可利用的代表值而言,係可使用最大值、標準偏差、或眾數等任意統計量。Fig. 8 is a diagram showing a representative value calculated based on a complex number abnormality degree acquired by a motor as a sample, and this is assigned as the abnormality degree of the motor. As shown in Figure 8, in order to simplify the final judgment of normal or abnormal, the complex abnormality obtained by one motor (as shown in Figure 7) is totaled by some method, and the calculated single representative value is assigned as the The abnormality of the motor. The simplest method to calculate the representative value is to set the average value of the complex abnormality as the representative value. In terms of available representative values, any statistics such as maximum value, standard deviation, or mode can be used.
《臨限值生成裝置20》
以下使用以上所說明的方法,說明在1個試樣亦即1個馬達被分配1個異常度,如圖1及圖2所示,若由複數馬達取得與該等馬達的個數為同數的異常度時,使用該等異常度,自動生成反映出使用者的指向的適當臨限值的臨限值生成裝置及方法。"Threshold
圖9係概略顯示本實施形態之臨限值生成裝置20的構成的區塊圖。臨限值生成裝置20係可實施本實施形態之臨限值生成方法的裝置。臨限值生成裝置20係具有:臨限值候補群生成部21、制約指定部22、第1臨限值選擇部23、第1判定精度算出部24、第2臨限值選擇部25、及第2判定精度算出部26。FIG. 9 is a block diagram schematically showing the configuration of the threshold
臨限值候補群生成部21係根據複數試樣分別被分配的複數異常度,生成包含1個以上的臨限值候補的第1臨限值候補群。在此,複數試樣係複數馬達。第1判定精度算出部24係算出第1臨限值候補群所包含的1個以上的臨限值候補的各個的第1判定精度。The threshold value candidate
制約指定部22係指定對第1判定精度的制約。制約指定部22係例如受理數值的輸入,根據數值來決定制約。數值的輸入係例如藉由使用者來進行。第1臨限值選擇部23係由第1臨限值候補群,根據制約來選擇1個以上的臨限值候補,生成包含所被選擇出的1個以上的臨限值候補的第2臨限值候補群。例如,第1臨限值選擇部23係由第1臨限值候補群選擇滿足制約的臨限值候補,生成包含所被選擇出的1個以上的臨限值候補的第2臨限值候補群。The
第2判定精度算出部26係算出第2臨限值候補群所包含的1個以上的臨限值候補的各個的第2判定精度。第2臨限值選擇部25係將由第2臨限值候補群根據第2判定精度所選擇出的臨限值候補,輸出作為最終的臨限值。例如,第2臨限值選擇部25係由第2臨限值候補群選擇第2判定精度成為最大的臨限值候補且輸出作為最終的臨限值。The second determination
《制約指定部22》
藉由制約指定部22所被指定之對第1判定精度的制約係例如根據被使用者所輸入的數值來決定。制約係例如以下所示之條件。使用者係選擇以下所示之制約(A1)~(A4)之中的1個,可自由指定其中的值E。
(A1)將異常試樣的漏看設為E%以下。
(A2)將正常試樣的誤檢測設為E%以下。
(A3)將異常試樣的檢測率設為E%以上。
(A4)將正常試樣的檢測率設為E%以上。"
欲使迴避漏看異常試樣優先的使用者若選擇制約(A1),且將此時的值E設定為較小的值即可。此外,欲使迴避誤檢測正常試樣優先的使用者若選擇制約(A2),且將此時的值E設定為較小的值即可。Users who want to give priority to avoiding and missing abnormal samples should select restriction (A1) and set the value E at this time to a smaller value. In addition, a user who wants to give priority to avoiding false detection of a normal sample should select restriction (A2) and set the value E at this time to a small value.
在此,導入被稱為TPR(真陽性:True Positive Rate)及TNR(偽陽性:True Negative Rate)的量。「Positive」意指非為正常的陽性,亦即欲檢測的對象。因此,在此,「Positive」對應異常試樣。換言之,TPR係表示「在異常試樣之中,系統判定為異常者的比例」。TNR係表示「在正常試樣之中,系統判定為正常者的比例」。Here, quantities called TPR (True Positive Rate) and TNR (True Negative Rate) are introduced. "Positive" means a positive that is not normal, that is, the object to be tested. Therefore, here, "Positive" corresponds to abnormal samples. In other words, the TPR system means "the proportion of abnormal samples determined by the system to be abnormal". TNR means "the proportion of normal samples determined by the system to be normal".
對如上所述所指定的制約,選擇滿足其之最適臨限值的問題係可解釋為在提供「將TPR及TNR之中其中一方設為任意值以上。」的制約之後,選擇最適臨限值的問題。例如,「將異常試樣的漏看設為10%以下。」的制約(A1)係可藉由在「將TPR設為90%以上。」的制約之下選擇最適臨限值來實現。同樣地,上述4種制約(A1)~(A4)係可如以下所示置換成「將TPR及TNR之中其中一方設為任意值以上」的制約(B1)~(B4)。換言之,制約(A1)~(A4)係分別與制約(B1)~(B4)等效。 (B1)將TPR設為(100-E)%以上。 (B2)將TNR設為(100-E)%以上。 (B3)將TPR設為E%以上。 (B4)將TNR設為E%以上。For the constraints specified above, the problem of choosing the most suitable threshold that meets them can be interpreted as providing the constraint of "setting one of TPR and TNR to an arbitrary value or more." and then selecting the most suitable threshold The problem. For example, the restriction (A1) of "setting the omission of abnormal samples to 10% or less." can be achieved by selecting the optimal threshold under the restriction of "setting TPR to 90% or more." Similarly, the above four types of constraints (A1) to (A4) can be replaced with the constraints (B1) to (B4) of "setting one of TPR and TNR to an arbitrary value or more" as shown below. In other words, the constraints (A1) to (A4) are equivalent to the constraints (B1) to (B4), respectively. (B1) Set TPR to (100-E)% or more. (B2) Set the TNR to (100-E)% or more. (B3) Set TPR to E% or more. (B4) Set TNR to E% or more.
以下係說明若選擇制約(A),且輸入E=20%時,亦即指定出「將異常試樣的漏看設為20%以下。」的制約的情形。該制約(A1)係可置換成TPR及TNR的制約(B1),亦即「將TPR設為80%以上。」。The following is a description of the situation where the restriction of "Set the omission of abnormal samples to 20% or less" is specified when the restriction (A) is selected and E=20% is input. The restriction (A1) can be replaced with the restriction of TPR and TNR (B1), that is, "Set TPR to 80% or more."
《臨限值候補群生成部21》
圖10係顯示藉由臨限值候補群生成部21所生成的第1臨限值候補群所包含的臨限值候補C1~C13之例的圖。圖11係顯示藉由臨限值候補群生成部21所生成的第1臨限值候補群所包含的臨限值候補C21~C25之其他例的圖。臨限值候補群生成部21係生成包含1個以上的臨限值候補的第1臨限值候補群,俾以取得滿足所被指定的制約的臨限值。生成臨限值候補的手法考慮有各種,惟以一例而言,如圖10所示,可使用列舉作為標繪在數線上的所有相鄰異常度的中間作為臨限值候補的方法。亦即,若被供予m個試樣的異常度,即生成m-1個臨限值候補。m為正整數。在圖10中,係被供予14個試樣的異常度,結果生成13個臨限值候補C1~C13。該方法的優點係若如圖11所示,正常試樣的異常度與異常試樣的異常度大幅背離時,生成識別兩者時的邊限(margin)成為最大的臨限值候補C23。藉此,對未知的試樣的一般化性能提升。"Threshold value candidate
《第1臨限值選擇部23及第1判定精度算出部24》
圖12係將藉由第1判定精度算出部24所算出的第1判定精度與藉由制約指定部22所指定的制約之例作為表1所顯示的圖。在第1臨限值選擇部23中,係對如以上所示所得的第1臨限值候補群,使用第1判定精度算出部24,求出對各個的臨限值候補的第1判定精度。在此係使用「TPR及TNR之組」作為第1判定精度的具體例。圖12之例係針對臨限值候補C1~C13,求出TPR與TNR之組作為第1判定精度者。該等臨限值候補之中,滿足前述之「將TPR設為80%以上。」的制約者被選擇作為第2臨限值候補群且輸出。"First Threshold
圖13係顯示第1臨限值選擇部23及第1判定精度算出部24的動作的流程圖。第1臨限值選擇部23係由第1臨限值候補群中選擇尚未被選擇的1個臨限值候補(步驟S11),第1判定精度算出部24係針對所選擇出的臨限值候補,算出第1判定精度(步驟S12)。FIG. 13 is a flowchart showing the operations of the first threshold
接著,第1臨限值選擇部23係判斷第1判定精度是否滿足所被指定的制約,若滿足制約(步驟S13中為YES),將滿足制約的臨限值候補追加在第2臨限值候補群(步驟S14),判斷是否已選擇出所有臨限值候補(步驟S15)。第1臨限值選擇部23係若第1判定精度不滿足所被指定的制約(步驟S13中為NO),在第2臨限值候補群不追加臨限值候補,判斷是否已選擇出所有臨限值候補(步驟S15)。Next, the first threshold
第1臨限值選擇部23係若選擇出所有臨限值候補(步驟S15中為YES),將第2臨限值候補群輸出至第2臨限值選擇部25(步驟S16),若有未選擇的臨限值候補(步驟S15中為NO),將處理返回至步驟S11。If the first threshold
《第2臨限值選擇部25及第2判定精度算出部26》
圖14係顯示將藉由第2判定精度算出部26所算出的第2判定精度之例作為表2所顯示的圖。第2判定精度算出部26係針對圖14所示之臨限值候補C1~C6,求出第2判定精度。在第2臨限值選擇部25中,由於由第2臨限值候補群中單義選擇最終的臨限值,因此第1判定精度係指以不同尺度來評估該等臨限值。該評估係藉由第2判定精度算出部26來進行。在此係使用「TPR及TNR之中較小的值」作為第2判定精度的具體例。在圖14所示之例中,第2判定精度最高的臨限值候補為C6。因此,該臨限值候補C6被輸出作為最終的臨限值。"Second Threshold
圖15係顯示第2臨限值選擇部25及第2判定精度算出部26的動作的流程圖。第2臨限值選擇部25係由第2臨限值候補群中選擇尚未被選擇的1個臨限值候補(步驟S21),第2判定精度算出部26係針對所被選擇出的臨限值候補算出第2判定精度(步驟S22)。FIG. 15 is a flowchart showing the operations of the second threshold
接著,第2臨限值選擇部25係判斷第2判定精度是否大於被記憶在記憶體的第2判定精度的最大值(步驟S23),若較大(步驟S23中為YES),記憶(亦即更新)第2判定精度的最大值(步驟S24),且判斷是否已選擇出所有臨限值候補(步驟S25)。第2臨限值選擇部25係若第2判定精度不滿足所被指定的制約(步驟S23中為NO),未更新第2判定精度的最大值,而判斷是否已選擇出所有臨限值候補(步驟S25)。Next, the second threshold
第2臨限值選擇部25係若已選擇出所有臨限值候補(步驟S25中為YES),輸出第2判定精度為最大的臨限值候補作為最終的臨限值(步驟S26),若有未選擇的臨限值候補(步驟S25中為NO),將處理返回至步驟S21。If the second threshold
其中,使用作為第1判定精度及第2判定精度的評估尺度亦可為藉由TPR或TNR所得者以外。例如,評估尺度係可利用正解精度、適合率、F值(F-score或F-measure)等任意統計量或其組合。However, the evaluation scale used as the first determination accuracy and the second determination accuracy may be other than those obtained by TPR or TNR. For example, the evaluation scale system can use any statistics such as the accuracy of the positive solution, the fitness rate, and the F-score (F-score or F-measure), or a combination thereof.
《效果》
如以上說明所示,若使用本實施形態之臨限值生成裝置20,以對第1判定精度的制約的形式,使臨限值反映使用者的指向,使用滿足該制約的第2判定精度,來選擇適當臨限值,藉此可一邊反映使用者的指向一邊選擇適當臨限值。"effect"
As shown in the above description, if the threshold
此外,指定可取得判定精度的數值的範圍的方法係對使用者而言為可直覺地理解,耗費在調整臨限值的使用者的勞力小。此外,藉由採取所謂數值的範圍指定的形式,可留有在該範圍內系統選定更為適當的臨限值的餘地。藉此,可兼顧使用者的指向的反映與臨限值的最適化。In addition, the method of specifying the range of the numerical value in which the determination accuracy can be obtained is intuitively understood by the user, and the user's labor for adjusting the threshold value is small. In addition, by adopting the form of so-called numerical range designation, there is room for the system to select a more appropriate threshold within the range. In this way, both the reflection of the user's orientation and the optimization of the threshold value can be considered.
此外,僅滿足被使用者所指定的制約的臨限值候補被選擇作為第2臨限值候補,因此最終的臨限值係成為使用者的指向被確實反映出者。In addition, only threshold value candidates satisfying the constraints specified by the user are selected as the second threshold value candidates, and therefore the final threshold value is one whose orientation of the user is reliably reflected.
此外,最終被輸出的臨限值被限定為1個,因此不需要使用者由複數被提示的臨限值候補之中選擇最終的臨限值等追加的作業,可減小使用者的勞力。In addition, the threshold value that is finally output is limited to one, so there is no need for the user to select the final threshold value from among a plurality of presented threshold value candidates, and additional tasks such as the user's labor can be reduced.
此外,資料全體之中,若正常/異常試樣所占比例大幅不同,例如,正解精度或F值等判定精度係可靠性降低。但是,TPR及TNR由於不會受到正常/異常試樣的比例的影響,因此可在各種狀況下生成可靠性高的臨限值。In addition, if the proportions of normal/abnormal samples in the entire data are significantly different, for example, the accuracy of judgments such as the accuracy of the correct solution or the F value will decrease the reliability. However, since TPR and TNR are not affected by the ratio of normal/abnormal samples, they can generate highly reliable thresholds under various conditions.
此外,作為第2判定基準的「TPR及TNR之中較小者的值」係對「被輸入的任何試樣均判定為正常。」或「被輸入的任何試樣均判定為異常。」等沒有助益的臨限值候補,必定成為0。因此,迴避選擇如上所示之沒有助益的臨限值候補,可期待在各種狀況下生成實用的臨限值。In addition, "the value of the smaller of TPR and TNR" as the second criterion is for "any sample that is input is judged as normal." or "any sample that is input is judged as abnormal.", etc. The alternate threshold value that is not helpful must become zero. Therefore, avoiding the selection of unhelpful threshold value candidates as shown above, it can be expected to generate practical threshold values in various situations.
此外,若正常試樣及異常試樣的異常度在數線上可完全分離,在最接近異常試樣的正常試樣、與最接近正常試樣的異常試樣之間的任何地方設定臨限值,判定精度均為100%。如上所示之情形下,與用以將支持向量機最適化的目的函數同樣地,藉由在兩者的正中間設定臨限值,可將對未知的試樣的一般化性能最大化。In addition, if the degree of abnormality of the normal sample and the abnormal sample can be completely separated on the number line, the threshold value can be set anywhere between the normal sample closest to the abnormal sample and the abnormal sample closest to the normal sample , The judgment accuracy is 100%. In the case shown above, similar to the objective function used to optimize the support vector machine, by setting the threshold in the middle of the two, the generalization performance for the unknown sample can be maximized.
《變形例》
圖16係顯示本實施形態之臨限值生成裝置20的硬體構成之例的圖。如圖16所示,臨限值生成裝置20係具有:儲存程式的記憶體32、及執行該程式的CPU(Central Processing Unit,中央處理單元)等處理器31。程式係可包含用以實施本實施形態之臨限值生成方法的臨限值生成程式。圖9所示之臨限值生成裝置20的功能的全體或一部分係可藉由執行程式的處理器31來實現。圖16所示之臨限值生成裝置20的功能的全體或一部分亦可藉由半導體積體電路來實現。此外,臨限值生成裝置20亦可具有:供使用者指定對判定精度的制約之用之作為介面之作為顯示手段的顯示器34;滑鼠、鍵盤、觸控面板等輸入元件35;及作為記憶裝置的硬碟33。"Modifications"
FIG. 16 is a diagram showing an example of the hardware configuration of the threshold
10:異常度模型學習器 20:臨限值生成裝置 21:臨限值候補群生成部 22:制約指定部 23:第1臨限值選擇部 24:第1判定精度算出部 25:第2臨限值選擇部 26:第2判定精度算出部 31:處理器 32:記憶體 33:硬碟 34:顯示器 35:輸入元件10: Abnormal degree model learner 20: Threshold value generation device 21: Threshold value candidate group generation department 22: Restriction Designation Department 23: The first threshold selection part 24: The first judgment accuracy calculation unit 25: The second threshold selection part 26: The second judgment accuracy calculation unit 31: processor 32: memory 33: Hard Disk 34: display 35: input components
[圖1]係顯示將正常試樣與異常試樣的異常度標繪在數線上之例的圖。 [圖2]係顯示將正常試樣與異常試樣的異常度標繪在數線上之其他例的圖。 [圖3]係顯示馬達的振動的頻譜之例的圖。 [圖4]係顯示將圖3所示之頻譜形成為行列而以模式表示的圖。 [圖5]係顯示將圖4所示之行列視為每個時刻的音色的特徵向量之例的圖。 [圖6]係顯示收取根據正常試樣與異常試樣的複數特徵向量的集合,且生成異常度模型的異常度模型學習器的圖。 [圖7]係顯示在圖3至圖6所示之例中,由1個試樣取得複數異常度的圖。 [圖8]係顯示根據由從1個試樣所取得的複數異常度算出1個代表值,且1個代表值被分配作為試樣的異常度的圖。 [圖9]係概略顯示本發明之實施形態之臨限值生成裝置的構成的功能區塊圖。 [圖10]係顯示藉由臨限值候補群生成部所生成的第1臨限值候補群所包含的臨限值候補之例的圖。 [圖11]係顯示藉由第1臨限值選擇部所生成的第2臨限值候補群所包含的臨限值候補之其他例的圖。 [圖12]係將藉由第1判定精度算出部所算出的第1判定精度與藉由制約指定部所指定的制約之例作為表1所顯示的圖。 [圖13]係顯示第1臨限值選擇部及第1判定精度算出部的動作的流程圖。 [圖14]係將藉由第2判定精度算出部所算出的第2判定精度之例作為表2所顯示的圖。 [圖15]係顯示第2臨限值選擇部及第2判定精度算出部的動作的流程圖。 [圖16]係顯示實施形態之臨限值生成裝置的硬體構成之例的圖。[Figure 1] is a diagram showing an example of plotting the abnormality of a normal sample and an abnormal sample on a number line. [Figure 2] is a diagram showing other examples where the abnormality of the normal sample and the abnormal sample is plotted on the number line. [Fig. 3] is a diagram showing an example of the frequency spectrum of the vibration of the motor. [Fig. 4] is a diagram showing the frequency spectrum shown in Fig. 3 formed into rows and columns and represented in a model. [Fig. 5] is a diagram showing an example of the row and column shown in Fig. 4 as the feature vector of the timbre at each time. [Figure 6] is a diagram showing an abnormality model learner that collects a set of complex eigenvectors based on a normal sample and an abnormal sample, and generates an abnormality model. [Fig. 7] is a graph showing the multiple abnormality degree obtained from one sample in the example shown in Figs. 3 to 6. [Fig. 8] is a graph showing that one representative value is calculated from the complex abnormality degree obtained from one sample, and one representative value is assigned as the abnormality degree of the sample. [Figure 9] is a functional block diagram schematically showing the configuration of the threshold value generating device of the embodiment of the present invention. [Fig. 10] is a diagram showing an example of threshold value candidates included in the first threshold value candidate group generated by the threshold value candidate group generating unit. [Fig. 11] is a diagram showing another example of threshold value candidates included in the second threshold value candidate group generated by the first threshold value selection unit. [FIG. 12] Table 1 shows an example of the first determination accuracy calculated by the first determination accuracy calculation unit and the constraints specified by the restriction designation unit. [Fig. 13] is a flowchart showing the operations of the first threshold value selection unit and the first determination accuracy calculation unit. [FIG. 14] is a graph shown in Table 2 with an example of the second determination accuracy calculated by the second determination accuracy calculation unit. [Fig. 15] is a flowchart showing the operation of the second threshold value selection unit and the second determination accuracy calculation unit. [Figure 16] is a diagram showing an example of the hardware configuration of the threshold value generation device of the embodiment.
20:臨限值生成裝置20: Threshold value generation device
21:臨限值候補群生成部21: Threshold value candidate group generation department
22:制約指定部22: Restriction Designation Department
23:第1臨限值選擇部23: The first threshold selection part
24:第1判定精度算出部24: The first judgment accuracy calculation unit
25:第2臨限值選擇部25: The second threshold selection part
26:第2判定精度算出部26: The second judgment accuracy calculation unit
Claims (11)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
WOPCT/JP2019/044818 | 2019-11-15 | ||
PCT/JP2019/044818 WO2021095222A1 (en) | 2019-11-15 | 2019-11-15 | Threshold value generation device, threshold value generation method, and threshold value generation program |
Publications (1)
Publication Number | Publication Date |
---|---|
TW202121205A true TW202121205A (en) | 2021-06-01 |
Family
ID=75913026
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW109114016A TW202121205A (en) | 2019-11-15 | 2020-04-27 | Threshold value generation device, threshold value generation method, and threshold value generation program |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP7012913B2 (en) |
TW (1) | TW202121205A (en) |
WO (1) | WO2021095222A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7457752B2 (en) | 2022-06-15 | 2024-03-28 | 株式会社安川電機 | Data analysis system, data analysis method, and program |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170032247A1 (en) | 2015-07-31 | 2017-02-02 | Qualcomm Incorporated | Media classification |
JP6588495B2 (en) | 2017-05-01 | 2019-10-09 | 日本電信電話株式会社 | Analysis system, setting method and setting program |
-
2019
- 2019-11-15 JP JP2021555739A patent/JP7012913B2/en active Active
- 2019-11-15 WO PCT/JP2019/044818 patent/WO2021095222A1/en active Application Filing
-
2020
- 2020-04-27 TW TW109114016A patent/TW202121205A/en unknown
Also Published As
Publication number | Publication date |
---|---|
JP7012913B2 (en) | 2022-01-28 |
JPWO2021095222A1 (en) | 2021-05-20 |
WO2021095222A1 (en) | 2021-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Awada et al. | A review of the stability of feature selection techniques for bioinformatics data | |
Hall et al. | Using generalized correlation to effect variable selection in very high dimensional problems | |
Qu et al. | Data reduction using a discrete wavelet transform in discriminant analysis of very high dimensionality data | |
JP6359123B2 (en) | Inspection data processing apparatus and inspection data processing method | |
JP2005524131A (en) | Method and apparatus for classifier performance estimation | |
EA006272B1 (en) | Heuristic method of classification | |
JP2011520183A (en) | Classification of sample data | |
Rogers et al. | Using statistical image models for objective evaluation of spot detection in two‐dimensional gels | |
JPWO2019181022A1 (en) | Gene mutation evaluation device, evaluation method, program, and recording medium | |
TW202121205A (en) | Threshold value generation device, threshold value generation method, and threshold value generation program | |
Slavkov et al. | Finding explained groups of time-course gene expression profiles with predictive clustering trees | |
Hagar et al. | Comparison of hazard rate estimation in R | |
JP4827285B2 (en) | Pattern recognition method, pattern recognition apparatus, and recording medium | |
JP5516925B2 (en) | Reliability calculation device, reliability calculation method, and program | |
JP4461240B2 (en) | Gene expression profile search device, gene expression profile search method and program | |
US20210390623A1 (en) | Data analysis method and data analysis device | |
JP2005038256A (en) | Effective factor information selection device, effective factor information selection method, program, and recording medium | |
JP7224263B2 (en) | MODEL GENERATION METHOD, MODEL GENERATION DEVICE AND PROGRAM | |
JP5517973B2 (en) | Pattern recognition apparatus and pattern recognition method | |
JP2018151913A (en) | Information processing system, information processing method, and program | |
US20220262455A1 (en) | Determining the goodness of a biological vector space | |
JP5247089B2 (en) | Gene profile processing apparatus, gene profile processing program, and gene profile processing method | |
JP7246330B2 (en) | How to monitor monitored data | |
Timmermans et al. | Advantages of the BAGIDIS methodology for metabonomics analyses: application to a spectroscopic study of Age-related Macular Degeneration | |
WO2021245850A1 (en) | Diagnosis support program, device, and method |