JP4431988B2

JP4431988B2 - Knowledge creating apparatus and knowledge creating method

Info

Publication number: JP4431988B2
Application number: JP2005207910A
Authority: JP
Inventors: 裕司平山; 敦清水
Original assignee: Omron Corp
Current assignee: Omron Corp
Priority date: 2005-07-15
Filing date: 2005-07-15
Publication date: 2010-03-17
Anticipated expiration: 2025-07-15
Also published as: JP2007024697A

Description

この発明は、知識作成装置および知識作成方法に関するもので、より具体的には、入力された検査対象の計測データに対して特徴量を抽出し、抽出した特徴量に基づいて状態を判定する検査装置における検査ロジック（有効特徴量，特徴量のパラメータ調整等）の生成を行なう技術に関する。 The present invention relates to a knowledge creation device and a knowledge creation method. More specifically, the present invention relates to a test for extracting feature quantities from input measurement data to be examined and determining a state based on the extracted feature quantities. The present invention relates to a technique for generating inspection logic (effective feature amount, parameter adjustment of feature amount, etc.) in an apparatus.

自動車や家電製品などには、モータ等の駆動系部品が組み込まれた回転機器が非常に多く用いられている。例えば自動車を例にとってみると、エンジン，パワーステアリング，パワーシート，ミッションその他の至る所に回転機器が実装されている。また、家電製品では、冷蔵庫，エアコン，洗濯機その他各種の製品がある。そして、係る回転機器が実際に稼働すると、モータ等の回転に伴って音が発生する。 In automobiles and home appliances, a rotating device in which a drive system component such as a motor is incorporated is used very often. For example, taking an automobile as an example, rotating equipment is mounted everywhere in the engine, power steering, power seat, mission and others. Household appliances include refrigerators, air conditioners, washing machines and various other products. When the rotating device is actually operated, a sound is generated with the rotation of the motor or the like.

係る音は、正常な動作に伴い必然的に発生するものもあれば、不良（故障）に伴い発生する音もある。その不良に伴う異常音の一例としては、ベアリングの異常，内部の異常接触，アンバランス，異物混入などがある。より具体的には、ギヤ１回転について１度の頻度で発生するギヤ欠け，異物かみ込み，スポット傷，モータ内部の回転部と固定部が回転中の一瞬だけこすれ合うような異常音がある。また、人が不快と感じる音としては、例えば人間が聞こえる２０Ｈｚから２０ｋＨｚの中で様々な音があり、例えば約１５ｋＨｚ程度のものがある。そして、係る所定の周波数成分の音が発生している場合も異常音となる。もちろん、異常音はこの周波数に限られない。 Some of these sounds are inevitably generated along with normal operation, and other sounds are generated due to defects (failures). Examples of abnormal sounds associated with the failure include bearing abnormalities, internal abnormal contact, imbalance, and foreign matter contamination. More specifically, there are gear chipping, foreign object biting, spot flaws, and abnormal noise such that the rotating part and the fixed part inside the motor rub for a moment during rotation. Further, as sounds that people feel uncomfortable, for example, there are various sounds from 20 Hz to 20 kHz that humans can hear, for example, about 15 kHz. And when the sound of the predetermined frequency component is generated, it becomes an abnormal sound. Of course, abnormal sounds are not limited to this frequency.

係る不良に伴う音は、不快であるばかりでなく、さらなる故障を発生させるおそれもある。そこで、それら各製品に対する品質保証を目的とし、生産工場においては、通常検査員による聴覚や触覚などの五感に頼った「官能検査」を行ない、異常音の有無の判断を行っている。具体的には、耳で聞いたり、手で触って振動を確認したりすることによって行っている。なお、官能検査は、官能検査用語ＪＩＳＺ８１４４により定義されている。 The sound associated with such a defect is not only unpleasant, but may cause further failure. Therefore, for the purpose of quality assurance for each of these products, production factories usually perform “sensory inspection” that relies on the five senses such as hearing and tactile sensation, and determine the presence or absence of abnormal sounds. Specifically, it is done by listening with the ear or touching it with the hand to check the vibration. The sensory test is defined by the sensory test term JIS Z8144.

ところで、係る検査員の五感に頼った官能検査では、熟練した技術を要するばかりでなく、判定結果に個人差や時間による変化などのばらつきが大きい。さらには、判定結果のデータ化，数値化が難しく管理も困難となるという問題がある。そこで、係る問題を解決するため、駆動系部品を含む製品の異常を検査する検査装置として、定量的かつ明確な基準による安定した検査を目的とした異音検査装置がある。 By the way, in the sensory test that relies on the five senses of such an inspector, not only skill is required, but also the determination results vary widely such as individual differences and changes with time. Furthermore, there is a problem that it is difficult to convert the determination results into data and numerical values and to manage them. Therefore, in order to solve such a problem, there is an abnormal sound inspection device for the purpose of a stable inspection based on a quantitative and clear standard as an inspection device for inspecting an abnormality of a product including a drive system component.

このように検査対象から得られた振動波形から正常／異常を判別する検査（いわゆる異音検査）を自動的に行なう異音検査装置としては、従来、特許文献１に開示されたものがある。この特許文献１に開示された発明は、時間軸波形から得られた特徴量と周波数波形から得られた特徴量とを用いて検査対象の正常／異常を総合的に判別するものである。 As an abnormal sound inspection apparatus that automatically performs an inspection (so-called abnormal sound inspection) for determining normality / abnormality from a vibration waveform obtained from an inspection object as described above, there is one disclosed in Patent Document 1 in the past. The invention disclosed in Patent Document 1 comprehensively determines normality / abnormality of an inspection object using a feature amount obtained from a time axis waveform and a feature amount obtained from a frequency waveform.

このように時間軸波形と周波数軸波形のように異なる軸から得られる波形に基づいて総合的に異音検査をするのは以下の理由からである。すなわち、それ以前に開発されていた時間軸波形から得られた特徴量だけの異音検査や、周波数軸波形から得られた特徴量だけの異音検査ではすべての異音を検出することが難しい。それは、それぞれの特徴量には得意・不得意があるからである。複数の特徴量を用いる異音検査は、単一の特徴量を用いる異音検査に比べて高い判別能力を有する。 The reason why the abnormal sound inspection is comprehensively performed based on waveforms obtained from different axes such as a time axis waveform and a frequency axis waveform is as follows. In other words, it is difficult to detect all abnormal sounds using the abnormal sound inspection only for the feature amount obtained from the time-axis waveform and the abnormal sound inspection only for the feature amount obtained from the frequency axis waveform. . This is because each feature quantity is good and bad. The abnormal sound inspection using a plurality of feature amounts has a higher discrimination ability than the abnormal sound inspection using a single feature amount.

つまり、そもそも駆動系部品は、回転や往復運動を繰り返す機構で成り立っており、その機構にわずかな機械的異常があれば、それに起因した異常成分（良品から発せられる正常成分とは何かが違う成分）が必ず振動や音として周囲に伝達される。ところが、異音検査における異常成分は、正常成分と比較しても振動や音の波形に含まれる、わずかな違いでしかなく、熟練した人の耳であれば聞き分けられるような違いがあっても、波形解析してみるとノイズに埋もれてうまく検知することができないことがあった。これは、従前の異音検査が時間軸波形から得られた特徴量だけや、周波数軸波形から得られた特徴量だけの判別、しかも単一の特徴量のみに基づいて行われる判別であったからである。そこで、上記の特許文献１では、複数の軸から得られる複数の特徴量に基づいて総合的に正常／異常を判断するようにしている。そして、この特許文献１に開示された発明では、判別ルールとして、ファジィルールを用い、ファジィ推論により複数の特徴量に基づく正常／異常の判断を行なうようにしている。 In other words, the drive system component is composed of a mechanism that repeats rotation and reciprocation in the first place, and if there is a slight mechanical abnormality in the mechanism, the abnormal component (what is different from the normal component emitted from a good product) is different. Component) is always transmitted to the surroundings as vibration and sound. However, abnormal components in abnormal noise tests are only slight differences in vibration and sound waveforms compared to normal components, even if there are differences that can be discerned by skilled human ears. When I analyzed the waveform, it was buried in noise and could not be detected well. This is because the conventional abnormal sound inspection was performed only on the feature amount obtained from the time axis waveform, or only on the feature amount obtained from the frequency axis waveform, and was performed based only on a single feature amount. It is. Therefore, in Patent Document 1 described above, normal / abnormal is determined based on a plurality of feature amounts obtained from a plurality of axes. In the invention disclosed in Patent Document 1, a fuzzy rule is used as a discrimination rule, and normality / abnormality is determined based on a plurality of feature amounts by fuzzy inference.

ところで、特許文献１に開示された異音検査に判別ルールとして用いるファジィ推論は、ニューラルネットなど、その他の判別モデルと比較して、人が判別ルールを理解しやすいという利点がある。例えばニューラルネットとは、ニューロンモデルを互いに多数結合させて接続しネットワーク状にしたものであり、どのような判別をしてそのような結果に至ったのか、その根拠が難解で感覚的に理解しがたい。感覚的に理解できないものを人は信用しにくい。それが品質の要となる検査装置であるならなおさらである。 By the way, the fuzzy reasoning used as a discrimination rule for the abnormal sound inspection disclosed in Patent Document 1 has an advantage that a person can easily understand the discrimination rule as compared with other discrimination models such as a neural network. For example, a neural network is a network in which many neuron models are connected to each other, and it is difficult to understand and understand the basis of what kind of discrimination was made. It's hard. It is difficult for people to trust things that they cannot understand sensuously. Especially if it is an inspection device that is the key to quality.

これに対して、ファジィ推論は、あいまいさを表現するメンバシップ関数を用いており、ファジィ推論を用いた判別ルールは、判別の根拠と判別結果を対応づけて「ＩＦ特徴量Ａ＝大ＴＨＥＮ異常」のように人に理解しやすい表現で示すことが出来る。このように感覚的に理解できるものは説明もしやすく、品質ソリューションを事業とする場合に、検査装置の検査ロジックとして判別ルールを説明しやすいため、その説明を受けた顧客にとっても納得する度合いが高いので安心して採用できるという利点がある。 On the other hand, fuzzy inference uses a membership function that expresses ambiguity, and the discrimination rule using fuzzy inference associates the basis of discrimination with the discrimination result, and “IF feature A = large THEN anomaly Can be expressed in an easy-to-understand manner. Those that can be understood sensuously are easy to explain, and when the quality solution is a business, it is easy to explain the discrimination rules as the inspection logic of the inspection device, so the customer who received the explanation is also highly convinced So there is an advantage that can be adopted with confidence.

また、新規に異音検査装置を導入しようとする顧客は、それまで熟練者（官能検査員）の耳による官能検査を行っていることも多く、官能検査員は「異音なきこと」などの記述が一般的な検査基準に対して独自の判定基準やノウハウ、知見をすでに有している。このような場合には、異音検査装置は官能検査員がこれまで行っていた官能検査の置き換えとなるので、官能検査員の持つ判定基準やノウハウ、知見との整合性が自ずと求められるのが現状である。係る場合にも、作成した判別ルールと、それまでの官能検査員がもっていた知識（検査基準）との整合性を説明しやすいということは、顧客に対して説明責任を負うソリューション提供者にとってファジィ推論による説明のしやすさは事業を進める上で大きな利点となっている。 In addition, many customers who intend to introduce a new abnormal sound inspection device have been performing sensory inspections by the ears of experts (sensory inspectors) until then. The description already has its own criteria, know-how and knowledge for general inspection standards. In such a case, since the abnormal sound inspection device replaces the sensory inspection that the sensory inspector has performed so far, the consistency with the judgment criteria, know-how and knowledge possessed by the sensory inspector is naturally required. Currently. Even in such cases, the fact that it is easy to explain the consistency between the created discrimination rules and the knowledge (inspection criteria) that the sensory inspector had so far is a fuzzy for solution providers who are accountable to customers. Ease of explanation by reasoning is a great advantage for business.

ところで、使用する特徴量の数が増加するほど、良否判定をするための判別ルールも複雑になったり、多数必要になったりする。そのため、精度の高い異音検査を行なうためには、判別ルールを精度良く作成する必要がある。異音検査における判別ルールを作成する工数を削減する技術として、非特許文献１に開示された技術がある。この非特許文献１には、判別ルール（検査ロジック）の自動生成において判別ルールに用いる特徴量選択とパラメータ探索に遺伝子アルゴリズムを用いる技術が開示されている。つまり、判別ルールに用いる特徴量選択とパラメータ探索とに、遺伝子アルゴリズムを用いることによって、それまで、人の勘と経験による試行錯誤でしか出来ないとされてきた判別ルールの作成処理を自動化／半自動化できるようにした。 By the way, as the number of feature quantities to be used increases, the determination rules for determining pass / fail judgment become more complicated or more numerous. Therefore, in order to perform an abnormal noise inspection with high accuracy, it is necessary to create a discrimination rule with high accuracy. There is a technique disclosed in Non-Patent Document 1 as a technique for reducing the number of steps for creating a discrimination rule in abnormal noise inspection. This Non-Patent Document 1 discloses a technique that uses a genetic algorithm for feature quantity selection and parameter search used for a discrimination rule in automatic generation of a discrimination rule (inspection logic). In other words, by using genetic algorithms to select feature values and parameter search used for discrimination rules, automate / semi-determining the creation of discrimination rules, which until now could only be done by trial and error based on human intuition and experience. Added automation.

また、異音検査における判別ルールを自動作成する技術としては、非特許文献２に開示された発明もある。この非特許文献２には、判別ルールの自動生成において、判別ルール生成のために収集した正常データと異常データから適切な数の正常データと異常データを選択し、選択したデータから遺伝子アルゴリズムを用いて判別ルールに用いる特徴量選択とパラメータ調整をし、選択した特徴量と調整したパラメータからファジィ推論を用いて判別ルールを生成する技術が開示されている。このように正常データと異常データからそれらを最も分離する判別ルールを生成する技術は一般に不良識別と呼ばれている。 Further, as a technique for automatically creating a discrimination rule in abnormal noise inspection, there is an invention disclosed in Non-Patent Document 2. In this non-patent document 2, in automatic generation of discrimination rules, an appropriate number of normal data and abnormal data are selected from normal data and abnormal data collected for generation of discrimination rules, and a gene algorithm is used from the selected data. Thus, there is disclosed a technique for performing feature selection and parameter adjustment used for a discrimination rule, and generating a discrimination rule from a selected feature amount and the adjusted parameter using fuzzy inference. Such a technique for generating a discrimination rule that most separates normal data and abnormal data from each other is generally called defect identification.

しかしながら、不良識別では、正常／異常を判別するにはサンプルデータとして正常データと異常データがあらかじめ必要であり、異常データは正常データに比べて取得しにくいことから異常データがなければ判別ルールが生成できないという問題点がある。
特許第３４８４６６５号オムロンテクニクスＶｏｌ．４３Ｎｏ．１ｐｐ．９９−１０５（２００３）オムロンテクニクスＶｏｌ．４４Ｎｏ．１ｐｐ．４８−５３（２００４） However, in defect identification, normal data and abnormal data are required as sample data in advance to determine normality / abnormality. Since abnormal data is difficult to obtain compared to normal data, a determination rule is generated if there is no abnormal data. There is a problem that you can not.
Japanese Patent No. 3484665 OMRON Technics Vol. 43 No. 1 pp. 99-105 (2003) OMRON Technics Vol. 44 no. 1 pp. 48-53 (2004)

上述した通り、これまでにも各種の異音検査装置の開発が試みられてきている。しかし、いずれも、不良品（異常品）を良品（正常品）と誤判定してしまう見逃し率の発生をなくしつつ（不良品を出荷することになるため確実に阻止する必要がある）、良品を不良品と誤判定してしまう過検出率の低減を図る（良品が出荷されず、廃棄処理等されてしまう無駄・歩留まり低下の防止をする）ことを目的とし、高性能な良否判定アルゴリズムの作成・改良が行なわれている。そのため、使用する特徴量の数が増加したり、よりよい判定ルールを作成するために要求されるサンプル数が増加したりするのが現状であった。 As described above, development of various abnormal sound inspection apparatuses has been attempted so far. However, in all cases, the defective product (abnormal product) is erroneously judged as a good product (normal product), while avoiding an overlook rate (it is necessary to reliably prevent defective products from being shipped). Is a high-performance pass / fail judgment algorithm designed to reduce the over-detection rate that erroneously determines a product as a defective product (to prevent waste / yield reduction where good products are not shipped and discarded) Created and improved. Therefore, the current situation is that the number of feature quantities to be used increases or the number of samples required for creating a better determination rule increases.

たとえば検査対象データごとに最適な特徴量値を出力できるようにパラメータを調整することで、正常か異常かの微妙なところも判定できる。ここで、パラメータは、たとえば特徴量Ａがある値（しきい値）以上となると、不良品であるというような場合、特徴量Ａについての係るしきい値がパラメータとなり、そのしきい値をいくつにするかの調整がパラメータの調整（学習）となる。しかし、パラメータの調整が不十分だったり、未調整のままだったりすると、正しい検査ができない可能性がある。 For example, by adjusting the parameters so that the optimum feature value can be output for each inspection target data, it is possible to determine the subtlety of normal or abnormal. Here, for example, when the feature amount A is a certain value (threshold value) or more, if the parameter value is a defective product, the threshold value related to the feature amount A is a parameter. Adjustment of whether or not is parameter adjustment (learning). However, if the parameter adjustment is insufficient or unadjusted, there is a possibility that correct inspection cannot be performed.

従って、上述したように、特徴量の増加並びにサンプル数の増加に伴い、学習に伴い生成・確認（ルールの適否判断）される判定ルールも多岐にわたり、学習が収束して最終的に最適な判定ルールを作成するのに時間がかかる。 Therefore, as described above, as the amount of features and the number of samples increase, the decision rules that are generated and confirmed (judgment of rule suitability) are varied along with learning. It takes time to create rules.

また、非特許文献２には、適切な数の正常データと異常データを選択することが開示されており、具体的には「パラメータ調整に使用するデータセットを選択する。４０種類の特徴量を使ってＯＫデータを元に基準空間を求め、各データに対してマハラノビス距離を計算する。すべてのデータの中から、指定された件数のＯＫデータおよびＮＧデータを全体の分布と照らし合わせて、バランスを考慮して選択する。」と記載されている。つまり、ＯＫデータ（正常データ）とＮＧデータ（異常データ）の数のバランスが悪いと、適切かつ効率のよい学習が行なえないおそれがある。 Non-Patent Document 2 discloses that an appropriate number of normal data and abnormal data are selected. Specifically, “select a data set used for parameter adjustment. Forty types of feature values are selected. Use the OK data to calculate the reference space, calculate the Mahalanobis distance for each data, and balance the total number of OK data and NG data from the specified data against the entire distribution. Is selected. ”Is described. That is, if the balance between the number of OK data (normal data) and NG data (abnormal data) is poor, there is a possibility that appropriate and efficient learning cannot be performed.

ところで、実際の判定処理の際に発生する不良品に基づく異常データをあらかじめ用意するのは困難で、学習時に用意できるサンプルデータのほとんどは良品に基づく正常データとなり、正常データと異常データのサンプル数に大きなばらつきが生じる。そこで、適切な数の正常／異常データに基づいて学習を行なうためには、正常データの削除（データの間引き）を行なうことになる。しかし、闇雲に削除するとパラメータ調整の演算時間が短縮できても過検出が多くなり、検査装置としての性能が出ないおそれがある。 By the way, it is difficult to prepare abnormal data based on defective products that occur during actual judgment processing in advance, and most of the sample data that can be prepared at the time of learning is normal data based on non-defective products, and the number of samples of normal data and abnormal data A large variation occurs. Therefore, in order to perform learning based on an appropriate number of normal / abnormal data, normal data is deleted (data thinning out). However, if it is deleted in the dark clouds, even if the calculation time for parameter adjustment can be shortened, overdetection increases, and the performance as an inspection device may not be achieved.

一例を示すと、正常データと異常データにおける２つの特徴量ＰＶ，ＰＮを軸とした散布図が、図１（ａ）に示すようになっていたとする。ここで、散布図は、データの特性を定量的に表す指標を複数定め、それら指標を軸にして構成される空間上に、各データをその指標値にしたがって配置した図である。ここでは２種類の特徴量を軸にした場合の散布図の例である。 As an example, it is assumed that a scatter diagram with two feature amounts PV and PN in normal data and abnormal data as axes is as shown in FIG. Here, the scatter diagram is a diagram in which a plurality of indexes that quantitatively represent the characteristics of data are determined, and each data is arranged according to the index values in a space configured with these indexes as axes. Here, it is an example of a scatter diagram when two types of feature quantities are used as axes.

この散布図において、白丸が正常データで、黒丸が異常データである。すべてのデータを考慮して判定ルールを作成すると、各特徴量の閾値が図中波線で示すような境界線に基づき分離することになり、ルールの一例を示すと、下記のようになる。
ＩＦＰＶ＞ＡＡＮＤＰＮ＞ＢＴｈｅｎ異常データ
In this scatter diagram, white circles are normal data, and black circles are abnormal data. When a determination rule is created in consideration of all data, the threshold values of the feature quantities are separated based on the boundary line as indicated by a wavy line in the figure. An example of the rule is as follows.
IF PV> A AND PN> B Then Abnormal data

この図１（ａ）に示すデータから、正常データの一部を削除し、残った正常データ（代表データ）と異常データに基づいて学習を行い判定ルールを作成する。このとき、たとえば図１（ｂ）に示すように偏った選択をする（点線の白丸は削除され、実線の白丸の正常データと、黒丸の異常データに基づいて学習する）と、本来は２つの特徴量を使わないと分離できないものを、１つの特徴量ＰＶだけで分離しようとする誤った検査知識が作られる可能性がある。このときのルールの一例を示すと、下記のようになる。
ＩＦＰＶ＞ＡＴｈｅｎ異常データ
A part of normal data is deleted from the data shown in FIG. 1A, and learning is performed based on the remaining normal data (representative data) and abnormal data to create a determination rule. At this time, for example, as shown in FIG. 1B, when the selection is biased (the dotted white circle is deleted and learning is performed based on the normal data of the solid white circle and the abnormal data of the black circle), there are originally two There is a possibility that erroneous inspection knowledge is created to try to separate what cannot be separated without using the feature quantity with only one feature quantity PV. An example of the rules at this time is as follows.
IF PV> A The abnormal data

同様に、図１（ｃ）に示すように偏った選択をする（点線の白丸は削除され、実線の白丸の正常データと、黒丸の異常データに基づいて学習する）と、正常データと異常データとの境界線（検査時のしきい値）をうまく設定できない可能性がある。このときのルールの一例を示すと、下記のようになる。
ＩＦＰＶ＞ＣＡＮＤＰＮ＞ＤＴｈｅｎ異常データ
Similarly, when a biased selection is made as shown in FIG. 1C (the dotted white circle is deleted and learning is performed based on the normal data of the solid white circle and the abnormal data of the black circle), normal data and abnormal data are displayed. There is a possibility that the boundary line with (threshold at the time of inspection) cannot be set well. An example of the rules at this time is as follows.
IF PV> C AND PN> D Then Abnormal data

上述したように適切でないルールが生成されてしまうと、点線の白丸である正常データが異常データと誤判定されてしまい、過検出が増える。なお、検査装置であるため、異常データを良品と誤判定（見逃し）して出荷してしまうことはさけなければいけない。よって、適切なデータが選択されないと、上述したように過検出が増え、無駄に廃棄されるものが多くなるという問題を有する。 If an inappropriate rule is generated as described above, normal data that is a dotted white circle is erroneously determined as abnormal data, and overdetection increases. In addition, since it is an inspection device, it must be avoided that the abnormal data is misjudged (missed) as a non-defective product and shipped. Therefore, if appropriate data is not selected, there is a problem that over-detection increases as described above, and a lot of data is wasted.

この発明は、知識作成に用いるデータを適宜に減らすことによって、パラメータ調整結果の最適性を損なうことなくパラメータ調整に要する時間を削減することができる知識作成装置および知識作成方法を提供することを目的とする。 An object of the present invention is to provide a knowledge creating apparatus and a knowledge creating method capable of reducing the time required for parameter adjustment without deteriorating the optimum parameter adjustment result by appropriately reducing the data used for knowledge creation. And

上記した目的を達成するため、本発明の知識作成装置は、入力された検査対象の計測データに対して特徴量を抽出し、抽出した特徴量に基づいて前記検査対象の良否判定をする検査装置における前記良否判定を行う際に使用する良否判定知識を、同一種類に属する波形データと、その同一種類に属さない波形データに基づいて作成する知識作成装置であって、取得した同一種類に属する複数の波形データの中から所定の波形データを選択するデータ選択手段と、そのデータ選択手段が選択した波形データである選択データと、前記同一種類に属さない波形データを用いて、前記良否判定知識を作成する知識作成手段とを備え、前記データ選択手段は、前記選択対象の複数の波形データが属するグループの境界領域に属する所定数の境界データを選択する機能と、前記境界領域に属さない波形データの中から一部の波形データを代表データとして選択する選択機能とを有し、それら選択された前記境界データと前記代表データを併せて前記選択データとするものであり、前記選択機能は、前記同一種類に属する全ての波形データに対して仮の特徴量演算を行ない、その仮の特徴量演算を実行して得られた特徴量演算結果に基づき、各波形データについてそれぞれ求めた複数の特徴量値を基準化するとともに、マハラノビス距離を求め、そのマハラノビス距離が大きいグループから前記境界データを選択し、そのマハラノビス距離が小さいグループから前記代表データを選択するように構成した。 In order to achieve the above-described object, the knowledge creating apparatus of the present invention extracts a feature amount from input measurement object measurement data, and determines whether or not the inspection target is good based on the extracted feature amount. A knowledge creation device that creates the pass / fail judgment knowledge used when the pass / fail judgment is performed on the basis of waveform data belonging to the same type and waveform data that does not belong to the same type, and a plurality of acquired knowledge belonging to the same type Using the data selection means for selecting predetermined waveform data from the waveform data, selection data that is the waveform data selected by the data selection means, and waveform data that do not belong to the same type, Knowledge creating means for creating, wherein the data selecting means is a predetermined number of boundary data belonging to a boundary area of a group to which the plurality of waveform data to be selected belongs. A function of selecting, and a selection function of selecting a part of the waveform data as a representative data from the waveform data that does not belong to the boundary region, the selection together with their selected the boundary data and the representative data The selection function performs a temporary feature amount calculation on all waveform data belonging to the same type, and executes the temporary feature amount calculation to obtain a feature amount calculation result. Based on the plurality of feature values obtained for each waveform data, the Mahalanobis distance is obtained, the boundary data is selected from the group having the larger Mahalanobis distance, and the representative data is selected from the group having the smaller Mahalanobis distance. Configured to select .

知識作成手段は、実施の形態では、「パラメータ最適化部」と「ルール作成部」により実現されているが、選択された選択データに基づいて良否判定知識を作成するものであれば、実施の形態に示したものに限られないのはもちろんである。 In the embodiment, the knowledge creation means is realized by the “parameter optimization unit” and the “rule creation unit”. However, if the knowledge creation means creates the pass / fail judgment knowledge based on the selected selection data, Of course, it is not restricted to what was shown in the form.

さらに、前記データ選択手段における選択手法としては、前記マハラノビス距離が設定されたしきい値以上に該当するものを前記境界データと選択するようにしたり、前記マハラノビス距離が大きいものから所定量（具体的な数／割合等）を前記境界データと選択するようにしたり、各波形データをマハラノビス距離の順にソートし、所定数おきに波形データを選択することで、前記代表データを選択するようにしたり、各波形データをマハラノビス距離の順にソートし、所定距離間隔で波形データを選択することで、前記代表データを選択するようにすることができる。もちろん、それ以外の手法によって選択するのを妨げない。 Further, as a selection method in the data selection means, a method in which the Mahalanobis distance is equal to or more than a set threshold value is selected as the boundary data, or a predetermined amount (specifically, from a large Mahalanobis distance) is selected. A number / ratio) is selected as the boundary data, each waveform data is sorted in the order of Mahalanobis distance, and the waveform data is selected every predetermined number, so that the representative data is selected, The representative data can be selected by sorting each waveform data in the order of Mahalanobis distance and selecting the waveform data at predetermined distance intervals. Of course, selection by other methods is not prevented.

さらに、上述した各種の発明を前提とし、前記知識作成手段が作成した良否判定知識を用いて実行された良否判断の結果が、判別性能の目標値に達しているか否かを判断する判断手段（実施の形態ではデータ選択部が兼用しているが、別に構成してももちろんよい）と、その判断手段の判断結果が、目標値に達しない場合には、前記データ選択手段で選択するデータを増やし、再度データ選択を行なう機能を備えるとよい。この発明は、第２の実施の形態により実現されている。 Furthermore, on the premise of the various inventions described above, determination means for determining whether or not the result of the pass / fail judgment executed using the pass / fail judgment knowledge created by the knowledge creating means has reached the target value of the discrimination performance ( In the embodiment, the data selection unit is also used, but it may be configured separately.) If the determination result of the determination unit does not reach the target value, the data selection unit selects the data to be selected by the data selection unit. It is desirable to provide a function of increasing and selecting data again. The present invention is realized by the second embodiment.

前記データ選択手段で選択するデータを増やす処理は、下記の（１）から（３）のうち少なくとも１つを含むとよい。もちろん、これ以外の手法をとることも妨げない。（１）前記同一種類に属する波形データを異なるグループと誤判断した場合に、その誤判断した波形データを境界データとして追加する。（２）前記同一種類に属さない波形データを前記同一種類に属すると誤判断した場合には、データ選択する総数を増やす。（３）前記同一種類に属さない波形データを前記同一種類に属すると誤判断するとともに、前記同一種類に属する波形データを異なるグループと誤判断した場合には、境界データとして選択する量を増やす。 The process of increasing the data selected by the data selection means may include at least one of the following (1) to (3). Of course, taking other methods is not precluded. (1) When the waveform data belonging to the same type is erroneously determined as a different group, the erroneously determined waveform data is added as boundary data. (2) If it is erroneously determined that the waveform data that does not belong to the same type belongs to the same type, the total number of data selection is increased. (3) When the waveform data not belonging to the same type is erroneously determined as belonging to the same type, and when the waveform data belonging to the same type is erroneously determined as a different group, the amount to be selected as boundary data is increased.

一方、本発明に係る知識作成法は、入力された検査対象の計測データに対して特徴量を抽出し、抽出した特徴量に基づいて前記検査対象の良否判定をする検査装置における前記良否判定を行う際に使用する良否判定知識を、同一種類に属する波形データと、その同一種類に属さない波形データに基づいて作成する知識作成装置における知識作成法であって、取得した前記同一種類に属する複数の波形データの中から所定の波形データを選択するデータ選択処理と、そのデータ選択処理を実行して選択された波形データである選択データと、前記同一種類に属さない波形データを用いて、前記良否判定知識を作成する知識作成する処理とを含む。そして、前記データ選択処理は、前記選択対象の複数の波形データが属するグループの境界領域に属する所定数の境界データを選択する選択処理と、前記境界領域に属さない波形データの中から一部の波形データを代表データとして選択するとともに、それら選択された前記境界データと前記代表データを併せて前記選択データとする処理を含み、前記選択処理は、前記同一種類に属する全ての波形データに対して仮の特徴量演算を行ない、その仮の特徴量演算を実行して得られた特徴量演算結果に基づき、各波形データについてそれぞれ求めた複数の特徴量値を基準化するとともに、マハラノビス距離を求め、そのマハラノビス距離が大きいグループから前記境界データを選択し、そのマハラノビス距離が小さいグループから前記代表データを選択するようにした。 On the other hand, the knowledge creation method according to the present invention extracts the feature quantity from the input measurement target measurement data, and performs the pass / fail determination in the inspection apparatus that determines the pass / fail of the inspection target based on the extracted feature quantity. multiple belonging quality determination knowledge to be used in performing the waveform data belonging to the same type, the same kind a knowledge generating method obtained in the knowledge creation apparatus for creating, based on the waveform data that does not belong to the same type Data selection processing for selecting predetermined waveform data from among the waveform data, selection data that is waveform data selected by executing the data selection processing, and waveform data that do not belong to the same type, And knowledge creation processing for creating pass / fail judgment knowledge. The data selection process includes a selection process for selecting a predetermined number of boundary data belonging to a boundary area of a group to which the plurality of waveform data to be selected belong, and a part of the waveform data not belonging to the boundary area. Including selecting the waveform data as representative data, and combining the selected boundary data and the representative data into the selection data, wherein the selection processing is performed on all waveform data belonging to the same type. Based on the feature value calculation result obtained by performing the temporary feature value calculation and standardizing multiple feature value values obtained for each waveform data, the Mahalanobis distance is calculated. The boundary data is selected from a group having a large Mahalanobis distance, and the representative data is selected from a group having a small Mahalanobis distance. It was to be selected.

本発明では、知識作成に用いるデータをうまく減らすことによって、パラメータ調整結果の最適性を損なうことなくパラメータ調整に要する時間を削減することができる。 In the present invention, the time required for parameter adjustment can be reduced without deteriorating the optimality of the parameter adjustment result by successfully reducing the data used for knowledge creation.

本発明の一実施の形態の装置を説明するに先立ち、本装置により設定される判定ルールを用いて良否判定（官能検査）を行う検査システムを説明する。図２に示すように、検査対象物１に接触・近接配置するマイク２および加速度ピックアップ３からの信号をアンプ４で増幅し、ＡＤ変換器５にてデジタルデータに変更後、検査装置１０に与えるようになっている。そして、検査装置１０は、マイク２で収集した音データや、加速度ピックアップ３で収集した振動データに基づく波形データを取得し、特徴量を抽出するとともに、異常判定を行なう。図２から明らかなように、検査装置１０は、コンピュータから構成され、ＣＰＵ本体１０ａと、キーボード，マウス等の入力装置１０ｂと、ディスプレイ１０ｃとを備えている。また、必要に応じて外部記憶装置を備えたり、通信機能を備えて外部のデータベースと通信し、必要な情報を取得することが出来る。 Prior to describing an apparatus according to an embodiment of the present invention, an inspection system that performs pass / fail determination (sensory inspection) using a determination rule set by the apparatus will be described. As shown in FIG. 2, a signal from the microphone 2 and the acceleration pickup 3 that are in contact with and close to the inspection object 1 is amplified by an amplifier 4, converted into digital data by an AD converter 5, and then supplied to the inspection apparatus 10. It is like that. And the inspection apparatus 10 acquires the waveform data based on the sound data collected by the microphone 2 and the vibration data collected by the acceleration pickup 3, extracts the feature amount, and performs abnormality determination. As is apparent from FIG. 2, the inspection apparatus 10 is configured by a computer, and includes a CPU main body 10a, an input device 10b such as a keyboard and a mouse, and a display 10c. Further, if necessary, an external storage device can be provided, or a communication function can be provided to communicate with an external database to obtain necessary information.

図３に示すように、検査装置１０は、Ａ／Ｄ変換器５を介して取得した波形データから特徴量を抽出する特徴量抽出部１１と、その特徴量抽出部１１で抽出した特徴量の値に基づいて、正常データか異常データかの良否判定を行う判定部１２と、特徴量抽出部１１にて特徴量抽出する特徴量とそのパラメータ等を記憶する特徴量演算パラメータ記憶部１３と、判定部１２にて良否判定処理を行う際に使用するファジィルールを記憶するファジィルール記憶部１４とを備えている。判定部１２は、ファジィルール記憶部１４に格納されたルールに従い、与えられた特徴量に対しファジィ推論部１２ａにてファジィ推論を行う。そして、求められた適合度を閾値処理部１２ｂに与え、そこにおいて閾値処理をし、良否判定を行う。この判定部１２の判定結果は、例えばディスプレイ１０ｃにリアルタイムで表示したり、記憶装置に格納したりすることができる。 As illustrated in FIG. 3, the inspection apparatus 10 includes a feature amount extraction unit 11 that extracts a feature amount from waveform data acquired via the A / D converter 5, and a feature amount extracted by the feature amount extraction unit 11. A determination unit 12 that determines whether the data is normal data or abnormal data based on the value; a feature amount calculation parameter storage unit 13 that stores a feature amount extracted by the feature amount extraction unit 11 and its parameters; A fuzzy rule storage unit 14 that stores a fuzzy rule used when the determination unit 12 performs the pass / fail determination process. The determination unit 12 performs fuzzy inference on the given feature amount by the fuzzy inference unit 12 a according to the rules stored in the fuzzy rule storage unit 14. Then, the obtained degree of conformity is given to the threshold processing unit 12b, where threshold processing is performed, and pass / fail determination is performed. The determination result of the determination unit 12 can be displayed on the display 10c in real time or stored in a storage device, for example.

上述した検査装置１０の具体的な構成（各処理部の詳細な構成）は、従来公知の各種のものを適用できるため、詳細な説明を省略する。本発明に係る知識作成装置の実施の形態では、上述した特徴量演算パラメータ記憶部１３に格納する特徴量や、その特徴量のパラメータを作成したり、ファジィルール記憶部１４に格納するルールを作成するものである。図４は、本発明に係る知識作成装置の一実施の形態の概略構成を示している。 As the specific configuration of the inspection apparatus 10 described above (detailed configuration of each processing unit), various conventionally known devices can be applied, and thus detailed description thereof is omitted. In the embodiment of the knowledge creation device according to the present invention, the feature quantity stored in the feature quantity calculation parameter storage unit 13 described above, the parameter of the feature quantity, or the rule stored in the fuzzy rule storage unit 14 is created. To do. FIG. 4 shows a schematic configuration of an embodiment of the knowledge creating apparatus according to the present invention.

図４に示すように、知識作成装置２０は、サンプルデータを格納する波形データベース２１と、その波形データベースに格納されたサンプルデータ（正常データと異常データの波形データ）に基づき、検査装置１０が良否判定を行なう際に使用する上述した特徴量やルール等を生成する良否判定アルゴリズム生成部２２とを備えている。 As shown in FIG. 4, the knowledge creation device 20 determines whether the inspection device 10 is good or bad based on the waveform database 21 storing sample data and the sample data (waveform data of normal data and abnormal data) stored in the waveform database. A pass / fail judgment algorithm generation unit 22 that generates the above-described feature amounts, rules, and the like used when making the determination is provided.

波形データベース２１に格納する波形データは、例えば、実際の官能検査を行なうのと同様に、サンプル品等を動作させた時に生じる音や振動をセンサ３で取得した（図示省略するが、必要に応じて増幅する）ものを記憶しても良いし、別途用意した別のデータベースからダウンロードして格納しても良い。また、この波形データベース２１には、実際の波形データと、その種類（正常データと異常データの区別）が分かるように格納されている。すなわち、各波形データと、種類を関係づけて格納しても良いし、正常データのホルダと、異常データのホルダを分け、各ホルダ毎に対応する波形データを格納するようにしても良い。要は、格納された波形データの種類が分かるようになっていればよい。尚、上記の種類は、例えば、検査員が実際にセンサ３でデータ取得した際に検査員が同時に検査対象物（サンプル品）から発生する音等を聴いて判断したり、一旦格納した波形データを再生し、その再生した音を検査員が聴いて判断した結果を格納するようにしても良いし、サンプルデータを取得するための検査対象物が、予め良品か不良品かの区別が付いているものの場合には、予め種類を指定して波形データを取り込むことにより自動的に種類と波形データの関連づけを行なうようにしても良い。 The waveform data stored in the waveform database 21 is obtained by, for example, the sound and vibration generated by operating the sample product or the like with the sensor 3 (not shown in the figure, as necessary), as in the case of actual sensory testing. May be stored, or may be downloaded from another database prepared separately and stored. The waveform database 21 stores actual waveform data and its type (distinguishment between normal data and abnormal data). In other words, each waveform data may be stored in association with the type, or a holder for normal data and a holder for abnormal data may be separated, and waveform data corresponding to each holder may be stored. In short, it is only necessary to be able to understand the type of stored waveform data. Note that the above-mentioned type is determined by, for example, the inspector listening to the sound generated from the inspection object (sample product) at the same time when the inspector actually acquires the data by the sensor 3, or the waveform data once stored. And the result of the inspector listening to and judging the reproduced sound may be stored, and the inspection object for obtaining the sample data is preliminarily distinguished as good or defective. If it is, the type and waveform data may be automatically associated by specifying the type in advance and taking in the waveform data.

良否判定アルゴリズム生成部２２は、データ選択部２３と、パラメータ最適化部２４と、ルール作成部２５とを備えている。データ選択部２３が本発明の要部となるところで、波形データベース２１に格納されたサンプルデータのうち、良否判定アルゴリズムを生成する際に使用するデータを選択するものである。波形データベース２１に格納されたサンプルデータ、換言すると、用意できるサンプルデータの数は、正常データのサンプル数が圧倒的に多いのが通常である。そして、基本的には、サンプル数が多いほどより高精度の判定を行なうことのできる良否判定アルゴリズムを生成することができる。但し、サンプル数が多いと、たとえば、後段のパラメータ最適化部２４にて行われる最適化処理を実行し特徴量やその特徴量のパラメータを決定するまでに要する処理時間が長くなる。そこで、データ選択部２３は、波形データベース２１に格納されたサンプルデータのうち、特にデータ数が多くなることが予測できる正常データを間引いて選択する。これにより、良否判定アルゴリズムの生成には使用されない（反映されない）正常データのサンプルデータも存在するため、後段のパラメータ最適化部２４にて行われる最適化処理時間が短くすることができる。そして、後述するように、正常データのサンプルデータを選択する際に、一定の条件の下で抽出するため、選択する前のすべてのサンプルデータを用いて求めた良否判定アルゴリズムとほぼ同様の高精度のものを得ることができる。 The pass / fail judgment algorithm generation unit 22 includes a data selection unit 23, a parameter optimization unit 24, and a rule creation unit 25. Where the data selection unit 23 is the main part of the present invention, data used for generating a pass / fail judgment algorithm is selected from the sample data stored in the waveform database 21. In general, the number of sample data stored in the waveform database 21, in other words, the number of sample data that can be prepared, is typically the number of samples of normal data. Basically, it is possible to generate a pass / fail judgment algorithm that can perform a more accurate judgment as the number of samples increases. However, if the number of samples is large, for example, the processing time required for executing the optimization process performed by the parameter optimization unit 24 in the subsequent stage and determining the feature amount and the parameter of the feature amount becomes long. Therefore, the data selection unit 23 thins out and selects normal data that can be predicted to increase especially among the sample data stored in the waveform database 21. Thereby, there is also sample data of normal data that is not used (not reflected) for generation of the pass / fail judgment algorithm, so that the optimization processing time performed in the parameter optimization unit 24 at the subsequent stage can be shortened. And, as will be described later, when selecting normal data sample data, it is extracted under certain conditions, so the accuracy is almost the same as the pass / fail judgment algorithm obtained using all sample data before selection You can get things.

パラメータ最適化部２４は、良否判定を行なう際に使用する特徴量（有効特徴量）と、その特徴量についてのパラメータを決定するものである。すなわち、特徴量の一例を示すと、周波数成分や、平均値，分散，最大値，最小値，閾値越えのピーク数や、Ｎ番目のピークの値などがある。周波数成分については、ローパスフィルタや、バンドパスフィルタその他各種のフィルタを用いたり、ＦＦＴ処理などの波形変換を行なうことなどがある。そのときの特徴量のパラメータは、ローパスフィルタについては、カットオフ周波数の値であったり、バンドパスフィルタの場合には、通過帯域を区切る値等となる。ＦＦＴ処理の場合には、周波数軸に変換されたため、任意の周波数帯における成分を特徴量とすることで、異常成分の含まれている状態を定量化することが可能なる。つまり、ＦＦＴ処理後に抽出すべき周波数帯域がパラメータとなる。平均値、最大値等の特徴量は、通常、フィルタ処理した後の波形データに基づいて求める。閾値越えのピーク数を特徴量とした場合のパラメータは、当該閾値の値となる。また、特徴量が、各ピーク値の上からＮ番目の場合は、Ｎの値（単数，複数）がパラメータとなる。さらに、ピーク数やピーク値の場合、検出対象の時間をパラメータとして設定したり、閾値も１つのみでなく、下限値と上限値というように２つの閾値を設定し、一定の範囲内或いは範囲外をパラメータとして設定することができる。もちろん、特徴量は上述したものに限られないのは言うまでもなく、また、例示列挙した特徴量に対するパラメータも、これに限られないのはもちろんである。 The parameter optimizing unit 24 determines a feature amount (effective feature amount) used when the quality determination is performed and a parameter for the feature amount. In other words, an example of the feature amount includes a frequency component, an average value, a variance, a maximum value, a minimum value, a peak number exceeding a threshold, a value of the Nth peak, and the like. For the frequency component, a low-pass filter, a band-pass filter, or other various types of filters may be used, or waveform conversion such as FFT processing may be performed. The feature parameter at that time is a cut-off frequency value for the low-pass filter, or a value that divides the passband in the case of the band-pass filter. In the case of FFT processing, since it has been converted to the frequency axis, it is possible to quantify the state in which an abnormal component is included by using a component in an arbitrary frequency band as a feature amount. That is, the frequency band to be extracted after the FFT processing is a parameter. The feature values such as the average value and the maximum value are usually obtained based on the waveform data after filtering. The parameter when the number of peaks exceeding the threshold is used as the feature amount is the threshold value. When the feature amount is Nth from the top of each peak value, the value of N (single or plural) is a parameter. Furthermore, in the case of the number of peaks or peak value, the detection target time is set as a parameter, and not only one threshold value but also two threshold values such as a lower limit value and an upper limit value are set. Outside can be set as a parameter. Of course, it goes without saying that the feature amount is not limited to the above-described ones, and the parameters for the feature amounts exemplified and enumerated are not limited thereto.

このパラメータ最適化部２４は、良品（正常データ）と不良品（異常データ）を最もよく分離できる特徴量演算用の諸パラメータ（特徴量演算パラメータや各特徴量の評価値の重み）を探索するものであり、本実施の形態では、例えば特開２００４−０７９２１１号公報に開示された発明のように、ＧＡ（遺伝的アルゴリズム）を用いて有効特徴量と、そのパラメータを決定することができる。すなわち、パラメータ探索のアルゴリズムとして遺伝的アルゴリズムが用いられた場合、個別のパラメータを遺伝子，全パラメータの組合せを個体とみなす。そこで、個体の交叉・突然変異によって新たな個体を創り出し（世代交代し）、評価の低い個体を新個体で置き換える。このようにして、評価の高いより優れた個体を残していくことにより最適に近い個体（パラメータ設定)を得る。 This parameter optimizing unit 24 searches for various parameters for feature amount calculation (feature amount calculation parameter and weight of evaluation value of each feature amount) that can best separate non-defective products (normal data) and defective products (abnormal data). In the present embodiment, for example, as in the invention disclosed in Japanese Patent Application Laid-Open No. 2004-079211, an effective feature amount and its parameters can be determined using GA (genetic algorithm). That is, when a genetic algorithm is used as a parameter search algorithm, individual parameters are regarded as genes, and combinations of all parameters are regarded as individuals. Therefore, a new individual is created by crossover / mutation of the individual (generation change), and the low-evaluated individual is replaced with a new individual. In this way, a more optimal individual (parameter setting) is obtained by leaving a higher-rated individual with higher evaluation.

そして、最終的に求められたパラメータ等が、特徴量演算パラメータ記憶部１３に格納される。なお、図示の例では、良否判定アルゴリズム生成部２２（パラメータ最適化部２４）が、検査装置１０の特徴量演算パラメータ記憶部１３に直接アクセスして格納するようにしているが、必ずしも直接格納する必要はなく、図示省略する知識作成装置２０の記憶装置内にパラメータ最適化部２４で生成した特徴量，パラメータ等を格納し、所定の方法（オンライン，通信，記録メディアを介する）でデータを格納すればよい。 Then, finally obtained parameters and the like are stored in the feature amount calculation parameter storage unit 13. In the example shown in the figure, the pass / fail judgment algorithm generation unit 22 (parameter optimization unit 24) directly accesses and stores the feature amount calculation parameter storage unit 13 of the inspection apparatus 10; The feature quantity, parameters, etc. generated by the parameter optimization unit 24 are stored in the storage device of the knowledge creation device 20 (not shown), and the data is stored by a predetermined method (online, communication, via a recording medium). That's fine.

なお、パラメータ最適化部２４における最適化処理、つまり、パラメータ等の探索方法は、上述したＧＡ（遺伝的アルゴリズム）に限ることはなく、例えば、ＮＮ（ニューラルネットワーク），ＳＶＭ（サポートベクターマシン），総当りなどの各種の手法をとることができる。 Note that the optimization processing in the parameter optimization unit 24, that is, the search method for parameters and the like is not limited to the above-described GA (genetic algorithm), and for example, NN (neural network), SVM (support vector machine), Various methods such as brute force can be taken.

また、パラメータ最適化部２４で求めた検査装置１０で使用すべき特徴量とそのパラメータは、ルール作成部２５に与えられる。ルール作成部２５は、与えられた特徴量とそのパラメータに基づいてファジィルールを作成する。使用すべき特徴量と、良否判定する場合のパラメータが分かっているため、公知の手法により“ＩＦＴＨＥＮ方式”のルールを簡単に作成することができるし、それに基づいてメンバシップ関数も作成できる。そして、それら作成したルール等をファジィルール記憶部１４に格納する。なお、図示の例では、良否判定アルゴリズム生成部２２（ルール作成部２５）が、検査装置１０のファジィルール記憶部１４に直接アクセスして格納するようにしているが、必ずしも直接格納する必要はなく、図示省略する知識作成装置２０の記憶装置内にルール記憶部１４で生成したファジィルール等を格納し、所定の方法（オンライン，通信，記録メディアを介する）でデータを格納すればよい。 Further, the feature quantity to be used by the inspection apparatus 10 obtained by the parameter optimization unit 24 and its parameters are given to the rule creation unit 25. The rule creation unit 25 creates a fuzzy rule based on the given feature amount and its parameters. Since the feature quantity to be used and the parameters for determining pass / fail are known, an “IF THEN” rule can be easily created by a known method, and a membership function can also be created based on the rule. Then, the created rules and the like are stored in the fuzzy rule storage unit 14. In the example shown in the figure, the pass / fail judgment algorithm generation unit 22 (rule creation unit 25) directly accesses and stores the fuzzy rule storage unit 14 of the inspection apparatus 10, but does not necessarily store directly. A fuzzy rule or the like generated by the rule storage unit 14 may be stored in a storage device of the knowledge creation device 20 (not shown), and data may be stored by a predetermined method (online, communication, via a recording medium).

ここで、本発明の要部となるデータ選択部２３の機能を説明する。上述したように、遺伝的アルゴリズムによるパラメータ探索を用いた場合、個別のパラメータを遺伝子，全パラメータの組合せを個体とみなして，個体の交叉・突然変異によって新たな個体を創りだしながら、より優れた個体を残していくことにより最適に近い個体（パラメータ設定)を獲得することができる。遺伝的アルゴリズムを使ったとしても、パラメータ探索に用いるデータの数や探索の繰り返し回数が増えると探索にかかる時間が膨大になる。もちろん、基本的にはデータの数が多いほど、探索の繰り返し回数が多いほど、最適なパラメータ設定が得られやすくなるが、本実施の形態のデータ選択部２３では、探索に用いるデータを適宜に減らすことによって、パラメータ調整結果の最適性を損なうことなくパラメータ調整に要する時間を削減することができるようにする。つまり、データ選択を行わない場合の特徴量演算パラメータ調整結果とほぼ同等のパラメータ調整結果が得られるようにする。 Here, the function of the data selection part 23 which becomes the principal part of this invention is demonstrated. As described above, when parameter search using a genetic algorithm is used, each individual parameter is regarded as a gene, and a combination of all parameters is regarded as an individual. By leaving an individual, it is possible to obtain an individual (parameter setting) close to the optimum. Even if a genetic algorithm is used, the time required for the search becomes enormous as the number of data used for parameter search and the number of search repetitions increase. Of course, as the number of data increases and the number of search iterations increases, it becomes easier to obtain optimal parameter settings. However, the data selection unit 23 of the present embodiment appropriately selects data used for search. By reducing the time, the time required for parameter adjustment can be reduced without impairing the optimality of the parameter adjustment result. That is, a parameter adjustment result almost equal to the feature amount calculation parameter adjustment result when data selection is not performed is obtained.

ここで、パラメータ調整結果が「良い」とは、そのパラメータ設定条件で演算した特徴量を用いることによって、よい判定性能（過検出率・見逃し率がともに０%に近いほどよい）が得られることである。ただし、判定性能の値は、最終的な検査ルールが確定しないと得られないことから、パラメータ調整の過程および結果の時点では、分離度を用いて特徴量およびその演算パラメータの良し悪しを評価する。 Here, “good” parameter adjustment result means that good judgment performance (both overdetection rate and oversight rate are closer to 0% is better) can be obtained by using the feature value calculated under the parameter setting conditions. It is. However, since the value of judgment performance cannot be obtained unless the final inspection rule is finalized, at the time of parameter adjustment and at the time of the result, the degree of feature and its operational parameters are evaluated using the degree of separation. .

ここで、分離度が低いとは、データの散布図が例えば図５（ａ）に示すように、軸となっている特徴量にそれぞれどのように閾値を設定しても良品データ群と不良品データ群の領域を分割することはできないような状態をいう。つまり、このままの特徴量の演算結果では良品と不良品の判別が難しいことを示している。なお、この散布図において、白丸が正常データで、黒丸が異常データである。 Here, when the degree of separation is low, as shown in FIG. 5A, for example, the data scatter diagram shows that the good data group and the defective product regardless of how the threshold values are set for the feature values that are the axes. A state in which the area of the data group cannot be divided. In other words, the calculation result of the feature amount as it is indicates that it is difficult to discriminate between a good product and a defective product. In this scatter diagram, white circles are normal data and black circles are abnormal data.

逆に、分離度が高い場合には、図５（ｂ）に示すように、軸となっている特徴量にそれぞれ適切に閾値を設定することによって良品データ群と不良品データ群のそれぞれが存在する領域を分割することができるような状態を言う。この特徴量演算結果で良品と不良品の判別ができる可能性が高いことを示している。 On the other hand, when the degree of separation is high, as shown in FIG. 5 (b), each of the non-defective product data group and the defective product data group exists by appropriately setting a threshold value for each feature amount serving as an axis. A state in which the area to be divided can be divided. This characteristic amount calculation result indicates that there is a high possibility that good products and defective products can be discriminated.

使用する特徴量を変えたり、仮に同じ特徴量であってもパラメータを調整することで、図５（ａ）の状態から図５（ｂ）に示す状態に変更することができる可能性がある。もちろん、どのように調整しても分離度が低いものもあるが、それは、解析に適さない特徴量・パラメータ等であることになる。ここで、データ選択部２３は、データを選択した結果分離度が悪くなったり、図１に示したように、本来の全てのデータを用いて設定した閾値（パラメータ）と異なり、誤検出を生じるようになることを避ける必要がある。 There is a possibility that the state shown in FIG. 5A can be changed to the state shown in FIG. 5B by changing the feature amount to be used or adjusting the parameter even if the feature amount is the same. Of course, some adjustments have a low degree of separation, but they are feature quantities, parameters, etc. that are not suitable for analysis. Here, the data selection unit 23 results in poor detection as a result of selecting data, and, as shown in FIG. 1, unlike the threshold (parameter) set using all original data, erroneous detection occurs. It is necessary to avoid becoming.

上記の条件の下で、本実施の形態では、以下のようにデータ選択をするようにした。すなわち、概念図として示すと、特徴量ＰＶ，ＰＮを軸とした散布図が、図６（ａ）に示すように良品に基づく全ての正常データが白丸で示すようになっており、不良品に基づく異常データが黒丸で示すようになっているとする。図から明らかなように、異常データ（黒丸）に比べて、正常データ（白丸）のサンプル数が多く、バランスを欠いているとともに、正常データについては、似ている（散布図上での位置（座標値）が近い）ものが多く、サンプル数が多く、最終的な良否判定アルゴリズムを生成するのに時間がかかる割に、無駄な正常データのサンプルの存在も否定できず、無駄に学習を繰り返し行なっているといえる。 Under the above conditions, in this embodiment, data selection is performed as follows. That is, when shown as a conceptual diagram, a scatter diagram with the feature amounts PV and PN as axes is such that all normal data based on non-defective products is indicated by white circles as shown in FIG. It is assumed that the abnormal data based on the data is indicated by a black circle. As is clear from the figure, the number of samples of normal data (white circles) is larger than that of abnormal data (black circles), the balance is lacking, and normal data is similar (positions on the scatter diagram ( Although there are many things that are close to the coordinate value), the number of samples is large, and it takes time to generate the final pass / fail judgment algorithm, the existence of useless normal data samples cannot be denied, and learning is repeated repeatedly. It can be said that it is done.

係る場合に、データ選択部２３は、正常データについて、図６（ｂ）に示すように、中心部分に位置する代表的な正常データ（白丸）と、正常データの境界域に存在する正常データ（ハッチング丸）を選択し、その選択された正常データ（代表データ）と、全ての異常データを次段のパラメータ最適化部２４に送り、学習を行うようにする。このように、境界部分の正常データを残すことにより、図６（ｂ）に示すように、正常データと異常データを区別する閾値を、図６（ａ）の場合と同様にすることができる。そして、正常データのうち、波線の丸で示したものは学習に使用しないため、サンプル数が少なくなり、短時間で同程度の品質の良否判定アルゴリズムを生成することが可能となる。 In such a case, as shown in FIG. 6 (b), the data selection unit 23, as shown in FIG. 6 (b), represents normal data (white circles) located in the central portion and normal data (between normal data ( The hatched circle) is selected, and the selected normal data (representative data) and all abnormal data are sent to the parameter optimization unit 24 in the next stage to perform learning. In this way, by leaving the normal data in the boundary portion, as shown in FIG. 6B, the threshold value for distinguishing between normal data and abnormal data can be made the same as in FIG. 6A. Since normal data indicated by wavy circles is not used for learning, the number of samples is reduced, and it is possible to generate a quality judgment algorithm with the same quality in a short time.

そして、境界部分の正常データを選択するに際し、本実施の形態では、マハラノビス距離によるデータ選択を行うようにした。すなわち、図７に示す例によれば、点Ａは点Ｂよりもユークリッド距離は遠くなるが、楕円で囲むデータの分布領域はＡ方向に分散が大きいため、Ｂの方がマハラノビス距離は大きくなる。そこで、正常データは、マハラノビス距離の大きい方（図７中ハッチング部分）に存在するものを順番に選択するようにした。この部分が境界部分の正常データとなる。また、白抜きの中央領域（ここでは、境界部分以外領域）は、均等になるように選択するようにした。この部分が、代表的な正常データとなる。 In selecting the normal data at the boundary portion, in the present embodiment, data selection is performed based on the Mahalanobis distance. That is, according to the example shown in FIG. 7, the Euclidean distance is longer at point A than at point B, but the distribution area of the data enclosed by the ellipse has a larger variance in the A direction, so that B has a larger Mahalanobis distance. . Therefore, normal data having a larger Mahalanobis distance (hatched portion in FIG. 7) is selected in order. This part becomes the normal data of the boundary part. In addition, the white center region (here, the region other than the boundary portion) is selected to be uniform. This portion is representative normal data.

ここで、マハラノビス距離は、データのばらつきを考慮した距離尺度であり、散布図上のユークリッド距離では等距離でも、分散が大きい方向ではマハラノビス距離は小さく、分散が小さい方向ではマハラノビス距離は大きくなる。換言すると、マハラノビス距離とは、データの母集団と各データとの距離を表す尺度となる。
たとえば、特徴量ＰＶ、ＰＮの平均をμPV、μPNとし、特徴量ＰＶ、ＰＮの母集団をＸPV，ＸPNとする。そして、

Here, the Mahalanobis distance is a distance scale that takes into account data variations. Even if the Euclidean distance on the scatter diagram is the same distance, the Mahalanobis distance is small in the direction where the variance is large, and the Mahalanobis distance is large in the direction where the variance is small. In other words, the Mahalanobis distance is a measure representing the distance between the data population and each data.
For example, the average of the feature amounts PV and PN is μPV and μPN, and the population of the feature amounts PV and PN is XPV and XPN. And

とおき、特徴量ＰＶ，ＰＮの演算値の分散共分散行列をΣ，その逆行列をΣ−１とすると、母集団と各データとのマハラノビス距離ｄは、

で求められる。 If the variance covariance matrix of the operation values of the feature values PV and PN is Σ and the inverse matrix is Σ-1, the Mahalanobis distance d between the population and each data is

Is required.

なお、実際には、各特徴量で、スパンが異なることから、正規化（基準化）をする必要がある。つまり、まず、各変量の単位を合わせるため、例えば、平均＝０、分散を＝１に基準化する。その後、上述した式に基づいてデータの中心からマハラノビス距離を求める。そして、マハラノビス距離の遠い方から一定数選択する。これにより、境界部分の正常データが抽出される。そして、残ったデータの中から、所定の条件に従い所定数の正常データ（代表データ）を抽出する。係る処理の具体的な一例を示すと、図８に示すフローチャートを実行する。 Actually, since the span is different for each feature amount, it is necessary to normalize (standardize). That is, first, in order to match the units of each variable, for example, the average = 0 and the variance = 1 are standardized. Thereafter, the Mahalanobis distance is obtained from the center of the data based on the above formula. Then, select a certain number from the far side of Mahalanobis distance. As a result, normal data at the boundary is extracted. Then, a predetermined number of normal data (representative data) is extracted from the remaining data according to a predetermined condition. When a specific example of such processing is shown, the flowchart shown in FIG. 8 is executed.

まず、データのクラス分けを行ない、必要なデータを取得する（Ｓ１）。ここでは、良品（正常データ）と不良品（異常データ）に分類分けを行なう。実際には、波形データベース２１にサンプルデータを格納する際に、その種類が関連づけられて登録されているため、ここでは、そのサンプルデータに登録される際の種類に基づき、必要なデータを取得する。本実施の形態では、所定データの中から所定数選択する（正常データを所定数間引く）。 First, data is classified and necessary data is acquired (S1). Here, a good product (normal data) and a defective product (abnormal data) are classified. Actually, when the sample data is stored in the waveform database 21, the type is associated and registered. Therefore, here, necessary data is acquired based on the type registered in the sample data. . In the present embodiment, a predetermined number is selected from predetermined data (normal data is thinned out by a predetermined number).

次に、データ選択数の設定数の設定を受け付ける（Ｓ２）。すなわち、ユーザが入力装置等を操作して境界部分の正常データの数と、代表的な正常データの数を入力する。そこで、データ選択部２３は、入力装置等を介して受け付けた選択すべきデータ数を取得する。この時指定するデータの数は、絶対的な数値でも良いし、全体の何％のように相対的な値でもよい。また、境界部分の正常データの指定方法と、代表的な正常データの指定方法とは、同じ方法でも良いし、異なる方法でも良い。 Next, the setting of the number of data selections is accepted (S2). That is, the user operates the input device or the like to input the number of normal data at the boundary and the number of typical normal data. Therefore, the data selection unit 23 acquires the number of data to be selected received via the input device or the like. The number of data specified at this time may be an absolute value or a relative value such as what percentage of the whole. Also, the normal data designation method for the boundary portion and the typical normal data designation method may be the same method or different methods.

波形データベース２１に格納された全てのデータに対して、特徴量演算を行なう（Ｓ３）この特徴量演算する対象の波形データは、少なくともデータ選択する対象（種類）のサンプルデータであり、本実施形態では、全ての正常データとなる。なお、すでに特徴量が求められている場合には、それを利用することができる。 The feature amount calculation is performed on all the data stored in the waveform database 21 (S3). The waveform data to be subjected to the feature amount calculation is at least sample data (target) to be selected (type). Then, it becomes all normal data. In addition, when the feature-value is already calculated | required, it can be utilized.

次に、処理ステップＳ３で求めたクラス内（本実施の形態では、良品）の全データの特徴量演算結果を基準化する（Ｓ４）。この基準化は、具体的には、各特徴量の値が、平均が０で分散が１になるように変換する。ついで、係る処理ステップＳ４を実行して得られたクラス内の全データの基準化特徴量データから、マハラノビス距離を算出する（Ｓ５）。ここで算出するマハラノビス距離は、クラス内（ここでは正常データのクラス）の全データの基準化特徴量データから算出される「基準空間」と、クラス内の個別データ（の基準化特徴量データ）と、の距離である。 Next, the feature amount calculation results of all the data in the class (good product in this embodiment) obtained in the processing step S3 are standardized (S4). More specifically, the standardization is performed such that each feature value has an average of 0 and a variance of 1. Next, the Mahalanobis distance is calculated from the standardized feature data of all the data in the class obtained by executing the processing step S4 (S5). The Mahalanobis distance calculated here is the “reference space” calculated from the standardized feature data of all the data in the class (here, the normal data class) and the individual data (standardized feature data of the class) And the distance.

これにより、図９に示すように、数十次元の特徴量データから１次元に集約された距離を求めることができる。そのクラス（良品）に近いほど、算出したマハラノビス距離は１に近く、良品の特徴から遠いほどマハラノビスの距離は大きな値となる。従って、今回の場合すべてが良品に基づく正常データであるため、マハラノビスの距離が大きなものほど、領域部分に存在するデータといえる。 Thereby, as shown in FIG. 9, it is possible to obtain a one-dimensionally aggregated distance from tens of dimensional feature quantity data. The closer to the class (non-defective), the closer the calculated Mahalanobis distance is to 1, and the farther from the non-defective features, the greater the Mahalanobis distance. Therefore, in this case, all of the data is normal data based on non-defective products. Therefore, the longer the Mahalanobis distance, the more data that exists in the region portion.

次いで、マハラノビス距離に基づいてクラス中心から遠いものを選択する（Ｓ６）。具体的には、マハラノビス距離の大きい順に良品データをソートし、ソートされた良品データの上位から、予め定めた比率もしくは件数を選択する。これにより、いわゆる境界領域に位置するデータ（正常データ）が選択される。本実施の形態では、処理ステップＳ２で指定された上位Ｎ個（あるいはＸ％）分を全て選択するようにした。このように、本実施の形態では、境界領域に属する正常データを上位（もっともマハラノビス距離の大きい）から所定量を連続して抽出するようにしたが、抽出のルールはこれに限るものではない。ただし、マハラノビス距離の大きい正常データをある程度の個数抽出することで、たとえば図６（ｂ）に示すように、正常データの存在領域の境界領域に存在する正常データを確実に抽出することができる。 Next, the one far from the class center is selected based on the Mahalanobis distance (S6). Specifically, the non-defective product data is sorted in descending order of the Mahalanobis distance, and a predetermined ratio or number is selected from the top of the sorted good product data. Thereby, data (normal data) located in a so-called boundary region is selected. In the present embodiment, all of the top N (or X%) parts designated in process step S2 are selected. As described above, in the present embodiment, normal data belonging to the boundary region is continuously extracted from the upper level (the largest Mahalanobis distance), but the extraction rule is not limited to this. However, by extracting a certain number of normal data having a large Mahalanobis distance, for example, as shown in FIG. 6B, normal data existing in the boundary region of the normal data existing region can be reliably extracted.

特に、この境界領域付近のデータの良否判定の適否が性能にシビアに効いてくるので、境界領域の正常データを確実に選択し、学習に使用することで、異常データと正常データを弁別するためのルール・閾値等を精度よく作成することができる。 In particular, whether or not the quality of the data in the vicinity of the boundary region is appropriate will have a severe effect on performance, so that normal data in the boundary region can be selected and used for learning to distinguish abnormal data from normal data. Rules, thresholds, etc. can be created with high accuracy.

次に、マハラノビス距離に基づいて残りの正常データから所定の条件（たとえば、等距離間隔や、所定数置き等）に合致する正常データを選択する（Ｓ７）。すなわち、異常データ（不良品）と正常データ（良品）を精度よく弁別するための良否判定アルゴリズムを生成するためには、上述するように境界領域のデータも必要であるが、各領域に属するデータを代表する、境界領域以外に点在するデータも偏ることなく適度に選択する必要がある。そこで、偏りを生じないような所定の条件に合致する正常データを選択する。このように偏りなく正常データを選択することで、選択された代表データは、そのクラス全体の分布特性をより少ない件数で表すことができるデータとなる。所定の条件については、後述する。 Next, based on the Mahalanobis distance, normal data that matches a predetermined condition (for example, equidistant intervals, every predetermined number, etc.) is selected from the remaining normal data (S7). In other words, in order to generate a pass / fail judgment algorithm for accurately discriminating between abnormal data (defective product) and normal data (good product), as described above, data of the boundary region is also necessary, but data belonging to each region It is necessary to appropriately select data scattered in areas other than the boundary area. Therefore, normal data that matches a predetermined condition that does not cause bias is selected. By selecting normal data without any bias in this way, the selected representative data becomes data that can represent the distribution characteristics of the entire class with a smaller number of cases. The predetermined condition will be described later.

このように代表するデータも適度に抽出することにより、クラス同士、つまり、正常データ（良品）の領域と異常データ（不良品）の領域同士の分布の距離をみて分離度を求めることで、パラメータ調整の良し悪しを評価することができる。 By appropriately extracting representative data in this way, parameters can be obtained by determining the degree of separation by looking at the distance between the classes, that is, the distribution of normal data (good product) and abnormal data (defective product). It is possible to evaluate the quality of adjustment.

そして、上述した各処理を実行して得られた結果を次段のパラメータ最適化部２４に出力する（Ｓ８）。なお、上述した実施の形態では、正常データのみをデータ選択の対象としたが、異常データのみをデータ選択の対象としても良い。その場合には、処理ステップＳ１におい取得するデータが異常データとなる。また、汎用性を持たせるためには、処理ステップＳ１では、正常データと異常データを分離して抽出し、処理ステップＳ２では、正常データと異常データのそれぞれに対して選択する数等を設定することになる。 Then, the result obtained by executing each process described above is output to the parameter optimization unit 24 in the next stage (S8). In the above-described embodiment, only normal data is a target for data selection, but only abnormal data may be a target for data selection. In that case, the data acquired in the processing step S1 is abnormal data. In order to provide versatility, normal data and abnormal data are extracted separately in processing step S1, and the number of selections for each of normal data and abnormal data is set in processing step S2. It will be.

代表データを選択するための具体的な方法としては、たとえば図１０（ａ）に示すように、距離の順位をベースにした選択方法と、図１０（ｂ）に示すように、距離そのものをベースにした選択方法とがある。もちろん、これ以外の方法を用いてもよい。 As a specific method for selecting the representative data, for example, as shown in FIG. 10 (a), a selection method based on the ranking of distances, and as shown in FIG. 10 (b), based on the distance itself. There is a selection method. Of course, other methods may be used.

図１０（ａ）に示す距離の順位をベースにする選択方法は、以下のように行なう。まず、前提として処理ステップＳ２で設定する各値が、良品データ全２０件から１０件の選択を想定し、選択データに占める境界データ比率を４０％とする。従って、境界領域として選択される正常データは、４個となり、残りの１６個のデータの中から６個のデータを代表データとして選択することになる。 The selection method based on the rank order shown in FIG. 10A is performed as follows. First, it is assumed that each value set in processing step S2 is selected from all 20 non-defective product data, and the boundary data ratio in the selected data is 40%. Therefore, the normal data selected as the boundary region is four, and six data are selected as representative data from the remaining 16 data.

処理ステップＳ３〜Ｓ５を実行後、算出したマハラノビス距離に基づいて正常データをソートする。これにより、図１０（ａ）に示すように、各データがマハラノビス距離を横軸にした一軸上に配置される。この状態で、距離の大きいものから４個（図中、ハッチングで示す円）を境界領域の正常データ（境界データ）として選択する。これが処理ステップＳ６の実行の具体例である。次いで、上述した境界データを除いた良品から等間隔個数ごと（この例では２つおき）に正常データを選択する。 After executing the processing steps S3 to S5, normal data is sorted based on the calculated Mahalanobis distance. Thereby, as shown in FIG. 10A, each data is arranged on one axis with the Mahalanobis distance as the horizontal axis. In this state, four items having the largest distance (circles indicated by hatching in the figure) are selected as normal data (boundary data) in the boundary region. This is a specific example of the execution of the processing step S6. Next, normal data is selected from the non-defective products excluding the boundary data described above at every equally spaced number (every two in this example).

より具体的には、図１１に示すフローチャートを実行することになる。すなわち、まず、境界データ選択数Ｎｓｂおよび代表データ選択数Ｎｓｒを取得する（Ｓ１１）。この取得処理は、例えば、上述した処理ステップ２において設定された各値が具体的な数値の場合には、係る値を取得すればよいが、相対的な値の場合には、現在の全データ数Ｎから演算により求める必要がある。例えば、上述したように、データ選択数Ｎｓと、境界データ比率ｒｂのように設定された場合には、境界データ選択数Ｎｓｂおよび代表データ選択数Ｎｓｒは、それぞれ下記式に基づいて算出する。
Ｎｓｂ＝Ｎｓ×ｒｂ／１００
Ｎｓｒ＝Ｎｓ−Ｎｓｂ
More specifically, the flowchart shown in FIG. 11 is executed. That is, first, the boundary data selection number Nsb and the representative data selection number Nsr are acquired (S11). For example, in the case where each value set in the above-described processing step 2 is a specific numerical value, the acquisition process may be performed by acquiring such a value. It is necessary to obtain from the number N by calculation. For example, as described above, when the data selection number Ns and the boundary data ratio rb are set, the boundary data selection number Nsb and the representative data selection number Nsr are calculated based on the following equations, respectively.
Nsb = Ns × rb / 100
Nsr = Ns−Nsb

従って、仮に上述したようにＮｓ＝１０，ｒｂ＝４０％とすると、
境界データ選択数Ｎｓｂ＝１０×４０／100＝４個，
代表データ選択数Ｎｓｒ＝１０−４＝６個
となる。 Therefore, if Ns = 10 and rb = 40% as described above,
Boundary data selection number Nsb = 10 × 40/100 = 4,
The number of representative data selections is Nsr = 10−4 = 6.

次いで、正常データをマハラノビス距離の降順にソートする（Ｓ１２）。これにより、例えば図１２（ａ）に示すように、マハラノビス距離が大きい順に各データの順位付けがなされる。なお、この処理ステップＳ１２と、上述した処理ステップＳ１１の実行順序は、逆にしてももちろんよい。 Next, normal data is sorted in descending order of Mahalanobis distance (S12). As a result, for example, as shown in FIG. 12A, each data is ranked in descending order of Mahalanobis distance. Of course, the execution order of the processing step S12 and the processing step S11 described above may be reversed.

そして、１位からＮｓｂ位までを境界データとして選択する（Ｓ１３）。これにより、上記した具体例の場合には、１位から４位までの４つのデータが選択される（図１２（ｂ）参照）。 Then, the first to Nsb positions are selected as boundary data (S13). Thereby, in the case of the specific example described above, four data from the first place to the fourth place are selected (see FIG. 12B).

次に、代表データ選択間隔Ｉｓｒを算出する（Ｓ１４）。本実施の形態では、下記式
Ｉｓｒ＝ｆ［（Ｎｓ−Ｎｓｂ）／Ｎｓｒ］
に基づいて算出する。ここで、関数ｆ［Ｘ］は、例えば、Ｘを超えない最大整数としたり、小数点以下１位を四捨五入した整数とすることができる。いずれの場合も、データ選択は等間隔順位（Ｙ個ずつ）となるが、関数ｆ［Ｘ］が四捨五入した結果、切り上がった場合を想定すると、どちらかというと前者のＸを超えない最大整数とした場合には、マハラノビス距離が小さい方が代表データとして選択され、後者の四捨五入（結果として切り上げ）の場合には、マハラノビス距離が大きい方まで代表データとして選択されることになる。 Next, the representative data selection interval Isr is calculated (S14). In the present embodiment, the following formula Isr = f [(Ns−Nsb) / Nsr]
Calculate based on Here, the function f [X] can be, for example, a maximum integer not exceeding X, or an integer obtained by rounding off the first decimal place. In either case, the data selection is equally spaced (Y units), but assuming that the function f [X] is rounded up as a result of rounding, the maximum integer that does not exceed the former X. In this case, the smaller Mahalanobis distance is selected as representative data, and in the latter rounding (result rounding up), the larger Mahalanobis distance is selected as representative data.

一例を示すと、上述した具体例の場合、
代表データ選択間隔Ｉｓｒ＝（２０−４）／６＝２．６６６……
となるため、前者の方式を採るとＩｓｒ＝２となり、後者の方式を採るＩｓｒ＝３となり、値が異なる。 For example, in the case of the specific example described above,
Representative data selection interval Isr = (20−4) /6=2.666……
Therefore, when the former method is adopted, Isr = 2, and when the latter method is adopted, Isr = 3, and the values are different.

次いで、ｉ＝０，１，……，Ｎｓｒ−１に対して、
（Ｎｓ−Ｉｓｒ×ｉ）位
のデータを代表データとして選択する（Ｓ１５）。このように、ｉ＝０の順位を選択することで、マハラノビス距離が最も短く、正常データの特徴を最も良く表していると言える正常データの領域の中心（中心付近）に存在する最大順位の正常データが選択される。上述した具体例の場合、Ｉｓｒ＝２とすると、２０位，１８位，１６位，１４位，１２位，１０位の６つのデータが代表データとして選択され、Ｉｓｒ＝３とすると、２０位，１７位，１４位，１１位，８位，５位の６つのデータが代表データとして選択される（図１２（ｃ）参照）。 Then, for i = 0,1, ..., Nsr-1,
The (Ns-Isr × i) rank data is selected as representative data (S15). In this way, by selecting the rank of i = 0, the normal of the highest rank that exists at the center (near the center) of the normal data area that has the shortest Mahalanobis distance and that best represents the characteristics of normal data. Data is selected. In the case of the specific example described above, if Isr = 2, six data of 20th, 18th, 16th, 14th, 12th, and 10th are selected as representative data, and if Isr = 3, 20th, Six data of 17th, 14th, 11th, 8th and 5th are selected as representative data (see FIG. 12C).

このようにして、上述した処理ステップＳ１３，Ｓ１５を実行してそれぞれ選択された境界データと代表データをあわせて選択データとして確定する（Ｓ１６）（図１２（ｃ）参照）。 In this way, the above-described processing steps S13 and S15 are executed, and the selected boundary data and representative data are combined and determined as selection data (S16) (see FIG. 12C).

なお、上述した具体例では、図１０（ａ）に合わせて境界データを４個選択する場合を例に挙げて説明したが、例えば、境界データを２個とすると、境界データ選択数Ｎｓｂと代表データ選択数Ｎｓｒとは、それぞれ、
Ｎｓｂ＝１０×２０／１００＝2個
Ｎｓｒ＝１０−２＝８個
となる。 In the specific example described above, the case where four pieces of boundary data are selected according to FIG. 10A has been described as an example. However, for example, when the number of boundary data is two, the boundary data selection number Nsb is representative. The data selection number Nsr is respectively
Nsb = 10 × 20/100 = 2 pieces Nsr = 10−2 = 8 pieces

従って、処理ステップＳ１２を実行することで、図１３（ａ）のようになり、処理ステップ１３を実行することで、上位２個の正常データが境界データとして選択される（図１３（ｂ）参照）。 Therefore, by executing the processing step S12, it becomes as shown in FIG. 13A, and by executing the processing step 13, the top two normal data are selected as boundary data (see FIG. 13B). ).

代表データ選択間隔Ｉｓｒは、
(２０−２)／８＝２
より、上述したどちらのルールによってもＩｓｒ＝２となる（Ｓ１４の実行）。 The representative data selection interval Isr is
(20-2) / 8 = 2
Therefore, Isr = 2 is obtained by either rule described above (execution of S14).

従って、処理ステップＳ１５を実行すると、ｉ＝０，１，２，……，（Ｎｓｒ−１）に対して、（２０−２×ｉ）位を代表データとして選択することになる。上述した具体例によれば、０，１，２，……，７に対応する順位、すなわち、２０位，１８位，……，６位を代表データとする（図１３（ｃ）参照）。 Therefore, when the processing step S15 is executed, the (20-2 × i) th rank is selected as representative data for i = 0, 1, 2,... (Nsr−1). According to the specific example described above, the ranks corresponding to 0, 1, 2,..., 7, that is, the 20th, 18th,..., 6th are used as representative data (see FIG. 13C).

次に、図１０（ｂ）に概念図を示した距離そのものをベースにした選択方法について説明する。１つの方法としては、上述したように、マハラノビス距離を算出し、それに基づいて正常データをソートする。そして、マハラノビス距離がしきい値を超えるものを境界データとして選択し、残ったデータに対し等間隔に区切って代表データを選択する。 Next, a selection method based on the distance shown in the conceptual diagram in FIG. 10B will be described. As one method, as described above, the Mahalanobis distance is calculated, and normal data is sorted based on the Mahalanobis distance. Then, the data whose Mahalanobis distance exceeds the threshold is selected as boundary data, and the representative data is selected by dividing the remaining data at equal intervals.

具体的には、図１４に示すフローチャートを実行する。すなわち、まず、選択数Ｎｓと境界距離閾値ｔｈを取得する（Ｓ１１）。この取得処理は、例えば、上述した処理ステップ２において設定された情報に基づいて設定してもよいし、不足している情報（たとえば境界距離閾値）がある場合には、係る不足している情報の入力を促し、ユーザからの入力を待つ。また、境界距離閾値は、このようにユーザからの入力に基づくものに限らず、初期値として装置側に設定しておき、それを利用してもよい。 Specifically, the flowchart shown in FIG. 14 is executed. That is, first, the selection number Ns and the boundary distance threshold th are acquired (S11). This acquisition process may be set based on, for example, the information set in the above-described processing step 2. If there is missing information (for example, boundary distance threshold), the missing information And wait for input from the user. Further, the boundary distance threshold is not limited to that based on the input from the user as described above, but may be set as an initial value on the apparatus side and used.

次いで、正常データをマハラノビス距離の降順にソートする（Ｓ２２）。なお、この処理ステップＳ２２と、上述した処理ステップＳ２１の実行順序は、逆にしてももちろんよい。また、先に処理ステップＳ２２を実行した場合、マハラノビス距離の最大値や、平均・分散などがわかるので、それに基づいて境界距離閾値を算出するようにしてもよい。 Next, normal data is sorted in descending order of Mahalanobis distance (S22). Of course, the execution order of the processing step S22 and the processing step S21 described above may be reversed. Further, when the processing step S22 is executed first, the maximum value of the Mahalanobis distance, the average / dispersion, etc. are known, and the boundary distance threshold value may be calculated based on the maximum value.

そして、マハラノビス距離が境界距離閾値ｔｈを超える正常データを、境界データとして選択し、選択された正常データの数をＮｓｂに設定する（Ｓ２３）。これにより、図１０（ｂ）でいうと、ハッチングで示された３つの正常データが境界データとして選択される。たとえば、正常データが図１２（ａ）に示すものとし、境界距離閾値ｔｈが３．０とすると、１位から３位の３つの正常データが境界データとして選択される。 Then, normal data whose Mahalanobis distance exceeds the boundary distance threshold th is selected as boundary data, and the number of selected normal data is set to Nsb (S23). Thereby, in FIG. 10B, three normal data indicated by hatching are selected as boundary data. For example, assuming that the normal data is shown in FIG. 12A and the boundary distance threshold th is 3.0, the three normal data from the first to the third are selected as the boundary data.

次に、境界データとして選択されたデータを除く正常データの中で、マハラノビス距離の最大値Ｄｍａｘと最小値Ｄｍｉｎを求める（Ｓ２４）。一例を示すと、正常データが図１２（ａ）に示すようになっており、上述したように境界距離閾値ｔｈが３．０とすると、境界データとして選択されずに残った正常データは、４位以下のデータであるので、最大値Ｄｍａｘ＝２．９となり、最小値Ｄｍｉｎ＝０．５となる。 Next, among the normal data excluding the data selected as the boundary data, the maximum value Dmax and the minimum value Dmin of the Mahalanobis distance are obtained (S24). As an example, the normal data is as shown in FIG. 12A. As described above, when the boundary distance threshold th is 3.0, the normal data remaining without being selected as the boundary data is 4 Since the data is less than or equal to the order, the maximum value Dmax = 2.9 and the minimum value Dmin = 0.5.

そして、代表データ選択数Ｎｓｒと、代表データ間隔距離Ｄｓｒを、
Ｎｓｒ＝Ｎｓ−Ｎｓｂ
Ｄｓｒ＝（Ｄｍａｘ−Ｄｍｉｎ）／Ｎｓｒ
に基づいて算出する（Ｓ２５）。 The representative data selection number Nsr and the representative data interval distance Dsr are
Nsr = Ns−Nsb
Dsr = (Dmax−Dmin) / Nsr
(S25).

次いで、ｉ＝０，１，……，Ｎｓｒ−１に対して、それぞれ（Ｄｍｉｎ＋ｉ×Ｄｓｒ）を求める。これにより、残った正常で他を等距離に分割する位置が設定される。そして、各分割位置に最も近い正常データをそれぞれ代表データとして選択する（Ｓ２６）。上記の演算式を用いることにより、ｉ＝０の時の分割位置は、Ｄｍｉｎの位置となるので、マハラノビス距離が最も短く、正常データの特徴を最も良く表していると言える正常データの領域の中心（中心付近）に存在する最大順位の正常データが選択される。そして、上述した処理ステップＳ２３，Ｓ２６を実行してそれぞれ選択された境界データと代表データをあわせて選択データとして確定する（Ｓ２７）。 Next, (Dmin + i × Dsr) is obtained for i = 0, 1,..., Nsr−1. Thereby, the remaining normal and other positions for dividing the other into equal distances are set. Then, normal data closest to each division position is selected as representative data (S26). By using the above arithmetic expression, the division position when i = 0 is the position of Dmin, so that the Mahalanobis distance is the shortest and the center of the normal data area that can be said to best represent the characteristics of normal data. The normal data having the highest rank existing near the center is selected. Then, the above-described processing steps S23 and S26 are executed, and the selected boundary data and representative data are combined and determined as selection data (S27).

もちろん、本発明では、代表データ間隔距離Ｄｓｒを求める際に、上述したように最小値Ｄｍｉｎを必ずしも用いる必要はなく、たとえば、
Ｄｓｒ＝Ｄｍａｘ／Ｎｓｒ
に基づいて算出するほか、各種の演算により求めることができる。 Of course, in the present invention, it is not always necessary to use the minimum value Dmin as described above when obtaining the representative data interval distance Dsr.
Dsr = Dmax / Nsr
In addition to calculation based on the above, it can be obtained by various calculations.

また、距離そのものをベースにした選択方法としては、上述した方法（境界データを閾値に基づいて決定する）ものに限ることはない。一例を示すと、例えば、まず、絶対値か相対的な値かは問わないが、境界データとして選択すべき量を決定し、マハラノビス距離の大きいものから順に上位の正常データを境界データとして選択する。そして、残ったデータの最大値（必要に応じて最小値も利用）と選択すべき数に基づいて等間隔で分割する距離を算出し、それに基づいて各分割する位置に近いデータを代表データとして選択するようにしてもよい。 The selection method based on the distance itself is not limited to the above-described method (boundary data is determined based on a threshold value). As an example, for example, regardless of whether it is an absolute value or a relative value, first, the amount to be selected as boundary data is determined, and the higher order normal data is selected as boundary data in descending order of the Mahalanobis distance. . Then, based on the maximum value of the remaining data (also using the minimum value if necessary) and the number to be selected, the distance to be divided at equal intervals is calculated, and based on this, the data close to the position to be divided is used as representative data. You may make it select.

図１５は、本発明の第２の実施の形態の要部を示している。すなわち、上述した第１の実施の形態の装置を用いて作成した良否判定アルゴリズムが、ユーザにとって必ずしも十分満足の行く性能が得られるとは限らない。そこで、本実施の形態では、十分な性能が得られなかった場合に再選択を行う機能を持たせている。 FIG. 15 shows a main part of the second embodiment of the present invention. In other words, the pass / fail judgment algorithm created using the apparatus of the first embodiment described above does not always provide performance that is sufficiently satisfactory for the user. Therefore, in this embodiment, a function for performing reselection when sufficient performance is not obtained is provided.

まず、本装置で作成した良否判定アルゴリズムに基づいて実際に良否判定を行ない、見逃し率と過検出率を求め、判別性能の目標を達成したか否かを判断する（Ｓ３１）。ここで、「見逃し」とは、本来「不良品」と判断し廃棄等して出荷しないようにする必要があるところ、良品と誤判断してしまうものである。係る事態は避けなければならないため、見逃し率は０％にする必要がある（０％が好ましい）。「過検出」とは、本来「良品」と判断し、そのまま出荷等できるものを「不良品」と判断してしまうことである。この「過検出」の発生率が多いと、意味のない歩留まりの低下を招き、商品のコストアップ等に結びつくとともに、利益を圧迫するおそれがある。従って、この過検出率もできるだけ小さい方が好ましい。ただし、通常「見逃し率」と「過検出率」をともに０％にするのは、事実上困難であるため、過検出はある程度生じることを許容するのが一般的である。よって、本実施の形態では、見逃し率は０％で過検出率を５％とした。なお、この数値は任意である。 First, a pass / fail determination is actually performed based on a pass / fail determination algorithm created by the present apparatus, an overlook rate and an overdetection rate are obtained, and it is determined whether or not the target of discrimination performance has been achieved (S31). Here, “missing” means that a product is erroneously determined to be a non-defective product because it must be judged as “defective product” and should not be shipped after being discarded. Since such a situation must be avoided, the miss rate must be 0% (0% is preferred). “Overdetection” is to determine that a product is “defective” if it can be shipped as it is. If the occurrence rate of this “overdetection” is high, there is a risk that the yield will be reduced meaninglessly, leading to an increase in the cost of the product, and the possibility of pressing profits. Therefore, it is preferable that this overdetection rate is as small as possible. However, since it is practically difficult to set both the “missing rate” and the “overdetection rate” to 0%, it is common to allow overdetection to occur to some extent. Therefore, in this embodiment, the miss rate is 0% and the overdetection rate is 5%. This numerical value is arbitrary.

この判別性能が目標に達成したか否かは、例えば、波形データベース２１に格納されたサンプルデータ（良否結果が分かっているもの）に基づいて良否判定を行ない、良否判定アルゴリズムに基づいて判定された結果が正しいか否かを求め、それに基づいて過検出率と見逃し率を算出する。また、この判断は、本実施の経緯では、データ選択部が行なうようにしているが、別途判定手段を設けてももちろん良い。 Whether or not the discrimination performance has been achieved as a target is determined based on a pass / fail determination algorithm based on, for example, a pass / fail determination based on sample data stored in the waveform database 21 (having known pass / fail results). It is determined whether or not the result is correct, and the overdetection rate and the oversight rate are calculated based on the result. This determination is made by the data selection unit in the course of this embodiment, but it is of course possible to provide a separate determination means.

また、実際の良否判定結果は、図２に示す検査装置１０の出力を取得することになるが、具体的な処理手順の一例を示すと、波形データベース２１に核に脳されたすべてのサンプルデータを検査装置１０に渡す。検査装置１０は、取得したサンプルデータについて良否判定を行ない、その判定結果を知識作成装置に渡す。サンプルデータには、サンプルデータを識別するコードが付与されているため、検査装置１０は、係るコードと判定結果を関連づけて知識作成装置２０に渡す。これにより、知識作成装置２０は、そのコードに基づき、波形データベース２１に格納された対応するサンプルデータの種類（良品／不良品）を知ることができるため、検査装置の判定結果の適否がわかる。 In addition, the actual pass / fail judgment result is obtained from the output of the inspection apparatus 10 shown in FIG. 2, but when an example of a specific processing procedure is shown, all sample data brainned in the waveform database 21 Is passed to the inspection apparatus 10. The inspection device 10 performs a pass / fail determination on the acquired sample data, and passes the determination result to the knowledge creating device. Since the sample data is provided with a code for identifying the sample data, the inspection apparatus 10 associates the code with the determination result and passes the code to the knowledge creating apparatus 20. As a result, the knowledge creating device 20 can know the type of the corresponding sample data (non-defective product / defective product) stored in the waveform database 21 based on the code.

なお、検査装置１０と知識作成装置２０の間におけるデータの送受は、通信で行なっても良いし、所定の記録媒体を用いてオフラインで行なっても良い。また、同一のパソコンなどに組み込まれている場合には、そのパソコンの記憶装置に互いにアクセスすることで簡単にデータの送受が行える。 Note that data transmission / reception between the inspection apparatus 10 and the knowledge creation apparatus 20 may be performed by communication or may be performed offline using a predetermined recording medium. If they are incorporated in the same personal computer, data can be easily transmitted and received by accessing the storage devices of the personal computers.

そして、性能が目標を達成している場合（処理ステップＳ３１の分岐判断でＹｅｓ）には、処理を終了するが、性能が目標を達成していない場合（処理ステップＳ３１の分岐判断でＮｏ）には、過検出・見逃し率がともに収束しているか否かを判断する（Ｓ３２）。すなわち、処理ステップ３１の分岐判断でＮｏとなった場合には、各回毎に求めた過検出率と見逃し率をメモリ（一時記憶手段）に記憶する。そして、過去の履歴から収束方向にあるか否かを判断する。つまり、過検出率並びに見逃し率が徐々に小さくなっている場合にも収束していると判断する（多少の変動はあっても全体的に小さくなる場合には収束していると判断する）。 Then, when the performance has achieved the target (Yes in the branch determination of the processing step S31), the process is terminated, but when the performance does not achieve the target (No in the branch determination of the processing step S31). Determines whether or not the overdetection / missing rate has converged (S32). That is, when the branch determination at processing step 31 is No, the overdetection rate and the oversight rate obtained every time are stored in the memory (temporary storage means). And it is judged whether it is in the convergence direction from the past history. That is, it is determined that convergence has occurred even when the overdetection rate and the overlook rate are gradually decreasing (when it is small even if there is some variation, it is determined that it has converged).

収束している場合には、そのまま学習を継続することで判別性能目標を達成する可能性が高いので、世代数を増やして判別ルール作成を再実行する（Ｓ３２）。つまり、第１の実施の形態で説明したように、データ選択部２３で選択された代表データは、次段のパラメータ最適化部２４に送られ、そこにおいてＧＡを用いてパラメータの最適化を行なうため、係るＧＡを継続して続行（学習の再実行）し、適当なタイミングで判別性能が目標に達したか否かの判断を行なう（Ｓ３１）。 If it has converged, there is a high possibility that the learning performance target will be achieved by continuing the learning as it is, so that the number of generations is increased and the discrimination rule creation is re-executed (S32). In other words, as described in the first embodiment, the representative data selected by the data selection unit 23 is sent to the parameter optimization unit 24 at the next stage, where the parameters are optimized using the GA. Therefore, the GA is continuously continued (re-execution of learning), and it is determined whether the discrimination performance has reached the target at an appropriate timing (S31).

一方、過検出率及びまたは見逃し率が収束していない場合（処理ステップＳ３２の分岐判断でＮｏ）には、データ選択部２３で選択した代表データが適切でなかったおそれがあるので、データ選択部２３は、再度データ選択を行う。具体的には、まず、見逃し率が目標より大きいか否かを判断する（Ｓ３４）。そして、見逃し率が目標に達している（目標以下の）場合（処理ステップＳ３４の分岐判断はＮｏ）には、過検出率を抑制する必要があるため、過検出されたデータを追加するデータ選択の再実行処理を行う（Ｓ３５）。 On the other hand, when the over-detection rate and / or the oversight rate have not converged (No in the branching determination in processing step S32), the representative data selected by the data selection unit 23 may not be appropriate. 23 performs data selection again. Specifically, first, it is determined whether or not the miss rate is larger than the target (S34). When the miss rate has reached the target (below the target) (the branch determination in processing step S34 is No), it is necessary to suppress the over detection rate, so data selection for adding over-detected data is performed. Is re-executed (S35).

具体的には、図１６に示すフローチャートのように、まず、前回のデータ選択結果を読込み（Ｓ４１）、検査結果を参照して過検出された良品データを抽出する（Ｓ４２）。係る抽出処理は、たとえば、良否判定アルゴリズムに基づく検査装置の判定結果が「不良品」（全ての検査装置の判定結果と対応する波形データ（波形データのサンプルデータを特定する情報でも可）を記憶保持しておくか、少なくとも不良品と判定された波形データのサンプルデータを特定する情報を記憶保持しておくことで自動抽出できる）で、波形データベース２１に登録された判定結果が「良品」のものをピックアップすることで自動的に行うことができる。 Specifically, as shown in the flowchart of FIG. 16, first, the previous data selection result is read (S41), and the over-detected non-defective product data is extracted with reference to the inspection result (S42). In such extraction processing, for example, the determination result of the inspection apparatus based on the pass / fail determination algorithm is “defective product” (the waveform data corresponding to the determination results of all inspection apparatuses (information specifying the sample data of the waveform data is also acceptable)) It is possible to automatically extract by storing or holding at least information specifying the sample data of the waveform data determined to be defective, and the determination result registered in the waveform database 21 is “non-defective”. It can be done automatically by picking up things.

抽出した良品データを前回のデータ選択結果（処理ステップＳ４１の実行により取得）に追加し（Ｓ４３）、追加後のデータ選択結果を新たな代表データとして出力する（Ｓ４４）。つまり、次段のパラメータ最適化部２４に渡す。 The extracted good product data is added to the previous data selection result (obtained by execution of processing step S41) (S43), and the data selection result after the addition is output as new representative data (S44). That is, it is passed to the parameter optimization unit 24 at the next stage.

上述した処理ステップＳ４１からＳ４４を実行して、データ選択部２３が新たな代表データを選択すると、図１５の処理ステップＳ３６に進み、パラメータ最適化部２４がパラメータ調整を再実行し、続いて、ルール作成部２５が判別ルール作成を再実行する（Ｓ３７）。これらの処理を経て、新たな代表データに基づく良否判定アルゴリズムが生成され、それぞれのデータベースに格納される。そして、処理ステップ３１に戻り、その新たな代表データに基づく良否判定アルゴリズムを用いて検査装置を実行させ、判別性能が目標に達したか否かの判断を行なう。 When the processing steps S41 to S44 described above are executed and the data selection unit 23 selects new representative data, the process proceeds to the processing step S36 of FIG. 15, the parameter optimization unit 24 re-executes parameter adjustment, The rule creation unit 25 re-executes the discrimination rule creation (S37). Through these processes, a pass / fail judgment algorithm based on new representative data is generated and stored in each database. Then, returning to the processing step 31, the inspection apparatus is executed using the pass / fail judgment algorithm based on the new representative data, and it is judged whether or not the discrimination performance has reached the target.

一方、見逃し率が目標値を達成していない場合には、処理ステップＳ３４の分岐判断がＹｅｓとなるので、処理ステップＳ３８に飛び、選択数を増して再度代表データの選択を行なう（Ｓ３８）。すなわち、見逃し率が目標に達しない場合は、不良品（以上データ）が良品（正常データ）の分布の範囲内、それも分布の中心付近に位置したり、不良全体の分布が広範囲にわたる場合等、そもそも特徴量の段階で分離できていない状態にあることが多い。つまり、データ選択が適切ではなかったことを意味し、より多くのサンプルデータに基づいて正常データと異常データを分離するべく、データ選択部２３で選択する学習する際の正常データの数を増加する。 On the other hand, if the miss rate does not reach the target value, the branch determination in processing step S34 is Yes, so the process jumps to processing step S38, and the number of selections is increased and representative data is selected again (S38). In other words, if the miss rate does not reach the target, the defective product (data above) is within the distribution range of the non-defective product (normal data), which is also located near the center of the distribution, or the distribution of the entire failure is wide. In many cases, separation is not possible at the feature level. That is, it means that the data selection is not appropriate, and the number of normal data when learning is selected by the data selection unit 23 is increased in order to separate normal data and abnormal data based on more sample data. .

このとき、単純に増加する方法も考えられるが、単純にデータ数を増やすと、学習時間が増加するという弊害もでてくる。そこで、たとえば、図１７に示すフローチャートに従って増加させることで、そのときの状況にあった適切なデータ数の増加を図り、学習結果の性能の向上を図りつつ、できるだけ少ないデータ数により短時間で学習を行なうことができるようにする。 At this time, a method of simply increasing is conceivable, but if the number of data is simply increased, there is a problem that the learning time increases. Therefore, for example, by increasing in accordance with the flowchart shown in FIG. 17, an appropriate increase in the number of data in accordance with the situation at that time is aimed at, and learning performance is improved in a short time while improving the performance of the learning result. To be able to do

具体的には、まず、見逃し率と過検出率のどちらの方が悪いかを判断する（Ｓ５１）。つまり、見逃し率が目標値に達していない場合、過検出率も目標値に達していない場合と過検出率は目標値に達している場合の両方が想定でき、しかも、両方とも目標値に達していない場合には、見逃し率と過検出率のいずれの方がより悪いかなどの場合分けができる。そこで、それらを一括して判断するため、本実施の形態では、
「見逃し率＞過検出率？」
を判断する。 Specifically, first, it is determined which of the miss rate and the overdetection rate is worse (S51). In other words, it can be assumed both when the miss rate has not reached the target value, when the overdetection rate has not reached the target value, and when the overdetection rate has reached the target value, and both have reached the target value. If not, it can be divided into cases such as which one of the overlook rate and overdetection rate is worse. Therefore, in order to judge them collectively, in this embodiment,
“Overlook rate> overdetection rate?”
Judging.

そして、見逃し率が多い場合（処理ステップＳ５１の分岐判断がＹｅｓ）は、良品の境界データが不足していたと考えられるため、境界データ比率を増やす（Ｓ５２）。あるいは、境界データの選択数を増やす。一方、過検出率の方が多い場合（処理ステップＳ５１の分岐判断がＮｏ）は、良品の分布を広く捉えなおすためにデータ選択数を増やす（Ｓ５３）。 If the miss rate is high (the branch determination in processing step S51 is Yes), the boundary data ratio is increased because it is considered that the non-defective boundary data is insufficient (S52). Alternatively, the selection number of boundary data is increased. On the other hand, when the over-detection rate is larger (the branch determination in processing step S51 is No), the number of data selections is increased in order to broadly recognize the non-defective product distribution (S53).

そして、上述した各処理を実行した後、新たに設定された代表データとしての選択条件に従って、良品データの選択（新たな代表データの選択）を実行する（Ｓ５４）。なお、図１７に示したフローチャートでは、処理ステップ５１の分岐判断の結果に基づいて処理ステップＳ５２とＳ５３のいずれか一方を実行するようにしたが、たとえば、処理ステップＳ５２を実行した場合には、続いて処理ステップＳ５３も実行するというように両方の処理を実行するようにしても良い。 Then, after each of the above-described processes is executed, selection of non-defective product data (selection of new representative data) is executed in accordance with the selection conditions as newly set representative data (S54). In the flowchart shown in FIG. 17, either one of the processing steps S52 and S53 is executed based on the result of the branch determination in the processing step 51. For example, when the processing step S52 is executed, Subsequently, both processes may be executed such that the processing step S53 is also executed.

上述した処理ステップＳ５１からＳ５４を実行して、データ選択部２３が新たな代表データを選択すると、図１５の処理ステップＳ３９に進み、パラメータ最適化部２４がパラメータ調整を再実行し、続いて、ルール作成部２５が判別ルール作成を再実行する（Ｓ４０）。これらの処理を経て、新たな代表データに基づく良否判定アルゴリズムが生成され、それぞれのデータベースに格納される。そして、処理ステップ３１に戻り、その新たな代表データに基づく良否判定アルゴリズムを用いて検査装置を実行させ、判別性能が目標に達したか否かの判断を行なう。 When the above-described processing steps S51 to S54 are executed and the data selection unit 23 selects new representative data, the process proceeds to processing step S39 in FIG. 15, the parameter optimization unit 24 re-executes parameter adjustment, The rule creation unit 25 re-executes the discrimination rule creation (S40). Through these processes, a pass / fail judgment algorithm based on new representative data is generated and stored in each database. Then, returning to the processing step 31, the inspection apparatus is executed using the pass / fail judgment algorithm based on the new representative data, and it is judged whether or not the discrimination performance has reached the target.

上述した各処理ステップを、適宜のルートで繰り返し実行することで、適切なデータ数の代表データに基づいて学習が行われ、目標とする品質を備えた良否判定アルゴリズムを生成することができる。 By repeatedly executing each processing step described above through an appropriate route, learning is performed based on representative data of an appropriate number of data, and a pass / fail judgment algorithm having a target quality can be generated.

従来の問題点を説明する図である。It is a figure explaining the conventional problem. 検査システムの一例を示す図である。It is a figure showing an example of an inspection system. 主に検査装置の内部構成の一例を示すブロック図である。It is a block diagram which mainly shows an example of the internal structure of an inspection apparatus. 本発明に係る知識作成装置の一実施の形態を示すブロック図である。It is a block diagram which shows one Embodiment of the knowledge preparation apparatus which concerns on this invention. 分離度を説明する図である。It is a figure explaining a separation degree. 本実施の形態の動作原理・作用の概要を説明する図である。It is a figure explaining the outline | summary of the operation principle and effect | action of this Embodiment. 本実施の形態の動作原理・作用の概要を説明する図である。It is a figure explaining the outline | summary of the operation principle and effect | action of this Embodiment. データ選択部の機能の一例を示すフローチャートである。It is a flowchart which shows an example of the function of a data selection part. 本実施の形態における各特徴量と、マハラノビス距離を説明する図である。It is a figure explaining each feature-value and Mahalanobis distance in this Embodiment. 本実施の形態の動作原理・作用の概要を説明する図である。It is a figure explaining the outline | summary of the operation principle and effect | action of this Embodiment. データ選択部の機能の詳細な一例を示すフローチャートである。It is a flowchart which shows a detailed example of the function of a data selection part. 本実施の形態の動作原理・作用の一例を示す図である。It is a figure which shows an example of the operation principle and effect | action of this Embodiment. 本実施の形態の動作原理・作用の一例を示す図である。It is a figure which shows an example of the operation principle and effect | action of this Embodiment. データ選択部の機能の詳細な一例を示すフローチャートである。It is a flowchart which shows a detailed example of the function of a data selection part. 第２の実施の形態を示すフローチャートの一例を示す図である。It is a figure which shows an example of the flowchart which shows 2nd Embodiment. 処理ステップＳ３３の具体的な処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the specific process sequence of process step S33. 処理ステップＳ３５の具体的な処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the specific process sequence of process step S35.

Explanation of symbols

１０検査装置
２０知識作成装置
２１波形データベース
２２良否判定アルゴリズム生成部
２３データ選択部
２４パラメータ最適化部
２５ルール作成部 DESCRIPTION OF SYMBOLS 10 Inspection apparatus 20 Knowledge preparation apparatus 21 Waveform database 22 Pass / fail judgment algorithm generation part 23 Data selection part 24 Parameter optimization part 25 Rule preparation part

Claims

The same pass / fail judgment knowledge is used when performing the pass / fail determination in the inspection apparatus that extracts the feature amount from the input measurement target measurement data and makes the pass / fail determination of the test target based on the extracted feature amount. A knowledge creation device that creates waveform data belonging to a type and waveform data that does not belong to the same type ,
Data selection means for selecting a predetermined waveform data from among a plurality of waveform data belonging to the same type obtained,
Using the selection data that is the waveform data selected by the data selection means, and the knowledge creation means for creating the pass / fail judgment knowledge using the waveform data that does not belong to the same type ,
The data selection means includes
Selection for selecting a function for selecting a predetermined number of boundary data belonging to the border area of the group in which a plurality of waveform data of the selected object belongs, a part of the waveform data from the waveform data that does not belong to the boundary region as the representative data With functions,
The selected boundary data and the representative data are combined as the selection data ,
The selection function performs a tentative feature amount calculation on all waveform data belonging to the same type, and each waveform data based on a feature amount calculation result obtained by executing the tentative feature amount calculation. Standardizing a plurality of obtained feature values, obtaining a Mahalanobis distance, selecting the boundary data from a group having a large Mahalanobis distance, and selecting the representative data from a group having a small Mahalanobis distance. Knowledge creation device characterized by

2. The knowledge creating apparatus according to claim 1, wherein the data selection means selects the boundary data that corresponds to the Mahalanobis distance equal to or greater than a set threshold value.

The knowledge creation device according to claim 1, wherein the data selection means selects a predetermined amount as the boundary data from the one having the large Mahalanobis distance.

4. The data selection unit according to claim 1, wherein the data selection means sorts each waveform data in order of Mahalanobis distance, and selects the representative data by selecting waveform data every predetermined number. The knowledge creation device according to claim 1.

4. The data selection unit according to claim 1, wherein the data selection means sorts each waveform data in order of Mahalanobis distance, and selects the representative data by selecting waveform data at a predetermined distance interval. The knowledge creation device according to claim 1.

The knowledge generating means quality determination that is performed using the quality determination knowledge created results, and determination means to determine whether the reached the target value of the discrimination performance,
6. The method according to claim 1, further comprising a function of increasing data to be selected by the data selection means and selecting data again when the judgment result of the judgment means does not reach a target value. The knowledge creation device according to item 1.

7. The knowledge creating apparatus according to claim 6, wherein the process of increasing data to be selected by the data selection means includes at least one of the following (1) to (3).
(1) When the waveform data belonging to the same type is erroneously determined as a different group, the erroneously determined waveform data is added as boundary data.
(2) If it is erroneously determined that the waveform data that does not belong to the same type belongs to the same type, the total number of data selection is increased.
(3) When the waveform data not belonging to the same type is erroneously determined as belonging to the same type, and when the waveform data belonging to the same type is erroneously determined as a different group, the amount to be selected as boundary data is increased.

Extracting feature values from the input measurement target measurement data, and using the same pass / fail judgment knowledge for use in the pass / fail judgment in the inspection device that performs pass / fail judgment of the test target based on the extracted feature quantities A knowledge creation method in a knowledge creation device that creates waveform data belonging to a type and waveform data that does not belong to the same type ,
And data selection processing for selecting a predetermined waveform data from among a plurality of waveform data belonging to the same type obtained,
Including selection data which is waveform data selected by executing the data selection processing, and processing for creating knowledge using the waveform data not belonging to the same type to create the pass / fail judgment knowledge,
The data selection process includes:
A selection process for selecting a predetermined number of boundary data belonging to the boundary region of the group to which the plurality of waveform data to be selected belong, and selecting a part of the waveform data from the waveform data not belonging to the boundary region as representative data And a process of combining the selected boundary data and the representative data together as the selection data ,
The selection process performs a temporary feature amount calculation on all the waveform data belonging to the same type, and based on the feature amount calculation result obtained by executing the temporary feature amount calculation, for each waveform data Standardizing a plurality of obtained feature values, obtaining a Mahalanobis distance, selecting the boundary data from a group having a large Mahalanobis distance, and selecting the representative data from a group having a small Mahalanobis distance. Knowledge creation method.