JP2014006613A

JP2014006613A - Neighborhood search method and similar image search method

Info

Publication number: JP2014006613A
Application number: JP2012140373A
Authority: JP
Inventors: Akira Matsumura; 明松村
Original assignee: Dainippon Screen Manufacturing Co Ltd
Current assignee: Dainippon Screen Manufacturing Co Ltd
Priority date: 2012-06-22
Filing date: 2012-06-22
Publication date: 2014-01-16

Abstract

PROBLEM TO BE SOLVED: To provide a technology by which, in a neighborhood search method for searching an image in the neighborhood of a predetermined reference position in a feature amount space and a search method for similar images, neighborhood search or search for similar images can be efficiently performed in the feature amount space without the need of high arithmetic capacity.SOLUTION: For one feature amount X1, a predetermined range including a value of a reference image Is is selected as a neighborhood range R1 in the feature amount X1, and images I1 and I2 in the range are selected as candidate images. For another feature amount X2, a neighborhood range R2 including a value of the reference image Is is set, and images I1 and I3 in the range are selected as candidate images. The candidate image I1 that is selected in common in the respective feature amount is specified as a neighborhood image of the reference image Is.

Description

この発明は、画像を複数種の特徴量によって表したとき、特徴量空間内における所定位置を基準としてその近傍にある画像を探索する近傍探索方法、およびその原理を用いた類似画像探索方法に関するものである。 The present invention relates to a neighborhood search method for searching for an image in the vicinity of a predetermined position in a feature quantity space when an image is represented by a plurality of types of feature quantities, and a similar image search method using the principle. It is.

例えば半導体やプリント基板等の製造技術分野では、製品に含まれる欠陥を検出しこれを分析・評価するために、評価対象物を顕微鏡等を介して撮像し、得られた画像について複数種の特徴量を算出して自動分類を行うことが研究されている。この種の分類技術においては、複数の画像から互いに類似したものを見つけ出すために、複数種の特徴量をそれぞれ座標軸とする特徴量空間内における近傍探索が行われる。すなわち、特徴量空間内において互いに近接した位置にある画像同士は多くの共通した特徴を有していると言えるから、１つの画像について特徴量空間内でその近傍に位置する画像を見出すことは、当該画像に類似する画像を抽出することに相当する。 For example, in the field of manufacturing technology such as semiconductors and printed circuit boards, in order to detect defects contained in products and analyze / evaluate them, the object to be evaluated is imaged through a microscope, etc. Research has been done on calculating the quantity and performing automatic classification. In this type of classification technique, in order to find similar ones from a plurality of images, a neighborhood search is performed in a feature amount space using a plurality of types of feature amounts as coordinate axes. In other words, since it can be said that images in positions close to each other in the feature amount space have many common features, finding an image located in the vicinity of the feature amount space for one image is as follows: This corresponds to extracting an image similar to the image.

このような近傍探索技術としては、例えば特許文献１に記載されたものがある。特許文献１に記載の技術においては、多次元の特徴量空間における２点間の距離をオクタゴン距離によって近似的に算出し、その値に基づいて近傍探索を行っている。この技術では、オクタゴン距離を用いることで、より厳密な距離であるユークリッド距離を算出する場合に必要となる高精度な（例えば倍精度浮動小数点演算のような）演算処理を回避している。 As such a proximity search technique, for example, there is one described in Patent Document 1. In the technique described in Patent Document 1, a distance between two points in a multidimensional feature amount space is approximately calculated by an octagon distance, and a neighborhood search is performed based on the value. In this technique, by using the octagon distance, high-precision (for example, double-precision floating-point arithmetic) that is necessary for calculating the Euclidean distance, which is a stricter distance, is avoided.

特開２０１１−０８６１２４号公報JP 2011-086124 A

しかしながら、上記従来技術においても、２点間のオクタゴン距離を求めるには相当の演算量を要し、特に多次元の特徴量空間では次元数に応じて指数関数的に演算量が増加するため、特徴量の種類が多い、つまり特徴量空間の次元数が大きい場合や、距離算出の対象となる点の数が多い場合には、距離演算のための処理量が膨大なものとなる。このため、上記従来技術では、ハードウェアに演算精度は要求されないものの、高速処理が可能でなければ処理に長大な時間がかかってしまうという問題があった。 However, in the above-described prior art, a considerable amount of computation is required to obtain the octagon distance between two points, and in particular, in a multi-dimensional feature amount space, the amount of computation increases exponentially according to the number of dimensions. When there are many types of feature amounts, that is, when the number of dimensions of the feature amount space is large, or when the number of points for distance calculation is large, the processing amount for the distance calculation becomes enormous. For this reason, although the above-mentioned conventional technique does not require calculation accuracy in hardware, there is a problem that it takes a long time for processing unless high-speed processing is possible.

この発明は上記課題に鑑みなされたものであり、特徴量空間内における所定の基準位置の近傍にある画像を探索する近傍探索方法および類似画像の探索方法において、高度な演算能力を必要とせず特徴量空間における近傍探索または類似画像の探索を効率よく行うことのできる技術を提供することを目的とする。 The present invention has been made in view of the above-described problems, and is a feature that does not require a high degree of computing capability in a neighborhood search method and a similar image search method for searching for an image in the vicinity of a predetermined reference position in a feature amount space. It is an object of the present invention to provide a technique capable of efficiently performing a proximity search or a similar image search in a quantity space.

この発明の第１の態様は、画像を複数の特徴量によって表し、該複数の特徴量をそれぞれ一の座標軸とする多次元の特徴量空間内で、所定の基準座標点の近傍に位置する画像を探索する近傍探索方法であって、上記目的を達成するため、一の前記座標軸について、当該座標軸に対応する特徴量の値が当該座標軸における前記基準座標点の値から予め定められた近傍範囲内にある画像を候補画像として選び出す抽出処理を、複数の前記座標軸を対象として実行する候補抽出工程と、前記候補抽出工程の対象とした複数の前記座標軸の全てにおいて前記候補画像として選ばれた画像を、前記基準座標点の近傍画像として特定する近傍画像特定工程とを備えることを特徴としている。 According to a first aspect of the present invention, an image is represented by a plurality of feature amounts, and is located in the vicinity of a predetermined reference coordinate point in a multidimensional feature amount space having each of the plurality of feature amounts as one coordinate axis. In order to achieve the above-mentioned object, the feature value corresponding to the coordinate axis is within a predetermined neighborhood range from the value of the reference coordinate point on the coordinate axis. A candidate extraction step of performing an extraction process for selecting an image in a candidate image as a candidate image, and images selected as the candidate images in all of the plurality of coordinate axes targeted by the candidate extraction step. And a neighborhood image specifying step of specifying as a neighborhood image of the reference coordinate point.

このように構成された発明では、画像を表す複数の特徴量のそれぞれについて、特徴量空間において当該特徴量に対応する座標軸のみに着目して、当該特徴量の値が基準座標点から近傍範囲内にある画像を当該座標軸における候補画像とする。このために必要な処理は、特徴量の算出および特徴量の種類ごとの単なるスカラー量の比較のみである。そして、特徴量ごとに選出された候補画像の中から、全ての特徴量において共通するものを探し出し、これを近傍画像とする。 In the invention configured as described above, for each of a plurality of feature amounts representing an image, focusing on only the coordinate axis corresponding to the feature amount in the feature amount space, the value of the feature amount is within the vicinity range from the reference coordinate point. The image at is a candidate image on the coordinate axis. The processing necessary for this is only the calculation of the feature amount and the simple comparison of the scalar amount for each type of feature amount. Then, from among candidate images selected for each feature amount, a common image is searched for in all feature amounts, and this is set as a neighborhood image.

複数の画像間での類似度合いを判断する方法として、特徴量空間内における各画像に対応する点間の距離を求める方法がこれまで一般的に行われており、例えばユークリッド距離、マンハッタン距離およびマハラノビス距離などがこの目的のために使用される。しかしながら、欠陥画像の特徴を表すための特徴量としては種々のものがあり、特徴量空間の次元が１００を超えるような場合もある。このように特徴量空間の次元が多くなると、複数の画像間で相互に上記距離を求めるための演算量は膨大なものとなり、例えば浮動小数点処理など高度な演算能力を有するプロセッサが必須となる。 As a method for determining the degree of similarity between a plurality of images, a method for obtaining a distance between points corresponding to each image in the feature amount space has been generally used. For example, a Euclidean distance, a Manhattan distance, and a Mahalanobis are used. Distance etc. are used for this purpose. However, there are various feature quantities for representing the features of the defect image, and the dimension of the feature quantity space may exceed 100. When the dimension of the feature amount space increases in this way, the amount of computation for obtaining the above-mentioned distance between a plurality of images becomes enormous, and for example, a processor having high computation capability such as floating point processing becomes essential.

一方、上記した本発明の処理では、一の特徴量における値同士の比較を特徴量の種類数だけ行うことにより、２つの欠陥画像間の類似度合いが判断される。このため、従来の方法に比べて演算量が大幅に削減され、また高度な演算能力も必要とされない。すなわち、この発明によれば、高度な演算能力を必要とすることなく、特徴量空間における近傍探索を効率よく行うことができる。この点から、本発明は、多くの画像の中から所定の特徴を有する画像を探索したり、互いに類似した特徴を有する画像を見つけ出すという目的に好適に適用することが可能である。 On the other hand, in the processing of the present invention described above, the degree of similarity between two defect images is determined by comparing the values of one feature amount by the number of types of feature amounts. For this reason, the amount of calculation is greatly reduced as compared with the conventional method, and advanced calculation capability is not required. That is, according to the present invention, it is possible to efficiently perform a neighborhood search in the feature amount space without requiring a high level of computing ability. From this point, the present invention can be suitably applied for the purpose of searching for an image having a predetermined feature from many images or finding images having features similar to each other.

なお、本発明では候補抽出工程の対象である特徴量の全てにおいて候補画像として選ばれた画像を近傍画像とするが、画像を表す複数種の特徴量の全てにおいて候補抽出工程が実行されることは必須ではない。すなわち、多数の特徴量のうち目的に応じて選択した一部について候補抽出工程を実行し、それらの特徴量間で共通する候補画像を近傍画像とする態様も、本発明の概念に含まれる。 In the present invention, an image selected as a candidate image in all the feature quantities that are candidates for the candidate extraction process is set as a neighborhood image, but the candidate extraction process is executed in all of a plurality of types of feature quantities representing the image. Is not required. That is, the concept of the present invention includes a mode in which the candidate extraction step is executed for a part selected according to the purpose among a large number of feature quantities, and a candidate image that is common among these feature quantities is used as a neighborhood image.

この発明において、複数の画像から近傍画像を探索する場合、例えば、一の座標軸について、複数の画像における当該座標軸に対応する特徴量の値の分布態様に応じて近傍範囲を設定するようにしてもよい。複数の画像において、ある１つの特徴量の値がどのような分布を示すかは、収集される画像の内容に依存し、予め予想することは困難である。また、実際の分布の態様を反映せずに設定された近傍範囲が用いられることで、近傍とされる範囲が不適切に判断されてしまうおそれがある。収集された画像における特徴量の値の分布態様に応じて近傍範囲を設定し近傍探索を行うことで、このような問題を未然に回避することができる。このように、近傍範囲を予め画一的に決めておくのではなく、複数画像における特徴量の値の分布に応じて定めることにより、実際の画像の内容に即した近傍探索が可能となる。 In the present invention, when searching for a neighborhood image from a plurality of images, for example, for one coordinate axis, a neighborhood range may be set according to a distribution mode of feature value values corresponding to the coordinate axis in the plurality of images. Good. In a plurality of images, the distribution of the value of a certain feature value depends on the contents of the collected images, and it is difficult to predict in advance. In addition, since the neighborhood range set without reflecting the actual distribution mode is used, the neighborhood range may be inappropriately determined. Such a problem can be avoided in advance by setting a neighborhood range according to the distribution mode of the feature value values in the collected image and performing a neighborhood search. As described above, the neighborhood range is not determined in advance in a uniform manner, but is determined according to the distribution of the feature value values in a plurality of images, thereby making it possible to perform a neighborhood search in accordance with the contents of the actual image.

より具体的には、例えば、一の座標軸に対応する特徴量の、複数の画像間での最大値と最小値とを含む数値範囲を所定の分割数で複数に均等分割した区間の数値範囲を、当該特徴量に対応する座標軸についての近傍範囲とすることができる。このようにすると、特徴量の最大値から最小値までの分布範囲が複数の区間に均等に分割され、特徴量の値がどの区間にあるかの判断のみで、当該画像が候補画像とすべきものであるか否かを判定することができる。これにより処理はさらに簡略化される。 More specifically, for example, a numerical value range of a section obtained by equally dividing a numerical value range including a maximum value and a minimum value between a plurality of images of a feature amount corresponding to one coordinate axis into a plurality of predetermined division numbers. , It can be set as a neighborhood range for the coordinate axis corresponding to the feature amount. In this way, the distribution range from the maximum value to the minimum value of the feature amount is equally divided into a plurality of sections, and the image should be a candidate image only by determining which section the feature value value is in It can be determined whether or not. This further simplifies the process.

さらに例えば、当該座標軸における座標値が基準座標点と同一の区間内にある画像を候補画像とするようにしてもよい。こうすることで、基準座標点の座標値に近い特徴量の値を有する画像が候補画像として選出される。このとき、各画像がどの区間に属しているかのみの情報があれば足り、特徴量の値そのものを用いるまでもなく候補画像の選出が可能となるので、さらに処理は簡単なものとなる。 Further, for example, an image in which the coordinate value on the coordinate axis is in the same section as the reference coordinate point may be set as the candidate image. By doing so, an image having a feature value close to the coordinate value of the reference coordinate point is selected as a candidate image. At this time, it is only necessary to have information about which section each image belongs to, and it becomes possible to select a candidate image without using the value of the feature amount itself, so that the processing is further simplified.

また例えば、当該座標軸における座標値が、基準座標点の座標値が属する区間を含む連続した複数区間のいずれかにある画像を候補画像とするようにしてもよい。候補画像を基準座標点と同一区間にある画像のみに限定した場合、各特徴量間で共通する候補画像を見出すことが難しくなる。特に高次元の特徴量空間においてはその傾向が強く、特徴量空間における位置が基準座標点と実質的にほぼ一致する画像以外は「近傍」と判断されなくなる場合がある。基準座標点の座標値が属する区間を含む複数区間に候補画像の範囲を広げることで、より画像の態様に応じた近傍探索を行うことが可能となる。 Further, for example, an image in which the coordinate value on the coordinate axis is in any of a plurality of continuous sections including the section to which the coordinate value of the reference coordinate point belongs may be used as the candidate image. When the candidate images are limited to only images in the same section as the reference coordinate point, it is difficult to find a candidate image that is common among the feature amounts. This tendency is particularly strong in a high-dimensional feature amount space, and it may not be determined as “neighboring” except for an image whose position in the feature amount space substantially coincides with the reference coordinate point. By expanding the range of the candidate image to a plurality of sections including the section to which the coordinate value of the reference coordinate point belongs, it is possible to perform a neighborhood search according to the mode of the image.

また、この発明の第２の態様は、複数の画像から互いに類似する画像を探索する類似画像探索方法であって、上記目的を達成するため、前記複数の画像のうち一を基準画像として選択し、該基準画像を表す複数の特徴量をそれぞれ一の座標軸とする多次元の特徴量空間内において前記基準画像が位置する点を基準座標点として設定する基準設定工程と、上記したいずれかの近傍探索方法により、前記複数の画像のうち前記基準画像以外の画像から前記基準座標点の近傍画像を探索する近傍探索工程と、前記基準座標点の前記近傍画像として特定された画像を、前記基準画像に類似する類似画像として特定する類似画像特定工程とを備えることを特徴としている。 According to a second aspect of the present invention, there is provided a similar image search method for searching for images that are similar to each other from a plurality of images. To achieve the above object, one of the plurality of images is selected as a reference image. A reference setting step for setting, as a reference coordinate point, a point where the reference image is located in a multi-dimensional feature amount space having a plurality of feature amounts each representing the reference image as one coordinate axis; A proximity search step of searching for a neighborhood image of the reference coordinate point from images other than the reference image among the plurality of images by a search method, and an image specified as the neighborhood image of the reference coordinate point as the reference image And a similar image specifying step of specifying as a similar image similar to.

このように構成された発明では、複数の画像のうち１つを基準画像としたとき、当該基準画像が特徴量空間において位置する点を基準座標点として上記した近傍探索方法により近傍画像が探索される。これにより、特徴量空間において基準画像の近傍にある、つまり基準画像に類似した画像が効率よく探索され、上記した近傍探索方法と同様、この場合にも高度な演算能力は必要とされない。 In the invention configured as described above, when one of a plurality of images is used as a reference image, a neighborhood image is searched by the above-described neighborhood search method using a point where the reference image is located in the feature amount space as a reference coordinate point. The As a result, an image that is in the vicinity of the reference image in the feature amount space, that is, an image similar to the reference image is searched efficiently, and in this case, as in the case of the above-described proximity search method, a high calculation capability is not required.

この発明において、例えば、複数の特徴量のうち、複数の画像間での最大値と最小値との差が、当該特徴量に対して予め設定された閾値よりも大きいものについて、当該特徴量に対応する座標軸を候補抽出工程の対象とするようにしてもよい。前記した通り、画像を表す特徴量の全てについて候補抽出工程を行うことは必須ではない。例えば各画像間で値の差がほとんどない特徴量については候補画像抽出工程を行うまでもなく各画像がいずれも候補画像となることは明らかである。このような特徴量について候補抽出工程を実行する意味はなく、むしろ結果に悪影響（例えば当該特徴量における微小な差のみで近傍であるか否かが判断されるなど）を及ぼす可能性がある。複数画像間である程度の値の差がある特徴量を候補抽出工程の対象とすることで、そのような問題を回避するとともに処理をより簡単にすることができる。 In the present invention, for example, among the plurality of feature amounts, the difference between the maximum value and the minimum value between the plurality of images is larger than a threshold set in advance for the feature amount. The corresponding coordinate axis may be the target of the candidate extraction process. As described above, it is not essential to perform the candidate extraction process for all the feature quantities representing the image. For example, with respect to a feature quantity that has almost no difference in value between images, it is obvious that each image becomes a candidate image without performing a candidate image extraction step. There is no point in performing the candidate extraction step for such a feature quantity, but there is a possibility that the result will be adversely affected (for example, whether or not it is a neighborhood based on only a small difference in the feature quantity). By using a feature amount having a certain level of difference between a plurality of images as a candidate extraction step, such a problem can be avoided and the processing can be simplified.

また、例えば基準設定工程に先立って、複数の画像のそれぞれについて複数の特徴量を算出するとともに、複数の特徴量ごとに、当該特徴量の値とその値を有する画像とを関連付けたテーブルを作成し、近傍探索工程では、テーブルに基づき近傍画像を探索するようにしてもよい。このように、それぞれの特徴量ごとにその値と画像とを関連付けたテーブルは、複数の画像のうちいずれが基準画像に選ばれた場合でも、当該基準画像の近傍画像を探索する上で有用なものとなる。というのは、当該基準画像における特徴量の値に近い値を有する他の画像を、テーブルを参照することで直ちに導出することができるからである。特に基準画像を順次変更しながら繰り返し探索処理を行う場合、全ての基準画像について同一のテーブルを参照することができるので、その効果は特に顕著となる。 Also, for example, prior to the reference setting step, a plurality of feature amounts are calculated for each of a plurality of images, and a table that associates the value of the feature amount with an image having the value is created for each of the plurality of feature amounts. In the neighborhood searching step, the neighborhood image may be searched based on the table. As described above, the table in which the value and the image are associated with each feature amount is useful for searching for a neighborhood image of the reference image, regardless of which of the plurality of images is selected as the reference image. It will be a thing. This is because another image having a value close to the value of the feature amount in the reference image can be immediately derived by referring to the table. In particular, when the search process is repeatedly performed while sequentially changing the reference image, the same table can be referred to for all the reference images, and the effect is particularly remarkable.

この発明によれば、１つの特徴量ごとに近傍範囲にある候補画像を選出し、各特徴量において共通して候補画像とされた画像を基準座標点の近傍画像とすることで、高度な演算能力を必要とすることなく、特徴量空間における近傍探索を効率よく行うことができる。 According to the present invention, advanced calculation is performed by selecting a candidate image in the vicinity range for each feature amount, and setting an image that is a candidate image in common for each feature amount as a vicinity image of the reference coordinate point. It is possible to efficiently perform a neighborhood search in the feature amount space without requiring capability.

本実施形態の処理対象である欠陥画像の概念を示す図である。It is a figure which shows the concept of the defect image which is the process target of this embodiment. この実施形態における処理の流れの概略を示すフローチャートである。It is a flowchart which shows the outline of the flow of the process in this embodiment. 特徴量空間における画像の分布の例を示す図である。It is a figure which shows the example of distribution of the image in feature-value space. この実施形態における近傍の範囲を示す図である。It is a figure which shows the range of the vicinity in this embodiment. 近傍画像を探索するための前処理を示すフローチャートである。It is a flowchart which shows the pre-processing for searching a near image. 特徴量の値の正規化・量子化の概念を示す図である。It is a figure which shows the concept of normalization and quantization of the value of a feature-value. 量子化された特徴量と画像番号とを関連付けたテーブルの例を示す図である。It is a figure which shows the example of the table which linked | related the quantized feature-value and the image number. 欠陥画像のグループ化処理を示すフローチャートである。It is a flowchart which shows the grouping process of a defect image. この発明を好適に適用可能な検査システムの概略構成を示す図である。1 is a diagram showing a schematic configuration of an inspection system to which the present invention can be preferably applied.

以下、この発明の一実施形態である欠陥画像の提示方法について説明する。この実施形態は、複数の欠陥画像を自動学習アルゴリズムによりいくつかの欠陥カテゴリに分類する自動欠陥分類を行うに当たり、その前段階の処理として実行されるものである。より詳しくは、基板等の検査対象物を撮像して収集された複数の欠陥画像の中から典型的な欠陥を含み自動学習における教師画像となりうる典型画像を見つけ出すユーザの作業を支援するための処理である。多数収集された欠陥画像から互いに類似する画像同士をグループ化して提示するに際して、類似した特徴を有する画像同士を関連付けるための処理に対して、本発明の近傍探索方法および類似画像の探索方法が適用されている。本実施形態の具体的な説明に先立って、本実施形態および以下の説明の前提となっている概念について説明しておく。 Hereinafter, a method for presenting a defect image according to an embodiment of the present invention will be described. In this embodiment, when automatic defect classification is performed in which a plurality of defect images are classified into several defect categories by an automatic learning algorithm, the process is executed as a previous stage process. More specifically, a process for supporting a user's work of finding a typical image that includes typical defects and can be a teacher image in automatic learning from a plurality of defect images collected by imaging an inspection target such as a substrate. It is. When the similar images are grouped and presented from a large number of collected defect images, the neighborhood search method and the similar image search method of the present invention are applied to the processing for associating images having similar features. Has been. Prior to specific description of the present embodiment, the concept that is the premise of the present embodiment and the following description will be described.

図１は本実施形態の処理対象である欠陥画像の概念を示す図である。図１（ａ）に示すように、予め収集された欠陥画像群ＩＧに属する各欠陥画像（以下、単に「画像」と称する場合がある）に対しては、各画像を個々に区別するための識別符号として連番かつ一意の画像番号（１，２，…）が付される。以下の説明では、欠陥画像群ＩＧの画像総数をＰ（Ｐは２以上の自然数）とし、そのうちの任意の１つの画像を画像番号ｐ（ｐは自然数、１≦ｐ≦Ｐ）として表すものとする。 FIG. 1 is a diagram showing a concept of a defect image that is a processing target of the present embodiment. As shown in FIG. 1A, for each defect image belonging to the defect image group IG collected in advance (hereinafter, simply referred to as “image”), each image is individually distinguished. Serial numbers and unique image numbers (1, 2,...) Are assigned as identification codes. In the following description, the total number of images in the defect image group IG is P (P is a natural number of 2 or more), and any one of the images is represented as an image number p (p is a natural number, 1 ≦ p ≦ P). To do.

各画像に含まれる欠陥の特徴は、複数の特徴量により表現される。図１（ｂ）に示すように、本実施形態ではＮ種類（Ｎは２以上の自然数）の特徴量により欠陥の特徴が表され、これら各種の特徴量のそれぞれに大文字Ｘを用いた符号Ｘ1，Ｘ2，…，ＸNが付される。そのうちの任意の１つの特徴量は、符号Ｘn（ｎは自然数、１≦ｎ≦Ｎ）により表されるものとする。 The feature of the defect included in each image is expressed by a plurality of feature amounts. As shown in FIG. 1B, in this embodiment, the feature of the defect is represented by N types (N is a natural number of 2 or more) of feature amounts, and a code X1 using a capital letter X for each of these various feature amounts. , X2, ..., XN. Any one of these features is represented by a code Xn (n is a natural number, 1 ≦ n ≦ N).

画像番号ｐの欠陥画像に対応する特徴量の値は小文字ｘを用いて符号ｘpnにより表される。したがって、例えば画像番号１の欠陥画像に対応するＮ種類の特徴量の値は、それぞれ符号ｘ11，ｘ12，…，ｘ1Nにより表される。また、画像番号１，２，…，Ｎに対応する特徴量Ｘ1の値は、それぞれｘ11，ｘ21，…，ｘP1により表される。また、一の特徴量Ｘnに着目したとき、各欠陥画像に対応する当該特徴量の値ｘ1n，ｘ2n，…，ｘPnのうちの最大値が符号Ｍn、最小値が符号ｍnにより表される。 The value of the feature amount corresponding to the defect image with the image number p is represented by the symbol xpn using the lowercase letter x. Therefore, for example, N types of feature values corresponding to the defect image of image number 1 are represented by codes x11, x12,..., X1N, respectively. Further, the value of the feature quantity X1 corresponding to the image numbers 1, 2,..., N is represented by x11, x21,. Further, when paying attention to one feature amount Xn, the maximum value among the feature amount values x1n, x2n,..., XPn corresponding to each defect image is represented by a symbol Mn, and the minimum value is represented by a symbol mn.

図２はこの実施形態における処理の流れの概略を示すフローチャートである。最初に、互いに異なる複数の欠陥画像が取得され（ステップＳ１０１）、それらの欠陥画像により欠陥画像群ＩＧが構成される。各欠陥画像は、検査対象である例えば半導体基板の外観に現れたピンホールや異物等の欠陥を含むものであり、同一基板の異なる位置で撮像された複数の画像や、互いに異なる基板において撮像された複数の画像などを多数収集して欠陥画像群ＩＧを構成することができる。 FIG. 2 is a flowchart showing an outline of the flow of processing in this embodiment. First, a plurality of different defect images are acquired (step S101), and a defect image group IG is constituted by these defect images. Each defect image includes defects such as pinholes and foreign matters appearing on the appearance of the semiconductor substrate to be inspected, and is captured on a plurality of images taken at different positions on the same substrate or on different substrates. In addition, the defect image group IG can be configured by collecting a large number of images.

こうして取得された複数の欠陥画像の各々について、複数の特徴量Ｘ1，…，ＸNが算出される（ステップＳ１０２）。これにより、各々の欠陥画像に含まれる欠陥の特徴がＮ種類の特徴量により定量的に表される。複数の特徴量を座標軸とする多次元の特徴量空間を想定したとき、各欠陥画像をその特徴量の値を座標値とする座標点として特徴量空間内にプロットすることができる。欠陥画像に含まれる欠陥の特徴が類似した画像は、特徴量空間内で互いに近接した位置に位置づけられる。言い換えれば、特徴量空間内で互いに近接した位置にある複数の画像の欠陥の特徴（例えば形状、大きさ等）は類似しており、それらは同じタイプの起源に基づく欠陥である可能性が高いと推定することができる。 A plurality of feature amounts X1,..., XN are calculated for each of the plurality of defect images thus obtained (step S102). Thereby, the feature of the defect contained in each defect image is quantitatively represented by the N types of feature amounts. Assuming a multidimensional feature amount space having a plurality of feature amounts as coordinate axes, each defect image can be plotted in the feature amount space as coordinate points having the feature amount values as coordinate values. Images having similar features of defects included in the defect image are positioned at positions close to each other in the feature amount space. In other words, the defect features (eg, shape, size, etc.) of multiple images located close to each other in the feature space are similar and are likely to be defects based on the same type of origin. Can be estimated.

学習アルゴリズムを用いた自動欠陥分類を精度よく行うためには、収集された複数の欠陥画像の中から教師画像となる典型的な欠陥を含む画像（典型画像）を見つけ出す必要がある。同一の欠陥であってもその発現態様は微妙に異なっており、コンピュータによる画像処理等で自動的に典型画像を求める技術は確立されていない。そのため、典型画像の選定は熟練したユーザの経験に基づく判断に委ねられているのが現状である。このとき、ユーザは事前に何らの手がかりもなく多数の画像を隈なく精査して典型画像を選び出す必要があり、その作業負荷は非常に大きい。収集された画像から自動的に把握できる欠陥の傾向や相関性に関する情報がユーザに対して提供されると便宜である。 In order to accurately perform automatic defect classification using a learning algorithm, it is necessary to find an image (typical image) including a typical defect that becomes a teacher image from a plurality of collected defect images. Even if the defect is the same, its appearance is slightly different, and a technique for automatically obtaining a typical image by computer image processing or the like has not been established. For this reason, the selection of typical images is left to the judgment based on the experience of skilled users. At this time, the user needs to carefully scrutinize a large number of images without any clues in advance and select a typical image, and the work load is very large. It is convenient if information on defect tendency and correlation that can be automatically grasped from the collected images is provided to the user.

そこで、この実施形態では、特徴量空間内で互いに近接した位置にある画像同士が同一のグループとなるように各欠陥画像がグループ化される（ステップＳ１０３）。これにより、類似した特徴を有する画像が互いに関連付けられる。そして、こうして形成されたグループごとに、当該グループに属する画像からいくつかが典型画像の候補としてユーザに提示される。このとき、形成されたいくつかのグループの中では、大きなグループ、つまりそのグループに属する画像の数が多いものから優先的に順に提示される（ステップＳ１０４）。 Therefore, in this embodiment, the defect images are grouped so that the images at positions close to each other in the feature amount space are in the same group (step S103). Thereby, images having similar characteristics are associated with each other. For each group formed in this manner, some of the images belonging to the group are presented to the user as typical image candidates. At this time, among some of the formed groups, a large group, that is, one having a large number of images belonging to the group is presented in priority order (step S104).

提示される画像の数は任意であり、同一グループに属する画像全てであってもよく、またその一部でもよい。提示の態様としては、提示の対象とされた画像を例えばディスプレイ等の表示手段に表示する方法であってもよく、また例えば同一グループに属する欠陥画像の画像番号のリストを出力する方法であってもよい。また、グループの大きさを把握し得るような情報、例えばそのグループに属する欠陥画像の数に関する情報をさらに付加してもよい。 The number of images to be presented is arbitrary, and may be all images belonging to the same group or a part thereof. As a mode of presentation, a method for displaying an image to be presented on a display unit such as a display may be used, and for example, a method for outputting a list of image numbers of defect images belonging to the same group. Also good. Further, information that can grasp the size of the group, for example, information on the number of defect images belonging to the group may be further added.

属する画像の数が大きいグループを優先的に提示する理由は以下の通りである。類似した特徴を有する画像を集めたグループが大きいほど、その特徴を有する欠陥が発生頻度の高い「典型的な」ものであることを意味している。したがって、教師画像に基づく学習アルゴリズムに際しては、当該グループに対応する典型画像が教師画像に含まれていることが望ましい。このため、大きなグループから優先的に画像の提示を行うようにしている。 The reason why a group having a large number of images belonging is preferentially presented is as follows. It means that the larger the group of images having similar characteristics, the more “typical” defects having the characteristics are generated. Therefore, in the learning algorithm based on the teacher image, it is desirable that a typical image corresponding to the group is included in the teacher image. For this reason, images are preferentially presented from a large group.

このような提示がなされることにより、相互の関連付けが全くなされていない未整理の多数の欠陥画像の中から、互いに類似する特徴を有する画像のみが絞り込まれて提示されることになる。そのため、ユーザは提示されたそれらの画像の中からいくつかの典型画像を選ぶだけで済むこととなり、典型画像の指定におけるユーザの作業負荷は大幅に軽減される。すなわち、未整理の画像群から最終的に典型画像を選出するための作業の多くが自動的に、かつ客観的に行われる。なお、１つのグループに属する画像から選出される典型画像の数は任意であり、例えば同一グループ内の画像全て、あるいはそのうち明らかに起源の異なると見られるもののみを除外した全てを典型画像としてもよい。 By such presentation, only images having features similar to each other are narrowed down and presented from a large number of unorganized defect images that are not associated with each other at all. Therefore, the user only has to select some typical images from the presented images, and the user's workload in specifying the typical images is greatly reduced. That is, much of the work for finally selecting a typical image from an unorganized image group is automatically and objectively performed. The number of typical images selected from images belonging to one group is arbitrary. For example, all images in the same group, or all except for those apparently having different origins may be used as typical images. Good.

また、ユーザは、互いに類似する欠陥画像についてはその全てをチェックする必要はなく、提示されたいくつかの画像から把握される特徴的な部分を見てその欠陥が典型的なものか否かを判断すればよい。一方で、類似する特徴を持つ画像の少ない特殊な欠陥を含む画像であっても小さいながらも１つのグループとして提示されることで、そのような特殊な欠陥についても見落とすことなく、典型画像とするか否かをユーザに判断させることができる。 In addition, the user does not need to check all of the defect images similar to each other, and looks at the characteristic part grasped from some of the presented images to determine whether the defect is typical. Just judge. On the other hand, even if an image having a special defect with few images with similar characteristics is presented as a group even though it is small, such a special defect is not overlooked and is made a typical image. It is possible to make the user determine whether or not.

また、ユーザの目視のみで典型画像を選出する場合、特徴量に基づき定量的に判断される画像の特徴と矛盾する指定がなされ、これに基づく学習が却って分類結果に混乱をもたらす可能性があり得る。しかしながら、この実施形態では類似する画像とそうでない画像とが予め仕分けられており、その情報を把握した上でユーザが典型画像を指定することができるため、このように定量的な判断と矛盾する指定がなされる可能性は極めて低くなっている。 In addition, when a typical image is selected only by the user's visual observation, it is specified that contradicts the feature of the image that is quantitatively determined based on the feature amount, and learning based on this may cause confusion in the classification result. obtain. However, in this embodiment, similar images and images that are not so are sorted in advance, and the user can specify a typical image after grasping the information, and thus contradicts the quantitative judgment in this way. The possibility of making a designation is extremely low.

次に、収集された多数の欠陥画像から互いに類似した特徴を有する画像のグループを見つけ出す方法について説明する。前述したように、一般的には、特徴量空間内において近接した位置にある複数の画像は互いに類似しているということができる。したがって、複数の画像について特徴量空間内における相互の距離を算出することで、画像間の類似度合いを推定することが可能である。このような類似度合いを表すパラメータとして、ユークリッド距離、マンハッタン距離、マハラノビス距離などが知られているが、高次元空間において多数の画像間で互いにこれらの距離を求めるための演算量は膨大である。また、高次元の特徴量空間では欠陥画像間の距離の比較がほとんど意味をなさない場合があることが「次元の呪い」として知られている。この実施形態では、多次元空間内での距離計算を伴わないためこのような問題が生じず、かつより簡便な方法で画像間の類似度合いを推定しており、その方法について次に説明する。 Next, a method for finding a group of images having similar characteristics to each other from a large number of collected defect images will be described. As described above, in general, it can be said that a plurality of images located at close positions in the feature amount space are similar to each other. Therefore, it is possible to estimate the degree of similarity between images by calculating the mutual distance in the feature amount space for a plurality of images. Euclidean distances, Manhattan distances, Mahalanobis distances, and the like are known as parameters representing such a degree of similarity. However, the amount of calculation for obtaining these distances between a large number of images in a high-dimensional space is enormous. In addition, it is known as a “dimensional curse” that comparison of distances between defect images may make little sense in a high-dimensional feature space. In this embodiment, since there is no distance calculation in a multidimensional space, such a problem does not occur, and the degree of similarity between images is estimated by a simpler method. The method will be described below.

図３は特徴量空間における画像の分布の例を示す図である。ここでは、２種類の特徴量Ｘ1、Ｘ2による二次元の特徴量空間を例として説明するが、より高い次元の特徴量空間においても以下の考え方を拡張して適用することが可能である。一例として、特徴量Ｘ1、Ｘ2をそれぞれ座標軸とする二次元特徴量空間に、２０枚の欠陥画像（Ｐ＝２０）をその特徴量の値に応じてプロットした場合を考える。図３において、内部に数字を付した丸印は特徴量空間における各欠陥画像の位置を示し、数字はそれぞれの欠陥画像の画像番号を表している。 FIG. 3 is a diagram illustrating an example of an image distribution in the feature amount space. Here, a two-dimensional feature amount space using two types of feature amounts X1 and X2 will be described as an example. However, the following concept can be extended and applied to a higher-dimensional feature amount space. As an example, let us consider a case where 20 defect images (P = 20) are plotted in accordance with the value of the feature quantity in a two-dimensional feature quantity space having the feature quantities X1 and X2 as coordinate axes. In FIG. 3, a circle with a number inside indicates the position of each defect image in the feature space, and the number indicates the image number of each defect image.

画像番号１〜２０の２０枚の画像は、特徴量Ｘ1、Ｘ2の値に応じて座標空間内の各位置に配置される。ここで、図３に破線で示したように、互いに近接位置にある画像同士をまとめてゆくことで、画像全体をいくつかのグループに分けることができる。例えば画像番号１，４，７，１３，１７，２０で表される一群の画像は互いに近接しており、全体として１つの塊（クラスタ）を構成している。１つの欠陥画像を基準画像として、その近傍にある他の欠陥画像を基準画像と同じグループに含ませるとともに、該他の欠陥画像に対し近傍位置にあるさらに別の欠陥画像についても、基準画像と同じグループとする。なお、他の画像からの距離が遠い孤立した画像（例えば画像番号８）については、当該孤立画像を唯一の構成要素とする１つのグループとみなせばよい。 The 20 images with image numbers 1 to 20 are arranged at respective positions in the coordinate space according to the values of the feature amounts X1 and X2. Here, as indicated by broken lines in FIG. 3, the images at close positions can be grouped together to divide the entire image into several groups. For example, a group of images represented by image numbers 1, 4, 7, 13, 17, and 20 are close to each other and constitute one cluster as a whole. One defect image is set as a reference image, and other defect images in the vicinity thereof are included in the same group as the reference image, and another defect image in the vicinity of the other defect image is also referred to as the reference image. Same group. It should be noted that an isolated image (for example, image number 8) that is far from other images may be regarded as one group having the isolated image as a single component.

図４はこの実施形態における近傍の範囲を示す図である。この実施形態の考え方では、１つの欠陥画像を基準画像Ｉｓとしたとき、特徴量空間における当該基準画像の位置を中心として、特徴量Ｘ1に対応する座標軸において適宜に定めた近傍範囲Ｒ1、特徴量Ｘ2に対応する座標軸において適宜に定めた近傍範囲Ｒ2の範囲内にある画像を探索し、そのような画像があればそれを当該基準画像に近接する近傍画像と判定する。 FIG. 4 is a diagram showing the vicinity range in this embodiment. In the concept of this embodiment, when one defect image is a reference image Is, a neighborhood range R1 and a feature amount appropriately determined on the coordinate axis corresponding to the feature amount X1 with the position of the reference image in the feature amount space as the center. An image within the range of the neighborhood range R2 appropriately determined on the coordinate axis corresponding to X2 is searched, and if there is such an image, it is determined as a neighborhood image close to the reference image.

この例では、基準画像Ｉｓに対し特徴量Ｘ1に対応する座標軸において近傍範囲Ｒ1内にある画像としてはそれぞれ符号Ｉ1、Ｉ2で示される画像があるが、このうち特徴量Ｘ2に対応する座標軸においても近傍範囲Ｒ2内にある画像Ｉ1が基準画像の近傍画像とされる。画像Ｉ2は特徴量Ｘ2に対応する座標軸において範囲外であるため、近傍とはみなされない。一方、画像Ｉ3は特徴量Ｘ2に対応する座標軸においては近傍範囲Ｒ2内にあるが、特徴量Ｘ1に対応する座標軸において範囲外であるため、やはり近傍とはみなされない。両座標軸のいずれにおいても範囲外である画像Ｉ4は当然に近傍ではないとされる。 In this example, the images in the vicinity range R1 in the coordinate axis corresponding to the feature amount X1 with respect to the reference image Is include images indicated by symbols I1 and I2, respectively. Of these, the coordinate axis corresponding to the feature amount X2 is also included. The image I1 in the vicinity range R2 is set as the vicinity image of the reference image. Since the image I2 is out of range on the coordinate axis corresponding to the feature amount X2, it is not regarded as a neighborhood. On the other hand, the image I3 is within the vicinity range R2 on the coordinate axis corresponding to the feature amount X2, but is not considered to be near because it is outside the range on the coordinate axis corresponding to the feature amount X1. Naturally, the image I4 which is out of the range in both coordinate axes is not near.

このような基準に基づく判定は、上記したユークリッド距離等に基づく判定ほど厳密なものではないとも言えるが、各座標軸ごとの独立した演算の繰り返しにより判定を行うことができ、次元数が増加したときに指数関数的に演算量が増大する上記の距離計算に比べ、演算量がさほど大きくならないという利点がある。また、ここでの判定の目的は、一応関連があるとみられる画像をグループ化して提示することでユーザの判断を支援するというものであり、そのための近傍画像の探索という目的においては上記判定方法は必要十分な精度を得ることができるものである。すなわち、ここでは処理量の少なさの利点が演算の正確さよりも優先される。 Although it can be said that the determination based on such a criterion is not as strict as the determination based on the Euclidean distance described above, the determination can be performed by repeating independent calculation for each coordinate axis, and the number of dimensions increases. There is an advantage that the calculation amount is not so large as compared with the above distance calculation in which the calculation amount increases exponentially. In addition, the purpose of the determination here is to support the user's determination by grouping and displaying images that are considered to be related to each other. For the purpose of searching for nearby images, the above determination method is Necessary and sufficient accuracy can be obtained. That is, here, the advantage of the small amount of processing is given priority over the accuracy of calculation.

図５は近傍画像を探索するための前処理を示すフローチャートである。この処理は、多数の欠陥画像間で相互の近傍関係をスムーズに探索するために、各欠陥画像の特徴量の値の関係を特徴量の種類ごとに予め整理しておく処理である。この処理は、図２に示した処理のうちステップＳ１０１およびＳ１０２と、ステップＳ１０３の一部とに相当する。複数の欠陥画像が取得されると（ステップＳ２０１）、各欠陥画像を識別するための符号として連番の画像番号が付与される（ステップＳ２０２）。そして、各欠陥画像について、Ｎ種類の特徴量がそれぞれ算出される（ステップＳ２０３）。 FIG. 5 is a flowchart showing preprocessing for searching for a neighborhood image. This process is a process in which the relationship between the feature values of each defect image is arranged in advance for each type of feature value in order to smoothly search for the neighborhood relationship between a large number of defect images. This process corresponds to steps S101 and S102 and a part of step S103 in the process shown in FIG. When a plurality of defect images are acquired (step S201), sequential image numbers are assigned as codes for identifying the defect images (step S202). Then, N types of feature amounts are calculated for each defect image (step S203).

次いで、Ｎ種類から１つの特徴量Ｘnが選択され、当該特徴量の値ｘ1n，ｘ2n，…，ｘPnの大小関係に基づき各欠陥画像が配列される。ここでは例えば昇順に配列されるものとする（ステップＳ２０５）。画像の配列において、実際の欠陥画像を並べる必要はなく、特徴量の値の順に配列された画像番号の列が得られればよい。 Next, one feature quantity Xn is selected from N types, and the respective defect images are arranged based on the magnitude relationship of the feature quantity values x1n, x2n,..., XPn. Here, for example, it is assumed that they are arranged in ascending order (step S205). In the image arrangement, it is not necessary to arrange actual defect images, and it is only necessary to obtain a sequence of image numbers arranged in the order of feature value values.

そして、これらの特徴量の最大値Ｍnと最小値ｍnとの差分が求められる（ステップＳ２０６）。この差分が所定の閾値ａと比較され（ステップＳ２０７）、閾値ａを超えていれば次のステップＳ２０８〜Ｓ２１１が実行される一方、閾値以下ならこれらの各ステップはスキップされる。 Then, the difference between the maximum value Mn and the minimum value mn of these feature quantities is obtained (step S206). This difference is compared with a predetermined threshold value a (step S207), and if it exceeds the threshold value a, the next steps S208 to S211 are executed, while if not more than the threshold value, these steps are skipped.

ステップＳ２０８〜Ｓ２１１では、各欠陥画像間での特徴量の値の比較を容易にするために、特徴量の値を正規化・量子化するとともに、その値と画像番号とを関連付けたテーブルを作成する。ただし、特徴量の最大値Ｍnと最小値ｍnとの差分が閾値以下であるとき、当該特徴量については各画像間で有意な差がないとみなし、そのような作業を省略する。閾値ａとしては、例えば倍精度浮動小数点型で表現し得る最小正数の１０００倍の値とすることができる。最大値Ｍnと最小値ｍnとの差がこのように定められた閾値ａ以下であるとき、各欠陥画像間での値の差はさらに微小なものとなり、当該特徴量に対応する座標軸に関しては各画像間で比較するまでもなくそれらが極めて近接しているとみなせるからである。このように結果にほとんど影響のない特徴量を除外することで、実質的な次元数が元のＮ次元よりも小さくなり、演算量をさらに低減することができる。 In steps S208 to S211, in order to facilitate the comparison of the feature value between the defect images, the feature value is normalized and quantized, and a table in which the value is associated with the image number is created. To do. However, when the difference between the maximum value Mn and the minimum value mn of the feature amount is equal to or less than the threshold value, it is considered that there is no significant difference between the images for the feature amount, and such work is omitted. As the threshold value a, for example, a value that is 1000 times the minimum positive number that can be expressed by a double-precision floating point type can be used. When the difference between the maximum value Mn and the minimum value mn is equal to or less than the threshold value a thus determined, the difference in value between the defect images becomes even smaller, and each coordinate axis corresponding to the feature amount is different for each coordinate axis. This is because they can be regarded as being very close without comparing between images. By excluding feature quantities that hardly affect the result in this way, the substantial number of dimensions becomes smaller than the original N dimensions, and the amount of computation can be further reduced.

ステップＳ２０８では、特徴量の値の昇順配列において隣接する画像間相互での特徴量の値の差δを求め、それらの値δを累積加算してデータ数で割ることにより、差分の平均値Δnを算出する。これにより特徴量の値の広がりの範囲を求める。このとき、値δが所定の閾値ｂnよりも小さいときは、隣接する画像間での特徴量の差が実質的にゼロであるとして累積加算から除外し、データ数からも除外する。閾値ｂnについては、例えば（Ｍn−ｍn）／１０００とすることができる。このようにすることで、次に行う特徴量の量子化において、画像間での特徴量の値のばらつきの態様に応じて、不必要に細かいステップとならない適切な量子化ステップを適用することが可能となる。 In step S208, the difference value δ between adjacent images in an ascending order of feature value values is obtained, and the difference value is averaged Δn by accumulating the value δ and dividing by the number of data. Is calculated. Thereby, the range of the spread of the feature value is obtained. At this time, if the value δ is smaller than the predetermined threshold value bn, it is excluded from the cumulative addition because the difference in feature quantity between adjacent images is substantially zero, and is also excluded from the number of data. The threshold value bn can be, for example, (Mn−mn) / 1000. In this way, in the next feature quantity quantization to be performed, it is possible to apply an appropriate quantization step that does not become an unnecessarily fine step depending on the variation in the value of the feature quantity between images. It becomes possible.

ステップＳ２０９では、当該特徴量Ｘnに対応する座標軸における量子化分割数Ｋnを求める。量子化分割数Ｋnについては次式：
Ｍn≦Ｋn・Δn
の関係を満たすような最小の整数とすることができる。さらに、ステップＳ２１０において、こうして定められた量子化分割数に基づき、個々の欠陥画像における特徴量Ｘnの値ｘpnが次式：
ｙpn＝Int｛（ｘpn−ｍn）／Δn＋０．５｝
により正規化・量子化される。ここで、Int｛ｘ｝は、変数ｘの整数部分を値として返す関数である。また、０．５を加えて整数化するのは、小数点以下を四捨五入するためである。 In step S209, the quantization division number Kn on the coordinate axis corresponding to the feature amount Xn is obtained. The quantization division number Kn is expressed by the following formula:
Mn ≦ Kn ・ Δn
The smallest integer that satisfies the relationship Further, in step S210, based on the quantization division number thus determined, the value xpn of the feature amount Xn in each defect image is expressed by the following equation:
ypn = Int {(xpn-mn) /Δn+0.5}
Normalized and quantized by Here, Int {x} is a function that returns the integer part of the variable x as a value. The reason for adding 0.5 to an integer is to round off the decimal part.

これにより、当該特徴量Ｘnに対応する座標軸において最小値ｍnから最大値Ｍnまでの範囲に分布する各欠陥画像の特徴量の値ｘpnは、１〜Ｋnまでの離散的な値を取る正規化・量子化された値ｙpnに変換される。これにより、画像間における特徴量の値の微小な差異に関する情報は失われることになるが、この情報は本実施形態のような用途においては必要のないものであり、結果にはほとんど影響しない。そして、こうして量子化された特徴量の値ｙpnとその値を取る欠陥画像の画像番号とを関連付けたテーブルを作成する（ステップＳ２１１）。 As a result, the feature value xpn of each defect image distributed in the range from the minimum value mn to the maximum value Mn on the coordinate axis corresponding to the feature value Xn is normalized to take discrete values from 1 to Kn. It is converted into a quantized value ypn. As a result, information regarding a minute difference in feature value between images is lost, but this information is not necessary in the application as in this embodiment, and hardly affects the result. Then, a table in which the feature value ypn quantized in this way is associated with the image number of the defect image taking the value is created (step S211).

このような処理を、特徴量の全種について行う（ステップＳ２１２）。すなわち、選択する特徴量を順次変更しながら、上記した特徴量の値の正規化・量子化およびこれに基づくテーブルの作成を行う。なお、ステップＳ２０７において一部の特徴量が除外される可能性があるため、テーブルはＮ種の特徴量の全てについて作成されるとは限らない。以下ではテーブルが作成された特徴量の数をＮ’（≦Ｎ）によって表す。 Such processing is performed for all types of feature values (step S212). That is, while the feature quantities to be selected are sequentially changed, the above-described feature quantity values are normalized and quantized, and a table based on this is created. In addition, since some feature amounts may be excluded in step S207, the table is not necessarily created for all N types of feature amounts. In the following, the number of feature quantities for which a table has been created is represented by N ′ (≦ N).

図６は特徴量の値の正規化・量子化の概念を示す図である。また、図７は量子化された特徴量と画像番号とを関連付けたテーブルの例を示す図である。これらの図およびその説明における欠陥画像群としては、図３に示したものと同じものを用いる。二次元の特徴量空間において特徴量の値を量子化するとは、特徴量空間において各画像を表す座標点の位置を、図６に破線で示すように等間隔に設けた格子点のうち最も近いものの位置に移動させることに相当する。また、特徴量の値を正規化するとは、１つの座標軸において、量子化された特徴量の最小値が１、最大値がＫnとなるように座標軸をスケーリングすることに相当している。なお、図６の例では、特徴量Ｘ1（ｎ＝１）に対応する横軸において量子化分割数Ｋn（すなわちＫ1）が９、特徴量Ｘ2（ｎ＝２）に対応する縦軸において量子化分割数Ｋn（すなわちＫ2）を１０となっているが、これらは一例を示したものにずぎず、量子化分割数は、上記した通り各座標軸における特徴量の値の分布に応じて定まるものである。 FIG. 6 is a diagram showing a concept of normalization / quantization of feature value. FIG. 7 is a diagram showing an example of a table in which quantized feature amounts are associated with image numbers. As the defect image group in these drawings and the description thereof, the same one as shown in FIG. 3 is used. The quantization of the feature value in the two-dimensional feature space means that the position of the coordinate point representing each image in the feature space is the closest among the lattice points provided at equal intervals as shown by broken lines in FIG. This corresponds to moving the object to the position. Further, normalizing the feature value corresponds to scaling the coordinate axis so that the minimum value of the quantized feature value is 1 and the maximum value is Kn in one coordinate axis. In the example of FIG. 6, the quantization division number Kn (that is, K1) is 9 on the horizontal axis corresponding to the feature quantity X1 (n = 1), and the quantization is performed on the vertical axis corresponding to the feature quantity X2 (n = 2). The number of divisions Kn (that is, K2) is 10. However, these are not limited to those shown as an example, and the number of quantization divisions is determined according to the distribution of feature values on each coordinate axis as described above. is there.

画像番号１〜２０の各欠陥画像が、上記のような正規化・量子化の結果、特徴量空間のどこに位置づけられたかを各特徴量ごとに示すのが、図７に例示するテーブルである。図７（ａ）は特徴量Ｘ1についてのテーブル、図７（ｂ）は特徴量Ｘ2についてのテーブルである。図において「区画番号」は正規化・量子化された特徴量の値と同義であり、「区画に該当する画像番号」は正規化・量子化の結果として特徴量の値が区画番号と同じになる欠陥画像の画像番号を意味している。例えば図７（ａ）を見ると、特徴量Ｘ1に対応する座標軸において正規化・量子化された特徴量の値が１となるのは、画像番号４，７，１０の３つであることが示されている。 The table illustrated in FIG. 7 indicates where each defect image with image numbers 1 to 20 is positioned in the feature amount space as a result of the normalization and quantization as described above. FIG. 7A is a table for the feature quantity X1, and FIG. 7B is a table for the feature quantity X2. In the figure, “division number” has the same meaning as the normalized and quantized feature value, and “image number corresponding to the division” has the same feature value as the division number as a result of normalization and quantization. Is the image number of the defect image. For example, referring to FIG. 7A, the value of the feature quantity normalized and quantized on the coordinate axis corresponding to the feature quantity X1 is 1 in three image numbers 4, 7, and 10. It is shown.

このように、それぞれの特徴量について、特徴量の値と画像番号とを関連付けたテーブルを作成しておけば、基準画像として１つの画像番号が指定されたとき、当該基準画像と特徴量の値が近い他の欠陥画像の画像番号を直ちにテーブルから読み出すことが可能である。 As described above, if a table in which the feature value and the image number are associated with each other is created, when one image number is designated as the reference image, the reference image and the feature value are stored. It is possible to immediately read out the image numbers of other defective images close to the table.

具体的には、このテーブルを次のように利用する。例えば画像番号１で表される欠陥画像を基準画像として、特徴量空間においてその近傍にある近傍画像を探索する場合を考える。基準画像に対応する画像番号として例えば画像番号１が指定されると、特徴量の値が基準画像の値に対して近傍範囲にあるものが候補画像として各テーブルから特定される。ここでは、基準画像と同一区画およびそれに隣接する区画までを近傍範囲とする。そうすると、特徴量Ｘ1については、図７（ａ）から明らかなように、画像番号１と同一区画（区画番号２）に属しているのは、画像番号１，１１，１３，１８，２０の５つ、隣接する区画（区画番号１および３）に属しているのは４，７，１０，１４，１７の計５つである。すなわち、特徴量Ｘ1についての候補画像は、画像番号１，４，７，１０，１３，１４，１７，１８，２０でそれぞれ特定される画像である。 Specifically, this table is used as follows. For example, consider a case in which a defect image represented by image number 1 is used as a reference image and a neighboring image in the vicinity thereof is searched in the feature amount space. When, for example, image number 1 is designated as the image number corresponding to the reference image, those whose feature values are in the vicinity of the reference image value are specified as candidate images from each table. Here, the same section as the reference image and a section adjacent to the same section are set as the vicinity range. Then, as is apparent from FIG. 7A, the feature quantity X1 belongs to the same section (section number 2) as the image number 1, and 5 of image numbers 1, 11, 13, 18, 20 In addition, there are a total of five that belong to adjacent sections (section numbers 1 and 3): 4, 7, 10, 14, and 17. That is, the candidate images for the feature amount X1 are images identified by image numbers 1, 4, 7, 10, 13, 14, 17, 18, and 20, respectively.

一方、特徴量Ｘ2については、図７（ｂ）からわかるように、基準画像と同じ区画番号２に属するのは画像番号１，７，１６であり、隣接する区画番号１および３に属するのは区画番号４，１３，１７，２０である。したがって、特徴量Ｘ2についての候補画像は、画像番号１，４，７，１３，１６，１７，２０でそれぞれ特定される画像である。 On the other hand, as can be seen from FIG. 7 (b), the feature quantity X2 belongs to the same partition number 2 as the reference image as image numbers 1, 7, and 16, and belongs to the adjacent partition numbers 1 and 3. Section numbers 4, 13, 17, and 20. Accordingly, the candidate images for the feature amount X2 are images specified by image numbers 1, 4, 7, 13, 16, 17, and 20, respectively.

特徴量空間内で近傍にある画像とは、全ての座標軸（特徴量）において基準画像の近傍範囲にある画像である。したがって、特徴量Ｘ1、Ｘ2それぞれでの候補画像の積集合を取り、そこから基準画像自体を除外した画像番号４，７，１３，１７，２０で表される画像が、基準画像（画像番号１）に対応する近傍画像であると言うことができる。なお、この場合の特徴量空間は、元のＮ次元空間から最大値Ｍnと最小値ｍnとの差が小さいと判断された（図５のステップＳ２０７）座標軸を除いたＮ’次元空間である。この原理を用いて、この実施形態では以下のようにして欠陥画像のグループ化を行う。 An image in the vicinity in the feature amount space is an image in the vicinity range of the reference image on all coordinate axes (feature amounts). Therefore, the product represented by the image numbers 4, 7, 13, 17, and 20 obtained by taking the product set of candidate images with the feature amounts X1 and X2 and excluding the reference image itself is the reference image (image number 1). It can be said that it is a neighborhood image corresponding to). Note that the feature amount space in this case is an N′-dimensional space excluding the coordinate axis that is determined to have a small difference between the maximum value Mn and the minimum value mn from the original N-dimensional space (step S207 in FIG. 5). Using this principle, defect images are grouped as follows in this embodiment.

図８は欠陥画像のグループ化処理を示すフローチャートである。この処理は、図２のステップＳ１０３に相当するものである。最初に、収集されている複数の欠陥画像の中からまだグループ化されていない欠陥画像を選び出し、これを基準画像に指定する（ステップＳ３０１）。この基準画像に対して、グループを識別するための適宜のグループラベルを付与する（ステップＳ３０２）。異なるグループに同一のラベルを与えることのないよう、このときのグループラベルは既存のグループに割り当てられていない新たなものとする。 FIG. 8 is a flowchart showing the defect image grouping process. This process corresponds to step S103 in FIG. First, a defect image that has not yet been grouped is selected from a plurality of collected defect images, and designated as a reference image (step S301). An appropriate group label for identifying the group is assigned to the reference image (step S302). In order not to give the same label to different groups, it is assumed that the group label at this time is a new one not assigned to an existing group.

次に、基準画像に対する近傍画像の探索を行う。すなわち、上記原理に基づき、先に作成されたＮ’面のテーブルを参照して、各特徴量においてその値が基準画像の値に対して近傍範囲にあるものを候補画像として抽出する（ステップＳ３０３）。そして、Ｎ’面の全てのテーブルにおいて候補画像として抽出されたものを特定し（ステップＳ３０４）、それらから基準画像自体を除外したものを、当該基準画像に対応する近傍画像として特定する（ステップＳ３０５）。こうして特定された近傍画像の各々のうちグループラベルが未付与のものに対し、基準画像と同一のグループラベルを付与する（ステップＳ３０６）。これにより、基準画像とその近傍画像とが同一のグループとして関連付けられる。 Next, a neighborhood image is searched for the reference image. That is, based on the above principle, the N′-plane table created earlier is referred to, and each feature value whose value is in the vicinity of the reference image value is extracted as a candidate image (step S303). ). Then, those extracted as candidate images in all the tables on the N ′ plane are specified (step S304), and those obtained by excluding the reference image itself from them are specified as neighboring images corresponding to the reference image (step S305). ). Of the neighboring images thus identified, the same group label as that of the reference image is assigned to the unassigned group label (step S306). As a result, the reference image and its neighboring images are associated as the same group.

本実施形態の趣旨によれば、近傍画像の近傍にある画像に対しても、たとえその画像が基準画像から近傍範囲を超えて離れたものであっても、基準画像と同一のグループラベルが与えられる必要がある。そこで、基準画像の近傍画像に対する近傍画像の探索とラベル付与とを引き続いて行う。具体的には、上記で得られた基準画像に対する近傍画像の画像番号をスタックに追加し（ステップＳ３０７）、それらの画像を順次新たな基準画像に指定しながら（ステップＳ３０９）、当該画像に対する近傍画像の探索および見つかった近傍画像へのグループラベル付与を行う（ステップＳ３０３〜Ｓ３０６）。新たな近傍画像が見つかれば演算プロセッサのスタックに追加する（ステップＳ３０７）。これをスタックが空になるまで繰り返すことで（ステップＳ３０８）、最初の基準画像を起点として互いの近傍画像を介した画像間の関連付けが次々と周囲に広がり、それらに同じグループラベルが割り当てられてゆく。 According to the gist of the present embodiment, the same group label as the reference image is given to an image in the vicinity of the vicinity image even if the image is far from the reference image beyond the vicinity range. Need to be done. Therefore, the search for the neighborhood image and the label assignment for the neighborhood image of the reference image are subsequently performed. Specifically, the image numbers of the neighboring images with respect to the reference image obtained above are added to the stack (step S307), and those images are sequentially designated as new reference images (step S309), while the neighborhood of the image is displayed. Image search and group label assignment to the found neighboring images are performed (steps S303 to S306). If a new neighborhood image is found, it is added to the arithmetic processor stack (step S307). By repeating this until the stack becomes empty (step S308), the association between the images via the neighboring images starts from the first reference image, and the same group label is assigned to them. go.

スタックが空になると、当該グループの近傍にはもはや他の画像は残っていないことになる。つまり当該グループの広がりは収束する。そこでステップＳ３０１に戻り（ステップＳ３１０）、まだグループ化されていない画像を新たな基準画像として上記処理を繰り返す。このとき、先のグループラベルとは異なるラベルが付与される。全ての欠陥画像にいずれかのグループラベルが与えられるまで、上記処理を繰り返して実行する。こうして全ての画像がグループ化される。グループ化の結果をユーザに提示する方法については先に説明した通りである。 When the stack is empty, there are no more images left in the vicinity of the group. That is, the spread of the group converges. Therefore, the process returns to step S301 (step S310), and the above processing is repeated using an image that has not yet been grouped as a new reference image. At this time, a label different from the previous group label is given. The above process is repeated until any group label is given to all defect images. In this way, all images are grouped. The method of presenting the grouping result to the user is as described above.

上記の例では、正規化および量子化された特徴量の値において基準画像と同一区画またはこれに隣接する区画までを「近傍範囲」として説明した。この近傍範囲の設定の仕方により、グループ化の結果は当然に異なる。すなわち、近傍範囲を広くすると、比較的差異の大きな画像まで同じグループにグループ分けされることになる。逆に近傍範囲を狭くすると、僅かな特徴の差異であっても別のグループとして扱われることとなる。特に高次元の特徴量空間では、外観上の小さな差異が特徴量空間では大きな距離となって各画像が孤立してしまう傾向がある。欠陥画像の数や特徴量の分布によって様々ではあるが、本願発明者の知見によれば、Ｋn＝１００程度の欠陥画像群においては、基準画像と同一区画を基準として±１０〜±５０区画程度を近傍範囲に含めたとき好結果が得られている。 In the above-described example, the description has been made as the “neighboring range” up to the same section as the reference image or the section adjacent to the reference image in the normalized and quantized feature value. Naturally, the result of grouping differs depending on how the neighborhood range is set. That is, when the neighborhood range is widened, even images with relatively large differences are grouped into the same group. Conversely, if the neighborhood range is narrowed, even a slight difference in characteristics will be treated as another group. In particular, in a high-dimensional feature amount space, a small difference in appearance tends to be a large distance in the feature amount space and each image tends to be isolated. According to the knowledge of the inventor of the present application, in the defect image group of about Kn = 100, about ± 10 to ± 50 sections based on the same section as the reference image, although it varies depending on the number of defect images and the distribution of the feature amount. Good results have been obtained when is included in the neighborhood range.

図９はこの発明を好適に適用可能な検査システムの概略構成を示す図である。この検査システム１は、検査対象である半導体基板Ｓの外観に現れたピンホールや異物等の欠陥検査を行い、検出された欠陥の自動分類を行う検査システムである。検査システム１は、基板Ｓ上の検査対象領域を撮像する撮像装置２と、撮像装置２からの画像データに基づいて欠陥検査を行うとともに欠陥が検出された場合に欠陥が属すべきカテゴリへと欠陥を自動分類（ＡＤＣ；automatic defect classification）する検査・分類機能および検査システム１の全体動作を制御する機能を有する制御部としてのホストコンピュータ５を有する。撮像装置２は基板Ｓの製造ラインに組み込まれ、検査システム１はいわゆるインライン型のシステムとなっている。 FIG. 9 is a diagram showing a schematic configuration of an inspection system to which the present invention can be preferably applied. This inspection system 1 is an inspection system that performs inspection of defects such as pinholes and foreign matters appearing on the appearance of a semiconductor substrate S to be inspected, and automatically classifies detected defects. The inspection system 1 performs the defect inspection based on the image data from the image pickup device 2 and the image pickup device 2 that picks up the inspection target area on the substrate S, and when the defect is detected, the defect belongs to the category to which the defect should belong. And a host computer 5 as a control unit having an inspection / classification function for automatically classifying (ADC) automatic defect classification and a function for controlling the entire operation of the inspection system 1. The imaging device 2 is incorporated in the production line of the substrate S, and the inspection system 1 is a so-called inline system.

撮像装置２は、基板Ｓ上の検査対象領域を撮像することにより画像データを取得する撮像部２１、基板Ｓを保持するステージ２２、および、撮像部２１に対してステージ２２を相対的に移動させるステージ駆動部２３を有し、撮像部２１は、照明光を出射する照明部２１１、基板Ｓに照明光を導くとともに基板Ｓからの光が入射する光学系２１２、および、光学系２１２により結像された基板Ｓの像を電気信号に変換する撮像デバイス２１３を有する。ステージ駆動部２３はボールねじ、ガイドレールおよびモータにより構成され、ホストコンピュータ５に設けられた装置制御部５０１がステージ駆動部２３および撮像部２１を制御することにより、基板Ｓ上の検査対象領域が撮像される。 The imaging device 2 moves the stage 22 relative to the imaging unit 21 that acquires image data by imaging the inspection target region on the substrate S, the stage 22 that holds the substrate S, and the imaging unit 21. The imaging unit 21 includes a stage driving unit 23, and an imaging unit 21 forms an image by an illumination unit 211 that emits illumination light, an optical system 212 that guides the illumination light to the substrate S and receives light from the substrate S, and an optical system 212. An imaging device 213 that converts the image of the substrate S thus converted into an electrical signal is included. The stage drive unit 23 includes a ball screw, a guide rail, and a motor. The apparatus control unit 501 provided in the host computer 5 controls the stage drive unit 23 and the imaging unit 21, so that the inspection target region on the substrate S is changed. Imaged.

ホストコンピュータ５は、予め読み込まれた制御プログラムを実行することにより、図１に示す各機能ブロックをソフトウェアにより実現する。ホストコンピュータ５は、上記の装置制御部５０１のほか、欠陥検出部５０２、欠陥分類部（ＡＤＣ）５０３、特徴量算出部５０４、教示部５０５、判定部５０６、演算部５０７などの各機能ブロックを備えている。さらに、ホストコンピュータ５は、各種データを記憶するための記憶部５１０、ユーザからの操作入力を受け付けるキーボードおよびマウスなどの入力受付部５１１および操作手順や処理結果等のユーザ向け視覚情報を表示する表示部５１２などを備えている。また、図示を省略しているが、光ディスク、磁気ディスク、光磁気ディスク等のコンピュータ読み取り可能な記録媒体から情報の読み取りを行う読取装置を有し、検査システム１の他の構成との間で信号を送受信する通信部が、適宜、インターフェイス（Ｉ／Ｆ）を介する等して接続される。 The host computer 5 implements each functional block shown in FIG. 1 by software by executing a control program read in advance. In addition to the apparatus control unit 501, the host computer 5 includes functional blocks such as a defect detection unit 502, a defect classification unit (ADC) 503, a feature amount calculation unit 504, a teaching unit 505, a determination unit 506, and a calculation unit 507. I have. Further, the host computer 5 displays a storage unit 510 for storing various data, an input receiving unit 511 such as a keyboard and a mouse for receiving operation inputs from the user, and a display for displaying visual information for the user such as operation procedures and processing results. Part 512 and the like. Although not shown in the figure, a reading device for reading information from a computer-readable recording medium such as an optical disk, a magnetic disk, a magneto-optical disk, etc. Are suitably connected via an interface (I / F) or the like.

欠陥検出部５０２は、検査対象領域の画像データを処理しつつ、検査対象領域中の特異な領域を見出すことで欠陥検出を行う。欠陥検出部５０２が検査対象領域から欠陥を検出すると、欠陥の画像データや検査に利用された各種データが記憶装置５１０に一時的に保存される。 The defect detection unit 502 performs defect detection by finding a unique area in the inspection target area while processing the image data of the inspection target area. When the defect detection unit 502 detects a defect from the inspection target area, image data of the defect and various data used for the inspection are temporarily stored in the storage device 510.

一方、欠陥分類部５０３は、検出された欠陥をＳＶＭ（サポート・ベクタ・マシン；Support Vector Machine）、ニューラルネットワーク、決定木、判別分析等の学習アルゴリズムを利用して分類する処理をソフトウェア的に実行する。特徴量算出部５０４は、検出された欠陥の画像データに基づいて、当該欠陥を特徴付ける特徴量を算出する。教示部５０５は欠陥分類部５０３に上記アルゴリズムを機械学習させるための教示データを与える。 On the other hand, the defect classification unit 503 performs, in software, a process of classifying detected defects using learning algorithms such as SVM (Support Vector Machine), neural network, decision tree, and discriminant analysis. To do. The feature amount calculation unit 504 calculates a feature amount that characterizes the defect based on the image data of the detected defect. A teaching unit 505 gives teaching data for causing the defect classification unit 503 to machine-learn the algorithm.

判定部５０６および演算部５０７はそれぞれ、欠陥分類部５０３によって分類された分類結果の妥当性を示す確度を算出するための処理を行う。具体的には、判定部５０６は欠陥画像の各特徴量の値が各分類カテゴリに適合するものであるか否かの判定を行う。また演算部５０７は判定部５０６の判定結果に基づき確度を算出する。 The determination unit 506 and the calculation unit 507 each perform processing for calculating the accuracy indicating the validity of the classification result classified by the defect classification unit 503. Specifically, the determination unit 506 determines whether or not each feature value of the defect image is appropriate for each classification category. The calculation unit 507 calculates the accuracy based on the determination result of the determination unit 506.

このように構成された検査システムに対し、本発明を好適に適用することが可能である。すなわち、撮像装置２により撮像された多数の欠陥画像のうちいくつかを典型画像として欠陥分類部５０３が機械学習を行う際、ユーザが欠陥画像の中から典型画像を見つけ出す作業を支援するのに本発明を用いることができる。グループ化により得られた典型画像の候補については、ホストコンピュータ５の表示部５１２への表示によってユーザに提示することができる。ユーザにより典型画像が指定されると、その情報に基づき教示部５０５が教示データを作成し欠陥分類部５０３に与えることになる。 The present invention can be suitably applied to the inspection system configured as described above. That is, when the defect classification unit 503 performs machine learning by using some of a large number of defect images captured by the imaging device 2 as typical images, the main task is to assist the user in finding the typical image from the defect images. The invention can be used. The candidates for typical images obtained by grouping can be presented to the user by display on the display unit 512 of the host computer 5. When a typical image is designated by the user, the teaching unit 505 creates teaching data based on the information and gives it to the defect classification unit 503.

以上のように、この実施形態では、互いに類似した特徴を有する複数の欠陥画像を見出すのに際して、一の欠陥画像を基準画像としたとき特徴量空間内で当該基準画像の近傍位置にある画像を探索するという方法を採っている。そして、近傍画像の探索は、特徴量空間内における距離計算によるのではなく、特徴量空間を構成する座標軸（すなわち各特徴量）の１つ１つについて基準画像から近傍範囲にある画像を候補画像として選出し、各特徴量に共通する候補画像を近傍画像とする方法によっている。 As described above, in this embodiment, when finding a plurality of defect images having features similar to each other, when one defect image is used as a reference image, an image in the vicinity of the reference image in the feature amount space is displayed. The method of searching is taken. The search for the neighborhood image is not based on the distance calculation in the feature amount space, but for each of the coordinate axes (that is, each feature amount) constituting the feature amount space, images in the neighborhood range from the reference image are candidate images. And selecting a candidate image common to each feature amount as a neighborhood image.

このような探索方法によれば、１つの特徴量における数値の比較のみで当該特徴量における候補画像を選出することが可能であり、各特徴量での候補画像から共通のものを抽出することにより、近傍画像が特定される。このため、特徴量空間内での距離計算を伴う従来の探索方法に比べて処理量が大幅に軽減され、また高度な演算処理能力が必要とされない。この利点は、特に特徴量空間の次元数が大きい場合や収集される画像の数が多い場合に顕著なものとなる。 According to such a search method, it is possible to select candidate images for the feature amount only by comparing numerical values of one feature amount, and by extracting a common image from candidate images for each feature amount. A neighborhood image is identified. For this reason, the amount of processing is greatly reduced as compared with the conventional search method involving distance calculation in the feature amount space, and a high degree of arithmetic processing capability is not required. This advantage is particularly remarkable when the number of dimensions of the feature amount space is large or when the number of images to be collected is large.

以上説明したように、この実施形態においては、図８のステップＳ３０３が本発明にかかる近傍探索方法における「候補抽出工程」に相当する一方、ステップＳ３０４が「近傍画像特定工程」に相当している。また、これらが本発明にかかる類似画像探索方法における「近傍探索工程」に相当しており、ステップＳ３０１が「基準設定工程」に相当している。また、ステップＳ３０５が「類似画像特定工程」に相当している。 As described above, in this embodiment, step S303 in FIG. 8 corresponds to the “candidate extraction step” in the neighborhood search method according to the present invention, while step S304 corresponds to the “neighboring image specifying step”. . These correspond to the “neighbor search step” in the similar image search method according to the present invention, and step S301 corresponds to the “reference setting step”. Step S305 corresponds to a “similar image specifying step”.

なお、本発明は上記した実施形態に限定されるものではなく、その趣旨を逸脱しない限りにおいて上述したもの以外に種々の変更を行うことが可能である。例えば、上記実施形態では欠陥画像の各特徴量の値を正規化・量子化した上で基準画像に対する近傍画像の探索を行っているが、これらは演算上の便宜のためであって、特徴量の値を正規化または量子化することは本発明において必須の要件ではない。またこれらの処理のうち一方のみが実行されてもよい。 The present invention is not limited to the above-described embodiment, and various modifications other than those described above can be made without departing from the spirit of the present invention. For example, in the above-described embodiment, the value of each feature value of the defect image is normalized and quantized, and then the neighborhood image is searched for the reference image. It is not an essential requirement in the present invention to normalize or quantize the value of. Only one of these processes may be executed.

したがって例えば、各特徴量について、複数の欠陥画像の間での当該特徴量の値の最小値から最大値までを含む範囲を均等に複数の区間に分割し、各画像の特徴量の値がそれらのうちのどの区間に該当するかという観点で図７のようなテーブルを作成しても、同様の効果が得られる。 Therefore, for example, for each feature amount, a range including the minimum value to the maximum value of the feature amount value among a plurality of defect images is equally divided into a plurality of sections, and the feature value value of each image is The same effect can be obtained even if a table as shown in FIG.

また、上記実施形態は半導体基板の欠陥を検査・分類する画像分類装置であるが、本発明の適用対象たる画像分類装置は、半導体基板を検査する装置だけでなく、他の対象物、例えばプリント基板やガラス基板等を検査する装置や、各種材料の表面状態を検査する表面検査装置であってもよい。 Moreover, although the said embodiment is an image classification apparatus which test | inspects and classifies the defect of a semiconductor substrate, the image classification apparatus which is an application object of this invention is not only the apparatus which test | inspects a semiconductor substrate, but another target object, for example, a print It may be a device for inspecting a substrate, a glass substrate or the like, or a surface inspection device for inspecting the surface state of various materials.

また、上記実施形態は、半導体基板等における欠陥画像から互いに類似する特徴を有する画像を選び出す装置に本発明を適用したものであるが、本発明が対象とする画像はこのような欠陥画像に限定されず、特徴量に基づいて評価される種々の画像を扱う技術全般に対して、本発明を適用することが可能である。 In the above embodiment, the present invention is applied to an apparatus for selecting images having similar characteristics from defect images on a semiconductor substrate or the like. However, the image targeted by the present invention is limited to such defect images. However, the present invention can be applied to all techniques for handling various images evaluated based on feature amounts.

本発明は複数の特徴量によって表される画像を扱う技術に好適であり、特に、多くの画像の中から所定の特徴を有する画像を探索したり、互いに類似した特徴を有する画像を見つけ出すという目的に好適に適用することが可能である。 The present invention is suitable for a technique for handling an image represented by a plurality of feature amounts, and in particular, an object of searching for an image having a predetermined feature from many images or finding images having features similar to each other. It can be suitably applied to.

１検査システム
２撮像装置
５０３欠陥分類部
５０４特徴量算出部
５０５教示部
５０６判定部
５０７演算部
５１２表示部
Ｓ３０１基準設定工程
Ｓ３０３候補抽出工程、近傍探索工程
Ｓ３０４近傍画像特定工程、近傍探索工程
Ｓ３０５類似画像特定工程 DESCRIPTION OF SYMBOLS 1 Inspection system 2 Imaging device 503 Defect classification | category part 504 Feature-value calculation part 505 Teaching part 506 Judgment part 507 Calculation part 512 Display part S301 Reference | standard setting process S303 Candidate extraction process, neighborhood search process S304 Neighborhood image specification process, Neighborhood search process S305 Similarity Image identification process

Claims

In a neighborhood search method for searching for an image located in the vicinity of a predetermined reference coordinate point in a multidimensional feature amount space in which an image is represented by a plurality of feature amounts and each of the plurality of feature amounts is a coordinate axis.
For one of the coordinate axes, an extraction process for selecting, as a candidate image, an image in which a feature value corresponding to the coordinate axis is within a predetermined vicinity range from a value of the reference coordinate point on the coordinate axis, a plurality of the coordinate axes Candidate extraction process to be executed as a target;
A neighborhood search method comprising: a neighborhood image identification step that identifies an image selected as the candidate image on all of the plurality of coordinate axes targeted by the candidate extraction step as a neighborhood image of the reference coordinate point .

Searching the neighborhood image from the plurality of images,
The neighborhood search method according to claim 1, wherein the neighborhood range is set for one coordinate axis in accordance with a distribution mode of feature value values corresponding to the coordinate axes in the plurality of images.

A numerical range of a section obtained by equally dividing a numerical range including a maximum value and a minimum value between the plurality of images of a feature amount corresponding to one coordinate axis into a plurality of predetermined division numbers corresponds to the feature amount. The neighborhood search method according to claim 2, wherein the neighborhood range with respect to a coordinate axis is used.

The neighborhood search method according to claim 3, wherein an image having a coordinate value on the coordinate axis in the same section as the reference coordinate point is set as the candidate image.

The neighborhood search method according to claim 3, wherein an image having a coordinate value on the coordinate axis in any of a plurality of continuous sections including a section to which the coordinate value of the reference coordinate point belongs is used as the candidate image.

In a similar image search method for searching for images similar to each other from a plurality of images,
One of the plurality of images is selected as a reference image, and a point where the reference image is located in a multi-dimensional feature amount space having a plurality of feature amounts representing the reference image as one coordinate axis is used as a reference coordinate point. A reference setting process to be set;
A neighborhood search step of searching for a neighborhood image of the reference coordinate point from images other than the reference image among the plurality of images by the neighborhood search method according to any one of claims 1 to 5.
A similar image search method comprising: a similar image specifying step of specifying an image specified as the neighborhood image of the reference coordinate point as a similar image similar to the reference image.

Among the plurality of feature amounts, for the difference between the maximum value and the minimum value between the plurality of images larger than a threshold value set in advance for the feature amount, a coordinate axis corresponding to the feature amount is used. The similar image search method according to claim 6, which is a target of the candidate extraction step.

Prior to the reference setting step, the plurality of feature amounts are calculated for each of the plurality of images, and the feature amount value and the image having the value are associated with each of the plurality of feature amounts. The similar image search method according to claim 6, wherein in the neighborhood search step, the neighborhood image is searched based on the table.