JP2002324206A

JP2002324206A - Data analyzing method and its device

Info

Publication number: JP2002324206A
Application number: JP2001127534A
Authority: JP
Inventors: Eidai Shirai; 英大白井; Hidetaka Tsuda; 英隆津田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2001-04-25
Filing date: 2001-04-25
Publication date: 2002-11-08

Abstract

PROBLEM TO BE SOLVED: To provide a data analyzing method capable of clarifying the level of confounding among a plurality of explanation variables. SOLUTION: This data analyzing method comprises a step for preparing the data results of explanation variables and purpose variables, a step for calculating the level of confounding and/or the level of independence among those explanation variables based on the data results, and a step for executing data mining by using the level of confounding and/or the level of independence.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、広く産業界で取り
扱われるデータ間の関連を把握し、産業上優位な結果を
もたらすための有意性のある結果を抽出するデータ解析
を行い、さらにその結果の精度等を評価する方法及び装
置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data analysis for grasping a relation between data handled in a wide range of industries, extracting a significant result for providing an industrially superior result, and further analyzing the result. The present invention relates to a method and an apparatus for evaluating the accuracy and the like of a device.

【０００２】例えば、半導体製造工程において取得され
る使用装置履歴、試験結果、設計情報、各種測定データ
等をもって、歩留りの変動状況を把握し、よって歩留り
向上に有利な条件を抽出するのになされるデータ解析、
およびその結果の精度等を評価する方法及び装置に関す
る。For example, using the history of used equipment, test results, design information, various measurement data, and the like obtained in a semiconductor manufacturing process, the fluctuation state of the yield is grasped, and conditions advantageous for improving the yield are extracted. Data analysis,
And a method and apparatus for evaluating the accuracy and the like of the result.

【０００３】特に、複数の説明変数が互いに交絡（独立
でなくなる）してしまい、有意差の抽出が困難になる場
合に対処し、より効率的かつ信頼性のある解析結果を得
るための方法及び装置に関する。[0003] In particular, a method and a method for obtaining a more efficient and reliable analysis result by coping with a case where a plurality of explanatory variables are entangled with each other (become independant) and it becomes difficult to extract a significant difference. Related to the device.

【０００４】[0004]

【従来の技術】半導体データの歩留り解析を例にとって
進める。特にプロセステータ解析のようにその解析結果
から品質、生産性向上の対策決定のための参考データを
得ようとする場合には、その精度、信頼度等が重要であ
り、これについては本願発明者により既に出願されてい
る（出願番号：特願２０００−４１８９６）。歩留り低
下要因をできるだけ速やかにみつけて対策を実施するた
めに、装置履歴、試験結果、設計情報、各種測定データ
等から歩留りに効いている要因やその要因に効いている
別の要因を見つけるためのデータ解析が行われる。デー
タ解析において、歩留り値のように解析対象となるもの
を目的変数、目的変数の要因となる装置履歴、試験結
果、設計情報、各種測定データ等は説明変数といわれ
る。その際に各種統計学的手法が適用されるが、そのう
ちの一つとしてデータマイニングを適用することで、多
種大量のデータから判別しにくい価値ある情報や規則性
を抽出することができる。2. Description of the Related Art The yield analysis of semiconductor data will be described as an example. In particular, when trying to obtain reference data for deciding measures for improving quality and productivity based on the analysis results, as in the case of process data analysis, the accuracy and reliability are important. (Japanese Patent Application No. 2000-41896). In order to find the cause of the decrease in yield as soon as possible and take countermeasures, it is necessary to find out factors that are affecting the yield and other factors that are affecting the factors from the device history, test results, design information, various measurement data, etc. Data analysis is performed. In data analysis, a target variable such as a yield value is an objective variable, and a device history, a test result, design information, various measurement data, and the like, which are factors of the objective variable, are called explanatory variables. At this time, various statistical methods are applied, and by applying data mining as one of them, valuable information and regularity that are difficult to distinguish from a large amount of various data can be extracted.

【０００５】しかし実際にデータマイニングを適用して
みると、うまく行かない場合がある。金融や流通などの
分野での適用では、何百万件もの膨大なデータ件数が有
り、説明変数の数はせいぜい数十であるため、精度の高
い分析結果が得られた。ところが半導体プロセステータ
解析の場合はデータ件数が少なく、同じ品種では多くて
も２００ロット程度であるにもかかわらず、説明変数の
数は数百にも達し（装置履歴、工程内検査値等）、複数
の説明変数が独立ではなくなってしまい、単純にデータ
マイニングを行っただけでは信頼できる結果が得らない
ことがある。However, when data mining is actually applied, it sometimes fails. In applications such as finance and distribution, there are millions of huge data records, and the number of explanatory variables is at most tens, so highly accurate analysis results were obtained. However, in the case of semiconductor process data analysis, although the number of data items is small and at most about 200 lots of the same type, the number of explanatory variables reaches several hundreds (equipment history, in-process inspection values, etc.) In some cases, independent variables are no longer independent and simple data mining may not provide reliable results.

【０００６】[0006]

【発明が解決しようとする課題】半導体データの歩留り
解析を例にとって説明する。データ数（例：ロット数）
に比較して説明変数（例：ＬＳＩ製造工程データ）が多
いプロセステータ解析において、複数の説明変数が互い
に交絡（独立でなくなる）してしまい、統計的有意差に
よる問題点が十分絞り込めない事が多くある。データマ
イニング（回帰木分析など）を適用した場合において
も、この問題がある場合には、かなり手間をかけて分析
結果の精度、信頼できる範囲の確認が必要となる。A description will be given by taking a yield analysis of semiconductor data as an example. Number of data (example: lot number)
In a process data analysis with many explanatory variables (eg, LSI manufacturing process data) compared to the above, a plurality of explanatory variables are entangled with each other (they are no longer independent), and the problem due to statistical significance cannot be sufficiently narrowed down. There are many. Even when data mining (regression tree analysis, etc.) is applied, if this problem exists, it is necessary to check the accuracy of the analysis result and the reliable range with considerable effort.

【０００７】図１は、ロットの流れと異常製造装置の関
係を示す。白丸“○”は正常装置１０１を示し、黒丸
“●”は異常装置１０２を示す。矢印はロットの流れを
示す。ＬＳＩ製造データにおける装置間差の解析は、各
ロットの工程毎の使用装置データから、どの製造工程で
どの製造装置を使用すると歩留りが最も影響を受けるか
を抽出する。FIG. 1 shows a relationship between a lot flow and an abnormal manufacturing apparatus. A white circle “○” indicates the normal device 101, and a black circle “●” indicates the abnormal device 102. Arrows indicate the flow of lots. The analysis of the difference between the devices in the LSI manufacturing data extracts, from the used device data for each process of each lot, which manufacturing device is used in which manufacturing process most affects the yield.

【０００８】図２は、従来技術によるある工程での装置
別歩留り分布（箱ヒゲ図）を示す。各製造工程毎に使用
した装置毎にそのロットの歩留り値を箱ヒゲ図で表示
し、各工程について確認していき、最も差が顕著な工程
とその装置を同定する。FIG. 2 shows a yield distribution (box whisker diagram) for each apparatus in a certain process according to the prior art. The yield value of the lot for each device used in each manufacturing process is displayed in a box-and-whisker diagram, and the process is confirmed, and the process with the most significant difference and the device are identified.

【０００９】しかし、この手法では工程数が数百となっ
た現在では大きな工数を要し、また差異が明確に出ない
場合や条件が複雑に絡み合った場合などはなかなか判断
が付きにくい。これらに対処するために回帰木分析によ
るデータマイニング手法が有効であり、目的変数の値が
高くなる使用装置群と低くなる使用装置群に分割する。
図３のようにロット毎に使用される装置を固定してロッ
トを流した場合、黒丸“●”で示す異常装置１０２が一
意に同定できないことがある。すなわち、説明変数間の
独立性が低い場合は、集合の二分割による有意差が大と
なるものが必ずしも“真に有意差が大”であるとはかぎ
らない。However, this method requires a large number of steps today when the number of steps is several hundred, and it is difficult to make a judgment when a difference is not clearly seen or when conditions are complicatedly entangled. In order to deal with these, a data mining method based on regression tree analysis is effective, and the target variable is divided into a group of used devices having a high value and a group of used devices having a low value of the objective variable.
As shown in FIG. 3, when a device used for each lot is fixed and a lot is flowed, the abnormal device 102 indicated by a black circle “●” may not be uniquely identified. That is, when the independence between explanatory variables is low, a case where the significant difference due to the two divisions of the set is large is not always “true significant difference”.

【００１０】以上が、半導体製造の各工程における使用
装置における交絡であるが、回帰木分析結果として２分
割された集合の交絡についても同様である。すなわち、
各工程毎に高歩留りが生じている装置群と低歩留りが生
じている装置群から成る集合についても同じことがいえ
る。この２分割された集合の交絡については、説明変数
が連続値である場合も同様である。The above is the confounding of the used equipment in each process of semiconductor manufacturing. The same applies to the confounding of a set divided into two as a result of regression tree analysis. That is,
The same can be said for a set including a device group in which a high yield occurs and a device group in which a low yield occurs in each process. The confounding of the two divided sets is the same when the explanatory variable is a continuous value.

【００１１】本発明の目的は、複数の説明変数間の交絡
の度合いを明確にすることができるデータ解析方法及び
装置を提供することである。An object of the present invention is to provide a data analysis method and apparatus which can clarify the degree of confounding between a plurality of explanatory variables.

【００１２】[0012]

【課題を解決するための手段】本発明の一観点によれ
ば、説明変数及び目的変数のデータ結果を準備するステ
ップと、そのデータ結果を基に複数の説明変数間の交絡
度及び／又は独立度を演算するステップと、交絡度及び
／又は独立度を用いてデータマイニングを行うステップ
とを有するデータ解析方法が提供される。According to one aspect of the present invention, a step of preparing data results of an explanatory variable and an objective variable, and a degree of confounding and / or independence between a plurality of explanatory variables based on the data results. A data analysis method includes a step of calculating a degree and a step of performing data mining using the degree of confounding and / or independence.

【００１３】複数の説明変数間の交絡度及び／又は独立
度を演算することにより、説明変数の交絡の度合いを明
確に把握できる。これにより、回帰木分析の集合の２分
割結果に基づき、説明変数の交絡度を定量的に評価でき
るようになり、回帰木における最初の分岐の有意差が大
きい問題となる説明変数に交絡している注意すべき説明
変数を明確化することが可能となる。By calculating the degree of confounding and / or the degree of independence between a plurality of explanatory variables, the degree of confounding of explanatory variables can be clearly grasped. As a result, the degree of confounding of the explanatory variables can be quantitatively evaluated based on the result of dividing the set of the regression tree analysis into two. It is possible to clarify which explanatory variables need attention.

【００１４】[0014]

【発明の実施の形態】図１７は、本発明の実施形態によ
るデータマイニングを導入したデータ解析装置を示す。
データマイニング部１７０３は、オリジナルデータ群１
７０１内の各データベース１７０２から抽出された個々
のオリジナルデータに基づいて、データ内に潜む特徴や
規則性の抽出処理を行い、ルールファイル１７０４を作
成する。解析ツール群１７０５は、統計解析コンポーネ
ント１７０６及び図表作成コンポーネント１７０７等を
有し、ルールファイル１７０４を基にデータベース１７
０２から抽出された個々のオリジナルデータを解析す
る。その解析結果は、解析ツール群１７０５及びデータ
マイニング部１７０３にフィードバックされる。データ
マイニング部１７０３は、解析ツール群１７０３の解析
結果及びオリジナルデータ群１７０１を基にデータマイ
ニングを行う。解析ツール群１７０５は、ルールファイ
ル１７０４、データベース１７０２から抽出された個々
のオリジナルデータ、及び自己の解析結果を基に解析を
行う。意思決定部１７０８は、解析ツール群１７０５の
解析結果を基に意思決定を行う。FIG. 17 shows a data analyzer incorporating data mining according to an embodiment of the present invention.
The data mining unit 1703 controls the original data group 1
Based on individual original data extracted from each database 1702 in the database 701, a process of extracting features and regularities hidden in the data is performed, and a rule file 1704 is created. The analysis tool group 1705 has a statistical analysis component 1706, a chart creation component 1707, and the like, and a database 17 based on a rule file 1704.
The original data extracted from the original data 02 is analyzed. The analysis result is fed back to the analysis tool group 1705 and the data mining unit 1703. The data mining unit 1703 performs data mining based on the analysis result of the analysis tool group 1703 and the original data group 1701. The analysis tool group 1705 performs an analysis based on the rule file 1704, individual original data extracted from the database 1702, and its own analysis result. The decision making unit 1708 makes a decision based on the analysis result of the analysis tool group 1705.

【００１５】歩留りデータ解析においてデータマイニン
グを適用した場合、データマイニング結果に基づいて歩
留り向上のための対策を決定したり、対策を実施すべき
か否かの判定を行ったり、対策効果の予測を行ったりす
ることになる。そのためには、データマイニング結果の
定量的な評価や精度が必要となる。When data mining is applied in yield data analysis, a measure for improving the yield is determined based on the data mining result, it is determined whether or not the measure should be implemented, and the effect of the measure is predicted. Or will be. For that purpose, quantitative evaluation and accuracy of data mining results are required.

【００１６】データマイニングの一手法である判別木分
析のうち、回帰木分析は特に有効である。回帰木分析の
利点の一つは、結果がわかりやすいルールとして出力さ
れることであり、それは一般的な言語やＳＱＬ言語のよ
うなデータベース言語であらわされる。したがって、こ
れらの結果の信頼度、精度を有効に使い、その結果によ
り有効な意思決定を行ったり、行動（すなわち対策等）
を起こすようにすることが可能となる。Of the discriminant tree analysis which is one of the data mining techniques, the regression tree analysis is particularly effective. One of the advantages of regression tree analysis is that results are output as easy-to-understand rules, which can be expressed in a general language or a database language such as SQL language. Therefore, the reliability and accuracy of these results are used effectively to make effective decisions based on the results, and actions (that is, measures, etc.)
Can be caused to occur.

【００１７】回帰木分析について簡単に説明する。回帰
木分析は、複数の属性を示す説明変数とそれにより影響
を受ける目的変数からなるレコードの集合を対象とし、
その目的変数に最も影響を与える属性と属性値を判別す
るものである。データマイニング部（回帰木分析エンジ
ン）からはデータの特徴や規則性を示すルールが出力さ
れる。The regression tree analysis will be briefly described. Regression tree analysis targets a set of records consisting of explanatory variables indicating multiple attributes and objective variables affected by the variables,
The attribute and the attribute value that most influence the objective variable are determined. The data mining unit (regression tree analysis engine) outputs rules indicating data characteristics and regularity.

【００１８】まず、回帰木分析を説明する。回帰木分析
の処理は、各説明変数（属性）のパラメータ値（属性
値）に基づいて集合の２分割を繰り返していくことで実
現される。その集合分割の際、分割前の目的変数（歩留
り）の平方和をＳ０、分割後の２つの集合の歩留りの平
方和をＳ１，Ｓ２としたとき、式（１）で示すΔＳが最
大となるように、分割するレコードの説明変数とそのパ
ラメータ値を求める。First, regression tree analysis will be described. The processing of the regression tree analysis is realized by repeatedly dividing the set into two based on the parameter value (attribute value) of each explanatory variable (attribute). At the time of the set division, when the sum of squares of the objective variable (yield) before division is S0 and the sum of squares of the yields of the two sets after division is S1 and S2, ΔS expressed by equation (1) becomes maximum. In this way, the explanatory variable of the record to be divided and its parameter value are obtained.

【００１９】 △Ｓ＝Ｓ０−（Ｓ１＋Ｓ２）・・・（１）ΔS = S0− (S1 + S2) (1)

【００２０】ここで得られる説明変数とそのパラメータ
値は、回帰木では分岐点に対応している。以降、分割さ
れた集合についても同様な処理を繰り返し、説明変数の
目的変数に対する影響を調べる。以上が、一般によく知
られている回帰木分析の手法であるが、集合分割の明確
さをより詳しく把握するために、複数の上位分割候補に
関して、ΔＳの他に以下のパラメータ（ａ）〜（ｄ）も
データマイニング結果の定量的な評価として使用する。The explanatory variables and their parameter values obtained here correspond to the branch points in the regression tree. Thereafter, the same processing is repeated for the divided sets, and the influence of the explanatory variables on the objective variables is examined. The above is a generally well-known method of regression tree analysis. In order to grasp the clarity of set division in more detail, the following parameters (a) to ( d) is also used as a quantitative evaluation of the data mining results.

【００２１】（ａ）Ｓ比集合分割による平方和の低減率であり、集合分割により
平方和がどの程度低減したかを示すパラメータである。
この値が小さいほど集合分割の効果は大きく、集合分割
が明確に行われているので、有意差が大である。(A) S ratio This is a reduction ratio of the sum of squares by set division, and is a parameter indicating how much the sum of squares has been reduced by set division.
The smaller this value is, the greater the effect of the set division is, and since the set division is clearly performed, the significant difference is large.

【００２２】Ｓ比＝（（Ｓ１＋Ｓ２）／２）／Ｓ０・・・（２）S ratio = ((S1 + S2) / 2) / S0 (2)

【００２３】（ｂ）ｔ回帰木分析エンジンにより集合が２分割されるが、分割
された２つの集合の平均（／Ｘ１，／Ｘ２）の差の検定
のための値である。ここで、“／”は上線を示す。統計
のｔ検定は、分割された集合における目的変数の平均値
の有意差を示す基準となる。自由度、すなわちデータ数
が同じであるなら、ｔが大きいほど集合が明確に分割さ
れており、有意差が大である。(B) t The set is divided into two by the regression tree analysis engine, and is a value for testing the difference between the average (/ X1, / X2) of the two divided sets. Here, “/” indicates an overline. The statistical t-test is a criterion indicating a significant difference between the average values of the objective variables in the divided sets. If the degree of freedom, that is, the number of data is the same, the set is clearly divided as t increases, and the significant difference is large.

【００２４】この際、分割された集合の分散に有意差が
ない場合には式（３）によりｔを求め、分割された集合
の分散に有意差がある場合には式（４）によりｔを求め
る。ここで、Ｎ１及びＮ２は、それぞれ分割した集合１
及び集合２の要素数である。／Ｘ１及び／Ｘ２は、それ
ぞれ分割後の各集合の平均である。Ｓ１及びＳ２は、そ
れぞれ複数のオリジナルデータよりなる集合を２分割し
てできた各集合の目的変数の平方和である。At this time, if there is no significant difference in the variance of the divided sets, t is obtained by equation (3), and if there is a significant difference in the variance of the divided sets, t is calculated by equation (4). Ask. Here, N1 and N2 are the divided sets 1 respectively.
And the number of elements in set 2. / X1 and / X2 are the averages of each set after division. S1 and S2 are the sum of squares of the objective variables of each set obtained by dividing a set consisting of a plurality of original data into two.

【００２５】[0025]

【数１】 (Equation 1)

【００２６】[0026]

【数２】 (Equation 2)

【００２７】（ｃ）分割された集合の目的変数の平均値
の差この値が大きいほど有意差が大である。(C) Difference in average value of objective variables of divided sets The larger this value is, the larger the significant difference is.

【００２８】（ｄ）分割された各集合のデータ数両者の差が小さいほど異常値（ノイズ）による影響が小
である。(D) Number of Data of Each Set Divided The smaller the difference between them is, the smaller the effect of abnormal values (noise) is.

【００２９】図４に回帰木分析の入力となるデータ例の
形式を示す。レコードはウェーハ番号単位であり、各レ
コードは各製造工程での使用装置４１１、電気的特性デ
ータ４１２とウェーハ歩留り４１３を有する。説明変数
４０１は、使用装置４１１及び電気的特性データ４１２
等である。目的変数４０２は、歩留り４１３である。例
えば、歩留りに効果があるのは、使用装置４１１と電気
的特性データ４１２であるとする。このデータによる回
帰木分析結果である回帰木図と評価用統計値リストを図
５，図６に示す。FIG. 4 shows a format of an example of data to be input to the regression tree analysis. Each record has a wafer number unit, and each record has a device 411 used in each manufacturing process, electrical characteristic data 412, and a wafer yield 413. The explanatory variables 401 are used devices 411 and electrical characteristic data 412.
And so on. The target variable 402 is the yield 413. For example, it is assumed that the use device 411 and the electrical characteristic data 412 have an effect on the yield. FIGS. 5 and 6 show a regression tree diagram and a statistical value list for evaluation, which are the results of regression tree analysis based on this data.

【００３０】図５は、回帰木分析結果である回帰木図で
ある。ルートノードｎ０は、ノードｎ１及びｎ２に２分
割される。ノードｎ１は、ノードｎ３及びｎ４に２分割
される。ノードｎ２は、ノードｎ５及びｎ６に２分割さ
れる。ノードｎ６は、ノードｎ７及びｎ８に２分割され
る。FIG. 5 is a regression tree diagram which is a result of regression tree analysis. The root node n0 is divided into two nodes n1 and n2. Node n1 is divided into two nodes n3 and n4. Node n2 is divided into two nodes n5 and n6. Node n6 is divided into two nodes n7 and n8.

【００３１】図６は、第１の２分割時の説明変数の評価
用統計値である。例えば、全集合の目的変数の平均値Ａ
ｖｅが７５であり、標準偏差ｓが１２であり、データ数
Ｎが１０００である。リスト６０１〜６０４は、それぞ
れ左から有意差による順位、Ｓ比、ｔ値、分割された集
合の目的変数の平均値の差、分割された各集合のデータ
数、分割された集合の属性名（説明変数）、分割された
２つの集合の属性値（パラメータ値）とその目的変数の
大小関係を示す。このリスト６０１〜６０４は、分割す
る属性値（説明変数）の（１）式に示すΔＳの値による
グループ分けの候補であり、有意差（ΔＳ）の大きい順
に並べてある。図５は、第１候補６０１を基にノードｎ
０をノードｎ１及びｎ２に分割したものである。FIG. 6 shows the statistical values for evaluating the explanatory variables at the time of the first division. For example, the average value A of the objective variables of the entire set
ve is 75, standard deviation s is 12, and the number of data N is 1000. The lists 601 to 604 respectively include, from the left, the order of the significant difference, the S ratio, the t value, the difference between the average values of the objective variables of the divided sets, the number of data of each divided set, and the attribute name of the divided set ( An explanatory variable), an attribute value (parameter value) of the two divided sets, and a magnitude relationship between the objective variable thereof. The lists 601 to 604 are candidates for grouping based on the value of ΔS shown in the expression (1) of the attribute value (explanatory variable) to be divided, and are arranged in descending order of the significant difference (ΔS). FIG. 5 is a diagram showing a node n based on the first candidate 601.
0 is divided into nodes n1 and n2.

【００３２】図５の全ウェーハの集合ｎ０を式（１）の
ΔＳの評価値に基づいて２つの集合ｎ１及びｎ２に分割
を行うと、歩留りに最も影響を及ぼすのは工程ＡでＡＭ
１かＡＭ２のいずれかを使うかであり、後者の方が歩留
りが良い。以下、分割された集合に対して、同様な集合
分割を繰り返していくとこの回帰木図が得られる。工程
ＡでＡＭ２かつ工程ＣでＣＭ２を使用したウェーハ群に
対しては、電気的特性データＲＳＰが９０以下の状態が
最も効果がある（歩留りが高い）。When the set n0 of all the wafers in FIG. 5 is divided into two sets n1 and n2 based on the evaluation value of ΔS in the equation (1), the process A has the most influence on the yield in step A.
Whether to use AM1 or AM2, the latter has better yield. Hereinafter, by repeating the same set division for the divided sets, this regression tree diagram is obtained. For a wafer group using AM2 in step A and CM2 in step C, the state where the electrical characteristic data RSP is 90 or less is most effective (high yield).

【００３３】図７は図５と等価であり、分割されたウェ
ーハ集合の歩留まりと特定工程の使用装置と電気的特性
データとの相関を示す。図５の回帰木図で上階層に現れ
る説明変数ほど、目的変数に対する影響は大きい。全ウ
ェーハの平均歩留りは７４．８％であるが、使用装置や
電気的特性データとの関連で幾つかの集合に分けてみる
とこのような特徴、規則性があることを回帰木分析は自
動的に抽出し、歩留り解析の手がかりとなる。FIG. 7 is equivalent to FIG. 5, and shows the correlation between the yield of the divided wafer set, the equipment used in the specific process, and the electrical characteristic data. The explanatory variables appearing in the upper hierarchy in the regression tree diagram of FIG. 5 have a greater influence on the objective variables. Although the average yield of all wafers is 74.8%, regression tree analysis automatically shows that these features and regularities are present in several sets in relation to the equipment used and electrical characteristic data. Extraction and provide clues for yield analysis.

【００３４】図５の回帰木図において上位２階層はいず
れも使用装置差によるものであるので、全ウェーハを使
った解析では歩留りに影響の大きいのは複合条件を含め
ても使用装置差である。電気的特性データはあまり効い
ていないように見られる。しかし、工程ＡでＡＭ２かつ
工程ＣでＣＭ２を使用したウェーハ群について歩留りに
最も効くのはＲＳＰであることが図５、図７から読み取
れる。In the regression tree diagram of FIG. 5, the upper two hierarchies are all due to the difference in the used equipment. Therefore, in the analysis using all the wafers, it is the difference in the used equipment that has a large influence on the yield even if the combined condition is included. . The electrical characteristics data seem to be less effective. However, it can be seen from FIGS. 5 and 7 that the RSP is most effective for the yield for the wafer group using AM2 in step A and CM2 in step C.

【００３５】次に、２分割交絡度、２分割独立度の算出
例を説明する。回帰木分析において、目的変数に対して
最も有意な説明変数を求めるために行われた各集合分割
状態の交絡度（交絡の状態、独立でない度合い）を統計
的に把握し、有意差が大とされた説明変数に交絡してい
る他の説明変数を明確にする。図８を参照しながら、２
分割交絡度及び２分割独立度の演算方法を説明する。Next, an example of calculation of the two-part confounding degree and the two-part independence degree will be described. In regression tree analysis, the degree of confounding (state of confounding, degree of independence) of each set division state, which was performed to find the most significant explanatory variable for the objective variable, was statistically grasped. Clarify other explanatory variables that are confounded by the explained explanatory variable. Referring to FIG.
The calculation method of the degree of division confounding and the degree of division independence will be described.

【００３６】（１）説明変数のうち、交絡度を評価した
いものを基準説明変数８０１とする。(1) Among the explanatory variables, a variable whose confounding degree is to be evaluated is defined as a reference explanatory variable 801.

【００３７】（２）各レコードは説明変数毎に“Ｌ”ま
たは“Ｈ”をデータ値とするテーブルを構成する。ここ
で、Ｈは回帰木分析時の集合２分割時の目的変数が高い
値となる集合、Ｌは回帰木分析時の集合２分割時の目的
変数が低い値となる集合にそれぞれ属する。集合２分割
時においては、全レコードの各説明変数について、Ｌ，
Ｈが定まる。(2) Each record forms a table in which "L" or "H" is a data value for each explanatory variable. Here, H belongs to a set having a high value of the objective variable when the set is divided into two during regression tree analysis, and L belongs to a set having a low value of the objective variable when the set is divided into two during the regression tree analysis. When the set is divided into two, for each explanatory variable of all records, L,
H is determined.

【００３８】（３）基準説明変数８０１を基に各比較説
明変数８０２のＬ，Ｈの一致度の評価値として、Ｌ，Ｈ
が一致するレコード数をＮａ、全レコード数をＮとし、
２分割交絡度ＤＥＰを式（５）のように定義する。２分
割交絡度ＤＥＰの範囲は−１〜１であり、完全に交絡し
ていれば１、全く交絡してなければ０、逆の交絡であれ
ば一１である。(3) Based on the reference explanatory variables 801, L and H are used as evaluation values of the degree of coincidence between L and H of each comparative explanatory variable 802.
Let Na be the number of records that match, and N be the total number of records.
The two-part confounding degree DEP is defined as in Expression (5). The range of the two-part confounding degree DEP is -1 to 1, which is 1 if completely confounded, 0 if not confounded at all, and 11 if confounded conversely.

【００３９】ＤＥＰ＝（２×Ｎａ／Ｎ）−１・・・（５）DEP = (2 × Na / N) −1 (5)

【００４０】また、２分割交絡度ＤＥＰを基に、２分割
独立度ＩＮＤを式（６）のように定義する。２分割独立
度ＩＮＤの範囲は０〜１であり、完全に独立していれば
１、全く独立でなければ０である。Further, based on the two-part confounding degree DEP, the two-part independence degree IND is defined as in equation (6). The range of the two-part independence degree IND is 0 to 1, which is 1 if completely independent and 0 if not completely independent.

【００４１】ＩＮＤ＝１−｜ＤＥＰ｜・・・（６）IND = 1− | DEP | (6)

【００４２】（４）上記の２分割交絡度ＤＥＰ、２分割
独立度ＩＮＤを一つの基準説明変数８０１とその他の説
明変数８０２との間で求め、説明変数間の評価尺度とす
る。どの説明変数を基準説明変数とするかは任意である
が、その有用性からして回帰木分析において目的変数に
対して、特に最上階層での集合分割で有意差が大とされ
たものとするのが有効である。(4) The two-part confounding degree DEP and the two-part independence IND are obtained between one reference explanatory variable 801 and another explanatory variable 802, and are used as an evaluation scale between the explanatory variables. Which explanatory variable is used as the reference explanatory variable is arbitrary, but from the viewpoint of its usefulness, it is assumed that the significant difference is large for the target variable in regression tree analysis, especially in the set division at the top hierarchy. Is effective.

【００４３】（５）上記の２分割交絡度ＤＥＰ及び２分
割独立度ＩＮＤを求めることにより、各比較説明変数８
０２がＬ，Ｈの各集合に属する状態が基準説明変数８０
１のものとどれだけ差異があるかを定量的に評価でき
る。(5) By calculating the above-mentioned two-part confounding degree DEP and two-part independence IND, each comparative explanatory variable 8
02 is the reference explanatory variable 80 which belongs to each set of L and H
It is possible to quantitatively evaluate the difference from the one.

【００４４】２分割交絡度及び／又は２分割独立度を求
めることにより、回帰木分析の集合２分割結果に基づき
説明変数の交絡度を定量的に評価できるようになり、回
帰木分析と組み合わせて、回帰木分析で得られた有意差
が大となる説明変数と交絡している別の説明変数を自動
的に抽出することが可能となる。By obtaining the degree of two-part confounding and / or the degree of two-part independence, the degree of confounding of explanatory variables can be quantitatively evaluated based on the set of two parts of regression tree analysis. In addition, it is possible to automatically extract another explanatory variable confounded with the explanatory variable having a large significant difference obtained by the regression tree analysis.

【００４５】２分割交絡度は、回帰木分析での対象とさ
れたどの説明変数についても評価できるが、その有効性
からみて図６の最初の分割候補の上位に挙がった説明変
数（＝基準説明変数、評価用統計値リストに挙がる）と
他の任意の説明変数がどれだけ交絡しているかを統計的
に把握し、有意差が大きい説明変数について交絡してい
る注意すべき説明変数を抽出する。基準説明変数８０１
との交絡度を解析しようとする説明変数を、比較説明変
数８０２とし、両者とも図５の評価用統計値リストから
選択される。２分割交絡度、２分割独立度の算出例を、
図８を参照しながら説明する。The bisection confounding degree can be evaluated for any explanatory variable targeted in the regression tree analysis. However, from the viewpoint of its effectiveness, the explanatory variable (= reference explanatory variable) which is ranked higher in the first division candidate in FIG. Variables, listed in the statistical value list for evaluation) and any other explanatory variables are statistically grasped, and noteworthy explanatory variables that are confounded for explanatory variables with a significant difference are extracted. . Reference explanatory variable 801
An explanatory variable whose degree of confounding is to be analyzed is a comparative explanatory variable 802, and both are selected from the evaluation statistical value list in FIG. The calculation example of the two-part confounding degree and the two-part independence degree is as follows.
This will be described with reference to FIG.

【００４６】図８は、横軸にウェーハ番号８０３、比較
説明変数８０２、基準説明変数８０１、歩留り８０４を
示し、縦軸に基準説明変数の高歩留りグループ８１１、
基準説明変数の低歩留りグループ８１２、２分割交絡度
の計算式８１３、２分割交絡度８１４、２分割独立度８
１５を示す。FIG. 8 shows the wafer number 803, the comparative explanatory variable 802, the reference explanatory variable 801, and the yield 804 on the horizontal axis, and the high yield group 811 of the reference explanatory variable on the vertical axis.
Low yield group 812 of reference explanatory variables, calculation formula 813 of two-part confounding degree 813, two-part confounding degree 814, two-part independence 8
15 is shown.

【００４７】図６の上位候補項目（評価用統計値リス
ト）の中から比較の基準とする項目を基準説明変数８０
１として決める。図８では、ＳＴ３が基準説明変数８０
１である。その他の説明変数を比較説明変数８０２とす
る。図８では、ＳＴ１，ＳＴ２，ＷＥＴ２が比較説明変
数８０２である。各比較説明変数８０２と基準説明変数
８０１とを比較する。説明変数であるＳＴ１，ＳＴ２，
ＳＴ３，ＷＥＴ２では、低歩留りグループの“Ｌ”をハ
ッチで示し、高歩留りグループの“Ｈ”をハッチなしで
示す。The item used as the reference for comparison among the top candidate items (evaluation statistical value list) in FIG.
Decide as 1. In FIG. 8, ST3 is the reference explanatory variable 80.
It is one. Other explanatory variables are referred to as comparative explanatory variables 802. In FIG. 8, ST1, ST2, and WET2 are comparison explanatory variables 802. Each comparison explanatory variable 802 is compared with the reference explanatory variable 801. ST1, ST2, which are explanatory variables
In ST3 and WET2, "L" of the low yield group is indicated by hatching, and "H" of the high yield group is indicated without hatching.

【００４８】基準説明変数８０１であるＳＴ３は、その
属性値により、高歩留りグループ８１１と低歩留りグル
ープ８１２に分けることができる。高歩留りグループ８
１１は１０個の集合であり、低歩留りグループ８１２も
１０個の集合である。The reference explanatory variable 801, ST 3, can be divided into a high yield group 811 and a low yield group 812 according to its attribute value. High yield group 8
11 is a set of ten, and the low yield group 812 is also a set of ten.

【００４９】次に、それぞれの説明変数の２分割された
高歩留グループと低歩留グループのロットが基準説明変
数の同じグループとどれだけ一致しているかを数えてＮ
ａとする。例えば、比較説明変数８０２であるＳＴ１
は、基準説明変数の高歩留りグループ８１１に含まれる
高歩留りグループが１０個であり、基準説明変数の低歩
留りグループ８１２に含まれる低歩留りグループが２個
である。すなわち、比較説明変数であるＳＴ１と基準説
明変数であるＳＴ３とが相互に同じグループに属する数
Ｎａ＝１０＋２＝１２である。このＮａを式（５）に代
入した式を計算式８１３に示す。ここで、データ数Ｎは
２０である。この計算結果を２分割交絡度８１４に示
す。式（６）により求めた値を２分割独立度８１５とし
て示す。２分割交絡度８１４及び２分割独立度８１５
を、図８の各列の下に示す。Next, the number of lots of the high yield group and the low yield group obtained by dividing each of the explanatory variables into two coincides with the same group of the reference explanatory variables, and N is counted.
a. For example, ST1 which is the comparison explanatory variable 802
Is 10 high yield groups included in the high yield group 811 of the reference explanatory variables, and 2 low yield groups included in the low yield group 812 of the reference explanatory variables. That is, the number Na = 10 + 2 = 12 in which the comparison explanatory variable ST1 and the reference explanatory variable ST3 belong to the same group. A formula obtained by substituting this Na into formula (5) is shown in formula 813. Here, the data number N is 20. The calculation result is shown in a two-part confounding degree 814. The value obtained by equation (6) is shown as a two-part independence 815. Two-part confounding degree 814 and two-part independence 815
Are shown below each column in FIG.

【００５０】２分割交絡度及び２分割独立度の基本的活
用方法は次の３つである。従来は判別が難しかった説明
変数が、以下のように定量的な情報として得られる。There are three basic methods of utilizing the two-part confounding degree and the two-part independence degree. An explanatory variable that has been difficult to discriminate conventionally can be obtained as quantitative information as follows.

【００５１】（１）有意な説明変数の範囲を確認有意性の高い候補と交絡している候補を把握し、これら
も有意な説明変数と判断する。交絡度に対する基準は特
に無いが、他の説明変数の値と比較して判断できる。ま
た、技術的に対象として考えなくてよい候補が上位に来
た場合、この候補に交絡している候補を明確にできる。
更に、意味の無い候補を削除して再度分析して確認でき
る。(1) Confirming the range of significant explanatory variables Candidates confounded with highly significant candidates are grasped, and these are also determined to be significant explanatory variables. Although there is no particular criterion for the confounding degree, it can be determined by comparing it with the values of other explanatory variables. In addition, when a candidate that does not need to be considered as a technical target comes to the top, a candidate confounded with this candidate can be clarified.
Furthermore, meaningless candidates can be deleted and analyzed again to confirm.

【００５２】（２）独立性の高い候補の確認とその応用全ての候補について他の候補との独立度を確認し、他の
候補との独立度が十分高い候補がある場合、この候補に
よる歩留り差は他の候補に独立して存在することが明確
になる。更に、この候補の分割グループごとに同様の判
別木分析を行って比較し、どちらも同様の分析結果が得
られる場合は分析結果の信頼性が高いことが分かる。逆
に、分析結果が異なる場合は独立と考えられた候補との
複合条件で歩留りを左右する説明変数があるか、または
特異なデータに左右されている（データ数が少ないこと
などが要因）と考えられる。(2) Confirmation of Candidates with High Independence and Its Application The independence of all candidates from other candidates is confirmed, and if there is a candidate with sufficiently high independence from other candidates, the yield by this candidate is determined. It becomes clear that the difference exists independently of the other candidates. Further, similar discriminant tree analysis is performed for each of the candidate divided groups and compared. If the same analysis result is obtained in both cases, it is understood that the reliability of the analysis result is high. Conversely, if the analysis results are different, there is an explanatory variable that affects the yield under compound conditions with the candidate considered independent, or it is influenced by unusual data (due to the small number of data, etc.) Conceivable.

【００５３】（３）交絡している候補に関する判別木分
析ある重要と考えられる候補が第１分岐候補に交絡してい
る場合、第１分岐の下層の分岐には現れ難い。その際、
他の独立度の高い候補の分割グループによってデータを
分割して判別木分析を行い、この分割グループの下での
判別木分析結果を比較する。同様の結果であれば、その
重要な候補は第１候補と区別できないが、分析自体は信
頼性が高いと考えられる。逆にその重要と考えられる候
補が現われ、異なった結果となった場合、この結果も考
慮すべきであり、重要と考えられる候補と第１候補とを
区別して分析できるデータ解析を更に行う必要が有ると
考えられる。(3) Analysis of Discrimination Tree Regarding Confounding Candidates If a candidate considered important is confounded with a first branch candidate, it is unlikely to appear in a lower branch of the first branch. that time,
The data is divided by another candidate division group having a high degree of independence to perform a discrimination tree analysis, and the results of the discrimination tree analysis under this division group are compared. With similar results, the important candidate cannot be distinguished from the first candidate, but the analysis itself is considered to be highly reliable. Conversely, if the important candidate appears and the result is different, this result should also be considered, and it is necessary to further perform data analysis which can analyze the candidate considered important and the first candidate separately. It is thought that there is.

【００５４】次に、装置履歴、電気的特性値を説明変
数、ウェーハ歩留りを目的変数とする回帰木分析を行
い、回帰木分析結果の第１分岐の上位１２候補について
２分割交絡度及び２分割独立度を求める場合を説明す
る。Next, a regression tree analysis is performed using the apparatus history and the electrical characteristic values as explanatory variables and the wafer yield as an objective variable. The case of obtaining the degree of independence will be described.

【００５５】本実施形態で得られる回帰木図及び評価用
統計値リストを図９及び図１０に示す。図９では、ノー
ドｎ９００がノードｎ９０１〜ｎ９１４に分割される。
図１０は、第１の２分割時の上位１２の説明変数の評価
用統計値を示す。これにより、集合分岐の１２の候補１
００１〜１０１２が挙がる。FIGS. 9 and 10 show a regression tree diagram and an evaluation statistic list obtained in this embodiment. In FIG. 9, a node n900 is divided into nodes n901 to n914.
FIG. 10 shows evaluation statistical values of the top 12 explanatory variables at the time of the first two divisions. As a result, 12 candidates 1 of the set branch
001 to 1012 are listed.

【００５６】図１１は、図１０の最上位の第一候補１０
０１として挙がっているＳＴ１を基準説明変数１１０１
とし、評価用統計値リストの他の説明変数を比較説明変
数１１０２としたときの２分割交絡度１１１１及び２分
割独立度１１１２を示す。FIG. 11 shows the first candidate 10 at the top of FIG.
ST1 listed as 01 is used as a reference explanatory variable 1101
The two-part confounding degree 1111 and the two-part independence 1112 when the other explanatory variables in the evaluation statistic value list are comparative explanatory variables 1102 are shown.

【００５７】図１２は、図１０の集合分岐の第三候補１
００３として挙がっているＳＴ３を基準説明変数１２０
１とし、評価用統計値リストの他の説明変数を比較説明
変数１２０２としたときの２分割交絡度１２１１及び２
分割独立度１２１２を示す。FIG. 12 shows the third candidate 1 of the set branch shown in FIG.
ST3 listed as 003 is used as a reference explanatory variable 120.
1 and the two-part confounding degrees 1211 and 211 when the other explanatory variable of the evaluation statistic list is a comparative explanatory variable 1202.
The degree of division independence 1212 is shown.

【００５８】図１１に示す２分割交絡度が０．７５を超
えているのはＳＴ２，ＳＴ４，ＳＴ５，ＳＴ６，ＳＴ１
０，ＷＥＴ２であり、これらは図９の回帰木図には現れ
てこないが、歩留りに大きく効いている要因である可能
性がある。逆に、図１２により、ＳＴ３は、２分割独立
度が高いことを示している。ST2, ST4, ST5, ST6, and ST1 show that the two-part confounding degree shown in FIG. 11 exceeds 0.75.
0 and WET2, which do not appear in the regression tree diagram of FIG. 9 but may be factors that greatly affect the yield. Conversely, FIG. 12 shows that ST3 has a high degree of two-part independence.

【００５９】図１２は、図１１で２分割独立度が高いと
されたＳＴ３を基準説明変数とし、他の１１の説明変数
との２分割交絡度１２１１及び２分割独立度１２１２を
示している。ＳＴ３は他のいずれの説明変数とも独立度
が高いことを示している。FIG. 12 shows the two-part confounding degree 1211 and the two-part independence degree 1212 with the other 11 explanatory variables using ST3, which is determined to be high in two-part independence in FIG. 11, as a reference explanatory variable. ST3 indicates that the degree of independence from any other explanatory variables is high.

【００６０】図１３及び図１４は、図１０の回帰木分析
で有意差が大きいとされた上位１２の説明変数同士の２
分割交絡度、２分割独立度およびその平均値を示し、説
明変数間の関連を一見に把握できる。図１３の最下欄は
２分割交絡度の平均値１３０１を示し、図１４の最下欄
は２分割独立度の平均値１４０１を示す。FIGS. 13 and 14 show two of the top 12 explanatory variables determined to have a significant difference in the regression tree analysis of FIG.
The degree of division confounding, the degree of division independence, and the average value thereof are shown, and the relationship between explanatory variables can be grasped at a glance. The lowermost column in FIG. 13 shows the average value 1301 of the two-part confounding degree, and the lowermost column of FIG. 14 shows the average value 1401 of the two-part independence degree.

【００６１】次に、ＳＴ３での使用装置の差は他の説明
変数と独立して歩留りに効いていることが判明したの
で、歩留りが不良となるＳＴ３での装置群によるウェー
ハ群（不良ウェーハ群：Ｓ３Ｍ２，Ｓ３Ｍ３を使用）と
良好となるＳＴ３での装置群によるウェーハ群（良好ウ
ェーハ群、Ｓ３Ｍ１，Ｓ３Ｍ４を使用）に分けて別個に
回帰木分析を行う。その結果としての回帰木図を図１
５、図１６に示す。Next, it was found that the difference in the equipment used in ST3 was effective for the yield independently of the other explanatory variables. : Using S3M2 and S3M3) and the wafer group (good wafer group, using S3M1 and S3M4) by the equipment group in ST3 which is good, and perform regression tree analysis separately. Figure 1 shows the resulting regression tree.
5, shown in FIG.

【００６２】図１５は、不良ウェーハ群による回帰木分
析結果を示す回帰木図であり、ノードｎ１５００〜ｎ１
５０６で構成される。図１６は、良好ウェーハ群による
回帰木分析結果を示す回帰木図であり、ノードｎ１６０
０〜ｎ１６０６で構成される。FIG. 15 is a regression tree diagram showing the result of regression tree analysis based on a group of defective wafers.
506. FIG. 16 is a regression tree diagram showing a regression tree analysis result using a good wafer group, and includes a node n160.
0 to n1606.

【００６３】図１５の不良ウェーハ群の第一分岐は図９
の全ウェーハ群によるものと同じであり、図９の回帰木
図の最上階層の不良ウェーハ群はｎ＝３９と少ないこと
もあわせ、歩留りが他に比べて極端に悪いウェーハによ
りかなり左右されると推察され、解析を困難にしている
一因である。図１６の良好ウェーハ群では、ＳＴ３工程
の不良装置により見えにくかった要因があらたに判明し
たことになる。The first branch of the defective wafer group in FIG.
This is the same as the case of all wafer groups of FIG. 9. In addition to the fact that the number of defective wafer groups on the top layer of the regression tree diagram of FIG. 9 is as small as n = 39, when the yield is significantly affected by extremely poor wafers compared to the others. It is speculated that this is one of the factors that makes the analysis difficult. In the good wafer group shown in FIG. 16, the factors that were difficult to see due to the defective apparatus in the ST3 process were newly found.

【００６４】本実施形態によれば、２分割交絡度及び２
分割独立度を用いて説明変数の交絡の度合いをより明確
に把握できるようになり、回帰木分析と組み合わせて、
回帰木における最初の分岐の有意差が大きい問題説明変
数に交絡している注意すべき説明変数を明確化すること
が可能となる。According to the present embodiment, the two-part confounding degree and the two
The degree of confounding of explanatory variables can be grasped more clearly using the degree of split independence, and in combination with regression tree analysis,
It is possible to clarify the explanatory variables to be noted that are confounded with the problem explanatory variables in which the significant difference between the first branches in the regression tree is large.

【００６５】更に、独立性の高い説明変数のグループ分
けを応用して再度回帰木分析する事によって、回帰木分
析の精度（信頼度）及び解析効率を向上させ、また、よ
り詳しい分析が可能となる。Further, the accuracy (reliability) of the regression tree analysis and the analysis efficiency are improved by applying regression tree analysis again by applying the grouping of highly independent explanatory variables, and more detailed analysis is possible. Become.

【００６６】本発明の実施形態は、コンピュータがプロ
グラムを実行することによって実現することができる。
また、プログラムをコンピュータに供給するための手
段、例えばかかるプログラムを記録したＣＤ−ＲＯＭ等
の記録媒体又はかかるプログラムを伝送するインターネ
ット等の伝送媒体も本発明の実施形態として適用するこ
とができる。上記のプログラム、記録媒体及び伝送媒体
は、本発明の範疇に含まれる。The embodiment of the present invention can be realized by a computer executing a program.
Further, means for supplying the program to the computer, for example, a recording medium such as a CD-ROM in which the program is recorded, or a transmission medium such as the Internet for transmitting the program can be applied as an embodiment of the present invention. The above-described program, recording medium, and transmission medium are included in the scope of the present invention.

【００６７】なお、上記実施形態は、何れも本発明を実
施するにあたっての具体化のほんの一例を示したものに
過ぎず、これらによって本発明の技術的範囲が限定的に
解釈されてはならないものである。すなわち、本発明は
その技術思想、またはその主要な特徴から逸脱すること
なく、様々な形で実施することができる。It should be noted that each of the above-described embodiments is merely an example of the embodiment for carrying out the present invention, and the technical scope of the present invention should not be interpreted in a limited manner. It is. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features.

【００６８】本発明の様々な実施形態をまとめると、以
下のようになる。（付記１）（ａ）説明変数及び目的変数のデータ結果
を準備するステップと、（ｂ）前記データ結果を基に複
数の説明変数間の交絡度及び／又は独立度を演算するス
テップと、（ｃ）前記交絡度及び／又は独立度を用いて
データマイニングを行うステップとを有するデータ解析
方法。（付記２）前記ステップ（ｂ）は、回帰木分析により
２分割された集合単位で前記交絡度及び／又は独立度を
演算する付記１記載のデータ解析方法。（付記３）前記ステップ（ｂ）は、回帰木分析により
有意差が大きい分割の要因となる複数の説明変数を選択
し、該複数の説明変数間の交絡度及び／又は独立度を演
算する付記２記載のデータ解析方法。（付記４）前記ステップ（ｂ）は、基準となる説明変
数とその他の説明変数との間の交絡度及び／又は独立度
を演算する際、回帰木分析により２分割された各集合内
の説明変数間のデータの一致と不一致との割合を基に交
絡度及び／又は独立度を演算する付記３記載のデータ解
析方法。（付記５）前記ステップ（ｃ）は、前記交絡度及び／
又は独立度を基に説明変数を取捨選択することによりデ
ータマイニングを行う付記４記載のデータ解析方法。（付記６）説明変数及び目的変数のデータ結果を基に
複数の説明変数間の交絡度及び／又は独立度を演算する
演算手段と、前記交絡度及び／又は独立度を用いてデー
タマイニングを行うデータマイニング手段とを有するデ
ータ解析装置。（付記７）前記演算手段は、回帰木分析により２分割
された集合単位で前記交絡度及び／又は独立度を演算す
る付記６記載のデータ解析装置。（付記８）前記演算手段は、回帰木分析により有意差
が大きい分割の要因となる複数の説明変数を選択し、該
複数の説明変数間の交絡度及び／又は独立度を演算する
付記７記載のデータ解析装置。（付記９）前記演算手段は、基準となる説明変数とそ
の他の説明変数との間の交絡度及び／又は独立度を演算
する際、回帰木分析により２分割された各集合内の説明
変数間のデータの一致と不一致との割合を基に交絡度及
び／又は独立度を演算する付記８記載のデータ解析装
置。（付記１０）前記データマイニング手段は、前記交絡
度及び／又は独立度を基に説明変数を取捨選択すること
によりデータマイニングを行う付記９記載のデータ解析
装置。（付記１１）（ａ）説明変数及び目的変数のデータ結
果を準備する手順と、（ｂ）前記データ結果を基に複数
の説明変数間の交絡度及び／又は独立度を演算する手順
と、（ｃ）前記交絡度及び／又は独立度を用いてデータ
マイニングを行う手順とをコンピュータに実行させるた
めのプログラムを記録したコンピュータ読み取り可能な
記録媒体。The following summarizes various embodiments of the present invention. (Supplementary Note 1) (a) preparing data results of explanatory variables and objective variables; (b) calculating confounding degree and / or independence degree between a plurality of explanatory variables based on the data results; c) performing data mining using the degree of confounding and / or degree of independence. (Supplementary Note 2) The data analysis method according to Supplementary Note 1, wherein the step (b) calculates the confounding degree and / or the independence degree for each set divided into two by regression tree analysis. (Supplementary Note 3) In the step (b), regression tree analysis is performed to select a plurality of explanatory variables that cause a division with a significant difference, and to calculate the degree of confounding and / or independence between the plurality of explanatory variables. 2. The data analysis method according to 2. (Supplementary Note 4) In the step (b), when calculating the degree of confounding and / or the degree of independence between the explanatory variable serving as a reference and other explanatory variables, the explanation in each set divided into two by regression tree analysis is performed. 4. The data analysis method according to claim 3, wherein the degree of confounding and / or the degree of independence is calculated based on the ratio of data coincidence and non-coincidence between variables. (Supplementary Note 5) The step (c) is based on the confounding degree and / or
Alternatively, the data analysis method according to Appendix 4, wherein data mining is performed by selecting explanatory variables based on the degree of independence. (Supplementary Note 6) An arithmetic unit that calculates a degree of confounding and / or independence between a plurality of explanatory variables based on data results of explanatory variables and an objective variable, and performs data mining using the degree of confounding and / or independence. A data analysis device having data mining means. (Supplementary note 7) The data analysis device according to supplementary note 6, wherein the calculation means calculates the confounding degree and / or the independence degree in a set unit divided into two by regression tree analysis. (Supplementary note 8) The supplementary note 7, wherein the calculating means selects a plurality of explanatory variables that cause a division having a significant difference by regression tree analysis, and calculates the degree of confounding and / or independence between the plurality of explanatory variables. Data analysis equipment. (Supplementary Note 9) When calculating the degree of confounding and / or the degree of independence between the explanatory variable serving as a reference and the other explanatory variables, the calculating means may determine whether the explanatory variables in each set are divided into two by regression tree analysis. 9. The data analysis apparatus according to claim 8, wherein the degree of confounding and / or the degree of independence is calculated based on the ratio of coincidence and non-coincidence of the data. (Supplementary note 10) The data analysis device according to supplementary note 9, wherein the data mining unit performs data mining by selecting and explaining explanatory variables based on the confounding degree and / or the independence degree. (Supplementary Note 11) (a) a procedure for preparing data results of explanatory variables and objective variables, (b) a procedure for calculating the degree of confounding and / or independence between a plurality of explanatory variables based on the data results, c) a computer-readable recording medium that records a program for causing a computer to execute the data mining procedure using the confounding degree and / or the independence degree.

【００６９】[0069]

【発明の効果】以上説明したように本発明によれば、複
数の説明変数間の交絡度及び／又は独立度を演算するこ
とにより、説明変数の交絡の度合いを明確に把握でき
る。これを基に回帰木分析を行えば、回帰木分析の集合
の２分割結果に基づき、説明変数の交絡度を定量的に評
価できるようになり、回帰木における最初の分岐の有意
差が大きい問題となる説明変数に交絡している注意すべ
き説明変数を明確化することが可能となる。As described above, according to the present invention, the degree of confounding of explanatory variables can be clearly grasped by calculating the degree of confounding and / or the degree of independence between a plurality of explanatory variables. If a regression tree analysis is performed based on this, it becomes possible to quantitatively evaluate the degree of confounding of explanatory variables based on the result of dividing the set of the regression tree analysis into two, and the problem that the first branch of the regression tree has a significant difference is large. It is possible to clarify the explanatory variable to be noted that is confounded with the explanatory variable to be.

[Brief description of the drawings]

【図１】ロットの流れと異常製造装置の関係を示す図で
ある。FIG. 1 is a diagram showing a relationship between a lot flow and an abnormal manufacturing apparatus.

【図２】従来技術によるある工程での装置別歩留り分布
を示す図である。FIG. 2 is a diagram showing a yield distribution for each device in a certain process according to the related art.

【図３】ロットの流れと異常製造装置の交絡の関係を示
す図である。FIG. 3 is a diagram showing the relationship between the flow of a lot and the confounding of abnormal manufacturing apparatuses.

【図４】回帰木分析入力データの例を示す図である。FIG. 4 is a diagram showing an example of regression tree analysis input data.

【図５】回帰木の例を示す図である。FIG. 5 is a diagram illustrating an example of a regression tree.

【図６】評価用統計値リストの例を示す図である。FIG. 6 is a diagram illustrating an example of an evaluation statistical value list.

【図７】使用製造装置と電気的特性データと歩留り値の
関係を示す図である。FIG. 7 is a diagram showing a relationship among a used manufacturing apparatus, electrical characteristic data, and a yield value.

【図８】２分割交絡度及び２分割独立度の算出例を示す
図である。FIG. 8 is a diagram illustrating a calculation example of a two-part confounding degree and a two-part independence degree.

【図９】回帰木の例を示す図である。FIG. 9 is a diagram illustrating an example of a regression tree.

【図１０】評価用統計値リストの例を示す図である。FIG. 10 is a diagram illustrating an example of an evaluation statistical value list.

【図１１】各説明変数と第１候補の説明変数との交絡度
及び独立度を示す図である。FIG. 11 is a diagram showing the degree of confounding and the degree of independence between each explanatory variable and the explanatory variable of the first candidate.

【図１２】各説明変数と第３候補の説明変数との交絡度
及び独立度を示す図である。FIG. 12 is a diagram illustrating the degree of confounding and the degree of independence between each explanatory variable and an explanatory variable of a third candidate.

【図１３】全候補同士の交絡度及びその平均を示す図で
ある。FIG. 13 is a diagram showing the degree of confounding among all candidates and the average thereof.

【図１４】全候補同士の独立度及びその平均を示す図で
ある。FIG. 14 is a diagram showing the degree of independence of all candidates and the average thereof.

【図１５】不良ウェーハ群による回帰木分析結果を示す
回帰木図である。FIG. 15 is a regression tree diagram showing a regression tree analysis result based on a group of defective wafers.

【図１６】良好ウェーハ群による回帰木分析結果を示す
回帰木図である。FIG. 16 is a regression tree diagram showing a regression tree analysis result using a good wafer group.

【図１７】データ解析装置の構成を示す図である。FIG. 17 is a diagram illustrating a configuration of a data analysis device.

[Explanation of symbols]

１０１正常装置１０２異常装置４０１説明変数４０２目的変数４１１使用装置４１２電気的特性データ４１３歩留り８０１基準説明変数８０２比較説明変数８０３ウェーハ番号８０４歩留り８１１基準説明変数の高歩留りグループ８１２基準説明変数の低歩留りグループ８１３２分割交絡度の計算式８１４２分割交絡度８１５２分割独立度１７０１オリジナルデータ群１７０２データベース１７０３データマイニング部１７０４ルールファイル１７０５解析ツール群１７０６統計解析コンポーネント１７０７図表作成コンポーネント１７０８意思決定部 101 Normal device 102 Abnormal device 401 Explanatory variable 402 Target variable 411 Used device 412 Electrical characteristic data 413 Yield 801 Reference explanatory variable 802 Comparative explanatory variable 803 Wafer number 804 Yield 811 High yield group of standard explanatory variable 812 Low yield of standard explanatory variable Group 813 Formula for calculating 2-part confounding degree 814 2-part confounding degree 815 2-part independence 1701 Original data group 1702 Database 1703 Data mining unit 1704 Rule file 1705 Analysis tool group 1706 Statistical analysis component 1707 Chart creation component 1708 Decision making unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者津田英隆神奈川県川崎市高津区坂戸３丁目２番１号富士通エルエスアイテクノロジ株式会社内Ｆターム(参考） 5B056 BB00 HH00 5B075 ND03 NR12 NR16 ────────────────────────────────────────────────── ─── Continuing on the front page (72) Inventor Hidetaka Tsuda 3-2-1 Sakado, Takatsu-ku, Kawasaki-shi, Kanagawa F-term in Fujitsu LSI Technology Co., Ltd. 5B056 BB00 HH00 5B075 ND03 NR12 NR16

Claims

[Claims]

(A) preparing a data result of an explanatory variable and an objective variable; and (b) calculating a confounding degree and / or an independence degree between a plurality of explanatory variables based on the data result. (C) performing data mining using the degree of confounding and / or degree of independence.

2. The data analysis method according to claim 1, wherein in the step (b), the confounding degree and / or the independence degree are calculated for each set divided into two by regression tree analysis.

3. In the step (b), a plurality of explanatory variables which cause a division having a significant difference are selected by a regression tree analysis, and the degree of confounding and / or independence between the plurality of explanatory variables is calculated. The data analysis method according to claim 2.

4. The method according to claim 1, wherein the step (b) comprises: calculating a confounding degree and / or an independence degree between a reference explanatory variable and another explanatory variable; 4. The data analysis method according to claim 3, wherein the degree of confounding and / or the degree of independence is calculated based on the ratio of data coincidence and disagreement between explanatory variables.

5. The data analysis method according to claim 4, wherein the step (c) performs data mining by selecting an explanatory variable based on the degree of confounding and / or degree of independence.

6. An arithmetic unit for calculating a degree of confounding and / or independence between a plurality of explanatory variables based on data results of explanatory variables and an objective variable, and performing data mining using the degree of confounding and / or independence. A data analysis device having data mining means for performing the data analysis.

7. The data analysis device according to claim 6, wherein the calculation means calculates the confounding degree and / or the degree of independence for each set divided into two by regression tree analysis.

8. The calculation means selects a plurality of explanatory variables that cause a division having a significant difference by regression tree analysis,
The data analysis device according to claim 7, wherein the degree of confounding and / or the degree of independence between the plurality of explanatory variables is calculated.

9. The calculation means calculates the degree of confounding and / or the degree of independence between a reference explanatory variable and another explanatory variable when the explanatory variable in each set divided into two by regression tree analysis. 9. The data analysis device according to claim 8, wherein the degree of confounding and / or the degree of independence is calculated based on a ratio of data coincidence and non-coincidence between the data.

10. The data analysis apparatus according to claim 9, wherein said data mining means performs data mining by selecting and explaining explanatory variables based on said confounding degree and / or independence degree.