JP4866520B2

JP4866520B2 - Data analysis method, data analysis program, and data analysis apparatus

Info

Publication number: JP4866520B2
Application number: JP2001338185A
Authority: JP
Inventors: 英隆津田; 英大白井
Original assignee: Fujitsu Semiconductor Ltd
Current assignee: Fujitsu Semiconductor Ltd
Priority date: 2001-11-02
Filing date: 2001-11-02
Publication date: 2012-02-01
Anticipated expiration: 2021-11-02
Also published as: JP2003142361A

Description

【０００１】
【発明の属する技術分野】
本発明は、広く産業界で取り扱われるデータ間の関連を把握し、産業上優位な結果をもたらすための有意性のある結果を抽出するデータ解析方法に関し、さらに、解析対象とするデータ値やその平均値だけに注目していては判別が困難である知識や情報を抽出するデータ解析方法に関する。また、解析結果の精度等を評価するデータ解析方法およびデータ解析装置に関する。
【０００２】
たとえば、半導体製造工程において取得される使用装置履歴、試験結果、設計情報または各種測定データ等をもって歩留りの変動状況を把握し、よって歩留り向上に有利な条件を抽出するためになされるデータ解析方法に関する。特に、計算機システムに蓄積されているオリジナルデータやその平均値だけでなく、それらオリジナルデータ等を編集することによって得られるデータ分布特徴を自動的かつ定量的に抽出して認識し、その特徴量に基づいて半導体等の低歩留り要因を抽出し、評価するデータ解析方法およびデータ解析装置に関する。
【０００３】
また、複数の説明変数が互いに交絡（独立でなくなる）してしまい、有意差の抽出が困難になる場合に対処し、より効率的かつ信頼性のある解析結果を得るためにデータ解析結果の精度等を評価するデータ解析方法およびデータ解析装置に関する。
【０００４】
【従来の技術】
半導体データの歩留り解析を例にとって進める。特に、プロセスデータ解析のように、その解析結果から品質、生産性向上の対策決定のための参考データを得ようとする場合には、その精度、信頼度等が重要であり、これについては本願発明者等により既に出願されている（出願番号：特願２００１−１２７５３４号）。歩留り低下要因をできるだけ速やかに見つけて対策を実施するために、装置履歴、試験結果、設計情報、各種測定データ等から歩留りに効いている要因やその要因に効いている別の要因を見つけるためのデータ解析がおこなわれる。
【０００５】
データ解析において、歩留り値のように解析対象となるものを目的変数、目的変数の要因となる装置履歴、試験結果、設計情報、各種測定データ等は説明変数といわれる。その際に各種統計学的手法が適用されるが、そのうちの一つとしてデータマイニングを適用することで、多種大量のデータから判別しにくい価値ある情報や規則性を抽出することができる。
【０００６】
半導体デバイスの不良要因を解析するためには、収集されたデータをより多面的に科学的根拠に基づいて解析し、より多くの有意差を抽出するのが重要である。そのため、従来は計算機システムに蓄積されたオリジナルデータの値やその平均値がよく活用されている。しかし、複雑に絡み合ったオリジナルデータ群から不良要因等を抽出するのが困難な場合もある。そのような場合、ウェーハ面内チップやロット内ウェーハの各種測定結果や歩留り等に関して特徴的なデータ分布が存在すれば、それに基づいて不良データの解析が進められる場合がある。
【０００７】
【発明が解決しようとする課題】
しかしながら、従来の計算機システムでは、たとえば歩留り値や電気的特性値等のオリジナルデータは蓄積されているが、ウェーハ面内の複数のチップやロット内の複数のウェーハにわたる特徴的なデータ分布はほとんど蓄積されていない。したがって、技術者はオリジナルデータの編集をおこない、各種統計解析ツールや図表作成ツール等を用いてデータ分布状況を取得する必要がある。そして、取得したデータ分布状況を、技術者が有する経験やノウハウ等に照らし合わせて、データの集計や傾向を認識する必要がある。そのため、大量のオリジナルデータの分布に関する特徴量を客観的に把握することは困難である。また、このように技術者の主観が入ったデータ分布の特徴量に基づいて解析を進めても正確な結果が得られないという問題点がある。
【０００８】
また、従来は、技術者が各種統計解析ツールや図表作成ツール等により得たデータ分布状況を見て、たとえばその分布に、ある特徴が「ある」か「ない」か、ある特徴の増減傾向が「増大である」か「減少である」か、ある特徴に２の周期性が「ある」か「ない」か、またはある特徴に３の周期性が「ある」か「ない」か、などというように、データ分布の特徴量を離散値で表している。そのため、ある特徴がどの程度ある（または、ない）のか、あるいはある特徴がどの程度増大傾向（または、減少傾向）にあるのか、というような程度を表す情報が欠落してしまう。また、たとえばある特徴が、ある程度の２の周期性とある程度の３の周期性を同時に有するような場合、より程度が強い方の周期性しか認識されないという問題点がある。
【０００９】
また、各種試験結果や測定結果、およびそれらの組み合わせまで考慮すると、想定されるデータ分布特徴の組み合わせは膨大になり、それらすべてについて調査するのは極めて困難である。しかも、抽出したデータ分布特徴に対応する不良要因は必ずしも既知のものではないし、また既知でない不良要因を判別するには多くの経験やノウハウが必要であるという問題点もある。
【００１０】
また、実際に、たとえば半導体データの歩留り解析にデータマイニングを適用してみても、うまく行かない場合がある。金融や流通などの分野での適用では、何百万件もの膨大なデータ件数があり、説明変数の数はせいぜい数十であるため、精度の高い分析結果が得られた。ところが半導体プロセスデータ解析の場合はデータ件数が少なく、同じ品種では多くても２００ロット程度であるにもかかわらず、説明変数の数は数百にも達し（装置履歴、工程内検査値等）、複数の説明変数が独立ではなくなってしまい、単純にデータマイニングをおこなっただけでは信頼できる結果が得られないことがある。以下に、これについて半導体データの歩留り解析を例にとって簡単に説明する。
【００１１】
データ数（例：ロット数）に比較して説明変数（例：ＬＳＩ製造工程データ）が多いプロセスデータ解析において、複数の説明変数が互いに交絡（独立でなくなる）してしまい、統計的有意差による問題点が十分絞り込めないことが多くある。データマイニング（回帰木分析など）を適用した場合においても、この問題がある場合には、かなり手間をかけて分析結果の精度、信頼できる範囲の確認が必要となる。
【００１２】
図４０は、ロットの流れと異常製造装置の関係を示す。白丸“○”は正常装置１０１を示し、黒丸“●”は異常装置１０２を示す。矢印はロットの流れを示す。ＬＳＩ製造データにおける装置間差の解析は、各ロットの工程ごとの使用装置データから、どの製造工程でどの製造装置を使用すると歩留りが最も影響を受けるかを抽出する。
【００１３】
図４１は、従来技術によるある工程での装置別歩留り分布（箱ヒゲ図）を示す。各製造工程ごとに使用した装置ごとにそのロットの歩留り値を箱ヒゲ図で表示し、各工程について確認していき、最も差が顕著な工程とその装置を同定する。
【００１４】
しかし、この手法では工程数が数百となった現在では大きな工数を要し、また差異が明確に出ない場合や条件が複雑に絡み合った場合などはなかなか判断が付きにくい。これらに対処するために回帰木分析によるデータマイニング手法が有効であり、目的変数の値が高くなる使用装置群と低くなる使用装置群に分割する。図４２のようにロットごとに使用される装置を固定してロットを流した場合、黒丸“●”で示す異常装置１０２が一意に同定できないことがある。すなわち、説明変数間の独立性が低い場合は、集合の２分割による有意差が大となるものが必ずしも“真に有意差が大”であるとは限らない。
【００１５】
以上が、半導体製造の各工程における使用装置における交絡であるが、回帰木分析結果として２分割された集合の交絡についても同様である。すなわち、各工程ごとに高歩留りが生じている装置群と低歩留りが生じている装置群からなる集合についても同じことがいえる。この２分割された集合の交絡については、説明変数が連続値である場合も同様である。
【００１６】
本発明は、上記問題点に鑑みてなされたものであって、オリジナルデータを編集して各種統計値等のデータ分布特徴量を抽出し、それを客観的に認識して活用することにより、不良要因等の抽出を自動的におこなうデータ解析方法を提供することを目的とする。また、複数の説明変数間の交絡の度合いを明確にすることができるデータ解析方法およびデータ解析装置を提供することを本発明の目的に含めることができる。
【００１７】
【課題を解決するための手段】
上記目的を達成するため、本発明は、計算機システムに蓄積されているオリジナルデータ群内に存在する種々のデータ分布特徴を自動的かつ定量的に評価して抽出し、その抽出された各特徴量を順次選択して解析をおこなうことにより、各特徴量が生じた要因を自動的かつ定量的に評価して抽出することを特徴とする。この発明によれば、オリジナルデータからデータ分布の傾向や特徴的パターンやデータ間の関連性などの多くの情報が抽出されるので、従来は多種多様なデータに埋もれて判別が困難であった関連性や有意差が効率的に科学的根拠に基づいて定量的に抽出される。
【００１８】
また、複数の説明変数間の交絡の度合いを明確にするため、説明変数および目的変数のデータ結果を準備するステップと、そのデータ結果を基に複数の説明変数間の交絡度および／または独立度を演算するステップと、交絡度および／または独立度を用いてデータマイニングをおこなうステップとを有するデータ解析方法が提供される。複数の説明変数間の交絡度および／または独立度を演算することにより、説明変数の交絡の度合いを明確に把握できる。これを基に回帰木分析をおこなえば、回帰木分析の集合の２分割結果に基づき、説明変数の交絡度を定量的に評価できるようになり、回帰木における最初の分岐の有意差が大きい問題となる説明変数に交絡している注意すべき説明変数を明確化することが可能となる。
【００１９】
【発明の実施の形態】
以下に、本発明の実施の形態１、２について図面を参照しつつ詳細に説明する。
【００２０】
（実施の形態１）
図１は、本発明の実施の形態１にかかるデータ解析方法の実施に供せられる計算機システムのハードウェア構成の一例を示す図である。この計算機システムは、図１に示すように、入力装置１、中央処理装置２、出力装置３および記憶装置４から構成される。
【００２１】
図２は、図１に示す構成の計算機システムにより実現されるデータ解析装置の機能構成の一例を示すブロック図である。このデータ解析装置は、図２に示すように、複数のオリジナルデータを含むデータベース４１からなるオリジナルデータ群４２を有する。このデータベース４１は、図１に示す計算機システムの記憶装置４において構築されている。
【００２２】
また、データ解析装置は、オリジナルデータ群４２の中に存在する１以上のデータ分布特徴を定量的に評価して抽出する手段２１、抽出した１以上のデータ分布特徴量の中から解析対象とする特徴量を選択する手段２２、解析対象に選択したデータ分布特徴量を目的変数として回帰木分析手法などによるデータマイニングをおこない、データ分布に潜む特徴や規則性などのルールファイル２４を抽出する手段２３、抽出したルールファイル２４を用いてオリジナルデータの分布特徴を解析する統計解析コンポーネント２５や図表作成コンポーネント２６などの解析ツール群２７を備える。
【００２３】
以上の各手段２１，２２，２３および解析ツール群２７は、それぞれの処理をおこなうためのプログラムを中央処理装置２で実行することにより実現される。抽出されたルールファイル２４は記憶装置４に記憶されるとともに、表示装置や印刷装置などの出力装置３により出力される。意思決定５は解析ツール群２７による解析結果に基づいてなされる。
【００２４】
また、上述したルールファイル２４を抽出する手段２３は、オリジナルデータ群４２の中のオリジナルデータ、データ分布特徴を抽出する手段２１により抽出されたデータ分布特徴、または解析ツール群２７による解析結果に対してもデータマイニングをおこなうようになっている。また、解析ツール群２７は、オリジナルデータ群４２の中のオリジナルデータ、データ分布特徴を抽出する手段２１により抽出されたデータ分布特徴、または解析ツール群２７の出力結果に対しても解析をおこなうようになっている。また、解析ツール群２７による解析結果は、解析対象となるデータ分布特徴量を選択する手段２２やオリジナルデータ群４２にフィードバックされる。また、オリジナルデータ群４２には、データ分布特徴を抽出する手段２１の出力がフィードバックされる。
【００２５】
図３は、たとえば図２のデータ分布特徴を抽出する手段２１により抽出されたデータ分布特徴量をＣＳＶ形式で出力した例を示す。各特徴量はレコードごとに独立して求められるので、独立して扱われる。たとえば図３に示すように、各特徴量はＣＳＶ形式で自動的に出力されるので、各特徴量ごとに効率的に有意な解析をおこなうことが可能となる。ここで、特徴量となるデータは、オリジナルデータ値やその平均値だけでなく、オリジナルデータの最大値、最小値、レンジまたは標準偏差値等でもよい。また、データの周期性や特定モデルとの類似度などを特徴量のデータとしてもよい。
【００２６】
ここで、対象とするデータ群の構造によって様々な特徴を抽出することができるが、目的に合わせてどのような特徴量を抽出するかという処理をあらかじめプログラムに組み込んでおくか、あるいは抽出する特徴量を定義したファイルを用意して、そのファイルを読むようにしてもよい。いずれの特徴量も、離散値ではなく、その特徴がどの程度の強さであるかという連続値で定義される。したがって、従来のように離散値化による情報の欠落が生じないので、より良好な解析結果が期待される。
【００２７】
データ分布特徴量の一例として、半導体データの歩留り解析におけるロット内データ分布の特徴について説明する。図４は、ウェーハの属性値の変動に着目した情報を示す一覧表である。ここでは、独立変数はウェーハ番号であり、従属変数は歩留り、カテゴリ歩留りまたは各種測定値等のオリジナルデータである。
【００２８】
特に限定しないが、図４に示す例では、（１）データ分布全体の中心、（２）データのばらつき、（３）ウェーハ番号に対するデータの相関、（４）一次近似した時のｙ軸切片、（５）ウェーハ番号に対するデータの傾き、（６）周期２(枚)の強さ、（７）周期３(枚)の強さ、（８）ロット内で最も強い周期、（９）前半ウェーハ−後半ウェーハの平均値の差、（１０）前半ウェーハ−後半ウェーハのばらつきの差、（１１）前半ウェーハ−後半ウェーハの相関の差、（１２）前半ウェーハ−後半ウェーハの一次近似ｙ軸切片の差、（１３）前半ウェーハ−後半ウェーハの傾きの差、（１４）後半ロットの周期２（枚）の強さ、（１５）後半ロットの周期３（枚）の強さ、および（１６）後半ロットの最も強い周期、の１６個の特徴項目が定義されている。各特徴項目の特徴量はロット単位で求められる。
【００２９】
ここで定義された１６個の特徴項目について簡単に説明する。（１）の特徴量は同一ロット内の全ウェーハの歩留りや各種測定値等の平均値である。（２）の特徴量は同一ロット内の全ウェーハの歩留りや各種測定値等の標準偏差値である。（３）の特徴量は同一ロット内のウェーハのウェーハ番号と歩留りや各種測定値等との相関係数であり、この相関係数の算出の仕方はあらかじめ解析の対象や目的などに照らし合わせて決められている。（４）の特徴量は同一ロット内のウェーハのウェーハ番号をｘとし、歩留りや各種測定値等をｙとし、ｘとｙの関係を一次式ｙ＝ｂ・ｘ＋ａに近似した時のｙ軸切片の値である。
【００３０】
（５）の特徴量は同一ロット内のウェーハのウェーハ番号をｘとし、歩留りや各種測定値等をｙとし、ｘとｙの関係を一次式ｙ＝ｂ・ｘ＋ａに近似した時の母回帰係数である。（６）の特徴量は同一ロット内の全ウェーハの歩留りや各種測定値等の分散と、ウェーハ番号が１、３、５、・・・のウェーハ群またはウェーハ番号が２、４、６、・・・のウェーハ群の歩留りや各種測定値等の分散との比である。（７）の特徴量は同一ロット内の全ウェーハの歩留りや各種測定値等の分散と、ウェーハ番号が１、４、７、・・・のウェーハ群、ウェーハ番号が２、５、８、・・・のウェーハ群またはウェーハ番号が３、６、９、・・・のウェーハ群の歩留りや各種測定値等の分散との比である。（８）の特徴量は同一ロット内のウェーハについて、上記（６）や（７）のようにして求めた周期２（枚)や３（枚）の分散比、同様にして求められる周期４（枚）や５（枚）、・・・などの分散比のうち、分散比が最大となる周期の値である。
【００３１】
（９）の特徴量は、同一ロット内の全ウェーハ（たとえば５０枚）を前半（たとえば２５枚）と後半（たとえば２５枚）に分け、前半ウェーハ群の歩留りや各種測定値等の平均値と後半ウェーハ群の歩留りや各種測定値等の平均値との差である。このように前半と後半に分けるのは、半導体製造プロセスにおいて装置履歴が異なるからである。（１０）の特徴量は、前半ウェーハ群の歩留りや各種測定値等の標準偏差値と後半ウェーハ群の歩留りや各種測定値等の標準偏差値との差である。（１１）の特徴量は前半ウェーハ群の相関係数と後半ウェーハ群の相関係数との差である。（１２）の特徴量は前半ウェーハ群の一次近似ｙ軸切片の値と後半ウェーハ群の一次近似ｙ軸切片の値との差である。
【００３２】
（１３）の特徴量は前半ウェーハ群の母回帰係数と後半ウェーハ群の母回帰係数との差である。（１４）の特徴量は後半ロット群について上記（６）と同様の周期２に関する分散比である。（１５）の特徴量は後半ロット群について上記（７）と同様の周期３に関する分散比である。（１６）の特徴量は後半ロット群について上記（８）と同様に分散比が最大となる周期の値である。なお、前半ロット群についても上記（１４）〜（１６）のように周期２（枚）の強さ、周期３（枚）の強さ、最も強い周期を定義してもよい。また、ここで例示した特徴項目に限らず、解析の対象や目的などに応じて種々の特徴項目が定義される。
【００３３】
上述したような特徴項目を定義して解析することによって、たとえば従来のように歩留り値等のオリジナルデータ値やその平均値を用いて解析しただけでは使用装置による有意差を抽出できない場合でも、使用装置による有意差を抽出することが可能となる場合がある。たとえば、複数のロットについてロット内ウェーハの歩留り値のばらつき（上記（２）に対応）に注目した例を図５に示す。
【００３４】
図５に示す例では、工程１で２１号機、２２号機、２４号機または２５号機を使用したロット群６（図５の一点鎖線の左側）と２８号機を使用したロット群７（図５の一点鎖線の右側）とでは、ウェーハ歩留りの平均値や全体の分布（ロット間ばらつき）はほとんど同じである。したがって、ウェーハ歩留りの値やその平均値を用いて解析しても明らかな有意差は認められない。それに対して、各ロット内でのウェーハ歩留り値のばらつきというデータ分布の特徴量に注目すると、２つのロット群６，７の間には明らかな有意差が認められる。注目する項目は上記（２）の項目に限らず、上記（１）の項目、上記（３）〜（１６）のいずれかの項目、またはその他の項目であってもよい。
【００３５】
上述したように各データ分布特徴量は各ロットの属性値として存在するので、前記データ分布特徴量を選択する手段２２は、各データ分布特徴量を順次目的変数として選択する。そして、データマイニングをおこなってルールファイル２４を抽出する手段２３は、各データ分布特徴量を順次目的変数として回帰木分析をおこなう。それによって、そのデータ分布特徴量が生じた要因の判別が可能となり、従来の解析方法よりも多くの不良要因を抽出することができる。その際、データ分布特徴量を順次選択する処理および回帰木分析処理はプログラムにしたがって自動的に実行されるので、技術者はどのデータ分布特徴量を目的変数に選択するかを考えずに済み、解析を効率的におこなうことができる。特に何を解析すべきかが不明な場合には有効である。
【００３６】
また、同一レコードに、周期２（枚）の強さと周期３（枚）の強さの両方が存在する場合のように、複数の特徴パターンが見られる場合でも、両方の特徴を評価することが可能となるので、情報量の欠落をなくしてより実状を反映した解析結果が得られる。
【００３７】
つぎに、本発明の実施の形態１にかかるデータ解析方法の流れについて説明する。図６は、本発明にかかるデータ解析方法の一例の概略を示すフローチャートである。図６に示すように、このデータ解析方法が開始されると、まずオリジナルデータ群４２の中から解析の対象とするデータ、たとえば歩留り値や各種測定値等が選択されて抽出される（ステップＳ１）。つづいて、抽出されたデータに対して１以上のデータ分布特徴を抽出する処理がおこなわれる（ステップＳ２）。
【００３８】
そして、解析対象とするデータ分布特徴量が選択され、それを目的変数として回帰木分析等のデータマイニングがおこなわれる（ステップＳ３）。ステップＳ２で抽出されたすべてのデータ分布特徴について回帰木分析が終了したら（ステップＳ４）、分析結果が出力され、技術者はその確認をおこなう（ステップＳ５）。そして、技術者は、分析結果に基づいて意思決定をおこなう（ステップＳ６）。
【００３９】
つぎに、本発明の特徴をより明らかにするため、データ分布特徴量を用いたデータ解析方法について具体例を挙げて説明する。一般に、同一ロット内のウェーハ群でもウェーハ番号が異なるとウェーハ単位の歩留り値や電気的特性値は異なり、それらの値はいろいろな変動パターンを示す。歩留り値や電気的特性値はウェーハ単位で保存されている。そのため、本実施の形態１では、このように複数のロットにわたってウェーハ番号に対する歩留り値等の変動パターンをデータ分布特徴として解析をおこなうことができる。ここでは、製品の性能に大きな影響を及ぼす重要な電気的特性であるテスト用代用Ｎｃｈトランジスタスレッシュホールド電圧ＶＴ＿Ｎ２（以下、単にＶＴ＿Ｎ２とする）について、多面的な解析をおこなう例を示す。なお、歩留りには各製造工程での使用装置履歴が効果があるとする。
【００４０】
図７は歩留りとＶＴ＿Ｎ２との関係を示す特性図であるが、同図より歩留りとＶＴ＿Ｎ２とは一見して無関係のように見える。また、図８はすべてのウェーハから得られたＶＴ＿Ｎ２データのヒストグラムであり、図９は全ＶＴ＿Ｎ２データをウェーハ番号ごとに表示した箱ヒゲ図である。これらの図に示す結果から統計的有意差を抽出するのは困難である。
【００４１】
また、図１０は、目的変数を各ロットにおけるＶＴ＿Ｎ２の平均値とし、説明変数を各工程で使用した装置名として回帰木分析をおこなった結果の例を示す図であり、図１１はこの回帰木分析の信頼度情報を表す評価用統計値リストの例を示す図である。この回帰木分析結果によれば、図１０に示すように、ＶＴ＿Ｎ２の変動に対して最も有意とされるのは第２配線＿装置として、１１号機または１３号機を使用したか、あるいは１２号機、１４号機、１７号機または１８号機を使用したかということである。全ＶＴ＿Ｎ２データを第２配線＿装置の使用装置名ごとに表示した箱ヒゲ図を図１２に示すが、同図においては顕著な有意差が見られない。なお、評価用統計値リストは回帰木図とともに出力されるが、これについては後述する。
【００４２】
また、図１３は、目的変数を各ウェーハのＶＴ＿Ｎ２の値とし、説明変数を各工程で使用した装置名として回帰木分析をおこなった結果の例を示す図であり、図１４はこの回帰木分析に対する評価用統計値リストの例を示す図である。この回帰木分析結果によれば、図１３に示すように、ＶＴ＿Ｎ２の変動に対して最も有意とされるのは２ＣＯＮ工程＿装置として、１１号機を使用したか、あるいは１２号機または１３号機を使用したかということである。図１５は、全ＶＴ＿Ｎ２データを２ＣＯＮ工程＿装置の使用装置名ごとに表示した箱ヒゲ図であるが、この図においても顕著な有意差は見られない。
【００４３】
それに対して、以下のようにＶＴ＿Ｎ２についてデータ分布特徴を抽出して解析をおこなうことにより不良要因の解明が可能となる。図１６は、ロットごとにＶＴ＿Ｎ２の各特徴量を定義したファイルの一例をＣＳＶ形式で示す図表である。このファイルは、図２に示す装置のデータ分布特徴を抽出する手段２１により出力される。
【００４４】
図１７は、図１６に示すＣＳＶ形式データに基づいてＶＴ＿Ｎ２の種々のロット内分布の特徴量を示すヒストグラムである。ここでは、ＶＴ＿Ｎ２について、図４に関連して説明した（１）〜（１６）の１６個の特徴項目のうち、（１）平均値（ＶＴ＿Ｎ２＿ａｖｅ）、（２）標準偏差値（ＶＴ＿Ｎ２＿ｓ）、（３）ウェーハ番号に対する相関係数（ＶＴ＿Ｎ２＿ｒ）、（４）一次近似式のｙ軸切片（ＶＴ＿Ｎ２＿ａ）、（５）母回帰係数（ＶＴ＿Ｎ２＿ｂ）、（６）ウェーハ番号の間隔２の周期性（ＶＴ＿Ｎ２＿２）、（７）ウェーハ番号の間隔３の周期性（ＶＴ＿Ｎ２＿３）、（９）前半ウェーハと後半ウェーハの平均値の差（ＶＴ＿Ｎ２＿ａｖｅ＿ｄ）、（１０）前半ウェーハと後半ウェーハの標準偏差値の差（ＶＴ＿Ｎ２＿ｓ＿ｄ）、（１１）前半ウェーハと後半ウェーハの相関係数の差（ＶＴ＿Ｎ２＿ｒ＿ｄ）、（１２）前半ウェーハと後半ウェーハの一次近似式のｙ軸切片の差（ＶＴ＿Ｎ２＿ａ＿ｄ）、（１３）前半ウェーハと後半ウェーハの母回帰係数の差（ＶＴ＿Ｎ２＿ｂ＿ｄ）、の１２個が抽出されている。
【００４５】
図１７より、いずれの特徴量もかなりばらついていることがわかる。したがって、各特徴量を目的変数として回帰木分析をおこなえば、それぞれの特徴量に有意差が生じた要因、すなわち不良要因等を解析することができる。
【００４６】
データ分布特徴を解析対象として効率的に解析結果を得るために、回帰木分析の入力データとして、図１８に示すように、ロットごとに、各工程での使用装置名と、抽出された特徴量とが定義されたファイルが作成される。このファイルは、歩留りの変動要因を回帰木分析で解析する際の入力データとして各工程での使用装置名とロット歩留りを定義したルールファイル２４（図１９参照）と、図１６に示すファイルとを同一ロット番号について結合したものである。
【００４７】
図２０は、図１８に示すファイルに基づいて、上述した（２）標準偏差値（ＶＴ＿Ｎ２＿ｓ）を目的変数とし、各工程での使用装置名を説明変数として、ＶＴ＿Ｎ２のロット内で生じているばらつきの要因を抽出するためにおこなった回帰木分析結果を示す回帰木図である。また、図２１は、この回帰木分析に対する評価用統計値リストの例を示す図である。
【００４８】
図２０に示す回帰木図によれば、ＶＴ＿Ｎ２の標準偏差値（ＶＴ＿Ｎ２＿ｓ）の変動に対して最も有意とされるのはＦｉｅｌｄ＿Ｏｘ工程＿装置として、ＰＭ１号機またはＰＭ３号機を使用したか、あるいはＰＭ２号機を使用したかということである。これは、評価用統計値リストのＳ比およびｔ値等について、１番目に出てくるＦｉｅｌｄ＿Ｏｘ工程＿装置のそれぞれの値（Ｓ比＝０．３７６７、ｔ＝３．０８１）と、２番目以降に出てくる第２配線＿装置やＤＲＹ工程＿装置のそれぞれの値（Ｓ比＞０．４３、ｔ＜２．２）とを比較すると、明らかに有意差が見られることから、信頼度が高いと判断される。
【００４９】
これを確認するため、図２２に、Ｆｉｅｌｄ＿Ｏｘ工程で使用した装置ごとにＶＴ＿Ｎ２の値の分布を箱ヒゲ図で示す。図２２では、ＰＭ１号機またはＰＭ３号機と、ＰＭ２号機との間には明らかな有意差が確認される。つまり、オリジナルデータの分布特徴を用いて解析をおこなうという本発明方法の有効性が確認されたわけである。なお、評価用統計値リスト、Ｓ比およびｔ値については後述する。
【００５０】
図２３は、図２１および図２２に示す回帰木分析の結果、問題工程とされたＦｉｅｌｄ＿Ｏｘ工程でＰＭ１号機またはＰＭ３号機を使用した全ウェーハのＶＴ＿Ｎ２の分布を示すヒストグラムである。図２４〜図２６は、それぞれＦｉｅｌｄ＿Ｏｘ工程でＰＭ１号機またはＰＭ３号機を使用した別々の１ロット分のウェーハのＶＴ＿Ｎ２の分布を示すヒストグラムである。また、図２７は、Ｆｉｅｌｄ＿Ｏｘ工程でＰＭ２号機を使用した全ウェーハのＶＴ＿Ｎ２の分布を示すヒストグラムであり、図２８〜図３０は、それぞれＦｉｅｌｄ＿Ｏｘ工程でＰＭ２号機を使用した別々の１ロット分のウェーハのＶＴ＿Ｎ２の分布を示すヒストグラムである。
【００５１】
図２３および図２７に示すように、ＰＭ１号機またはＰＭ３号機を使用した全ウェーハのＶＴ＿Ｎ２の平均値（μ＝０．８５６０）と、ＰＭ２号機を使用した全ウェーハのＶＴ＿Ｎ２の平均値（μ＝０．７３０２）とは略同じである。そのため、従来のように平均値を用いて解析しても有意差を抽出するのは困難である。
【００５２】
しかし、ＰＭ１号機またはＰＭ３号機を使用した全ウェーハのＶＴ＿Ｎ２の標準偏差値（σ＝０．０８３５）と、ＰＭ２号機を使用した全ウェーハのＶＴ＿Ｎ２の標準偏差値（σ＝０．２３５１）とを比較すると明らかに有意差が見られる。したがって、実施の形態１のように、オリジナルデータのばらつき等のデータ分布特徴に着目することにより、オリジナルデータのみを解析対象としていたのでは抽出できなかった有意差をあらたに不良要因として抽出することが可能となる。
【００５３】
上述した解析結果に基づいて実際にＰＭ２号機について詳細な調査をおこなった結果、ＰＭ１号機およびＰＭ３号機に比べて炉内の温度分布差が大きいことが判明した。さらに、それは熱電対劣化に起因することがわかり、定期点検方法の最適化がおこなわれた。ところで、ロット歩留りを目的変数とし、各工程での使用装置名を説明変数として回帰木分析をおこなった結果では、ＰＭ２号機が歩留り低下要因であることは明らかにならなかった。つまり、歩留り値に明確に現れていなかった低歩留り要因が、ロット内の電気的特性値の標準偏差等に有意差が生じる要因を解析するという本発明方法により明らかにされたわけである。なお、実施の形態１では、蓄積されたデータの編集、回帰木分析の実行、独自な手法によるその結果の定量的な評価までが自動的に実行される。
【００５４】
図３１は、図１８に示すファイルに基づいて、上述した（６）ウェーハ番号の間隔２の周期性（ＶＴ＿Ｎ２＿２）を目的変数とし、各工程での使用装置名を説明変数として回帰木分析をおこなった結果を示す回帰木図である。また、図３２は、この回帰木分析に対する評価用統計値リストの例を示す図である。図３１に示す回帰木図によれば、ＶＴ＿Ｎ２＿２のロット内変動が２の周期性を有することに対して最も有意とされるのは、Ｆ拡散工程＿装置としてＦ７号機を使用したか、あるいはＦ５号機、Ｆ６号機、Ｆ８号機またはＦ９号機を使用したかということである。Ｆ７号機を使用した方が５０％程度強く２の周期性を示すことがわかる。
【００５５】
これを確認するため、図３３に、Ｆ拡散工程で使用した装置ごとに２の周期性の値（ＶＴ＿Ｎ２＿２）の分布を箱ヒゲ図で示す。図３３では、Ｆ７号機と、Ｆ５号機、Ｆ６号機、Ｆ８号機またはＦ９号機との間には明らかな有意差が確認される。なお、全ＶＴ＿Ｎ２データをウェーハ番号ごとに表示した図９の箱ヒゲ図からは、２の周期性を見ることはできない。この例でも、オリジナルデータの分布特徴を用いて解析をおこなうという本発明方法の有効性が確認されたわけである。
【００５６】
図３４〜図３６は、図３１および図３２に示す回帰木分析の結果、問題工程とされたＦ拡散工程でＦ７号機を使用した別々の１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示すヒストグラムである。図３７〜図３９は、Ｆ拡散工程でＦ５号機、Ｆ６号機、Ｆ８号機またはＦ９号機を使用した別々の１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示すヒストグラムである。上述した解析結果より、ＶＴ＿Ｎ２のロット内変動の要因が抽出され、ウェーハが交互に使用される装置であるＦ拡散工程の装置が注目され、実際に２つのチャンバーのうちの一方でパーティクルの発生が多いことが判明した。
【００５７】
ところで実施の形態１では、回帰木分析は説明変数を同じにして抽出された各特徴量を順次目的変数に選択して、自動的に回帰木分析をおこない、それによって各特徴量を左右する要因がそれぞれについて抽出される。特に何を解析すべきかが明確となっていない場合には、考えられるすべての特徴量を抽出し、それらを目的変数として回帰木分析を実行する。その結果、上述したように種々の解析結果が得られるので、その中で最も有意差が大とみなされる項目を歩留り改善のための対策項目の候補とする。このように従来の解析方法では容易に抽出されなかった多くの有意差が効率的に抽出される。
【００５８】
ここで、回帰木分析および評価用統計値リストについて説明する。まず、回帰木分析について簡単に説明する。回帰木分析は、複数の属性を示す説明変数とそれにより影響を受ける目的変数からなるレコードの集合を対象とし、その目的変数に最も影響を与える属性と属性値を判別するものである。データマイニングをおこなってルールファイル２４を抽出する手段２３（回帰木分析エンジン）からはデータの特徴や規則性を示すルールが出力される。
【００５９】
回帰木分析の処理は、各説明変数（属性）のパラメータ値（属性値）に基づいて集合の２分割を繰り返していくことで実現される。その集合分割の際、分割前の目的変数の平方和をＳ０、分割後の２つの集合のそれぞれの目的変数の平方和をＳ１およびＳ２としたとき、式（１）で示すΔＳが最大となるように、分割するレコードの説明変数とそのパラメータ値を求める。
【００６０】
ΔＳ＝Ｓ０−（Ｓ１＋Ｓ２）・・・（１）
【００６１】
ここで得られる説明変数とそのパラメータ値は、回帰木では分岐点に対応している。以降、分割された集合についても同様な処理を繰り返し、説明変数の目的変数に対する影響を調べる。以上が、一般によく知られている回帰木分析の手法であるが、集合分割の明確さをより詳しく把握するために、複数の上位分割候補に関して、ΔＳの他に以下のパラメータ（ａ）〜（ｄ）も回帰木分析結果の定量的な評価として使用する。これらのパラメータは評価用統計値リストとして出力される。
【００６２】
（ａ）Ｓ比：
集合分割による平方和の低減率であり、集合分割により平方和がどの程度低減したかを示すパラメータである。この値が小さいほど集合分割の効果は大きく、集合分割が明確におこなわれているので、有意差が大である。
【００６３】
Ｓ比＝（（Ｓ１＋Ｓ２）／２）／Ｓ０・・・（２）
【００６４】
（ｂ）ｔ値：
回帰木分析エンジンにより集合が２分割されるが、分割された２つの集合の平均（／Ｘ１，／Ｘ２）の差の検定のための値である。ここで、“／”は上線を示す。統計のｔ検定は、分割された集合における目的変数の平均値の有意差を示す基準となる。自由度、すなわちデータ数が同じであるなら、ｔが大きいほど集合が明確に分割されており、有意差が大である。
【００６５】
この際、分割された集合の分散に有意差がない場合にはつぎの（３）式によりｔ値を求め、分割された集合の分散に有意差がある場合には（４）式によりｔ値を求める。ここで、Ｎ１およびＮ２は、それぞれ分割した集合１および集合２の要素数である。また、／Ｘ１および／Ｘ２はそれぞれ分割後の各集合の平均である。Ｓ１およびＳ２は、それぞれ分割後の各集合の目的変数の平方和である。
【００６６】
【数１】

【００６７】
【数２】

【００６８】
（ｃ）分割された集合の目的変数の平均値の差：
この値が大きいほど有意差が大である。
【００６９】
（ｄ）分割された各集合のデータ数：
両者の差が小さいほど異常値（ノイズ）による影響が小である。
【００７０】
上述した実施の形態１によれば、従来のようにオリジナルデータやその平均値だけでなく、オリジナルデータのばらつきやロット内変動パターンなど、オリジナルデータ群内に存在する種々のデータ分布特徴を抽出し、各特徴量を順次目的変数に選択して解析をおこなうことにより、各特徴量が生じた要因を自動的かつ定量的に評価して抽出し、データをより多面的にみて多くの情報を抽出することができる。したがって、従来は多種多様なデータに埋もれて判別が困難であった関連性や有意差を、技術者の主観によらずに客観的に、また効率的に定量的に抽出することができる。
【００７１】
また、実施の形態１では、特徴量の抽出からその要因抽出までの一連の手順が自動的におこなわれるので、所定の設定をしておくことによって自動的に半導体製造ライン等の変動状況やその要因を絶えず監視することが可能となる。
【００７２】
なお、本発明は上述した実施の形態１に限らず、適用範囲が広い。たとえば新品種の立ち上げ時などで、悪化原因がたくさんあり、歩留りの悪いロットが多発している状況では、オリジナルデータやその平均値を用いた原因工程の調査だけでなく、ロットやウェーハ内のデータ分布特徴からの原因工程調査をおこなうことによって、隠れていた原因を見つけたり、原因の絞り込みをおこなうことが可能となる。
【００７３】
（実施の形態２）
図５６は、本発明の実施の形態２によるデータマイニングを導入したデータ解析装置の機能構成の一例を示す図である。データマイニング部１７０３は、オリジナルデータ群１７０１内の各データベース１７０２から抽出された個々のオリジナルデータに基づいて、データ内に潜む特徴や規則性の抽出処理を行い、ルールファイル１７０４を作成する。解析ツール群１７０５は、統計解析コンポーネント１７０６および図表作成コンポーネント１７０７等を有し、ルールファイル１７０４を基にデータベース１７０２から抽出された個々のオリジナルデータを解析する。
【００７４】
その解析結果は、解析ツール群１７０５およびデータマイニング部１７０３にフィードバックされる。データマイニング部１７０３は、解析ツール群１７０５の解析結果およびオリジナルデータ群１７０１を基にデータマイニングをおこなう。解析ツール群１７０５は、ルールファイル１７０４、データベース１７０２から抽出された個々のオリジナルデータ、および自己の解析結果を基に解析をおこなう。意思決定（部）１７０８は、解析ツール群１７０５の解析結果を基に意思決定をおこなう。
【００７５】
歩留りデータ解析においてデータマイニングを適用した場合、データマイニング結果に基づいて歩留り向上のための対策を決定したり、対策を実施すべきか否かの判定をおこなったり、対策効果の予測をおこなったりすることになる。そのためには、データマイニング結果の定量的な評価や精度が必要となる。
【００７６】
データマイニングの一手法である判別木分析のうち、回帰木分析は特に有効である。回帰木分析の利点の一つは、結果がわかりやすいルールとして出力されることであり、それは一般的な言語やＳＱＬ言語のようなデータベース言語であらわされる。したがって、これらの結果の信頼度、精度を有効に使い、その結果により有効な意思決定をおこなったり、行動（すなわち対策等）を起こすようにすることが可能となる。
【００７７】
図４３に回帰木分析の入力となるデータ例の形式を示す。レコードはウェーハ番号単位であり、各レコードは各製造工程での使用装置４１１、電気的特性データ４１２とウェーハ歩留り４１３を有する。説明変数４０１は、使用装置４１１および電気的特性データ４１２等である。目的変数４０２は、歩留り４１３である。たとえば、歩留りに効果があるのは、使用装置４１１と電気的特性データ４１２であるとする。このデータによる回帰木分析結果である回帰木図と評価用統計値リストを図４４、図４５に示す。
【００７８】
図４４は、回帰木分析結果である回帰木図である。ルートノードｎ０は、ノードｎ１およびｎ２に２分割される。ノードｎ１は、ノードｎ３およびｎ４に２分割される。ノードｎ２は、ノードｎ５およびｎ６に２分割される。ノードｎ６は、ノードｎ７およびｎ８に２分割される。
【００７９】
図４５は、第１の２分割時の説明変数の評価用統計値である。たとえば、全集合の目的変数の平均値Ａｖｅが７５であり、標準偏差ｓが１２であり、データ数Ｎが１０００である。リスト６０１〜６０４は、それぞれ左から有意差による順位、Ｓ比、ｔ値、分割された集合の目的変数の平均値の差、分割された各集合のデータ数、分割された集合の属性名（説明変数）、分割された２つの集合の属性値（パラメータ値）とその目的変数の大小関係を示す。このリスト６０１〜６０４は、分割する属性値（説明変数）の（１）式に示すΔＳの値によるグループ分けの候補であり、有意差（ΔＳ）の大きい順に並べてある。図４４は、第１候補６０１を基にノードｎ０をノードｎ１およびｎ２に分割したものである。
【００８０】
図４４の全ウェーハの集合ｎ０を式（１）のΔＳの評価値に基づいて２つの集合ｎ１およびｎ２に分割をおこなうと、歩留りに最も影響を及ぼすのは工程ＡでＡＭ１かＡＭ２のいずれかを使うかであり、後者の方が歩留りがよい。以下、分割された集合に対して、同様な集合分割を繰り返していくとこの回帰木図が得られる。工程ＡでＡＭ２かつ工程ＣでＣＭ２を使用したウェーハ群に対しては、電気的特性データＲＳＰが９０以下の状態が最も効果がある（歩留りが高い）。
【００８１】
図４６は図４４と等価であり、分割されたウェーハ集合の歩留りと特定工程の使用装置と電気的特性データとの相関を示す。図４４の回帰木図で上階層に現れる説明変数ほど、目的変数に対する影響は大きい。全ウェーハの平均歩留りは７４．８％であるが、使用装置や電気的特性データとの関連で幾つかの集合に分けてみるとこのような特徴、規則性があることを回帰木分析は自動的に抽出し、歩留り解析の手がかりとなる。
【００８２】
図４４の回帰木図において上位２階層はいずれも使用装置差によるものであるので、全ウェーハを使った解析では歩留りに影響の大きいのは複合条件を含めても使用装置差である。電気的特性データはあまり効いていないように見られる。しかし、工程ＡでＡＭ２かつ工程ＣでＣＭ２を使用したウェーハ群について歩留りに最も効くのはＲＳＰであることが図４４、図４６から読み取れる。
【００８３】
つぎに、２分割交絡度、２分割独立度の算出例を説明する。回帰木分析において、目的変数に対して最も有意な説明変数を求めるためにおこなわれた各集合分割状態の交絡度（交絡の状態、独立でない度合い）を統計的に把握し、有意差が大とされた説明変数に交絡している他の説明変数を明確にする。図４７を参照しながら、２分割交絡度および２分割独立度の演算方法を説明する。
【００８４】
第１に、説明変数のうち、交絡度を評価したいものを基準説明変数８０１とする。
【００８５】
第２に、各レコードは説明変数ごとに“Ｌ”または“Ｈ”をデータ値とするテーブルを構成する。ここで、Ｈは回帰木分析時の集合２分割時の目的変数が高い値となる集合、Ｌは回帰木分析時の集合２分割時の目的変数が低い値となる集合にそれぞれ属する。集合２分割時においては、全レコードの各説明変数について、Ｌ，Ｈが定まる。
【００８６】
第３に、基準説明変数８０１を基に各比較説明変数８０２のＬ，Ｈの一致度の評価値として、Ｌ，Ｈが一致するレコード数をＮａ、全レコード数をＮとし、２分割交絡度ＤＥＰを式（５）のように定義する。２分割交絡度ＤＥＰの範囲は−１〜１であり、完全に交絡していれば１、全く交絡してなければ０、逆の交絡であれば−１である。
【００８７】
ＤＥＰ＝（２×Ｎａ／Ｎ）−１・・・（５）
【００８８】
また、２分割交絡度ＤＥＰを基に、２分割独立度ＩＮＤを式（６）のように定義する。２分割独立度ＩＮＤの範囲は０〜１であり、完全に独立していれば１、全く独立でなければ０である。
【００８９】
ＩＮＤ＝１−｜ＤＥＰ｜・・・（６）
【００９０】
第４に、上記の２分割交絡度ＤＥＰ、２分割独立度ＩＮＤを一つの基準説明変数８０１とその他の説明変数８０２との間で求め、説明変数間の評価尺度とする。どの説明変数を基準説明変数とするかは任意であるが、その有用性からして回帰木分析において目的変数に対して、特に最上階層での集合分割で有意差が大とされたものとするのが有効である。
【００９１】
第５に、上記の２分割交絡度ＤＥＰおよび２分割独立度ＩＮＤを求めることにより、各比較説明変数８０２がＬ，Ｈの各集合に属する状態が基準説明変数８０１のものとどれだけ差異があるかを定量的に評価できる。
【００９２】
２分割交絡度および／または２分割独立度を求めることにより、回帰木分析の集合２分割結果に基づき説明変数の交絡度を定量的に評価できるようになり、回帰木分析と組み合わせて、回帰木分析で得られた有意差が大となる説明変数と交絡している別の説明変数を自動的に抽出することが可能となる。
【００９３】
２分割交絡度は、回帰木分析での対象とされたどの説明変数についても評価できるが、その有効性からみて図４５の最初の分割候補の上位に挙がった説明変数（＝基準説明変数、評価用統計値リストに挙がる）と他の任意の説明変数がどれだけ交絡しているかを統計的に把握し、有意差が大きい説明変数について交絡している注意すべき説明変数を抽出する。基準説明変数８０１との交絡度を解析しようとする説明変数を、比較説明変数８０２とし、両者とも図４４の評価用統計値リストから選択される。２分割交絡度、２分割独立度の算出例を、図４７を参照しながら説明する。
【００９４】
図４７は、横軸にウェーハ番号８０３、比較説明変数８０２、基準説明変数８０１、歩留り８０４を示し、縦軸に基準説明変数の高歩留りグループ８１１、基準説明変数の低歩留りグループ８１２、２分割交絡度の計算式８１３、２分割交絡度８１４、２分割独立度８１５を示す。
【００９５】
図４５の上位候補項目（評価用統計値リスト）の中から比較の基準とする項目を基準説明変数８０１として決める。図４７では、ＳＴ３が基準説明変数８０１である。その他の説明変数を比較説明変数８０２とする。図４７では、ＳＴ１，ＳＴ２，ＷＥＴ２が比較説明変数８０２である。各比較説明変数８０２と基準説明変数８０１とを比較する。説明変数であるＳＴ１，ＳＴ２，ＳＴ３，ＷＥＴ２では、低歩留りグループの“Ｌ”をハッチで示し、高歩留りグループの“Ｈ”をハッチなしで示す。
【００９６】
基準説明変数８０１であるＳＴ３は、その属性値により、基準説明変数の高歩留りグループ８１１と基準説明変数の低歩留りグループ８１２に分けることができる。基準説明変数の高歩留りグループ８１１は１０個の集合であり、基準説明変数の低歩留りグループ８１２も１０個の集合である。
【００９７】
つぎに、それぞれの説明変数の２分割された高歩留グループと低歩留グループのロットが基準説明変数の同じグループとどれだけ一致しているかを数えてＮａとする。たとえば、比較説明変数８０２であるＳＴ１は、基準説明変数の高歩留りグループ８１１に含まれる高歩留りグループが１０個であり、基準説明変数の低歩留りグループ８１２に含まれる低歩留りグループが２個である。すなわち、比較説明変数であるＳＴ１と基準説明変数であるＳＴ３とが相互に同じグループに属する数Ｎａ＝１０＋２＝１２である。
【００９８】
上記Ｎａを式（５）に代入した式を２分割交絡度の計算式８１３に示す。ここで、データ数Ｎは２０である。この計算結果を２分割交絡度８１４に示す。式（６）により求めた値を２分割独立度８１５として示す。２分割交絡度８１４および２分割独立度８１５を、図４７の各列の下に示す。
【００９９】
２分割交絡度および２分割独立度の基本的活用方法は次の３つである。従来は判別が難しかった説明変数が、以下のように定量的な情報として得られる。
【０１００】
（１）有意な説明変数の範囲を確認：
有意性の高い候補と交絡している候補を把握し、これらも有意な説明変数と判断する。交絡度に対する基準は特に無いが、他の説明変数の値と比較して判断できる。また、技術的に対象として考えなくてよい候補が上位に来た場合、この候補に交絡している候補を明確にできる。さらに、意味の無い候補を削除して再度分析して確認できる。
【０１０１】
（２）独立性の高い候補の確認とその応用：
すべての候補について他の候補との独立度を確認し、他の候補との独立度が十分高い候補がある場合、この候補による歩留り差は他の候補に独立して存在することが明確になる。さらに、この候補の分割グループごとに同様の判別木分析をおこなって比較し、どちらも同様の分析結果が得られる場合は分析結果の信頼性が高いことがわかる。逆に、分析結果が異なる場合は独立と考えられた候補との複合条件で歩留りを左右する説明変数があるか、または特異なデータに左右されている（データ数が少ないことなどが要因）と考えられる。
【０１０２】
（３）交絡している候補に関する判別木分析：
ある重要と考えられる候補が第１分岐候補に交絡している場合、第１分岐の下層の分岐には現れ難い。その際、他の独立度の高い候補の分割グループによってデータを分割して判別木分析をおこない、この分割グループの下での判別木分析結果を比較する。同様の結果であれば、その重要な候補は第１候補と区別できないが、分析自体は信頼性が高いと考えられる。逆にその重要と考えられる候補が現われ、異なった結果となった場合、この結果も考慮すべきであり、重要と考えられる候補と第１候補とを区別して分析できるデータ解析をさらにおこなう必要が有ると考えられる。
【０１０３】
つぎに、装置履歴、電気的特性値を説明変数、ウェーハ歩留りを目的変数とする回帰木分析をおこない、回帰木分析結果の第１分岐の上位１２候補について２分割交絡度および２分割独立度を求める場合を説明する。
【０１０４】
本実施の形態２で得られる回帰木図および評価用統計値リストを図４８および図４９に示す。図４８では、ノードｎ９００がノードｎ９０１〜ｎ９１４に分割される。図４９は、第１の２分割時の上位１２の説明変数の評価用統計値を示す。これにより、集合分岐の１２の候補１００１〜１０１２が挙がる。
【０１０５】
図５０は、図４９の最上位の第一候補１００１として挙がっているＳＴ１を基準説明変数１１０１とし、評価用統計値リストの他の説明変数を比較説明変数１１０２としたときの２分割交絡度１１１１および２分割独立度１１１２を示す。
【０１０６】
図５１は、図４９の集合分岐の第三候補１００３として挙がっているＳＴ３を基準説明変数１２０１とし、評価用統計値リストの他の説明変数を比較説明変数１２０２としたときの２分割交絡度１２１１および２分割独立度１２１２を示す。
【０１０７】
図５０に示す２分割交絡度が０．７５を超えているのはＳＴ２，ＳＴ４，ＳＴ５，ＳＴ６，ＳＴ１０，ＷＥＴ２であり、これらは図４８の回帰木図には現れてこないが、歩留りに大きく効いている要因である可能性がある。逆に、図５１により、ＳＴ３は、２分割独立度が高いことを示している。
【０１０８】
図５１は、図５０で２分割独立度が高いとされたＳＴ３を基準説明変数とし、他の１１の説明変数との２分割交絡度１２１１および２分割独立度１２１２を示している。ＳＴ３は他のいずれの説明変数とも独立度が高いことを示している。
【０１０９】
図５２および図５３は、図４９の回帰木分析で有意差が大きいとされた上位１２の説明変数同士の２分割交絡度、２分割独立度およびその平均値を示し、説明変数間の関連を一見に把握できる。図５２の最下欄は２分割交絡度の平均値１３０１を示し、図５３の最下欄は２分割独立度の平均値１４０１を示す。
【０１１０】
つぎに、ＳＴ３での使用装置の差は他の説明変数と独立して歩留りに効いていることが判明したので、歩留りが不良となるＳＴ３での装置群によるウェーハ群（不良ウェーハ群：Ｓ３Ｍ２，Ｓ３Ｍ３を使用）と良好となるＳＴ３での装置群によるウェーハ群（良好ウェーハ群、Ｓ３Ｍ１，Ｓ３Ｍ４を使用）に分けて別個に回帰木分析をおこなう。その結果としての回帰木図を図５４、図５５に示す。
【０１１１】
図５４は、不良ウェーハ群による回帰木分析結果を示す回帰木図であり、ノードｎ１５００〜ｎ１５０６で構成される。図５５は、良好ウェーハ群による回帰木分析結果を示す回帰木図であり、ノードｎ１６００〜ｎ１６０６で構成される。
【０１１２】
図５４の不良ウェーハ群の第一分岐は図４８の全ウェーハ群によるものと同じであり、図４８の回帰木図の最上階層の不良ウェーハ群はｎ＝３９と少ないこともあわせ、歩留りが他に比べて極端に悪いウェーハによりかなり左右されると推察され、解析を困難にしている一因である。図５５の良好ウェーハ群では、ＳＴ３工程の不良装置により見えにくかった要因があらたに判明したことになる。
【０１１３】
本実施の形態２によれば、２分割交絡度および２分割独立度を用いて説明変数の交絡の度合いをより明確に把握できるようになり、回帰木分析と組み合わせて、回帰木における最初の分岐の有意差が大きい問題説明変数に交絡している注意すべき説明変数を明確化することが可能となる。
【０１１４】
さらに、独立性の高い説明変数のグループ分けを応用して再度回帰木分析することによって、回帰木分析の精度（信頼度）および解析効率を向上させ、また、より詳しい分析が可能となる。
【０１１５】
上述した実施の形態は、コンピュータがプログラムを実行することによって実現することができる。また、プログラムをコンピュータに供給するための手段、たとえばかかるプログラムを記録したＣＤ−ＲＯＭ等の記録媒体またはかかるプログラムを伝送するインターネット等の伝送媒体も本発明の実施の形態として適用することができる。上記のプログラム、記録媒体および伝送媒体は、本発明の範疇に含まれる。
【０１１６】
なお、上記実施の形態は、いずれも本発明を実施するにあたっての具体化のほんの一例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、またはその主要な特徴から逸脱することなく、様々な形で実施することができる。
【０１１７】
（付記１）オリジナルデータ値を編集して前記オリジナルデータ群内に存在する１以上のデータ分布特徴量を定量的に評価して抽出する工程と、
抽出された前記データ分布特徴量の中から任意のデータ分布特徴量を選択して解析をおこなう工程と、
得られた解析結果に基づいて意思決定をおこなう工程と、
を含んだことを特徴とするデータ解析方法。
【０１１８】
（付記２）オリジナルデータ値を編集して前記オリジナルデータ群内に存在する２以上のデータ分布特徴量を定量的に評価して抽出する工程と、
抽出された個々の前記データ分布特徴量を順次選択して解析をおこなう工程と、
得られた解析結果に基づいて意思決定をおこなう工程と、
を含んだことを特徴とするデータ解析方法。
【０１１９】
（付記３）前記データ分布特徴量は、特徴の程度を表す連続値で示されることを特徴とする付記１または２に記載のデータ解析方法。
【０１２０】
（付記４）各レコードに関して各データ分布特徴量は互いに独立であることを特徴とする付記１〜３のいずれか一つに記載のデータ解析方法。
【０１２１】
（付記５）前記データ分布特徴量を目的変数とするデータマイニングにより解析をおこなうことを特徴とする付記１〜４のいずれか一つに記載のデータ解析方法。
【０１２２】
（付記６）個々の前記データ分布特徴量をレコードごとにファイルに保存し、前記ファイルから、一部または全部のレコードに対して同じデータ分布特徴量を順次選択して目的変数とし、回帰木分析をおこなうことを特徴とする付記５に記載のデータ解析方法。
【０１２３】
（付記７）前記各工程は、順次おこなうように組まれたソフトウェアを計算機システムで実行することによって自動的におこなわれることを特徴とする付記１〜６のいずれか一つに記載のデータ解析方法。
【０１２４】
（付記８）前記データ分布特徴量の一つは、オリジナルデータの配列の順番をｘとし、オリジナルデータ値をｙとし、ｘとｙの関係を一次式ｙ＝ｂ・ｘ＋ａに近似した時のｙ軸切片の値ａであることを特徴とする付記１〜７のいずれか一つに記載のデータ解析方法。
【０１２５】
（付記９）前記データ分布特徴量の一つは、オリジナルデータの配列の順番をｘとし、オリジナルデータ値をｙとし、ｘとｙの関係を一次式ｙ＝ｂ・ｘ＋ａに近似した時の傾きの値ｂであることを特徴とする付記１〜７のいずれか一つに記載のデータ解析方法。
【０１２６】
（付記１０）前記データ分布特徴量の一つは、オリジナルデータの配列の順番に対するオリジナルデータ値の特定の周期性の強度であることを特徴とする付記１〜７のいずれか一つに記載のデータ解析方法。
【０１２７】
（付記１１）前記データ分布特徴量の一つは、オリジナルデータの配列の順番に対するオリジナルデータ値の最も強い周期性を示す値であることを特徴とする付記１〜７のいずれか一つに記載のデータ解析方法。
【０１２８】
（付記１２）（ａ）説明変数および目的変数のデータ結果を準備する工程と、
（ｂ）前記データ結果を基に複数の説明変数間の交絡度および／または独立度を演算する工程と、
（ｃ）前記交絡度および／または独立度を用いてデータマイニングをおこなう工程と、
を含んだことを特徴とするデータ解析方法。
【０１２９】
（付記１３）前記ステップ（ｂ）は、回帰木分析により２分割された集合単位で前記交絡度および／または独立度を演算することを特徴とする付記１２に記載のデータ解析方法。
【０１３０】
（付記１４）前記ステップ（ｂ）は、回帰木分析により有意差が大きい分割の要因となる複数の説明変数を選択し、該複数の説明変数間の交絡度および／または独立度を演算することを特徴とする付記１３に記載のデータ解析方法。
【０１３１】
（付記１５）前記ステップ（ｂ）は、基準となる説明変数とその他の説明変数との間の交絡度および／または独立度を演算する際、回帰木分析により２分割された各集合内の説明変数間のデータの一致と不一致との割合を基に交絡度および／または独立度を演算することを特徴とする付記１４に記載のデータ解析方法。
【０１３２】
（付記１６）前記ステップ（ｃ）は、前記交絡度および／または独立度を基に説明変数を取捨選択することによりデータマイニングをおこなうことを特徴とする付記１５に記載のデータ解析方法。
【０１３３】
（付記１７）説明変数および目的変数のデータ結果を基に複数の説明変数間の交絡度および／または独立度を演算する演算手段と、
前記交絡度および／または独立度を用いてデータマイニングをおこなうデータマイニング手段と、
を備えたことを特徴とするデータ解析装置。
【０１３４】
（付記１８）前記演算手段は、回帰木分析により２分割された集合単位で前記交絡度および／または独立度を演算することを特徴とする付記１７に記載のデータ解析装置。
【０１３５】
（付記１９）前記演算手段は、回帰木分析により有意差が大きい分割の要因となる複数の説明変数を選択し、該複数の説明変数間の交絡度および／または独立度を演算することを特徴とする付記１８に記載のデータ解析装置。
【０１３６】
（付記２０）前記演算手段は、基準となる説明変数とその他の説明変数との間の交絡度および／または独立度を演算する際、回帰木分析により２分割された各集合内の説明変数間のデータの一致と不一致との割合を基に交絡度および／または独立度を演算することを特徴とする付記１９に記載のデータ解析装置。
【０１３７】
（付記２１）前記データマイニング手段は、前記交絡度および／または独立度を基に説明変数を取捨選択することによりデータマイニングをおこなうことを特徴とする付記２０に記載のデータ解析装置。
【０１３８】
（付記２２）（ａ）説明変数および目的変数のデータ結果を準備する手順と、
（ｂ）前記データ結果を基に複数の説明変数間の交絡度および／または独立度を演算する手順と、
（ｃ）前記交絡度および／または独立度を用いてデータマイニングをおこなう手順と、
をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体。
【０１３９】
【発明の効果】
本発明によれば、計算機システムに蓄積されているオリジナルデータ群内に存在する種々のデータ分布特徴を抽出し、各特徴量を順次選択して解析をおこなうことにより、各特徴量が生じた要因を自動的かつ定量的に評価して抽出するため、データをより多面的にみて多くの情報（傾向、特徴的パターン、データ間の関連性等）を抽出することができる。したがって、従来は多種多様なデータに埋もれて判別が困難であった関連性や有意差を、技術者の主観によらずに客観的に、また効率的に定量的に抽出することができる。
【図面の簡単な説明】
【図１】本発明の実施の形態１において使用される計算機システムの一例を示す図である。
【図２】図１に示す構成の計算機システムにより実現されるデータ解析装置の機能構成の一例を示すブロック図である。
【図３】本発明の実施の形態１においてデータ分布特徴の抽出により抽出される各特徴量をＣＳＶ形式で表した図表である。
【図４】本発明の実施の形態１において半導体データの歩留り解析をおこなう際にロット内データ分布の特徴としてウェーハの属性値の変動に着目した情報を示す図表である。
【図５】複数のロットについてロット内ウェーハの歩留り値のばらつきの様子を示す図である。
【図６】本発明の実施の形態１にかかるデータ解析方法の一例の概略を示すフローチャートである。
【図７】具体例として歩留りとＶＴ＿Ｎ２との関係を示す特性図である。
【図８】具体例としてすべてのウェーハから得られたＶＴ＿Ｎ２データのヒストグラムを示す図である。
【図９】具体例として全ＶＴ＿Ｎ２データをウェーハ番号ごとに表示した箱ヒゲ図を示す図である。
【図１０】具体例として目的変数を各ロットにおけるＶＴ＿Ｎ２の平均値とし、説明変数を各工程で使用した装置名として回帰木分析をおこなった結果を示す図である。
【図１１】図１０に示す回帰木分析結果に対する評価用統計値リストの例を示す図である。
【図１２】具体例として全ＶＴ＿Ｎ２データを第２配線＿装置の使用装置名ごとに表示した箱ヒゲ図を示す図である。
【図１３】具体例として目的変数を各ウェーハのＶＴ＿Ｎ２の値とし、説明変数を各工程で使用した装置名として回帰木分析をおこなった結果を示す図である。
【図１４】図１３に示す回帰木分析結果に対する評価用統計値リストの例を示す図である。
【図１５】具体例として全ＶＴ＿Ｎ２データを２ＣＯＮ工程＿装置の使用装置名ごとに表示した箱ヒゲ図を示す図である。
【図１６】具体例としてロットごとにＶＴ＿Ｎ２の各特徴量を定義したファイルを示す図表である。
【図１７】具体例としてＶＴ＿Ｎ２のロット内分布の各特徴量のヒストグラムを示す図である。
【図１８】具体例としてＶＴ＿Ｎ２のロット内分布の特徴量について回帰木分析をおこなうためのファイルを示す図表である。
【図１９】具体例として歩留りの変動要因を回帰木分析で解析するための入力ファイルを示す図表である。
【図２０】具体例として目的変数を各ロットにおけるＶＴ＿Ｎ２の標準偏差値とし、説明変数を各工程で使用した装置名として回帰木分析をおこなった結果を示す図である。
【図２１】図２０に示す回帰木分析結果に対する評価用統計値リストの例を示す図である。
【図２２】具体例として全ＶＴ＿Ｎ２データをＦｉｅｌｄ＿Ｏｘ工程＿装置の使用装置名ごとに表示した箱ヒゲ図を示す図である。
【図２３】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ１号機またはＰＭ３号機を使用した全ウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図２４】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ１号機またはＰＭ３号機を使用した１ロット分のウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図２５】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ１号機またはＰＭ３号機を使用した１ロット分のウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図２６】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ１号機またはＰＭ３号機を使用した１ロット分のウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図２７】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ２号機を使用した全ウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図２８】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ２号機を使用した１ロット分のウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図２９】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ２号機を使用した１ロット分のウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図３０】具体例としてＦｉｅｌｄ＿Ｏｘ工程でＰＭ２号機を使用した１ロット分のウェーハのＶＴ＿Ｎ２のヒストグラムを示す図である。
【図３１】具体例として目的変数をウェーハ番号の間隔２の周期性の値とし、説明変数を各工程で使用した装置名として回帰木分析をおこなった結果を示す図である。
【図３２】図３１に示す回帰木分析結果に対する評価用統計値リストの例を示す図である。
【図３３】具体例としてウェーハ番号の間隔２の周期性の値をＦ拡散工程＿装置の使用装置名ごとに表示した箱ヒゲ図を示す図である。
【図３４】具体例としてＦ拡散工程でＦ７号機を使用した１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示す図である。
【図３５】具体例としてＦ拡散工程でＦ７号機を使用した１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示す図である。
【図３６】具体例としてＦ拡散工程でＦ７号機を使用した１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示す図である。
【図３７】具体例としてＦ拡散工程でＦ５号機、Ｆ６号機、Ｆ８号機またはＦ９号機を使用した１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示す図である。
【図３８】具体例としてＦ拡散工程でＦ５号機、Ｆ６号機、Ｆ８号機またはＦ９号機を使用した１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示す図である。
【図３９】具体例としてＦ拡散工程でＦ５号機、Ｆ６号機、Ｆ８号機またはＦ９号機を使用した１ロット分のウェーハについてＶＴ＿Ｎ２のロット内変動を示す図である。
【図４０】ロットの流れと異常製造装置の関係を示す図である。
【図４１】従来技術によるある工程での装置別歩留り分布を示す図である。
【図４２】ロットの流れと異常製造装置の交絡の関係を示す図である。
【図４３】回帰木分析入力データの例を示す図である。
【図４４】回帰木の例を示す図である。
【図４５】評価用統計値リストの例を示す図である。
【図４６】使用製造装置と電気的特性データと歩留り値の関係を示す図である。
【図４７】２分割交絡度および２分割独立度の算出例を示す図である。
【図４８】回帰木の例を示す図である。
【図４９】評価用統計値リストの例を示す図である。
【図５０】各説明変数と第１候補の説明変数との交絡度および独立度を示す図である。
【図５１】各説明変数と第３候補の説明変数との交絡度および独立度を示す図である。
【図５２】全候補同士の交絡度およびその平均を示す図である。
【図５３】全候補同士の独立度およびその平均を示す図である。
【図５４】不良ウェーハ群による回帰木分析結果を示す回帰木図である。
【図５５】良好ウェーハ群による回帰木分析結果を示す回帰木図である。
【図５６】データ解析装置の機能構成の一例を示す図である。
【符号の説明】
２１データ分布特徴を抽出する手段
２２解析対象とする特徴量を選択する手段
２３データマイニングをおこなう手段
２４ルールファイル
２５統計解析コンポーネント
２６図表作成コンポーネント
２７解析ツール群
４１データベース
４２オリジナルデータ群
１０１正常装置
１０２異常装置
４０１説明変数
４０２目的変数
４１１使用装置
４１２電気的特性データ
４１３歩留り
８０１基準説明変数
８０２比較説明変数
８０３ウェーハ番号
８０４歩留り
８１１基準説明変数の高歩留りグループ
８１２基準説明変数の低歩留りグループ
８１３２分割交絡度の計算式
８１４２分割交絡度
８１５２分割独立度
１７０１オリジナルデータ群
１７０２データベース
１７０３データマイニング部
１７０４ルールファイル
１７０５解析ツール群
１７０６統計解析コンポーネント
１７０７図表作成コンポーネント
１７０８意思決定部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a data analysis method for extracting a meaningful result for obtaining an industrially superior result by grasping a relation between data widely handled in the industry, and further, a data value to be analyzed and its value The present invention relates to a data analysis method for extracting knowledge and information that are difficult to discriminate by paying attention only to an average value. The present invention also relates to a data analysis method and a data analysis apparatus for evaluating the accuracy of analysis results.
[0002]
For example, the present invention relates to a data analysis method for grasping the fluctuation status of yield by using device history, test results, design information or various measurement data acquired in the semiconductor manufacturing process, and thus extracting conditions advantageous for yield improvement. . In particular, not only the original data stored in the computer system and its average value, but also data distribution features obtained by editing the original data etc. are automatically and quantitatively extracted and recognized, The present invention relates to a data analysis method and a data analysis apparatus for extracting and evaluating low-yield factors such as semiconductors.
[0003]
In addition, the accuracy of the data analysis results to deal with the case where multiple explanatory variables are entangled (not independent) and it becomes difficult to extract significant differences, and to obtain more efficient and reliable analysis results The present invention relates to a data analysis method and a data analysis apparatus for evaluating the above.
[0004]
[Prior art]
Let us take a yield analysis of semiconductor data as an example. In particular, when trying to obtain reference data for determining measures to improve quality and productivity from the analysis results, such as process data analysis, the accuracy, reliability, etc. are important. The application has already been filed by the inventors (application number: Japanese Patent Application No. 2001-127534). In order to find out the yield reduction factor as quickly as possible and take countermeasures, to find the factor that is effective in the yield and other factors that are effective in that factor from the device history, test results, design information, various measurement data, etc. Data analysis is performed.
[0005]
In data analysis, an analysis target such as a yield value is an objective variable, device history that causes the objective variable, test results, design information, various measurement data, and the like are called explanatory variables. Various statistical methods are applied at that time, and by applying data mining as one of them, it is possible to extract valuable information and regularity that are difficult to discriminate from a large amount of data.
[0006]
In order to analyze a failure factor of a semiconductor device, it is important to analyze the collected data on a multifaceted basis on a scientific basis and extract more significant differences. Therefore, conventionally, the value of the original data stored in the computer system and its average value are often used. However, there are cases where it is difficult to extract a failure factor or the like from a complicatedly intertwined original data group. In such a case, if there is a characteristic data distribution regarding various measurement results, yields, etc. of wafer in-chip chips or wafers in a lot, the analysis of defect data may proceed based on the distribution.
[0007]
[Problems to be solved by the invention]
However, in conventional computer systems, for example, original data such as yield values and electrical characteristic values are accumulated, but characteristic data distributions across multiple chips in a wafer surface and multiple wafers in a lot are mostly accumulated. It has not been. Therefore, the engineer needs to edit the original data and acquire the data distribution status using various statistical analysis tools and chart creation tools. Then, it is necessary to recognize the aggregation and tendency of data by comparing the acquired data distribution situation with the experience and know-how of engineers. Therefore, it is difficult to objectively grasp the feature amount related to the distribution of a large amount of original data. In addition, there is a problem that an accurate result cannot be obtained even if the analysis is advanced based on the feature amount of the data distribution including the subjectivity of the engineer.
[0008]
Also, in the past, engineers looked at the data distribution status obtained by various statistical analysis tools and chart creation tools, etc. “Increase” or “Decrease”, whether a feature has a periodicity of “2” or “not”, or a feature has a periodicity of “3” or “not” As described above, the feature amount of the data distribution is represented by a discrete value. For this reason, information indicating a degree such as how much (or not) a certain feature exists or how much a certain feature tends to increase (or decreases) is lost. In addition, for example, when a certain feature has a certain degree of periodicity of 2 and a certain degree of periodicity of 3 at the same time, there is a problem that only the periodicity of the stronger degree is recognized.
[0009]
Also, considering various test results, measurement results, and combinations thereof, the possible combinations of data distribution features are enormous, and it is extremely difficult to investigate all of them. In addition, the failure factor corresponding to the extracted data distribution feature is not always known, and there is a problem that much experience and know-how are required to determine the unknown failure factor.
[0010]
In practice, for example, applying data mining to yield analysis of semiconductor data may not work. In applications such as finance and distribution, there are millions of data, and there are at most tens of explanatory variables, so highly accurate analysis results were obtained. However, in the case of semiconductor process data analysis, the number of data is small and the number of explanatory variables reaches several hundreds (equipment history, in-process inspection values, etc.) despite the fact that there are at most about 200 lots in the same product type. Multiple explanatory variables are no longer independent, and simple data mining may not provide reliable results. In the following, this will be briefly described by taking yield analysis of semiconductor data as an example.
[0011]
In process data analysis with many explanatory variables (eg, LSI manufacturing process data) compared to the number of data (eg, lot number), multiple explanatory variables are entangled with each other (no longer independent), resulting in statistical significance There are many cases where problems cannot be narrowed down sufficiently. Even when data mining (regression tree analysis, etc.) is applied, if there is this problem, it is necessary to take considerable time to check the accuracy of the analysis result and the reliable range.
[0012]
FIG. 40 shows the relationship between the lot flow and the abnormal manufacturing apparatus. A white circle “◯” indicates the normal device 101, and a black circle “●” indicates the abnormal device 102. Arrows indicate the lot flow. In the analysis of the difference between the devices in the LSI manufacturing data, the yield is most affected by which manufacturing device is used in which manufacturing step from the used device data for each lot process.
[0013]
FIG. 41 shows a yield distribution (box-whisker diagram) by device in a certain process according to the prior art. For each device used in each manufacturing process, the yield value of the lot is displayed in a box-whisker chart and checked for each process, and the process and the apparatus with the most significant difference are identified.
[0014]
However, with this technique, the number of processes is now several hundreds, and a large number of man-hours are required. It is difficult to make a judgment when there is no clear difference or when the conditions are intricately intertwined. In order to deal with these problems, a data mining technique based on regression tree analysis is effective, and it is divided into a group of used devices where the value of the objective variable is high and a group of used devices where it is low. When the apparatus used for each lot is fixed as shown in FIG. 42 and the lot is flowed, the abnormal apparatus 102 indicated by the black circle “●” may not be uniquely identified. That is, when the independence between explanatory variables is low, the one that has a large significant difference due to the two divisions of the set is not necessarily “true significant difference”.
[0015]
The above is the confounding in the device used in each process of semiconductor manufacturing, but the same applies to the confounding of the set divided into two as the regression tree analysis result. That is, the same can be said for a set of device groups in which a high yield is generated in each process and device groups in which a low yield is generated. The confounding of the two-divided sets is the same when the explanatory variable is a continuous value.
[0016]
The present invention has been made in view of the above problems, and by editing original data to extract data distribution feature quantities such as various statistical values, and by objectively recognizing and utilizing the data, An object of the present invention is to provide a data analysis method for automatically extracting factors and the like. It is also possible to include a data analysis method and a data analysis apparatus that can clarify the degree of confounding between a plurality of explanatory variables.
[0017]
[Means for Solving the Problems]
In order to achieve the above object, the present invention automatically and quantitatively evaluates and extracts various data distribution features existing in an original data group accumulated in a computer system, and extracts each feature amount. The factors that cause each feature amount are automatically and quantitatively evaluated and extracted by sequentially selecting and analyzing them. According to the present invention, since a lot of information such as data distribution tendency, characteristic pattern, and relevance between data is extracted from the original data, it is difficult to discriminate it because it is buried in various data. Gender and significant differences are efficiently and quantitatively extracted based on scientific evidence.
[0018]
In addition, in order to clarify the degree of confounding between a plurality of explanatory variables, a step of preparing data results of explanatory variables and objective variables, and a degree of confounding and / or independence between a plurality of explanatory variables based on the data results There is provided a data analysis method having a step of calculating and a step of performing data mining using the degree of confounding and / or independence. By calculating the degree of confounding and / or independence between a plurality of explanatory variables, it is possible to clearly grasp the degree of confounding of the explanatory variables. If regression tree analysis is performed based on this, the degree of confounding of explanatory variables can be quantitatively evaluated based on the result of bisection of the set of regression tree analysis, and the significant difference of the first branch in the regression tree is large. It is possible to clarify the explanatory variables to be noted that are entangled with the explanatory variables.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments 1 and 2 of the present invention will be described below in detail with reference to the drawings.
[0020]
(Embodiment 1)
FIG. 1 is a diagram illustrating an example of a hardware configuration of a computer system that is used to implement the data analysis method according to the first embodiment of the present invention. As shown in FIG. 1, the computer system includes an input device 1, a central processing device 2, an output device 3, and a storage device 4.
[0021]
FIG. 2 is a block diagram showing an example of a functional configuration of the data analysis apparatus realized by the computer system having the configuration shown in FIG. As shown in FIG. 2, the data analyzing apparatus has an original data group 42 including a database 41 including a plurality of original data. The database 41 is constructed in the storage device 4 of the computer system shown in FIG.
[0022]
In addition, the data analysis apparatus is a means for quantitatively evaluating and extracting one or more data distribution features existing in the original data group 42, and makes an analysis target from the extracted one or more data distribution features. Means 22 for selecting a feature amount, and means 23 for extracting a rule file 24 such as features and regularity lurking in the data distribution by performing data mining by a regression tree analysis method using the data distribution feature amount selected as an analysis target as an objective variable. And an analysis tool group 27 such as a statistical analysis component 25 and a chart creation component 26 for analyzing distribution characteristics of original data using the extracted rule file 24.
[0023]
Each of the above means 21, 22, 23 and the analysis tool group 27 is realized by executing a program for performing each processing in the central processing unit 2. The extracted rule file 24 is stored in the storage device 4 and is output by the output device 3 such as a display device or a printing device. Decision making 5 is made based on the analysis result by the analysis tool group 27.
[0024]
Further, the means 23 for extracting the rule file 24 described above applies to the original data in the original data group 42, the data distribution feature extracted by the means 21 for extracting data distribution characteristics, or the analysis result by the analysis tool group 27. Even data mining is done. The analysis tool group 27 also analyzes the original data in the original data group 42, the data distribution feature extracted by the data distribution feature extraction means 21, or the output result of the analysis tool group 27. It has become. The analysis result by the analysis tool group 27 is fed back to the means 22 for selecting the data distribution feature quantity to be analyzed and the original data group 42. The original data group 42 is fed back with the output of the means 21 for extracting data distribution features.
[0025]
FIG. 3 shows an example in which the data distribution feature amount extracted by the means 21 for extracting the data distribution feature shown in FIG. 2 is output in the CSV format. Since each feature amount is obtained independently for each record, it is handled independently. For example, as shown in FIG. 3, since each feature amount is automatically output in the CSV format, it is possible to efficiently perform a significant analysis for each feature amount. Here, the data serving as the feature amount may be not only the original data value and its average value, but also the maximum value, minimum value, range or standard deviation value of the original data. Further, the data of the feature amount may be data periodicity or similarity to a specific model.
[0026]
Here, various features can be extracted depending on the structure of the target data group, but the process of what kind of feature amount to extract according to the purpose is incorporated in the program in advance or the feature to be extracted You may prepare a file that defines the quantity and read the file. Each feature quantity is not a discrete value, but is defined by a continuous value indicating how strong the feature is. Therefore, since there is no loss of information due to discretization as in the prior art, a better analysis result is expected.
[0027]
As an example of the data distribution feature amount, the feature of the intra-lot data distribution in the yield analysis of semiconductor data will be described. FIG. 4 is a list showing information focusing on the variation of the attribute value of the wafer. Here, the independent variable is a wafer number, and the dependent variable is original data such as yield, category yield or various measured values.
[0028]
Although not particularly limited, in the example shown in FIG. 4, (1) the center of the entire data distribution, (2) variation in the data, (3) correlation of the data with the wafer number, (4) the y-axis intercept when linearly approximated, (5) Data slope relative to wafer number, (6) Cycle 2 (sheet) strength, (7) Cycle 3 (sheet) strength, (8) Strongest cycle in lot, (9) First half wafer- Difference in average value of the latter half wafer, (10) Difference in variation between the first half wafer and the second half wafer, (11) Difference in correlation between the first half wafer and the second half wafer, (12) Difference in first-order approximate y-axis intercept of the first half wafer and the second half wafer. (13) Difference in inclination of first half wafer to second half wafer, (14) Strength of cycle 2 (sheets) of latter half lot, (15) Strength of cycle 3 (sheets) of latter half lot, and (16) Second half lot 16 feature items are defined. Has been. The feature amount of each feature item is obtained in lot units.
[0029]
The 16 feature items defined here will be briefly described. The feature quantity (1) is an average value of the yield and various measured values of all wafers in the same lot. The feature quantity (2) is a standard deviation value such as the yield and various measured values of all wafers in the same lot. The feature quantity in (3) is the correlation coefficient between the wafer number of the wafers in the same lot and the yield, various measured values, etc. The calculation method of this correlation coefficient is in advance according to the object and purpose of analysis. It has been decided. The feature quantity of (4) is the y-axis intercept when the wafer number of the wafers in the same lot is x, the yield and various measured values are y, and the relationship between x and y is approximated by the linear expression y = b · x + a Is the value of
[0030]
The feature quantity in (5) is the population regression coefficient when the wafer number of the wafer in the same lot is x, the yield and various measured values are y, and the relationship between x and y is approximated by the linear expression y = b · x + a It is. The feature amount of (6) is the dispersion of the yield and various measured values of all wafers in the same lot, the wafer group with

wafer numbers

1, 3, 5,... Or the

wafer numbers

2, 4, 6,. The ratio of the yield of wafer groups and the dispersion of various measured values. The feature quantity of (7) is the dispersion of the yield and various measurement values of all wafers in the same lot, the wafer group with

wafer numbers

1, 4, 7,..., The

wafer numbers

2, 5, 8,. ... Is a ratio of the yield of wafer groups having wafer numbers of 3, 6, 9,... The feature quantity of (8) is the dispersion ratio of periods 2 (sheets) and 3 (sheets) obtained as described in (6) and (7) above for wafers in the same lot, and the period 4 ( Among the dispersion ratios such as “sheet”, “5” (sheets),...
[0031]
The feature quantity of (9) is that all wafers (for example, 50 wafers) in the same lot are divided into the first half (for example, 25 wafers) and the latter half (for example, 25 wafers), This is the difference from the average value of the yield and various measured values of the latter half wafer group. The reason for dividing the first half and the second half in this way is that the device history is different in the semiconductor manufacturing process. The feature quantity (10) is the difference between the standard deviation values such as the yield and various measured values of the first half wafer group and the standard deviation values such as the yield and various measured values of the second half wafer group. The feature quantity (11) is the difference between the correlation coefficient of the first half wafer group and the correlation coefficient of the second half wafer group. The feature amount (12) is the difference between the value of the primary approximate y-axis intercept of the first half wafer group and the value of the primary approximate y-axis intercept of the second half wafer group.
[0032]
The feature quantity (13) is the difference between the population regression coefficient of the first half wafer group and the population regression coefficient of the second half wafer group. The feature quantity of (14) is the variance ratio with respect to the cycle 2 similar to (6) above for the latter half lot group. The feature quantity of (15) is the variance ratio for the cycle 3 as in (7) above for the latter half lot group. The feature value of (16) is a period value at which the dispersion ratio is maximum for the latter half lot group, as in (8) above. For the first half lot group, the strength of cycle 2 (sheets), the strength of cycle 3 (sheets), and the strongest cycle may be defined as in (14) to (16) above. In addition to the feature items exemplified here, various feature items are defined according to the analysis target and purpose.
[0033]
By defining and analyzing feature items as described above, for example, even if it is not possible to extract a significant difference depending on the device used simply by using the original data values such as yield values and their average values as in the past, use It may be possible to extract significant differences between devices. For example, FIG. 5 shows an example in which attention is paid to the variation in the yield value of wafers in a lot (corresponding to the above (2)) for a plurality of lots.
[0034]
In the example shown in FIG. 5, lot group 6 (left side of the dashed line in FIG. 5) using No. 21, No. 22, No. 24, or No. 25 in Step 1 and lot group 7 using No. 28 (one point in FIG. 5). On the right side of the chain line), the average wafer yield and the overall distribution (variation between lots) are almost the same. Therefore, no obvious significant difference is recognized even if analysis is performed using the wafer yield value or its average value. On the other hand, when paying attention to the feature quantity of the data distribution called the variation of the wafer yield value in each lot, an obvious significant difference is recognized between the two

lot groups

6 and 7. The item of interest is not limited to the item (2), but may be the item (1), any one of the items (3) to (16), or other items.
[0035]
As described above, since each data distribution feature quantity exists as an attribute value of each lot, the means 22 for selecting the data distribution feature quantity sequentially selects each data distribution feature quantity as an objective variable. Then, the means 23 for extracting the rule file 24 by performing data mining performs regression tree analysis using each data distribution feature amount as an objective variable in order. As a result, it is possible to determine the cause of the data distribution feature amount, and it is possible to extract more failure factors than in the conventional analysis method. At that time, the process of sequentially selecting the data distribution feature quantity and the regression tree analysis process are automatically executed according to the program, so the engineer does not have to consider which data distribution feature quantity to select as the objective variable, Analysis can be performed efficiently. This is particularly effective when it is unclear what to analyze.
[0036]
Further, even when a plurality of feature patterns are seen, such as when both the strength of cycle 2 (sheets) and the strength of cycle 3 (sheets) exist in the same record, both features can be evaluated. As a result, it is possible to obtain an analysis result reflecting the actual situation by eliminating the lack of information.
[0037]
Next, the flow of the data analysis method according to the first embodiment of the present invention will be described. FIG. 6 is a flowchart showing an outline of an example of the data analysis method according to the present invention. As shown in FIG. 6, when this data analysis method is started, first, data to be analyzed, such as yield values and various measured values, is selected and extracted from the original data group 42 (step S1). ). Subsequently, a process of extracting one or more data distribution features is performed on the extracted data (step S2).
[0038]
Then, the data distribution feature quantity to be analyzed is selected, and data mining such as regression tree analysis is performed using it as an objective variable (step S3). When regression tree analysis is completed for all the data distribution features extracted in step S2 (step S4), the analysis result is output and the engineer confirms the result (step S5). Then, the engineer makes a decision based on the analysis result (step S6).
[0039]
Next, in order to clarify the features of the present invention, a data analysis method using data distribution feature amounts will be described with a specific example. In general, if the wafer number is different even in a wafer group in the same lot, the yield value and electrical characteristic value for each wafer are different, and these values show various variation patterns. Yield values and electrical characteristic values are stored in wafer units. Therefore, in the first embodiment, it is possible to analyze a variation pattern such as a yield value with respect to a wafer number over a plurality of lots as a data distribution feature. Here, an example is shown in which multifaceted analysis is performed on a test substitute Nch transistor threshold voltage VT_N2 (hereinafter simply referred to as VT_N2), which is an important electrical characteristic that greatly affects the performance of a product. In addition, suppose that the utilization apparatus log | history in each manufacturing process has an effect on a yield.
[0040]
FIG. 7 is a characteristic diagram showing the relationship between the yield and VT_N2, but the yield and VT_N2 seem to be irrelevant at first glance. FIG. 8 is a histogram of VT_N2 data obtained from all wafers, and FIG. 9 is a box-and-whisker diagram showing all VT_N2 data for each wafer number. It is difficult to extract statistically significant differences from the results shown in these figures.
[0041]
FIG. 10 is a diagram showing an example of a result of regression tree analysis in which the objective variable is the average value of VT_N2 in each lot and the explanatory variable is the name of the device used in each process. FIG. It is a figure which shows the example of the statistical value list for evaluation showing the reliability information of analysis. According to the result of this regression tree analysis, as shown in FIG. 10, the most significant change with respect to the variation of VT_N2 was the use of No. 11 or No. 13 as the second wiring_device, or No. 12, It is whether

Unit

14, 17 or 18 was used. FIG. 12 shows a box-whisker diagram in which all VT_N2 data is displayed for each used device name of the second wiring_device. In FIG. 12, no significant difference is observed. The evaluation statistical value list is output together with the regression tree diagram, which will be described later.
[0042]
FIG. 13 is a diagram showing an example of a result of regression tree analysis using the objective variable as the value of VT_N2 of each wafer and the explanatory variable as the device name used in each process. FIG. 14 shows this regression tree analysis. It is a figure which shows the example of the statistical value list for evaluation with respect to. According to the results of this regression tree analysis, as shown in FIG. 13, the most significant change with respect to VT_N2 was the use of No. 11 or No. 12 or No. 13 as the 2CON process_device. It is what you did. FIG. 15 is a box-and-whisker diagram in which all VT_N2 data is displayed for each used device name of the 2CON process_device, but there is no significant difference in this figure.
[0043]
On the other hand, the cause of the defect can be clarified by extracting and analyzing the data distribution feature of VT_N2 as follows. FIG. 16 is a chart showing an example of a file in which each feature quantity of VT_N2 is defined for each lot in the CSV format. This file is output by means 21 for extracting data distribution features of the apparatus shown in FIG.
[0044]
FIG. 17 is a histogram showing feature quantities of various intra-lot distributions of VT_N2 based on the CSV format data shown in FIG. Here, among the 16 feature items (1) to (16) described in relation to FIG. 4 for VT_N2, (1) average value (VT_N2_ave), (2) standard deviation value (VT_N2_s), ( 3) Correlation coefficient for wafer number (VT_N2_r), (4) y-axis intercept (VT_N2_a) of linear approximation, (5) population regression coefficient (VT_N2_b), (6) periodicity of interval 2 of wafer numbers (VT_N2_2) (7) Periodicity of interval 3 of wafer numbers (VT_N2_3), (9) Difference between average values of first and second half wafers (VT_N2_ave_d), (10) Difference between standard deviation values of first and second half wafers (VT_N2_s_d) (11) Difference in correlation coefficient between the first half wafer and the second half wafer (VT_N2_r_d), (12) One of the first half wafer and the second half wafer. The difference in the y-axis intercept of the approximate expression (VT_N2_a_d), (13) the difference between the first half wafer and mother regression coefficients late wafer (VT_N2_b_d), 12 pieces are extracted for.
[0045]
From FIG. 17, it can be seen that all the feature quantities vary considerably. Therefore, if regression tree analysis is performed using each feature quantity as an objective variable, a factor causing a significant difference in each feature quantity, that is, a failure factor or the like can be analyzed.
[0046]
In order to obtain an analysis result efficiently using the data distribution feature as an analysis target, as input data for regression tree analysis, as shown in FIG. 18, for each lot, the name of the device used in each process and the extracted feature amount A file in which and are defined is created. This file includes a rule file 24 (see FIG. 19) defining device names and lot yields used in each process as input data for analyzing yield fluctuation factors by regression tree analysis, and a file shown in FIG. Combined for the same lot number.
[0047]
FIG. 20 shows the variation occurring in the lot of VT_N2 based on the file shown in FIG. 18 with the above-mentioned (2) standard deviation value (VT_N2_s) as the objective variable and the device name used in each process as the explanatory variable. It is a regression tree figure which shows the regression tree analysis result performed in order to extract the factor of. FIG. 21 is a diagram showing an example of an evaluation statistical value list for this regression tree analysis.
[0048]
According to the regression tree diagram shown in FIG. 20, PM1 or PM3 is used as the Field_Ox process_device, or PM2 is the most significant for the variation of the standard deviation value (VT_N2_s) of VT_N2. It is that you used. This is because each value of the Field_Ox process_apparatus that appears first (S ratio = 0.3767, t = 3.081) and the second and subsequent values for the S ratio, t value, etc. in the evaluation statistical value list When the respective values (S ratio> 0.43, t <2.2) of the second wiring_device and the DRY process_device appearing in FIG. Judged to be high.
[0049]
In order to confirm this, FIG. 22 shows the distribution of the value of VT_N2 for each apparatus used in the Field_Ox process in a box-whisker diagram. In FIG. 22, a clear significant difference is confirmed between PM1 or PM3 and PM2. In other words, the effectiveness of the method of the present invention for performing analysis using the distribution characteristics of the original data was confirmed. The evaluation statistical value list, S ratio, and t value will be described later.
[0050]
FIG. 23 is a histogram showing the distribution of VT_N2 of all wafers using the PM1 or PM3 machine in the Field_Ox process, which is a problem process, as a result of the regression tree analysis shown in FIGS. FIGS. 24 to 26 are histograms showing the distribution of VT_N2 of wafers for different lots using the PM1 or PM3 machine in the Field_Ox process, respectively. FIG. 27 is a histogram showing the distribution of VT_N2 of all wafers using the PM2 machine in the Field_Ox process, and FIGS. 28 to 30 respectively show wafers for different lots using the PM2 machine in the Field_Ox process. It is a histogram which shows distribution of VT_N2.
[0051]
As shown in FIGS. 23 and 27, the average value of VT_N2 of all wafers using the PM1 or PM3 machine (μ = 0.8560) and the average value of VT_N2 of all wafers using the PM2 machine (μ = 0). .7302) is substantially the same. For this reason, it is difficult to extract a significant difference even if analysis is performed using an average value as in the prior art.
[0052]
However, the standard deviation value of VT_N2 for all wafers using PM1 or PM3 (σ = 0.0835) and the standard deviation value of VT_N2 for all wafers using PM2 (σ = 0.351) are compared. Clearly, there is a significant difference. Therefore, as in the first embodiment, by focusing on data distribution characteristics such as variations in original data, a significant difference that cannot be extracted if only the original data is analyzed is newly extracted as a failure factor. Is possible.
[0053]
As a result of actually conducting a detailed investigation on PM2 based on the analysis results described above, it was found that the temperature distribution difference in the furnace was larger than that of PM1 and PM3. Furthermore, it was found that it was caused by thermocouple deterioration, and the periodic inspection method was optimized. By the way, the results of regression tree analysis using the lot yield as an objective variable and the name of the device used in each process as an explanatory variable did not reveal that PM2 was the cause of the yield reduction. That is, the low yield factor that did not appear clearly in the yield value was clarified by the method of the present invention in which a factor causing a significant difference in the standard deviation of the electrical characteristic value in the lot is analyzed. In the first embodiment, editing of accumulated data, execution of regression tree analysis, and quantitative evaluation of the result by a unique method are automatically executed.
[0054]
FIG. 31 shows a regression tree analysis based on the file shown in FIG. 18 with (6) the periodicity of interval 2 of wafer numbers (VT_N2_2) as an objective variable and the names of devices used in each process as explanatory variables. It is the regression tree figure which shows the result. FIG. 32 is a diagram showing an example of an evaluation statistical value list for this regression tree analysis. According to the regression tree diagram shown in FIG. 31, it is most significant that the variation in the lot of VT_N2_2 has a periodicity of 2, whether F7 was used as the F diffusion process_device, or F5 It is whether the No. machine, F6 machine, F8 machine or F9 machine was used. It can be seen that using the F7 machine shows a periodicity of 2 stronger by about 50%.
[0055]
In order to confirm this, in FIG. 33, the distribution of the periodicity value (VT_N2_2) of 2 for each apparatus used in the F diffusion process is shown by a box mustache diagram. In FIG. 33, a clear significant difference is confirmed between the F7 machine and the F5 machine, F6 machine, F8 machine or F9 machine. It should be noted that the periodicity of 2 cannot be seen from the box mustache diagram of FIG. 9 in which all VT_N2 data is displayed for each wafer number. In this example as well, the effectiveness of the method of the present invention for performing analysis using the distribution characteristics of the original data was confirmed.
[0056]
FIGS. 34 to 36 are histograms showing intra-lot fluctuations of VT_N2 for one lot of wafers using the F7 machine in the F diffusion process, which is the problem process, as a result of the regression tree analysis shown in FIGS. 31 and 32. It is. FIG. 37 to FIG. 39 are histograms showing intra-lot fluctuations of VT_N2 for different lots of wafers using the F5, F6, F8 or F9 machines in the F diffusion process. From the analysis results described above, the factors of VT_N2 variation within the lot are extracted, and attention is paid to the F diffusion process apparatus, which is an apparatus in which wafers are used alternately. In fact, particles are generated in one of the two chambers. It turns out that there are many.
[0057]
By the way, in the first embodiment, the regression tree analysis selects each feature quantity extracted with the same explanatory variable as a target variable in order, and automatically performs the regression tree analysis, thereby affecting the feature quantities. Are extracted for each. In particular, if it is not clear what should be analyzed, all possible feature quantities are extracted, and regression tree analysis is performed using them as objective variables. As a result, various analysis results are obtained as described above, and an item that is regarded as having the largest significant difference among them is set as a candidate for a measure item for yield improvement. As described above, many significant differences that are not easily extracted by the conventional analysis method can be extracted efficiently.
[0058]
Here, the regression tree analysis and the statistical value list for evaluation will be described. First, the regression tree analysis will be briefly described. In the regression tree analysis, a set of records composed of explanatory variables indicating a plurality of attributes and objective variables affected thereby is determined, and attributes and attribute values that most affect the objective variables are discriminated. A rule indicating data characteristics and regularity is output from the means 23 (regression tree analysis engine) that performs data mining and extracts the rule file 24.
[0059]
The regression tree analysis process is realized by repeatedly dividing the set into two based on the parameter value (attribute value) of each explanatory variable (attribute). When the set is divided, if the sum of squares of the objective variable before the division is S0 and the sum of squares of the objective variables of the two sets after the division is S1 and S2, ΔS shown in the equation (1) is maximized. As described above, the explanatory variable of the record to be divided and its parameter value are obtained.
[0060]
ΔS = S0− (S1 + S2) (1)
[0061]
The explanatory variables and parameter values obtained here correspond to branch points in the regression tree. Thereafter, the same processing is repeated for the divided sets, and the influence of the explanatory variable on the objective variable is examined. The above is a generally well-known method of regression tree analysis. In order to understand the clarity of set partitioning in more detail, in addition to ΔS, the following parameters (a) to ( d) is also used as a quantitative evaluation of the regression tree analysis results. These parameters are output as an evaluation statistical value list.
[0062]
(A) S ratio:
This is a reduction rate of the sum of squares by the set division, and is a parameter indicating how much the sum of squares has been reduced by the set division. The smaller this value is, the greater the effect of set partitioning is, and the set partition is clearly performed, so the significant difference is large.
[0063]
S ratio = ((S1 + S2) / 2) / S0 (2)
[0064]
(B) t value:
Although the set is divided into two by the regression tree analysis engine, it is a value for testing the difference between the averages (/ X1, / X2) of the two divided sets. Here, “/” indicates an overline. The statistical t-test is a standard indicating a significant difference in the mean value of the objective variable in the divided set. If the degree of freedom, that is, the number of data is the same, the larger the t, the more clearly the set is divided and the greater the difference.
[0065]
At this time, if there is no significant difference in the variance of the divided sets, the t value is obtained by the following equation (3). If there is a significant difference in the variance of the divided sets, the t value is obtained by the equation (4). Ask. Here, N1 and N2 are the numbers of elements of set 1 and set 2, respectively. Further, / X1 and / X2 are averages of the respective sets after the division. S1 and S2 are the sum of squares of the objective variable of each set after division.
[0066]
[Expression 1]

[0067]
[Expression 2]

[0068]
(C) Difference in mean value of objective variables of divided sets:
The greater this value, the greater the difference.
[0069]
(D) Number of data in each divided set:
The smaller the difference between the two, the smaller the influence of abnormal values (noise).
[0070]
According to the first embodiment described above, not only the original data and its average value as in the past, but also various data distribution features existing in the original data group such as variations in the original data and fluctuation patterns in the lot are extracted. By selecting and analyzing each feature quantity as a target variable sequentially, the factors that caused each feature quantity are automatically and quantitatively evaluated and extracted, and a lot of information can be extracted from a more multifaceted perspective. can do. Therefore, it is possible to objectively and efficiently quantitatively extract the relevance and significant difference that have been buried in a variety of data and have been difficult to discriminate conventionally.
[0071]
Further, in the first embodiment, a series of procedures from feature quantity extraction to factor extraction is automatically performed. Therefore, by setting a predetermined setting, it is possible to automatically change a semiconductor manufacturing line or the like. Factors can be monitored constantly.
[0072]
The present invention is not limited to the first embodiment described above, and has a wide application range. For example, when a new variety is launched, there are many causes of deterioration, and lots with low yields occur frequently. In addition to investigation of the cause process using the original data and its average value, By investigating the cause process from the data distribution characteristics, it becomes possible to find the cause that has been hidden or to narrow down the cause.
[0073]
(Embodiment 2)
FIG. 56 is a diagram showing an example of a functional configuration of a data analysis apparatus into which data mining according to Embodiment 2 of the present invention is introduced. The data mining unit 1703 performs a process of extracting features and regularity hidden in the data based on the individual original data extracted from each database 1702 in the original data group 1701, and creates a rule file 1704. The analysis tool group 1705 includes a statistical analysis component 1706, a chart creation component 1707, and the like, and analyzes individual original data extracted from the database 1702 based on the rule file 1704.
[0074]
The analysis result is fed back to the analysis tool group 1705 and the data mining unit 1703. The data mining unit 1703 performs data mining based on the analysis result of the analysis tool group 1705 and the original data group 1701. The analysis tool group 1705 performs analysis based on the rule file 1704, individual original data extracted from the database 1702, and its own analysis result. A decision (part) 1708 makes a decision based on the analysis result of the analysis tool group 1705.
[0075]
When data mining is applied in yield data analysis, measures to improve yield are determined based on the data mining results, whether or not measures should be implemented, and prediction of countermeasure effects become. For that purpose, quantitative evaluation and accuracy of the data mining result are required.
[0076]
Of discriminant tree analysis, which is one method of data mining, regression tree analysis is particularly effective. One advantage of regression tree analysis is that the results are output as easy-to-understand rules, which can be expressed in a general language or a database language such as the SQL language. Therefore, it is possible to effectively use the reliability and accuracy of these results, and to make effective decision making and take actions (that is, countermeasures, etc.) based on the results.
[0077]
FIG. 43 shows a format of an example of data used as an input for regression tree analysis. Each record is a wafer number unit, and each record includes a use device 411, electrical characteristic data 412, and wafer yield 413 in each manufacturing process. The explanatory variable 401 includes the use device 411 and electrical characteristic data 412. The objective variable 402 is the yield 413. For example, it is assumed that the use device 411 and the electrical characteristic data 412 have an effect on the yield. FIG. 44 and FIG. 45 show the regression tree diagram and the statistical value list for evaluation, which are the regression tree analysis results based on this data.
[0078]
FIG. 44 is a regression tree diagram showing the results of regression tree analysis. The root node n0 is divided into two nodes n1 and n2. Node n1 is divided into two nodes n3 and n4. Node n2 is divided into two nodes n5 and n6. Node n6 is divided into two nodes n7 and n8.
[0079]
FIG. 45 shows statistical values for evaluation of explanatory variables at the time of the first two divisions. For example, the average value Ave of the objective variables of all sets is 75, the standard deviation s is 12, and the number of data N is 1000. Lists 601 to 604 are respectively ranked from the left by significant difference, S ratio, t value, difference in the average value of the objective variables of the divided sets, the number of data of each divided set, and attribute names of the divided sets ( Explanatory variable), and the relationship between the attribute values (parameter values) of the two divided sets and their objective variables. The lists 601 to 604 are candidates for grouping by the value of ΔS shown in the expression (1) of the attribute value (explanatory variable) to be divided, and are arranged in descending order of significant difference (ΔS). In FIG. 44, the node n0 is divided into nodes n1 and n2 based on the first candidate 601.
[0080]
When the set n0 of all the wafers in FIG. 44 is divided into two sets n1 and n2 based on the evaluation value of ΔS in the equation (1), the most significant influence on the yield is either AM1 or AM2 in the process A. The latter is better in yield. Thereafter, when the same set division is repeated for the divided set, this regression tree diagram is obtained. For the wafer group using AM2 in process A and CM2 in process C, the state where the electrical characteristic data RSP is 90 or less is most effective (high yield).
[0081]
FIG. 46 is equivalent to FIG. 44 and shows the correlation between the yield of the divided wafer set, the apparatus used in the specific process, and the electrical characteristic data. The explanatory variables appearing in the upper hierarchy in the regression tree diagram of FIG. 44 have a greater influence on the objective variable. The average yield of all wafers is 74.8%, but regression tree analysis automatically shows that there are such characteristics and regularity when divided into several sets in relation to the equipment used and electrical characteristics data. Extracted and used as a clue to yield analysis.
[0082]
In the regression tree diagram of FIG. 44, the upper two layers are all due to differences in the devices used. Therefore, in the analysis using all the wafers, it is the device differences that have a large effect on the yield even if the complex condition is included. The electrical property data appears to be less effective. However, it can be seen from FIGS. 44 and 46 that RSP is the most effective in yield for the wafer group using AM2 in process A and CM2 in process C.
[0083]
Next, an example of calculating the two-part entanglement degree and the two-part independence degree will be described. In regression tree analysis, statistically grasp the degree of confounding (entanglement state, degree of independence) of each set partition state performed to obtain the most significant explanatory variable for the objective variable, and the significant difference is Clarify other explanatory variables that are entangled with the explained explanatory variables. With reference to FIG. 47, a method of calculating the degree of entanglement by two and the degree of independence by two divide will be described.
[0084]
First, among the explanatory variables, the reference explanatory variable 801 is used to evaluate the degree of confounding.
[0085]
Secondly, each record constitutes a table having “L” or “H” as a data value for each explanatory variable. Here, H belongs to a set in which the objective variable when the set is divided into two at the time of regression tree analysis has a high value, and L belongs to a set in which the objective variable at the time of set of two splits in the regression tree analysis has a low value. When the set is divided into two, L and H are determined for each explanatory variable of all records.
[0086]
Third, based on the reference explanatory variable 801, as the evaluation value of the L and H coincidence of each comparative explanatory variable 802, Na is the number of records that match L and H, N is the total number of records, and the two-fold confounding degree DEP is defined as in equation (5). The range of the two-part entanglement degree DEP is −1 to 1, 1 if completely entangled, 0 if not entangled at all, and −1 if it is oppositely entangled.
[0087]
DEP = (2 × Na / N) −1 (5)
[0088]
Further, based on the two-part entanglement degree DEP, the two-part independence degree IND is defined as shown in Expression (6). The range of the two-part independence IND is 0 to 1, 1 if completely independent, and 0 if not completely independent.
[0089]
IND = 1− | DEP | (6)
[0090]
Fourth, the above-described two-part entanglement degree DEP and two-part independence degree IND are obtained between one reference explanatory variable 801 and another explanatory variable 802, and are used as an evaluation scale between the explanatory variables. Which explanatory variable is used as the standard explanatory variable is arbitrary, but due to its usefulness, it is assumed that the significant difference is greatly increased in the regression tree analysis with respect to the objective variable, especially in the set division at the top hierarchy. Is effective.
[0091]
Fifth, by obtaining the above-described two-part entanglement degree DEP and two-part independence degree IND, how much the state of each comparative explanatory variable 802 belonging to each set of L and H differs from that of the reference explanatory variable 801. Can be quantitatively evaluated.
[0092]
By obtaining the two-part confounding degree and / or the two-part independence degree, the confounding degree of the explanatory variable can be quantitatively evaluated based on the result of the two-part division of the regression tree analysis, and combined with the regression tree analysis, the regression tree It is possible to automatically extract another explanatory variable that is entangled with an explanatory variable having a large significant difference obtained by the analysis.
[0093]
The bipartite confounding degree can be evaluated for any explanatory variable targeted in the regression tree analysis, but in view of its effectiveness, explanatory variables (= reference explanatory variable, evaluation) listed above the first division candidate in FIG. Statistically grasp how many other arbitrary explanatory variables are entangled with each other, and extract explanatory variables to be noted that are entangled with explanatory variables having a large significant difference. An explanatory variable to be analyzed for the degree of confounding with the reference explanatory variable 801 is a comparative explanatory variable 802, and both are selected from the evaluation statistical value list of FIG. A calculation example of the two-part entanglement degree and the two-part independence degree will be described with reference to FIG.
[0094]
47, the horizontal axis indicates the wafer number 803, the comparative explanatory variable 802, the reference explanatory variable 801, and the yield 804, and the vertical axis indicates the high yield group 811 of the reference explanatory variable, the low yield group 812 of the reference explanatory variable, and bipartite confounding. A degree calculation formula 813, a two-part entanglement degree 814, and a two-part independence degree 815 are shown.
[0095]
An item as a reference for comparison is determined as a reference explanatory variable 801 from among the top candidate items (evaluation statistical value list) in FIG. In FIG. 47, ST3 is the reference explanatory variable 801. Other explanatory variables are referred to as comparative explanatory variables 802. In FIG. 47, ST1, ST2, and WET2 are comparative explanatory variables 802. Each comparative explanatory variable 802 is compared with the reference explanatory variable 801. In the explanatory variables ST1, ST2, ST3, and WET2, “L” of the low yield group is indicated by hatching and “H” of the high yield group is indicated by no hatching.
[0096]
ST3 which is the reference explanatory variable 801 can be divided into a high yield group 811 for the reference explanatory variable and a low yield group 812 for the reference explanatory variable according to the attribute value. The reference yield variable high yield group 811 is a set of 10 and the reference yield variable low yield group 812 is also a set of 10.
[0097]
Next, Na is calculated by counting how much the lots of the high yield group and the low yield group divided into two for each explanatory variable match the same group of the reference explanatory variables. For example, ST1, which is a comparative explanatory variable 802, has 10 high yield groups included in the high yield group 811 of the reference explanatory variable, and 2 low yield groups included in the low yield group 812 of the standard explanatory variable. . That is, the number Na = 10 + 2 = 12, in which the comparative explanatory variable ST1 and the standard explanatory variable ST3 belong to the same group.
[0098]
A formula obtained by substituting Na into the formula (5) is shown in a calculation formula 813 for the degree of entanglement by two. Here, the number of data N is 20. This calculation result is shown as a two-part entanglement degree 814. A value obtained by Expression (6) is shown as a two-part independence 815. The two-part entanglement 814 and the two-part independence 815 are shown below each column in FIG.
[0099]
There are three basic methods for utilizing the two-part entanglement and the two-part independence. The explanatory variables that have been difficult to distinguish in the past can be obtained as quantitative information as follows.
[0100]
(1) Confirm the range of significant explanatory variables:
Grasp candidates that are intertwined with highly significant candidates and determine that these are also significant explanatory variables. Although there is no standard for the degree of confounding, it can be judged by comparing with the values of other explanatory variables. In addition, when a candidate that does not need to be considered as a technical target comes to the top, the candidate entangled with this candidate can be clarified. Furthermore, it can be confirmed by deleting a meaningless candidate and analyzing it again.
[0101]
(2) Confirmation of highly independent candidates and their application:
Check the independence of all candidates from other candidates, and if there is a candidate with a sufficiently high degree of independence from other candidates, it becomes clear that the yield difference due to this candidate exists independently of other candidates . Further, the same discriminant tree analysis is performed for each of the candidate divided groups and compared, and it is understood that the reliability of the analysis result is high when the same analysis result is obtained for both. On the other hand, if the analysis results are different, there are explanatory variables that influence the yield under the combined conditions with candidates considered independent, or they depend on peculiar data (because the number of data is small, etc.) Conceivable.
[0102]
(3) Discriminant tree analysis for entangled candidates:
When a candidate considered to be important is entangled with the first branch candidate, it is difficult to appear in the lower branch of the first branch. At this time, the data is divided by another candidate group having a high degree of independence, and the discriminant tree analysis is performed, and the discriminant tree analysis results under this divided group are compared. If the result is similar, the important candidate cannot be distinguished from the first candidate, but the analysis itself is considered highly reliable. Conversely, if a candidate that appears important appears and the results are different, this result should also be taken into account, and it is necessary to further analyze the data so that the candidate considered important and the first candidate can be analyzed separately. It is thought that there is.
[0103]
Next, a regression tree analysis is performed with the device history and electrical characteristic values as explanatory variables, and the wafer yield as an objective variable, and the two-fold confounding degree and the two-fold independence degree are obtained for the top 12 candidates of the first branch of the regression tree analysis result. The case where it asks is explained.
[0104]
The regression tree diagram and the evaluation statistical value list obtained in the second embodiment are shown in FIGS. In FIG. 48, the node n900 is divided into nodes n901 to n914. FIG. 49 shows statistical values for evaluation of the top 12 explanatory variables at the time of the first two divisions. As a result, 12 candidates 1001 to 1012 of the set branch are listed.
[0105]
In FIG. 50, ST1 listed as the top first candidate 1001 in FIG. 49 is the reference explanatory variable 1101, and the other explanatory variables in the evaluation statistical value list are the comparative explanatory variables 1102, and the degree of two-way confounding 1111 And 2 split independence 1112 is shown.
[0106]
51, ST3 listed as the third set branch candidate 1003 in FIG. 49 is the reference explanatory variable 1201, and the other explanatory variables in the evaluation statistical value list are the comparative explanatory variables 1202, and the degree of two-way confounding 1211 And 2 split independence 1212.
[0107]
It is ST2, ST4, ST5, ST6, ST10, and WET2 that the bipartite confounding degree shown in FIG. 50 exceeds 0.75, and these do not appear in the regression tree diagram of FIG. It may be a factor that works. On the other hand, FIG. 51 shows that ST3 has a high degree of independence in two divisions.
[0108]
FIG. 51 shows ST3, which has a high degree of independence of two divisions in FIG. 50, as a reference explanatory variable, and shows a degree of inversion 1211 and a degree of independence of two divisions 1211 with other eleven explanatory variables. ST3 indicates that the degree of independence from any other explanatory variable is high.
[0109]
52 and 53 show the degree of entanglement, the degree of independence, and the average value of the top twelve explanatory variables that are considered to have a significant difference in the regression tree analysis of FIG. I can grasp at a glance. The bottom column of FIG. 52 shows the average value 1301 of the two-part entanglement degree, and the bottom column of FIG. 53 shows the average value 1401 of the two-part independence degree.
[0110]
Next, since it has been found that the difference in the devices used in ST3 is effective in the yield independently of other explanatory variables, the wafer group (defective wafer group: S3M2, S3M2, which has a yield failure) due to the device group in ST3. Regression tree analysis is performed separately for the wafer group (good wafer group, using S3M1 and S3M4) by the apparatus group in ST3 that is good and using S3M3). The resulting regression tree diagrams are shown in FIGS.
[0111]
FIG. 54 is a regression tree diagram showing a regression tree analysis result based on a defective wafer group, and includes nodes n1500 to n1506. FIG. 55 is a regression tree diagram showing the results of regression tree analysis using a good wafer group, and includes nodes n1600 to n1606.
[0112]
The first branch of the defective wafer group in FIG. 54 is the same as that of the whole wafer group in FIG. 48, and the number of defective wafer groups in the uppermost layer of the regression tree diagram in FIG. It is presumed that it is considerably influenced by an extremely bad wafer as compared with the above, and this is one factor that makes analysis difficult. In the good wafer group in FIG. 55, the factors that were difficult to see due to the defective device in the ST3 process were newly found.
[0113]
According to the second embodiment, the degree of confounding of the explanatory variables can be grasped more clearly by using the two-part confounding degree and the two-part independence, and the first branch in the regression tree can be combined with the regression tree analysis. It is possible to clarify the explanatory variable to be noted that is entangled with the problem explanatory variable having a large significant difference.
[0114]
Further, by applying the grouping of highly independent explanatory variables and performing the regression tree analysis again, the accuracy (reliability) and analysis efficiency of the regression tree analysis can be improved, and more detailed analysis can be performed.
[0115]
The above-described embodiment can be realized by a computer executing a program. Further, means for supplying the program to the computer, for example, a recording medium such as a CD-ROM recording such a program, or a transmission medium such as the Internet for transmitting such a program can also be applied as an embodiment of the present invention. The above program, recording medium, and transmission medium are included in the category of the present invention.
[0116]
The above-described embodiments are merely examples of implementation in carrying out the present invention, and the technical scope of the present invention should not be construed as being limited thereto. . That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.
[0117]
(Supplementary Note 1) A step of editing original data values to quantitatively evaluate and extract one or more data distribution features existing in the original data group;
Selecting and analyzing an arbitrary data distribution feature amount from the extracted data distribution feature amount; and
A process of making a decision based on the obtained analysis results;
The data analysis method characterized by including.
[0118]
(Appendix 2) Editing the original data value and quantitatively evaluating and extracting two or more data distribution feature quantities existing in the original data group;
A step of sequentially selecting and analyzing the extracted individual data distribution features;
A process of making a decision based on the obtained analysis results;
The data analysis method characterized by including.
[0119]
(Supplementary note 3) The data analysis method according to

supplementary note

1 or 2, wherein the data distribution feature amount is represented by a continuous value representing a feature level.
[0120]
(Additional remark 4) Each data distribution feature-value is mutually independent regarding each record, The data analysis method as described in any one of Additional remark 1-3 characterized by the above-mentioned.
[0121]
(Supplementary note 5) The data analysis method according to any one of supplementary notes 1 to 4, wherein the analysis is performed by data mining using the data distribution feature amount as an objective variable.
[0122]
(Appendix 6) Individual data distribution feature values are stored in a file for each record, and the same data distribution feature values are sequentially selected from the file for some or all records as objective variables, and regression tree analysis The data analysis method according to appendix 5, wherein the data analysis is performed.
[0123]
(Supplementary note 7) The data analysis method according to any one of supplementary notes 1 to 6, wherein the respective steps are automatically performed by executing software assembled in a sequential manner on a computer system. .
[0124]
(Supplementary Note 8) One of the data distribution feature amounts is y when the order of arrangement of the original data is x, the original data value is y, and the relationship between x and y is approximated by the linear expression y = b · x + a The data analysis method according to any one of appendices 1 to 7, wherein the value is an axis intercept value a.
[0125]
(Supplementary note 9) One of the data distribution feature quantities is an inclination when the order of arrangement of the original data is x, the original data value is y, and the relationship between x and y is approximated by a linear expression y = b · x + a The data analysis method according to any one of appendices 1 to 7, wherein the value is b.
[0126]
(Additional remark 10) One of the said data distribution feature-value is the intensity | strength of the specific periodicity of the original data value with respect to the order of the arrangement | sequence of original data, Any one of additional remark 1-7 characterized by the above-mentioned. Data analysis method.
[0127]
(Additional remark 11) One of the said data distribution feature-value is a value which shows the strongest periodicity of the original data value with respect to the order of the arrangement | sequence of original data, Any one of Additional remark 1-7 characterized by the above-mentioned. Data analysis method.
[0128]
(Supplementary Note 12) (a) A step of preparing data results of explanatory variables and objective variables;
(B) calculating a degree of confounding and / or independence between a plurality of explanatory variables based on the data results;
(C) performing data mining using the degree of confounding and / or independence;
The data analysis method characterized by including.
[0129]
(Supplementary note 13) The data analysis method according to supplementary note 12, wherein the step (b) calculates the degree of confounding and / or independence in a set unit divided into two by regression tree analysis.
[0130]
(Additional remark 14) The said step (b) selects the some explanatory variable used as the factor of the division | segmentation with a significant difference by regression tree analysis, and calculates the confounding degree and / or independence between these several explanatory variables The data analysis method according to appendix 13, characterized by:
[0131]
(Supplementary Note 15) In the step (b), when calculating the degree of confounding and / or independence between the explanatory variable serving as a reference and other explanatory variables, the description in each set divided into two by regression tree analysis 15. The data analysis method according to appendix 14, wherein the degree of confounding and / or independence is calculated based on a ratio of data matching and mismatching between variables.
[0132]
(Supplementary note 16) The data analysis method according to supplementary note 15, wherein the step (c) performs data mining by selecting explanatory variables based on the degree of confounding and / or independence.
[0133]
(Supplementary Note 17) Calculation means for calculating the degree of confounding and / or independence between a plurality of explanatory variables based on the data results of explanatory variables and objective variables;
Data mining means for performing data mining using the confounding degree and / or independence;
A data analysis apparatus comprising:
[0134]
(Supplementary note 18) The data analysis apparatus according to supplementary note 17, wherein the computing means computes the degree of confounding and / or independence in a set unit divided into two by regression tree analysis.
[0135]
(Additional remark 19) The said calculating means selects the some explanatory variable used as the factor of a division | segmentation with a significant difference by regression tree analysis, and calculates the confounding degree and / or independence between these several explanatory variables, The data analysis apparatus according to appendix 18.
[0136]
(Additional remark 20) When the said calculating means calculates the confounding degree and / or independence degree between the explanatory variable used as a reference | standard, and another explanatory variable, between the explanatory variables in each set divided | segmented into 2 by regression tree analysis 20. The data analysis apparatus according to appendix 19, wherein the degree of confounding and / or independence is calculated based on a ratio between the coincidence and disagreement of the data.
[0137]
(Supplementary note 21) The data analysis apparatus according to supplementary note 20, wherein the data mining means performs data mining by selecting explanatory variables based on the degree of confounding and / or independence.
[0138]
(Supplementary Note 22) (a) A procedure for preparing data results of explanatory variables and objective variables;
(B) a procedure for calculating the degree of confounding and / or independence between a plurality of explanatory variables based on the data results;
(C) a procedure for performing data mining using the degree of confounding and / or independence;
The computer-readable recording medium which recorded the program for making a computer perform.
[0139]
【Effect of the invention】
According to the present invention, various data distribution features existing in the original data group accumulated in the computer system are extracted, and each feature amount is sequentially selected and analyzed, thereby causing the cause of each feature amount. Since the data is automatically and quantitatively evaluated and extracted, it is possible to extract a lot of information (trends, characteristic patterns, relevance between data, etc.) by looking at the data from various aspects. Therefore, it is possible to objectively and efficiently quantitatively extract the relevance and significant difference that have been buried in a variety of data and have been difficult to discriminate conventionally.
[Brief description of the drawings]
FIG. 1 is a diagram showing an example of a computer system used in Embodiment 1 of the present invention.
FIG. 2 is a block diagram showing an example of a functional configuration of a data analysis apparatus realized by the computer system having the configuration shown in FIG.
FIG. 3 is a chart showing each feature amount extracted by extraction of data distribution features in the first embodiment of the present invention in CSV format;
FIG. 4 is a chart showing information focusing on variation of wafer attribute values as a feature of data distribution within a lot when yield analysis of semiconductor data is performed in the first embodiment of the present invention.
FIG. 5 is a diagram showing a variation in yield values of wafers in a lot for a plurality of lots.
FIG. 6 is a flowchart showing an outline of an example of a data analysis method according to the first exemplary embodiment of the present invention;
FIG. 7 is a characteristic diagram showing a relationship between yield and VT_N2 as a specific example.
FIG. 8 is a diagram showing a histogram of VT_N2 data obtained from all wafers as a specific example.
FIG. 9 is a diagram showing a box whisker diagram in which all VT_N2 data is displayed for each wafer number as a specific example;
FIG. 10 is a diagram showing a result of regression tree analysis using a target variable as an average value of VT_N2 in each lot and an explanatory variable as a device name used in each process as a specific example.
11 is a diagram showing an example of an evaluation statistical value list for the regression tree analysis result shown in FIG.
FIG. 12 is a diagram showing a box whisker diagram in which all VT_N2 data is displayed for each used device name of the second wiring_device as a specific example;
FIG. 13 is a diagram showing a result of regression tree analysis performed with the objective variable as the value of VT_N2 of each wafer and the explanatory variable as the device name used in each process as a specific example.
14 is a diagram showing an example of an evaluation statistical value list for the regression tree analysis result shown in FIG. 13; FIG.
FIG. 15 is a diagram showing a box whisker diagram in which all VT_N2 data is displayed for each used device name of the 2CON process_device as a specific example;
FIG. 16 is a chart showing a file that defines each feature quantity of VT_N2 for each lot as a specific example;
FIG. 17 is a diagram illustrating a histogram of each feature amount of distribution in a lot of VT_N2 as a specific example.
FIG. 18 is a chart showing a file for performing regression tree analysis on the feature amount of distribution in VT_N2 as a specific example.
FIG. 19 is a chart showing an input file for analyzing yield fluctuation factors by regression tree analysis as a specific example;
FIG. 20 is a diagram showing a result of regression tree analysis using a target variable as a standard deviation value of VT_N2 in each lot and an explanatory variable as a device name used in each process as a specific example.
FIG. 21 is a diagram showing an example of an evaluation statistical value list for the regression tree analysis result shown in FIG. 20;
FIG. 22 is a diagram showing a box mustache diagram in which all VT_N2 data is displayed for each device name used in the Field_Ox process_device as a specific example.
FIG. 23 is a diagram showing a histogram of VT_N2 of all wafers using the PM1 or PM3 machine in the Field_Ox process as a specific example.
FIG. 24 is a diagram showing a VT_N2 histogram of wafers for one lot using the PM1 or PM3 machine in the Field_Ox process as a specific example.
FIG. 25 is a diagram showing a histogram of VT_N2 of wafers for one lot using PM1 or PM3 in the Field_Ox process as a specific example.
FIG. 26 is a diagram showing a VT_N2 histogram of one lot of wafers using the PM1 or PM3 machine in the Field_Ox process as a specific example.
FIG. 27 is a diagram showing a VT_N2 histogram of all wafers using the PM2 machine in the Field_Ox process as a specific example.
FIG. 28 is a diagram showing a VT_N2 histogram of one lot of wafers using the PM2 machine in the Field_Ox process as a specific example.
FIG. 29 is a diagram showing a VT_N2 histogram of one lot of wafers using the PM2 machine in the Field_Ox process as a specific example.
FIG. 30 is a diagram showing a VT_N2 histogram of one lot of wafers using the PM2 machine in the Field_Ox process as a specific example.
FIG. 31 is a diagram showing the result of regression tree analysis using the objective variable as the periodicity value of the interval 2 of the wafer number as a specific example and the explanatory variable as the name of the apparatus used in each step.
32 is a diagram showing an example of an evaluation statistical value list for the regression tree analysis result shown in FIG. 31;
FIG. 33 is a diagram showing a box whisker diagram in which the periodicity value of the wafer number interval 2 is displayed for each device name used in the F diffusion process_device as a specific example.
FIG. 34 is a diagram showing in-lot fluctuation of VT_N2 for one lot of wafers using the F7 machine in the F diffusion process as a specific example.
FIG. 35 is a diagram showing in-lot variation of VT_N2 for one lot of wafers using the F7 machine in the F diffusion process as a specific example.
FIG. 36 is a diagram showing in-lot variation of VT_N2 for one lot of wafers using the F7 machine in the F diffusion process as a specific example.
FIG. 37 is a diagram showing variation in VT_N2 within a lot for one lot of wafers using No. F5, No. F6, No. F8 or No. F9 in the F diffusion process as a specific example.
FIG. 38 is a diagram showing variation in VT_N2 within a lot for one lot of wafers using No. F5, No. F6, No. F8 or No. F9 in the F diffusion process as a specific example.
FIG. 39 is a diagram showing in-lot variation of VT_N2 for one lot of wafers using No. F5, No. F6, No. F8 or No. F9 in the F diffusion process as a specific example.
FIG. 40 is a diagram illustrating a relationship between a lot flow and an abnormal manufacturing apparatus.
FIG. 41 is a diagram showing yield distribution by device in a certain process according to the prior art.
FIG. 42 is a diagram showing the relationship between the lot flow and the entanglement of the abnormal manufacturing apparatus.
FIG. 43 is a diagram illustrating an example of regression tree analysis input data.
FIG. 44 is a diagram illustrating an example of a regression tree.
FIG. 45 is a diagram illustrating an example of an evaluation statistical value list;
FIG. 46 is a diagram showing a relationship among a manufacturing apparatus used, electrical characteristic data, and a yield value.
FIG. 47 is a diagram illustrating a calculation example of a two-part entanglement degree and a two-part independence degree.
FIG. 48 is a diagram illustrating an example of a regression tree.
FIG. 49 is a diagram showing an example of an evaluation statistical value list.
FIG. 50 is a diagram illustrating the degree of confounding and independence between each explanatory variable and the first candidate explanatory variable;
FIG. 51 is a diagram showing the degree of confounding and independence between each explanatory variable and the third candidate explanatory variable;
FIG. 52 is a diagram showing the degree of confounding and the average of all candidates.
FIG. 53 is a diagram showing the degree of independence and the average of all candidates.
FIG. 54 is a regression tree diagram showing the results of regression tree analysis using a defective wafer group.
FIG. 55 is a regression tree diagram showing the results of regression tree analysis using a good wafer group.
FIG. 56 is a diagram illustrating an example of a functional configuration of the data analysis apparatus.
[Explanation of symbols]
21 Means to extract data distribution features
22 Means for selecting feature quantity to be analyzed
23 Data mining means
24 Rule file
25 Statistical analysis components
26 Charting components
27 Analysis tools
41 database
42 Original data group
101 Normal device
102 Abnormal equipment
401 explanatory variables
402 Objective variable
411 Equipment used
412 Electrical characteristics data
413 Yield
801 Standard explanatory variable
802 Comparison explanatory variable
803 Wafer number
804 Yield
811 High yield group of standard explanatory variables
812 Low yield group of standard explanatory variables
813 Formula for calculating the degree of entanglement
814 Two-part entanglement
815 Independence of 2 divisions
1701 Original data group
1702 Database
1703 Data Mining Department
1704 rule file
1705 Analysis tools
1706 Statistical analysis components
1707 Chart creation component
1708 Decision Making Department

Claims

A data analysis method in which a computer having a central processing unit and a storage device that stores an original data group measured in a semiconductor manufacturing process performs data analysis on a failure factor in the semiconductor manufacturing process,
The computer is
An extraction process for extracting a data group to be analyzed from the storage device and storing it in the storage device by the central processing unit;
Processing for obtaining the data distribution feature quantity of the analysis target from the data group to be analyzed extracted by the extraction process by the central processing unit and storing it in the storage device;
The central processing unit divides the objective variable and the explanatory variable into two sets in a process selected from the unselected processes in the process group of the semiconductor manufacturing process, and the divided objective variable and explanatory variable are divided into the sets. Regression tree analysis that is further divided into two sets at a step selected from among the unselected steps in the process group of the semiconductor manufacturing process, the data distribution feature amount as the objective variable, and the semiconductor manufacturing process Execution processing for storing the identification information of the device used in the process selected from the unselected processes among the process group as the explanatory variable, and storing the execution result in the storage device;
For each division in the execution process by the central processing unit, the apparatus is one explanatory variable after the division and the other explanatory variable based on the objective variable before the division and the two objective variables after the division. A calculation process of calculating an evaluation value between the storage device and storing it in the storage device;
An output process for outputting an evaluation value calculated for each division by the calculation process;
The data analysis method characterized by performing.

The data analysis method according to claim 1, wherein the data distribution feature amount is represented by a continuous value representing a degree of the feature.

A computer having a central processing unit and a storage device for storing an original data group measured in a semiconductor manufacturing process is a data analysis program for performing data analysis on a failure factor in the semiconductor manufacturing process,
In the computer,
An extraction process for extracting a data group to be analyzed from the storage device and storing it in the storage device by the central processing unit;
Processing for obtaining the data distribution feature quantity of the analysis target from the data group to be analyzed extracted by the extraction process by the central processing unit and storing it in the storage device;
The central processing unit divides the objective variable and the explanatory variable into two sets in a process selected from the unselected processes in the process group of the semiconductor manufacturing process, and the divided objective variable and explanatory variable are divided into the sets. Regression tree analysis that is further divided into two sets at a step selected from among the unselected steps in the process group of the semiconductor manufacturing process, the data distribution feature amount as the objective variable, and the semiconductor manufacturing process Execution processing for storing the identification information of the device used in the process selected from the unselected processes among the process group as the explanatory variable, and storing the execution result in the storage device;
For each division in the execution process by the central processing unit, the apparatus is one explanatory variable after the division and the other explanatory variable based on the objective variable before the division and the two objective variables after the division. A calculation process of calculating an evaluation value between the storage device and storing it in the storage device;
An output process for outputting an evaluation value calculated for each division by the calculation process;
A data analysis program characterized in that it is executed.

A computer having a central processing unit and a storage device for storing an original data group measured in a semiconductor manufacturing process is a data analysis device for performing data analysis on a failure factor in the semiconductor manufacturing process,
An extraction process for extracting a data group to be analyzed from the storage device and storing it in the storage device by the central processing unit;
Processing for obtaining the data distribution feature quantity of the analysis target from the data group to be analyzed extracted by the extraction process by the central processing unit and storing it in the storage device;
The central processing unit divides the objective variable and the explanatory variable into two sets in a process selected from the unselected processes in the process group of the semiconductor manufacturing process, and the divided objective variable and explanatory variable are divided into the sets. Regression tree analysis that is further divided into two sets at a step selected from among the unselected steps in the process group of the semiconductor manufacturing process, the data distribution feature amount as the objective variable, and the semiconductor manufacturing process Execution processing for storing the identification information of the device used in the process selected from the unselected processes among the process group as the explanatory variable, and storing the execution result in the storage device;
For each division in the execution process by the central processing unit, the apparatus is one explanatory variable after the division and the other explanatory variable based on the objective variable before the division and the two objective variables after the division. A calculation process of calculating an evaluation value between the storage device and storing it in the storage device;
An output process for outputting an evaluation value calculated for each division by the calculation process;
The data analysis device characterized by performing.