JP5696354B2

JP5696354B2 - Reliability judgment device

Info

Publication number: JP5696354B2
Application number: JP2009191925A
Authority: JP
Inventors: 浩中
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-08-21
Filing date: 2009-08-21
Publication date: 2015-04-08
Anticipated expiration: 2029-08-21
Also published as: JP2011044592A

Description

本発明は、統計的手法によって得られる推定モデルの信頼度を判断する技術に関する。 The present invention relates to a technique for determining the reliability of an estimation model obtained by a statistical method.

近年、加工中のプロセス装置の各種センサーデータに対して統計的手法を行うことによって、プロセスを推定するためのモデル（以下、「推定モデル」という。）を構築する取り組みが行われている。具体的には、推定モデルとは、多数のプロセス評価用ロットを用いた実測値からなる教師データ（目的変数及び説明変数の組み合わせ）を予め用意し、複数の教師データに対して統計的手法（多変量解析や重回帰分析や回帰分析など）を行うことによって得られる予測式である。推定モデルを用いることによって、検査装置にて実際に検査を行うことなく加工プロセス結果を判断することが可能であり、さらにテスト品の投入も削減できるため、今後も多く使われていくものである。 In recent years, efforts have been made to construct a model for estimating a process (hereinafter referred to as an “estimated model”) by performing a statistical method on various sensor data of a process apparatus being processed. Specifically, an estimation model is prepared in advance with teacher data (combination of objective variables and explanatory variables) consisting of actual measurement values using a large number of process evaluation lots. Multivariate analysis, multiple regression analysis, regression analysis, etc.). By using the estimation model, it is possible to judge the machining process result without actually inspecting with the inspection equipment, and also reduce the input of test products, so it will continue to be used in the future. .

従来、量産現場で構築される推定モデルは、ある一定の短期間での目的変数及び説明変数を収集して構築されている。そのため、推定モデルを用いて製品を生産する中で、推定モデルの適正状態が維持できているか否かを適当な時期に判断する必要があった。 Conventionally, an estimation model constructed at a mass production site is constructed by collecting objective variables and explanatory variables in a certain short period of time. Therefore, during production of products using the estimation model, it is necessary to determine at an appropriate time whether or not the appropriate state of the estimation model can be maintained.

例えば、１年で新製品が導入される環境では、モデル生成のための目的変数及び説明変数のデータを半年や１年かけて収集していたのでは、推定モデルが完成したときにはそれを活用すべき機会が無くなってしまう。そのため、例えば１カ月のように短い期間に目的変数と相関が高い説明変数を選択し、推定モデルを構築しなければならない。したがって、推定モデルの構築用にデータを収集した期間よりも長い期間で変動する説明変数が存在した場合や、メンテナンス等により相関が高い説明変数の関係が変化した場合には、推定モデルの精度が低下してしまう可能性がある。もちろん、推定モデルを随時最新データに更新して構築し直すことが可能であればよいが、実際には教師データは容易に集められない場合が多い。そのため、推定モデル自体の信頼度を定期的に確認する必要があった。 For example, in an environment where a new product is introduced in one year, data of objective variables and explanatory variables for model generation was collected over half a year or one year. Opportunities should disappear. Therefore, for example, an explanatory variable having a high correlation with the objective variable is selected in a short period such as one month, and an estimation model must be constructed. Therefore, if there is an explanatory variable that fluctuates over a longer period than the period for which the data was collected for constructing the estimation model, or if the relationship of explanatory variables with high correlation changes due to maintenance, etc., the accuracy of the estimation model There is a possibility of lowering. Of course, it is sufficient if the estimation model can be updated to the latest data and reconstructed at any time, but in practice, teacher data is often not easily collected. Therefore, it was necessary to regularly check the reliability of the estimation model itself.

一般的なプロセスの推定モデルでは、推定モデルの信頼度自体を確認するのではなく、推定モデルによって得られる予測値と、実際に測定して得られる実測値との差分を算出し、その差分が閾値以上となった場合にアラームを出すことが行われている。例えば、特許文献１では、半導体のＣＭＰ（Chemical Mechanical Polishing）装置にて、研磨時間を予測するモデル構築し、予測値と実測値の差が閾値以上となった場合にエラー判断を行うことが記述されている。また、特許文献２では、各種データ間の相関関係を抽出する方法が開示されている。 In the general process estimation model, the reliability of the estimation model is not confirmed, but the difference between the predicted value obtained by the estimation model and the actual measurement value obtained by actual measurement is calculated. An alarm is issued when the threshold is exceeded. For example, Patent Document 1 describes that a model for predicting a polishing time is constructed in a semiconductor CMP (Chemical Mechanical Polishing) apparatus, and an error judgment is made when the difference between the predicted value and the actual measurement value exceeds a threshold value. Has been. Patent Document 2 discloses a method for extracting a correlation between various data.

特開２００８−２５８５１０号公報JP 2008-258510 A 特開２００６−８６４０３号公報JP 2006-86403 A

しかし、特許文献１に開示された技術では、単純に予測値と実測値の差分を見るだけであり、推定モデルの信頼度については判断されない。そのため、閾値設定問題となり、誤報との判断が難しい。また、特許文献２に開示された技術は、様々な条件の中から集められたデータの中からデータ間の関係を見つける事を目的としているため、データ群の分割・抽出があるだけで、推定モデルの信頼度を判断することはできない。 However, in the technique disclosed in Patent Document 1, the difference between the predicted value and the actual measurement value is simply viewed, and the reliability of the estimation model is not determined. Therefore, it becomes a threshold setting problem and it is difficult to judge it as a false alarm. In addition, the technique disclosed in Patent Document 2 is intended to find a relationship between data from data collected from various conditions. The reliability of the model cannot be determined.

上記事情に鑑み、本発明は、推定モデルの信頼度を判断することが可能な信頼度判断装置を提供することを目的としている。 In view of the above circumstances, the present invention aims at providing a reliability determination equipment capable of determining the reliability of the estimation model.

本発明の一態様は、信頼度判断装置であって、複数の実測値に基づいて統計的に求められた推定モデルについて、前記複数の実測値における説明変数の値の範囲を複数の区間に分割する区間分割部と、実測値が予め得られている推定対象物について、当該推定対象物における説明変数の値が、前記複数の区間のうちどの区間に属するか判定する区間判定部と、前記区間判定部によって判定された区間に属する前記複数の実測値と、当該区間における前記推定モデルによる推定結果とに基づいて、当該推定モデルの信頼度を判断する判断部と、を備える。 One aspect of the present invention is a reliability determination device that divides a range of values of explanatory variables in a plurality of actually measured values into a plurality of sections for an estimated model statistically obtained based on a plurality of actually measured values. A section dividing unit, a section determination unit for determining which section of the plurality of sections the value of the explanatory variable of the estimation target belongs to the estimation target for which an actual measurement value is obtained in advance, and the section A determination unit configured to determine the reliability of the estimation model based on the plurality of actual measurement values belonging to the section determined by the determination unit and the estimation result of the estimation model in the section;

本発明の一態様は、信頼度判断方法であって、情報処理装置が、複数の実測値に基づいて統計的に求められた推定モデルについて、前記複数の実測値における説明変数の値の範囲を複数の区間に分割する区間分割ステップと、前記情報処理装置が、実測値が予め得られている推定対象物について、当該推定対象物における説明変数の値が、前記複数の区間のうちどの区間に属するか判定する区間判定ステップと、前記情報処理装置が、前記区間判定ステップによって判定された区間に属する前記複数の実測値と、当該区間における前記推定モデルによる推定結果とに基づいて、当該推定モデルの信頼度を判断する判断ステップと、を備える。 One aspect of the present invention is a reliability determination method, in which an information processing device calculates a range of values of explanatory variables in a plurality of actually measured values for an estimated model statistically obtained based on a plurality of actually measured values. A section dividing step for dividing into a plurality of sections, and for the estimation object for which the measured value is obtained in advance, the value of the explanatory variable in the estimation object is in which section of the plurality of sections. The estimation model based on the section determination step for determining whether to belong, the information processing apparatus based on the plurality of actually measured values belonging to the section determined by the section determination step, and the estimation result of the estimation model in the section And a determination step of determining the reliability of.

本発明の一態様は、信頼度判断用コンピュータプログラムであって、複数の実測値に基づいて統計的に求められた推定モデルについて、前記複数の実測値における説明変数の値の範囲を複数の区間に分割する区間分割ステップと、実測値が予め得られている推定対象物について、当該推定対象物における説明変数の値が、前記複数の区間のうちどの区間に属するか判定する区間判定ステップと、前記区間判定ステップによって判定された区間に属する前記複数の実測値と、当該区間における前記推定モデルによる推定結果とに基づいて、当該推定モデルの信頼度を判断する判断ステップと、をコンピュータに対して実行させるためのコンピュータプログラムである。 One aspect of the present invention is a computer program for determining reliability, and for an estimation model statistically obtained based on a plurality of actually measured values, a range of values of explanatory variables in the plurality of actually measured values is divided into a plurality of sections. And a section determination step for determining which section of the plurality of sections the value of the explanatory variable in the estimation target object is for the estimation target object in which the actual measurement value is obtained in advance. A determination step of determining a reliability of the estimation model based on the plurality of actually measured values belonging to the section determined by the section determination step and an estimation result by the estimation model in the section; A computer program for execution.

本発明により、推定モデルの信頼度を判断することが可能となる。 According to the present invention, it is possible to determine the reliability of the estimation model.

信頼度判断装置の機能構成を表す概略ブロック図である。It is a schematic block diagram showing the functional structure of a reliability judgment apparatus. 目的変数と、説明変数と、推定モデルとの例を表す図である。It is a figure showing the example of an objective variable, an explanatory variable, and an estimation model. 信頼度判断装置の処理の流れを表すフローチャートである。It is a flowchart showing the flow of a process of a reliability judgment apparatus. 分布係数算出部による分布係数算出処理の流れを表すフローチャートである。It is a flowchart showing the flow of the distribution coefficient calculation process by a distribution coefficient calculation part.

図１は、信頼度判断装置１の機能構成を表す概略ブロック図である。信頼度判断装置１は、バスで接続されたＣＰＵ（Central Processing Unit）やメモリや補助記憶装置などを備え、信頼度判断用コンピュータプログラムを実行することによって、区間分割部１０１、分布係数算出部１０２、区間判定部１０３、判定部１０４、寄与率算出部１０５、分散比算出部１０６、比較部１０７、出力部１０８を備える装置として機能する。なお、信頼度判断装置１の各機能の全て又は一部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＰＬＤ（Programmable Logic Device）等のハードウェアを用いて実現されても良い。 FIG. 1 is a schematic block diagram illustrating a functional configuration of the reliability determination apparatus 1. The reliability determination apparatus 1 includes a CPU (Central Processing Unit) connected via a bus, a memory, an auxiliary storage device, and the like, and executes a reliability determination computer program to thereby obtain an interval division unit 101 and a distribution coefficient calculation unit 102. , Section determination unit 103, determination unit 104, contribution rate calculation unit 105, variance ratio calculation unit 106, comparison unit 107, and output unit 108. Note that all or part of the functions of the reliability determination apparatus 1 may be realized using hardware such as an ASIC (Application Specific Integrated Circuit) or a PLD (Programmable Logic Device).

推定モデルは、多数のプロセス評価用ロットを用いた実測値からなる複数の教師データ（目的変数及び説明変数の組み合わせ）を予め用意し、複数の教師データに対して統計的手法（多変量解析や重回帰分析や回帰分析など）を行うことによって得られる予測式である。このようにして得られる推定モデルは、例えば生産を続ける中で良品判定や消耗状態などのプロセスの変化状態の特徴を表すために用いられる。推定モデルが得られた後に、定期的にテスト評価用ロットについてのデータ（以下、「テストデータ」という。）が得られる。テストデータは、テスト評価用ロットについての説明変数の値と、テスト評価用ロットについて実際に測定を行うことによって得られる目的変数の実測値とを有する。信頼度判断装置１は、多変量解析や重回帰分析などの統計的手法により構築された推定モデルと推定モデルの構築に用いられた全ての教師データとテストデータとを入力として受け付け、教師データ及びテストデータに基づいて推定モデルの信頼度を判断し、判断結果を出力する。このような処理を行うため、信頼度判断装置１の各機能部は以下のように動作する。 The estimation model prepares in advance a plurality of teacher data (combinations of objective variables and explanatory variables) consisting of actual measurement values using a large number of process evaluation lots, and statistical methods (multivariate analysis and This is a prediction formula obtained by performing multiple regression analysis or regression analysis. The estimation model obtained in this way is used, for example, to represent the characteristics of a process change state such as a non-defective product determination or a wear state while continuing production. After the estimation model is obtained, data on the test evaluation lots (hereinafter referred to as “test data”) is obtained periodically. The test data includes an explanatory variable value for the test evaluation lot and an actual measurement value of the objective variable obtained by actually performing the measurement for the test evaluation lot. The reliability determination apparatus 1 receives as input an estimation model constructed by a statistical method such as multivariate analysis or multiple regression analysis, and all the teacher data and test data used to construct the estimation model. The reliability of the estimation model is determined based on the test data, and the determination result is output. In order to perform such processing, each functional unit of the reliability determination apparatus 1 operates as follows.

区間分割部１０１は、推定モデルを生成した際に用いられた説明変数の値の範囲を分割し複数の区間を形成する。分布係数算出部１０２は、分布係数算出処理を実行することによって、各区間での分布係数を算出する。区間判定部１０３は、入力されたテストデータが区間分割部１０１によって形成された複数の区間のどの区間に所属するか判定する。判定部１０４は、テストデータが所属する区間において、推定モデル生成時に十分なデータ分布が存在していたか否かを判定する。寄与率算出部１０５は、テストデータが所属する区間の寄与率を算出する。分散比算出部１０６は、テストデータが所属する区間の分散比を計算する。比較部１０７は、テストデータが所属する区間について寄与率と分散比とを比較し信頼度の判断を行う。出力部１０８は、音声出力部や表示部によって、判定部１０４による判定結果、又は比較部１０７による判断結果を信頼度判断装置１のユーザーに対して出力する。また、出力部１０８は、ユーザーに対してではなく、他の装置に対して判断結果を表すデータを出力するように構成されても良い。 The section dividing unit 101 divides the range of the explanatory variable values used when the estimation model is generated to form a plurality of sections. The distribution coefficient calculation unit 102 calculates a distribution coefficient in each section by executing a distribution coefficient calculation process. The section determining unit 103 determines which section of the plurality of sections formed by the section dividing unit 101 the input test data belongs to. The determination unit 104 determines whether or not there is a sufficient data distribution when the estimated model is generated in the section to which the test data belongs. The contribution rate calculation unit 105 calculates the contribution rate of the section to which the test data belongs. The variance ratio calculation unit 106 calculates the variance ratio of the section to which the test data belongs. The comparison unit 107 determines the reliability by comparing the contribution rate and the variance ratio for the section to which the test data belongs. The output unit 108 outputs the determination result by the determination unit 104 or the determination result by the comparison unit 107 to the user of the reliability determination device 1 by the voice output unit or the display unit. Further, the output unit 108 may be configured to output data representing the determination result not to the user but to another device.

図２は、目的変数と、説明変数と、推定モデルとの例を表す図である。教師データとなる目的変数及び説明変数は、図２における各黒い点であり、例えば半導体製造装置がプロセス評価用ロットを処理した際に得られる。推定モデルは、図２における直線であり、目的変数及び説明変数から多変量解析や重回帰分析や回帰分析などによって得られる。なお、図２では、目的変数及び説明変数の種類がそれぞれ一つであるが、それぞれ複数であっても良い。また、推定モデルとは、線形あるいは非線形を含めた実測に対する近似直線や曲線を意味する。 FIG. 2 is a diagram illustrating an example of an objective variable, an explanatory variable, and an estimation model. The objective variable and the explanatory variable serving as the teacher data are black dots in FIG. 2 and are obtained, for example, when the semiconductor manufacturing apparatus processes the process evaluation lot. The estimation model is a straight line in FIG. 2, and is obtained from the objective variable and the explanatory variable by multivariate analysis, multiple regression analysis, regression analysis, or the like. In FIG. 2, the number of the objective variable and the explanatory variable is one, but there may be a plurality of types. In addition, the estimation model means an approximate straight line or curve for actual measurement including linear or non-linear.

図３は、信頼度判断装置１の処理の流れを表すフローチャートである。まず、区間分割部１０１は、推定モデルの説明変数のデータ範囲を、所定の方法に基づいて複数の区間に分割する（ステップＳ１０１）。分割後の区間は、図３のように一定の幅となっても良いし、区間毎に幅が異なっても良い。また、各区間は、図３のように隣同士でオーバーラップしないように設定されても良いし、隣同士で一部がオーバーラップするように（重なり合うように）設定されても良い。 FIG. 3 is a flowchart showing a process flow of the reliability determination apparatus 1. First, the section dividing unit 101 divides the data range of the explanatory variable of the estimation model into a plurality of sections based on a predetermined method (step S101). The divided sections may have a constant width as shown in FIG. 3, or the width may be different for each section. In addition, each section may be set so as not to overlap each other as shown in FIG. 3, or may be set so that a part of the sections overlaps each other (so as to overlap).

次に、分布係数算出部１０２は、分散係数算出処理を実行することによって、各区間における分布係数を算出する（ステップＳ１０２）。分布係数は、推定モデルの構築時の教師データの数が十分であったか否か、且つ、１点に集中せずに分散していたか否かを表す。 Next, the distribution coefficient calculation unit 102 calculates a distribution coefficient in each section by executing a dispersion coefficient calculation process (step S102). The distribution coefficient indicates whether or not the number of teacher data at the time of construction of the estimation model is sufficient, and whether or not the teacher data is distributed without being concentrated on one point.

図４は、分布係数算出部１０２による分布係数算出処理の流れを表すフローチャートである。まず、分布係数算出部１０２は、分布係数の算出対象となる区間を選択する（ステップＳ２０１）。この選択はどのような順番で行われても良く、例えば説明変数の値が小さい方から順に選択しても良い。 FIG. 4 is a flowchart showing the flow of distribution coefficient calculation processing by the distribution coefficient calculation unit 102. First, the distribution coefficient calculation unit 102 selects a section for which a distribution coefficient is to be calculated (step S201). This selection may be performed in any order. For example, the selection may be made in order from the smallest value of the explanatory variable.

次に、分布係数算出部１０２は、分布係数の算出対象となった区間にあるデータ点を区間の端から順番に基準点として選択し、さらに基準点の次の順番のデータ点を隣接点として選択する（ステップＳ２０２）。データ点とは、一つの教師データを表す点であり、図２に表される一つの黒い点である。また、上記の順番とは、説明変数の軸にそって小さい方から大きい方への順でも良いし、目的変数の軸や両方の軸にそって小さい方から大きい方への順でも良いし、いずれかの軸に沿って大きい方から小さい方への順でも良いし、その他の順であっても良い。 Next, the distribution coefficient calculation unit 102 selects data points in the section for which the distribution coefficient is to be calculated as a reference point in order from the end of the section, and further sets a data point in the next order of the reference point as an adjacent point. Select (step S202). A data point is a point representing one teacher data, and is a single black point shown in FIG. Also, the above order may be from the smallest to the largest along the explanatory variable axis, or from the smallest to the largest along the objective variable axis or both axes, The order may be from the largest to the smallest along any axis, or may be in any other order.

次に、分布係数算出部１０２は、基準点と隣接点との間の距離（以下、「隣接間距離」という。）を算出する（ステップＳ２０３）。次に、分布係数算出部１０２は、算出された隣接間距離が、予め設定されている閾値よりも大きいか否か判定する（ステップＳ２０４）。隣接間距離が閾値以下である場合（ステップＳ２０４−ＮＯ）、分布係数算出部１０２は、現在の隣接点を無視し、さらに次の順のデータ点を新たな隣接点として選択する（ステップＳ２０５）。そして、分布係数算出部１０２は、新たな隣接点に基づいて隣接間距離を算出し（ステップＳ２０３）、閾値との比較を行う（ステップＳ２０４）。 Next, the distribution coefficient calculation unit 102 calculates the distance between the reference point and the adjacent point (hereinafter referred to as “inter-adjacent distance”) (step S203). Next, the distribution coefficient calculation unit 102 determines whether or not the calculated adjacent distance is greater than a preset threshold value (step S204). If the inter-adjacent distance is equal to or smaller than the threshold (step S204—NO), the distribution coefficient calculation unit 102 ignores the current adjacent point and further selects the next sequential data point as a new adjacent point (step S205). . Then, the distribution coefficient calculation unit 102 calculates an inter-adjacent distance based on the new adjacent point (step S203) and compares it with a threshold value (step S204).

一方、ステップＳ２０４の処理において隣接間距離が閾値よりも大きい場合（ステップＳ２０４−ＹＥＳ）、分布係数算出部１０２は、算出対象としている区間における隣接間距離の累計を更新する（ステップｓ２０６）。次に、分布係数算出部１０２は、カウンタの値をインクリメントする（ステップＳ２０７）。このカウンタの値は、隣接間距離の累計が行われた回数を表し、言い換えれば隣接間距離の累計に用いられた基準点及び隣接点の組み合わせの数を表す。 On the other hand, when the distance between adjacent areas is larger than the threshold in the process of step S204 (step S204-YES), the distribution coefficient calculating unit 102 updates the accumulated distance between adjacent areas in the section to be calculated (step s206). Next, the distribution coefficient calculation unit 102 increments the counter value (step S207). The value of this counter represents the number of times that the distance between adjacent points has been accumulated, in other words, the number of combinations of reference points and adjacent points used for the accumulation of distances between adjacent points.

次に、分布係数算出部１０２は、分布係数の算出対象となっている区間内の全てのデータ点を基準点としてステップＳ２０２〜Ｓ２０７の処理を行ったか否か判定し（ステップＳ２０８）、基準点となっていないデータ点が残らなくなるまでステップＳ２０２〜Ｓ２０７の処理を繰り返し実行する（ステップＳ２０８−ＮＯ）。なお、ステップＳ２０２に処理が戻った場合には、分布係数算出部１０２は、それまで基準点としていたデータ点の次の順のデータ点を新たな基準点とし、さらに新たな基準点の次の順のデータ点を新たな隣接点として処理を行う。 Next, the distribution coefficient calculation unit 102 determines whether or not the processing in steps S202 to S207 has been performed using all data points in the section for which the distribution coefficient is to be calculated as a reference point (step S208). The processes in steps S202 to S207 are repeatedly executed until there are no remaining data points (step S208—NO). When the process returns to step S202, the distribution coefficient calculation unit 102 sets a data point in the next order after the data point that has been used as the reference point as a new reference point, and further adds a data point next to the new reference point. Processing is performed with the sequential data point as a new adjacent point.

一方、基準点となっていないデータ点がなくなると（ステップＳ２０８−ＹＥＳ）、分布係数算出部１０２は、最新の隣接間距離の累計の値を最新のカウンタの値で除算することによって分布係数を算出する（ステップＳ２０９）。 On the other hand, when there is no data point that is not a reference point (step S208—YES), the distribution coefficient calculation unit 102 divides the latest cumulative value of adjacent distances by the latest counter value to calculate the distribution coefficient. Calculate (step S209).

次に、分布係数算出部１０２は、区間分割部１０１によって形成された全ての区間についてステップＳ２０１〜Ｓ２０９の処理が完了したか否か判定し（ステップＳ２１０）、全ての区間について分布係数が算出されるまでステップＳ２０１〜Ｓ２０９の処理を繰り返し実行する（ステップＳ２１０−ＮＯ）。なお、ステップＳ２０１に処理が戻った場合には、分布係数算出部１０２は、隣接間距離の累計の値及びカウンタの値をゼロにリセットして、ステップＳ２０１以降の処理を実行する。
一方、全ての区間について分布係数が算出されると（ステップＳ２１０−ＹＥＳ）、分布係数算出部１０２は、分布係数算出処理を終了する。 Next, the distribution coefficient calculation unit 102 determines whether or not the processing in steps S201 to S209 has been completed for all the sections formed by the section dividing unit 101 (step S210), and distribution coefficients are calculated for all the sections. The processes of steps S201 to S209 are repeatedly executed until NO (step S210-NO). When the processing returns to step S201, the distribution coefficient calculation unit 102 resets the cumulative value of the adjacent distance and the counter value to zero, and executes the processing after step S201.
On the other hand, when distribution coefficients are calculated for all sections (step S210—YES), the distribution coefficient calculation unit 102 ends the distribution coefficient calculation process.

分布係数算出処理が終了すると、区間判定部１０３が、テストデータが区間分割部１０１によって形成された複数の区間のどの区間に所属するか判定する（ステップＳ１０３）。次に、判定部１０４が、ステップＳ１０３で判定された区間の分布係数が閾値以上であるか否か判定する（ステップＳ１０４）。分布係数が閾値未満である場合（ステップＳ１０４−ＮＯ）、すなわち推定モデル生成時に十分な数の教師データが存在していなかった場合、判定部１０４は、推定モデル生成時の教師データが少ないためモデル信頼度を判断できないと判断する（ステップＳ１０５）。この場合、出力部１０８は、判定部１０４による「モデル信頼度判断不能」という判断結果を出力する。 When the distribution coefficient calculation process ends, the section determination unit 103 determines which section of the plurality of sections formed by the section division unit 101 the test data belongs to (step S103). Next, the determination unit 104 determines whether or not the distribution coefficient of the section determined in step S103 is greater than or equal to a threshold value (step S104). When the distribution coefficient is less than the threshold value (step S104-NO), that is, when there is not a sufficient number of teacher data at the time of generating the estimated model, the determination unit 104 uses the model because there is little teacher data at the time of generating the estimated model. It is determined that the reliability cannot be determined (step S105). In this case, the output unit 108 outputs the determination result “model reliability determination impossible” by the determination unit 104.

一方、分布係数が閾値より大きい場合（ステップＳ１０４−ＹＥＳ）、すなわち推定モデル生成時に十分な数の教師データが存在していた場合、寄与率算出部１０５が、ステップＳ１０３で判定された区間の推定モデルの寄与率を算出する（ステップＳ１０６）。具体的には、寄与率算出部１０５は以下の式１にしたがって推定モデルの寄与率を算出する。なお、式１におけるｉの値は、各教師データの識別情報を表す。Ｙｉ_ｐｒｅは説明変数がＸｉであるときに推定モデルによって算出される予測値を表し、ａ＋ｂＸｉに等しい。Ｙｉは説明変数がＸｉであるときの実測値を表し、教師データから得られる。また、Ｙ_ａｖｅは、実測値の平均値を表す。また、式１に用いられる説明変数Ｘｉは、教師データとして存在する説明変数Ｘｉのうち、ステップＳ１０３で判定された区間に属するもの全てである。したがって、Ｙ_ａｖｅは、教師データのうちその説明変数ＸｉがステップＳ１０３で判定された区間に属するもの全ての教師データにおける目的変数の実測値の平均値を表す。 On the other hand, when the distribution coefficient is larger than the threshold (step S104-YES), that is, when there is a sufficient number of teacher data at the time of generation of the estimation model, the contribution rate calculation unit 105 estimates the section determined in step S103. A contribution ratio of the model is calculated (step S106). Specifically, the contribution rate calculation unit 105 calculates the contribution rate of the estimation model according to the following Equation 1. Note that the value of i in Equation 1 represents identification information of each teacher data. Yi _pre represents a predicted value calculated by the estimation model when the explanatory variable is Xi, and is equal to a + bXi. Yi represents an actual measurement value when the explanatory variable is Xi, and is obtained from the teacher data. Y _ave represents an average value of actually measured values. The explanatory variables Xi used in Equation 1 are all of the explanatory variables Xi existing as teacher data that belong to the section determined in step S103. Therefore, Y _ave represents the average value of the actual measurement values of the objective variables in all the teacher data in which the explanatory variable Xi belongs to the section determined in step S103.

式１では、ステップＳ１０３で判定された区間に属する各教師データの実測値と平均値との差分の二乗の累積値を、各予測値と平均値との差分の二乗の累積値で除算した値が寄与率として得られる。ここで、寄与率は、推定モデルの適合度を示す値である。寄与率の算出方法は、数１にしたがった方法に限定されず、実測値に対する予測値の適合度合いを表す値が算出できれば、例えば実測値と予測値との差分値を寄与率とするように他の算出方法であっても良い。 In Formula 1, a value obtained by dividing the squared cumulative value of the difference between the measured value and the average value of each teacher data belonging to the section determined in step S103 by the squared cumulative value of the difference between each predicted value and the average value Is obtained as a contribution rate. Here, the contribution rate is a value indicating the fitness of the estimation model. The method of calculating the contribution rate is not limited to the method according to Equation 1, and if a value representing the degree of fit of the predicted value with the actual measurement value can be calculated, for example, the difference value between the actual measurement value and the predicted value is used as the contribution rate. Other calculation methods may be used.

次に、分散比算出部１０６が、ステップＳ１０３で判定された区間での推定モデル生成時の教師データの中心に対して、テスト評価用ロットの予測値と実測値との分散の比率を計算する。具体的には、分散比算出部１０６は以下の式２にしたがって分散比を算出する。なお、Ｙｔ_ｐｒｅは、説明変数がテストデータの説明変数Ｘｔであるときに推定モデルによって算出される予測値を表し、ａ＋ｂＸｔに等しい。Ｙｔは説明変数がＸｔであるときの実測値を表し、テストデータから得られる。また、Ｙ_ａｖｅは、実測値の平均値を表す。また、式２に用いられる説明変数Ｘｔは、テストデータとして存在する説明変数Ｘｔのうち、ステップＳ１０３で判定された区間に属するもの全てである。したがって、Ｙ_ａｖｅは、テストデータのうちその説明変数ＸｔがステップＳ１０３で判定された区間に属するもの全てのテストデータにおける目的変数の実測値の平均値を表す。 Next, the variance ratio calculation unit 106 calculates the ratio of variance between the predicted value and the actual measurement value of the test evaluation lot with respect to the center of the teacher data when the estimated model is generated in the section determined in step S103. . Specifically, the dispersion ratio calculation unit 106 calculates the dispersion ratio according to the following Equation 2. Yt _pre represents a predicted value calculated by the estimation model when the explanatory variable is the explanatory variable Xt of the test data, and is equal to a + bXt. Yt represents an actual measurement value when the explanatory variable is Xt, and is obtained from test data. Y _ave represents an average value of actually measured values. The explanatory variables Xt used in Equation 2 are all of the explanatory variables Xt existing as test data that belong to the section determined in step S103. Therefore, Y _ave represents the average value of the actual measurement values of the objective variable in all the test data in which the explanatory variable Xt belongs to the section determined in step S103.

式２では、ステップＳ１０３で判定された区間に属する各テストデータの実測値と平均値との差分の二乗の累積値を、各予測値と平均値との差分の二乗の累積値で除算した値が分散比として得られる。ここで、分散比は、推定モデルの適合度を示す値である。分散比の算出方法は、数２にしたがった方法に限定されず、実測値に対する予測値の適合度合いを表す値が算出できれば、例えば実測値と予測値との差分値を分散比とするように他の算出方法であっても良い。 In Formula 2, the value obtained by dividing the squared cumulative value of the difference between the measured value and the average value of each test data belonging to the section determined in step S103 by the squared cumulative value of the difference between each predicted value and the average value Is obtained as a dispersion ratio. Here, the dispersion ratio is a value indicating the degree of fitness of the estimation model. The calculation method of the dispersion ratio is not limited to the method according to Equation 2, and if a value representing the degree of fit of the predicted value with the actual measurement value can be calculated, for example, the difference value between the actual measurement value and the predicted value is set as the dispersion ratio. Other calculation methods may be used.

比較部１０７は、寄与率算出部１０５によって算出された寄与率と分散比算出部１０６によって算出された分散比とを比較する（ステップＳ１０８）。分散比が寄与率よりも大きい場合（ステップ１０８−ＹＥＳ）、比較部１０７は、推定モデルの信頼度は高いと判断し、推定モデルの再構築は不要と判断する（ステップＳ１０９）。一方、分散比が寄与率以下である場合（ステップ１０８−ＮＯ）、比較部１０７は、推定モデルの信頼度は低いと判断し、推定モデルの再構築が必要と判断する（ステップＳ１１０）。ステップＳ１０９又はステップＳ１１０の処理の後、出力部１０８は判断結果を出力する。 The comparison unit 107 compares the contribution rate calculated by the contribution rate calculation unit 105 with the dispersion ratio calculated by the dispersion ratio calculation unit 106 (step S108). When the variance ratio is larger than the contribution rate (step 108-YES), the comparison unit 107 determines that the reliability of the estimation model is high and determines that the estimation model does not need to be reconstructed (step S109). On the other hand, when the variance ratio is less than or equal to the contribution rate (step 108—NO), the comparison unit 107 determines that the reliability of the estimation model is low and determines that the estimation model needs to be reconstructed (step S110). After the process of step S109 or step S110, the output unit 108 outputs the determination result.

信頼度判断装置１は、実際の製造装置内の演算装置や個々の製造装置とは関係ない独立の情報処理装置として実現されても良い。
このように構成された信頼度判断装置１によれば、一度構築した推定モデルの安定性を、新たなプロセス評価用ロットに基づいた新たな教師データを改めて準備することなく容易に判定することが可能となり、推定モデルの信頼度を容易に確認することが可能となる。 The reliability determination device 1 may be realized as an independent information processing device that is not related to an arithmetic device in an actual manufacturing device or individual manufacturing devices.
According to the reliability determination device 1 configured as described above, it is possible to easily determine the stability of the estimation model once constructed without newly preparing new teacher data based on a new process evaluation lot. This makes it possible to easily confirm the reliability of the estimation model.

また、信頼度判断装置１によれば、一度構築した推定モデルの信頼度を、教師データの区間別に判断しており、教師データが少ない区間については推定モデルの信頼度を判断することなく「判断不能」とするため、教師データが少ない区間において誤報を出力してしまうことを防ぐことが可能となる。 Moreover, according to the reliability determination apparatus 1, the reliability of the estimation model once constructed is determined for each section of the teacher data, and “determination” is performed without determining the reliability of the estimation model for the section with less teacher data. Since it is “impossible”, it is possible to prevent an erroneous report from being output in a section with a small amount of teacher data.

また、信頼度判断装置１によれば、推定モデルの信頼度の判断において、推定モデルの再構築に必要な程度の多くのプロセス評価用ロットを用いた多くの教師データによって判断を行うのではなく、少ない数のテスト評価用ロットを用いた少ないテストデータによって判断を行う。そのため、実測を行うテスト評価用ロットの数を減らすことが可能であり、テスト評価用ロットではなく実際の製品に係る処理の効率を向上させることが可能となる。 In addition, according to the reliability determination apparatus 1, in determining the reliability of the estimation model, the determination is not performed using a lot of teacher data using a lot of process evaluation lots necessary to reconstruct the estimation model. Judgment is made with a small amount of test data using a small number of test evaluation lots. Therefore, it is possible to reduce the number of test evaluation lots to be actually measured, and it is possible to improve the processing efficiency related to the actual product instead of the test evaluation lot.

また、信頼度判断装置１によれば以下のようなことも可能となる。すなわち、ある短期間で相関が高いパラメータを選択して構築した推定モデルは、より長い期間で検討した場合に本来は必要である相関パラメータを入れずに構築している場合が存在する。しかしながら、信頼度が低下した場合にはすぐに再構築が必要であることを判断することが可能であるため、そのような短期間に得られた教師用データにより構築された推定モデルであっても実運用することが可能となる。したがって、推定モデルが完成した時には該当製品の生産が終了していたというような状況を避ける事が可能となる。 Moreover, according to the reliability determination apparatus 1, the following can also be performed. That is, there are cases where an estimation model constructed by selecting a parameter having a high correlation in a short period is constructed without including a correlation parameter that is originally necessary when the parameter is examined in a longer period. However, since it is possible to immediately determine that the reconstruction is necessary when the reliability decreases, the estimation model is constructed by the teacher data obtained in such a short period of time. Can also be used in practice. Therefore, it is possible to avoid a situation in which the production of the corresponding product is finished when the estimation model is completed.

なお、信頼度判断装置１の活用例としては、装置のセンサーデータを活用してプロセス内容を推定する半導体装置が挙げられる。 In addition, as an application example of the reliability determination apparatus 1, a semiconductor apparatus that estimates process contents by using sensor data of the apparatus can be cited.

＜変形例＞
比較部１０７は、ステップＳ１０８の評価ロットの分散比が寄与率より小さい回数が一定回数以上繰り返された場合に、推定モデルの再構築が必要と判断しても良い。
以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 <Modification>
The comparison unit 107 may determine that it is necessary to reconstruct the estimation model when the number of times that the variance ratio of the evaluation lot in step S108 is smaller than the contribution rate is repeated a certain number of times.
The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

１…信頼度判断装置，１０１…区間分割部，１０２…分布係数算出部，１０３…区間判定部，１０４…判定部，１０５…寄与率算出部，１０６…分散比算出部，１０７…比較部（判断部），１０８…出力部 DESCRIPTION OF SYMBOLS 1 ... Reliability judgment apparatus, 101 ... Section division part, 102 ... Distribution coefficient calculation part, 103 ... Section determination part, 104 ... Determination part, 105 ... Contribution rate calculation part, 106 ... Dispersion ratio calculation part, 107 ... Comparison part ( Judgment part), 108 ... output part

Claims

A section dividing unit that divides a range of values of explanatory variables in the plurality of actually measured values into a plurality of sections for the estimation model statistically obtained based on the plurality of actually measured values,
A section determination unit that determines which section of the plurality of sections the value of the explanatory variable in the estimation target belongs to the estimation target for which the actual measurement value is obtained in advance,
A determination unit that determines the reliability of the estimation model based on the plurality of actual measurement values belonging to the section determined by the section determination unit and the estimation result of the estimation model in the section;
A distribution coefficient calculation unit that calculates a distribution coefficient for evaluating the number or distribution of the actual measurement values used when obtaining the estimation model in the section determined by the section determination unit;
The reliability determination device , wherein the determination unit does not determine the reliability when the distribution coefficient does not satisfy a predetermined criterion .

A section dividing unit that divides a range of values of explanatory variables in the plurality of actually measured values into a plurality of sections for the estimation model statistically obtained based on the plurality of actually measured values,
A section determination unit that determines which section of the plurality of sections the value of the explanatory variable in the estimation target belongs to the estimation target for which the actual measurement value is obtained in advance,
A contribution ratio and a dispersion ratio between the plurality of actual measurement values belonging to the section determined by the section determination unit and an estimation result of the estimation model in the section are calculated, and the larger the dispersion ratio is, the larger the contribution ratio is. A determination unit that determines that the reliability of the estimation model is high, and determines that the reliability of the estimation model is low as the variance ratio is smaller than the contribution rate;
The contribution rate is a value representing a degree of fit of a predicted value calculated by the estimated model with respect to an actual measurement value when the value of the explanatory variable is teacher data used for the construction of the estimated model,
The variance ratio is a reliability determination device that is a value representing a degree of fit of a predicted value calculated by the estimated model with respect to an actual measured value when the value of the explanatory variable is test data used after the estimated model is constructed .