CN116992760A

CN116992760A - Method and system for predicting and explaining film thickness process of wafer deposition

Info

Publication number: CN116992760A
Application number: CN202310890677.6A
Authority: CN
Inventors: 陈一宁; 史雨萌; 蔡宇; 高大为
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2023-07-19
Filing date: 2023-07-19
Publication date: 2023-11-03

Abstract

The application relates to a method and a system for predicting and explaining a film thickness process of wafer deposition, wherein the method comprises the following steps: acquiring production line data, preprocessing the production line data, and generating sample data; inputting the sample data into a Catboost model for prediction, and outputting a prediction result; and processing the sample data by using a SHAP model, and combining the prediction result, analyzing the influence degree of the characteristics in the sample data on the prediction result from the two angles of local interpretability and global interpretability to generate an analysis result. The method and the system have the beneficial effects that the credibility of the model prediction result is enhanced, and a person skilled in the art can timely adjust the production equipment parameters according to the prediction result, so that the deviation in the wafer process is reduced, and the purposes of saving the wafer process cost and optimizing the wafer process flow are achieved.

Description

Method and system for predicting and explaining film thickness process of wafer deposition

Technical Field

The present application relates to the field of integrated circuit chip fabrication, and more particularly, to a method and system for predicting and interpreting wafer deposited film thickness processes.

Background

With the development of the information age and the semiconductor integrated circuit industry, integrated circuit chips have become one of the most important core and base devices in the information age. Wafer fabrication is a core technology in the field of integrated circuit chip fabrication. In the wafer manufacturing process, the wafer is affected by a plurality of process parameters, so that defects can be generated on the surface of the wafer in the process manufacturing process, and the quality and the yield of chips are affected. In order to ensure the quality of the chip, the quality condition of each key stage of the wafer needs to be fed back through actual measurement of the product, but in the actual operation process, the related parameters of the key product are difficult to directly obtain, the traditional measurement equipment needs higher economic cost, the measurement period is longer, and the feedback is not comprehensive and timely enough, so that the quality of the wafer is affected. In the existing measurement process, virtual measurement can be established by utilizing wafer process data, and important characteristic parameters are predicted in the wafer manufacturing process, so that production equipment parameters can be timely adjusted to reduce process deviation.

At present, more and more machine learning algorithms which can be applied to integrated circuit chip manufacturing processes are proposed, but a complex tree model and a deep learning model are actually black box models, and because prediction results generated by the models are not subjected to interpretable analysis, the reliability of the prediction results is poor, production equipment parameters of each key stage cannot be adjusted in time, and process deviation is increased.

Disclosure of Invention

In view of the above-mentioned drawbacks and shortcomings of the prior art, the present application aims to solve the technical problem that in the prior art, in the process, the prediction result generated by the machine learning algorithm model is not subjected to interpretable analysis, so that the reliability of the prediction result is poor.

In order to solve the technical problems, the main technical scheme adopted by the application comprises the following steps:

in a first aspect, embodiments of the present application provide a method for wafer deposited film thickness process prediction and interpretation, the method having the steps of:

acquiring production line data, preprocessing the production line data, and generating sample data;

inputting the sample data into a Catboost model for prediction, and outputting a prediction result;

and processing the sample data by using a SHAP model, and combining the prediction result, analyzing the influence degree of the characteristics in the sample data on the prediction result from the two angles of local interpretability and global interpretability to generate an analysis result.

In one embodiment of the present application, the step of acquiring production line data, preprocessing the production line data, and generating sample data includes:

deleting the production line sample with the missing parameters, and merging a plurality of pieces of data of the same wafer to obtain the sample set;

extracting the characteristics in the sample set, and carrying out normalization processing on the characteristics to obtain the characteristic set;

performing feature screening on the feature set to obtain an optimal feature set;

and combining the sample set and the optimal feature set, screening samples with the optimal feature set from the sample set, and generating sample data.

In one embodiment of the application, the normalization of the features uses the following formula:

wherein X is ^* For normalization results, X is the feature quantity, min is the minimum value of the feature quantity, and Max is the maximum value of the feature quantity.

In one embodiment of the present application, the screening method for performing feature screening is an RFECV method.

In one embodiment of the present application, the step of analyzing the degree of influence of the features in the sample data on the prediction result from two viewpoints of local interpretability and global interpretability further includes:

processing the sample data by using a SHAP model, and calculating a SHAP value of each feature in the sample data;

performing feature importance analysis from the perspective of global interpretability, and performing feature importance ranking by calculating an average value of absolute values of SHAP values of each feature in the sample data as an importance measurement standard of the feature to quantify the influence degree of the feature on the prediction result;

performing feature density scatter diagram analysis from the perspective of global interpretability, generating a feature density scatter diagram by sorting features, feature values, SHAP values and feature importance in the sample data, and visualizing the influence degree of the features on the prediction result;

performing feature interaction analysis from the global interpretability point of view, generating a feature interaction graph through the feature value, the SHAP value and the feature value of the maximum interaction feature, and visualizing interaction among different features;

and (3) carrying out SHAP decision graph analysis from the view of local interpretability, generating a SHAP decision graph through a deposited film thickness basic value, a deposited film thickness predicted value and the characteristic value in the model, and visualizing the influence degree of single characteristics on the predicted result.

In a second aspect, embodiments of the present application provide a system for wafer deposited film thickness process prediction and interpretation, the system having the following modules:

the data preprocessing module is used for acquiring production line data, preprocessing the production line data and generating sample data;

the prediction module is used for inputting the sample data into a Catboost model for prediction and outputting a prediction result;

and the interpretable analysis module is used for processing the sample data by utilizing the SHAP model, and combining the prediction result, analyzing the influence degree of the characteristics in the sample data on the prediction result from the two angles of local interpretability and global interpretability, and generating an analysis result.

In a third aspect, the present application provides an electronic device comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor is adapted to perform the method according to any of the preceding embodiments when the computer program is run.

In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, implements a method according to any of the preceding embodiments.

Compared with the prior art, the technical scheme of the application has the following advantages: the method and the system strengthen the credibility of the model prediction result compared with the prior art, and the technicians in the field can timely adjust the production equipment parameters according to the prediction result, thereby reducing the deviation in the wafer process, and achieving the purposes of saving the wafer process cost and optimizing the wafer process flow.

The application is not limited to the prediction of the thickness of the deposited film of the wafer, can be applied to the related prediction and explanation of other manufacturing processes of the integrated circuit, and can provide a teaching for related research.

In order to make the above objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order that the contents of the application may be more readily understood, a further detailed description of the application will be presented below with reference to specific embodiments of the application taken in conjunction with the accompanying drawings, it being understood that the following drawings illustrate only certain embodiments of the application and therefore should not be considered limiting in scope, and that other related drawings may be obtained from these drawings by one of ordinary skill in the art without undue effort.

FIG. 1 is a flow chart of a method and system for predicting and interpreting a film thickness process for wafer deposition in accordance with the present application;

FIG. 2 is a diagram of a feature importance analysis of a method and system for wafer deposited film thickness process prediction and interpretation provided by the present application;

FIG. 3 is a SHAP decision chart of a method and system for predicting and interpreting a thickness of a deposited film of a wafer in accordance with the present application.

Detailed Description

The present application will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the application and practice it.

The method and system of the present application are further described below in conjunction with specific embodiments.

In a first aspect, as shown in FIG. 1, the present application provides a method for wafer deposited film thickness process prediction and interpretation, the method having the steps of:

s1, acquiring production line data, preprocessing the production line data, and generating sample data;

in this embodiment, the step of acquiring production line data, preprocessing the production line data, and generating sample data includes:

s11, deleting production line samples with missing parameters, and merging a plurality of pieces of data of the same wafer to obtain a sample set;

s12, extracting features in the sample set, and carrying out normalization processing on the features to obtain a feature set;

in step S12, the normalization processing of the features uses the following formula:

wherein X is ^* For the normalization result, X is the feature quantity, min is the minimum value of the feature quantity, and Max is the maximum value of the feature quantity.

S13, screening the characteristics of the characteristic set to obtain an optimal characteristic set;

in step S13, feature sets are feature-filtered using the RFECV method, feature importance is rated by recursive feature elimination, after feature rating, an optimal number of features is selected by cross-validation, and an optimal feature set is obtained, which includes 19 features, using the CatBoost as an interpreter.

S14, combining the sample set and the optimal feature set, screening samples with the optimal feature set from the sample set, and generating sample data.

In this embodiment, in combination with 19 features in the optimal feature set, a sample having the optimal feature set in the sample set is extracted, and sample data having the optimal feature set is generated.

S2, inputting sample data into a Catboost model for prediction, and outputting a prediction result;

in this embodiment, a prediction model used in a process is determined, firstly, a plurality of prediction models are required to be screened, a prediction model with highest prediction accuracy is selected, then the prediction model with highest prediction accuracy is trained, after a training termination condition is met, the trained prediction model is applied to the wafer process, and a prediction result is output.

In this embodiment, the specific steps of screening a plurality of prediction models include:

s21, inputting a plurality of tree models of sample data to obtain corresponding preliminary prediction results;

in step S21, the sample data is divided into a training set and a test set according to a proportion of 75%,25%, the training set is used to train a plurality of tree models, grid search is used to optimize super parameters of the plurality of tree models, and a preliminary prediction result of the corresponding model is output.

The tree models comprise a prediction model based on an RF algorithm, a prediction model based on an XGBoost algorithm, a prediction model based on a LightGBM algorithm and a prediction model based on a Catoost algorithm.

S22, checking the accuracy of the preliminary prediction result, and taking the tree model with the highest prediction accuracy as a target prediction model.

In step S22, the test set is used to verify the preliminary prediction result, and the mean square error MSE (Mean Square Error), the average absolute percentage error MAPE (Mean Absolute Percentage Error), and the coefficient R is determined ² As an evaluation index of the target prediction model.

The prediction value of the mean square error MSE is a perfect model when the prediction value is equal to the true value, and the smaller the difference between the prediction result and the true result is, the higher the model fitness is; the average absolute percentage error MAPE value is 0%, the model is a perfect model, the MAPE value is 100%, the model is an inferior model, the MAPE value tends to be 0%, and the model fitness is higher; determining the coefficient R ² The value is 0, the model fitting effect is poor, R ² The value is 1, the model is a perfect model, R ² The higher the value tends to 1 model fitness.

TABLE 1 comparison of the prediction results for the RF model, XGBoost model, lightGBM model and Catoost model

As shown in Table 1, compared with other prediction models, the prediction model based on the Catboost algorithm has the highest prediction accuracy on the test set, the mean square error MSE is 366.68, the mean absolute percentage error MAPE is 0.250%, and the coefficient R is determined ² 0.788.

In the embodiment, the Catboost model is not subjected to super parameter optimization, and when the default parameters are used for prediction, a prediction result superior to other models can be obtained, the model has small dependence on super parameter optimization, has more universality, can greatly reduce time cost, and is more suitable for industrial application.

S3, processing sample data by using a SHAP model, and combining a prediction result, analyzing the influence degree of characteristics in the sample data on the prediction result from two angles of local interpretability and global interpretability to generate an analysis result;

in this embodiment, SHAP is one way to resolve model interpretability. And inputting the sample data into a SHAP model for SHAP analysis, calculating a SHAP value, wherein the SHAP value is a numerical value allocated to each feature in the sample data, analyzing the influence degree of the optimal feature set on the prediction result from two angles of local interpretability and global interpretability, generating an analysis result, and determining the accuracy of the prediction result.

In step S3, the step of analyzing the influence degree of the optimal feature set on the prediction result from the local interpretability and the global interpretability further includes:

s31, processing the sample data by using a SHAP model, and calculating SHAP values of each feature in the sample data;

s32, analyzing the importance of the features, and calculating the average value of the absolute values of the SHAP values of each feature in the sample data to serve as an importance measurement standard of the feature, wherein the larger the average absolute SHAP value is, the larger the influence of the feature on the prediction result is, and the influence degree of the feature on the prediction result can be quantized, so that the global interpretation of the prediction model is realized;

in step S32, as shown in fig. 2, feature importance ranking is performed according to the average absolute SHAP value of each feature, and the earlier the ranking is, the greater the influence on the prediction result is.

S33, analyzing a feature density scatter diagram, sorting the features obtained in the step S32 according to the importance degree, sorting the 19 features according to the ordinate from big to small, and accumulating SHAP values and feature values obtained by different features in a single prediction sample in the horizontal direction to form a feature density scatter diagram;

in step S33, most of the features approximately satisfy a linear relationship with the prediction result, i.e., approximately the feature value increase SHAP value increase or the feature value increase SHAP value decrease.

When the SHAP value is less than 0, the characteristic has negative influence on the predicted result, namely, the smaller the predicted deposited film thickness is; when the SHAP value is 0, the characteristic has no influence on the prediction result; at SHAP values greater than 0, the feature has a positive effect on the predicted result, i.e., the greater the predicted deposited film thickness.

In the embodiment, the interpretability analysis is performed by using the SHAP method, the feature density scatter diagram is drawn, the influence degree of the features on the prediction result is visualized, the influence of the features on the thickness of the deposited film of the wafer can be intuitively reflected, and the global interpretability of the deposited film is revealed.

S34, performing feature interaction analysis, namely observing interaction among features by drawing a feature dependency graph, setting the abscissa of the feature dependency graph as the magnitude of a feature value, setting the left vertical axis as the magnitude of a feature SHAP value, setting the right vertical axis as the magnitude of a feature value with maximum interaction with the feature, drawing the feature dependency graph, performing feature interaction analysis, and visualizing interaction among different features to realize global interpretation of a prediction model;

in step S54, the positive and negative of the SHAP value are determined by the magnitude of the feature value in the feature dependency graph, and deposition film deposition is facilitated when the SHAP value is positive, and negative effects are exerted on the deposited film thickness when the SHAP value is negative. And when the characteristic value is fixed, the characteristic value interacted with the characteristic value influences the positive and negative of the SHAP value, thereby influencing the prediction result.

S35, carrying out SHAP decision graph analysis, wherein a vertical straight line in the graph is a basic value of the deposited film thickness in the prediction model, a broken line is a predicted value of the deposited film thickness in the prediction model, whether each feature moves an output value to a value higher or lower than an average predicted value, the feature value provides a reference beside a predicted value line, and the predicted line shows how the deposited film thickness is accumulated from the basic value to a model final predicted value at the top of the graph from the bottom of the SHAP decision graph as shown in FIG. 3.

In this embodiment, the SHAP decision diagram is a single sample decision diagram, which can show how the complex model obtains its predictions, so that the magnitude and direction of the main influence can be easily identified, and the SHAP decision diagram is still clear and intuitive when the features are more.

In this embodiment, a locally interpretable analysis can provide predictive detail, focusing on how to interpret individual predictions as decisions on how individual features affect a model once.

In a second aspect, the present application provides a system for predicting and interpreting a thickness process of a deposited film of a wafer, comprising a data preprocessing module, a prediction module, and an interpretable analysis module.

The data preprocessing module is used for acquiring production line data, preprocessing the production line data and generating sample data.

The prediction module is used for inputting the sample data into the Catboost model for prediction and outputting a prediction result.

The interpretable analysis module is used for processing the sample data by utilizing the SHAP model, and combining the prediction results, analyzing the influence degree of the characteristics in the sample data on the prediction results from the two angles of local interpretability and global interpretability, and generating analysis results.

The effects of the above system when the above method is applied may be referred to the description in the foregoing method embodiment, and will not be repeated here.

In a third aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of a method and system for wafer deposited film thickness process prediction and interpretation as claimed in any of claims 1 to 5 when executing the program.

In a fourth aspect, the present application provides a non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of a method and system for wafer deposited film thickness process prediction and interpretation as claimed in any of claims 1 to 5.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations and modifications of the present application will be apparent to those of ordinary skill in the art in light of the foregoing description. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present application.

Claims

1. A method for wafer deposited film thickness process prediction and interpretation, the method comprising the steps of:

2. The method for wafer deposited film thickness process prediction and interpretation of claim 1, wherein the step of obtaining production line data, preprocessing the production line data, and generating sample data comprises:

3. A method for wafer deposited film thickness process prediction and interpretation according to claim 2, characterized in that the normalization of the features uses the following formula:

wherein X is ^* To get home toAs a result, X is the feature quantity, min is the minimum value of the feature quantity, and Max is the maximum value of the feature quantity.

4. The method for predicting and interpreting a thickness of a deposited film of a wafer as recited in claim 2, wherein said screening method for feature screening is an RFECV method.

5. The method for wafer deposited film thickness process prediction and interpretation according to claim 1, wherein the step of analyzing the degree of influence of features in the sample data on the prediction result from both a local interpretability and a global interpretability perspective further comprises:

6. A system for wafer deposited film thickness process prediction and interpretation, wherein the system has the following modules:

7. An electronic device, the electronic device comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor is adapted to perform the method of any one of claims 1 to 5 when the computer program is run.

8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any one of claims 1 to 5.