CN117520741A - Method for predicting and improving yield of semiconductor factory based on big data - Google Patents

Method for predicting and improving yield of semiconductor factory based on big data Download PDF

Info

Publication number
CN117520741A
CN117520741A CN202311313568.4A CN202311313568A CN117520741A CN 117520741 A CN117520741 A CN 117520741A CN 202311313568 A CN202311313568 A CN 202311313568A CN 117520741 A CN117520741 A CN 117520741A
Authority
CN
China
Prior art keywords
data
wafer
yield
test
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311313568.4A
Other languages
Chinese (zh)
Inventor
陈一宁
郭庞
高大为
陈鼎崴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202311313568.4A priority Critical patent/CN117520741A/en
Publication of CN117520741A publication Critical patent/CN117520741A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing

Abstract

A method for predicting and improving yield of a semiconductor factory based on big data is used for preprocessing unbalance of WAT data and CP data in the collected semiconductor factory, performing dimension reduction processing on the high-dimensional data, enhancing robustness and interpretability of a model, correlating the big data with the yield of a wafer by adopting a machine learning model, and analyzing test factors causing reduction of the yield. The model establishment and analysis are carried out on the WAT data and the CP data after the processing, the root cause analysis of the yield is carried out according to the result of the model, the root cause analysis efficiency can be greatly improved, the economic benefit of a manufacturing plant is improved, and the model establishment method has comprehensiveness, prediction accuracy, and analysis rapidity and reliability on the WAT data and the CP data in the processing process.

Description

Method for predicting and improving yield of semiconductor factory based on big data
Technical Field
The invention relates to the technical field of integrated circuits, in particular to a method for predicting and improving yield of a semiconductor factory based on big data.
Background
In integrated circuit manufacturing, yield is directly related to the benefit of the production unit. In the process, a lot of test data including online Inline data (mainly, error detection and classification (Fault Detection and Classification, FDC) data), defect (defect) data, wafer acceptance test (Wafer Acceptance Test, WAT) data, and wafer probe test (CP) data may occur. The WAT data is mainly used for testing the electrical performance of the wafer to monitor the stability of process fluctuation occurring in production overload, and the wafer is subjected to the CP test after passing the WAT test. WAT data, however, typically contains tens to hundreds of test variables including voltage, resistance, capacitance, etc. of transistors, and the amount of data is very large. The conventional analysis method adopts a method of T test value, analysis of variance and average value comparison to determine whether the test variables have problems, however, the methods have certain limitations, firstly, the methods cannot be related to the wafer yield, secondly, the dimension of the data is very high, the analysis method needs engineers to perform manual analysis, time and labor are wasted, thirdly, if the process parameters change, the analysis method also needs to change, if the analysis is also based on the analysis, the waste of production resources can be caused, meanwhile, the conventional WAT test data analysis method is difficult to be related to the wafer yield, the analysis is time and labor wasted due to the excessively high dimension of the data, and even errors can occur in the analysis result.
Disclosure of Invention
Aiming at the problems and technical requirements existing in the prior art, the invention aims to provide a method for predicting and improving the yield of a semiconductor factory based on big data, which is used for preprocessing the unbalance of WAT data and CP data in the collected semiconductor factory, simultaneously carrying out dimension reduction on the high-dimensional data, enhancing the robustness and the interpretability of a model, and finally adopting a machine learning model to correlate the big data with the wafer yield, and analyzing test factors which lead to the reduction of the yield.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:
a method for predicting and improving yield of a semiconductor factory based on big data comprises the following steps:
step 1: collecting wafer acceptability test data and wafer probe test (CP) data in a semiconductor factory, forming a data set as raw data and storing the raw data in a storage system;
step 2: performing data preprocessing on the collected wafer acceptability test data and wafer probe test data, wherein the preprocessing comprises outlier processing, missing value processing and data normalization;
step 3: combining wafer acceptance test (Wafer Acceptance Test, WAT) data and wafer probe test data, and dividing a data set into a training set, a test set and a verification set;
step 4: sample enhancement is carried out on a training set in the wafer acceptance test data, and a balance sample set is generated;
step 5: performing dimension reduction treatment on the generated balance sample set, and screening out a parameter set with the best predicted performance effect in a characteristic variable screening mode;
step 6: carrying out wafer yield prediction modeling on the processed wafer acceptance test data and the processed wafer probe test data;
step 7: calculating the feature importance of each feature in the wafer yield prediction model, outputting the model importance of each feature, and sequencing from large to small;
step 8: and calculating the SHAP values of the features, and outputting the SHAP values and the analysis chart of each feature.
Step 9: and (5) carrying out root cause analysis of yield loss according to the size ordering of the model importance of the features and the SHAP value and the analysis chart.
The wafer acceptability test data collected in the step 1 includes data of resistance, capacitance and inductance of the transistor, threshold voltage, saturation current, subthreshold current and capacitance, resistance and inductance of the metal interconnection layer.
The probe test data collected in the step 1 include electrical test data of each bare chip in the wafer and yield data of the wafer.
The pretreatment in the step 2 comprises the following steps:
step 21: performing outlier processing on the wafer acceptability test data and the wafer probe test data to remove abnormal conditions in the test process;
step 22: carrying out missing value processing on the wafer acceptability test data and the wafer probe test data after the abnormal value processing, and removing the condition that errors are not stored in the test process;
step 23: and carrying out data normalization on the wafer acceptability test data and the wafer probe test data after the missing value processing, and removing the situation of inconsistent dimension of the test parameters in the test process.
The abnormal value processing method adopts a box diagram method, a Z-score method and a mean square error analysis method.
The missing value processing adopts the methods of missing value removal, interpolation and mean filling.
The data normalization adopts the methods of Min-Max normalization, mean variance normalization and batch normalization.
The sample enhancement in step 4 was performed by anti-formative networking (Generative Adversarial Network, GAN).
The feature screening method in the step 5 adopts an algorithm combining a Borata algorithm and a genetic algorithm.
And in the step 6, the wafer yield is predicted and modeled by adopting a Catboost model, a random forest model, a decision tree model, an XGboost model and a support vector machine model.
Compared with the prior art, the invention has the beneficial effects that:
according to the method for predicting and improving the yield of the semiconductor factory based on the big data, the quality of the data is improved through the effective data enhancement means and the data dimension reduction method for the WAT data and the CP data in the collected semiconductor factory, the robustness and the interpretability of the model are improved, meanwhile, the analysis difficulty is reduced, the analysis time is shortened, the reliable guarantee is provided for the subsequent analysis, meanwhile, the model establishment and the analysis are carried out on the WAT data and the CP data after the processing, the root cause analysis of the yield is carried out according to the result of the model, the root cause analysis efficiency can be greatly improved, the economic benefit of a manufacturing factory is improved, and the method has comprehensiveness, the prediction accuracy, the analysis rapidity and the reliability on the processing process of the WAT data and the CP data.
The foregoing description is only an overview of the present invention, and in order that the present invention may be more clearly understood by reference to the following description, the present invention will be described in more detail with reference to the accompanying drawings.
The above and other objects, features and advantages of the present invention will become more apparent to those skilled in the art from the following detailed description of the specific embodiments of the present invention taken in conjunction with the accompanying drawings, which are not to be construed as limiting the invention.
Drawings
FIG. 1 is a frame diagram of the present invention
FIG. 2 is a sample enhanced frame diagram of the present invention
Detailed Description
The present invention will be described more fully hereinafter in order to facilitate an understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
The following provides a detailed description of embodiments of the present invention.
As shown in FIG. 1, a method for predicting and improving yield of a semiconductor factory based on big data comprises the following steps:
step 1: collecting wafer acceptability test data and wafer probe test (CP) data in a semiconductor factory, forming a data set as raw data and storing the raw data in a storage system;
step 2: performing data preprocessing on the collected wafer acceptability test data and wafer probe test data, wherein the preprocessing comprises outlier processing, missing value processing and data normalization;
step 3: combining wafer acceptance test (Wafer Acceptance Test, WAT) data and wafer probe test data, and dividing a data set into a training set, a test set and a verification set;
step 4: sample enhancement is carried out on a training set in the wafer acceptance test data, and a balance sample set is generated;
step 5: performing dimension reduction treatment on the generated balance sample set, and screening out a parameter set with the best predicted performance effect in a characteristic variable screening mode;
step 6: carrying out wafer yield prediction modeling on the processed wafer acceptance test data and the processed wafer probe test data;
step 7: calculating the feature importance of each feature in the wafer yield prediction model, outputting the model importance of each feature, and sequencing from large to small;
step 8: and calculating the SHAP values of the features, and outputting the SHAP values and the analysis chart of each feature.
Step 9: and (5) carrying out root cause analysis of yield loss according to the size ordering of the model importance of the features and the SHAP value and the analysis chart.
The wafer acceptability test data collected in step 1 includes data of resistance, capacitance, inductance of the transistor, threshold voltage, saturation current, subthreshold current, and capacitance, resistance, and inductance of the metal interconnect layer.
The probe test data collected in step 1 includes electrical test data of each die in the wafer and yield data of the wafer.
The wafer acceptability test typically includes tens to hundreds of tests, and the probe test data typically includes tens of test classes, each of which includes a different electrical functional test.
The pretreatment in step 2 comprises the following steps:
step 21: performing abnormal value processing on the wafer acceptability test data and the wafer probe test data to remove abnormal conditions existing in the test process, such as poor contact or test value preservation errors existing in the contact of the probe and the wafer during the test;
step 22: carrying out missing value processing on the wafer acceptability test data and the wafer probe test data after the abnormal value processing, and removing the condition that errors are not stored in the test process;
step 23: and carrying out data normalization on the wafer acceptability test data and the wafer probe test data after the missing value processing, and removing the situation of inconsistent dimension of the test parameters in the test process.
The abnormal value is processed by using a box diagram method, a Z-score method, a mean square error analysis method and the like.
The missing value processing adopts the methods of missing value removal, interpolation, mean filling and the like.
The data normalization adopts the methods of Min-Max normalization, mean variance normalization, batch normalization and the like.
Data merging in step 3, the ratio of training set, test set and validation set is set to 7:2:1.
the sample enhancement in step 4 was performed by anti-formative networks (Generative Adversarial Network, GAN).
In step 4, the distribution of the wafer yield is generally different according to different development periods of the semiconductor factory on the product, in the development period, the number of high-yield wafers is generally far less than that of low-yield wafers, in the mature mass production period, the number of high-yield wafers is far more than that of low-yield wafers, and for these different yield distributions, a data enhancement technology is required to enhance the reliability of the subsequent machine learning or deep learning model.
As shown in fig. 2, the sample enhancement method inputs a few types of samples in the samples into the model for learning, and generates the few types of samples so that the few types of samples are consistent with the number of the majority types of samples. Typically, the anti-formative network comprises a generator for generating minority class samples and a discriminator for discriminating the generated minority class samples from the original samples. An equilibrium dataset is achieved through an antagonistic training.
The steps of anti-networking are as follows: a noise is firstly applied to the minority class samples and is input into the generator, then the discriminator is trained to distinguish the generated minority class samples from the original samples, a proper loss function is set until the discriminator cannot distinguish the generated minority class samples from the original samples, and finally a balance data set is generated.
In the feature screening method in the step 5, the adopted method is an algorithm combining a Borata algorithm and a genetic algorithm, a group of feature variables which are most relevant to dependent variables in the feature variables are screened out through the Borata algorithm, and then the genetic algorithm is adopted to screen out a group of feature variables with the best effect on the model, wherein the feature variables comprise various parameters such as resistance, capacitance and inductance of a transistor.
In the step 6, the wafer yield is predicted and modeled by adopting a Catboost model, a random forest model, a decision tree model, an XGboost model and a support vector machine model.
Wafer yield prediction modeling to predict wafer yield, failure test terms, failure categories, etc.
The technical features of the above examples may be arbitrarily combined, and all possible combinations of the technical features in the above examples are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent embodiments of the invention, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that several variations and insubstantial modifications could be made by those skilled in the art without departing from the spirit of the invention, which would still fall within the scope of the invention. Accordingly, the scope of the invention should be assessed as that of the appended claims.

Claims (10)

1. The method for predicting and improving the yield of the semiconductor factory based on the big data is characterized by comprising the following steps of:
step 1: collecting wafer acceptability test data and wafer probe test (CP) data in a semiconductor factory, forming a data set as raw data and storing the raw data in a storage system;
step 2: performing data preprocessing on the collected wafer acceptability test data and wafer probe test data, wherein the preprocessing comprises outlier processing, missing value processing and data normalization;
step 3: combining wafer acceptance test (Wafer Acceptance Test, WAT) data and wafer probe test data, and dividing a data set into a training set, a test set and a verification set;
step 4: sample enhancement is carried out on a training set in the wafer acceptance test data, and a balance sample set is generated;
step 5: performing dimension reduction treatment on the generated balance sample set, and screening out a parameter set with the best predicted performance effect in a characteristic variable screening mode;
step 6: carrying out wafer yield prediction modeling on the processed wafer acceptance test data and the processed wafer probe test data;
step 7: calculating the feature importance of each feature in the wafer yield prediction model, outputting the model importance of each feature, and sequencing from large to small;
step 8: and calculating the SHAP values of the features, and outputting the SHAP values and the analysis chart of each feature.
Step 9: and (5) carrying out root cause analysis of yield loss according to the size ordering of the model importance of the features and the SHAP value and the analysis chart.
2. The method of claim 1, wherein the wafer acceptability test data collected in step 1 includes data of resistance, capacitance, inductance, threshold voltage, saturation current, subthreshold current, and capacitance, resistance, and inductance of the metal interconnect layer.
3. The method of claim 1, wherein the probe test data collected in step 1 includes electrical test data of each die in the wafer and yield data of the wafer.
4. The method for predicting and improving yield of semiconductor factories based on big data according to claim 1, wherein the preprocessing in step 2 comprises the following steps:
step 21: performing outlier processing on the wafer acceptability test data and the wafer probe test data to remove abnormal conditions in the test process;
step 22: carrying out missing value processing on the wafer acceptability test data and the wafer probe test data after the abnormal value processing, and removing the condition that errors are not stored in the test process;
step 23: and carrying out data normalization on the wafer acceptability test data and the wafer probe test data after the missing value processing, and removing the situation of inconsistent dimension of the test parameters in the test process.
5. The method for predicting and improving yield of semiconductor factories based on big data according to claim 4, wherein the abnormal value is processed by a box-line graph method, a Z-score method and a mean square error analysis method.
6. The method for predicting and improving yield of semiconductor manufacturing plant based on big data as claimed in claim 4, wherein the missing value processing adopts a method of removing missing value, interpolation method and mean filling method.
7. The method for predicting and improving yield of semiconductor plants based on big data as claimed in claim 4, wherein the data normalization is Min-Max normalization, mean variance normalization and batch normalization.
8. The method for predicting and improving yield of semiconductor manufacturing process based on big data as recited in claim 1, wherein the sample enhancement in the step 4 is anti-growth network (Generative Adversarial Network, GAN).
9. The method for predicting and improving yield of semiconductor factories based on big data according to claim 1, wherein the feature screening method in the step 5 is an algorithm combining a Boruta algorithm with a genetic algorithm.
10. The method for predicting and improving the yield of the semiconductor factory based on big data according to claim 1, wherein the wafer yield prediction modeling in the step 6 is a Catboost model, a random forest model, a decision tree model, an XGboost model and a support vector machine model.
CN202311313568.4A 2023-10-11 2023-10-11 Method for predicting and improving yield of semiconductor factory based on big data Pending CN117520741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311313568.4A CN117520741A (en) 2023-10-11 2023-10-11 Method for predicting and improving yield of semiconductor factory based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311313568.4A CN117520741A (en) 2023-10-11 2023-10-11 Method for predicting and improving yield of semiconductor factory based on big data

Publications (1)

Publication Number Publication Date
CN117520741A true CN117520741A (en) 2024-02-06

Family

ID=89746472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311313568.4A Pending CN117520741A (en) 2023-10-11 2023-10-11 Method for predicting and improving yield of semiconductor factory based on big data

Country Status (1)

Country Link
CN (1) CN117520741A (en)

Similar Documents

Publication Publication Date Title
KR102258942B1 (en) System and method for the automatic determination of critical parametric electrical test parameters for inline yield monitoring
US20060241802A1 (en) Real-time management systems and methods for manufacturing management and yield rate analysis integration
CN110335168B (en) Method and system for optimizing power utilization information acquisition terminal fault prediction model based on GRU
CN113191399B (en) Method for improving yield of semiconductor chips based on machine learning classifier
CN113900869A (en) Chip test data judgment method and device, storage medium and test method
CN115099147A (en) Process analysis and intelligent decision method based on SMT production line
CN112817954A (en) Missing value interpolation method based on multi-method ensemble learning
CN113484817A (en) Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model
CN117520741A (en) Method for predicting and improving yield of semiconductor factory based on big data
Pradeep et al. Optimal Predictive Maintenance Technique for Manufacturing Semiconductors using Machine Learning
CN111223799B (en) Process control method, device, system and storage medium
CN114637782A (en) Method and device for generating text aiming at structured numerical data
US20050075835A1 (en) System and method of real-time statistical bin control
CN117520739A (en) Intelligent prediction and analysis system for improving yield of CMOS integrated circuit
Hsu et al. Main branch decision tree algorithm for yield enhancement with class imbalance
Vergura Big data and efficiency of PV plants
CN104752259A (en) Sorting method for chips with inappropriate yield to technological level
CN117194963B (en) Industrial FDC quality root cause analysis method, device and storage medium
Ooi et al. Identifying systematic failures on semiconductor wafers using ADCAS
US11328108B2 (en) Predicting die susceptible to early lifetime failure
US20240053735A1 (en) Method and Device for Identifying Variables from a Plurality of Variables Having a Dependence on a Predetermined Variable from the Plurality of Variables
KR102367597B1 (en) AI based Manufacturing quality inspection system
CN116629707B (en) FDC traceability analysis method based on distributed parallel computing and storage medium
JP2002368056A (en) Method for providing yield conditions, method for determining production conditions, method for fabricating semiconductor device and recording medium
CN117272122B (en) Wafer anomaly commonality analysis method and device, readable storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination