CN117113291A - Analysis method for importance of production parameters in semiconductor manufacturing - Google Patents

Analysis method for importance of production parameters in semiconductor manufacturing Download PDF

Info

Publication number
CN117113291A
CN117113291A CN202311371783.XA CN202311371783A CN117113291A CN 117113291 A CN117113291 A CN 117113291A CN 202311371783 A CN202311371783 A CN 202311371783A CN 117113291 A CN117113291 A CN 117113291A
Authority
CN
China
Prior art keywords
production
importance
model
parameters
production parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311371783.XA
Other languages
Chinese (zh)
Other versions
CN117113291B (en
Inventor
陈双武
李晨杰
李江明
金东�
杨坚
谢箭
陶煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Zheta Technology Co ltd
Institute of Artificial Intelligence of Hefei Comprehensive National Science Center
Original Assignee
Hefei Zheta Technology Co ltd
Institute of Artificial Intelligence of Hefei Comprehensive National Science Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Zheta Technology Co ltd, Institute of Artificial Intelligence of Hefei Comprehensive National Science Center filed Critical Hefei Zheta Technology Co ltd
Priority to CN202311371783.XA priority Critical patent/CN117113291B/en
Publication of CN117113291A publication Critical patent/CN117113291A/en
Application granted granted Critical
Publication of CN117113291B publication Critical patent/CN117113291B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The application relates to the field of semiconductor manufacturing, in particular to a method for analyzing importance of production parameters in semiconductor manufacturing. The method divides a data set, trains and verifies the decision tree regression model in a rotation verification mode by utilizing a plurality of groups of training sets and corresponding test sets which are obtained by dividing, and obtains a plurality of rotation verification models. Selecting an important production parameter set from the three angles of frequency characteristic importance, arrangement characteristic importance and constant replacement characteristic importance by using a plurality of rotation verification models, fitting a plurality of general regression models by using the important production parameter set, selecting a general regression model with the minimum mean square error of a prediction result from the plurality of general regression models, and training the production parameters used by the general regression model as the production parameters with the highest importance in semiconductor manufacturing. The application captures the interrelationship among various production parameters more comprehensively, improves the generalization capability of the model, and can be better suitable for various complex conditions in the production process.

Description

Analysis method for importance of production parameters in semiconductor manufacturing
Technical Field
The application relates to the field of semiconductor manufacturing, in particular to a method for analyzing importance of production parameters in semiconductor manufacturing.
Background
With the growing global demand for semiconductor manufacturing, the importance of semiconductor manufacturing becomes increasingly significant. Improving the yield of semiconductor production has become critical. However, since the production process of the semiconductor is compact and complicated, each step in the production process needs to be strictly controlled, and the final yield can be ensured.
Taking wafer manufacturing as an example, the wafer manufacturing process is very complex, and multiple chambers are required to be organically cooperated, and the wafer manufacturing process is completed according to strict manufacturing steps. In the wafer production process, various sensors are installed in each chamber to monitor environmental and equipment state parameters such as temperature, humidity, pressure and the like, and accurate control of the production parameters is important to ensure the yield of wafer production. In order to improve the yield of wafer production and reduce the production cost, key production parameters with great influence on the yield of wafer production in the production process need to be rapidly positioned so as to better optimize the production flow.
In recent years, engineers have widely adopted analysis methods based on mathematical modeling and simulation techniques to analyze the importance of production parameters in semiconductor manufacturing processes in order to improve the accuracy of the production parameter analysis. The method relies on complex mathematical modeling technology and combines statistical principles, so that deep modeling simulation and analysis can be carried out aiming at production parameters, various conditions of the production process are simulated, and the influence of different production parameters on the final yield is evaluated.
However, the analysis method based on mathematical modeling and simulation technology generally performs deep modeling and simulation on specific production parameters, but the production parameters involved in the semiconductor production process are numerous and interrelated, and the variation of one production parameter may affect the performance of other production parameters. If the model is too focused on analysis of a particular production parameter, the influence of other production parameters may be ignored, resulting in poor generalization ability of the model and inability to adapt to various complications in the production process.
Disclosure of Invention
In order to solve the above problems, the present application provides a method for analyzing the importance of production parameters in semiconductor manufacturing.
The method comprises the steps of positioning a plurality of production parameters with highest importance in the production process for semiconductor manufacture, and comprises the following steps:
step one, preprocessing the historical production data of the semiconductor and selecting the characteristics to obtain a productProduction data set consisting of individual samples +.>Each sample corresponds to all production parameters of a semiconductor product in the production process and the quality detection result of the semiconductor product, and each sample is +.>Vector of dimensions, each dimension corresponding to each category of production parameters;
step two, the production data setDivided into->Selecting a decision tree regression model from training set and corresponding test set using +.>The training sets are respectively trained based on the selected decision tree regression model to obtain +.>A rotation verification model, calculating the mean square error of the prediction result of the rotation verification model by using a test set corresponding to the training set of each rotation verification model, wherein +.>Personal rotation verification model->The mean square error of the predicted result is +.>,/>
Step three, useCalculating the importance of the frequency characteristic for each production parameter by using the rotation verification model to obtain the length of the frequency characteristicFrequency feature importance ranking queue ++>
Step four, usingThe individual rotation verification model calculates the importance of the arrangement features for each production parameter to obtain a length ofRanking feature importance ranking queue ++>
Step five, usingThe individual rotation verification model calculates the constant substitution feature importance for each production parameter, resulting in a length +.>Constant substitution feature importance ranking queue +.>
Step six, sorting the queue from the frequency characteristic importanceRanking feature importance ranking queue->And constant substitution feature importance ranking queue +.>Is selected from->The most important production parameter, +.>The composition length is->Carrying out de-duplication treatment after the production parameter set of (2) to obtain an important production parameter set +.>
Step seven, from the production datasetExtract important production parameter set->The data corresponding to the production parameters in (a) are used as fitting data and are based on the production data set +.>Preparing a fitting test set, selecting a universal regression model, dividing fitting data into a plurality of fitting subsets, respectively training the universal regression model to obtain a plurality of importance pre-estimated models, calculating the mean square error of the prediction results of the importance pre-estimated models based on the fitting test set, and selecting all production parameters included in the fitting subset used by the importance pre-estimated model with the least mean square error of the prediction results as the production parameters with the highest importance in semiconductor manufacturing.
Further, the second step specifically includes:
step two A, the production data setDivided into->The data distribution difference of each subset is smaller than a threshold value;
step two, taking each subset as a test set,the subset is removed from the subset and left +.>The subsets are combined and then used as training sets corresponding to the subsets;
step two, selecting a decision tree regression model;
step two D, based on the firstTraining the decision tree regression model by using the training sets to obtainFirst->Personal rotation verification model,/>
Step two E, based on the firstTest set corresponding to the training set calculates +.>Personal rotation verification model->Mean square error of prediction result->
Wherein,represents->Production parameters of the individual semiconductor products during production, < >>Represents->Quality test result label of individual semiconductor products, < >>Represents->Sample number->Indicate->Training set, ->Indicate->Quality detection results of the input semiconductor product production parameter prediction by the individual rotation verification model;
step two F, respectively usingTraining and verifying the decision tree regression model by the training set and the corresponding test set to obtain +.>Personal rotation verification model +.>And (5) verifying the mean square error of the model prediction result through each rotation.
Further, the decision tree regression model in step two C specifically refers to a lightweight gradient lifting tree model.
Further, the third step specifically includes:
first, theFrequency characteristic importance of the individual production parameters +.>
,
Wherein,indicate->Production parameters of->Indicate->Production parameters of->Indicate->The number of times the parameters are produced using brackets during the construction of the individual rotation verification model, +.>Representing a natural exponential function;
usingThe individual rotation verification model calculates the frequency characteristic importance for each production parameter, resulting in a length +.>Frequency feature importance ranking queue ++>
Further, the fourth step specifically includes:
first, theArrangement characteristic importance of individual production parameters +.>The method comprises the following steps:
wherein,representing a natural exponential function, ++>Indicate will be->Samples with disordered characteristic sequences corresponding to the production parameters, < >>Indicate will be->Samples with disordered characteristic sequences corresponding to the production parameters; />Representing a mean square error calculation; />Represents the +.>After the sequence of the characteristics corresponding to the production parameters is disordered, the method comprises the following steps ofPersonal rotation verification model->Mean square error of predicted quality detection result, +.>Represents the +.>The corresponding characteristic sequence of the production parameters is disturbed, and then the production process is finished by the first ∈>Personal rotation verification model->Mean square error of the predicted quality detection result;
usingThe individual rotation verification model calculates the importance of the alignment features for each production parameter, resulting in a length of +.>Ranking feature importance ranking queue ++>
Further, the fifth step specifically includes:
first, theConstant substitution characteristic importance of the individual production parameters +.>Is->
Wherein,represents the>The characteristic value corresponding to the production parameter is replaced by a constant +.>Mean square error of the prediction result of the individual rotation verification model,/->Represents the>The characteristic value corresponding to the production parameter is replaced by a constant +.>The mean square error of the prediction result of the model is verified through rotation;
usingThe individual rotation verification model calculates the constant substitution feature importance for each production parameter, resulting in a length +.>Constant substitution feature importance ranking queue +.>
Further, the seventh step specifically includes:
step seven A, defining important production parameter setThe number of production parameters in (a) is->From the production data set->Extract important production parameter set->As data corresponding to production parameters in a computerFitting the data; production data set->Sampling by using a self-help method to construct a fitting test set;
step seven B, selecting each time from the fitting dataData corresponding to the individual production parameters are used as fitting subsets, < >>Obtain->Fitting subset->Representing from->Selecting->The number of all possible ways of producing the parameters;
step seven, selecting a general regression model;
step seven D, utilizeTraining the general regression model by using the fitting subsets to obtain +.>A importance pre-estimation model;
step seven E, calculating by using fitting test setThe method comprises the steps of predicting the mean square error of a result by an importance pre-estimation model, and selecting an importance pre-estimation model with the minimum mean square error of the predicted result as an optimal importance pre-estimation model;
and step seven F, taking all production parameters included in the fitting subset used by the optimal importance prediction model training as a plurality of production parameters with highest importance in the semiconductor manufacturing.
Further, the general regression model in the step seven C specifically refers to a lightweight gradient lifting tree model.
One or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
the application not only focuses on single production parameters, but also enables the model to learn and understand from multiple angles when learning various production parameters in the semiconductor production process through comprehensive analysis of frequency characteristic importance, arrangement characteristic importance and constant replacement characteristic importance, thereby capturing the interrelationship among various production parameters more comprehensively, improving the generalization capability of the model and being better suitable for various complex conditions in the production process.
Drawings
FIG. 1 is a flow chart of a method for analyzing importance of production parameters in semiconductor manufacturing according to an embodiment of the present application.
Detailed Description
The present application will be described in detail below with reference to the drawings and detailed embodiments, and before the technical solutions of the embodiments of the present application are described in detail, the terms and terms involved will be explained, and in the present specification, the components with the same names or the same reference numerals represent similar or identical structures, and are only limited for illustrative purposes.
The application collects the original production parameters and the quality detection results obtained by testing the corresponding semiconductor products from the semiconductor production engineering database to preprocess, generates a data set, divides the data set, and obtains a plurality of groups of training sets and corresponding testing sets. And selecting a decision tree regression model, and training and verifying the decision tree regression model by using a plurality of groups of training sets and corresponding test sets in a rotation verification mode to obtain a plurality of rotation verification models so as to improve the robustness and generalization capability of the rotation verification models.
The method comprises the steps of evaluating the importance of production parameters by using a plurality of rotation verification models from three angles of frequency feature importance, arrangement feature importance and constant replacement feature importance, selecting an important production parameter set by combining the three-angle-evaluated production parameter importance, fitting a plurality of general regression models again by using the important production parameter set, selecting a general regression model with the minimum mean square error of a prediction result from the plurality of general regression models, and training the production parameters used by the general regression model to serve as the production parameters with the highest importance in semiconductor manufacturing.
The steps of the present application are shown in fig. 1, and the present application will be explained below using wafer production as an example, and the detailed steps of this example are as follows:
1. training data preparation
The wafer production process is complex and long, and various production parameters in the wafer production process and quality detection results of the produced wafers are recorded in an engineering database at the same time. And taking the production parameters read in the engineering database as original data, and taking the quality detection result of the wafer corresponding to the production parameters as a label in the original data.
The original data collected from the engineering database often contains noise, missing and inconsistent data, and the original data is subjected to preprocessing including complement missing values, de-duplication, noise reduction and the like so as to improve the quality of the original data and promote efficient data mining tasks.
The original data is subjected to data preprocessing to obtain preprocessed data.
2. Feature selection
The feature selection improves the performance and accuracy of the model by removing redundant and irrelevant features, reduces the complexity of the model and improves the running speed of the model. The present application uses pearson correlation coefficients for feature selection.
After feature selection, the obtained product is composed ofProduction data set consisting of individual samples +.>Each sample represents all production parameters of a wafer in the production process and the quality detection result of the wafer,production data set->Expressed as:
wherein,represents->Production parameters of individual wafers during production, <' > in->Is +.>Vector of dimensions, each dimension corresponding to each category of production parameters; />Represents->And labeling the quality detection result of each wafer.
3. Modeling based on lightweight gradient lifting tree model
A decision tree regression model is selected to learn the relationship between the production parameters and the quality detection results, and a lightweight gradient-lifted tree model (LightGBM) is selected to explain the application. Will produce the datasetDividing into a plurality of parts, selecting one part as a test set each time, and the rest as a training set, and respectively training and verifying the decision tree regression model based on the training set and the test set to improve the robustness and generalization energy of the decision tree regression modelForce.
Will produce the datasetDivided into->And the data distribution difference of each subset is smaller than a threshold value. For each subset, it is taken as test set, and +.>The subset is removed from the subset and left +.>The sub-sets are combined and then used as training sets corresponding to the sub-sets, and the sub-sets are divided to obtain +.>A training set and a corresponding test set. Based on->Training the LightGBM by the training set to obtain a rotation verification model +.>,/>Training set, test set and rotation verification model are in one-to-one correspondence, and are in ++>Both the index of the training set or test set and the index of the rotation verification model. Evaluating the prediction accuracy of the alternate verification model by using a test set corresponding to each alternate verification model to obtain +.>Mean square error of the prediction result of the individual rotation verification model +.>Personal rotation verification model->Mean square error of prediction result->The method comprises the following steps:
wherein,indicate->Training set, ->Indicate->Quality detection result of the individual rotation verification model for the input wafer production parameter prediction, +.>Represents->Samples.
Considering that the hyper-parameters of the rotation verification model have a great influence on the fitting effect of the rotation verification model, the hyper-parameters of the rotation verification model are set by using the grid search parameter adjustment method, so that the prediction accuracy of the rotation verification model is improved.
4. Production parameter importance analysis
In order to comprehensively evaluate the importance of the production parameters, the application defines the importance of the frequency characteristics, the importance of the arrangement characteristics and the importance of the constant substitution characteristics, and evaluates the importance of the production parameters from three angles.
4.1 importance of frequency characteristics
The importance of the frequency characteristic is focused on measuring a production parameterThe use frequency of the features corresponding to the numbers in the construction process of the rotation verification model is higher, and the contribution of the features with higher use frequency to the fitting effect of the rotation verification model is larger. UsingThe frequency characteristic importance of each production parameter is calculated by the individual rotation verification model, namely +.>Frequency characteristic importance of individual production parametersThe method comprises the following steps:
,
wherein,indicate->Production parameters of->Indicate->Production parameters of->Indicate->The number of times the parameters are produced using brackets during the construction of the individual rotation verification model, +.>Representing a natural exponential function.
The larger the frequency characteristic importance value of the production parameter obtained through calculation is, the larger the fitting effect contribution of the characteristic corresponding to the production parameter to the rotation verification model is.
UsingThe individual rotation verification model calculates the frequency characteristic importance for each production parameter, resulting in a length +.>Frequency feature importance ranking queue ++>
4.2 ranking feature importance
The feature importance is arranged by disturbing the feature sequence corresponding to a single production parameter, and the importance of one production parameter is measured by the change of the mean square error of the prediction result of the rotation verification model caused by the disturbance of the feature sequence. Definition of the first embodimentArrangement characteristic importance of individual production parameters +.>The method comprises the following steps:
wherein,indicate will be->Samples with disordered characteristic sequences corresponding to the production parameters, < >>Indicate will be->Special corresponding to each production parameterSamples with disordered sign sequences; />Representing a mean square error calculation; />Represents the +.>The corresponding characteristic sequence of the production parameters is disturbed, and then the production process is finished by the first ∈>Personal rotation verification model->Mean square error of predicted quality detection result, +.>Represents the +.>The corresponding characteristic sequence of the production parameters is disturbed, and then the production process is finished by the first ∈>Personal rotation verification model->Mean square error of the predicted quality detection result.
If the importance of the arrangement features obtained by calculating the features corresponding to one production parameter is a positive number, the feature corresponding to the production parameter can improve the fitting quality of the rotation verification model; if the importance of the arrangement feature is 0, the feature corresponding to the production parameter does not contribute to the fitting of the alternate verification model; if the importance of the ranking features is negative, this means that the corresponding features of the production parameters will reduce the quality of the fit of the rotation verification model.
UsingThe individual rotation verification model calculates the importance of the alignment features for each production parameter, resulting in a length of +.>Ranking feature importance ranking queue ++>
4.3 constant substitution feature importance
The constant replacement feature importance refers to replacing a feature value corresponding to a single production parameter with a constant, and comparing the change of the mean square error of the prediction result of the rotation verification model caused by the replacement to measure the importance of one production parameter. If the mean square error change of the prediction result of the rotation verification model does not exceed the threshold value after the feature value corresponding to one production parameter is replaced by a constant, the influence of the feature corresponding to the production parameter on the fitting effect of the rotation verification model can be considered to be smaller, namely the importance of the production parameter is lower; after the characteristic value corresponding to one production parameter is replaced by a constant, the mean square error change of the prediction result of the rotation verification model exceeds a threshold value, so that the influence of the characteristic corresponding to the production parameter on the fitting effect of the rotation verification model is larger, namely the importance of the production parameter is higher. Then the firstConstant substitution feature importance of individual production parametersIs->
Wherein,represents the>The characteristic value corresponding to the production parameter is replaced by a constant +.>Mean square error of the prediction result of the individual rotation verification model,/->Represents the>The characteristic value corresponding to the production parameter is replaced by a constant +.>And (5) verifying the mean square error of the model prediction result through each rotation.
The larger the calculated constant replacement feature importance value of the production parameter is, the larger the contribution of the feature corresponding to the production parameter to the fitting effect of the rotation verification model is.
UsingThe individual rotation verification model calculates the constant substitution feature importance for each production parameter, resulting in a length +.>Constant substitution feature importance ranking queue +.>
4.4 multivariate importance analysis
Ranking queues from frequency feature importanceRanking feature importance ranking queue->And constant substitution feature importance ranking queue +.>Is selected from->The most important production parameter, +.>The composition length is->Will->The duplicate removal treatment is carried out on the production parameter sets to obtain important production parameter sets +.>
5. Predicting production parameter importance
Defining important production parameter setsThe number of production parameters in (a) is->From the production data set->Extract important production parameter set->The data corresponding to the production parameters in (a) are used as fitting data, and the production data set is +.>Self-service sampling was used to construct a fitting test set. Every time select +.>The data corresponding to the individual production parameters are used as a fitting subset,is co-obtained->Fitting subset->Representing from->Selecting->The number of all possible ways of producing the parameters.
A general regression model was selected, and in this embodiment, a LightGBM was selected as the general regression model to specifically explain the present application. By means ofTraining the general regression model by using the fitting subsets to obtain +.>An importance prediction model, calculating +.>And selecting the importance pre-estimation model with the minimum mean square error of the prediction result as the optimal importance pre-estimation model. And taking all production parameters included in the fitting subset used by the optimal importance prediction model training as the production parameters with the highest importance in the wafer manufacturing.
In the embodiment, the hyper-parameters of the importance pre-estimated model are set by using a grid searching and parameter adjusting method.
The above embodiments are merely illustrative of the preferred embodiments of the present application and are not intended to limit the scope of the present application, and various modifications and improvements made by those skilled in the art to the technical solution of the present application should fall within the protection scope defined by the claims of the present application without departing from the design spirit of the present application.

Claims (8)

1. The method for analyzing the importance of the production parameters in the semiconductor manufacturing is a plurality of production parameters with the highest importance in the positioning production process of the semiconductor manufacturing, and is characterized by comprising the following steps:
step one, preprocessing the historical production data of the semiconductor and selecting the characteristics to obtain a productProduction data set consisting of individual samples +.>Each sample corresponds to all production parameters of a semiconductor product in the production process and the quality detection result of the semiconductor product, and each sample is +.>Vector of dimensions, each dimension corresponding to each category of production parameters;
step two, the production data setDivided into->Selecting a decision tree regression model from training set and corresponding test set using +.>The training sets are respectively trained based on the selected decision tree regression model to obtain +.>Each rotation verification model is used for calculating the rotation verification model pre-set by using the test set corresponding to the training set of each rotation verification modelMean square error of the measured results, wherein +.>Personal rotation verification model->The mean square error of the predicted result is +.>,/>
Step three, useThe individual rotation verification model calculates the frequency characteristic importance for each production parameter, resulting in a length +.>Frequency feature importance ranking queue ++>
Step four, usingThe individual rotation verification model calculates the importance of the alignment features for each production parameter, resulting in a length of +.>Ranking feature importance ranking queue ++>
Step five, usingThe individual rotation verification model calculates constant substitution feature importance for each production parameterObtaining the length ofConstant substitution feature importance ranking queue +.>
Step six, sorting the queue from the frequency characteristic importanceRanking feature importance ranking queue->And constant substitution feature importance ranking queue +.>Is selected from->The most important production parameter, +.>The composition length is->Carrying out de-duplication treatment after the production parameter set of (2) to obtain an important production parameter set +.>
Step seven, from the production datasetExtract important production parameter set->The data corresponding to the production parameters in (a) are used as fitting data and are based on the production data set +.>Preparing a fitting test set, selecting a universal regression model, dividing fitting data into a plurality of fitting subsets, respectively training the universal regression model to obtain a plurality of importance pre-estimated models, calculating the mean square error of the prediction results of the importance pre-estimated models based on the fitting test set, and selecting all production parameters included in the fitting subset used by the importance pre-estimated model with the least mean square error of the prediction results as the production parameters with the highest importance in semiconductor manufacturing.
2. The method of claim 1, wherein the second step comprises:
step two A, the production data setDivided into->The data distribution difference of each subset is smaller than a threshold value;
step two, taking each subset as a test set,the subset is removed from the subset and left +.>The subsets are combined and then used as training sets corresponding to the subsets;
step two, selecting a decision tree regression model;
step two D, based on the firstTraining the decision tree regression model by the training set to obtain +.>Personal rotation verification model->
Step two E, based on the firstTest set corresponding to the training set calculates +.>Personal rotation verification model->Mean square error of prediction result->
Wherein,represents->Production parameters of the individual semiconductor products during production, < >>Represents->Quality test result label of individual semiconductor products, < >>Represents->Sample number->Indicate->Training set, ->Indicate->Quality detection results of the input semiconductor product production parameter prediction by the individual rotation verification model;
step two F, respectively usingTraining and verifying the decision tree regression model by the training set and the corresponding test set to obtain +.>Personal rotation verification model +.>And (5) verifying the mean square error of the model prediction result through each rotation.
3. The method according to claim 2, wherein the decision tree regression model in step two C is specifically a lightweight gradient-lifting tree model.
4. The method of claim 1, wherein the third step comprises:
first, theFrequency characteristic importance of the individual production parameters +.>
,
Wherein,indicate->Production parameters of->Indicate->Production parameters of->Indicate->The number of times the parameters are produced using brackets during the construction of the individual rotation verification model, +.>Representing a natural exponential function;
usingThe individual rotation verification model calculates the frequency characteristic importance for each production parameter, resulting in a length +.>Frequency feature importance ranking queue ++>
5. The method of claim 1, wherein the step four specifically comprises:
first, theArrangement characteristic importance of individual production parameters +.>The method comprises the following steps:
wherein,representing a natural exponential function, ++>Indicate will be->Samples with disordered characteristic sequences corresponding to the production parameters, < >>Indicate will be->Corresponding to the production parametersSamples with disordered feature sequences; />Representing a mean square error calculation; />Represents the +.>The corresponding characteristic sequence of the production parameters is disturbed, and then the production process is finished by the first ∈>Personal rotation verification model->Mean square error of predicted quality detection result, +.>Represents the +.>The corresponding characteristic sequence of the production parameters is disturbed, and then the production process is finished by the first ∈>Personal rotation verification model->Mean square error of the predicted quality detection result;
usingThe individual rotation verification model calculates the importance of the alignment features for each production parameter, resulting in a length of +.>Ranking feature importance ranking queue ++>
6. The method of claim 1, wherein the fifth step comprises:
first, theConstant substitution characteristic importance of the individual production parameters +.>Is->
Wherein,represents the>The characteristic value corresponding to the production parameter is replaced by a constant +.>Mean square error of the prediction result of the individual rotation verification model,/->Represents the>The characteristic value corresponding to the production parameter is replaced by a constant +.>The mean square error of the prediction result of the model is verified through rotation;
usingThe individual rotation verification model calculates the constant substitution feature importance for each production parameter, resulting in a length +.>Constant substitution feature importance ranking queue +.>
7. The method of claim 1, wherein the step seven specifically comprises:
step seven A, defining important production parameter setThe number of production parameters in (a) is->From the production data set->Extract important production parameter set->The data corresponding to the production parameters in the model are used as fitting data; production data set->Sampling by using a self-help method to construct a fitting test set;
step seven B, from the simulationEach selection in the aggregate dataData corresponding to the individual production parameters are used as fitting subsets, < >>ObtainingFitting subset->Representing from->Selecting->The number of all possible ways of producing the parameters;
step seven, selecting a general regression model;
step seven D, utilizeTraining the general regression model by using the fitting subsets to obtain +.>A importance pre-estimation model;
step seven E, calculating by using fitting test setThe method comprises the steps of predicting the mean square error of a result by an importance pre-estimation model, and selecting an importance pre-estimation model with the minimum mean square error of the predicted result as an optimal importance pre-estimation model;
and step seven F, taking all production parameters included in the fitting subset used by the optimal importance prediction model training as a plurality of production parameters with highest importance in the semiconductor manufacturing.
8. The method according to claim 7, wherein the generic regression model in step seven C is a lightweight gradient-lifting tree model.
CN202311371783.XA 2023-10-23 2023-10-23 Analysis method for importance of production parameters in semiconductor manufacturing Active CN117113291B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311371783.XA CN117113291B (en) 2023-10-23 2023-10-23 Analysis method for importance of production parameters in semiconductor manufacturing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311371783.XA CN117113291B (en) 2023-10-23 2023-10-23 Analysis method for importance of production parameters in semiconductor manufacturing

Publications (2)

Publication Number Publication Date
CN117113291A true CN117113291A (en) 2023-11-24
CN117113291B CN117113291B (en) 2024-02-09

Family

ID=88800511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311371783.XA Active CN117113291B (en) 2023-10-23 2023-10-23 Analysis method for importance of production parameters in semiconductor manufacturing

Country Status (1)

Country Link
CN (1) CN117113291B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120096244A1 (en) * 2010-10-14 2012-04-19 Bin Zhang Method, system, and product for performing uniformly fine-grain data parallel computing
CN104504373A (en) * 2014-12-18 2015-04-08 电子科技大学 Feature selection method for FMRI (Functional Magnetic Resonance Imaging) data
CN105447844A (en) * 2014-08-15 2016-03-30 大连达硕信息技术有限公司 New method for characteristic selection of complex multivariable data
CN106971240A (en) * 2017-03-16 2017-07-21 河海大学 The short-term load forecasting method that a kind of variables choice is returned with Gaussian process
CN107330555A (en) * 2017-06-30 2017-11-07 红云红河烟草(集团)有限责任公司 It is a kind of that power method is assigned based on the Primary Processing parameter that random forest is returned
US20190079472A1 (en) * 2015-10-15 2019-03-14 Accenture Global Services Limited System and method for selecting controllable parameters for equipment operation safety
CN110728331A (en) * 2019-10-28 2020-01-24 国网上海市电力公司 Harmonic emission level evaluation method of improved least square support vector machine
AU2020100709A4 (en) * 2020-05-05 2020-06-11 Bao, Yuhang Mr A method of prediction model based on random forest algorithm
CN111488713A (en) * 2020-04-14 2020-08-04 中国交通建设股份有限公司吉林省分公司 Method, system and storage medium for predicting early carbonization of concrete

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120096244A1 (en) * 2010-10-14 2012-04-19 Bin Zhang Method, system, and product for performing uniformly fine-grain data parallel computing
CN105447844A (en) * 2014-08-15 2016-03-30 大连达硕信息技术有限公司 New method for characteristic selection of complex multivariable data
CN104504373A (en) * 2014-12-18 2015-04-08 电子科技大学 Feature selection method for FMRI (Functional Magnetic Resonance Imaging) data
US20190079472A1 (en) * 2015-10-15 2019-03-14 Accenture Global Services Limited System and method for selecting controllable parameters for equipment operation safety
CN106971240A (en) * 2017-03-16 2017-07-21 河海大学 The short-term load forecasting method that a kind of variables choice is returned with Gaussian process
CN107330555A (en) * 2017-06-30 2017-11-07 红云红河烟草(集团)有限责任公司 It is a kind of that power method is assigned based on the Primary Processing parameter that random forest is returned
CN110728331A (en) * 2019-10-28 2020-01-24 国网上海市电力公司 Harmonic emission level evaluation method of improved least square support vector machine
CN111488713A (en) * 2020-04-14 2020-08-04 中国交通建设股份有限公司吉林省分公司 Method, system and storage medium for predicting early carbonization of concrete
AU2020100709A4 (en) * 2020-05-05 2020-06-11 Bao, Yuhang Mr A method of prediction model based on random forest algorithm

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JOAO PEREIRA 等: "Covered information disentanglement:model transparency via Unbiased permutation importance", 《ARXIV:2111.09744V2》 *
刘继辉;许磊;马晓龙;李达;林鸿佳;杨洋;杨晶津;李兴绪;王慧;: "基于随机森林回归的制丝过程参数影响权重分析", 烟草科技, no. 02 *
陈双武 等: "基于迁移学习的跨域异常流量检测", 《北京邮电大学学报》 *
陈蕊 等: "基于随机森林特征重要性和区间偏最小二乘法的 近红外光谱波长筛选方法", 《光谱学与光谱分析》 *

Also Published As

Publication number Publication date
CN117113291B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
CN106156401B (en) Multi-combination classifier based data driving system state model online identification method
CN111222549A (en) Unmanned aerial vehicle fault prediction method based on deep neural network
CN111191726B (en) Fault classification method based on weak supervision learning multilayer perceptron
CN109240276B (en) Multi-block PCA fault monitoring method based on fault sensitive principal component selection
CN114036610A (en) Penetration depth prediction method based on data enhancement
US20210397956A1 (en) Activity level measurement using deep learning and machine learning
CN114297918A (en) Aero-engine residual life prediction method based on full-attention depth network and dynamic ensemble learning
CN111736567B (en) Multi-block fault monitoring method based on fault sensitivity slow characteristic
CN115375026A (en) Method for predicting service life of aircraft engine in multiple fault modes
Wu et al. A reliability assessment method based on support vector machines for CNC equipment
CN114357870A (en) Metering equipment operation performance prediction analysis method based on local weighted partial least squares
CN114297921A (en) AM-TCN-based fault diagnosis method
CN117113291B (en) Analysis method for importance of production parameters in semiconductor manufacturing
CN113281229A (en) Multi-model self-adaptive atmosphere PM based on small samples2.5Concentration prediction method
Zhou et al. An adaptive remaining useful life prediction model for aeroengine based on multi-angle similarity
CN107992454B (en) Air quality grade prediction method based on online sequential regression
CN115718880A (en) Prediction method for degradation stage of complex equipment
CN115618506A (en) Method for predicting power of single-shaft combined cycle gas turbine
US20230214668A1 (en) Hyperparameter adjustment device, non-transitory recording medium in which hyperparameter adjustment program is recorded, and hyperparameter adjustment program
CN114841000B (en) Soft measurement modeling method based on modal common feature separation
Lu et al. A modified active learning intelligent fault diagnosis method for rolling bearings with unbalanced samples
TWI751540B (en) Data processing device, method, and semiconductor manufacturing device
CN113570191B (en) Intelligent diagnosis method for dangerous situations of ice plugs in river flood
CN116432323B (en) Aircraft structure digital twin credibility assessment method based on Bayesian network
CN115186941B (en) Variable optimization climate mode method based on multiple space-time indexes and comprehensive sequencing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant