CN111896038B

CN111896038B - Semiconductor process data correction method based on correlation entropy and shallow neural network

Info

Publication number: CN111896038B
Application number: CN202010591258.9A
Authority: CN
Inventors: 谢磊; 吴小菲; 徐浩杰; 陈启明; 苏宏业
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2020-06-24
Filing date: 2020-06-24
Publication date: 2021-08-31
Anticipated expiration: 2040-06-24
Also published as: CN111896038A

Abstract

The invention discloses a semiconductor process data correction method based on correlation entropy and a shallow neural network, which comprises the following steps: (1) collecting output signals of a process variable sensor corresponding to a variable to be corrected; (2) inputting each variable into the established shallow neural network model, extracting the correlation information of the variables layer by layer, and transferring the output of each layer through a function; collecting the variable output of the last layer of the model, comparing the variable output with the input variable, and establishing a regression model; (3) saving the parameter weight of the current model, calculating a final objective function value, if the final objective function value does not meet the stop condition, updating the parameter weight and repeating the step (2) until the stop condition is reached; (4) changing the number of network layers, and repeating the steps (2) to (3) until the maximum number of network layers is reached; (5) selecting the network layer number with the best correction result; and storing parameters of each layer, calculating new data to be corrected and obtaining a correction value. By using the method and the device, the data correction result with lower error can be obtained.

Description

Semiconductor process data correction method based on correlation entropy and shallow neural network

Technical Field

The invention relates to the field of process monitoring in an industrial system, in particular to a semiconductor process data correction method based on a correlation entropy and a shallow neural network.

Background

In recent years, data-driven methods such as process monitoring, soft metrology, and the like have been established as powerful process control tools in the semiconductor industry. Therefore, the reliability and accuracy of measured process data is critical to the efficiency, profitability and safe operation of a plant in the chemical industry. However, due to process variability and measurement technology limitations, the data measured online is often disturbed by random and gross errors. By improving the raw data set, process performance and maintenance efficiency can be significantly improved. Therefore, data rectification that can mitigate the effects of errors in raw data has become an important area of research in data analysis.

In the semiconductor industry, data rectification is also referred to as bias estimation. Researchers have combined statistical information (distribution, variance, etc.) with known models to enable selection of efficient estimation methods, improve the original mean square error objective function and eliminate bias. While these methods perform well in engineering processes, all of them are model-based techniques, and the key to effective data correction is the adoption of a good process model. If the model does not faithfully represent the process, the corrected data will be distorted by the model mismatch. For part of real industrial processes, it is difficult to accurately obtain a process model. On the other hand, for the serious error in the model, the prior model is usually solved by adopting a preprocessing mode, but the method only considers the statistical knowledge of a single variable and does not consider the relation among other variables in the whole process, and the improper correction result can be caused.

Based on the above background, a method is considered to be found, a data relationship can be mined through the acquired original sample data, and the data relationship is used as a basis for correction to obtain a better correction value, so that the optimization of the data relationship is further promoted. Such a loop is advantageous for obtaining a final relatively accurate data correction value.

Disclosure of Invention

The invention discloses a semiconductor process data correction method based on a correlation entropy and a shallow neural network, which can be suitable for process measurement values containing random errors and major errors, only needs to acquire conventional operation data, and does not need any prior knowledge or pretreatment.

A semiconductor process data correction method based on correlation entropy and a shallow neural network comprises the following steps:

(1) for a control process in which disturbances exist, the output signal of a process variable sensor corresponding to the variable to be corrected is collected:

(2) directly inputting the collected variables into the constructed shallow neural network model, extracting the relevant information in the variables layer by layer, and transmitting the output of each layer through a set function;

collecting the variable output of the last layer of the model, comparing the variable output with the numerical value of the input variable, and establishing a regression model;

(3) saving the parameter weight of the current shallow neural network model, and calculating a final objective function value, wherein the objective function adopts a related entropy function; if the stopping condition is not met, updating the parameter weight and repeating the step (2) until the stopping condition is reached;

(4) changing the network layer number, and repeating the steps (2) to (3) until the maximum network layer number is reached;

(5) selecting the network layer number with the best correction result; and storing the parameter values of each layer, inputting the new data to be corrected into the shallow neural network model, recalculating and obtaining a variable correction value.

The invention can reduce the interference of random and serious errors, improve the original data and obviously improve the process performance and the maintenance efficiency, thereby reducing the production loss and having important practical value in the aspect of improving the economic benefit.

The method is different from the traditional model-based method, does not need to rely on the accuracy of prior knowledge, can directly establish the relationship between model mining data and use the relationship for adjusting data errors, and correct good data can also obtain an accurate variable model; also different from the traditional preprocessing method, the method carries out data correction while acquiring the relation, directly introduces the related entropy into the objective function, can more effectively consider the relation among all variables, does not depend on the characteristics of a certain variable, and thus obtains a better correction result.

In the step (1), the acquired output signals contain random errors and major errors, and can be transmitted to the neural network model in the step (2) without any pretreatment.

In the step (2), the input and output of the model are all measured variables, so as to obtain the relationship between the variables.

The specific process of the step (2) is as follows:

(2-1)x₀∈R^D；x₀＝[x_0,1,x_0,2,…x_0,D]^Trepresenting the D-dimensional input of a variable with error, and x_l-1L is 1, 2, …, L indicates that the model shares L-layer operations, each layer of network nodesAre all input variable dimensions D, weight matrix

And

deviation vector

And

and respectively defining parameters of linear and nonlinear functions in the transfer function of the network, and then the output of the ith layer is expressed as the following process:

and

for hidden nodes in the network, respectively transmitting corresponding linear (psi) and nonlinear activation functions (phi) until the output vector x of the first layer of the network is obtained_l；

(2-2) after multi-layer continuous iteration, the output of the last layer of neural network model is a correction value x_L；

x₁＝F(x₀)

x₂＝F(x₁)

x_L＝F(x_L-1)

Here function F represents the hidden node acquisition and corresponding linear and nonlinear activation function operations shown in (2-1);

and (2-3) comparing the output of the neural network model with the numerical value of the input variable to establish a regression model.

In step (3), the correlation entropy function is expressed as:

ε_d＝x_0,d-x_L,d

in the formula, k_σd(. to) is a function of the associated entropy, σ_dRepresenting adjustable parameters, epsilon, of the corresponding d-th variable in the associated entropy function_dThe difference between the measured value and the corrected value of the corresponding dimension d variable is obtained.

In the step (3), a gradient descent method is adopted to train and update the parameter weight, and the formula is as follows:

wherein α represents a fixed learning rate and α > 0; gradient of parameter

And

is obtained by the following formula

Wherein, the symbol

Representing the partial derivatives of the element-by-element one-to-one multiplication with the target function on the output value of the last layer of the function

And

is obtained by the following formula

In addition, the remaining iteration part, i.e. the objective function, is represented by the hidden node partial derivatives of each layer as:

in the step (3), the stop condition is as follows: the target function reaches the maximum value or the cycle number reaches the set maximum cycle number.

And (4) for the new data to be corrected, obtaining a new variable correction value according to network iteration.

Compared with the prior art, the invention has the following beneficial effects:

1. the invention defines the input and the output as the variables per se, and the relation between the variables is effectively extracted by simulating a black box model, thereby replacing the original prior knowledge and being used as the constraint in the variables to better correct the data.

2. According to the invention, the obtained data of certain correction can further promote the more accurate expression of the model relation, and the accuracy of the correction result is promoted to a certain extent.

3. In the invention, the number of nodes of each layer of the neural network is the same as the dimension of the input variable, the built-in weight matrix of the linear nonlinear function of each layer is shared with the deviation vector, and the advantage of reducing the complexity of the model is eliminated.

4. For selecting a proper model, the model weight matrix and the deviation vector are shared in each layer, and the number of hidden nodes in each layer is the same, so that only the number of model layers needs to be adjusted when the model is selected, and the parameter adjusting pressure is reduced.

5. The invention further optimizes the objective function by adopting an estimation method based on the correlation entropy, so that the objective function can also process the major error.

6. The present invention can automatically adjust the built-in parameters through an efficient gradient-based approach.

7. The invention completely adopts a data driving type method, does not need prior knowledge of the process and does not need to design a filter in advance.

Drawings

FIG. 1 is a flow chart of a semiconductor process data correction method based on correlation entropy and shallow neural network according to the present invention;

FIG. 2 is a schematic diagram of a model according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating measured process band error values and model output corrections, in accordance with an embodiment of the present invention.

Detailed Description

The invention will be described in further detail below with reference to the drawings and examples, which are intended to facilitate the understanding of the invention without limiting it in any way.

In the following, taking the estimation of the deposition process result of a certain factory in China as an example, the height value of the wafer passing through the multi-stage chemical process is virtually measured.

In the manufacturing process, the chemical vapor deposition process is similar to the process of applying a solid thin film coating on a surface that is often used in the semiconductor industry. This process is complex because it involves many chemical reactions, and the reactors in a multiple reactor system are independently controlled to deposit films in the process chamber under a variety of conditions. The chemical vapor deposition apparatus is equipped with a considerable number of sensors. These measurements include random errors and gross errors due to unstable production environments and unreliable measurement instruments. Thus, an accurate model for obtaining reliable measurements helps to optimize operation, and subsequently a series of controls.

Step 1, for a control process with disturbance, acquiring an output signal of a process variable sensor corresponding to a variable to be corrected.

Step 2, directly inputting each variable into the established shallow neural network model, extracting correlation information in the variables layer by layer, and transmitting each layer of output through a set function; collecting the variable output of the last layer of the model, comparing the variable output with the input variable value, and establishing a regression model;

as shown in fig. 2, the whole model modeling steps are as follows:

(2-1)x₀∈R^D；x₀＝[x_0,1,x_0,2,…x_0,D]^Trepresenting the D-dimensional input of a variable with error, and x_l-1L is 1, 2, …, L represents model common L-layer operation, each layer network node is input variable dimension D, weight matrix

And

deviation vector

And

and

respectively transmitting corresponding linear (psi) and nonlinear activation functions (phi, in the text, sigmoid functions are selected) for hidden nodes in the network until a first-layer output vector x of the network is obtained_l。

(2-2) after multi-layer continuous iteration, the final layer of neural network outputs a correction value x_L。

x₁＝F(x₀)

x₂＝F(x₁)

x_L＝F(x_L-1)

Function F here represents the hidden node acquisition and corresponding linear and nonlinear activation function operations shown in (2-1).

(2-3) since it is considered herein that in addition to random errors, significant errors occur in the industrial process due to additional disturbance variables, and the conventional mean square error objective function is sensitive to such errors, mean square error cannot be taken as the objective function here. Thus, the objective function based on the correlation entropy is introduced, which can be expressed as:

ε_d＝x_0,d-x_L,d

wherein

As a function of the associated entropy, σ_dRepresenting adjustable parameters, epsilon, of the corresponding d-th variable in the associated entropy function_dThe difference between the measured value and the corrected value of the corresponding dimension d variable is obtained.

(2-4) training the update parameters according to a gradient descent method:

where α represents a fixed learning rate and α > 0, a parameter gradient

And

the following formula is used to obtain the following formula,

wherein, the symbol

Representing the partial derivatives of the output values of the last layer of the function, one-to-one by element operation, and the objective function

And

obtained from the following equation:

in addition, the remaining iteration sections can be represented as:

step 3, saving the parameter weight of the current model, calculating a final objective function value, if the final objective function value does not meet the stop condition, updating the parameter weight and repeating the step (2) until the stop condition is reached;

step 4, changing the network layer number, and repeating the steps (2) to (3) until the maximum network layer number is reached;

step 5, selecting the network layer with the best correction result; and storing the parameter values of each layer, recalculating the new data to be corrected and obtaining the corrected value.

In this example, the result is shown in fig. 3, and the proposed method performs well, corrects random errors, and also detects significant errors and obtains corresponding correction values.

The embodiments described above are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only specific embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions and equivalents made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims

1. A semiconductor process data correction method based on correlation entropy and a shallow neural network is characterized by comprising the following steps:

the correlation entropy function is expressed as:

ε_d＝x_0,d-x_L,d

in the formula (I), the compound is shown in the specification,

as a function of the associated entropy, σ_dRepresenting adjustable parameters, epsilon, of the corresponding d-th variable in the associated entropy function_dThe difference value of the measured value and the corrected value of the corresponding dimension d variable is obtained;

and (3) training the weight of the updated parameter by adopting a gradient descent method, wherein the formula is as follows:

wherein α represents a fixed learning rate and α > 0; gradient of parameter

And

is obtained by the following formula

Wherein, the symbol

And

is obtained by the following formula

the stop conditions are as follows: the target function reaches the maximum value or the cycle number reaches the set maximum cycle number;

2. The semiconductor process data correcting method based on the correlated entropy and the shallow neural network as claimed in claim 1, wherein in the step (1), the collected output signal contains random errors and significant errors.

3. The semiconductor process data correcting method based on the correlation entropy and the shallow neural network as claimed in claim 1, wherein the specific process of the step (2) is as follows:

And

deviation vector

And

and

4. The semiconductor process data correction method based on correlated entropy and shallow neural network as claimed in claim 1, wherein in step (4), new variable correction values are obtained according to network iteration for new data to be corrected.