CN112581412A

CN112581412A - Atomic force microscope image restoration method based on long-term and short-term memory network

Info

Publication number: CN112581412A
Application number: CN202011575092.8A
Authority: CN
Inventors: 胡佳成; 陶镛泽; 施玉书; 张树; 颜迪新
Original assignee: China Jiliang University
Current assignee: China Jiliang University
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2021-03-30

Abstract

The invention discloses an atomic force microscope image restoration method of a long-term and short-term memory network. The method comprises the steps of firstly, obtaining a simulation sample by using an expansion method in mathematical morphology, then, carrying out normalization operation on the simulation sample, inputting the simulation sample into an LSTM model for training, perfecting the LSTM model by adopting an improved MAE loss function and an Adam optimization algorithm, and then, carrying out regularization and inverse normalization processing on the training sample. And finally, inputting the actual AFM image into the trained model to obtain an AFM restoration image, and processing by using a self-adaptive algorithm with gradual interpolation, so that the position of the characteristic point in the sample is accurately obtained. The method has stability and better robustness, and can effectively recover the AFM image, thereby improving the measurement accuracy.

Description

Atomic force microscope image restoration method based on long-term and short-term memory network

Technical Field

The invention relates to the fixed value of standard instruments such as grids and steps in nano metering, in particular to an atomic force microscope image restoration method based on a long-short term memory (LSTM) network.

Background

Since an Atomic Force Microscope (AFM) performs imaging by using an interaction force between a minute probe and an object to be measured, it is widely used in the field of nano-science metrology. The AFM obtains relevant geometric quantity parameters through high-resolution imaging of the nano-metering standard instrument and calibrates the geometric quantity parameters, so that the quantity value transmission from the standard metering instrument to the working metering instrument is realized. In the scanning imaging process of the AFM, due to the influence of the needle tip, an image obtained by scanning is a result of the combined action of the atomic force probe and the sample instead of the real description of the appearance of the sample, and the phenomenon is the 'artifact effect' of the AFM. The phenomenon reduces the accuracy of the measurement result of the standard device, and affects the quantity value transmission and traceability of the nanometer scientific measurement.

From the viewpoint of improving the accuracy of the measuring instrument, the "artifact effect" can be reduced by using a probe with a smaller curvature radius, but such a probe tends to have poor durability and high cost, and the "artifact effect" cannot be eliminated in principle. Because the nanometer material has higher rigidity and hardness and is not easy to deform and wear, the AFM adopts contact measurement in the process of scanning a sample. Based on the measuring mode, the real surface of the sample can be obtained through a certain algorithm according to the imaging of the sample. However, this deconvolution process is non-deterministic and cannot restore an image with an accurate mathematical expression, so the process is highly demanding in algorithm selection.

At present, the main algorithms for restoring AFM images include a radial basis function neural network and multilayer perceptron-based restoration algorithm, a probe blind reconstruction algorithm based on mathematical morphology and a back propagation neural network algorithm. The probe blind reconstruction algorithm often depends excessively on AFM measurement images, so that the robustness of the reconstructed probe morphology is poor, and AFM image restoration is influenced. The radial basis function neural network and the back propagation neural network algorithm are limited by the upper depth limit of the neural network in the training process, and the problem of training failure caused by gradient attenuation or explosion is often accompanied.

Disclosure of Invention

The invention provides a method for restoring AFM images based on a long-short term memory (LSTM) recurrent neural network, aiming at the defects of the prior art.

The technical scheme adopted by the invention is that,

an atomic force microscope image restoration method based on a long-term and short-term memory network comprises the following steps:

step 1: obtaining a simulation sample by using an expansion method in mathematical morphology;

assuming that the expression of the probe cross-sectional profile shape when the tip is at the home position in the coordinate system is s (x), the expression of the sample cross-sectional profile shape is f (x). The position of the probe in the scan line is defined by the abscissa of the tip, assuming that the abscissa of the tip is x₀The ordinate h of the tip is then indicated at x₀The measured height value of the sample scanning point. At this time, assuming that the expression of the cross-sectional profile shape of the probe is t (x), the relationship between s (x) and t (x) satisfies the expression

t(x)＝s(x)+h (1)

The relationship between f (x) and t (x) satisfies the expression

AFM contact measurement is equivalent to a dilation operation in mathematical morphology, the mathematical expression of which is

Thus, according to equation (3), the expression for h can be written as

By combining formula (1) and formula (2), formula (3) can be converted into

Since h and x satisfy a one-to-one mapping relationship, h can be regarded as a function of x, and h (x) is used for representing a height result change curve measured by the probe on a certain horizontal scanning line of the sample, namely results of each line of the simulation image. h (x) includes the relationship among the sample profile, the tip profile and the measurement image.

In the simulation process, a nano-grid standard device is selected as a tested sample, and a silicon nitride needle tip is selected as a scanning device. The simulated scanning process may acquire 2000 sets of sample images, with 1600 sets of images used as a training set and 400 sets of images used as a test set. The sample selection for the training set need not contain all probe types, but not all nanogrid types.

Step 2: carrying out normalization operation on the simulation sample image obtained in the step 1;

and decomposing the AFM image line by line to realize dimension reduction, so that the line vector becomes a training sample of the network.

Since the AFM simulation image is a 140 × 140 matrix, both the input data and the output result are 1 × 140 row vectors. In order to inhibit the phenomena of incapability of convergence and overfitting in the training process and further improve the stability and the performance of the neural network, normalization processing is adopted for input data, namely

In formula (6), x is a numerical value in the input matrix, x_maxAnd x_minRespectively the maximum and minimum values, x, in the input matrix^*The result is normalized for the input data.

And step 3: inputting the data obtained in the step (2) into an LSTM model for training;

the LSTM model comprises a forgetting gate F_tInput gate I_tOutput gate O_tAnd candidate memory cells

And fourthly, the method comprises the following steps. The gate input of LSTM is the current time step input X_tHidden state H with last time step_t-1This is matched by the AFM scan line point gradient determined by the height values of adjacent points. The output is calculated by a full connection layer of a sigmoid activation function, and the expression is

To obtain candidate memory cells with larger gradient near 0 AFM scan line input and faster model convergence, tanh activation function is used, expressed as tan h

The LSTM can store AFM scanning line vector information with higher dimension through memory cells, so that the correlation degree between scanning points is obtained. For example, points at flat areas of the scan line are less correlated, while points at areas with greater variations in scan line height are more correlated. Setting a time step memory cell C_t-1Current time step memory cell C_tAt the time of obtaining C_tWhen passing through F_tAnd I_tAnd the Hadamard product of the matrix controls the flow of information between each point of the scanning line:

if F_tIs always approximately 1 and I_tThe element in (1) is always approximately 0, and past memory cells will always be saved by time and passed to the current time step. The process can store effective information in the AFM scanning line for a long time, solves the problem of gradient attenuation caused by the increase of scanning line vector dimensions, and improves the training reliability. Meanwhile, the dependency relationship among points with larger distance among scanning lines can be obtained, and the image restoration of a region with serious artifact effect is realized.

At the acquisition of C_tThen, through O_tTo control the slave H_tFlow of information to the current time step hidden state:

H_t＝O_t⊙tanh(C_t) (10)

for the nano-grid scan line, when O_tWhen the element in (A) is approximately 0 or 0, C_tThe information in the method is usually only the information of a flat area of the nano grid, is not the required characteristic information, and is generally reserved for the user; when O is present_tWhen the element in (B) is far from 0, then C is present_tThe characteristic information such as the position of a step and the like of the nano-grid is contained in the H-shaped magnetic field, and the H-shaped magnetic field is transmitted to the H_tFor use by the output layer.

And 4, step 4: continuously improving an LSTM model by adopting an MSE loss function and an Adam optimization algorithm;

the recovery effect of AFM images was evaluated using Modified Mean Absolute Error (MMAE) as a loss function and using a 1-norm in the summation process. The loss function is expressed as

In equation (11), n represents the number of training samples, l represents the vector dimension of the network output,

estimated value, y, representing each line of AFM restored image_iThe real values of the respective lines of the AFM restored image are shown.

The learning rate parameter adaptive formula of the Adam algorithm is

In the formula (12), β⁰Denotes the initial given hyper-parameter, βⁱRepresenting the hyperparameter after i iterations.

And 5: carrying out regularization and inverse normalization processing on the training samples;

for the output layer passing through the LSTM model, a Dropout regularization method is adopted, and some output characteristics of the layer are abandoned randomly in the training process, so that the overfitting phenomenon is reduced, and the model obtained through training is suitable for more types of AFM scanning sample images.

All sample data in the LSTM model training process are in the [0,1] interval, and the finally obtained restored image needs to be close to the actual sample, so that all data need to be subjected to inverse normalization operation before an output result is obtained.

Step 6: inputting the actual AFM image into the trained model to obtain an AFM restoration image;

and for the AFM restored image, processing by using an adaptive algorithm with gradual interpolation, so that the positions of the characteristic points in the sample can be more accurately obtained.

The invention has the beneficial effects that: after AFM simulation images are acquired by a simulation scanning sample, the AFM image deconvolution process is obtained through LSTM model training, and finally the model is used for restoring the actual AFM images. The method can solve the problems of cost consumption and image quality reduction caused by needle point abrasion in the process of obtaining AFM images in a large scale. The method has the advantages that the LSTM model is adopted for deconvolution operation, so that the repair of the position with serious artifact effect in the AFM image is facilitated, and meanwhile, the problem of gradient disappearance or explosion in the training process is solved.

Drawings

FIG. 1 is a schematic diagram of AFM simulation images acquired in the present invention.

FIG. 2 is a three-dimensional view of a simulated needle tip according to the method of the present invention.

FIG. 3 is a three-dimensional diagram of a simulated nano-grating according to the method of the present invention.

FIG. 4 is a three-dimensional graph of the results of the simulated nano-raster scan of the present invention.

FIG. 5 is an algorithmic flow chart of the method of the present invention.

FIG. 6 is a diagram of the LSTM model architecture of the method of the present invention.

FIG. 7 is a graph of loss value versus iteration number in neural network training for the method of the present invention.

FIG. 8 is a top view of a nano-grating in the test results of the method of the present invention.

Detailed Description

The method obtains the neural network training sample by simulating the process of scanning the nano-grid of the nano-metering standard instrument by the Matlab. And then, equating each line scanning height curve in the measurement image of the nano grid by AFM as a time sequence sample, inputting the time sequence sample into a constructed LSTM frame for training to obtain a prediction model, inputting the test data into the model to obtain a prediction result, and comparing the prediction result with an actual result, thereby proving that the model can effectively restore the actual appearance of the sample, has practical significance for processing AFM scanning data and improving measurement accuracy, and can realize the butt joint with related AFM imaging software.

The invention comprises the following steps:

the scanning process of AFM contact measurements does not need to take into account the small deformations due to the interaction forces between the probe and the sample. Assuming that the expression of the probe cross-sectional profile shape when the tip is at the home position in the coordinate system is s (x), the expression of the sample cross-sectional profile shape is f (x). The position of the probe in the scan line is defined by the abscissa of the tip, assuming that the abscissa of the tip is x₀The ordinate h of the tip is then indicated at x₀The measured height value of the sample scanning point. At this time, assuming that the expression of the cross-sectional profile shape of the probe is t (x), the relationship between s (x) and t (x) satisfies the expression

t(x)＝s(x)+h (1)

The relationship between f (x) and t (x) satisfies the expression

Thus, according to equation (3), the expression for h can be written as

By combining formula (1) and formula (2), formula (3) can be converted into

In the simulation process, a nano-grid standard device is selected as a tested sample, and a silicon nitride needle tip is selected as a scanning device.

In order to better acquire the characteristic vectors of the nano-grid and the probe in training, the following scheme can be used for setting the parameters of the nano-grid and the probe: the line width W of the nano-grid is 20-40nm, the distance is 5nm, the height H is 10-30nm, the distance is 5nm, 25 nano-grid types are arranged in total, the size of a corresponding matrix is 140 multiplied by 140, and the resolution is 1 nm/pixel; the curvature radius R of the silicon nitride needle tips is 10-30nm, the pitch is 5nm, the cone angle theta is 5-80 degrees, the pitch is 5nm, 80 needle tip types are arranged, the size of a corresponding matrix is 30 multiplied by 30, and the resolution is 1 nm/pixel. The simulated scanning process may acquire 2000 sets of sample images, where 1600 sets of images are used as a training set and 400 sets of images are used as a testing set. Meanwhile, in the actual scanning process, the probe types are often considered to be limited, and the nano-grid morphology is infinite, so that the sample selection of the training set needs to include all the probe types, but does not need to include all the nano-grid types.

because the AFM image obtained in the step 1 is equivalent to a set of row vectors, the AFM image can be decomposed line by line to realize dimension reduction, so that the row vectors become training samples of the network. In the process, as the AFM simulation image is a 140 × 140 matrix, the input data and the output result are both row vectors of 1 × 140. In order to inhibit the phenomena of incapability of convergence and overfitting in the training process and further improve the stability and the performance of the neural network, normalization processing is required to be adopted on input data, namely

the convolution process in the AFM scanning sample is generated by the gradient change of the sample appearance and is not related to the height value of the sample, so the process of deconvolution of the AFM image focuses on researching the change condition of the sample scanning height curve. For an input matrix, if the position of each data in the matrix is changed, the data in the output matrix itself will change, not just spatially. Thus, the data in the input matrix is ordered and can be viewed as a time series, while the LSTM model is suitable for training of the process since the input and output matrices are the same size.

And fourthly, the method comprises the following steps. The gate input of LSTM is the current time step input X_tHidden state H with last time step_t-1This is matched by the AFM scan line point gradient determined by the height values of adjacent points. The output is given by sigmoThe full connection layer calculation of the id activation function is obtained, and the expression is

At the acquisition of C_tThen, it is necessary to pass through O_tTo control the slave H_tFlow of information to the current time step hidden state:

H_t＝O_t⊙tanh(C_t) (10)

in the process of adjusting the LSTM model to realize gradient descent, due to the output AFM scanning behavior vector form, the recovery effect of the AFM image is evaluated by adopting Modified Mean Absolute Error (MMAE) as a loss function, and a 1-norm is used in the summation process. The loss function is expressed as

In the iterative process of gradient descent, an Adam optimization algorithm is adopted. The Adam algorithm performs exponential weighted moving average on the small-batch random gradient, performs deviation correction on partial variables, and simultaneously adopts self-adaptive learning rate to continuously optimize network parameters. Compared with a random gradient descent method, the algorithm can more effectively update the network weight and finally accelerate the convergence speed. The learning rate parameter adaptive formula of the Adam algorithm is

The invention is further illustrated by the following figures and examples.

Example (b):

according to the AFM simulation image schematic diagram shown in FIG. 1, the following scheme is adopted to set parameters: the line width W of the nano grid is 20-40nm, the spacing is 5nm, the height H is 10-30nm, the spacing is 5nm, the size of a corresponding matrix is 140 multiplied by 140, and the resolution is 1 nm/pixel; the curvature radius R of the silicon nitride needle tip is 10-30nm, the cone angle theta is 5-80 degrees, the interval is 5nm, the size of a corresponding matrix is 30 multiplied by 30, and the resolution is 1 nm/pixel. In 2000 groups of sample images obtained in the simulation scanning process, 1600 groups of images are used as a training set, 400 groups of images are used as a testing set, and the sample selection of the training set needs to include all probe types. The simulated tip model, the simulated nano-grating model and the simulated AFM image are respectively shown in FIG. 2, FIG. 3 and FIG. 4.

After obtaining the AFM simulation image, the sample needs to be trained, and the flow of the adopted algorithm is shown in fig. 5.

Step 2: and (3) carrying out normalization operation on the simulation sample image obtained in the step (1).

And step 3: and (3) inputting the data obtained in the step (2) into an LSTM model for training.

The LSTM model comprises four parts, namely a forgetting gate, an input gate, an output gate and a candidate memory cell. The gate inputs of the LSTM are the current time step input and the previous time step hidden state, and the output is obtained by a full connection layer of a sigmoid function. In the memory layer, the memory cells at the current time step are obtained by the Hadamard product of the matrix, and the hidden state at the current time step is obtained by the tanh function, as shown in FIG. 6.

And 4, step 4: and continuously perfecting the LSTM model by adopting an improved MAE loss function and an Adam optimization algorithm.

A graph of loss values versus number of iterations in the training is shown in fig. 7.

for the output layer passing through the LSTM model, a Dropout regularization method is adopted, and the output characteristics of the output layer are abandoned randomly in the training process, so that the overfitting phenomenon is reduced, and the model obtained through training is suitable for more types of AFM images.

and for an output result obtained by inputting the actual AFM image into the trained model, processing by using a step-by-step interpolation adaptive algorithm, so that the positions of the feature points in the sample can be more accurately obtained.

The test results of the method of the invention are shown in fig. 8, (a) is the top view of the actual nano-grating sample, (b) is the top view of the actual nano-grating sample after AFM scanning, and (c) is the top view of the AFM restored image obtained by the trained model. Therefore, the atomic force microscope image restoration is realized, namely.

The above description is only a preferred embodiment of the method of the present invention, but the scope of the method of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the method of the present invention are included in the scope of the method of the present invention. Therefore, the protection scope of the method of the present invention shall be subject to the protection scope of the claims.

Claims

1. The atomic force microscope image restoration method based on the long-term and short-term memory network is characterized by comprising the following steps:

step 2: carrying out normalization operation on the simulation sample;

and step 3: inputting the normalized simulation sample into an LSTM model for training;

the LSTM model comprises a forgetting gate, an input gate, an output gate and candidate memory cells; the gate input of the LSTM is the current time step input and the previous time step hidden state, and the output is obtained by a full connection layer of a sigmoid function;

in a memory layer, memory cells in the current time step are obtained through the Hadamard product of the matrix, so that the flow of information in the row vector of the AFM image is controlled, and the hidden state in the current time step is obtained through a tanh function;

step 6: and inputting the actual AFM image into the trained model to obtain an AFM restoration image.

2. The atomic force microscope image restoration method based on the long-short term memory network as claimed in claim 1, wherein the process of obtaining the simulation sample in step 1 is as follows:

converting a three-dimensional process of AFM contact measurement into a two-dimensional process, and converting a binary function representing the process into a univariate function; using s (x) to represent the cross section outline shape of the probe when the needle point is at the original point position in the coordinate system, f (x) the cross section outline shape of the sample, and then obtaining the relation between the height h (x) of the sample obtained by AFM scanning and f (x) and s (x);

setting parameters of a simulation nano grid and a simulation silicon nitride needle tip;

and (4) simulating by using an expansion method in mathematical morphology to obtain a plurality of groups of training samples.

3. The atomic force microscope image restoration method based on the long-short term memory network as claimed in claim 1, wherein the normalization process in the step 2 is as follows:

the simulation sample obtained in the step 1 is equivalent to a set of row vectors, and the AFM image is decomposed line by line to realize dimension reduction, so that the row vectors become training samples of the network; in this process, both the input data and the output result are row vectors of 1 × 140, and normalization processing is applied to the input data.

4. The atomic force microscope image restoration method based on the long and short term memory network as claimed in claim 1, wherein in the step 4, in order to satisfy the output result in a vector form, the LSTM model is continuously perfected by using an improved MAE loss function MMAE and Adam optimization algorithm; the MMAE loss function is expressed as

Where n represents the number of training samples, l represents the vector dimension of the network output,

estimated value, y, representing each line of AFM restored image_iRepresenting real values of each line of the AFM restored image;

the learning rate parameter adaptive formula of the Adam algorithm is

Wherein beta is⁰Denotes the initial given hyper-parameter, βⁱRepresenting the hyperparameter after i iterations.

5. The atomic force microscope image restoration method based on the long-short term memory network as claimed in claim 1, wherein the regularization and de-normalization processing of the training samples in step 5 is as follows:

for the output layer passing through the LSTM model, a Dropout regularization method is adopted, and some output characteristics of the layer are abandoned randomly in the training process, so that the overfitting phenomenon is reduced, and the model obtained through training is suitable for more types of scanning sample images;

in order to make the AFM restored image close to the actual sample, all data are subjected to an inverse normalization operation before an output result is obtained.