CN113222031A - Photolithographic hot zone detection method based on federal personalized learning - Google Patents

Photolithographic hot zone detection method based on federal personalized learning Download PDF

Info

Publication number
CN113222031A
CN113222031A CN202110545686.2A CN202110545686A CN113222031A CN 113222031 A CN113222031 A CN 113222031A CN 202110545686 A CN202110545686 A CN 202110545686A CN 113222031 A CN113222031 A CN 113222031A
Authority
CN
China
Prior art keywords
nodes
parameters
node
model
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110545686.2A
Other languages
Chinese (zh)
Other versions
CN113222031B (en
Inventor
卓成
林学忠
徐金明
孟文超
朱建新
黄炎
朱泽晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110545686.2A priority Critical patent/CN113222031B/en
Publication of CN113222031A publication Critical patent/CN113222031A/en
Application granted granted Critical
Publication of CN113222031B publication Critical patent/CN113222031B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Exposure And Positioning Against Photoresist Photosensitive Materials (AREA)

Abstract

The invention discloses a lithography hot zone detection method based on federal personalized learning.A central server aggregates global model parameters returned by each node, is used for fusing the common characteristics of each node, updates the global model parameters and feeds the latest global model parameters back to each node; each node downloads global model parameters from a central server, and then local model parameters are trained by using local data to find the optimal local model parameters under the current global model parameters, so that model isomerism and data isomerism of different nodes are overcome; after the local model parameters are finely adjusted, the nodes train all the parameters by using local data to find the optimal of the current parameters, and the optimal parameters are used for searching common characteristics of different nodes. The method solves the problem of model overfitting caused by too little local data; data among chip design manufacturers are protected, and privacy protection is realized; the stability and the overall precision of the federal personalized learning model in the heterogeneous environment are improved.

Description

Photolithographic hot zone detection method based on federal personalized learning
Technical Field
The invention belongs to the field of machine learning, and particularly relates to a photolithographic hot zone detection method based on federal personalized learning.
Background
The lithography hot area is an integrated circuit layout area with manufacturing defects, and how to quickly and accurately detect the lithography hot area is a problem which needs to be solved at present. The hot zone detection methods at the present stage mainly include the following four methods:
1. and lithography simulation, namely performing rapid plane lithography simulation aiming at a one-dimensional chip layout by fully utilizing the partial coherence characteristic of a light source in a lithography system and the characteristic of a one-dimensional chip graph. The photoetching simulation method consists of a one-dimensional element figure table look-up method, a minimum look-up table and the edge extension thereof and large-area layout simulation without cutting. The traditional photoetching hot area detection depends on photoetching simulation to a great extent, the hot area in the layout can be detected with extremely high accuracy by the method, but the method has high calculation complexity and long time consumption, is inconvenient to quickly and accurately detect the photoetching hot area in a test stage, and is generally used for manufacturing a final verification stage.
2. Pattern recognition is adopted for the photoetching hot area detection framework, important design rules are used for characterizing topological features of the photoetching hot area, and a tangent space distance measurement is used for hot area pattern analysis and classification. Although pattern recognition can accurately and quickly detect lithography hotspots, the accuracy is not satisfactory for some unknown lithography hotspot patterns.
3. Centralized machine learning, a large amount of data of each node is obtained on a central server for model training, all training data are gathered to train a convolutional neural network until the model is converged finally, and a photoetching hot zone detection model is obtained, wherein a training schematic diagram of the photoetching hot zone detection model is shown in fig. 1. The method extracts the depth features of the layout, remarkably improves the detection efficiency, but needs to obtain a large amount of data for model training, and due to the consideration of privacy protection, data among chip design manufacturers are not intercommunicated, so that the problem of model overfitting is easily caused.
4. Federal learning, in a basic federal learning framework, there is a central server (server) and several nodes (clients). Each node stores own unshared local photoetching hot area data, the local data are used for training a convolutional neural network photoetching hot area detection model respectively, and the model is uploaded to a central server. The central server is responsible for organizing local training of each node, aggregating the obtained models of each node, and sharing the models back to the nodes, so that the process is a round. And training the nodes on the aggregated model, and repeating the training until the final model converges to obtain a uniform lithography hot zone detection model, wherein a training schematic diagram of the uniform lithography hot zone detection model is shown in fig. 2. The method can solve the problem of data islands such as data non-intercommunication and the like among chip design manufacturers, but the performance of the method is often poor when the high data heterogeneity problem and the asynchronous problem are processed by federal learning, and the detection precision standard is difficult to achieve.
Disclosure of Invention
The invention aims to provide a lithography hot zone detection method based on federal personalized learning, aiming at the defects of the prior art.
The purpose of the invention is realized by the following technical scheme: a photoetching hot zone detection method based on federal personalized learning comprises the following steps:
s1, constructing a convolutional neural network with the same architecture but different parameters for each node based on the photoetching hot zone data of each node;
s2, determining global model parameters and local model parameters according to the parameter similarity of the convolutional neural network between the nodes:
comparing parameter distances of the same layer on a convolutional neural network obtained by training different nodes, calculating difference values of parameters of the jth layer of all nodes and the average value of the parameters of the jth layer on the jth layer of the convolutional neural network, if the sum of 2 norms of all the difference values is less than or equal to a distance threshold parameter delta, taking the jth layer of parameters as common characteristic parameters of different nodes, determining the jth layer of parameters as global model parameters to aggregate, and otherwise, considering that the jth layer of parameters are incompatible characteristics of different nodes, and taking the jth layer of parameters as local model parameters to perform local fine tuning;
s3, establishing a lithography hot zone federal personalized learning model:
Figure BDA0003073542750000021
wherein, wglobalIs a global model parameter that is common to all nodes,
Figure BDA0003073542750000022
the kth column is the local model parameter of the kth node, N is the number of nodes, and the probability of each node being selected
Figure BDA0003073542750000023
pkNot less than 0 and
Figure BDA0003073542750000024
nkis the number of samples of the node k,
Figure BDA0003073542750000025
is the sum of all node sample numbers; in this model, F is the overall empirical loss function, Fk(. is) the lithographic hot zone data distribution for node k
Figure BDA0003073542750000026
Local empirical loss function of Fk(. o) is non-convex, assuming that the kth node holds nkLithography thermal zone training data:
Figure BDA0003073542750000027
then Fk(. cndot.) can be defined as:
Figure BDA0003073542750000028
where l (-) is a loss function based on a certain sample;
s4, iteratively updating parameters of the federal personalized learning model of the photo-etching hot zone, wherein the updating process of the t round is as follows:
firstly, the central server broadcasts the latest photoetching hot area global model parameter w to all nodest,global
Secondly, assume the lithography hot zone federal personalized learning model of the kth node as
Figure BDA0003073542750000029
Execution Elocal(more than or equal to 1) times of local model parameter fine adjustment of the photoetching hot area:
Figure BDA00030735427500000210
wherein eta istIs the learning rate, xikIs a uniformly selected sample from the local lithographic hot zone data; at this time, the model of the node k is updated to
Figure BDA0003073542750000031
And E (more than or equal to 1) times of photoetching hotspot all parameter updating are carried out:
Figure BDA0003073542750000032
finally, the central server aggregates the global model parameters of the node lithography hotspot model to generate new global model parameters wt+1,global
And S5, after a plurality of rounds of iterative updating, until the lithography hot zone federal personalized learning model converges, and using the converged model for lithography hot zone data detection.
Further, the construction of the convolutional neural network of each node is specifically as follows:
the convolution neural network of each node consists of two convolution units and two full-connection layers which are connected in sequence;
each convolution unit comprises two convolution layers, a ReLU layer and a maximum pooling layer which are connected in sequence; in each convolution process, a series of convolution kernels perform convolution operation on the data tensor of the bottom photoetching hot area; the ReLU layer activates output data of the convolutional layer to ensure that the whole neural network is nonlinear and sparse; the maximum pooling layer performs 2 multiplied by 2 downsampling on the output of the ReLU layer and serves as an output layer of the current convolution unit;
two convolution units are followed by two fully-connected layers, during training, a dropout operation is performed on the first fully-connected layer to mitigate overfitting, and the second fully-connected layer is an output layer of the whole neural network and is provided with two output channels which are respectively the predicted probabilities of the lithography hot area and the non-lithography hot area.
Further, in S4, when all nodes participate in the aggregation, i.e., the process of synchronous aggregation, is as follows:
photoetching global parameters of a hot zone model according to all nodes
Figure BDA0003073542750000033
Generating a new global parameter wt+1,global(ii) a After the parameters of all the nodes in each round are updated, all the nodes send the global parameters to the central server for aggregation, and the aggregation formula is as follows:
Figure BDA0003073542750000034
further, in S4, when the partial nodes participate in the aggregation, i.e., asynchronous aggregation, the process is as follows:
setting a threshold value K (K is more than or equal to 1 and less than N) of the number of aggregation nodes, and enabling a central server to collect the output of the former K response nodes; after collecting the outputs of the K nodes, the central server stops waiting for the rest of the nodes; in this update, the K +1 th to Nth nodes are regarded as laggard nodes; let St(|StK) is the set of the first K response nodes in the t-th iteration, and the aggregation formula is as follows:
Figure BDA0003073542750000041
wherein n isKIs the sum of the sample data size of the first K nodes,
Figure BDA0003073542750000042
the invention has the following beneficial effects:
1. due to the fact that data among chip design manufacturers are not intercommunicated, researchers are difficult to obtain a large amount of data to conduct model training, the problem of data islands can be broken through by the method, and the problem of model overfitting caused by the fact that local data are too little is solved.
2. And data among chip design manufacturers are protected, and privacy protection is realized.
3. The problem that the traditional federal learning model is low in precision is solved, and the precision of the model can be effectively improved by using the method provided by the invention.
4. The problem of data heterogeneity and asynchrony that traditional federal study faced is solved. The method of the invention can effectively overcome data isomerism and support the online dynamic update of each node.
Drawings
FIG. 1 is a schematic diagram of centralized machine learning;
FIG. 2 is a diagram of conventional federal learning;
FIG. 3 is a diagram illustrating relative distances between neural network layers in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a lithography hotspot detection method based on federated personalized learning in an embodiment of the present invention;
fig. 5 and 6 respectively show the experimental result and the precision comparison result of the synchronous training of two data sets divided into 2 nodes;
fig. 7 and 8 respectively show the experimental result and the precision comparison result of the synchronous training of two data sets divided into 4 nodes;
fig. 9 and 10 are respectively an experimental result and a precision comparison result of two data sets divided into 10 nodes for synchronous training;
fig. 11 and 12 are respectively an experimental result and a precision comparison result of two data sets divided into 4 nodes for asynchronous training;
fig. 13 and 14 respectively show experimental results and precision comparison results of asynchronous training performed by dividing two data sets into 10 nodes.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
When lithographic adjustments are made to an integrated circuit design layout, some layouts are less robust to such adjustments and are more likely to cause open or short circuit failures during the manufacturing process, such failure prone areas being defined as lithographic hot spots. The main goal of lithography hot zone detection is to improve detection accuracy and minimize detection error rate as much as possible. Training a lithography hot-zone inspection model with bleeding properties typically requires a large amount of data. However, factories with lithography hot zone data will not share data with each other due to the privacy of the data. It is difficult for each plant to obtain a large amount of data for model training, which easily results in model overfitting. Federal learning is introduced for lithography hotspot detection purposes to learn the data characteristics of each node and protect data privacy. Individual plants often generate and collect circuit lithography hotspot data in a highly heterogeneous fashion. In addition, the amount of lithography hot zone data may vary greatly from factory to factory. This data generation violates the independent co-distribution (IID) assumption often used in federal learning, and conventional joint learning cannot deal with statistical heterogeneity. Therefore, the present invention introduces a lithography hotspot detection method based on federal personalized learning, with the goal of achieving factory-specific personalized modeling, which is generally a more efficient method of dealing with statistical heterogeneity of data.
Neural network architecture of one node and each node
Firstly, describing a neural network architecture local to each node, specifically adopting a convolutional neural network architecture, wherein a Convolutional Neural Network (CNN) has excellent performance in the field of image classification. The convolutional neural network is composed of several layers of convolutional units for performing feature extraction and several layers of fully connected layers for generating sample classification probabilities.
In the invention, the convolutional neural network of each node is composed of two convolutional units and two fully-connected layers which are connected in sequence, and each convolutional unit comprises two convolutional layers, a ReLU layer and a max-pooling layer (max-pooling layer) which are connected in sequence. During each convolution, a series of convolution kernels perform the following convolution operations on the underlying lithographic hot zone data tensor X:
Y=conv(W,X)+b
where W is the weight matrix of the convolutional layer, b is the offset parameter, Y is the output data of the convolutional layer, all the convolutional kernels in this embodiment have a size of 3 × 3, and the number of output channels of the two convolutional layers of each convolutional unit is 16 and 32, respectively. ReLU is an activation function that operates on each data element Y after the convolutional layer, whose expression is shown below, which ensures that the entire neural network is non-linear and sparse.
Figure BDA0003073542750000051
The maximum pooling layer down-samples the output of the previous layer by 2 x 2 and serves as the output layer of the current convolution unit. Two convolution units are followed by two fully connected layers with output channel numbers of 250 and 2, respectively. During training, a dropout operation is performed on the first fully-connected layer with a 50% probability to mitigate overfitting, and the second fully-connected layer is the output layer of the entire neural network, which has two output channels, which are the predicted probabilities of the lithographic and non-lithographic hot zones, respectively. The model configuration parameters are detailed in table 1.
TABLE 1 node neural network model configuration
Figure BDA0003073542750000052
Figure BDA0003073542750000061
Determining global model parameters and local model parameters
Starting from the neural network model parameters of each node, the data similarity of the nodes can be deduced according to the similarity of the model parameters, by using the thought, the neural network model parameters are taken as a means for extracting common characteristics, the parameter distance of the same layer on the neural network trained by different nodes is compared according to the neural network model parameters trained by different nodes, the difference value between the jth layer parameter of all the nodes and the average value of the jth layer parameter is calculated on a certain neural network layer (for example, the jth layer), if the sum of 2 norms of all the difference values is less than or equal to a distance threshold parameter delta, the jth layer parameter is taken as the common characteristic parameter of different nodes, and the common characteristic parameter is determined as the global model parameter to be fused, namely:
Figure BDA0003073542750000062
wherein, Wk,jIs the jth layer parameter of the kth node model,
Figure BDA0003073542750000063
is the average of the j-th layer parameters of the N models, and
Figure BDA0003073542750000064
if the sum of the 2 norms of all the differences is greater than the distance threshold parameter delta, the parameter of the layer is considered to be incompatible features of different nodes, the part of the parameter is used as a local parameter to be locally updated, and the local model parameter is finely adjusted after the global model parameter is aggregated so as to improve the performance of the local model, namely:
Figure BDA0003073542750000065
by using the above method, we obtain the corresponding distances of the neural network parameters of each layer of all nodes, in this embodiment, the distance of the first fully-connected layer fc1 is calculated to be the smallest, and the distance is taken as a reference to obtain the relative distance between different layers, as shown in fig. 3, it can be found that the distance of the last layer fc2 is significantly higher than that of other neural network layers, so we finally determine the last layer fc2 of the convolutional neural network as the local fine-tuning neural network layer.
Thirdly, establishing a lithography hot zone federal personalized learning model as follows:
Figure BDA0003073542750000066
wherein, wglobalIs a global model parameter that is common to all nodes,
Figure BDA0003073542750000067
the kth column is the local model parameter of the kth node, N is the number of nodes, and the probability of each node being selected
Figure BDA0003073542750000068
pkNot less than 0 and
Figure BDA0003073542750000069
nkis the number of samples of the node k,
Figure BDA00030735427500000610
is the sum of the number of samples of all nodes. In this model, F is the overall empirical loss function, Fk(. is) the lithographic hot zone data distribution for node k
Figure BDA00030735427500000611
Local empirical loss function of Fk(. o) is non-convex, assuming that the kth node holds nkLithography thermal zone training data: x is the number ofk,1,xk,2,…
Figure BDA0003073542750000078
Then F isk(. cndot.) can be defined as:
Figure BDA0003073542750000071
where l (-) is a loss function based on a certain sample.
Fourthly, in the lithography hot zone federal personalized learning model, a parameter iterative updating method is as follows:
for the t-th round, firstly, the central server broadcasts the latest photoetching hot area global model parameters w to all nodest,global. Secondly, assume the lithography hot zone federal personalized learning model of the kth node as
Figure BDA0003073542750000072
Then execute Elocal(more than or equal to 1) times of local model parameter updating of the photoetching hot area:
Figure BDA0003073542750000073
wherein eta istIs the learning rate (also called step size), ξkIs a uniformly selected sample from the local lithographic hot zone data. At this time, the model of the node k is updated to
Figure BDA0003073542750000074
Then E (≧ 1) total parameter updates of the lithography hotspots are performed:
Figure BDA0003073542750000075
finally, the central server aggregates the global model parameters of the node lithography hotspot model to generate new global model parameters wt+1,global. The method is divided into two cases of all nodes participating in aggregation and part of nodes participating in aggregation, wherein all nodes participate in an ideal case, and only part of nodes can participate in aggregation (namely asynchronous aggregation) in a real scene.
All nodes participate in the aggregation. Photoetching global model parameters of hot zone model according to all nodes
Figure BDA0003073542750000076
Generating new global model parameters wt+1,global. After the parameters of all the nodes in each round are updated, all the nodes send the global model parameters to the central server for aggregation, and the aggregation formula is as follows:
Figure BDA0003073542750000077
unfortunately, in a practical application environment, the requirement for all nodes to participate in the aggregation is affected by a severe "dequeue effect" (which means that all nodes are waiting for the slowest node). For example, if there are thousands of user devices in a federal learning system, a small percentage of the devices are always offline. Full device participation means that the central server must wait for these "laggars", which is clearly impractical.
Part of the nodes participate in the aggregation. This strategy is more practical because it does not require all nodes to be online at the same time (asynchronous). We can set the threshold number of aggregation nodes K (1 ≦ K < N) and let the central server collect the outputs of the first K response nodes. After collecting the outputs of the K nodes, the central server stops waiting for the rest of the nodes; in this update, the K +1 th to Nth nodes are considered as laggard nodes. Let St(|StK) is the set of the first K response nodes in the t-th iteration, and the aggregation formula is as follows:
Figure BDA0003073542750000081
wherein n isKIs the sum of the sample data size of the first K nodes,
Figure BDA0003073542750000082
as shown in fig. 4, the photolithography hot zone detection method based on federal personalized learning mainly comprises the following three parts:
a central server aggregation stage: the central server aggregates the global model parameters returned by each node, is used for fusing the common characteristics of each node, updates the global model parameters, and feeds back the latest global model parameters to each node.
And (3) fine tuning of local model parameters of the nodes: each node downloads global model parameters from the central server, and then local model parameters are trained by using local data to find the optimal local model parameters under the current global model parameters, so that model isomerism and data isomerism of different nodes are overcome.
And (3) updating all parameters of the nodes: after the local model parameters are finely adjusted, the nodes train all the parameters by using local data to find the optimal of the current parameters, and the optimal parameters are used for searching common characteristics of different nodes.
The photoetching hot area detection method based on the federal personalized learning provided by the invention solves the isomerism challenge in theory and experiment. The key thought of the invention is that global model parameters representing common characteristics of each node are subjected to federal fusion so as to fuse the common characteristics of each node, and local model parameters representing the characteristic characteristics of each node are subjected to local fine adjustment so as to make corresponding adjustment according to the heterogeneity of each node, thereby improving the stability of the model. These improvements improve the stability and overall accuracy of federal personalized learning in heterogeneous environments (nonIID).
This embodiment uses 2 data sets, one of which is ICCAD 2012 content and the other is an industry data set asml 1. These 2 data sets are all very representative data sets in the field of lithography hot spots, and the basic information is shown in table 2. The four columns of the table list the total number of lithographically defined hot zones (hotspots) and non-lithographically defined hot zones (non-hotspots) in the training set and test set, respectively.
TABLE 2 data set basic information
Figure BDA0003073542750000083
The following are experimental details of the implementation:
dividing ICCAD and asml1 data sets into data sets with different quantities, wherein each data set corresponds to a node, and performing synchronous/asynchronous training, specifically: the ICCAD and the asml1 are divided into 1, 2 and 5 data sets respectively, 2, 4 and 10 data sets are totally arranged, and the data sets correspond to 2, 4 and 10 nodes and are respectively tested.
The algorithm used for comparison includes:
FedAvg (traditional federal learning)
FedProx (modified version of traditional federal learning)
Local train (only build convolutional neural network for training and detection)
Federated Personalized Learning (method of the invention FPL, last layer fc2 as local parameter for fine tuning)
First, in case of synchronization (i.e. all nodes are involved in training and parameter aggregation):
the experimental results of the two data sets divided into 2 nodes for synchronous training are shown in fig. 5, and the total accuracy (acc) comparison results are shown in fig. 6. For the 4 algorithms, the index of the left graph of each algorithm is True Positive Rate (TPR, i.e. the proportion of correctly determined Positive samples to all Positive samples), and the index of the right graph is False Positive Rate (FPR, i.e. the proportion of number of incorrectly determined negative samples to the total number of negative samples). The higher the TPR, the better, the lower the FPR. Finally, comparing the total precision, the FPL of the invention can be found to have obviously better performance.
The experimental results of the two data sets divided into 4 nodes for synchronous training are shown in fig. 7, and the total accuracy (acc) comparison results are shown in fig. 8. It can be found that the performance of the FPL of the invention is still superior to other methods.
The experimental results of the two data sets divided into 10 nodes for synchronous training are shown in fig. 9, and the total accuracy (acc) comparison results are shown in fig. 10. It can be found that the performance of the FPL of the invention is still significantly higher than that of conventional federal learning and FedProx. Compared with pure local training, the FPL of the invention simultaneously achieves higher True Positive Rate, lower False Positive Rate and obviously better performance. The result shows that when the number of samples on the node is small, the FPL can successfully use the information of other nodes to improve the performance of the node.
Secondly, in the asynchronous (i.e. only part of the nodes in each round of aggregation participate in training and parameter aggregation) case:
the experimental results of the asynchronous training of the two data sets divided into 4 nodes are shown in fig. 11, and the total accuracy (acc) comparison results are shown in fig. 12. And randomly selecting half of nodes for training and aggregating in each round. The result here is similar to that of 4-node synchronous training. Overall the performance of the FPL of the invention is better than other approaches.
The experimental results of the asynchronous training of the two data sets divided into 10 nodes are shown in fig. 13, and the overall accuracy (acc) comparison results are shown in fig. 14. It can be found that the convergence rate of the FPL of the invention is still higher than that of other methods, and the performance is obviously better. Compared with other methods, the FPL still can simultaneously achieve a higher True Positive Rate, a lower False Positive Rate and a significantly better performance. The above experiment shows that the FPL can still improve the local performance by using other node information under the asynchronous condition.
According to the above experimental results, it can be found that:
1. the invention adopts federal personalized learning, does not need each node to share photoetching hot area data information, and can learn the wanted knowledge from the global model only by sharing the common characteristic parameters of each node, thereby achieving satisfactory effect.
2. The TPR and FPR in the photoetching hot area detection are well balanced, and the final photoetching hot area detection precision is improved.
3. Meanwhile, the number of parameters of the local vernier (fc2) is very small, and only accounts for 0.68% of the parameters of the global model.
4. The global model parameter is a very good pre-training parameter, the global model parameter contains common characteristics of photoetching hot zone data of a plurality of nodes, and the set model precision can be achieved only by carrying out local updating for a few times on the basis of the global model parameter.
5. The algorithm can learn global knowledge, can also carry out self-adaptive adjustment according to respective local photoetching hot area data sets, and has strong tolerance on data heterogeneity in the photoetching hot area field.
6. Compared with the existing method (FL, local train, fepprox), the method of the invention can well solve the problems of model isomerism and data isomerism in the field of lithography hot zone detection, can realize better and more stable personalized performance as long as the neural network parameters for global polymerization are kept consistent, and improves the precision by about 10% compared with the FL.
7. Asynchronous, the method of the invention supports dynamic joining or exiting of nodes in the training process.
Hereinbefore, specific embodiments of the present invention are described with reference to the drawings. However, those skilled in the art will appreciate that various modifications and substitutions can be made to the specific embodiments of the present invention without departing from the spirit and scope of the invention. Such modifications and substitutions are intended to be included within the scope of the present invention as defined by the appended claims.

Claims (4)

1. A photoetching hot zone detection method based on federal personalized learning is characterized by comprising the following steps:
s1, constructing a convolutional neural network with the same architecture but different parameters for each node based on the photoetching hot zone data of each node;
s2, determining global model parameters and local model parameters according to the parameter similarity of the convolutional neural network between the nodes:
comparing parameter distances of the same layer on a convolutional neural network obtained by training different nodes, calculating difference values of parameters of the jth layer of all nodes and the average value of the parameters of the jth layer on the jth layer of the convolutional neural network, if the sum of 2 norms of all the difference values is less than or equal to a distance threshold parameter delta, taking the jth layer of parameters as common characteristic parameters of different nodes, determining the jth layer of parameters as global model parameters to aggregate, and otherwise, considering that the jth layer of parameters are incompatible characteristics of different nodes, and taking the jth layer of parameters as local model parameters to perform local fine tuning;
s3, establishing a lithography hot zone federal personalized learning model:
Figure FDA0003073542740000011
wherein, wglobalIs a global model parameter that is common to all nodes,
Figure FDA0003073542740000012
the kth column is the local model parameter of the kth node, N is the number of nodes, and the probability of each node being selected
Figure FDA0003073542740000013
And is
Figure FDA0003073542740000014
nkIs the number of samples of the node k,
Figure FDA0003073542740000015
is the sum of all node sample numbers; in this model, F is the overall empirical loss function, Fk(. is) the lithographic hot zone data distribution for node k
Figure FDA0003073542740000016
Local empirical loss function of Fk(. o) is non-convex, assuming that the kth node holds nkLithography thermal zone training data:
Figure FDA0003073542740000017
then Fk(. cndot.) can be defined as:
Figure FDA0003073542740000018
where l (-) is a loss function based on a certain sample;
s4, iteratively updating parameters of the federal personalized learning model of the photo-etching hot zone, wherein the updating process of the t round is as follows:
firstly, the central server broadcasts the latest photoetching hot area global model parameter w to all nodest,global
Secondly, assume the lithography hot zone federal personalized learning model of the kth node as
Figure FDA0003073542740000019
Execution Elocal(more than or equal to 1) times of local model parameter fine adjustment of the photoetching hot area:
Figure FDA0003073542740000021
wherein eta istIs the learning rate, xikIs a uniformly selected sample from the local lithographic hot zone data; at this time, the model of the node k is updated to
Figure FDA0003073542740000022
And E (more than or equal to 1) times of photoetching hotspot all parameter updating are carried out:
Figure FDA0003073542740000023
finally, the central server aggregates the global model parameters of the node lithography hotspot model to generate new global model parameters wt+1,global
And S5, after a plurality of rounds of iterative updating, until the lithography hot zone federal personalized learning model converges, and using the converged model for lithography hot zone data detection.
2. The lithography hotspot detection method based on federal personalized learning according to claim 1, wherein the convolutional neural network construction of each node is specifically as follows:
the convolution neural network of each node consists of two convolution units and two full-connection layers which are connected in sequence;
each convolution unit comprises two convolution layers, a ReLU layer and a maximum pooling layer which are connected in sequence; in each convolution process, a series of convolution kernels perform convolution operation on the data tensor of the bottom photoetching hot area; the ReLU layer activates output data of the convolutional layer to ensure that the whole neural network is nonlinear and sparse; the maximum pooling layer performs 2 multiplied by 2 downsampling on the output of the ReLU layer and serves as an output layer of the current convolution unit;
two convolution units are followed by two fully-connected layers, during training, a dropout operation is performed on the first fully-connected layer to mitigate overfitting, and the second fully-connected layer is an output layer of the whole neural network and is provided with two output channels which are respectively the predicted probabilities of the lithography hot area and the non-lithography hot area.
3. The method for lithography hotspot detection based on federal personalized learning according to claim 1, wherein in the step S4, when all nodes participate in the aggregation, i.e. synchronous aggregation, the process is as follows:
photoetching global parameters of a hot zone model according to all nodes
Figure FDA0003073542740000024
Generating a new global parameter wt+1,global(ii) a After the parameters of all the nodes in each round are updated, all the nodes send the global parameters to the central server for aggregation, and the aggregation formula is as follows:
Figure FDA0003073542740000025
4. the method for lithography hotspot detection based on federal personalized learning according to claim 1, wherein in the step S4, when part of nodes participate in aggregation, i.e. asynchronous aggregation, the process is as follows:
setting a threshold value K (K is more than or equal to 1 and less than N) of the number of aggregation nodes, and enabling a central server to collect the output of the former K response nodes;after collecting the outputs of the K nodes, the central server stops waiting for the rest of the nodes; in this update, the K +1 th to Nth nodes are regarded as laggard nodes; let St(|StK) is the set of the first K response nodes in the t-th iteration, and the aggregation formula is as follows:
Figure FDA0003073542740000031
wherein n isKIs the sum of the sample data size of the first K nodes,
Figure FDA0003073542740000032
CN202110545686.2A 2021-05-19 2021-05-19 Photolithographic hot zone detection method based on federal personalized learning Active CN113222031B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110545686.2A CN113222031B (en) 2021-05-19 2021-05-19 Photolithographic hot zone detection method based on federal personalized learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110545686.2A CN113222031B (en) 2021-05-19 2021-05-19 Photolithographic hot zone detection method based on federal personalized learning

Publications (2)

Publication Number Publication Date
CN113222031A true CN113222031A (en) 2021-08-06
CN113222031B CN113222031B (en) 2022-04-12

Family

ID=77093154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110545686.2A Active CN113222031B (en) 2021-05-19 2021-05-19 Photolithographic hot zone detection method based on federal personalized learning

Country Status (1)

Country Link
CN (1) CN113222031B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113674235A (en) * 2021-08-15 2021-11-19 上海立芯软件科技有限公司 Low-cost photoetching hotspot detection method based on active entropy sampling and model calibration
CN113869528A (en) * 2021-12-02 2021-12-31 中国科学院自动化研究所 De-entanglement individualized federated learning method for consensus characterization extraction and diversity propagation
CN116756764A (en) * 2023-05-04 2023-09-15 浙江大学 Model blocking aggregation privacy protection method for lithography hotspot detection
CN117196070A (en) * 2023-11-08 2023-12-08 山东省计算中心(国家超级计算济南中心) Heterogeneous data-oriented dual federal distillation learning method and device

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751502A (en) * 2008-12-18 2010-06-23 睿初科技公司 Method and system for correcting window maximized optic proximity effect in photoetching process
CN110263921A (en) * 2019-06-28 2019-09-20 深圳前海微众银行股份有限公司 A kind of training method and device of federation's learning model
CN111324812A (en) * 2020-02-20 2020-06-23 深圳前海微众银行股份有限公司 Federal recommendation method, device, equipment and medium based on transfer learning
CN111754000A (en) * 2020-06-24 2020-10-09 清华大学 Quality-aware edge intelligent federal learning method and system
EP3742229A1 (en) * 2019-05-21 2020-11-25 ASML Netherlands B.V. Systems and methods for adjusting prediction models between facility locations
CN112181971A (en) * 2020-10-27 2021-01-05 华侨大学 Edge-based federated learning model cleaning and equipment clustering method, system, equipment and readable storage medium
CN112219270A (en) * 2018-06-05 2021-01-12 科磊股份有限公司 Active learning for defect classifier training
WO2021056043A1 (en) * 2019-09-23 2021-04-01 Presagen Pty Ltd Decentralised artificial intelligence (ai)/machine learning training system
US20210117214A1 (en) * 2019-10-18 2021-04-22 Facebook, Inc. Generating Proactive Content for Assistant Systems
US20210117780A1 (en) * 2019-10-18 2021-04-22 Facebook Technologies, Llc Personalized Federated Learning for Assistant Systems
CN112818394A (en) * 2021-01-29 2021-05-18 西安交通大学 Self-adaptive asynchronous federal learning method with local privacy protection

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751502A (en) * 2008-12-18 2010-06-23 睿初科技公司 Method and system for correcting window maximized optic proximity effect in photoetching process
CN112219270A (en) * 2018-06-05 2021-01-12 科磊股份有限公司 Active learning for defect classifier training
EP3742229A1 (en) * 2019-05-21 2020-11-25 ASML Netherlands B.V. Systems and methods for adjusting prediction models between facility locations
CN110263921A (en) * 2019-06-28 2019-09-20 深圳前海微众银行股份有限公司 A kind of training method and device of federation's learning model
WO2021056043A1 (en) * 2019-09-23 2021-04-01 Presagen Pty Ltd Decentralised artificial intelligence (ai)/machine learning training system
US20210117214A1 (en) * 2019-10-18 2021-04-22 Facebook, Inc. Generating Proactive Content for Assistant Systems
US20210117780A1 (en) * 2019-10-18 2021-04-22 Facebook Technologies, Llc Personalized Federated Learning for Assistant Systems
CN111324812A (en) * 2020-02-20 2020-06-23 深圳前海微众银行股份有限公司 Federal recommendation method, device, equipment and medium based on transfer learning
CN111754000A (en) * 2020-06-24 2020-10-09 清华大学 Quality-aware edge intelligent federal learning method and system
CN112181971A (en) * 2020-10-27 2021-01-05 华侨大学 Edge-based federated learning model cleaning and equipment clustering method, system, equipment and readable storage medium
CN112818394A (en) * 2021-01-29 2021-05-18 西安交通大学 Self-adaptive asynchronous federal learning method with local privacy protection

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
H. BRENDAN MCMAHAN ET AL.: "Communication-Efficient Learning of Deep Networks", 《ARTIFICIAL INTELLIGENCE AND STATISTIC》 *
JINZE WU ET AL.: "Hierarchical Personalized Federated Learning for User Modeling", 《WWW"21》 *
MANOJ GHUHAN ARIVAZHAGAN ET AL.: "Federated Learning with Personalization Layers", 《ARXIV.ORG》 *
TIAN LI ET AL.: "FEDERATED OPTIMIZATION IN HETEROGENEOUS NETWORKS", 《ARXIV.ORG》 *
YUTAO HUANG ET AL.: "Personalized Cross-Silo Federated Learning on Non-IID Data", 《ARXIV.ORG》 *
邓宇: "基于深度学习的电路版图光刻热点检测技术", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
郭求是: "基于深度学习的光刻热点检测技术研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113674235A (en) * 2021-08-15 2021-11-19 上海立芯软件科技有限公司 Low-cost photoetching hotspot detection method based on active entropy sampling and model calibration
CN113674235B (en) * 2021-08-15 2023-10-10 上海立芯软件科技有限公司 Low-cost photoetching hot spot detection method based on active entropy sampling and model calibration
CN113869528A (en) * 2021-12-02 2021-12-31 中国科学院自动化研究所 De-entanglement individualized federated learning method for consensus characterization extraction and diversity propagation
WO2023098790A1 (en) * 2021-12-02 2023-06-08 中国科学院自动化研究所 Disentangled personalized federated learning method for consensus representation extraction and diversity propagation
CN116756764A (en) * 2023-05-04 2023-09-15 浙江大学 Model blocking aggregation privacy protection method for lithography hotspot detection
CN117196070A (en) * 2023-11-08 2023-12-08 山东省计算中心(国家超级计算济南中心) Heterogeneous data-oriented dual federal distillation learning method and device
CN117196070B (en) * 2023-11-08 2024-01-26 山东省计算中心(国家超级计算济南中心) Heterogeneous data-oriented dual federal distillation learning method and device

Also Published As

Publication number Publication date
CN113222031B (en) 2022-04-12

Similar Documents

Publication Publication Date Title
CN113222031B (en) Photolithographic hot zone detection method based on federal personalized learning
CN109918708B (en) Material performance prediction model construction method based on heterogeneous ensemble learning
CN111881342A (en) Recommendation method based on graph twin network
WO2017024691A1 (en) Analogue circuit fault mode classification method
CN113190688B (en) Complex network link prediction method and system based on logical reasoning and graph convolution
CN108563875B (en) Multi-objective optimization-based combined optimization method for measuring points and frequencies of analog circuit
CN110232434A (en) A kind of neural network framework appraisal procedure based on attributed graph optimization
CN110378744A (en) Civil aviaton&#39;s frequent flight passenger value category method and system towards incomplete data system
Yang et al. Collective entity alignment for knowledge fusion of power grid dispatching knowledge graphs
Alber et al. Backprop evolution
CN113807040B (en) Optimized design method for microwave circuit
CN114065033A (en) Training method of graph neural network model for recommending Web service combination
CN109116300A (en) A kind of limit learning position method based on non-abundant finger print information
CN112257202A (en) Neural network-based two-dimensional structure grid automatic decomposition method for multi-inner-hole part
Zhou et al. Online recommendation based on incremental-input self-organizing map
CN116415177A (en) Classifier parameter identification method based on extreme learning machine
CN114330549A (en) Chemical process fault diagnosis method based on depth map network
Yu et al. Fault diagnosis of analog circuit based CS_SVM algorithm
CN116011657B (en) Optimization method, device and system for power distribution network load prediction model based on miniature PMU
Wang et al. A deep learning algorithm for predicting protein-protein interactions with nonnegative latent factorization
CN113807422B (en) Weighted graph convolutional neural network scoring prediction model integrating multi-feature information
Dong et al. Heterogeneous Graph Neural Architecture Search with GPT-4
Zhang et al. Community discovery on multi-view social networks via joint regularized nonnegative matrix triple factorization
CN115620807B (en) Method for predicting interaction strength between target protein molecule and drug molecule
Sundstrom et al. A Computational Model for Decision-Making and Assembly Optimization in Manufacturing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant