CN116756764A

CN116756764A - Model blocking aggregation privacy protection method for lithography hotspot detection

Info

Publication number: CN116756764A
Application number: CN202310489532.5A
Authority: CN
Inventors: 田哲; 朱泽晗; 徐金明; 黄炎; 闫昌智; 林学忠
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2023-05-04
Filing date: 2023-05-04
Publication date: 2023-09-15
Anticipated expiration: 2043-05-04
Also published as: CN116756764B

Abstract

The invention discloses a model blocking aggregation privacy protection method for lithography hotspot detection, wherein each client node trains local model parameters by utilizing a local lithography hotspot data set, blocks the parameters according to characteristics and respectively transmits the parameters to a plurality of central servers for parameter aggregation, and finally the central servers return the aggregated global model parameters to each client node, and the client nodes locally splice model parameter blocks for next iteration. The invention protects the local model updating parameters of each client in the lithography hotspot detection system based on federal learning, solves the problem of privacy leakage of the updating parameters, successfully defends reconstruction attacks on local lithography hotspot image data, and improves the aggregation and communication speed of the model parameters of the central server side.

Description

Model blocking aggregation privacy protection method for lithography hotspot detection

Technical Field

The invention belongs to the field of machine learning, and particularly relates to a model blocking aggregation privacy protection method for lithography hotspot detection.

Background

The chip lithography hot spot refers to a local hot spot phenomenon occurring in the chip lithography process. The lithography hotspots are prone to short circuits and distortion, seriously affecting chip production, and therefore require technology to detect them. In the lithography hotspot detection system based on federal learning, each chip manufacturer can learn the data feature knowledge of other clients without uploading lithography hotspot images of the local clients. Because the training data does not leave the local client, the federal learning framework can generate a certain privacy protection effect on the local data, but malicious attackers can infer the attribute of the client and even the privacy data of the client through model parameters and gradients used in the federal learning framework, so that great potential data safety hazards are brought to a lithography hotspot detection system based on federal learning. In order to better protect the data privacy of each chip manufacturer and ensure high availability of the lithography hotspot detection model, the lithography hotspot detection system must be privacy protected.

The federal learning algorithm in the prior art is as follows: in a basic federal learning framework, there is a central server and a number of client nodes, wherein the computing portion is performed by the clients and updated parameters are sent to the central server for parameter aggregation. The popular federal learning algorithm at the present stage mainly comprises FedAvg algorithm and FedProx algorithm:

FedAvg algorithm: fedAVg is the earliest classical federal learning algorithm proposed in the industry, is also the benchmark of the current federal learning research field, and is widely applied. As shown in fig. 1, the FedAvg algorithm implements a distributed online learning system consisting of a central server and a plurality of clients. FedAvg allows multiple local clients to cooperatively learn a shared model while preserving all training data locally. For a central server, the number of clients participating in federal learning training isN, distinguishing the data set of each client, and lettingThe data set of the ith client has a sample number of n _i The global loss function of FedAvg can be defined as:

wherein ,n_N Sample total amount of data set of N clients participating in federal learning training, F _i (w) is the local loss function of the ith client.

FedProx algorithm: fedProx is a federal learning algorithm that optimizes FedAvg in terms of robustness convergence, allowing each client participating in federal learning training to perform a variable number of model training steps, as shown in FIG. 2. FedAvg requires all participating clients to perform the same model training work, and FedAvg can reject the client to participate in the model aggregation update for clients which cannot complete the model training within a specified time range, and FedProx can accept the client to participate in the update, so that FedProx can realize more robust convergence performance than FedAvg. The global loss function of FedProx can be defined as:

wherein ,H _i (w；w ^t ) Is the local objective function of the ith client, μ is the regularized term coefficient, w ^t Is global parameter obtained by aggregation of the t-th round model, and w is as follows ^t Adding the new parameters into the original loss function of each client as regular terms to limit the update of the local model, and introducing new parameters +.> The self-adaptive adjustment can be carried out by the ith client according to the self system constraint of the ith client, the number of updating times of the local model of the ith client can be limited, and the condition is met when the local model parameters are updated>The update is stopped when the local model parameter is +.>And then, the local model parameters are sent to a central server for aggregation, and the aggregated model is distributed back to all clients, so that the model is one-round and is circularly reciprocated until the final model converges.

At present, a classical privacy attack mode under the condition of lithography hotspot detection based on federal learning is depth gradient leakage: gradient exchange is a common communication mode of federal learning architecture, and it has long been considered that gradients can be safely shared, i.e. training data cannot be leaked due to gradient exchange. However, studies have shown that private training data can be obtained through shared gradients. Researchers named it depth gradient leakage (Deep Leakage from Gradients, DLG), i.e. DLG attack. Researchers have validated DLG attacks on computer vision and natural language processing tasks. The results show that on various data set tasks, DLG attacks can fully recover training data with only a few gradient fitting steps. For the lithography hotspot image, the image color is monotonous, the line is simple, and only a small amount of fitting iterative training is needed for DLG attack to realize the lithography hotspot image recovery at the pixel level, as shown in FIG. 3. DLG attacks present a more serious challenge to the federal learning architecture-based lithography hotspot detection system than other attacks (attribute inference attacks, membership inference attacks, model inversion attacks, etc.).

In order to deploy a DLG attack, an attacker (malicious client) first randomly generates a set of input data and labels locally, and then performs neural network training according to the generated data to calculate a local virtual gradient. The attacker updates the input data and the labels according to the local virtual gradient vector and the L2 norms of the real gradient vectors of other clients so as to reduce the distance between the virtual gradient and the real gradient, and after the attack is finished, the private photoetching hot spot image data is completely exposed after the attack is finished. The core principle of DLG is to minimize gradient tracking errors:

wherein x 'and y' are a set of random virtual input data and labels, and these virtual data are imported into the model to obtain a virtual gradient The real gradient of a certain client is obtained, and the training update enables the virtual gradient to be close to the real gradient and enables the virtual input data to be close to the real training data of the client.

The traditional defensive measure for deep gradient leakage is a safe multiparty calculation federal learning method based on model parameter splitting and aggregation:

to defend against DLG attacks, a conventional model parameter based split aggregate federal learning law (Federated Learning of Parameter Splitting and Aggregation, FL-PSA) was constructed as shown in FIG. 4. During the t-th round of aggregation training, the i-th client calculates local model parameters after the local training is completedThe client then performs a randomized split of the model parameters, i.e. +.> wherein />Is the model parameter sent by the ith client to Server1, where +.>Is the model parameter sent by the ith client to Server 2. The two central servers aggregate the model parameters transmitted by the clients, namely and />The Server2 sends the parameter aggregation to the Server1 for parameter averaging after completing the parameter aggregation, and removes the influence of randomization to obtain the global model parameters of the t+1st roundThe central server gathers the model parameters w after completion ^t+1 And issuing the model parameters to each client, wherein each client uses the model parameters for local model updating of the next round until convergence.

The training process of the FL-PSA algorithm is that before the client transmits the model parameters, one model parameter is split into two model parameters with the same dimension through the random number vector, so that the data privacy of the client of the federal learning architecture can be protected to a certain extent. There are some problems with the FL-PSA algorithm: during each federal learning training and aggregation, the client generates a random number vector, and uses the random number vector for calculation, so that the local calculation amount of the client is greatly increased; each client transmits model parameters with the same dimension to two center servers, so that communication overhead between the client and the center servers is increased.

Disclosure of Invention

The invention aims to provide a model blocking aggregation privacy protection method for lithography hotspot detection, aiming at the defects of the prior art.

The invention aims at realizing the following technical scheme: a model blocking aggregation privacy protection method for lithography hotspot detection comprises the following steps:

s1, in the updating of the federal learning model for lithography hotspot detection in the t-th round, each client trains through a local lithography hotspot data set to obtain local model parameter informationModel partitioning processing is carried out on the model parameter information:

s2, after model partitioning is completed, each client sends the split model parameter blocks to a plurality of center servers respectively;

s3, each central server receives the split model parameter blocks transmitted by each client and performs aggregation treatment;

s4, the central server transmits the aggregated model parameter blocks to each client, and then each client locally splices the model parameter blocks to obtain model parameters of the t+1st round:

s5, each client continuously updates local parameters according to the local lithography hotspot data set, and repeats the steps; and after a plurality of rounds of iterative updating, until the federal learning model converges, and locally splicing final lithography hotspot detection model parameters by each client.

Further, the model blocking process specifically includes:

wherein Is the model parameter block that the ith client sends to the first central server at the t-th round,/the first central server>Is the model parameter block of the ith client and the t-th round of the ith client sent to the second central server, and d is the cost of each clientModel parameter dimension, satisfying d=d ₁ +d ₂ 。

Further, the central server aggregates the model parameter blocks transmitted by each client specifically:

wherein ,is a model parameter block obtained by the first central server in the aggregation of the t-th round,/for the first central server>The model parameter blocks are obtained by the second central server in the aggregation of the t-th round, and N is the number of clients participating in federal learning training.

Further, the local splicing model parameter block of each client specifically includes:

wherein ,w^t+1 Is the t+1st round global model parameter obtained by aggregation of each client,is a model parameter block obtained by the first central server in the aggregation of the t-th round,/for the first central server>Is a model parameter block obtained by aggregation of the second central server in the t-th round.

Further, the federal learning model includes a convolutional neural network model in each client, and the specific construction method is as follows:

the convolutional neural network model used by each client consists of two convolutional parts and two full-connection layers, wherein each convolutional part comprises two convolutional layers and one maximum pooling layer; the nonlinear activation function used after convolution is the ReLU function.

Further, the method for performing model partitioning processing on the model parameter information comprises the following steps: blocking according to the sequence of the neural network layers and randomly selecting network layer blocks:

blocking according to the layer sequence of the neural network: each client node equally divides the respective model into n blocks according to the forward propagation sequence and distributes the n blocks to each central server; wherein the number of the central servers is n;

randomly selecting network layer blocks: in each round, the central server firstly generates a random number seed and sends the random number seed to each client, and each client utilizes the same random number seed to divide the neural network layer into n blocks randomly and sends the n blocks to n central servers respectively.

Further, the neural network layer sequence blocking specifically comprises: and sending the 1-3 layers of parameters serving as a first model parameter block to a first central server by each client, and sending the 4-6 layers of parameters serving as a second model parameter to a second central server quickly, wherein the central server aggregates the parameters and returns the aggregated parameters to each client node for the next iteration.

Further, in the case where the number of the center servers is 2, the model blocking processing method for the model parameter information further includes blocking in odd layers and even layers: each client transmits the odd and even layers to two central servers, respectively.

Further, in the case of convolutional neural networks, the method for performing model blocking processing on model parameter information further includes blocking according to the attribute of the neural network layer: in the convolutional neural network, the neural network with training parameters only has a convolutional layer and a full connection layer; and dividing the convolution layer parameters and the full connection layer parameters into blocks and sending the blocks to a central server separately.

The beneficial effects of the invention are as follows:

1. the method can ensure that local updating parameters and final global model parameters of each client are not leaked to any single central server in the photoetching hot spot detection system based on federal learning, thereby resisting DLG attack and preventing photoetching hot spot diagrams in local data sets from being exposed.

2. The number of parameters to be aggregated by a single central server is reduced, so that the aggregation speed of the central server is increased.

3. The method does not need communication aggregation between two central servers, thereby reducing the traffic.

Drawings

FIG. 1 is a schematic diagram of FedAvg algorithm;

FIG. 2 is a schematic diagram of FedProx algorithm;

FIG. 3 is a diagram of a DLG attack implemented under the Federal learning framework;

FIG. 4 is a diagram of a conventional secure multi-party computing federal learning framework;

FIG. 5 is a diagram of a secure multi-party computing federal learning framework based on model block aggregation provided by an embodiment of the present invention;

FIG. 6 is a diagram illustrating the effectiveness of various secure multiparty computational defenses strategies provided by an embodiment of the present invention;

FIG. 7 is a graph of time spent by various privacy preserving algorithms in one training round.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Under the framework of lithography hotspot detection based on federal learning, when local update parameters are uploaded to a central server for aggregation, each client side easily leaks respective real parameters to an unreliable central server, so that privacy problems are caused.

Federal learning based on secure multiparty computing is introduced, which aims to ensure that a central server cannot obtain any information beyond an aggregation result, such as real updated parameter information of each client, while obtaining the correct aggregation result. The traditional secure multiparty calculation method based on parameter splitting cannot solve the problem that two central servers collude to steal the private information of the client, and in addition, all parameters are required to be aggregated and communicated among the central servers for secondary aggregation, so that the model training efficiency is greatly influenced. Therefore, the invention introduces a model block aggregation privacy protection method for lithography hotspot detection, which is also called Federated Learning of Model Block and Aggregation, FL-MBA. The method aims to realize the problem of protecting the parameter privacy of each client in the lithography hotspot detection system based on federal learning, and improve the defects of the traditional method in communication and aggregation speed. The embodiment of the invention comprises the following specific contents:

1. neural network architecture for each client

The invention adopts the convolution neural network which is most excellent in the image classification task at present as a basic classification model. The convolutional neural network model used by each client node consists of two convolutional parts and two fully-connected layers, where each convolutional part contains two convolutional layers and one max-pooling layer.

The main function of the convolution layer is to perform Feature extraction on the photo-hotspot graphic data in a convolution mode, the convolution is a linear image filtering process, the calculation process is to firstly select convolution kernels with specific specification and size, and the number of the convolution kernels determines the number of channels of a Feature Map (Feature Map) after convolution. Then the convolution kernel scans the two-dimensional digital image from left to right and from top to bottom according to a set step length (stride), multiplies the numerical value on the convolution kernel with the pixel numerical value on the corresponding scanned lithography hotspot image, then sums up, finally takes the calculated result as the pixel value on the corresponding position on the lithography hotspot feature map, and finally outputs the lithography hotspot feature map after the convolution is completed:

wherein x represents a value of (M ₁ ，M ₂ ) W represents convolution kernel parameters with the size (I, J), b is an offset constant, and y is a lithography hotspot feature map output after convolution. All convolution kernels used in the model of the present invention are 3 x 3 in size, and the number of convolution kernels for the two convolution portions is 16 and 32, respectively.

The activation function is used for carrying out nonlinear mapping on the lithography hotspot feature map, and after being used for a convolution layer, the activation function enhances the nonlinear expression capacity of the convolution neural network, so that the model has a more excellent fitting effect. The invention uses a ReLU function as an activation function, and the calculation mode is as follows:

ReLU(Y)＝max{Y，0}

the pooling layer is also called a downsampling layer and divides a photoetching hot spot characteristic map obtained by convolution into a series of mutually non-overlapped rectangular areas, and a characteristic sampling value of each rectangular area is obtained after pooling operation is carried out on each rectangular area, so that the dimension of the photoetching hot spot characteristic map is reduced, the risk of model overfitting is further reduced, and a convolution kernel of 2 multiplied by 2 is used.

The full-connection layer is the last layer of the whole convolutional neural network, and is used for combining and splicing the convolutional features of the photoetching hotspot feature map, and finally obtaining the classification probability of the input photoetching hotspot image through weighting calculation. According to the invention, two full-connection layers are used as output layers of a convolution part, wherein the output channel of a first layer is 250, and dropout is sampled at the same time to reduce the over-fitting phenomenon during model training; the output channel of the second layer is 2, which is also the last layer of the model, and is used for outputting a two-class problem, namely whether the two-class problem is the prediction probability of the lithography hotspot.

The configuration parameters of the model are specifically shown in table 1, and the number of the neural network layer containing the training parameters is 1-6.

Table 1 client neural network configuration

2. In federal learning, the client parameter local update method is as follows:

for the t-th round, first, the i-th client executes E _local (. Gtoreq.1) local updates:

wherein ,is the local model parameter of the ith client in the t-th round, eta _t Is the learning rate (also called step size) of the t-th round,>is a loss function based on the ith client lithography hotspot sample in the t-th round.

At this time, the client i obtains the local model parameters of the t+1st round

3. The client splitting model parameter block method comprises the following steps:

in the t-th round of federation learning model updating, each client has own photoetching hot spot data set and trains to obtain local model parameter informationBefore sending the model parameter information, each client side firstly performs model blocking processing on the model parameter information:

wherein Is the model parameter block sent by the ith client to Server1 at the t-th round,/the client is a model parameter block sent by the ith client to Server1 at the t-th round>Is the model parameter block sent to Server2 by the ith client and the t round, and d is the local model parameter dimension of each client, satisfying d=d ₁ +d ₂ 。

In particular, let the number of client node model network layers be k, the number of central servers be n, in particular, in this example, k=6, n=2, and the method for partitioning local model parameters by each client node specifically includes:

blocking according to the layer sequence of the neural network: each client node equally divides the respective model into n blocks according to the forward propagation sequence and distributes the n blocks to each central server. For this example, each client takes layer 1-3 parameters as the parametersSending to Server1, taking 4-6 layers parameters as +.>And sending the parameters to the Server2, and returning the parameters to each client node for the next iteration after the parameters are aggregated by the central Server.

Blocking in odd and even layers: especially if the number of central servers is 2, as in this example, each client takes the odd layer, i.e. 1,3,5 layer parameters asThe Server1 is issued with even number layer, namely 2,4,6 layer parameters as +.>And sending the parameters to the Server2, and returning the parameters to each client node for the next iteration after the parameters are aggregated by the central Server.

Blocking according to the attribute of the neural network layer: typically, the neural network with training parameters includes a convolutional layer, a fully-connected layer, a round-robin network layer, and the like. For this example, each client takes all convolution modules, i.e., layer 1-4 parameters, asSend to Server1, willAll fully connected layers, i.e. 5-6 layers parameters as +.>And sending the parameters to the Server2, and returning the parameters to each client node for the next iteration after the parameters are aggregated by the central Server.

Randomly selecting network layer blocks: in each round, the central server firstly generates a random number seed and sends the random number seed to each client, and each client utilizes the same random number seed to divide the neural network layer into n blocks randomly and sends the n blocks to n central servers respectively. For this example, layer 3 parameters are randomly selected asSend to Server1, regard the remaining 3 layers parameter asAnd sending the parameters to the Server2, and returning the parameters to each client node for the next iteration after the parameters are aggregated by the central Server.

4. The two center servers aggregate the model parameter blocks transmitted by the clients as follows:

wherein ,is a model parameter block obtained by aggregation of Server1 in the t-th round,/the model parameter block is obtained by aggregation of Server1 in the t-th round>The model parameter blocks are obtained by aggregation of Server2 in the t-th round, and N is the number of clients participating in federal learning training.

5. The central server sends the aggregation result to the client for splicing as follows:

the two central servers aggregate the model parameter blocks and />And then each client side obtains the complete model parameters of the t+1st round in the local splicing model parameter block:

the client continuously updates local model parameters according to the local lithography hotspot data set, and repeats the steps to finally obtain a global convergence model w _global 。

As shown in fig. 5, the model blocking aggregation privacy protection method for lithography hotspot detection mainly comprises the following three parts:

client model blocking stage: each client performs blocking processing on the locally updated model parameters and distributes the locally updated model parameters to two center servers respectively, so that any center server cannot obtain the complete parameter information of the client.

The central server model aggregation phase: the two center servers average and aggregate the model parameters transmitted by the clients and distribute the model parameters to each client.

Client splice model stage: and each client splices the model parameter blocks downloaded from the two central servers to obtain complete aggregate model parameters.

The model blocking aggregation privacy protection method for lithography hotspot detection provided by the invention solves the challenges of client updating parameter privacy protection in theory and practice, and improves the communication and aggregation efficiency of a central server. The key idea of the invention is that the model parameters are split according to the neural network layer, so that two central servers cannot obtain complete client model parameters, thereby protecting the privacy of the client. Furthermore, since each central server does not have to aggregate all model parameters, aggregation and communication speed is improved.

1. Embodiment parameter setting:

this embodiment uses 2 data sets, one of which is ICCAD 2012Contest and the other of which is industry data set asml1. These 2 data sets are all very representative data sets in the field of lithography hotspots, the basic information of which is shown in table 2. The four columns of the table list the total number of lithography hotspots (hotspot) and non-lithography hotspots (non-hotspot) in the training set and test set, respectively.

Table 2 data set basic information

The experimental details of this example are as follows:

the embodiment of the invention realizes a federal learning privacy protection algorithm by using a PyTorch library, and trains and tests on a platform provided with a Xeon E5-2650CPU and an Nvidia 1080Ti GPU. In order to verify the true privacy protection effect of the method in the lithography hotspot detection framework based on federal learning,

the embodiment of the invention selects an ICCAD 2012Contest data set and an Industry data set to verify the proposed method. Both data sets are very representative data sets in the field of lithography hotspots, the basic information of which is shown in table 3. The ICCAD 2012 context data set contains five reference data sets (ICCAD 1-ICCAD 5), and since the data volume of a single reference data set is not large, the scalability of the algorithm cannot be verified, so we merge the five reference data sets into one unified data set ICCAD. In addition, in order to test the performance of the CNN model in actual industrial production, the present chapter also uses an industrial dataset Industry of the integrated circuit Industry, which contains more complex circuit layout data and covers a wider circuit layout design pattern. Each data set is divided into two parts, a training data set and a test data set. The training data set is used to train the neural network model and the test data set is used to test the performance of the model.

TABLE 3 number of lithography hotspots and non-lithography hotspots in training set and test set for each dataset

Using a photo-lithography hotspot image and a non-lithography hotspot image of an ICCAD 2012Contest dataset with a manufacturing process of 28nm and an Industry dataset with a manufacturing process of 20nm as a primary training dataset and a test dataset of the document, wherein the primary training dataset in the ICCAD 2012Contest dataset comprises 1204 photo-lithography hotspot images and 17096 non-lithography hotspot images, and the primary test dataset comprises 2524 photo-lithography hotspot images and 138848 non-lithography hotspot images; the original training dataset in the Industry dataset contained 3629 lithography hotspot images and 80299 non-lithography hotspot images, and the original test dataset contained 942 lithography hotspot images and 20412 non-lithography hotspot images. The two datasets contained lithography hotspot images with the following characteristics: single image color is monotonous, lines are simple and the shape of the graph is regular; each image contains a unique design paradigm of a chip manufacturer, and has strong privacy.

In order to verify the effectiveness of the proposed privacy protection method in the federal learning-based lithography hotspot detection framework, the embodiment of the invention deploys the related privacy protection method into two federal learning algorithms, namely FedAvg and FedProx, and tests the privacy protection effect of the proposed method. Because the research focus of the invention is the privacy protection strategy of lithography hotspot data, only the scene of synchronous training of all federal learning algorithms is considered, namely all clients are selected to participate in the aggregation of the models in each round, the neural network model used by each client is a CNN model, and the model configuration parameters are specifically shown in table 1.

All federal learning algorithms involved in the embodiments of the present invention use Adam optimizers with a learning rate of 0.001. The ICCAD 2012Contest training data set and the Industry training data set are distributed to different numbers of clients as local data, the number of the clients corresponding to each data set is set to 5, the batch size of the local training of each client is set to 64, the batch size of the local test is set to 256, and all the clients execute the local training of 3 epochs. When a DLG attack experiment is carried out, the learning rate of the DLG attack is set to 0.01, the Iteration step number alternation trained during the attack is set to 100, and the number of lithography hotspot images recovered by each attack is 1. The number of central servers is set to 2 when the secure multiparty computing federation learning framework is built.

In the invention, in order to fairly evaluate the privacy protection performance of the algorithm provided by the invention, model performance comparison is carried out on two safe multiparty calculation federal learning privacy protection methods under the same experimental setting. Selecting a model parameter calculation time T _c Time T required for model parameters to communicate between two central servers _ss Time T required for model parameters to communicate between a central server and a client _cs The model Accuracy (ACC) and the mean square error value (MSE) of the virtual gradient and the real gradient are used as the evaluation indexes of the invention, and the MSE is calculated according to the following formula:

wherein ,is a virtual gradient vector for DLG attack setup, < >>Is the true gradient vector for the client, d is the dimension of the gradient, |·| is the L2 norm. The MSE reflects the privacy protection degree of the algorithm, and the smaller the MSE is, the weaker the privacy protection degree of the algorithm is even approaching to 0; the greater the MSE, the greater the degree of privacy protection of the algorithm.

2. Example procedure and results:

the invention verifies the effect of the lithography hotspot data privacy protection method based on the federal learning framework through a numerical experiment, and the numerical experiment is completed through Python. The invention designs experiments from the following aspects:

example 1: the performance of the related secure multiparty computing method in federal learning privacy protection is verified through numerical experiments, experiment 1 performs performance comparison on an ICCAD 2012Contest data set and an Industry data set on a FL-PSA algorithm, a FL-MBA algorithm and a federal learning algorithm without defensive measures, wherein the FL-PSA algorithm adds a random number vectorThe mean value of (2) is 0 and the variance is 0.1. The FL-MBA algorithm numbers and blocks shared model parameters in units of layers, and the neural network used in experiment 1 is a CNN model which comprises four convolution layers and two full connection layers, and corresponding model parameter blocks are numbered 1-6 respectively. Each client divides all the model parameter blocks into two parts according to the method provided by the invention, one part is sent to the Server1 for aggregation, and the other part is sent to the Server2 for aggregation. The effect of various secure multiparty computing strategies against DLG attacks is shown in fig. 6, "Original" representing a federal learning algorithm without defensive measures. It can be seen that when no defensive measures are added, the privacy security of the lithography hotspot image data based on the federal learning framework cannot be guaranteed, and the training data can be completely recovered under a small amount of iterative training of an attacker. Meanwhile, the FL-PSA algorithm and the FL-MBA algorithm can effectively defend DLG attack, the mean square error value MSE of the virtual gradient and the real gradient of an attacker is up to 1.15, and the photoetching hot spot image data of the client side is prevented from being stolen by the attacker.

Experiment 1 further deploys different safe multiparty calculation methods in a related federal learning algorithm (FedAvg, fedProx), and model accuracy indexes of the different methods are compared, and experimental results are shown in table 4, so that it can be seen that under the same federal learning algorithm scene, compared with federal learning algorithms without defensive measures, the FL-PSA algorithm and the FL-MBA algorithm can effectively maintain the model accuracy, the accuracy fluctuation is not more than 0.4%, and the model accuracy fluctuation belongs to normal model accuracy fluctuation. The FL-PSA algorithm and the FL-MBA algorithm realize the model accuracy rate consistent with the federal learning algorithm without defensive measures, and effectively guarantee the privacy safety of the model.

Table 4 model accuracy contrast for different secure multiparty computing methods deployed in federal learning algorithms

Example 2: the performance of the FL-PSA algorithm and the FL-MBA algorithm in the time dimension is next studied through numerical experiments. Calculation time T of FL-PSA algorithm in each round of federal learning training process _c Including the time taken to calculate the gradient and the time taken to split the model parameters, the communication time includes T _cs and T_ss ，T _cs Is the time of uploading and downloading model parameters between a client and two central servers, T _ss Is the time it takes for the model parameters to communicate between two central servers. Calculation time T of FL-MBA algorithm _c Including the time to calculate the gradient and the time taken for the model parameters to block, the communication time includes T _cs ，T _cs Is the time that the model parameter block is uploaded and downloaded between the client and the two central servers. The time spent by the FL-PSA algorithm and the FL-MBA algorithm in each round of federal learning training contains computation time and communication time, both the client-to-center server network link and the center server-to-center server network link have bandwidths of 100Mbps, and the experimental results are shown in fig. 7, "initial" refers to the federal learning algorithm without defensive measures.

As can be seen from fig. 7, the "Original" algorithm takes a minimum of only 61.49ms, since the federal learning algorithm without defensive measures has no additional computational overhead and communication overhead. The FL-PSA algorithm takes the longest time, the total time is 118.94ms, and the additional calculation overhead is generated by parameter splitting, meanwhile, the algorithm needs to send two large-scale parameter models to two central servers, the two central servers need to perform communication aggregation in each round, compared with the initial algorithm, the algorithm generates about 2 times of communication overhead, and when training rounds are increased and the scale is increased, the communication time of the algorithm is greatly increased, and the development of the algorithm is hindered. The FL-MBA algorithm spends only 67.29ms, and the parameter splitting causes additional small calculation cost, the algorithm reduces the parameter scale to half, even if the parameter block needs to be sent to two center servers, the communication cost is not excessively increased, meanwhile, the algorithm does not allow the two center servers to communicate, the privacy leakage problem caused by communication collusion between the center servers is greatly prevented, and the data privacy of the client is better protected.

The above-described embodiments are intended to illustrate the present invention, not to limit it, and any modifications and variations made thereto are within the spirit of the invention and the scope of the appended claims.

Claims

1. The model blocking aggregation privacy protection method for lithography hotspot detection is characterized by comprising the following steps of:

2. The method for protecting privacy of model blocking aggregation for lithography hotspot detection according to claim 1, wherein the model blocking processing specifically comprises:

wherein Is the model parameter block that the ith client sends to the first central server at the t-th round,/the first central server>Is the model parameter block of the ith client and the nth round of transmission to the second central server, and d is the local model parameter dimension of each client, satisfying d=d ₁ +d ₂ 。

3. The method for protecting privacy by model block aggregation for lithography hotspot detection according to claim 2, wherein the central server aggregates model parameter blocks transmitted from each client specifically:

4. The method for protecting privacy by model blocking and aggregation for lithography hotspot detection according to claim 1, wherein the locally spliced model parameter block of each client is specifically:

5. The method for protecting privacy by model blocking and aggregation for lithography hotspot detection according to claim 1, wherein the federal learning model comprises a convolutional neural network model in each client, and the specific construction method comprises the following steps:

6. The method for protecting privacy of model partitioning and aggregation for lithography hotspot detection according to claim 1, wherein the method for model partitioning model parameter information comprises the following steps: blocking according to the sequence of the neural network layers and randomly selecting network layer blocks:

7. The method for protecting privacy by model blocking and aggregation for lithography hotspot detection according to claim 6, wherein the neural network layer sequence blocking is specifically as follows: and sending the 1-3 layers of parameters serving as a first model parameter block to a first central server by each client, and sending the 4-6 layers of parameters serving as a second model parameter to a second central server quickly, wherein the central server aggregates the parameters and returns the aggregated parameters to each client node for the next iteration.

8. The method for protecting privacy of model partitioning and aggregation for lithography hotspot detection according to claim 6, wherein the method for model partitioning the model parameter information further comprises partitioning according to an odd layer and an even layer in the case that the number of the central servers is 2: each client transmits the odd and even layers to two central servers, respectively.

9. The method for protecting privacy by model blocking and aggregation for lithography hotspot detection according to claim 6, wherein in the case of convolutional neural network, the method for model blocking and processing model parameter information further comprises blocking according to the attribute of the neural network layer: in the convolutional neural network, the neural network with training parameters only has a convolutional layer and a full connection layer; and dividing the convolution layer parameters and the full connection layer parameters into blocks and sending the blocks to a central server separately.