CN115329885A

CN115329885A - Personalized federal learning method and device based on privacy protection

Info

Publication number: CN115329885A
Application number: CN202211014315.2A
Authority: CN
Inventors: 陈晋音; 李明俊; 刘涛
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2022-08-23
Filing date: 2022-08-23
Publication date: 2022-11-11

Abstract

The invention discloses a privacy protection-based personalized federal learning method and a device, wherein a client performs each round of training to obtain trained local model matrix parameters and uploads the trained local model matrix parameters to a corresponding personalized model; calculating a threshold value of each client according to the variable quantity of the matrix parameters of the local model trained in the current round, and then calculating the frequency sparsity of each client; clustering the clients according to the frequency sparsity set; averaging local model parameters uploaded by clients in the same cluster, and issuing the average value as a new global model matrix parameter to the clients in the same cluster; and repeating until the global model is converged, and finishing the training of the personalized federal learning. According to the invention, the user privacy is always protected in the process of calculating the frequency sparsity, the clients are clustered through the similarity of the sparsity to form K clusters, and the client distribution of the K clusters is subjected to aggregation operation, so that the effects of collaborative training and personalized federal learning are achieved.

Description

Personalized federal learning method and device based on privacy protection

Technical Field

The invention belongs to the technical field of federal learning, and particularly relates to a personalized federal learning method and a personalized federal learning device based on privacy protection.

Background

Deep learning is now a great concern in both academic and industrial areas. Because the performance of the deep learning is greatly improved compared with that of the traditional algorithm, the deep learning is widely applied to various fields, such as machine translation, image recognition, unmanned driving, natural language processing and the like. Deep learning is changing our lifestyle. Its success depends on powerful computers and the availability of large amounts of data. However, a serious privacy problem is caused by a learning system which needs to input all data into a learning model running on a central server, and with the rise of the internet of things and edge computing, big data is not always bound to a single whole but distributed in many aspects, and how to safely and effectively update and share the model among multiple places is a new challenge faced by various computing methods. Due to the consideration of data privacy and safety, data owners cannot share data directly to achieve a common training deep learning model. People are beginning to seek a way federal learning has emerged as an extremely potential solution to address data islanding problems and user privacy. Federal Learning (FL), which has been proposed and has attracted significant attention because of its ability to collaboratively train a shared global model in a decentralized data environment, has now become a popular distributed machine learning paradigm.

A large number of distributed clients collectively participate in the learning process by uploading the gradient of their local models (or model weights) to the server over multiple iterations without sharing the raw data between the clients. At the beginning of the FL task, the server initializes the global model. In each learning iteration, the server distributes current global model matrix parameters to selected clients. Each selected client proceeds to independently train the received model using its local data by following a predefined learning protocol. At the end of each learning iteration, the server collects and aggregates updates from the client using gradient aggregation rules (e.g., fedAvg). This mechanism for protecting user privacy has found widespread application in the practical deployment of FL in recent years, such as loan status prediction, health assessment, and next word prediction. While FL has proven effective in generating a better federal model, it may not be the best solution for each client, as the data distribution for the client may be Non-IID. Personalized operations need to be considered in order to better adapt to their unique data distribution. Existing research has noted the problem of data heterogeneity and proposed some personalized approaches to solve this problem. The method comprises the steps of using a fine-tuning federal model to achieve schemes of personalization, multi-task learning, knowledge extraction and the like. While these approaches may facilitate personalization to some extent, they have a significant drawback in that the personalization process is limited to a single device, which may present some bias or overfitting problems because the data in the device is extremely limited. Meanwhile, due to relevant regulations of privacy laws, the privacy protection of a client side of the personalized federal learning framework needs to be fully considered during design. Therefore, when the Non-IID data distribution is faced, how to better guarantee the main performance and safety of federal learning becomes the focus of attention of people.

Disclosure of Invention

The invention aims to provide a personalized federal learning method and a personalized federal learning device based on privacy protection aiming at the defects of the prior art.

The purpose of the invention is realized by the following technical scheme: a personalized federal learning method based on privacy protection comprises the following steps:

(1) Initializing a federal learning training environment;

(2) The server sets a corresponding personalized model for each client at the cloud, each personalized model issues the global model matrix parameter of the personalized model to the corresponding client, and federal learning training is started;

(3) The client side participating in the training carries out the t round of training to obtain the trained local model matrix parameters and uploads the trained local model matrix parameters to the corresponding personalized model; calculating the threshold value alpha of each client according to the variable quantity of the matrix parameters of the local model trained in the current round _i,t And updating the matrix of each client [ (wcf-matrix) _i,t ](ii) a According to the matrix of each client [ (wcf-matrix) _i,t ]Calculating the frequency sparsity epsilon of each client _i,t To obtain a frequency sparsity set { epsilon _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t }；

(4) From a set of frequency sparsities { epsilon _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Performing K-Means clustering operation, and collecting frequency sparsity { epsilon } _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Dividing the frequency sparsity into K clusters; and the frequency sparsity epsilon in the same cluster _i,t The represented clients are divided into the same cluster;

(5) Averaging local model parameters uploaded by clients in the same cluster, and issuing the average value as a new global model matrix parameter to the clients in the same cluster;

(6) And (5) repeating the steps (3) to (5) until the global model is converged, and finishing the training of the personalized federal learning model.

Further, the step (1) is specifically: and setting an overall training round E, local data D and an overall client number k participating in federal learning.

Further, the step (2) specifically includes the following sub-steps:

(2.1) the Server is at the cloud for each client p _i Setting a corresponding personalized model N _i And initializing each personalized model N _i Obtaining initialized global model matrix parameters

i =1,2, … i,. K; the initialized global model matrix parameters

The matrix size of (a) is W × H;

(2.2) Each personalized model N _i Global model matrix parameters to be initialized

Is sent to the corresponding client p _i And starting federal learning training.

Further, the step (3) specifically includes the following sub-steps:

(3.1) client participating in training p _i Not sharing data, locally performing the issued global model weightLocal model training:

for the t-th round of training, obtaining the matrix parameters of the trained local model

And uploaded to the corresponding personalized model N _i ；

(3.2) calculating each client p after the t round of local model training _i Threshold value alpha of _i,t The calculation formula is as follows:

wherein alpha is _i,t Representing each client p after the t-th round of local model training _i A threshold value of (a);

representing local model matrix parameters

Subparameters of the nth row and the vth column;

representing initialized global model matrix parameters

Subparameters of the nth row and the vth column;

(3.3) if

Then the matrix is updated](wcf-matrix) _i,t ]Subparameters of the u-th row and the v-th column:

if it is

Then update the matrix [ (wcf-matrix) _i,t ]Subparameters of the u-th row and the v-th column:

wherein, [ (wcf-matrix) _i,t ] _u,v Representation matrix [ (wcf-matrix) _i,t ]Subparameters of the nth row and the vth column;

repeating the above steps to update the whole matrix [ (wcf-matrix) _i,t ]；

(3.4) by the matrix of each client [ (wcf-matrix) _i,t ]Calculating the frequency sparsity epsilon of each client _i,t And obtaining a frequency sparsity set [ epsilon ] _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Frequency sparsity ε _i,t The calculation formula of (c) is as follows:

further, the step (4) specifically includes the following sub-steps:

(4.1) from the set of frequency sparsity { ε _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Performing K-Means clustering operation, setting the number of clusters as K, wherein the K-Means clustering operation specifically comprises the following steps:

from the set of frequency sparsity [. Epsilon. ] _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Randomly selecting K frequency sparsity as an initial centroid; calculating sparsity epsilon of each frequency _i,t Distance to each centroid, frequency sparsity ε _i,t Dividing the image into clusters corresponding to centroids closest to each other; calculating all frequency sparsity epsilon in each cluster _i,t And updating the centroid of the cluster using the mean to achieve the maximum number of iterations, finally forming K clusters: s ₁ 、S ₂ …S _K ；

(4.2) frequency sparsity ε to be in the same cluster _i,t The represented clients are divided into the same cluster.

The invention also provides a personalized federal learning device based on privacy protection, which comprises one or more processors and is used for realizing the personalized federal learning method based on privacy protection.

The present invention also provides a computer-readable storage medium having a program stored thereon, where the program, when executed by a processor, is configured to implement the above-mentioned personalized federal learning method based on privacy protection.

The invention has the beneficial effects that: compared with the traditional federal learning, the method calculates the variable quantity of the matrix parameters of the trained local model in each round, the frequency sparsity of each client is calculated, the frequency sparsity belongs to statistics and does not relate to local data of the clients, so that the privacy of users is always protected in the process, the clients are clustered through the similarity of the frequency sparsity to form K clusters, and the client distribution of the K clusters is subjected to aggregation operation to achieve the effect of the personalized federal learning.

Drawings

FIG. 1 is a schematic diagram of a local and cloud device according to the present invention;

FIG. 2 is a schematic diagram of a local and cloud device according to the present invention;

fig. 3 is a schematic device diagram of a personalized federal learning device based on privacy protection provided by the invention.

Detailed Description

For purposes of promoting an understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description of the embodiments taken in conjunction with the accompanying drawings, it being understood that the specific embodiments described herein are illustrative of the invention and are not intended to be exhaustive. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, are within the scope of the present invention.

The present invention uses a means of privacy protection to attempt collaborative training of similarly distributed clients. Compared with the traditional federal learning, the method has the advantages that the change frequency of the weight of the local training model uploaded by the client is collected through the server, the sparsity of the frequency is calculated, the sparsity of the frequency belongs to statistics and does not relate to local data of the client, so that the privacy of a user is always protected in the process, the clients are clustered through the similarity of the sparsity to form four clusters, and the client distribution of the four clusters is subjected to aggregation operation to achieve cooperative training so as to achieve the effect of personalized federal learning.

Example 1

As shown in fig. 1 and fig. 2, the present invention provides a personalized federal learning method based on privacy protection, which comprises the following steps:

(1) Acquiring local data D:

in this embodiment, an MNIST dataset and an ImageNet dataset are used as local data; the MNIST data set comprises 60000 Zhang Huidu images, the size of each image is 28 × 28, and the images are divided into 10 types; 50000 pieces of data are taken as local data. The ImageNet data sets are 1000 types in total, each type comprises 1000 samples, each picture is an RGB color image, and the size of each sample is 224 × 224; from each class, 30% of pictures were randomly drawn as local data.

(2) Initializing a federated learning training environment: and setting an overall training round E, local data D and the number k of overall equipment participating in federal learning.

(3) The server sets a corresponding personalized model for each client at the cloud, each personalized model issues the global model matrix parameter of the personalized model to the corresponding client, and federal learning training is started.

The step (3) includes the substeps of:

(3.1) the Server is in the cloud for each client p _i Setting up corresponding personalized model N _i And initializing each personalized model N _i Obtaining initialized global model matrix parameters

i =1,2, … i,. K; the initialized global model matrix parameters

Comprises the following steps:

the matrix size is W x H, wherein,

global model matrix parameters for initialization

Subparameters of the nth row and the vth column; u =1,2, … u.. H, v =1,2, … v.. W; (ii) a

(3.2) Each personalized model N _i Global model matrix parameters to be initialized

(4) The client side participating in the training carries out the training of the t round to obtain the matrix parameters of the trained local model and uploads the matrix parameters to the corresponding personalized model; calculating the threshold value alpha of each client according to the variable quantity of the matrix parameters of the local model trained in the current round _i,t And updating the matrix of each client [ (wcf-matrix) _i,t ](ii) a According to the matrix of each client [ (wcf-matrix) _i,t ]Calculating the frequency sparsity epsilon of each client _i,t To obtain a frequency sparsity set { epsilon _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t }。

The step (4) comprises the following substeps:

(4.1) client participating in training p _i Carrying out the training of the t-th round without data sharing, carrying out the local model training on the issued global model weight locally to obtain the trained local model matrix parameters

And uploaded to the corresponding personalized model N _i (ii) a The trained local model matrix parameters

Comprises the following steps:

wherein the content of the first and second substances,

for local model parameters

Subparameters of the nth row and the vth column;

(4.2) calculating each client p after the t-th round of local model training according to the variable quantity of the matrix parameters of the local model trained in the current round _i Threshold value alpha of _i,t Threshold value alpha _i,t The calculation formula of (a) is as follows:

wherein alpha is _i,t Representing each client p after the t-th round of local model training _i A threshold value of (d);

(4.3) if

if it is

repeating the above steps to update the whole matrix [ (wcf-matrix) _i,t ]The matrix [ (wcf-matrix) _i,t ]Is composed of

(4.4) according to the matrix of each client [ (wcf-matrix) _i,t ]Calculating the frequency sparsity epsilon of each client _i,t And obtaining a frequency sparsity set { epsilon _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t }, frequency sparsity ε _i,t The calculation formula of (a) is as follows:

(5) From a set of frequency sparsities { epsilon _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Performing K-Means clustering operation, and collecting frequency sparsity { epsilon } _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Dividing the frequency sparsity into K clusters; and the frequency sparsity epsilon in the same cluster _i,t The represented clients are divided into the same cluster.

The step (5) comprises the following substeps:

(5.1) from the set of frequency sparsity [. Epsilon. ] _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Performing K-Means clustering operation, setting the number of clusters as K, wherein the K-Means clustering operation specifically comprises the following steps:

(5.2) frequency sparsity ε to be in the same cluster _i,t The represented clients are divided into the same cluster.

(6) Averaging local model parameters uploaded by clients in the same cluster, and issuing the average value as a new global model matrix parameter to the clients in the same cluster;

(7) And (5) repeating the step (4) to the step (6) until the global model is converged, and finishing the training of the personalized federal learning model.

The method is applied to an image classification task for testing, training of a Convolutional Neural network (CN) is respectively carried out on MNIST and ImageNet, and an MNIST data set comprises 70000 handwritten digital images with gray scales; the ImageNet dataset contains 1000 classes of RGB color images, and the CNNs in the experiment are LeNet and VGG16 models, respectively, and a cross-entropy loss function is used. The client-side local updating uses a mini-batch random gradient descent (mini-batch SGD) training method, and data in the client-side is distributed independently on data distribution. In this simulation experiment, for the MNIST dataset, the batch size thereof is set to 50, and the learning rate is 0.01, e =100, k =50, n =50, k =4. For the ImageNet dataset, its batch size was set to 64, and the learning rate was 0.001, e =100, k =50, n =50, k =4.

The model precision of the personalized Federal learning model obtained through training and the success rate of gradient reversal attack facing privacy leakage are tested through experiments, and the usability of the method is tested. Experiments were performed on both data sets to demonstrate the effectiveness of privacy protection and the utility of the model.

Table 1 shows the utility results of the global models of different clusters, and clusters each client into different clusters and performs global model aggregation, respectively, so as to achieve personalization, and the success rate is 0% in the face of gradient reversal attack, and the privacy protection effect exists.

Table 1: model accuracy for personalized federal learning models

Corresponding to the embodiment of the personalized federal learning method based on privacy protection, the invention also provides an embodiment of a personalized federal learning device based on privacy protection.

Referring to fig. 3, an individualized federal learning apparatus based on privacy protection provided in an embodiment of the present invention includes one or more processors, and is configured to implement the individualized federal learning method based on privacy protection in the foregoing embodiment.

The embodiment of the personalized federal learning device based on privacy protection can be applied to any equipment with data processing capability, such as computers and other equipment or devices. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for running through the processor of any device with data processing capability. In terms of hardware, as shown in fig. 3, a hardware structure diagram of any device with data processing capability where the privacy protection-based personalized federal learning apparatus of the present invention is located is shown, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 3, in the embodiment, any device with data processing capability where the apparatus is located may also include other hardware generally according to the actual function of the any device with data processing capability, which is not described again.

The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.

For the device embodiment, since it basically corresponds to the method embodiment, reference may be made to the partial description of the method embodiment for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement without inventive effort.

An embodiment of the present invention further provides a computer-readable storage medium, on which a program is stored, where the program, when executed by a processor, implements the personalized federal learning method based on privacy protection in the foregoing embodiments.

The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing capability device described in any of the foregoing embodiments. The computer readable storage medium can be any device with data processing capability, such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both an internal storage unit and an external storage device of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing capable device, and may also be used for temporarily storing data that has been output or is to be output.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A personalized federal learning method based on privacy protection is characterized by comprising the following steps:

(1) Initializing a federal learning training environment;

2. The personalized federal learning method based on privacy protection as claimed in claim 1, wherein the step (1) is specifically: and setting an overall training round E, local data D and an overall client number k participating in federal learning.

3. The personalized federal learning method as claimed in claim 2, wherein the step (2) specifically comprises the following sub-steps:

(2.1) the Server is at the cloud for each client p _i Setting up corresponding personalized model N _i And initializing each personalized model N _i Obtaining initialized global model matrix parameters

The initialized global model matrix parameters

The matrix size of (a) is W × H;

4. A personalized federal learning method as claimed in claim 3, wherein the step (3) specifically comprises the following sub-steps:

(3.1) client p participating in training _i And (3) not carrying out data sharing, and carrying out local model training on the issued global model weight locally:

And uploaded to the corresponding personalized model N _i ；

representing local model matrix parameters

Subparameters of the nth row and the vth column;

representing initialized global model matrix parameters

Subparameters of the nth row and the vth column;

(3.3) if

Then update the matrix [ (wcf-matrix) _i,t ]Sub-parameters of the u-th row and the v-th column:

if it is

repeating the above steps to update the whole matrix [ (wcf-matrix) _i,t ]；

(3.4) by the matrix of each client [ (wcf-matrix) _i,t ]Calculating the frequency sparsity epsilon of each client _i,t And obtaining a frequency sparsity set { epsilon _1,t ,ε _2,t ,…,ε _i,t ,…,ε _k,t Frequency sparsity ε _i,t The calculation formula of (a) is as follows:

5. the personalized federal learning method as claimed in claim 4, wherein the step (4) specifically comprises the following sub-steps:

6. A personalized federal learning apparatus based on privacy protection, comprising one or more processors for implementing the personalized federal learning method based on privacy protection of any one of claims 1 to 5.

7. A computer readable storage medium having a program stored thereon, wherein the program, when executed by a processor, is adapted to implement the personalized federal learning methodology for privacy protection based on any of claims 1-5.