CN115495771A - Data privacy protection method and system based on self-adaptive adjustment weight - Google Patents

Data privacy protection method and system based on self-adaptive adjustment weight Download PDF

Info

Publication number
CN115495771A
CN115495771A CN202210798075.3A CN202210798075A CN115495771A CN 115495771 A CN115495771 A CN 115495771A CN 202210798075 A CN202210798075 A CN 202210798075A CN 115495771 A CN115495771 A CN 115495771A
Authority
CN
China
Prior art keywords
model
local
data
sample
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210798075.3A
Other languages
Chinese (zh)
Inventor
陈益强
何雨婷
杨晓东
于汉超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN202210798075.3A priority Critical patent/CN115495771A/en
Publication of CN115495771A publication Critical patent/CN115495771A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Multimedia (AREA)
  • Bioethics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Security & Cryptography (AREA)
  • Molecular Biology (AREA)
  • Computer Hardware Design (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a data privacy protection method and system based on self-adaptive adjustment weight, solves the problems of model performance reduction and convergence speed reduction brought by non-independent and identically distributed data, and belongs to the technical field of federal learning application. The method comprises the following steps: when each round of federal communication starts, the server side evaluates the credibility of the global model category level by using the auxiliary data set, and issues the credibility matrix and global model parameters to the clients participating in the round of federal communication; the client side evaluates the reliability of a global model sample level according to the local private data set, weights the knowledge distillation by utilizing category reliability and sample reliability, dynamically guides the training process of the local model, and uploads updated local model parameters to the server side; and the server side weights and aggregates the parameters of each local model to update the global model.

Description

Data privacy protection method and system based on self-adaptive adjustment weight
Technical Field
The invention relates to the technical field of federal learning and data security, in particular to a federal learning method and a system based on client selective knowledge distillation.
Background
Conventional machine learning techniques have been successfully applied to the fields of computer vision, natural language processing, recommendation systems, and automatic control. With the application of artificial intelligence in various industries, people are also continuously concerned about user privacy and data security. The Protection of Data security and privacy is continuously strengthened in various countries, for example, general Data Protection Regulation (GDPR) is formally introduced in 2018 of the european union, and personal information Protection law of the people's republic of china is passed in 2021 of China. Due to the privacy protection restrictions of these laws and regulations, data in the fields of medicine, enterprises, military, etc. presents an island distribution. Federal Learning (FL), which has emerged in recent years, enables secure sharing of multiparty data by transmitting model parameters instead of raw data. On one hand, the privacy of the user and the data security can be well protected when the data is not locally generated; on the other hand, joint training can fully sense local private data of each client, and the data island problem is solved.
The environments of different users, devices, organizations and other participating clients are heterogeneous in nature, so that data in federal learning is Non-independent and clinically distributed (Non-IID). Non-IID has been a leading-edge hotspot problem to be solved urgently in the field of federal learning, wherein tag distribution shift is particularly significant in the practical application scenario of federal learning. The data heterogeneity of the client can cause the local training of the client to seriously deviate from the global target, so that the update process is divergent (Weight Divergence). Therefore, one of the research challenges of the federal learning data Non-IID lies in constraining the model updating direction in the local training process of each client, and learning knowledge from local private data while preserving the knowledge of the global model. The prior art adds a correction term to the local loss function so that the local update process does not deviate too much from the global model. Where the correction term is calculated using the local model minus the L2 distance of the global model of the previous round. A global model trained on a complete data set learns better representations than a local model trained on a skewed subset. Based on the above, a comparison learning loss term is added to the local loss function, so as to achieve the purposes of reducing the distance between the representation learned by the local model and the representation learned by the global model and increasing the distance between the representation learned by the local model and the representation learned by the previous local model. The method can also adopt an Elastic Weight Consolidation (EWC) method to eliminate the catastrophic forgetting problem in federal learning, and a penalty term is added in a local loss function to hinder the change of important model parameters of a local model on a global task.
The above research methods all use the global model to constrain the update direction of the local model of the client, so as to prevent the difference between the updated local model and the global model from being too large. However, these works all suffer from the following two drawbacks: on the one hand, these efforts do not adaptively adjust the weights between the correction term and the task loss term in the local objective function. If the weight of the correction term is too large, the federation in the round cannot learn new knowledge, and if the weight of the correction term is too small, the federation in the round can deviate from the global target in the optimization direction, so that the weight needs to be adjusted very carefully to optimize towards an ideal optimal model, and the process of trying the weight consumes a great deal of time and energy. On the other hand, these efforts do not take into account that the poor performance global model misleads the local model update direction to optimize in the wrong direction, especially for the client that did not participate in the previous round of federation. Because each round of clients participating in the federation is dynamically changed and the data of the clients is not independently and uniformly distributed, the global model obtained by aggregation has different characterization capabilities on different classes. At the beginning of federal learning, the global model has not yet learned a good representation, and the local model should be trained with more attention paid to learning knowledge from local private data rather than to retain global knowledge. And in the middle and later stages of federal communication, the global model performs better on a specific category than the local model, and the local model selectively retains global knowledge from the category level and the sample level.
Most existing federal learning algorithms are based on classical federal Averaging (FedAvg), which uses a traditional Client-Server (C-S) architecture to split a distributed training process into a multi-round iterative Client-side local training process and a Server-side parameter aggregation process. As shown in fig. 1, in the client local training process, each client downloads a model from a server and then performs multiple rounds of training on a local private data set; in the parameter aggregation process of the server side, the server receives updated model parameters from the client side, and aggregation is performed in a parameter averaging mode by using the total sample size of the client side as the weight. Assuming a total of N clients, local private dataset for each client is D = { D = { 1 ,D 2 ,…,D N }. The objective function L of the global model w is the objective function L of the local model of each client i Weighted average of (c):
Figure BDA0003732923840000021
wherein q is i Represents the weight in the aggregation of client i, | D i L represents the total sample size of the client i local private dataset.
In the practical application scenario of federal learning, the local data distribution of each client is usually Non-independent and same distributed Non-IID, especially the label distribution P (y) of the data may be different. On one hand, when each client performs local training on a local inclined data set, the updating directions of model fitting samples are inconsistent, so that each local model deviates from a global target, and 'catastrophic forgetting' of global knowledge is generated, and further, the deviation between the aggregated and updated global model and an ideal model is overlarge. Meanwhile, the global model has different capabilities of extracting different types of features, and has strong capability of extracting most types of features in the currently selected local private data of the client, so that the credibility of the output logit (the input variable of the softmax activation function of the neural network for outputting the predicted value) of the global model on the most types of channels is higher than that on the few types of channels. On the other hand, the output logit of the global model is also related to the specific training sample. When the training samples belong to the above-mentioned few classes, or the features in the samples of the most classes are unique, the global model cannot effectively extract the features of the samples, so that the confidence of the output logit on each class channel is low at this time.
In summary, how to selectively retain global knowledge and adaptively adjust local update direction in the local training process, and improve the generalization and convergence speed of the model becomes the focus of research attention of us.
Disclosure of Invention
The invention aims to overcome the defects that the updating direction of a local model cannot be adjusted in a self-adaptive mode and the performance change of a global model, the performance difference between classes, the performance difference between samples and the like are not considered when the existing federal learning method faces to non-independent same-distribution data, and provides a data privacy protection method based on self-adaptive adjustment weight, which comprises the following steps:
step 1, inputting an auxiliary data set marked with category labels into a global model to obtain the classification precision of the global model on each category of data, and using the classification precision as a category credibility matrix;
step 2, at least one client acquires the global model from a cloud, initializes a local model of the client, and locally has a local private data set, wherein samples in the local private data set correspond to category labels; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample reliability matrix and the class reliability matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; uploading the trained model parameters of the local model to the cloud;
and 3, carrying out weighted aggregation on the received model parameters by the cloud end to obtain a new model.
The data privacy protection method based on the adaptive adjustment weight, wherein the step 3 comprises: and (4) replacing the global model with the new model, circularly executing the steps 1 to 3 until the total loss is converged or a preset iteration number is reached, finishing the training and updating of the global model, saving the current global model as a final model, and classifying or predicting specified data.
The data privacy protection method based on self-adaptive weight adjustment, wherein the distillation loss
Figure BDA0003732923840000041
Wherein
Figure BDA0003732923840000042
A weight vector of class, \ indicates element multiplication; z is a radical of formula g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] +
Figure BDA0003732923840000043
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5
where sample x belongs to class k 2
Figure BDA0003732923840000044
A k1,k1 Is the recall ratio of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2; p is a radical of formula g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 Probability of (M) max Is the upper limit of distillation loss term.
According to the data privacy protection method based on the self-adaptive adjustment weight, the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or predicting risks of transaction data.
The invention also provides a data privacy protection system based on the self-adaptive weight adjustment, which comprises the following steps:
the cloud end is used for inputting the auxiliary data set marked with the category label into the global model to obtain the classification precision of the global model on each category of data, and the classification precision is used as a category credibility matrix; obtaining a new model by performing weighted aggregation on the received model parameters;
the client side is used for acquiring the global model from the cloud side, initializing a local model of the client side, wherein the client side is provided with a local private data set locally, and samples in the local private data set correspond to category labels; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample reliability matrix and the class reliability matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; and uploading the trained model parameters of the local model to the cloud.
The data privacy protection system based on the adaptive adjustment weight comprises a cloud end and a server, wherein the cloud end is used for: and replacing the global model with the new model, saving the current global model as a final model, and classifying or predicting the specified data.
The adaptive weight-based data privacy protection system wherein the distillation loss
Figure BDA0003732923840000051
Wherein
Figure BDA0003732923840000052
A weight vector of class,. Indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] +
Figure BDA0003732923840000053
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5
where sample x belongs to class k 2
Figure BDA0003732923840000054
A k1,k1 Is the recall rate of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2; p is a radical of g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 Probability of (M) max Is the upper limit of distillation loss term.
According to the data privacy protection system based on the self-adaptive adjustment weight, the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or predicting risks of transaction data.
The invention also provides a storage medium for storing a program for executing the data privacy protection method based on the self-adaptive adjustment weight.
The invention also provides a client used for any data privacy protection system based on the self-adaptive weight adjustment.
According to the scheme, the invention has the advantages that:
according to the method, the strategy of selective knowledge distillation of the client is introduced in the local training process of federal learning, the credibility of the global model is evaluated from the category level and the sample level, the global knowledge is selectively distilled into the local model according to the credibility, so that the local model of each client does not deviate from the global model from the learning knowledge of local private data at the same time, the number of turns of federal communication is reduced, the performance of the model is improved, and the convergence of the model is accelerated.
Drawings
FIG. 1 is a prior art federal learning flow chart;
FIG. 2 is a flow chart of the federated learning of the present invention;
FIG. 3 is a schematic diagram of data distribution of clients;
FIG. 4 is a graph of comparative results on a reference data set;
FIG. 5 is a diagram of a confusion matrix for a global model over a test set.
Detailed Description
Before describing embodiments of the present invention in detail, some of the terms used therein will be explained as follows:
a client refers to a node that provides services to a client. The clients may be a large number of mobile or internet of things devices, or may be different organizations (e.g., government departments, medical institutions, financial institutions, geographically distributed data centers, etc.), and the local private data is stored in the clients. The client in the embodiment of the present invention is not limited to any application scenario.
The server (cloud) refers to a node providing services for the client. The central server side is connected with each client side, and the multiple client sides are coordinated to jointly model on the premise that original data are not leaked.
The deep learning model is a deep neural network formed by connecting a plurality of processing units. The model updated at the server side is called a global model, and the model updated at the client side is called a local model.
The auxiliary data set is an open data set used for assisting federal training. May consist of a public data set related to the training task or a small amount of local data published by each client.
In view of the limitations and challenges presented by prior approaches, the present invention proposes a federal learning approach based on client selective knowledge distillation. The method has the key points that the knowledge of the global model is selectively distilled into the local model, and the weight of distillation loss in the local training process is adaptively adjusted according to the credibility of the global model at the category level and the sample level, so that the performance of the model is improved, and the convergence of the model is accelerated. Since each federal requires recalculation of M _ class and M _ sample, while the weight of distillation loss, M, is determined by both, it is called adaptive weight adjustment.
The invention comprises the following key technical points:
the key point 1 is that a federal learning method based on client knowledge distillation is introduced aiming at the problem of catastrophic forgetting in the local training process. The method has the technical effects that the knowledge of the global model can be reserved in the local updating process;
and 2, a client selective knowledge distillation strategy is introduced aiming at the problem of performance difference between categories and performance difference between samples of the global model. And respectively evaluating the credibility of the global model class level by using the auxiliary data set at the server side and evaluating the credibility of the global model sample level by using each sample of the private data of the server side locally, and weighting by using the class credibility and the sample credibility during knowledge distillation. The training process of the local model is dynamically guided by the strategy of "choose best from, and change bad". The technical effect is that the local update process can be adaptively adjusted.
In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
Aiming at the problems, the invention provides a client-based Selective Self-Distillation federal Learning method (FedSSD), which takes a global model as a teacher and a local model as a student, and selectively learns global expression from the global model, so that the local updating process of model parameters is adaptively corrected, and new knowledge is learned from local private data while global knowledge is not forgotten.
According to one embodiment of the invention, a small part of data is shared from the various clients to the server, or the public data set is used as the construction auxiliary data set D V . For example, a breast cancer histological image classification task may employ a few public data sets Camelyon17 as an auxiliary data set. In the t round of federal communication turn, the data set D is assisted at the server end firstly V Evaluating the credibility of the global model category level (represented by confusion matrix)
Figure BDA0003732923840000071
Wherein the function
Figure BDA0003732923840000077
Representing a global model w t In the auxiliary data set D V Precision of (A), K represents the number of classes, A k1,k2 Representing the probability that the global model predicts class k1 as class k 2. Then, the confidence matrix A t And a global model w t And issuing the data to the client participating in the round of federation. The client can be a mobile phone terminal or an organization such as a hospital, a research institute, a company, etc.
According to one embodiment of the invention, a client i receives a credibility matrix A issued by a server t And a global model w t . First using the global model w t Initialize its local model
Figure BDA0003732923840000072
Then local private data D is processed by a Stochastic Gradient Descent (SGD) algorithm i Optimize its local objective function L i . To prevent local training from causing catastrophic forgetfulness to the global model, both classification losses and distillation losses need to be optimized: l is i =L CE,i +L SSD,i
According to one embodiment of the invention, for each input sample x, we define the output of the global model to be
Figure BDA0003732923840000073
The output of the local model, logit, is
Figure BDA0003732923840000074
The predicted probability after passing the softmax layer is
Figure BDA0003732923840000075
Loss of classification L CE,i In general, there is a cross-entropy loss,
Figure BDA0003732923840000076
the distillation loss is output from the global model logit: z is a radical of g (p after softmax) g ) Z, the local model outputs a local weight M of M sample 、M class And (4) forming. The output logit of the global model on the local data represents the global knowledge, and can be regarded as an absolute prediction estimation that the sample belongs to each class, and after passing through the softmax layer, the logit is converted into a relative prediction estimation that the sample belongs to each class. In order to decouple the prediction capability of the global model in each category, weighted Mean Square Error (MSE) is used to align local vectors output by the local model and the global model, instead of KL Divergence (Kullback-Leibler) to align the prediction distributions of the global model and the local model. Thus, the loss of selective distillation is defined as
Figure BDA0003732923840000081
Wherein
Figure BDA0003732923840000082
Is a weight vector associated with the category, E indicates a meaning of expect, which indicates element multiplication. Suppose that sample x belongs to class k 2 In the M vector k 1 The values of the positions are: m (x) [ k ] 1 ]=M max · [M class [k 1 ]M sample (x)-0.1] +
Figure BDA0003732923840000083
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5 . Wherein
Figure BDA0003732923840000084
Each element in M is guaranteed to be greater than or equal to 0. A. The k1,k1 Is the recall ratio of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2. p is a radical of g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 The probability of (c). M max The upper limit value of the distillation loss term is determined.
The framework of the overall process is shown in fig. 2. The process can be summarized as follows:
1. the server side initializes a global model w 0 Calculating the credibility of the model category level on the auxiliary data set;
2. the server randomly selects C multiplied by N clients S t Participating in the federate training of the current round and combining the global model parameters w t And confidence matrix A t Sending the data to the clients;
3. client-side local private data set D i Upper computation global model w t Sample level credibility, training and updating local model
Figure BDA0003732923840000085
Wherein L is i =L CE,i +L SSD,i
4. Each client side updates the model parameters
Figure BDA0003732923840000086
Uploading to a server side;
5. the server carries out weighted aggregation on the received model parameters
Figure BDA0003732923840000087
And calculating an updated global model w on the helper data set t+1 Confidence matrix A of t+1
And repeating the steps 2-5 until the model converges.
According to an embodiment of the invention, in an actual application scenario, the final purpose is to obtain a global model with strong generalization, and each client can download the model from a server side to perform local reasoning. For example, in a government open application scene, the method of the invention can break the data isolated island of government departments, and realize the safe sharing of cross-department, social data and the like; in the application scene of the biological medical treatment, the method can effectively combine a plurality of hospitals to realize the tasks of disease prediction, medical image identification, drug discovery, gene sequencing and the like; in a financial application scene, the method can be used for training the joint credit style model on the premise of protecting user information from being leaked, so that more precise and precise financial risk control is realized.
According to one embodiment of the present invention, a Dirichlet distribution (P) is used k Dir (delta)) construction of Non-IID data distribution scenarios, P k,i Indicating that the client i owns the class as a proportion of the number of k samples. Delta is a hyper-parameter that controls the degree of heterogeneity between clients, and a smaller value indicates a more uneven distribution of client data and a greater degree of heterogeneity. The method verifies the effectiveness of the method on three public data sets of CIFAR10, CIFAR100 and TinyImageNet. δ =0.5 is set as a default value, and an example of data distribution of each client is shown in fig. 3, where an abscissa represents an ID of the client (default setting is 10 clients), an ordinate represents a category ID (3 datasets have 10, 100, and 200 categories, respectively), a rectangular box represents the number of samples of the category owned by the client, and a darker color represents a larger number of samples owned by the client. In addition, a data division method in the FedAvg algorithm is also adopted, and K classes are randomly allocated to each client, and are denoted by # K = K here. For CIFAR10, we used a model consistent with FedAvg, a simple CNN network (two layers of convolution, two layers of full connectivity); resNet50 was used for models of CIFAR100 and tinyimagenete.
This is in contrast to the 5 related works, which include the benchmark method FedAvg and the four similar methods FedProx ("Federated Optimization in Heterogeneous Networks" by Li et al), fedCurv ("adapting in Federated Learning on Non-IID Data" by Shoham et al), MOON ("Model-synthesized learned Learning" by Li et al) and SCAFFOLD ("SCAFFOLD: stored Controlled Learning for fed Learning" by Karimerddy et al) for adding regularization terms to the local objective function. The comparison result on the reference data set is shown in fig. 4, where the first row represents the accuracy rate variation graph of the global model on the test set, and the second row represents the average accuracy rate variation graph of the local models of the clients on the same test set. It can be seen that the method FedSSD is superior to other methods on the data sets of CIFAR10 and TinyImageNet, the global test accuracy on CIFAR100 is slightly worse than MOON, but the average local accuracy is higher. On the other hand, we observed that the average local accuracy of the fedsds and benchmark methods was not much worse in the first rounds of federal communications because the global model has not yet learned good feature expression. Compared with other methods, the average local accuracy of FedSSD is greatly improved on three data sets, which shows that FedSSD can effectively retain global knowledge and learn knowledge from local private data. Besides the above comparative experiments, we also analyzed the influence of the data isomerism degree on the fedsds, and the global test accuracy is shown in table one. It can be seen that fedsds perform well in data distributions of different degrees of heterogeneity.
Watch 1
Figure BDA0003732923840000091
Figure BDA0003732923840000101
To further illustrate the effectiveness of the present invention, a confusion matrix on a test set of a global model issued in a certain round is visualized as shown in fig. 5 (b), and a client is randomly selected, and data distribution is shown in fig. 5 (a). The confusion matrix of the model on the test set after training on the local private data by using the reference method FedAvg and the fedsdd method proposed by the present invention is shown in fig. 4 (c). The model after the FedAvg local update has poor performance on the categories 2,6 and 7, which means that the knowledge of the global model about the categories is forgotten in the local update process, and feddsd effectively retains the knowledge and learns the knowledge about the categories 3,4,5 and 8 from the local private data.
The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related technical details mentioned in the present embodiment can also be applied to the above embodiments.
The invention also provides a data privacy protection system based on the self-adaptive weight adjustment, which comprises the following steps:
the cloud end is used for inputting the auxiliary data set marked with the category label into the global model to obtain the classification precision of the global model on each category of data as a category credibility matrix; obtaining a new model by performing weighted aggregation on the received model parameters;
the client side is used for acquiring the global model from the cloud side, initializing a local model of the client side, wherein the client side is provided with a local private data set locally, and samples in the local private data set correspond to category labels; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample reliability matrix and the class reliability matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; and uploading the trained model parameters of the local model to the cloud.
The data privacy protection system based on the adaptive adjustment weight comprises a cloud end and a server, wherein the cloud end is used for: and replacing the global model with the new model, saving the current global model as a final model, and classifying or predicting the specified data.
The adaptive weight-based data privacy protection system, wherein the distillation loss
Figure BDA0003732923840000102
Wherein
Figure BDA0003732923840000103
A weight vector of class, \ indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] +
Figure BDA0003732923840000111
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5
where sample x belongs to class k 2
Figure BDA0003732923840000112
A k1,k1 Is the recall rate of category k1, A k,k1 Indicating the probability of class k being mispredicted into class k 2; p is a radical of g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 Probability of (M) max Is the upper limit of distillation loss term.
The data privacy protection system based on the self-adaptive weight adjustment is characterized in that the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or predicting risks of transaction data.
The invention also provides a storage medium for storing a program for executing the data privacy protection method based on the self-adaptive weight adjustment.
The invention also provides a client used for any data privacy protection system based on the self-adaptive weight adjustment.

Claims (10)

1. A data privacy protection method based on self-adaptive weight adjustment is characterized by comprising the following steps:
step 1, inputting an auxiliary data set marked with category labels into a global model to obtain the classification precision of the global model on each category of data, and using the classification precision as a category credibility matrix;
step 2, at least one client acquires the global model from a cloud, the client initializes a local model of the client, the client locally has a local private data set, and samples in the local private data set correspond to category labels; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample credibility matrix and the class credibility matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; uploading the trained model parameters of the local model to the cloud;
and 3, carrying out weighted aggregation on the received model parameters by the cloud end to obtain a new model.
2. The data privacy protection method based on adaptive adjustment weight according to claim 1, wherein the step 3 comprises: and (4) replacing the global model with the new model, circularly executing the steps 1 to 3 until the total loss is converged or a preset iteration number is reached, finishing the training and updating of the global model, saving the current global model as a final model, and classifying or predicting specified data.
3. The adaptation-based method of claim 1Method for privacy protection of data with adjusted weights, characterized in that the distillation loss is
Figure FDA0003732923830000011
Wherein
Figure FDA0003732923830000012
A weight vector of class,. Indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] +
Figure FDA0003732923830000013
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5
where the sample x belongs to the class k2,
Figure FDA0003732923830000014
A k1,k1 is the recall ratio of category k1, A k,k1 Indicating the probability of class k being mispredicted into class k 2; p is a radical of g (x)[k 2 ]The class representing the correct prediction sample x of the global model is k 2 Probability of, M max Is the upper limit of distillation loss term.
4. The adaptive weight adjustment based data privacy protection method according to claim 1, wherein the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or performing risk prediction on transaction data.
5. A data privacy protection system based on self-adaptive weight adjustment is characterized by comprising:
the cloud end is used for inputting the auxiliary data set marked with the category label into the global model to obtain the classification precision of the global model on each category of data, and the classification precision is used as a category credibility matrix; obtaining a new model by performing weighted aggregation on the received model parameters;
the client side acquires the global model from the cloud side, initializes a local model of the client side, and locally has a local private data set, wherein samples in the local private data set correspond to the category label; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample reliability matrix and the class reliability matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; and uploading the trained model parameters of the local model to the cloud.
6. The adaptive weight adjustment based data privacy protection system of claim 5, wherein the cloud is configured to: and replacing the global model with the new model, saving the current global model as a final model, and classifying or predicting the specified data.
7. The adaptive adjustment weight-based data privacy protection system of claim 5, wherein the distillation loss
Figure FDA0003732923830000021
Wherein
Figure FDA0003732923830000022
A weight vector of class,. Indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] +
Figure FDA0003732923830000023
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5
where the sample x belongs to class k 2
Figure FDA0003732923830000024
A k1,k1 Is the recall rate of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2; p is a radical of g (x)[k 2 ]The class representing the correct prediction sample x of the global model is k 2 Probability of (M) max Is the upper limit of distillation loss term.
8. The adaptive weight adjustment based data privacy protection system of claim 5, wherein the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or performing risk prediction on transaction data.
9. A storage medium storing a program for executing the adaptive weight-based data privacy protection method according to any one of claims 1 to 4.
10. A client terminal for use in the data privacy protection system based on the adaptive adjustment weight in any one of claims 5 to 8.
CN202210798075.3A 2022-07-06 2022-07-06 Data privacy protection method and system based on self-adaptive adjustment weight Pending CN115495771A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210798075.3A CN115495771A (en) 2022-07-06 2022-07-06 Data privacy protection method and system based on self-adaptive adjustment weight

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210798075.3A CN115495771A (en) 2022-07-06 2022-07-06 Data privacy protection method and system based on self-adaptive adjustment weight

Publications (1)

Publication Number Publication Date
CN115495771A true CN115495771A (en) 2022-12-20

Family

ID=84466175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210798075.3A Pending CN115495771A (en) 2022-07-06 2022-07-06 Data privacy protection method and system based on self-adaptive adjustment weight

Country Status (1)

Country Link
CN (1) CN115495771A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541769A (en) * 2023-07-05 2023-08-04 北京邮电大学 Node data classification method and system based on federal learning
CN116935136A (en) * 2023-08-02 2023-10-24 深圳大学 Federal learning method for processing classification problem of class imbalance medical image
CN117350373A (en) * 2023-11-30 2024-01-05 艾迪恩(山东)科技有限公司 Personalized federal aggregation algorithm based on local self-attention mechanism
US11930022B2 (en) * 2019-12-10 2024-03-12 Fortinet, Inc. Cloud-based orchestration of incident response using multi-feed security event classifications

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11930022B2 (en) * 2019-12-10 2024-03-12 Fortinet, Inc. Cloud-based orchestration of incident response using multi-feed security event classifications
CN116541769A (en) * 2023-07-05 2023-08-04 北京邮电大学 Node data classification method and system based on federal learning
CN116935136A (en) * 2023-08-02 2023-10-24 深圳大学 Federal learning method for processing classification problem of class imbalance medical image
CN117350373A (en) * 2023-11-30 2024-01-05 艾迪恩(山东)科技有限公司 Personalized federal aggregation algorithm based on local self-attention mechanism
CN117350373B (en) * 2023-11-30 2024-03-01 艾迪恩(山东)科技有限公司 Personalized federal aggregation algorithm based on local self-attention mechanism

Similar Documents

Publication Publication Date Title
CN115495771A (en) Data privacy protection method and system based on self-adaptive adjustment weight
Li et al. Improved techniques for training adaptive deep networks
Luo et al. Adapt to adaptation: Learning personalization for cross-silo federated learning
CN111460528B (en) Multi-party combined training method and system based on Adam optimization algorithm
CN113378959B (en) Zero sample learning method for generating countermeasure network based on semantic error correction
CN114154643A (en) Federal distillation-based federal learning model training method, system and medium
Wang et al. Distributed stochastic consensus optimization with momentum for nonconvex nonsmooth problems
Idrissi et al. Fedbs: Learning on non-iid data in federated learning using batch normalization
Che et al. FedTriNet: A pseudo labeling method with three players for federated semi-supervised learning
CN117634594A (en) Self-adaptive clustering federal learning method with differential privacy
Zhao et al. Optimizing widths with PSO for center selection of Gaussian radial basis function networks
CN117077765A (en) Electroencephalogram signal identity recognition method based on personalized federal incremental learning
CN108921281A (en) A kind of field adaptation method based on depth network and countermeasure techniques
CN113705724B (en) Batch learning method of deep neural network based on self-adaptive L-BFGS algorithm
CN111353534A (en) Graph data category prediction method based on adaptive fractional order gradient
US11899765B2 (en) Dual-factor identification system and method with adaptive enrollment
Tran et al. Personalized privacy-preserving framework for cross-silo federated learning
Di et al. Variance-aware regret bounds for stochastic contextual dueling bandits
CN117113274A (en) Heterogeneous network data-free fusion method and system based on federal distillation
Shi et al. Efficient federated learning with enhanced privacy via lottery ticket pruning in edge computing
Guo et al. Dual class-aware contrastive federated semi-supervised learning
CN116611535A (en) Edge federation learning training method and system for heterogeneous data
Zhang et al. Going deeper, generalizing better: An information-theoretic view for deep learning
Tun et al. Federated learning with intermediate representation regularization
Wang et al. Logit calibration for non-iid and long-tailed data in federated learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination