CN115495771A - Data privacy protection method and system based on self-adaptive adjustment weight - Google Patents
Data privacy protection method and system based on self-adaptive adjustment weight Download PDFInfo
- Publication number
- CN115495771A CN115495771A CN202210798075.3A CN202210798075A CN115495771A CN 115495771 A CN115495771 A CN 115495771A CN 202210798075 A CN202210798075 A CN 202210798075A CN 115495771 A CN115495771 A CN 115495771A
- Authority
- CN
- China
- Prior art keywords
- model
- local
- data
- sample
- global
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Human Resources & Organizations (AREA)
- Multimedia (AREA)
- Bioethics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Game Theory and Decision Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Computer Security & Cryptography (AREA)
- Molecular Biology (AREA)
- Computer Hardware Design (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a data privacy protection method and system based on self-adaptive adjustment weight, solves the problems of model performance reduction and convergence speed reduction brought by non-independent and identically distributed data, and belongs to the technical field of federal learning application. The method comprises the following steps: when each round of federal communication starts, the server side evaluates the credibility of the global model category level by using the auxiliary data set, and issues the credibility matrix and global model parameters to the clients participating in the round of federal communication; the client side evaluates the reliability of a global model sample level according to the local private data set, weights the knowledge distillation by utilizing category reliability and sample reliability, dynamically guides the training process of the local model, and uploads updated local model parameters to the server side; and the server side weights and aggregates the parameters of each local model to update the global model.
Description
Technical Field
The invention relates to the technical field of federal learning and data security, in particular to a federal learning method and a system based on client selective knowledge distillation.
Background
Conventional machine learning techniques have been successfully applied to the fields of computer vision, natural language processing, recommendation systems, and automatic control. With the application of artificial intelligence in various industries, people are also continuously concerned about user privacy and data security. The Protection of Data security and privacy is continuously strengthened in various countries, for example, general Data Protection Regulation (GDPR) is formally introduced in 2018 of the european union, and personal information Protection law of the people's republic of china is passed in 2021 of China. Due to the privacy protection restrictions of these laws and regulations, data in the fields of medicine, enterprises, military, etc. presents an island distribution. Federal Learning (FL), which has emerged in recent years, enables secure sharing of multiparty data by transmitting model parameters instead of raw data. On one hand, the privacy of the user and the data security can be well protected when the data is not locally generated; on the other hand, joint training can fully sense local private data of each client, and the data island problem is solved.
The environments of different users, devices, organizations and other participating clients are heterogeneous in nature, so that data in federal learning is Non-independent and clinically distributed (Non-IID). Non-IID has been a leading-edge hotspot problem to be solved urgently in the field of federal learning, wherein tag distribution shift is particularly significant in the practical application scenario of federal learning. The data heterogeneity of the client can cause the local training of the client to seriously deviate from the global target, so that the update process is divergent (Weight Divergence). Therefore, one of the research challenges of the federal learning data Non-IID lies in constraining the model updating direction in the local training process of each client, and learning knowledge from local private data while preserving the knowledge of the global model. The prior art adds a correction term to the local loss function so that the local update process does not deviate too much from the global model. Where the correction term is calculated using the local model minus the L2 distance of the global model of the previous round. A global model trained on a complete data set learns better representations than a local model trained on a skewed subset. Based on the above, a comparison learning loss term is added to the local loss function, so as to achieve the purposes of reducing the distance between the representation learned by the local model and the representation learned by the global model and increasing the distance between the representation learned by the local model and the representation learned by the previous local model. The method can also adopt an Elastic Weight Consolidation (EWC) method to eliminate the catastrophic forgetting problem in federal learning, and a penalty term is added in a local loss function to hinder the change of important model parameters of a local model on a global task.
The above research methods all use the global model to constrain the update direction of the local model of the client, so as to prevent the difference between the updated local model and the global model from being too large. However, these works all suffer from the following two drawbacks: on the one hand, these efforts do not adaptively adjust the weights between the correction term and the task loss term in the local objective function. If the weight of the correction term is too large, the federation in the round cannot learn new knowledge, and if the weight of the correction term is too small, the federation in the round can deviate from the global target in the optimization direction, so that the weight needs to be adjusted very carefully to optimize towards an ideal optimal model, and the process of trying the weight consumes a great deal of time and energy. On the other hand, these efforts do not take into account that the poor performance global model misleads the local model update direction to optimize in the wrong direction, especially for the client that did not participate in the previous round of federation. Because each round of clients participating in the federation is dynamically changed and the data of the clients is not independently and uniformly distributed, the global model obtained by aggregation has different characterization capabilities on different classes. At the beginning of federal learning, the global model has not yet learned a good representation, and the local model should be trained with more attention paid to learning knowledge from local private data rather than to retain global knowledge. And in the middle and later stages of federal communication, the global model performs better on a specific category than the local model, and the local model selectively retains global knowledge from the category level and the sample level.
Most existing federal learning algorithms are based on classical federal Averaging (FedAvg), which uses a traditional Client-Server (C-S) architecture to split a distributed training process into a multi-round iterative Client-side local training process and a Server-side parameter aggregation process. As shown in fig. 1, in the client local training process, each client downloads a model from a server and then performs multiple rounds of training on a local private data set; in the parameter aggregation process of the server side, the server receives updated model parameters from the client side, and aggregation is performed in a parameter averaging mode by using the total sample size of the client side as the weight. Assuming a total of N clients, local private dataset for each client is D = { D = { 1 ,D 2 ,…,D N }. The objective function L of the global model w is the objective function L of the local model of each client i Weighted average of (c):wherein q is i Represents the weight in the aggregation of client i, | D i L represents the total sample size of the client i local private dataset.
In the practical application scenario of federal learning, the local data distribution of each client is usually Non-independent and same distributed Non-IID, especially the label distribution P (y) of the data may be different. On one hand, when each client performs local training on a local inclined data set, the updating directions of model fitting samples are inconsistent, so that each local model deviates from a global target, and 'catastrophic forgetting' of global knowledge is generated, and further, the deviation between the aggregated and updated global model and an ideal model is overlarge. Meanwhile, the global model has different capabilities of extracting different types of features, and has strong capability of extracting most types of features in the currently selected local private data of the client, so that the credibility of the output logit (the input variable of the softmax activation function of the neural network for outputting the predicted value) of the global model on the most types of channels is higher than that on the few types of channels. On the other hand, the output logit of the global model is also related to the specific training sample. When the training samples belong to the above-mentioned few classes, or the features in the samples of the most classes are unique, the global model cannot effectively extract the features of the samples, so that the confidence of the output logit on each class channel is low at this time.
In summary, how to selectively retain global knowledge and adaptively adjust local update direction in the local training process, and improve the generalization and convergence speed of the model becomes the focus of research attention of us.
Disclosure of Invention
The invention aims to overcome the defects that the updating direction of a local model cannot be adjusted in a self-adaptive mode and the performance change of a global model, the performance difference between classes, the performance difference between samples and the like are not considered when the existing federal learning method faces to non-independent same-distribution data, and provides a data privacy protection method based on self-adaptive adjustment weight, which comprises the following steps:
and 3, carrying out weighted aggregation on the received model parameters by the cloud end to obtain a new model.
The data privacy protection method based on the adaptive adjustment weight, wherein the step 3 comprises: and (4) replacing the global model with the new model, circularly executing the steps 1 to 3 until the total loss is converged or a preset iteration number is reached, finishing the training and updating of the global model, saving the current global model as a final model, and classifying or predicting specified data.
The data privacy protection method based on self-adaptive weight adjustment, wherein the distillation lossWhereinA weight vector of class, \ indicates element multiplication; z is a radical of formula g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] + ,
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5 ;
where sample x belongs to class k 2 ,A k1,k1 Is the recall ratio of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2; p is a radical of formula g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 Probability of (M) max Is the upper limit of distillation loss term.
According to the data privacy protection method based on the self-adaptive adjustment weight, the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or predicting risks of transaction data.
The invention also provides a data privacy protection system based on the self-adaptive weight adjustment, which comprises the following steps:
the cloud end is used for inputting the auxiliary data set marked with the category label into the global model to obtain the classification precision of the global model on each category of data, and the classification precision is used as a category credibility matrix; obtaining a new model by performing weighted aggregation on the received model parameters;
the client side is used for acquiring the global model from the cloud side, initializing a local model of the client side, wherein the client side is provided with a local private data set locally, and samples in the local private data set correspond to category labels; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample reliability matrix and the class reliability matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; and uploading the trained model parameters of the local model to the cloud.
The data privacy protection system based on the adaptive adjustment weight comprises a cloud end and a server, wherein the cloud end is used for: and replacing the global model with the new model, saving the current global model as a final model, and classifying or predicting the specified data.
The adaptive weight-based data privacy protection system wherein the distillation lossWhereinA weight vector of class,. Indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] + ,
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5 ;
where sample x belongs to class k 2 ,A k1,k1 Is the recall rate of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2; p is a radical of g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 Probability of (M) max Is the upper limit of distillation loss term.
According to the data privacy protection system based on the self-adaptive adjustment weight, the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or predicting risks of transaction data.
The invention also provides a storage medium for storing a program for executing the data privacy protection method based on the self-adaptive adjustment weight.
The invention also provides a client used for any data privacy protection system based on the self-adaptive weight adjustment.
According to the scheme, the invention has the advantages that:
according to the method, the strategy of selective knowledge distillation of the client is introduced in the local training process of federal learning, the credibility of the global model is evaluated from the category level and the sample level, the global knowledge is selectively distilled into the local model according to the credibility, so that the local model of each client does not deviate from the global model from the learning knowledge of local private data at the same time, the number of turns of federal communication is reduced, the performance of the model is improved, and the convergence of the model is accelerated.
Drawings
FIG. 1 is a prior art federal learning flow chart;
FIG. 2 is a flow chart of the federated learning of the present invention;
FIG. 3 is a schematic diagram of data distribution of clients;
FIG. 4 is a graph of comparative results on a reference data set;
FIG. 5 is a diagram of a confusion matrix for a global model over a test set.
Detailed Description
Before describing embodiments of the present invention in detail, some of the terms used therein will be explained as follows:
a client refers to a node that provides services to a client. The clients may be a large number of mobile or internet of things devices, or may be different organizations (e.g., government departments, medical institutions, financial institutions, geographically distributed data centers, etc.), and the local private data is stored in the clients. The client in the embodiment of the present invention is not limited to any application scenario.
The server (cloud) refers to a node providing services for the client. The central server side is connected with each client side, and the multiple client sides are coordinated to jointly model on the premise that original data are not leaked.
The deep learning model is a deep neural network formed by connecting a plurality of processing units. The model updated at the server side is called a global model, and the model updated at the client side is called a local model.
The auxiliary data set is an open data set used for assisting federal training. May consist of a public data set related to the training task or a small amount of local data published by each client.
In view of the limitations and challenges presented by prior approaches, the present invention proposes a federal learning approach based on client selective knowledge distillation. The method has the key points that the knowledge of the global model is selectively distilled into the local model, and the weight of distillation loss in the local training process is adaptively adjusted according to the credibility of the global model at the category level and the sample level, so that the performance of the model is improved, and the convergence of the model is accelerated. Since each federal requires recalculation of M _ class and M _ sample, while the weight of distillation loss, M, is determined by both, it is called adaptive weight adjustment.
The invention comprises the following key technical points:
the key point 1 is that a federal learning method based on client knowledge distillation is introduced aiming at the problem of catastrophic forgetting in the local training process. The method has the technical effects that the knowledge of the global model can be reserved in the local updating process;
and 2, a client selective knowledge distillation strategy is introduced aiming at the problem of performance difference between categories and performance difference between samples of the global model. And respectively evaluating the credibility of the global model class level by using the auxiliary data set at the server side and evaluating the credibility of the global model sample level by using each sample of the private data of the server side locally, and weighting by using the class credibility and the sample credibility during knowledge distillation. The training process of the local model is dynamically guided by the strategy of "choose best from, and change bad". The technical effect is that the local update process can be adaptively adjusted.
In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
Aiming at the problems, the invention provides a client-based Selective Self-Distillation federal Learning method (FedSSD), which takes a global model as a teacher and a local model as a student, and selectively learns global expression from the global model, so that the local updating process of model parameters is adaptively corrected, and new knowledge is learned from local private data while global knowledge is not forgotten.
According to one embodiment of the invention, a small part of data is shared from the various clients to the server, or the public data set is used as the construction auxiliary data set D V . For example, a breast cancer histological image classification task may employ a few public data sets Camelyon17 as an auxiliary data set. In the t round of federal communication turn, the data set D is assisted at the server end firstly V Evaluating the credibility of the global model category level (represented by confusion matrix)Wherein the functionRepresenting a global model w t In the auxiliary data set D V Precision of (A), K represents the number of classes, A k1,k2 Representing the probability that the global model predicts class k1 as class k 2. Then, the confidence matrix A t And a global model w t And issuing the data to the client participating in the round of federation. The client can be a mobile phone terminal or an organization such as a hospital, a research institute, a company, etc.
According to one embodiment of the invention, a client i receives a credibility matrix A issued by a server t And a global model w t . First using the global model w t Initialize its local modelThen local private data D is processed by a Stochastic Gradient Descent (SGD) algorithm i Optimize its local objective function L i . To prevent local training from causing catastrophic forgetfulness to the global model, both classification losses and distillation losses need to be optimized: l is i =L CE,i +L SSD,i 。
According to one embodiment of the invention, for each input sample x, we define the output of the global model to beThe output of the local model, logit, isThe predicted probability after passing the softmax layer isLoss of classification L CE,i In general, there is a cross-entropy loss,
the distillation loss is output from the global model logit: z is a radical of g (p after softmax) g ) Z, the local model outputs a local weight M of M sample 、M class And (4) forming. The output logit of the global model on the local data represents the global knowledge, and can be regarded as an absolute prediction estimation that the sample belongs to each class, and after passing through the softmax layer, the logit is converted into a relative prediction estimation that the sample belongs to each class. In order to decouple the prediction capability of the global model in each category, weighted Mean Square Error (MSE) is used to align local vectors output by the local model and the global model, instead of KL Divergence (Kullback-Leibler) to align the prediction distributions of the global model and the local model. Thus, the loss of selective distillation is defined asWhereinIs a weight vector associated with the category, E indicates a meaning of expect, which indicates element multiplication. Suppose that sample x belongs to class k 2 In the M vector k 1 The values of the positions are: m (x) [ k ] 1 ]=M max · [M class [k 1 ]M sample (x)-0.1] + , M sample (x)=1-(1-p g (x)[k 2 ]) 0.5 . WhereinEach element in M is guaranteed to be greater than or equal to 0. A. The k1,k1 Is the recall ratio of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2. p is a radical of g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 The probability of (c). M max The upper limit value of the distillation loss term is determined.
The framework of the overall process is shown in fig. 2. The process can be summarized as follows:
1. the server side initializes a global model w 0 Calculating the credibility of the model category level on the auxiliary data set;
2. the server randomly selects C multiplied by N clients S t Participating in the federate training of the current round and combining the global model parameters w t And confidence matrix A t Sending the data to the clients;
3. client-side local private data set D i Upper computation global model w t Sample level credibility, training and updating local modelWherein L is i =L CE,i +L SSD,i
5. the server carries out weighted aggregation on the received model parametersAnd calculating an updated global model w on the helper data set t+1 Confidence matrix A of t+1 。
And repeating the steps 2-5 until the model converges.
According to an embodiment of the invention, in an actual application scenario, the final purpose is to obtain a global model with strong generalization, and each client can download the model from a server side to perform local reasoning. For example, in a government open application scene, the method of the invention can break the data isolated island of government departments, and realize the safe sharing of cross-department, social data and the like; in the application scene of the biological medical treatment, the method can effectively combine a plurality of hospitals to realize the tasks of disease prediction, medical image identification, drug discovery, gene sequencing and the like; in a financial application scene, the method can be used for training the joint credit style model on the premise of protecting user information from being leaked, so that more precise and precise financial risk control is realized.
According to one embodiment of the present invention, a Dirichlet distribution (P) is used k Dir (delta)) construction of Non-IID data distribution scenarios, P k,i Indicating that the client i owns the class as a proportion of the number of k samples. Delta is a hyper-parameter that controls the degree of heterogeneity between clients, and a smaller value indicates a more uneven distribution of client data and a greater degree of heterogeneity. The method verifies the effectiveness of the method on three public data sets of CIFAR10, CIFAR100 and TinyImageNet. δ =0.5 is set as a default value, and an example of data distribution of each client is shown in fig. 3, where an abscissa represents an ID of the client (default setting is 10 clients), an ordinate represents a category ID (3 datasets have 10, 100, and 200 categories, respectively), a rectangular box represents the number of samples of the category owned by the client, and a darker color represents a larger number of samples owned by the client. In addition, a data division method in the FedAvg algorithm is also adopted, and K classes are randomly allocated to each client, and are denoted by # K = K here. For CIFAR10, we used a model consistent with FedAvg, a simple CNN network (two layers of convolution, two layers of full connectivity); resNet50 was used for models of CIFAR100 and tinyimagenete.
This is in contrast to the 5 related works, which include the benchmark method FedAvg and the four similar methods FedProx ("Federated Optimization in Heterogeneous Networks" by Li et al), fedCurv ("adapting in Federated Learning on Non-IID Data" by Shoham et al), MOON ("Model-synthesized learned Learning" by Li et al) and SCAFFOLD ("SCAFFOLD: stored Controlled Learning for fed Learning" by Karimerddy et al) for adding regularization terms to the local objective function. The comparison result on the reference data set is shown in fig. 4, where the first row represents the accuracy rate variation graph of the global model on the test set, and the second row represents the average accuracy rate variation graph of the local models of the clients on the same test set. It can be seen that the method FedSSD is superior to other methods on the data sets of CIFAR10 and TinyImageNet, the global test accuracy on CIFAR100 is slightly worse than MOON, but the average local accuracy is higher. On the other hand, we observed that the average local accuracy of the fedsds and benchmark methods was not much worse in the first rounds of federal communications because the global model has not yet learned good feature expression. Compared with other methods, the average local accuracy of FedSSD is greatly improved on three data sets, which shows that FedSSD can effectively retain global knowledge and learn knowledge from local private data. Besides the above comparative experiments, we also analyzed the influence of the data isomerism degree on the fedsds, and the global test accuracy is shown in table one. It can be seen that fedsds perform well in data distributions of different degrees of heterogeneity.
To further illustrate the effectiveness of the present invention, a confusion matrix on a test set of a global model issued in a certain round is visualized as shown in fig. 5 (b), and a client is randomly selected, and data distribution is shown in fig. 5 (a). The confusion matrix of the model on the test set after training on the local private data by using the reference method FedAvg and the fedsdd method proposed by the present invention is shown in fig. 4 (c). The model after the FedAvg local update has poor performance on the categories 2,6 and 7, which means that the knowledge of the global model about the categories is forgotten in the local update process, and feddsd effectively retains the knowledge and learns the knowledge about the categories 3,4,5 and 8 from the local private data.
The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related technical details mentioned in the present embodiment can also be applied to the above embodiments.
The invention also provides a data privacy protection system based on the self-adaptive weight adjustment, which comprises the following steps:
the cloud end is used for inputting the auxiliary data set marked with the category label into the global model to obtain the classification precision of the global model on each category of data as a category credibility matrix; obtaining a new model by performing weighted aggregation on the received model parameters;
the client side is used for acquiring the global model from the cloud side, initializing a local model of the client side, wherein the client side is provided with a local private data set locally, and samples in the local private data set correspond to category labels; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample reliability matrix and the class reliability matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; and uploading the trained model parameters of the local model to the cloud.
The data privacy protection system based on the adaptive adjustment weight comprises a cloud end and a server, wherein the cloud end is used for: and replacing the global model with the new model, saving the current global model as a final model, and classifying or predicting the specified data.
The adaptive weight-based data privacy protection system, wherein the distillation lossWhereinA weight vector of class, \ indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] + ,
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5 ;
where sample x belongs to class k 2 ,A k1,k1 Is the recall rate of category k1, A k,k1 Indicating the probability of class k being mispredicted into class k 2; p is a radical of g (x)[k 2 ]K represents the class of the correct prediction sample x of the global model 2 Probability of (M) max Is the upper limit of distillation loss term.
The data privacy protection system based on the self-adaptive weight adjustment is characterized in that the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or predicting risks of transaction data.
The invention also provides a storage medium for storing a program for executing the data privacy protection method based on the self-adaptive weight adjustment.
The invention also provides a client used for any data privacy protection system based on the self-adaptive weight adjustment.
Claims (10)
1. A data privacy protection method based on self-adaptive weight adjustment is characterized by comprising the following steps:
step 1, inputting an auxiliary data set marked with category labels into a global model to obtain the classification precision of the global model on each category of data, and using the classification precision as a category credibility matrix;
step 2, at least one client acquires the global model from a cloud, the client initializes a local model of the client, the client locally has a local private data set, and samples in the local private data set correspond to category labels; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample credibility matrix and the class credibility matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; uploading the trained model parameters of the local model to the cloud;
and 3, carrying out weighted aggregation on the received model parameters by the cloud end to obtain a new model.
2. The data privacy protection method based on adaptive adjustment weight according to claim 1, wherein the step 3 comprises: and (4) replacing the global model with the new model, circularly executing the steps 1 to 3 until the total loss is converged or a preset iteration number is reached, finishing the training and updating of the global model, saving the current global model as a final model, and classifying or predicting specified data.
3. The adaptation-based method of claim 1Method for privacy protection of data with adjusted weights, characterized in that the distillation loss isWhereinA weight vector of class,. Indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] + ,
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5 ;
where the sample x belongs to the class k2,A k1,k1 is the recall ratio of category k1, A k,k1 Indicating the probability of class k being mispredicted into class k 2; p is a radical of g (x)[k 2 ]The class representing the correct prediction sample x of the global model is k 2 Probability of, M max Is the upper limit of distillation loss term.
4. The adaptive weight adjustment based data privacy protection method according to claim 1, wherein the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or performing risk prediction on transaction data.
5. A data privacy protection system based on self-adaptive weight adjustment is characterized by comprising:
the cloud end is used for inputting the auxiliary data set marked with the category label into the global model to obtain the classification precision of the global model on each category of data, and the classification precision is used as a category credibility matrix; obtaining a new model by performing weighted aggregation on the received model parameters;
the client side acquires the global model from the cloud side, initializes a local model of the client side, and locally has a local private data set, wherein samples in the local private data set correspond to the category label; inputting the local private data set into the local model to obtain the output of the local model and the classification loss; inputting the local private data set into the global model local to the client to obtain the output of the global model and the classification precision of each sample in the local private data as a sample credibility matrix; obtaining distillation loss according to the sample reliability matrix and the class reliability matrix; training the local model based on a total loss consisting of the classification loss and the distillation loss; and uploading the trained model parameters of the local model to the cloud.
6. The adaptive weight adjustment based data privacy protection system of claim 5, wherein the cloud is configured to: and replacing the global model with the new model, saving the current global model as a final model, and classifying or predicting the specified data.
7. The adaptive adjustment weight-based data privacy protection system of claim 5, wherein the distillation lossWhereinA weight vector of class,. Indicates element multiplication; z is a radical of g Is the output of the global model, and z is the output of the local model;
weight vector M k 1 The values of the positions are:
M(x)[k 1 ]=M max ·[M class [k 1 ]M sample (x)-0.1] + ,
M sample (x)=1-(1-p g (x)[k 2 ]) 0.5 ;
where the sample x belongs to class k 2 ,A k1,k1 Is the recall rate of category k1, A k,k1 Indicating the probability that class k is mispredicted to class k 2; p is a radical of g (x)[k 2 ]The class representing the correct prediction sample x of the global model is k 2 Probability of (M) max Is the upper limit of distillation loss term.
8. The adaptive weight adjustment based data privacy protection system of claim 5, wherein the client is a medical institution data center or a financial institution data center, and the new model is used for classifying input images or performing risk prediction on transaction data.
9. A storage medium storing a program for executing the adaptive weight-based data privacy protection method according to any one of claims 1 to 4.
10. A client terminal for use in the data privacy protection system based on the adaptive adjustment weight in any one of claims 5 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210798075.3A CN115495771A (en) | 2022-07-06 | 2022-07-06 | Data privacy protection method and system based on self-adaptive adjustment weight |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210798075.3A CN115495771A (en) | 2022-07-06 | 2022-07-06 | Data privacy protection method and system based on self-adaptive adjustment weight |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115495771A true CN115495771A (en) | 2022-12-20 |
Family
ID=84466175
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210798075.3A Pending CN115495771A (en) | 2022-07-06 | 2022-07-06 | Data privacy protection method and system based on self-adaptive adjustment weight |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115495771A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116541769A (en) * | 2023-07-05 | 2023-08-04 | 北京邮电大学 | Node data classification method and system based on federal learning |
CN116935136A (en) * | 2023-08-02 | 2023-10-24 | 深圳大学 | Federal learning method for processing classification problem of class imbalance medical image |
CN117350373A (en) * | 2023-11-30 | 2024-01-05 | 艾迪恩(山东)科技有限公司 | Personalized federal aggregation algorithm based on local self-attention mechanism |
US11930022B2 (en) * | 2019-12-10 | 2024-03-12 | Fortinet, Inc. | Cloud-based orchestration of incident response using multi-feed security event classifications |
-
2022
- 2022-07-06 CN CN202210798075.3A patent/CN115495771A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11930022B2 (en) * | 2019-12-10 | 2024-03-12 | Fortinet, Inc. | Cloud-based orchestration of incident response using multi-feed security event classifications |
CN116541769A (en) * | 2023-07-05 | 2023-08-04 | 北京邮电大学 | Node data classification method and system based on federal learning |
CN116935136A (en) * | 2023-08-02 | 2023-10-24 | 深圳大学 | Federal learning method for processing classification problem of class imbalance medical image |
CN117350373A (en) * | 2023-11-30 | 2024-01-05 | 艾迪恩(山东)科技有限公司 | Personalized federal aggregation algorithm based on local self-attention mechanism |
CN117350373B (en) * | 2023-11-30 | 2024-03-01 | 艾迪恩(山东)科技有限公司 | Personalized federal aggregation algorithm based on local self-attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115495771A (en) | Data privacy protection method and system based on self-adaptive adjustment weight | |
Li et al. | Improved techniques for training adaptive deep networks | |
Luo et al. | Adapt to adaptation: Learning personalization for cross-silo federated learning | |
CN111460528B (en) | Multi-party combined training method and system based on Adam optimization algorithm | |
CN113378959B (en) | Zero sample learning method for generating countermeasure network based on semantic error correction | |
CN114154643A (en) | Federal distillation-based federal learning model training method, system and medium | |
Wang et al. | Distributed stochastic consensus optimization with momentum for nonconvex nonsmooth problems | |
Idrissi et al. | Fedbs: Learning on non-iid data in federated learning using batch normalization | |
Che et al. | FedTriNet: A pseudo labeling method with three players for federated semi-supervised learning | |
CN117634594A (en) | Self-adaptive clustering federal learning method with differential privacy | |
Zhao et al. | Optimizing widths with PSO for center selection of Gaussian radial basis function networks | |
CN117077765A (en) | Electroencephalogram signal identity recognition method based on personalized federal incremental learning | |
CN108921281A (en) | A kind of field adaptation method based on depth network and countermeasure techniques | |
CN113705724B (en) | Batch learning method of deep neural network based on self-adaptive L-BFGS algorithm | |
CN111353534A (en) | Graph data category prediction method based on adaptive fractional order gradient | |
US11899765B2 (en) | Dual-factor identification system and method with adaptive enrollment | |
Tran et al. | Personalized privacy-preserving framework for cross-silo federated learning | |
Di et al. | Variance-aware regret bounds for stochastic contextual dueling bandits | |
CN117113274A (en) | Heterogeneous network data-free fusion method and system based on federal distillation | |
Shi et al. | Efficient federated learning with enhanced privacy via lottery ticket pruning in edge computing | |
Guo et al. | Dual class-aware contrastive federated semi-supervised learning | |
CN116611535A (en) | Edge federation learning training method and system for heterogeneous data | |
Zhang et al. | Going deeper, generalizing better: An information-theoretic view for deep learning | |
Tun et al. | Federated learning with intermediate representation regularization | |
Wang et al. | Logit calibration for non-iid and long-tailed data in federated learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |