CN116882524A - Federal learning method and system for meeting personalized privacy protection requirements of participants - Google Patents
Federal learning method and system for meeting personalized privacy protection requirements of participants Download PDFInfo
- Publication number
- CN116882524A CN116882524A CN202310707082.2A CN202310707082A CN116882524A CN 116882524 A CN116882524 A CN 116882524A CN 202310707082 A CN202310707082 A CN 202310707082A CN 116882524 A CN116882524 A CN 116882524A
- Authority
- CN
- China
- Prior art keywords
- privacy
- participants
- server
- budgets
- parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 40
- 230000002776 aggregation Effects 0.000 claims abstract description 24
- 238000004220 aggregation Methods 0.000 claims abstract description 24
- 238000004590 computer program Methods 0.000 claims description 17
- 238000011478 gradient descent method Methods 0.000 claims description 7
- 238000003860 storage Methods 0.000 claims description 7
- 238000005520 cutting process Methods 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000004075 alteration Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013523 data management Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000003094 perturbing effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Bioethics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A federal learning method and system for meeting personalized privacy protection requirements of participants relates to the technical field of network and information security. The method solves the problem of privacy disclosure of the participants due to the fact that the server is not trusted in the federal learning scene. The method comprises the following steps: the participant selects a privacy budget, encrypts the privacy budget and sends the privacy budget to the server; the server receives the privacy budgets for summation, the server and the participants cooperate to decrypt to obtain the privacy budgets summation, and the summation is sent to the participants; the participant divides the privacy budget and the sum of the privacy budgets to obtain an aggregation weight; the server sends the global model parameters to the participants, and training is carried out according to the parameters to obtain a local model; the participant multiplies the local gradient with the aggregation weight, and then performs gradient clipping; disturbing the cut parameters and sending the parameters to a server; and the server receives the gradient parameters and performs aggregation to generate a global model. The method is applied to the field of privacy data protection.
Description
Technical Field
The invention relates to the technical field of network and information security, in particular to a federal learning method for meeting personalized privacy protection requirements of participants.
Background
With the rapid development of big data and artificial intelligence, artificial intelligence has entered into aspects of our lives, such as finance, medical treatment, unmanned driving, etc. The most important technology in artificial intelligence is machine learning, and it is the development of machine learning that has prompted the rapid development of artificial intelligence. At the same time, privacy problems in machine learning have received widespread attention. As privacy concerns increase, users have a lower willingness to share data. Paradoxically, the artificial intelligence technology must rely on a large amount of data collection and fusion, and if complete and rich information cannot be obtained to train the model and develop the technology, the development of the artificial intelligence application is severely limited.
In the background that the contradiction between the data islanding phenomenon and the data fusion requirement is gradually highlighted, federal learning has been developed. Its core idea is to train a machine learning model on separate data sets distributed across different devices or parties, which can protect local data privacy to some extent. In federal learning, participants use only uploaded model parameters or gradients and do not expose potentially sensitive local data. Although federal learning is an effective means of protecting private information in machine learning, there is still a risk of privacy disclosure. However, some studies have shown that gradients or model parameters uploaded by participants may also reveal privacy, and that attackers may analyze differences between the original model information and the model information using some means of attack, such as differential attack and model inversion attack, to obtain specific privacy information in the model.
To address privacy concerns in federal learning, scholars have proposed several schemes, such as federal learning based on homomorphic encryption or secure multiparty computing. Because homomorphic encryption techniques are expensive and require a large amount of computation, they are not practical for model iterative training involving large-scale data. One challenge with federal learning methods based on secure multiparty computing is to increase computational efficiency, as a large amount of computational resources are required to complete a round of training in the federal learning framework. Federal learning methods based on differential privacy have lower communication and computational overhead than other methods, and are therefore widely used to protect federal learning privacy. Currently, federal learning work based on differential privacy work mainly includes two types: 1) The method comprises the steps that parameters uploaded by a user are disturbed by using local differential privacy before the user uploads model parameters; 2) A central aggregation server that aggregates gradients using centralized differential privacy perturbations. But one drawback of local differencing privacy is that it provides a uniform level of privacy protection for all users.
Due to different cultural value views, revenues, ages, legal, national or professional backgrounds, a unified level of privacy protection is impractical. In practice, when facing data sets containing multiple users with different privacy expectations, the use of local differential privacy is limited. One possibility is to set the global privacy level high enough that it may introduce an unacceptable amount of noise in the analysis output, resulting in poor utility. On the other hand, setting a lower privacy level may also significantly compromise utility. Furthermore, a single privacy level means that a large amount of privacy budget is wasted for some customers, often negatively impacting the accuracy of the model. Using common local differential privacy does not take into account the different privacy requirements of the user, it is not practical to protect privacy with a uniform privacy level.
Disclosure of Invention
The invention provides a federal learning method for meeting personalized privacy protection requirements of participants, which aims to solve the problem of privacy leakage of the participants caused by unreliable servers in a federal learning scene, and the scheme is as follows:
a federal learning method for meeting personalized privacy protection requirements of a participant, the method comprising:
s1: more than two participants select privacy budgets according to privacy demands, encrypt the privacy budgets and send the encrypted privacy budgets to a server;
s2: the server receives the encrypted privacy budgets and sums up, and decrypts the privacy budgets together with the participators according to the privacy budgets summed up by the server to obtain the sum of the privacy budgets, and sends the sum to the participators;
s3: the participant divides the private budget of the participant from the sum of the private budgets to obtain an aggregation weight;
s4: the server sends the global model parameters to the participators, and the participators perform local training according to the parameters sent by the server to obtain a local model;
s5: multiplying the parameters of the local model by the aggregation weights by each participant, and then performing gradient clipping;
s6: adding personalized noise to the parameters cut in the step S5 for disturbance, and sending the parameters to a server;
s7: and the server receives the parameters sent by the participants in the step S6 and aggregates the parameters to generate a global model, wherein the global model is used for predicting and analyzing the setting problems in the privacy protection scene.
Further, a preferred mode is also provided, and the step S1 specifically includes:
s11: there are m participating parties { c 1 ,c 2 ,…c m Each participant has its own original data set d 1 ,d 2 ,…d m The participants select own privacy budget according to own privacy requirements;
s12: the participants encrypt by homomorphic encryption according to the own secret key and send the encrypted secret key to the server.
Further, there is also provided a preferred mode, wherein the step S2 includes:
s21: the server collects model parameters of the participants and aggregates the model parameters for distributed training;
s22: the server receives the encrypted privacy budgets and sums the encrypted privacy budgets, and the server decrypts the encrypted privacy budgets together with the participants under the condition that the specific privacy budgets of the participants are not known, obtains the privacy budgets sum of plaintext and sends the privacy budgets to the participants.
Further, there is also provided a preferred mode, wherein the step S4 includes:
the server sends global model parameters to the participators, the participators receive the global model parameters and then perform local training, and the local data set and the random gradient descent method are used for performing multiple iterations locally to obtain a local model.
Based on the same inventive concept, the invention also provides a federal learning system meeting personalized privacy protection requirements of participants, the system comprising:
the encryption module is used for selecting privacy budgets according to privacy demands by more than two participants, encrypting the privacy budgets and sending the encrypted privacy budgets to the server;
the decryption module is used for receiving the encrypted privacy budgets and summing the privacy budgets, decrypting the privacy budgets together with the participators according to the privacy budgets summed by the server and sending the sum to the participators;
the aggregation weight acquisition module is used for dividing the privacy budget of the participant by the sum of the privacy budgets to obtain an aggregation weight;
the training module is used for the server to send the global model parameters to the participators, and the participators perform local training according to the parameters sent by the server to obtain a local model;
the clipping module is used for multiplying the parameters of the local model by the aggregation weight by each participant and then carrying out gradient clipping;
the parameter updating module is used for adding personalized noise into the parameters cut by the cutting module to perform disturbance and sending the parameters to the server;
the output module is used for receiving the parameters sent by the participants in the parameter updating module by the server and carrying out aggregation to generate a global model, and the global model is used for predicting and analyzing the setting problems in the privacy protection scene.
Further, a preferred mode is also provided, and the encryption module specifically comprises:
there are m participating parties { c 1 ,c 2 ,…c m Each participant has its own originInitial data set { d } 1 ,d 2 ,…d m The participants select own privacy budget according to own privacy requirements;
the participants encrypt by homomorphic encryption according to the own secret key and send the encrypted secret key to the server.
Further, there is also provided a preferred mode, the decryption module including:
the server collects model parameters of the participants and aggregates the model parameters for distributed training;
the server receives the encrypted privacy budgets and sums the encrypted privacy budgets, and the server decrypts the encrypted privacy budgets together with the participants under the condition that the specific privacy budgets of the participants are not known, obtains the privacy budgets sum of plaintext and sends the privacy budgets to the participants.
Further, there is also provided a preferred mode, the training module includes:
the server sends global model parameters to the participators, the participators receive the global model parameters and then perform local training, and the local data set and the random gradient descent method are used for performing multiple iterations locally to obtain a local model.
Based on the same inventive concept, the present invention further provides a computer readable storage medium for storing a computer program for executing a federal learning method satisfying personalized privacy protection requirements of a participant according to any one of the above.
Based on the same inventive concept, the present invention also provides a computer device, comprising a memory and a processor, wherein the memory stores a computer program, and when the processor runs the computer program stored in the memory, the processor executes a federal learning method meeting the personalized privacy protection requirement of a participant according to any one of the above.
The invention has the advantages that:
the method and the device solve the problem of privacy disclosure of the participants caused by the fact that the server is not trusted in the federal learning scene.
The invention provides a federal learning method for meeting personalized privacy protection requirements of participants, aiming at the defect of federal learning using local differential privacy protection, personalized privacy protection is introduced on the basis of different privacy requirements of the participants. In the method, in the training process of federal learning, a participant selects own privacy budget for own privacy requirements to add random noise to local model parameters so as to achieve the purpose of disturbing the parameters, then a server aggregates the collected parameters, loops for many times until the model converges, and can predict and analyze the set problems in a privacy protection scene through the model, thereby better protecting the privacy of users.
The invention provides a federal learning method meeting personalized privacy protection requirements of participants, and provides a federal learning framework for personalized privacy protection based on local differential privacy, federal learning and other technologies, which can avoid privacy attack by an untrusted server. According to the invention, the participants can freely select own privacy budget, so that the aim of personalized privacy protection is fulfilled, the server cannot know the specific privacy budget of each participant, and the privacy attack of the untrusted server on the participants with large privacy budgets is avoided. The invention ensures that the server has better aggregation result and generates better global model.
The invention provides a federal learning method meeting the personalized privacy protection requirements of participants, solves the problem that users have difference in attitudes of privacy, introduces personalized privacy protection, and enables different users to set according to own privacy requirements. This not only increases the data utility, but also increases user enthusiasm. When users are able to control their privacy, they are more likely to contribute their data first for analysis.
The method and the device are applied to the field of privacy data protection.
Drawings
FIG. 1 is a flowchart of a federal learning method for meeting personalized privacy protection requirements of participants according to an embodiment;
fig. 2 is a step timing chart according to an eleventh embodiment.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments.
Embodiment one, this embodiment will be described with reference to fig. 1. The federal learning method according to the present embodiment, which meets the personalized privacy protection requirement of the participant, includes:
s1: more than two participants select privacy budgets according to privacy demands, encrypt the privacy budgets and send the encrypted privacy budgets to a server;
s2: the server receives the encrypted privacy budgets and sums up, and decrypts the privacy budgets together with the participators according to the privacy budgets summed up by the server to obtain the sum of the privacy budgets, and sends the sum to the participators;
s3: the participant divides the private budget of the participant from the sum of the private budgets to obtain an aggregation weight;
s4: the server sends the global model parameters to the participators, and the participators perform local training according to the parameters sent by the server to obtain a local model;
s5: multiplying the parameters of the local model by the aggregation weights by each participant, and then performing gradient clipping;
s6: adding personalized noise to the parameters cut in the step S5 for disturbance, and sending the parameters to a server;
s7: and the server receives the parameters sent by the participants in the step S6 and aggregates the parameters to generate a global model, wherein the global model is used for predicting and analyzing the setting problems in the privacy protection scene.
The aggregate weight of the present embodiment is assigned by considering factors such as data contribution, computing power, and trust level of each participant. Thus, after the gradient parameters multiply the aggregate weights, the result is a parameter representing the contribution of the participants. The server may aggregate the perturbed parameters of the participants to generate a new global model parameter without knowing the particular parameters of the participants. The method can protect the data privacy and safety and eliminate the difference between the participants, thereby realizing better federal learning effect.
Aiming at the defect of federal learning using local differential privacy protection, the method of the embodiment introduces personalized privacy protection on the basis of different privacy requirements of the participants. In the method, in the training process of federal learning, a participant selects own privacy budget for own privacy requirements to add random noise to local model parameters so as to achieve the purpose of disturbing the parameters, then a server aggregates the collected parameters, loops for many times until the model converges, and can predict and analyze the set problems in a privacy protection scene through the model, thereby better protecting the privacy of users.
The second embodiment and the present embodiment are further defined on a federal learning method according to the first embodiment, wherein the step S1 specifically includes:
s11: there are m participating parties { c 1 ,c 2 ,…c m Each participant has its own original data set d 1 ,d 2 ,…d m The participants select own privacy budget according to own privacy requirements;
s12: the participants encrypt by homomorphic encryption according to the own secret key and send the encrypted secret key to the server.
In step S11 and step S12 of this embodiment, the privacy and security of the data of the participants are protected, while minimizing the risk that the data of the participants are not disclosed or attacked. In particular, the benefits of these operations include the following:
protecting data privacy: the data of the parties often contain personally sensitive information or business secrets, etc., and exposing the data directly to a third party may result in privacy leakage. Therefore, the participants encrypt the data by homomorphic encryption and send the encrypted data to the server, so that the data privacy can be protected, and the risk of data leakage can be reduced.
Protecting data security: the data of the participants may also be subject to security threats such as hacking or malicious tampering. The homomorphic encryption method can ensure the safety of the data encryption and transmission process and avoid hacking, interception or tampering.
Improving controllability: the participants can configure privacy budgets according to own demands, and balance effects and privacy protection degrees flexibly according to own demands, so that the controllability of the participants on the binding learning process is improved.
Protecting privacy budgets: privacy budgets are criteria for assessing the privacy sensitivity of data, important in federal learning. Using homomorphic encryption methods, a participant can calculate without exposing the privacy budget, preventing hackers or hostile participants from acquiring the privacy budget.
Maintaining responsibility for data management: the party encrypts the transmitted data by using the homomorphic encryption method, so that the data management and processing of the server can be reduced as much as possible, and the data resource of the party can be better controlled. This can also help the participating parties manage the risk of data and maintain complete control over their own data.
Embodiment three, this embodiment is a further limitation of the federal learning method according to the first embodiment, in which the personalized privacy protection requirement of the participant is satisfied, and the step S2 includes:
s21: the server collects model parameters of the participants and aggregates the model parameters for distributed training;
s22: the server receives the encrypted privacy budgets and sums the encrypted privacy budgets, and the server decrypts the encrypted privacy budgets together with the participants under the condition that the specific privacy budgets of the participants are not known, obtains the privacy budgets sum of plaintext and sends the privacy budgets to the participants.
Step S21 described in this embodiment makes the model not directly possess the data of other participating units while each participating unit learns its own data by aggregating the limited training information uploaded by the participants in a predetermined manner. Step S22 decrypts the encryption privacy budget of all participants for the encryption mechanism in federal learning and aggregates it into a total amount. S22 is generally used in the internal operation of S21, and ensures data privacy by an encryption mechanism. Specifically, the aggregator calculates the sum of ciphertext according to the ciphertext of the limited training information uploaded by the participant, and the aggregator does not directly decrypt the uploaded encrypted limited training data. Instead, the participants need to calculate and upload the privacy budget before uploading offline encrypted information, which is used to break the privacy of the aggregator for the information they upload, and to get the aggregate of the total budget and update the local model in the aggregator's encrypted manner. This ensures data privacy and minimizes damage to the federal learning system that an attacker may use to calculate data and model data leakage.
Embodiment four, this embodiment is a further limitation of the federal learning method according to the first embodiment, in which the personalized privacy protection requirement of the participant is satisfied, and the step S4 includes:
the server sends global model parameters to the participators, the participators receive the global model parameters and then perform local training, and the local data set and the random gradient descent method are used for performing multiple iterations locally to obtain a local model.
In the embodiment, the global model parameters are fused into the local model, so that the participants can optimize the local model through local iteration, and the performance and accuracy of the whole shared model are improved and enhanced; when the participants perform local training, the learning rate, regularization and other parameters can be adjusted according to the own requirements, so that the controllability of the model is improved, and meanwhile, through the local training, the participants can more effectively know the local data characteristics of the participants, optimize the local model and improve the training effect; through local training, the participants can ensure that own data is not uploaded, further strengthen data privacy protection, and possibly reduce the burden on data transmission and storage thereof.
An embodiment five, a federal learning system for meeting personalized privacy protection requirements of a participant according to the embodiment, the system includes:
the encryption module is used for selecting privacy budgets according to privacy demands by more than two participants, encrypting the privacy budgets and sending the encrypted privacy budgets to the server;
the decryption module is used for receiving the encrypted privacy budgets and summing the privacy budgets, decrypting the privacy budgets together with the participators according to the privacy budgets summed by the server and sending the sum to the participators;
the aggregation weight acquisition module is used for dividing the privacy budget of the participant by the sum of the privacy budgets to obtain an aggregation weight;
the training module is used for the server to send the global model parameters to the participators, and the participators perform local training according to the parameters sent by the server to obtain a local model;
the clipping module is used for multiplying the parameters of the local model by the aggregation weight by each participant and then carrying out gradient clipping;
the parameter updating module is used for adding personalized noise into the parameters cut by the cutting module to perform disturbance and sending the parameters to the server;
the output module is used for receiving the parameters sent by the participants in the parameter updating module by the server and carrying out aggregation to generate a global model, and the global model is used for predicting and analyzing the setting problems in the privacy protection scene.
An embodiment six, this embodiment is a further limitation of the federal learning system according to the fifth embodiment, where the personalized privacy protection requirement of the participant is met, and the encryption module specifically is:
there are m participating parties { c 1 ,c 2 ,…c m Each participant has its own original data set d 1 ,d 2 ,…d m The participants select own privacy budget according to own privacy requirements;
the participants encrypt by homomorphic encryption according to the own secret key and send the encrypted secret key to the server.
Embodiment seven and this embodiment are further defined by a federal learning system according to embodiment five that meets a personalized privacy protection requirement of a participant, where the decryption module includes:
the server collects model parameters of the participants and aggregates the model parameters for distributed training;
the server receives the encrypted privacy budgets and sums the encrypted privacy budgets, and the server decrypts the encrypted privacy budgets together with the participants under the condition that the specific privacy budgets of the participants are not known, obtains the privacy budgets sum of plaintext and sends the privacy budgets to the participants.
An eighth embodiment is a further limitation of the federal learning system according to the fifth embodiment, wherein the training module includes:
the server sends global model parameters to the participators, the participators receive the global model parameters and then perform local training, and the local data set and the random gradient descent method are used for performing multiple iterations locally to obtain a local model.
The computer readable storage medium according to the ninth embodiment is used for storing a computer program, and the computer program executes a federal learning method according to any one of the first to fourth embodiments, where the federal learning method meets a personalized privacy protection requirement of a participant.
The computer device according to the tenth embodiment includes a memory and a processor, where the memory stores a computer program, and when the processor runs the computer program stored in the memory, the processor executes a federal learning method according to any one of the first to fourth embodiments, where the federal learning method meets the personalized privacy protection requirement of the participant.
Embodiment eleven, this embodiment will be described with reference to fig. 2. The present embodiment provides a specific example for the federal learning method according to the first embodiment, which meets the personalized privacy protection requirement of the participant, and is also used for explaining the second embodiment to the fourth embodiment, specifically:
step 1: the participant selects own privacy budget according to own privacy requirements, encrypts the privacy budget and sends the encrypted privacy budget to the server;
the step 1 comprises the following steps:
step 1.1: the participant selects his own privacy budget, the smaller the privacy budget, the higher the degree of privacy protection, and vice versa. For a given security parameter λ, the dimension of the lattice problem is set to n, ciphertext modulus q, key distribution x, and error distribution y on R. Generating a random vectorReturning a common parameter (n, q, x, y, a), each party samples a key s i ζ x and an error vector e i ←y d Then calculate its public key b i =-s i ·a+e i (mod q), all participants cooperatively compute an aggregated public key: />
Step 1.2: the participants encrypt the privacy budget using their own keys, setting ε i E R is the plaintext of the privacy budget, and a=a [0 ]],Sampling v di ←x,/>Ciphertext for calculating privacy budget And sent to the server.
Step 2: the server sums the ciphertext privacy budgets uploaded by the participants, then decrypts the ciphertext privacy budgets together with the participants to obtain the sum of the privacy budgets, and sends the sum to the participants;
the step 2 comprises the following steps:
step 2.1: after receiving the encrypted privacy budget sent by the participant, the server performs homomorphic encryption on the privacy budget to obtain a sum
Step 2.2: all participants together decrypt the ciphertext, one for each participantCalculate decryption share->And handle D i And sending the data to a server.
Step 2.3: after receiving the decrypted shares sent by all the participants, the server decrypts the sum of the plaintext privacy budgets epsilon.And sends the privacy budget sum epsilon to the participants.
Step 3: the participant divides the own privacy budget by the sum of the privacy budgets to obtain an aggregate weight w=epsilon i /ε。
Step 4: the server sends the global model parameters to the participators, and the participators perform local training according to the parameters sent by the server to obtain a local model;
the step 4 comprises the following steps:
step 4.1: the server sends the parameter g of the global model to the participants, and the participants perform local training after receiving the parameter g. Training with local data set in each local iteration, generating local model after training with random gradient descent method, wherein learning rate is eta, and gradient of the obtained local model is g i 。
Step 5: multiplying the parameters of the model by the aggregation weight by each participant, then performing gradient clipping, adding personalized noise to perform disturbance, and then sending to a server;
said step 5 comprises the steps of:
step 5.1: local gradient g i Multiplied by the aggregate weight g' i =g i ·(ε i Epsilon), gradient clipping of local gradients using a threshold σ of gradient clipping, g=clip (g' i ,σ)。
Step 5.2: setting the privacy budget epsilon according to the participants themselves i Adding random noise, such as Laplacian noise or Gaussian noise, and then perturbing the local gradientAnd sending the data to a server.
Step 6: the server aggregates the model parameters sent by the participants to generate a total office model, namely
While the preferred embodiments of the present disclosure have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the disclosure.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present disclosure without departing from the spirit or scope of the disclosure. Thus, the present disclosure is intended to include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
It will be appreciated by those skilled in the art that embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present disclosure and not for limiting the scope thereof, and although the present disclosure has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: various alterations, modifications, and equivalents may be suggested to the specific embodiments of the invention, which would occur to persons skilled in the art upon reading the disclosure, are intended to be within the scope of the appended claims.
Claims (10)
1. A federal learning method for meeting personalized privacy protection requirements of a participant, the method comprising:
s1: more than two participants select privacy budgets according to privacy demands, encrypt the privacy budgets and send the encrypted privacy budgets to a server;
s2: the server receives the encrypted privacy budgets and sums up, and decrypts the privacy budgets together with the participators according to the privacy budgets summed up by the server to obtain the sum of the privacy budgets, and sends the sum to the participators;
s3: the participant divides the private budget of the participant from the sum of the private budgets to obtain an aggregation weight;
s4: the server sends the global model parameters to the participators, and the participators perform local training according to the parameters sent by the server to obtain a local model;
s5: multiplying the parameters of the local model by the aggregation weights by each participant, and then performing gradient clipping;
s6: adding personalized noise to the parameters cut in the step S5 for disturbance, and sending the parameters to a server;
s7: and the server receives the parameters sent by the participants in the step S6 and aggregates the parameters to generate a global model, wherein the global model is used for predicting and analyzing the setting problems in the privacy protection scene.
2. The federal learning method for satisfying personalized privacy protection requirements of participants according to claim 1, wherein said step S1 is specifically:
s11: there are m participating parties { c 1 ,c 2 ,…c m Each participant has its own original data set d 1 ,d 2 ,…d m The participants select own privacy budget according to own privacy requirements;
s12: the participants encrypt by homomorphic encryption according to the own secret key and send the encrypted secret key to the server.
3. A federal learning method for meeting personalized privacy protection requirements of a participant according to claim 1, wherein step S2 comprises:
s21: the server collects model parameters of the participants and aggregates the model parameters for distributed training;
s22: the server receives the encrypted privacy budgets and sums the encrypted privacy budgets, and the server decrypts the encrypted privacy budgets together with the participants under the condition that the specific privacy budgets of the participants are not known, obtains the privacy budgets sum of plaintext and sends the privacy budgets to the participants.
4. A federal learning method for meeting personalized privacy protection requirements of a participant according to claim 1, wherein step S4 comprises:
the server sends global model parameters to the participators, the participators receive the global model parameters and then perform local training, and the local data set and the random gradient descent method are used for performing multiple iterations locally to obtain a local model.
5. A federal learning system that meets personalized privacy protection requirements of a participant, the system comprising:
the encryption module is used for selecting privacy budgets according to privacy demands by more than two participants, encrypting the privacy budgets and sending the encrypted privacy budgets to the server;
the decryption module is used for receiving the encrypted privacy budgets and summing the privacy budgets, decrypting the privacy budgets together with the participators according to the privacy budgets summed by the server and sending the sum to the participators;
the aggregation weight acquisition module is used for dividing the privacy budget of the participant by the sum of the privacy budgets to obtain an aggregation weight;
the training module is used for the server to send the global model parameters to the participators, and the participators perform local training according to the parameters sent by the server to obtain a local model;
the clipping module is used for multiplying the parameters of the local model by the aggregation weight by each participant and then carrying out gradient clipping;
the parameter updating module is used for adding personalized noise into the parameters cut by the cutting module to perform disturbance and sending the parameters to the server;
the output module is used for receiving the parameters sent by the participants in the parameter updating module by the server and carrying out aggregation to generate a global model, and the global model is used for predicting and analyzing the setting problems in the privacy protection scene.
6. The federal learning system for meeting personalized privacy protection requirements of participants of claim 5, wherein the encryption module is specifically:
there are m participating parties { c 1 ,c 2 ,…c m Each participant has its own original data set d 1 ,d 2 ,…d m The participants select own privacy budget according to own privacy requirements;
the participants encrypt by homomorphic encryption according to the own secret key and send the encrypted secret key to the server.
7. A federal learning system that meets personalized privacy protection requirements of a participant according to claim 5, wherein the decryption module comprises:
the server collects model parameters of the participants and aggregates the model parameters for distributed training;
the server receives the encrypted privacy budgets and sums the encrypted privacy budgets, and the server decrypts the encrypted privacy budgets together with the participants under the condition that the specific privacy budgets of the participants are not known, obtains the privacy budgets sum of plaintext and sends the privacy budgets to the participants.
8. The federal learning system for meeting personalized privacy protection requirements of a participant according to claim 5, wherein the training module comprises:
the server sends global model parameters to the participators, the participators receive the global model parameters and then perform local training, and the local data set and the random gradient descent method are used for performing multiple iterations locally to obtain a local model.
9. A computer readable storage medium for storing a computer program for executing a federal learning method according to any one of claims 1-4, which meets the personalized privacy protection requirements of a participant.
10. A computer device comprising a memory and a processor, the memory having a computer program stored therein, the processor performing a federal learning method according to any one of claims 1-4 that meets the personalized privacy protection requirements of a participant when the processor runs the computer program stored in the memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310707082.2A CN116882524A (en) | 2023-06-14 | 2023-06-14 | Federal learning method and system for meeting personalized privacy protection requirements of participants |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310707082.2A CN116882524A (en) | 2023-06-14 | 2023-06-14 | Federal learning method and system for meeting personalized privacy protection requirements of participants |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116882524A true CN116882524A (en) | 2023-10-13 |
Family
ID=88255805
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310707082.2A Pending CN116882524A (en) | 2023-06-14 | 2023-06-14 | Federal learning method and system for meeting personalized privacy protection requirements of participants |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116882524A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117155569A (en) * | 2023-10-30 | 2023-12-01 | 天清数安(天津)科技有限公司 | Privacy calculation method and system for fine-tuning pre-training model |
CN117910047A (en) * | 2024-03-20 | 2024-04-19 | 广东电网有限责任公司 | Multi-key federal learning method, device, terminal equipment and medium |
CN118586041A (en) * | 2024-08-02 | 2024-09-03 | 国网安徽省电力有限公司信息通信分公司 | Data-heterogeneity-resistant electric power federal learning privacy enhancement method and device |
-
2023
- 2023-06-14 CN CN202310707082.2A patent/CN116882524A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117155569A (en) * | 2023-10-30 | 2023-12-01 | 天清数安(天津)科技有限公司 | Privacy calculation method and system for fine-tuning pre-training model |
CN117155569B (en) * | 2023-10-30 | 2024-01-09 | 天清数安(天津)科技有限公司 | Privacy calculation method and system for fine-tuning pre-training model |
CN117910047A (en) * | 2024-03-20 | 2024-04-19 | 广东电网有限责任公司 | Multi-key federal learning method, device, terminal equipment and medium |
CN118586041A (en) * | 2024-08-02 | 2024-09-03 | 国网安徽省电力有限公司信息通信分公司 | Data-heterogeneity-resistant electric power federal learning privacy enhancement method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110399742B (en) | Method and device for training and predicting federated migration learning model | |
Miao et al. | Privacy-preserving Byzantine-robust federated learning via blockchain systems | |
EP3779717B1 (en) | Multiparty secure computing method, device, and electronic device | |
Bonawitz et al. | Practical secure aggregation for privacy-preserving machine learning | |
Han et al. | A data sharing protocol to minimize security and privacy risks of cloud storage in big data era | |
Waziri et al. | Network security in cloud computing with elliptic curve cryptography | |
CN116882524A (en) | Federal learning method and system for meeting personalized privacy protection requirements of participants | |
CN112543187B (en) | Industrial Internet of things safety data sharing method based on edge block chain | |
TW201448551A (en) | Privacy-preserving ridge regression using partially homomorphic encryption and masks | |
CN115549888A (en) | Block chain and homomorphic encryption-based federated learning privacy protection method | |
He et al. | Privacy-preserving and low-latency federated learning in edge computing | |
Yan et al. | Context-aware verifiable cloud computing | |
Al Aziz et al. | Secure and efficient multiparty computation on genomic data | |
CN115392487A (en) | Privacy protection nonlinear federal support vector machine training method and system based on homomorphic encryption | |
Soykan et al. | A survey and guideline on privacy enhancing technologies for collaborative machine learning | |
CN111953483A (en) | Multi-authority access control method based on criterion | |
Bay et al. | Multi-party private set intersection protocols for practical applications | |
CN116561787A (en) | Training method and device for visual image classification model and electronic equipment | |
CN118445844A (en) | Federal learning data privacy protection method, federal learning data privacy protection device and readable storage medium | |
Kucherov et al. | Homomorphic encryption methods review | |
Bandaru et al. | Block chain enabled auditing with optimal multi‐key homomorphic encryption technique for public cloud computing environment | |
CN116415267A (en) | Iterative updating method, device and system for joint learning model and storage medium | |
CN114338090A (en) | Data security detection method and device and electronic equipment | |
Zhang et al. | Efficient federated learning framework based on multi-key homomorphic encryption | |
CN117034287A (en) | Multiparty joint modeling safety enhancement method based on privacy computing interconnection and interworking technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |