CN116739079A - Self-adaptive privacy protection federal learning method - Google Patents

Self-adaptive privacy protection federal learning method Download PDF

Info

Publication number
CN116739079A
CN116739079A CN202310518209.6A CN202310518209A CN116739079A CN 116739079 A CN116739079 A CN 116739079A CN 202310518209 A CN202310518209 A CN 202310518209A CN 116739079 A CN116739079 A CN 116739079A
Authority
CN
China
Prior art keywords
privacy
model
gradient
training
budget
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310518209.6A
Other languages
Chinese (zh)
Other versions
CN116739079B (en
Inventor
王志波
胡佳慧
申永生
任奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou City Brain Co ltd
Zhejiang University ZJU
Original Assignee
Hangzhou City Brain Co ltd
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou City Brain Co ltd, Zhejiang University ZJU filed Critical Hangzhou City Brain Co ltd
Priority to CN202310518209.6A priority Critical patent/CN116739079B/en
Publication of CN116739079A publication Critical patent/CN116739079A/en
Application granted granted Critical
Publication of CN116739079B publication Critical patent/CN116739079B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioethics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Aiming at the characteristic that federal learning is easy to suffer from gradient attack, the method completes federal learning model training under privacy protection, provides self-adaptive privacy protection intensity for different communication rounds, ensures that a generated model is high in availability, resists gradient attack, and protects the safety of training data of a client. The invention discloses a gradient attack with the characteristic of communication round isomerism, thereby providing a privacy decomposition scheme for revealing risk perception. The method and the device quantify the privacy leakage risk of the sharing parameters of the current communication round and adaptively allocate the privacy budget, so that the data privacy and the model availability of different communication rounds are balanced. In the training stage of the client, the invention provides a self-adaptive differential privacy random gradient descent method, which dynamically attenuates noise and clipping coefficients, effectively relieves the negative effect of a differential privacy mechanism on model training, and improves the accuracy and convergence of the model.

Description

Self-adaptive privacy protection federal learning method
Technical Field
The invention relates to the field of federal learning (Federated Learning) safety and the field of data privacy, in particular to a self-adaptive privacy protection federal learning method.
Background
Federal learning is a decentralized machine learning technique in which multiple client devices collectively learn a neural network model without sending their local data to a cloud server. Federal learning significantly alleviates privacy concerns for customers compared to centralized machine learning because it does not require the concentration of their private data. However, researchers have found that federal learning still faces various data security and privacy concerns. This is because model parameters and gradient information passed by the participants may reveal data privacy, resulting in that an attacker can reconstruct training data local to the client from this information.
Aiming at the privacy disclosure problem, the privacy protection technology of federal learning is mainly divided into the following two types: 1) The privacy protection method based on encryption mainly uses a cryptographic technology to carry out privacy protection, and the mainstream method comprises homomorphic encryption and secure multiparty calculation. The method can keep the accuracy of the original model training and realize high privacy guarantee. However, such methods incur significant communication and computation costs, and are not well suited for federal learning scenarios with large numbers of participants and data. 2) The dominant approach to perturbation-based privacy preservation is differential privacy technology (DP). The differential privacy technology perturbs the gradient by adding measurable noise to the gradient information, thereby preventing privacy attacks caused by true gradient leakage. Although the differential privacy technology is more suitable for the federal learning scene due to the high-efficiency light-weight characteristics, the existing federal learning method based on the differential privacy technology cannot effectively weigh the privacy and accuracy of the model, and especially under the condition that the model with high availability is required, the privacy protection of data cannot be fully ensured.
Therefore, aiming at the problems existing in the prior art, research on how to use the differential privacy technology to realize an efficient, lightweight and flexible self-adaptive privacy protection federal learning method is urgently needed so as to meet the national requirements of data security protection.
Disclosure of Invention
The invention improves the defects of the prior art, provides a self-adaptive privacy protection federal learning method, can not only ensure the model joint training precision, but also resist gradient attack to ensure that the model joint training precision can not recover the training data of a client, and is realized by the following technical scheme:
the invention discloses a self-adaptive privacy protection federal learning method, which comprises the following steps:
1): an initialization stage;
1.1 Collecting common samples related to federal learning tasks as an assessment dataset;
1.2 Initializing a global model;
1.3 Initializing the total privacy budget (a measure of privacy strength in a differential privacy mechanism);
2): evaluating privacy disclosure risk of the current communication turn according to the evaluation data set and the global model;
3): decomposing the total privacy budget according to the estimated privacy disclosure risk to obtain the privacy budget of the current communication round;
4): according to the privacy budget and the current global model, performing local model training by adopting a self-adaptive differential privacy random gradient descent method (ADP-SGD) to obtain a new local model;
4.1 Converting the allocated budget into a zero-mean-set differential privacy (zCDP) paradigm and initializing a local model with the global model;
4.2 Calculating the noise level used by the current model training period according to the privacy budget and the current model training period;
4.3 Updating the remaining privacy budget in accordance with the noise level;
4.4 Calculating gradient clipping coefficients used in the current model iteration period according to the current model iteration period;
4.5 Calculating the gradient of the iteration cycle of the current model according to the back propagation;
4.6 Tailoring the gradient generated in the current model iteration cycle according to the gradient tailoring coefficient;
4.7 Calculating noise disturbance parameters according to the noise level and the gradient clipping coefficients, and adding Gaussian noise on the clipped gradient according to the noise disturbance parameters to obtain a post-disturbance gradient;
4.8 Updating the local model according to gradient descent after disturbance (the initial value is a global model);
iterating 4.2) through 4.8) until 4.3) the calculated residual privacy budget is less than 0 or the model training period ends;
5) The local model is polymerized to obtain a new global model:
and circularly executing the steps 2) to 5) until the global model converges (the model preparation rate is not lifted any more or the lifting range is smaller than the threshold value 10 < -4 >), and stopping the federal learning model training.
As a further improvement, the method for evaluating the privacy disclosure risk of the current communication round specifically comprises the following steps: the server uses the collected evaluation data set to test the accuracy of the current global model, and uses the collected evaluation data set as a privacy disclosure risk indicator to represent the privacy disclosure risk of the current communication round, and considers that small fluctuation of a certain range can occur in the accuracy of the global model, in order to reduce the influence of the fluctuation on the privacy disclosure risk evaluation, if S t The accuracy of the global model of the t communication turn is represented, and the model accuracy increment is represented as:
when training is just started, the number of effective global model accuracy is lackedAccording to this, the process is carried out,and->The value is 1.
As a further improvement, the privacy budget described in the present invention is decomposed specifically as follows: after the privacy revealing risk measurement is completed, calculating the privacy budget E of the current communication round according to the privacy risk of the current communication round t
Where E is the total privacy budget, E c Is the consumed budget, T represents a preset total communication round,the global model accuracy increment of the current communication round is used for evaluating privacy disclosure risks.
As a further improvement, the adaptive differential privacy stochastic gradient descent method (ADP-SGD) of the present invention is specifically: each client uses a local data set to carry out local model training based on the issued global model and privacy budget obtained according to privacy leakage risks;
in ADP-SGD, the budget E will be allocated first t Conversion to zero-mean-concentration differential privacy (zCDP) paradigm, therefore, E t Conversion to
ρ t =(∈ t ) 2 /(4log(1/δ))
Wherein ρ is t Representing privacy budget under zCDP paradigm (∈) t Delta) is a privacy parameter under the DP paradigm, and for each training period (Epoch), the current remaining privacy budget ρ is first updated left And determines the noise level of the current period (consuming a certain privacy budget), which should be attenuated with the training period to improve convergence, thus every training periodNoise level sigma of period e Is that
Wherein ρ is t Is the privacy budget allocated to the user for the current communication round, beta is used to control the initial noise level, k σ The noise level attenuation rate is that E is a preset training period (the training period is ended and the local training is stopped), E is the current training period, and after the noise level of each local training period is determined, the privacy budget remains
Wherein ρ is left The initial value is set as ρ t To ensure that the privacy budget does not exceed the allocated ρ t Privacy budget exhaustion (ρ left Less than or equal to 0), stopping training;
at each model Iteration period τ (Iteration), a batch of samples B is selected from the client data set by a random reorganization method, and the samples (x k ,y k ) E B, calculating the sample gradient using back propagation, then clipping the gradient,
where L represents the loss function, F represents the training model, w is the model parameter, g (x k ) Andrepresenting the original gradient and the gradient after clipping respectively, C τ Represents clipping coefficients, if the L2 norm of the gradient g (x k )|| 2 Is greater than the clipping value C τ The L2 norm of the gradient after clipping is C τ Conversely, the gradient remains unchanged;
considering that the gradient value becomes smaller along with training, the method dynamically attenuates and cutsCoefficient C τ
C τ =C 0 exp(k c τ)
Wherein C is 0 Is the initial value of clipping coefficient, generally takes the average value or median value, k of batch gradient L2 norm value c The attenuation rate is the clipping coefficient;
after completing gradient clipping, model parameters are updated by using gradient after noise disturbance, and noise is subjected to Gaussian distribution
After the model is updated, a next model iteration period (tau+1) is entered, when the local data set is traversed, the local training is completed once, the next training period (e+1) is trained, when all the training periods are finished (e=e) or the privacy budget is exhausted, the local training is finished, and the local model is uploaded to a server.
As a further improvement, the local model polymerization according to the invention is in particular: step 5) the server performs model aggregation on the uploaded local model to obtain a new round of global model, for example, an average aggregation method is adopted, and the new global model is updated as follows:
wherein D is k A local data set representing client k,representing the local model that client k has uploaded in his turn.
The invention has the advantages that:
the invention provides a self-adaptive privacy protection federal learning method, which completes federal learning model training, provides self-adaptive privacy protection for different communication rounds, ensures that a generated model is highly available, resists gradient attack, and protects the safety of client training data. Aiming at the problem that a federal learning method based on encryption privacy protection generates a large amount of communication cost and calculation cost, the invention adopts a method for carrying out gradient disturbance on a client to protect gradient privacy safety, the redundant communication cost is zero-increased on the original federal learning method, and the noise generation calculation cost of the gradient disturbance is far lower than encryption and decryption calculation. Aiming at the problem that the existing federal learning based on the differential privacy technology cannot balance the privacy and accuracy of the model, the invention discloses that the gradient attack has the characteristic of communication round heterogeneous, thereby providing a privacy decomposition scheme revealing risk perception. The scheme quantifies the privacy disclosure risk of the sharing parameters of the current communication round and adaptively allocates privacy budget, so that the data privacy and model availability of different communication rounds are balanced. In addition, in the training stage of the client, the invention provides a self-adaptive differential privacy random gradient descent method, which dynamically attenuates noise and clipping coefficients, effectively relieves the negative effect of a differential privacy mechanism on model training, and improves the accuracy and convergence of the model. The method has the characteristics of high availability of the model, light weight calculation and measurable privacy, so that the method has higher practical value in the federal learning field. The efficient, lightweight and flexible self-adaptive data privacy protection method meets the national data hierarchical protection requirements, and has important theoretical significance and application value for federal learning application and popularization.
Drawings
FIG. 1 is a block diagram of an adaptive privacy preserving federal learning approach;
FIG. 2 is a flow chart of an adaptive differential privacy random gradient descent method (ADP-SGD);
FIG. 3 is a graph comparing the gradient attack resistance index of the present invention with that of the prior Federal learning method;
FIG. 4 is a graph comparing model training accuracy with the prior dynamic differential privacy machine learning method;
FIG. 5 is a graph comparing model training accuracy of the present invention with existing federal learning methods.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1 and 2, the adaptive privacy preserving federal learning method includes the steps of:
1) Before issuing the global model, the server side evaluates the privacy disclosure risk of the current round and realizes privacy decomposition of disclosure risk perception. Firstly, the information value contained in the identified gradient gradually decreases along with the progress of model training, so that the gradient attack difficulty is increased. Therefore, the scheme adopts the accuracy increment of the global model compared with the previous round as an index for measuring the training progress (namely whether the training is close to convergence or not), and the server side quantifies the privacy leakage risk in each round of communication based on the index. The server side uses the common dataset of the home side, which follows the basic federal training set, to test the accuracy of the current global model. Considering that small fluctuation of a certain range can occur in the accuracy of the global model, in order to reduce the influence of the fluctuation on privacy leakage risk assessment, it is assumed that St represents the accuracy of the global model of t communication rounds, and the model accuracy increment is represented as follows:
2) And according to the estimated privacy disclosure risk, distributing privacy budget (a measure index of privacy intensity in a differential privacy mechanism) of the current communication round, and realizing privacy decomposition. The greater the risk of privacy disclosure, the greater the intensity of privacy protection required, and the smaller the privacy budget allocated correspondingly. Conversely, the lower the risk of privacy disclosure, the privacy budget can be increased, thereby reducing the impact of gradient perturbations on the model accuracy. Calculating privacy budget epsilon of current communication round based on privacy risk of current communication round t
Where E is the total privacy budget, E c Is the consumed budget, T represents the preset total communication round.The global model accuracy increment of the current communication round is used for evaluating privacy disclosure risks.
3) The server randomly selects K clients from N clients to participate in the training of the model of the round, and transmits the model and the privacy budget to the clients participating in the training of the round.
4) The client uses adaptive differential privacy random gradient descent (ADP-SGD) and distributed privacy budget to develop local model training based on the issued model and the local training set. The self-adaptive differential privacy stochastic gradient descent method is used for adaptively and dynamically distributing the privacy budget to corresponding local training periods in order to improve model accuracy and convergence. After the client finishes the local training, the local model is uploaded to the server.
5) In ADP-SGD, the budget E will be allocated first t Converting to the zero-mean-concentration differential privacy (zCDP) paradigm, sequential synthesis under zCDP results in less privacy costs than traditional DP with the addition of the same amount of noise. Therefore, under a given privacy budget, the use of zCDP to develop privacy statistics can enable the model to train more rounds, thereby improving the accuracy of the model. Thus, E t Conversion to
ρ t =(∈ t ) 2 /(4log(1/δ))
Wherein ρ is t Representing privacy budget under zCDP paradigm (∈) t δ) is a privacy parameter under the DP paradigm.
6) For each training period (Epoch), the currently remaining privacy budget ρ is first updated left And decides the noise level of the current period (consuming a certain privacy budget). To promote convergence, the noise level should be attenuated with training periods, so the noise level σ of each training period e Is that
Wherein ρ is t Is the privacy budget allocated to the user for the current communication round, beta is used to control the initial noise level, k σ The noise level attenuation rate is that E is a preset training period (the training period is ended, the local training is stopped), and E is the current training period. After determining the noise level for each local training period, the privacy budget remains
Wherein ρ is left The initial value is set as ρ t . To ensure that the privacy budget does not exceed the allocated ρ t Privacy budget exhaustion (ρ left Less than or equal to 0), stopping training.
7) At each model Iteration period τ (Iteration), a batch of samples B is selected from the client dataset by means of random reorganization. For the samples in the batch (x k ,y k ) E B, calculating the sample gradient using back propagation, then clipping the gradient.
Where L represents the loss function, F represents the training model, and w is the model parameter. g (x) k ) Andrepresenting the original gradient and the post-clipping gradient, respectively. C (C) τ Represents clipping coefficients, if the L2 norm of the gradient g (x k )|| 2 Is greater than the clipping value C τ The L2 norm of the gradient after clipping is C τ Otherwise, the gradient remains unchanged.
Considering that the gradient value becomes smaller along with training, the dynamic attenuation clipping coefficient C of the scheme τ
C τ =C 0 exp(k c τ)
Wherein C is 0 Is the initial value of clipping coefficient, generally takes the average value or median value, k of batch gradient L2 norm value c The coefficient attenuation rate is cut.
8) After completing gradient clipping, model parameters are updated by using gradient after noise disturbance, and noise is subjected to Gaussian distribution
After model update, the next model iteration cycle (τ+1) is entered. When the local data set is traversed, the local training is completed once, and the training is completed to the next training period (e+1). When all training periods are over (e=e) or the privacy budget is exhausted, the local training is over, and the local model is uploaded to the server.
9) And the server performs model aggregation on the uploaded local models to obtain a new round of global model. If an average aggregation method is used, the new global model is updated as follows.
Wherein D is k A local data set representing client k,representing the local model that client k has uploaded in his turn.
According to the invention, the model training effect is evaluated by adopting Accuracy (Accuracy), and the higher the Accuracy is, the stronger the model usability is. The invention adopts SSIM (structural similarity), PSNR (peak signal to noise ratio) and MSE (mean square error) to evaluate the quality of the reconstructed image, thereby measuring the gradient attack resistance of the method. SSIM is a number between 0 and 1, the greater the difference between the reconstructed image and the original image, the greater the resistance to gradient attack of the method, ssim=1 when the two images are identical; PSNR is also used to compare the similarity between a reconstructed image and a corresponding original image, with smaller values indicating poorer quality of the reconstructed image and lower privacy exposure; MSE represents the pixel difference value between the reconstructed image and the original image, and the greater the difference value is, the better the privacy protection effect is; the total communication turn of the federal learning training is set to be 15, and is uniformly divided into three periods of early, medium and late. Fig. 3 shows the results of the present invention and the existing federal learning method in evaluating the index of the resistance to gradient attack, and the present invention has stronger resistance to gradient attack and lower degree of privacy disclosure in the whole federal learning period than other federal learning methods, when observed from the index SSIM, PSNR, MSE.
The invention has strong privacy protection capability, but has little influence on the accuracy of the model. Fig. 4 shows a comparison of the adaptive differential privacy stochastic gradient descent method of the present invention with the existing local training method in terms of model accuracy under MNIST data sets. Fig. 5 shows a comparison of the present invention with existing federal learning methods in terms of model accuracy under MNIST data sets. It can be seen that the present invention performs better in accuracy than the local training method or federal learning method with added privacy protection. Under the condition that the sigma is less than or equal to 2 and the epsilon is more than or equal to 6 under moderate noise, the model accuracy exceeds 90 percent similar to the performance of a local training method without privacy protection or federal learning on model training.
Finally, it should also be noted that the above list is only a few specific embodiments of the present invention. Obviously, the invention is not limited to the above embodiments, but many variations are possible. All modifications directly derived or suggested to one skilled in the art from the present disclosure should be considered as being within the scope of the present invention.

Claims (5)

1. An adaptive privacy preserving federal learning method, comprising the steps of:
1): an initialization stage;
1.1 Collecting common samples related to federal learning tasks as an assessment dataset;
1.2 Initializing a global model;
1.3 Initializing an overall privacy budget;
2): evaluating privacy disclosure risk of the current communication turn according to the evaluation data set and the global model;
3): decomposing the total privacy budget according to the estimated privacy disclosure risk to obtain the privacy budget of the current communication round;
4): according to the privacy budget and the current global model, a self-adaptive differential privacy random gradient descent method is adopted to carry out local model training to obtain a new local model;
4.1 Converting the allocated budget into a zero-mean-set differential privacy paradigm and initializing a local model with a global model;
4.2 Calculating the noise level used by the current model training period according to the privacy budget and the current model training period;
4.3 Updating the remaining privacy budget in accordance with the noise level;
4.4 Calculating gradient clipping coefficients used in the current model iteration period according to the current model iteration period;
4.5 Calculating the gradient of the iteration cycle of the current model according to the back propagation;
4.6 Tailoring the gradient generated in the current model iteration cycle according to the gradient tailoring coefficient;
4.7 Calculating noise disturbance parameters according to the noise level and the gradient clipping coefficients, and adding Gaussian noise on the clipped gradient according to the noise disturbance parameters to obtain a post-disturbance gradient;
4.8 Updating the local model according to gradient descent after disturbance (the initial value is a global model);
iterating 4.2) through 4.8) until 4.3) the calculated residual privacy budget is less than 0 or the model training period ends;
5) The local model is polymerized to obtain a new global model:
and circularly executing the steps 2) to 5) until the global model converges and the federal learning model training is stopped.
2. Such asThe adaptive privacy preserving federal learning method of claim 1, wherein the evaluating the privacy exposure risk for the current communication round is specifically: the server uses the collected evaluation data set to test the accuracy of the current global model, and uses the collected evaluation data set as a privacy disclosure risk indicator to represent the privacy disclosure risk of the current communication round, and considers that small fluctuation of a certain range can occur in the accuracy of the global model, in order to reduce the influence of the fluctuation on the privacy disclosure risk evaluation, if S t The accuracy of the global model of the t communication turn is represented, and the model accuracy increment is represented as:
when training is just started, the effective global model accuracy data is lacking,and->The value is 1.
3. The adaptive privacy preserving federal learning method of claim 1, wherein the decomposing of the privacy budget is specifically: after the privacy revealing risk measurement is completed, calculating the privacy budget E of the current communication round according to the privacy risk of the current communication round t
Where E is the total privacy budget, E c Is the consumed budget, T represents a preset total communication round,is global of the current communication roundAnd the model accuracy rate increment is used for evaluating the privacy disclosure risk.
4. The adaptive federal learning method for privacy preserving according to claim 1, wherein the adaptive differential random gradient descent method ADP-SGD is specifically: each client uses a local data set to carry out local model training based on the issued global model and privacy budget obtained according to privacy leakage risks;
in ADP-SGD, the budget E will be allocated first t Conversion to zero-mean-concentration differential privacy zCDP paradigm, therefore, E t Conversion to
ρ t =(∈ t ) 2 /(4log(1/6))
Wherein ρ is t Representing privacy budget under zCDP paradigm (∈) t Delta) is a privacy parameter under the DP paradigm, and for each training period (Epoch), the current remaining privacy budget ρ is first updated left And determines the noise level of the current period, the noise level should be attenuated with the training period in order to improve convergence, so the noise level sigma of each training period e Is that
Wherein ρ is t Is the privacy budget allocated to the user for the current communication round, beta is used to control the initial noise level, k σ The noise level attenuation rate is that E is a preset training period, E is the current training period, and after the noise level of each local training period is determined, the privacy budget remains
Wherein ρ is left The initial value is set as ρ t To ensure that the privacy budget does not exceed the allocated ρ t Privacy budget exhaustion (ρ left Less than or equal to 0), stopping training;
at each model Iteration period τ (Iteration), a batch of samples B is selected from the client data set by a random reorganization method, and the samples (x k ,y k ) E B, calculating the sample gradient using back propagation, then clipping the gradient,
where L represents the loss function, F represents the training model, w is the model parameter, g (x k ) Andrepresenting the original gradient and the gradient after clipping respectively, C τ Represents clipping coefficients, if the L2 norm of the gradient g (x k )|| 2 Is greater than the clipping value C τ The L2 norm of the gradient after clipping is C τ Conversely, the gradient remains unchanged;
considering that the gradient value becomes smaller along with training, the method dynamically attenuates the clipping coefficient C τ
C τ =C 0 exp(k c τ)
Wherein C is 0 Is the initial value of clipping coefficient, generally takes the average value or median value, k of batch gradient L2 norm value c The attenuation rate is the clipping coefficient; after completing gradient clipping, model parameters are updated by using gradient after noise disturbance, and noise is subjected to Gaussian distribution
After the model is updated, a next model iteration period (tau+1) is entered, when the local data set is traversed, the local training is completed once, the next training period (e+1) is trained, when all the training periods are finished (e=e) or the privacy budget is exhausted, the local training is finished, and the local model is uploaded to a server.
5. The adaptive privacy preserving federal learning method of claim 1, wherein the local model aggregation is specifically: step 5) the server performs model aggregation on the uploaded local model to obtain a new round of global model, for example, an average aggregation method is adopted, and the new global model is updated as follows:
wherein D is k A local data set representing client k,representing the local model that client k has uploaded in his turn.
CN202310518209.6A 2023-05-10 2023-05-10 Self-adaptive privacy protection federal learning method Active CN116739079B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310518209.6A CN116739079B (en) 2023-05-10 2023-05-10 Self-adaptive privacy protection federal learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310518209.6A CN116739079B (en) 2023-05-10 2023-05-10 Self-adaptive privacy protection federal learning method

Publications (2)

Publication Number Publication Date
CN116739079A true CN116739079A (en) 2023-09-12
CN116739079B CN116739079B (en) 2024-02-09

Family

ID=87901959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310518209.6A Active CN116739079B (en) 2023-05-10 2023-05-10 Self-adaptive privacy protection federal learning method

Country Status (1)

Country Link
CN (1) CN116739079B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117094382A (en) * 2023-10-19 2023-11-21 曲阜师范大学 Personalized federal learning method, device and medium with privacy protection
CN117313869A (en) * 2023-10-30 2023-12-29 浙江大学 Large model privacy protection reasoning method based on model segmentation
CN117932686A (en) * 2024-03-22 2024-04-26 成都信息工程大学 Federal learning privacy protection method, system and medium in meta universe based on excitation mechanism

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220188649A1 (en) * 2020-11-05 2022-06-16 Zhejiang University Decision tree-oriented vertical federated learning method
CN114841364A (en) * 2022-04-14 2022-08-02 北京理工大学 Federal learning method capable of meeting personalized local differential privacy requirements
CN115496198A (en) * 2022-08-05 2022-12-20 广州大学 Gradient compression framework for adaptive privacy budget allocation based on federal learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220188649A1 (en) * 2020-11-05 2022-06-16 Zhejiang University Decision tree-oriented vertical federated learning method
CN114841364A (en) * 2022-04-14 2022-08-02 北京理工大学 Federal learning method capable of meeting personalized local differential privacy requirements
CN115496198A (en) * 2022-08-05 2022-12-20 广州大学 Gradient compression framework for adaptive privacy budget allocation based on federal learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117094382A (en) * 2023-10-19 2023-11-21 曲阜师范大学 Personalized federal learning method, device and medium with privacy protection
CN117094382B (en) * 2023-10-19 2024-01-26 曲阜师范大学 Personalized federal learning method, device and medium with privacy protection
CN117313869A (en) * 2023-10-30 2023-12-29 浙江大学 Large model privacy protection reasoning method based on model segmentation
CN117313869B (en) * 2023-10-30 2024-04-05 浙江大学 Large model privacy protection reasoning method based on model segmentation
CN117932686A (en) * 2024-03-22 2024-04-26 成都信息工程大学 Federal learning privacy protection method, system and medium in meta universe based on excitation mechanism
CN117932686B (en) * 2024-03-22 2024-05-31 成都信息工程大学 Federal learning privacy protection method, system and medium in meta universe based on excitation mechanism

Also Published As

Publication number Publication date
CN116739079B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
CN116739079B (en) Self-adaptive privacy protection federal learning method
US20210158216A1 (en) Method and system for federated learning
CN113591145B (en) Federal learning global model training method based on differential privacy and quantization
CN114297722B (en) Privacy protection asynchronous federal sharing method and system based on block chain
CN113762530B (en) Precision feedback federal learning method for privacy protection
CN113361694A (en) Layered federated learning method and system applying differential privacy protection
CN112906903A (en) Network security risk prediction method and device, storage medium and computer equipment
CN114841364B (en) Federal learning method for meeting personalized local differential privacy requirements
CN116306910B (en) Fair privacy calculation method based on federal node contribution
CN115563650A (en) Privacy protection system for realizing medical data based on federal learning
CN109063502A (en) Data encryption, data analysing method and device
CN113691594B (en) Method for solving data imbalance problem in federal learning based on second derivative
CN112600697B (en) QoS prediction method and system based on federal learning, client and server
CN108008632A (en) A kind of method for estimating state and system of the time lag Markov system based on agreement
CN114169543A (en) Federal learning algorithm based on model obsolescence and user participation perception
CN117574429A (en) Federal deep learning method for privacy enhancement in edge computing network
Xu et al. Agic: Approximate gradient inversion attack on federated learning
CN107465571B (en) Tactical network simulation training background business flow generation method based on statistical characteristics
CN115510472B (en) Multi-difference privacy protection method and system for cloud edge aggregation system
Jiao et al. A blockchain federated learning scheme based on personalized differential privacy and reputation mechanisms
CN115329388B (en) Privacy enhancement method for federally generated countermeasure network
CN114862416A (en) Cross-platform credit evaluation method under federated learning environment
CN114723071A (en) Federal learning method and device based on client classification and information entropy
CN114553869A (en) Method and device for determining resource contribution degree based on joint learning and electronic equipment
CN117113389A (en) Privacy protection method and device for distributed learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant