CN116340986A - Block chain-based privacy protection method and system for resisting federal learning gradient attack - Google Patents

Block chain-based privacy protection method and system for resisting federal learning gradient attack Download PDF

Info

Publication number
CN116340986A
CN116340986A CN202211516395.1A CN202211516395A CN116340986A CN 116340986 A CN116340986 A CN 116340986A CN 202211516395 A CN202211516395 A CN 202211516395A CN 116340986 A CN116340986 A CN 116340986A
Authority
CN
China
Prior art keywords
gradient
noise
value
hash value
blockchain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211516395.1A
Other languages
Chinese (zh)
Inventor
晏宗明
文建良
郭正涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unicom Lingjing Video Jiangxi Technology Co ltd
Original Assignee
Unicom Lingjing Video Jiangxi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unicom Lingjing Video Jiangxi Technology Co ltd filed Critical Unicom Lingjing Video Jiangxi Technology Co ltd
Priority to CN202211516395.1A priority Critical patent/CN116340986A/en
Publication of CN116340986A publication Critical patent/CN116340986A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

A privacy protection method and a system for resisting federal learning gradient attack based on a blockchain relate to the technical field of data privacy protection. The technical key points of the invention include: when the participant finishes training and uploading the gradient data, the intelligent contract automatically adds a noise value related to the gradient hash value, and the block chain stores the gradient after noise addition, so that the user data cannot be recovered directly by a gradient leakage method; when the central server aggregates data, the participant uses the hash value of the gradient data as a parameter to call an intelligent contract, and the intelligent contract verifies the uploaded gradient data and then carries out the next operation; the intelligent contract recalculates the noise value and randomly divides the noise value into a plurality of parts, and each part is stored in a block chain; the central server calculates the shared noise value of all the participants by the intelligent contract and then directly returns the noise sum; and finally, calculating the latest model parameters. The invention effectively solves the problem of user privacy disclosure and improves the problem of model precision reduction.

Description

Block chain-based privacy protection method and system for resisting federal learning gradient attack
Technical Field
The invention relates to the technical field of data privacy protection, in particular to a privacy protection method and system for resisting federal learning gradient attack based on a blockchain.
Background
With further development of network communication, the internet requires higher levels of security and stronger privacy protection requirements. Machine learning requires training with large amounts of data, but protection of data and privacy makes it difficult for data to circulate to form data islands, failing to release the greater value of the data. Traditional machine learning based on a central server faces serious privacy and security challenges and cannot realize ubiquitous security artificial intelligence for future networks. Furthermore, due to the huge overhead of centralized data aggregation and processing, conventional centralized machine learning schemes may not be suitable for ubiquitous artificial intelligence.
Federal learning is an emerging distributed machine learning scheme, and provides a new solution to the privacy and security problems faced by machine learning. In federal learning, the participating devices co-train the shared model with their local data, unlike traditional machine learning schemes, only upload model updates rather than raw data to a centralized parameter server. Although federal learning provides a different solution for the development of artificial intelligence, significantly improving privacy-sensitive applications, federal learning is still in an early stage of development, facing new challenges.
Unlike traditional machine learning scheme, federal learning uploads model update instead of original data to a centralized parameter server, the method provides tighter privacy protection for machine learning, original data is always stored locally, and direct leakage of privacy is avoided. However, according to the latest research, the model updating process still faces the risk of privacy disclosure, and a malicious party can infer user data by acquiring gradient updating in the user training process, so that great threat is brought to data security and user privacy of federal learning.
In order to solve the privacy problem caused by gradient leakage in federal learning, the prior art mainly has two solutions: 1. and encrypting the updated model parameters by using the cryptology schemes such as homomorphic encryption and the like, uploading the encrypted model parameters to the central server, encrypting the updated model parameters by using the cryptology scheme, and only the central server can decrypt and aggregate the updated model parameters into a new model. In the scheme, the fact that other participants are difficult to acquire the user gradient is guaranteed, but the central server can directly acquire the user gradient parameters, so that the possibility of revealing the user privacy exists; 2. noise is added to the model, and when the user uploads gradient parameters, proper noise is added to the data, so that the participants cannot directly acquire training gradients of each round, and the data is difficult to recover. Experiments prove that the privacy can be protected to a certain extent by adding noise, but the training accuracy and the training speed of federal learning are seriously influenced.
Disclosure of Invention
Accordingly, the present invention is directed to a blockchain-based privacy protection method and system that resists federal learning gradient attacks in an attempt to solve or at least mitigate at least one of the above-identified problems.
According to one aspect of the invention, a privacy protection method for resisting federal learning gradient attack based on a blockchain is provided, gradient updating is performed in the training process of a federal learning model according to the following process, so that the privacy protection is performed on the local training data of each participant:
the blockchain system receives a first gradient and a first hash value uploaded by each participant; the first gradient is obtained by training each participant according to the model parameters shared by the center servers by utilizing local training data; the first hash value is obtained according to first gradient calculation;
for each first gradient, the blockchain system calculates a second hash value corresponding to the first gradient; generating a first noise value by taking the second hash value as a random number seed, adding the first noise value and the first gradient, and obtaining a first gradient after noise addition;
for each first hash value, the block chain system generates a second noise value by taking the first hash value as a random number seed, and subtracts the first gradient after noise addition from the second noise value to obtain a second gradient; calculating a hash value of the second gradient, judging whether the hash value of the second gradient is the same as the first hash value, and if so, indicating that the first gradient uploaded by the participant is not changed; dividing the second noise value corresponding to each participant into a plurality of parts;
the block chain system sums a plurality of second noise values corresponding to a plurality of participants and transmits the obtained noise value sum to a central server;
the center server calculates an average gradient according to the noise value and the first gradient after adding noise; gradient updates are performed according to the average gradient.
Further, the calculation formula of the first gradient is:
Figure BDA0003972059350000021
wherein W is t Model weight parameters of a t-th round in the training process of the federal learning model are represented; x is x t,i ,y t,i Respectively representing the data input of the participant i at the t-th round and the corresponding label; f (-) represents the predicted value of the input data under the current model parameters; l (·) represents the loss function constructed using the predicted value and the real label.
Further, the calculation formula of the average gradient is:
Figure BDA0003972059350000022
wherein N represents the total number of participants;
Figure BDA0003972059350000023
representing a first gradient after adding noise; r represents the noise value sum.
Further, the formula for gradient update according to the average gradient is:
Figure BDA0003972059350000024
where η represents a learning rate.
According to another aspect of the present invention, there is provided a blockchain-based privacy protection system against federal learning gradient attacks, the system comprising a plurality of participants, a central server, and a blockchain system; each of the plurality of participants trains by utilizing local training data according to the model parameters shared by the center servers to acquire a first gradient; obtaining a first hash value according to the first gradient calculation; uploading a first gradient and a first hash value to the blockchain system;
the blockchain system comprises an added noise intelligent contract module, a shared noise value intelligent contract module and a noise and intelligent contract module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the noise adding intelligent contract module is used for: receiving a first gradient and a first hash value uploaded by each participant; for each first gradient, calculating a second hash value corresponding to the first gradient; generating a first noise value by taking the second hash value as a random number seed, adding the first noise value and the first gradient, and obtaining a first gradient after noise addition;
the shared noise value intelligent contract module is used for: generating a second noise value for each first hash value by taking the first hash value as a random number seed, and subtracting the second noise value from the first gradient after noise addition to obtain a second gradient; calculating a hash value of the second gradient, judging whether the hash value of the second gradient is the same as the first hash value, and if so, indicating that the first gradient uploaded by the participant is not changed; dividing the second noise value corresponding to each participant into a plurality of parts;
the noise and intelligence contract module is to: summing a plurality of second noise values corresponding to a plurality of participants, and transmitting the obtained noise value sums to the central server;
the central server is used for calculating an average gradient according to the noise value and the first gradient after adding noise, and carrying out gradient update according to the average gradient.
Further, the calculation formula of the first gradient is:
Figure BDA0003972059350000031
wherein W is t Model weight parameters of a t-th round in the training process of the federal learning model are represented; x is x t,i ,y t,i Respectively representing the data input of the participant i at the t-th round and the corresponding label; f (-) represents the predicted value of the input data under the current model parameters; l (·) represents the loss function constructed using the predicted value and the real label.
Further, the calculation formula of the average gradient is:
Figure BDA0003972059350000032
wherein N represents the total number of participants;
Figure BDA0003972059350000033
representing a first gradient after adding noise; r represents the noise value sum.
Further, the formula for gradient update according to the average gradient is:
Figure BDA0003972059350000034
where η represents a learning rate.
The beneficial technical effects of the invention are as follows:
the invention provides a privacy protection method and a privacy protection system for resisting federal learning gradient attack based on a blockchain, which effectively solve the problem of user privacy leakage caused by federal learning model gradient leakage and improve the problem of model precision reduction caused by the existing noise adding scheme; further, the identity security of the user is protected through the blockchain intelligent contract, and malicious vandalism in the training process is avoided.
Drawings
The invention may be better understood by reference to the following description taken in conjunction with the accompanying drawings, which are included to provide a further illustration of the preferred embodiments of the invention and to explain the principles and advantages of the invention, together with the detailed description below.
FIG. 1 is a block diagram of a block chain based privacy protection method for resisting federal learning gradient attacks in accordance with an embodiment of the present invention;
FIG. 2 is a schematic diagram of a threat model in an embodiment of the invention;
FIG. 3 is a flow chart of a blockchain-based privacy protection method against federal learning gradient attacks in accordance with an embodiment of the present invention;
FIG. 4 is a diagram showing an example of data after gradient recovery for an existing federal learning model in accordance with an embodiment of the present invention;
FIG. 5 is a graph showing an example of data obtained by gradient recovery of a federal learning model trained based on the method of the present invention in an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, exemplary embodiments or examples of the present invention will be described below with reference to the accompanying drawings. It is apparent that the described embodiments or examples are only implementations or examples of a part of the invention, not all. All other embodiments or examples, which may be made by one of ordinary skill in the art without undue burden, are intended to be within the scope of the present invention based on the embodiments or examples herein.
The embodiment of the invention provides a privacy protection method for resisting federal learning gradient attack based on a blockchain, which carries out gradient update in the training process of a federal learning model according to the following process so as to carry out privacy protection on local training data of each participant:
the blockchain system receives a first gradient and a first hash value uploaded by each participant; the first gradient is obtained by training each participant according to the model parameters shared by the center servers by utilizing local training data; the first hash value is obtained according to first gradient calculation;
for each first gradient, the blockchain system calculates a second hash value corresponding to the first gradient; generating a first noise value by taking the second hash value as a random number seed, adding the first noise value and the first gradient, and obtaining a first gradient after noise addition;
for each first hash value, the block chain system generates a second noise value by taking the first hash value as a random number seed, and subtracts the first gradient after noise addition from the second noise value to obtain a second gradient; calculating a hash value of the second gradient, judging whether the hash value of the second gradient is the same as the first hash value, and if so, indicating that the first gradient uploaded by the participant is not changed; dividing the second noise value corresponding to each participant into a plurality of parts;
the block chain system sums a plurality of second noise values corresponding to a plurality of participants and transmits the obtained noise value sum to a central server;
the center server calculates an average gradient according to the noise value and the first gradient after adding noise; gradient updates are performed according to the average gradient.
In this embodiment, preferably, the calculation formula of the first gradient is:
Figure BDA0003972059350000051
wherein W is t Model weight parameters of a t-th round in the training process of the federal learning model are represented; x is x t,i ,y t,i Respectively representing the data input of the participant i at the t-th round and the corresponding label; f (-) represents the predicted value of the input data under the current model parameters; l (·) represents the loss function constructed using the predicted value and the real label.
In this embodiment, preferably, the calculation formula of the average gradient is:
Figure BDA0003972059350000052
wherein N represents the total number of participants;
Figure BDA0003972059350000053
representing a first gradient after adding noise; r represents the noise value sum.
In this embodiment, preferably, the formula for gradient update according to the average gradient is:
Figure BDA0003972059350000054
where η represents a learning rate.
Another embodiment of the invention provides a blockchain-based privacy protection method for resisting federal learning gradient attacks.
In federal learning, the participating device cooperatively trains a shared model through its local data, unlike the traditional machine learning scheme, only updates the model instead of uploading the original data to a centralized parameter server, and the core idea is to construct a global model based on virtual fusion data by performing distributed model training among a plurality of data sources with local data, without exchanging local individual or sample data, only by exchanging model parameters or intermediate results, thereby realizing the balance of data privacy protection and data sharing calculation, namely, the application new paradigm of "data available invisible", "data motionless model motile".
In the training process of the machine learning model, the machine learning model is optimized by continuously updating the gradient so as to obtain an optimal model. The gradient calculation method comprises the following steps:
Figure BDA0003972059350000061
wherein W is t Is the firstModel weight parameters of the t wheel; x is x t,i 、y t,i Inputting data and labels of the t-th round of the participant i; f (-) is a predicted value of the input data under the current model parameters; l (·) is a loss function constructed using the predicted value and the real label.
Under the condition that the learning rate is eta, each round of weight updating is as follows:
Figure BDA0003972059350000062
in federal learning, the gradients calculated separately for each participant need to be aggregated:
Figure BDA0003972059350000063
to recover data from the gradient, a virtual input x 'and a tag input y' are first randomly initialized; these "virtual data" are then input into the model and "virtual gradients" are obtained "
Figure BDA0003972059350000064
Figure BDA0003972059350000065
Optimizing near-original false gradients also causes the dummy data to approach real training data. Given the gradient of a step, training data is obtained by minimizing the following objectives:
Figure BDA0003972059350000066
normal distance in objective function
Figure BDA0003972059350000067
Is differentiable, the virtual input x 'and the label y' can be optimized using standard gradient-based methods. Through repeated iterative optimization, the method can recoverAnd outputting the real data.
The above procedure illustrates how the user's training data is subject to privacy leakage, i.e., gradient leakage, through gradient updating.
Therefore, the embodiment of the invention introduces a blockchain, and provides a privacy protection method for resisting federal learning gradient attack based on the blockchain, namely a method for acquiring user content privacy through gradient attack in machine learning. As shown in fig. 1, the method includes three steps: a blockchain network, a plurality of participants, and a central server; the block chain network is used for providing a safe and reliable execution environment, guaranteeing the reliable operation of the intelligent contract, storing data and guaranteeing the data safety; the multiple participants are providers of training data, and training is carried out by utilizing the training data owned by the participants so as to obtain model parameters; the central server is used for the aggregation of the model gradients and the updating of the model weights.
In the blockchain, users collectively create a common ledger for blockwise verification and logging transactions. Decentralizing is the most essential feature of a blockchain, and each node backs up complete account book information; the consensus mechanism enables each mutually incoherent node to verify and confirm the data in the network, so as to generate trust and achieve consensus; the encryption algorithm is used for protecting the driving of the characteristics of anonymity, non-falsifiability and the like of the blockchain, and is a base line for judging whether the chain is trustworthy or not and whether the chain has basic safety or not; an intelligent contract is a digitally defined contract that is capable of automatically executing terms. It places the contract in code on the blockchain and executes automatically under contracted conditions. The non-modifiable and traceable nature of the blockchain provides a secure trusted operating environment for the smart contract.
The intelligent contract can be automatically executed when the contract triggering condition is met, and the contract is not controlled by people. The interference of malicious behaviors on the normal execution is avoided to a great extent. The intelligent contract based on the block chain technology not only can play the advantage of the intelligent contract in the aspect of cost efficiency, but also can avoid the interference of malicious behaviors on the normal execution of the contract. Although the privacy protection in the machine learning process can be enhanced by adding noise into the gradient parameters of the training model, the machine learning training effect is seriously affected by the noise, the effective privacy protection cannot be realized due to too small noise addition, and the machine learning training precision is seriously reduced due to too large noise addition.
While blockchains provide a secure and trusted computing environment, they are still subject to attack risk against user data and privacy. Figure 2 illustrates the threat faced by the system. In the system interior, the participants may send erroneous data; the central server deduces the user privacy according to the gradient data improved by the participants, and the malicious nodes break the training process in the machine learning training process, so that the machine learning training is difficult to carry out, and the identity and the data privacy of the data parties are exposed; the blockchain management user may recover the participant data from the data stored in the blockchain, expose the user identity from the blockchain transaction record, and so forth.
The privacy protection method for resisting federal learning gradient attack based on the blockchain provided by the embodiment of the invention is used for solving the problems.
In the method, after the participant finishes training, when gradient data is uploaded, a noise value related to a gradient hash value is automatically added by an intelligent contract and is stored in a blockchain, and the gradient after noise addition is stored in the blockchain, so that the data of a user cannot be recovered directly by a gradient leakage method; when the central server aggregates the data, the participant calls the intelligent contract by taking the hash value of the gradient data as a parameter, and after the intelligent contract verifies the uploaded gradient data, the participant can perform the next operation. The smart contract recalculates the noise value and randomly divides into several shares and saves each of the shares in the blockchain. The malicious node cannot know the number of times the noise value is divided, and cannot know which party the shared noise value stored in the blockchain belongs to, so that the real noise value cannot be acquired. The central server calculates the shared noise value of all the participants by the intelligent contract and returns the noise sum directly after aggregating the noise sum required by the data. And finally, calculating the latest model parameters. The data operation in the block chain is executed through the intelligent contract, the execution process cannot be interfered by human, and the malicious node resistance of the system is greatly improved.
In the methodGradient updated per round by each participant
Figure BDA0003972059350000071
Adding noise R t Then, the gradient is sent to a central server, and the gradient received by the central server is->
Figure BDA0003972059350000072
The method comprises the following steps:
Figure BDA0003972059350000073
the central server aggregates the received gradients:
Figure BDA0003972059350000074
party-generated noise utilization secure multiparty computation F s (-) send to multiple participants:
r 1 ,r 2 ,...,r s =F s (R t,i ) (8)
the participants jointly calculate the sum of all random numbers:
Figure BDA0003972059350000081
the central server restores the gradient:
Figure BDA0003972059350000082
the data recovered by the central server can be seen to be consistent with the aggregation gradient without noise, so that the training accuracy is ensured not to be influenced by the noise. The added noise is transmitted through secure multiparty calculation, any party and a central server cannot directly recover the noise value, and the true gradient cannot be deduced, so that the privacy of the user is ensured.
As shown in fig. 3, the flow of the method specifically includes:
step 1, a plurality of participants start training;
each participant uses the model parameters shared by the central server to train the self-owned image by machine learning, and uses the self-owned data to calculate the gradient
Figure BDA0003972059350000083
Step 2, uploading gradients by a plurality of participants;
each participant i will have a respective gradient
Figure BDA0003972059350000084
As a parameter, call an add noise smart contract module for calculating gradient +.>
Figure BDA0003972059350000085
Is a hash value of (2):
Figure BDA0003972059350000086
in a gradient
Figure BDA0003972059350000087
Hash value H of (a) i Generating noise value R for random number seed i =Random(H i ) The gradient after noise addition was calculated:
Figure BDA0003972059350000088
and will be
Figure BDA0003972059350000089
Stored in the blockchain.
Step 3, verifying data and sharing noise values;
each participant i will gradient
Figure BDA0003972059350000091
Hash value H of (a) i As a parameter, a shared noise value smart contract module is invoked, which is used for verifying whether the noise gradient uploaded contains H and not i Corresponding noise gradient->
Figure BDA0003972059350000092
First according to the gradient
Figure BDA0003972059350000093
Hash value H of (a) i Calculating the noise value R i '=Random(H i ) By means of noise value R i ' and noise gradient->
Figure BDA0003972059350000094
Calculating gradient value +.>
Figure BDA0003972059350000095
Then calculate
Figure BDA0003972059350000096
Hash value +.>
Figure BDA0003972059350000097
Whether or not to match H i And consistent. If they are identical, the gradient of uploading by party i is described +.>
Figure BDA0003972059350000098
Is not altered; otherwise, the noise gradient stored in the blockchain is inconsistent with the gradient uploaded by the participant, namely, the data is polluted in the uploading process or the participant is a malicious node, and at the moment, the participant is deleted from the system.
After verifying the data is consistent, R is taken as i ' random division into n i Part, give n i Random number
Figure BDA0003972059350000099
Wherein the method comprises the steps of
Figure BDA00039720593500000910
And random number->
Figure BDA00039720593500000911
Stored in a blockchain.
Step 4, obtaining a noise value sum;
the central server aggregate gradient first needs to acquire the noise sum. The noise sum is calculated by calling the noise sum intelligent contract module by the central server. The module divides all noise after the account
Figure BDA00039720593500000912
Summing to obtain R, and returning the R to the central server.
Step 5, generating a deviation-free gradient;
the central server obtains a plurality of noise gradients from the blockchain
Figure BDA00039720593500000913
By means of
Figure BDA00039720593500000914
Calculating average gradient, and obtaining new model parameter +.>
Figure BDA00039720593500000915
Further, after sharing the new model parameters to each participant, the next training is performed according to the steps until the federal learning model meeting the requirements is obtained.
In the embodiment of the invention, in the gradient uploading stage, a participant invokes an intelligent contract to add noise to the gradient, and the gradient is stored in a blockchain and is not a true gradient; in the stage of sharing noise values, participants call intelligent contracts by taking gradient hash values as parameters, firstly, verifying according to the gradient hash values, confirming that noise gradients stored in a block chain are consistent with uploaded gradients, then randomly dividing the noise values into a plurality of parts, and enabling malicious attackers to be unable to confirm which participant the divided noise values belong to, wherein all the divided noise values are stored in the same account, and the true noise values cannot be recovered; therefore, the real gradient cannot be recovered according to the noise gradient and the division noise value, so that the privacy safety of the user is further protected; in the noise and phase acquisition, the central server directly acquires the noise sum through the intelligent contract, so that the noise value is prevented from being directly acquired, and the real gradient is restored; in the aggregate gradient stage, the central server calculates an average gradient using the noise gradient and the noise sum, and updates the model parameters. Compared with other existing schemes for adding noise, the method does not lose the model parameter precision, and ensures the machine learning training effect; because the block chain network is introduced, the transmission channels are not required to be additionally searched for information transmission among all parties, so that the transmission safety is greatly enhanced, and the network traffic is reduced.
Another embodiment of the present invention provides a blockchain-based privacy protection system that is resistant to federal learning gradient attacks, the system comprising a plurality of participants, a central server, and a blockchain system; each of the plurality of participants trains by utilizing local training data according to the model parameters shared by the center servers to acquire a first gradient; obtaining a first hash value according to the first gradient calculation; uploading a first gradient and a first hash value to the blockchain system;
the blockchain system comprises an added noise intelligent contract module, a shared noise value intelligent contract module and a noise and intelligent contract module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the noise adding intelligent contract module is used for: receiving a first gradient and a first hash value uploaded by each participant; for each first gradient, calculating a second hash value corresponding to the first gradient; generating a first noise value by taking the second hash value as a random number seed, adding the first noise value and the first gradient, and obtaining a first gradient after noise addition;
the shared noise value intelligent contract module is used for: generating a second noise value for each first hash value by taking the first hash value as a random number seed, and subtracting the second noise value from the first gradient after noise addition to obtain a second gradient; calculating a hash value of the second gradient, judging whether the hash value of the second gradient is the same as the first hash value, and if so, indicating that the first gradient uploaded by the participant is not changed; dividing the second noise value corresponding to each participant into a plurality of parts;
the noise and intelligence contract module is to: summing a plurality of second noise values corresponding to a plurality of participants, and transmitting the obtained noise value sums to the central server;
the central server is used for calculating an average gradient according to the noise value and the first gradient after adding noise, and carrying out gradient update according to the average gradient.
In this embodiment, preferably, the calculation formula of the first gradient is:
Figure BDA0003972059350000101
wherein W is t Model weight parameters of a t-th round in the training process of the federal learning model are represented; x is x t,i ,y t,i Respectively representing the data input of the participant i at the t-th round and the corresponding label; f (-) represents the predicted value of the input data under the current model parameters; l (·) represents the loss function constructed using the predicted value and the real label.
In this embodiment, preferably, the calculation formula of the average gradient is:
Figure BDA0003972059350000102
wherein N represents the total number of participants;
Figure BDA0003972059350000103
representing a first gradient after adding noise; r represents the noise value sum.
In this embodiment, preferably, the formula for gradient update according to the average gradient is:
Figure BDA0003972059350000104
where η represents a learning rate.
The function of the blockchain-based privacy protection system for resisting the federal learning gradient attack according to the present embodiment may be described by the blockchain-based privacy protection method for resisting the federal learning gradient attack, so that the details of the present embodiment are not described, and reference may be made to the above method embodiments.
Further experiments prove the technical effect of the invention.
The experiment uses a CoCo dataset, which is a dataset that can be used for image recognition, comprising 80 image classifications, suitable for object detection, segmentation and image description. In the experiment, a longitudinal federal learning method is adopted for federal learning.
After training data are respectively input into the existing federal learning model and the federal learning model obtained by training based on the method of the embodiment of the invention and gradient recovery is carried out on the federal learning model, as shown in fig. 4 and 5, as can be seen from fig. 4, after 50 rounds of iteration and 70 rounds of iteration, the existing federal learning model can enable an attacker to obtain partial information of original content; as can be seen from FIG. 5, after 70 iterations, the federal learning model trained and obtained based on the method of the embodiment of the present invention can make the attacker unable to obtain effective information.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments are contemplated within the scope of the invention as described herein. The disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is defined by the appended claims.

Claims (8)

1. The privacy protection method for resisting the federal learning gradient attack based on the blockchain is characterized in that gradient updating is carried out in the training process of a federal learning model according to the following process so as to carry out privacy protection on local training data of each participant:
the blockchain system receives a first gradient and a first hash value uploaded by each participant; the first gradient is obtained by training each participant according to the model parameters shared by the center servers by utilizing local training data; the first hash value is obtained according to first gradient calculation;
for each first gradient, the blockchain system calculates a second hash value corresponding to the first gradient; generating a first noise value by taking the second hash value as a random number seed, adding the first noise value and the first gradient, and obtaining a first gradient after noise addition;
for each first hash value, the block chain system generates a second noise value by taking the first hash value as a random number seed, and subtracts the first gradient after noise addition from the second noise value to obtain a second gradient; calculating a hash value of the second gradient, judging whether the hash value of the second gradient is the same as the first hash value, and if so, indicating that the first gradient uploaded by the participant is not changed; dividing the second noise value corresponding to each participant into a plurality of parts;
the block chain system sums a plurality of second noise values corresponding to a plurality of participants and transmits the obtained noise value sum to a central server;
the center server calculates an average gradient according to the noise value and the first gradient after adding noise; gradient updates are performed according to the average gradient.
2. The blockchain-based privacy protection method against federal learning gradient attack of claim 1, wherein the first gradient is calculated according to the formula:
Figure FDA0003972059340000011
wherein W is t Model weight parameters of a t-th round in the training process of the federal learning model are represented; x is x t,i ,y t,i Respectively representing the data input of the participant i at the t-th round and the corresponding label; f (·) represents transfusionEntering predicted values of data under current model parameters; l (·) represents the loss function constructed using the predicted value and the real label.
3. The blockchain-based privacy protection method against federal learning gradient attack of claim 2, wherein the average gradient is calculated according to the formula:
Figure FDA0003972059340000012
wherein N represents the total number of participants; w is equal to i ' represents a first gradient after adding noise; r represents the noise value sum.
4. A blockchain-based privacy protection method against federal learning gradient attack as in claim 3, wherein the formula for gradient update according to the average gradient is:
Figure FDA0003972059340000021
where η represents a learning rate.
5. A privacy protection system for resisting federal learning gradient attack based on a blockchain is characterized by comprising a plurality of participants, a central server and a blockchain system; each of the plurality of participants trains by utilizing local training data according to the model parameters shared by the center servers to acquire a first gradient; obtaining a first hash value according to the first gradient calculation; uploading a first gradient and a first hash value to the blockchain system;
the blockchain system comprises an added noise intelligent contract module, a shared noise value intelligent contract module and a noise and intelligent contract module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the noise adding intelligent contract module is used for: receiving a first gradient and a first hash value uploaded by each participant; for each first gradient, calculating a second hash value corresponding to the first gradient; generating a first noise value by taking the second hash value as a random number seed, adding the first noise value and the first gradient, and obtaining a first gradient after noise addition;
the shared noise value intelligent contract module is used for: generating a second noise value for each first hash value by taking the first hash value as a random number seed, and subtracting the second noise value from the first gradient after noise addition to obtain a second gradient; calculating a hash value of the second gradient, judging whether the hash value of the second gradient is the same as the first hash value, and if so, indicating that the first gradient uploaded by the participant is not changed; dividing the second noise value corresponding to each participant into a plurality of parts;
the noise and intelligence contract module is to: summing a plurality of second noise values corresponding to a plurality of participants, and transmitting the obtained noise value sums to the central server;
the central server is used for calculating an average gradient according to the noise value and the first gradient after adding noise, and carrying out gradient update according to the average gradient.
6. The blockchain-based privacy protection system against federal learning gradient attack of claim 5, wherein the first gradient is calculated as:
Figure FDA0003972059340000022
wherein W is t Model weight parameters of a t-th round in the training process of the federal learning model are represented; x is x t,i ,y t,i Respectively representing the data input of the participant i at the t-th round and the corresponding label; f (-) represents the predicted value of the input data under the current model parameters; l (·) represents the loss function constructed using the predicted value and the real label.
7. The blockchain-based privacy protection system against federal learning gradient attack of claim 6, wherein the mean gradient is calculated as:
Figure FDA0003972059340000023
wherein N represents the total number of participants; w is equal to i ' represents a first gradient after adding noise; r represents the noise value sum.
8. The blockchain-based privacy protection system against federal learning gradient attack of claim 7, wherein the formula for gradient update based on average gradient is:
Figure FDA0003972059340000031
where η represents a learning rate.
CN202211516395.1A 2022-11-30 2022-11-30 Block chain-based privacy protection method and system for resisting federal learning gradient attack Pending CN116340986A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211516395.1A CN116340986A (en) 2022-11-30 2022-11-30 Block chain-based privacy protection method and system for resisting federal learning gradient attack

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211516395.1A CN116340986A (en) 2022-11-30 2022-11-30 Block chain-based privacy protection method and system for resisting federal learning gradient attack

Publications (1)

Publication Number Publication Date
CN116340986A true CN116340986A (en) 2023-06-27

Family

ID=86888188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211516395.1A Pending CN116340986A (en) 2022-11-30 2022-11-30 Block chain-based privacy protection method and system for resisting federal learning gradient attack

Country Status (1)

Country Link
CN (1) CN116340986A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116828453A (en) * 2023-06-30 2023-09-29 华南理工大学 Unmanned aerial vehicle edge computing privacy protection method based on self-adaptive nonlinear function

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116828453A (en) * 2023-06-30 2023-09-29 华南理工大学 Unmanned aerial vehicle edge computing privacy protection method based on self-adaptive nonlinear function
CN116828453B (en) * 2023-06-30 2024-04-16 华南理工大学 Unmanned aerial vehicle edge computing privacy protection method based on self-adaptive nonlinear function

Similar Documents

Publication Publication Date Title
CN111639361B (en) Block chain key management method, multi-person common signature method and electronic device
CN110892396B (en) Method and apparatus for efficiently implementing a distributed database within a network
Bonawitz et al. Practical secure aggregation for privacy-preserving machine learning
CN110138802B (en) User characteristic information acquisition method, device, block chain node, network and storage medium
CN112380578A (en) Edge computing framework based on block chain and trusted execution environment
Huang et al. Starfl: Hybrid federated learning architecture for smart urban computing
CN109347829B (en) Group intelligence perception network truth value discovery method based on privacy protection
CN115549888A (en) Block chain and homomorphic encryption-based federated learning privacy protection method
CN114254386A (en) Federated learning privacy protection system and method based on hierarchical aggregation and block chain
Li et al. SPFM: Scalable and privacy-preserving friend matching in mobile cloud
CN111581648B (en) Method of federal learning to preserve privacy in irregular users
Wazid et al. BUAKA-CS: Blockchain-enabled user authentication and key agreement scheme for crowdsourcing system
CN110737915A (en) Anti-quantum-computation anonymous identity recognition method and system based on alliance chain and implicit certificate
CN116340986A (en) Block chain-based privacy protection method and system for resisting federal learning gradient attack
CN116882524A (en) Federal learning method and system for meeting personalized privacy protection requirements of participants
CN116800488A (en) Group cooperation privacy game method based on blockchain
JP2002529778A (en) Incorporating shared randomness into distributed encryption
Zhang et al. Privacyeafl: Privacy-enhanced aggregation for federated learning in mobile crowdsensing
CN111737337A (en) Multi-party data conversion method, device and system based on data privacy protection
CN116506154A (en) Safe verifiable federal learning scheme
Li et al. Epps: Efficient privacy-preserving scheme in distributed deep learning
CN111581663B (en) Federal deep learning method for protecting privacy and facing irregular users
Yang et al. Federated Medical Learning Framework Based on Blockchain and Homomorphic Encryption
Masuda et al. Model fragmentation, shuffle and aggregation to mitigate model inversion in federated learning
CN114398671A (en) Privacy calculation method, system and readable storage medium based on feature engineering IV value

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination