CN114036566A - Verifiable federal learning method and device based on block chain and lightweight commitment - Google Patents

Verifiable federal learning method and device based on block chain and lightweight commitment Download PDF

Info

Publication number
CN114036566A
CN114036566A CN202111387073.7A CN202111387073A CN114036566A CN 114036566 A CN114036566 A CN 114036566A CN 202111387073 A CN202111387073 A CN 202111387073A CN 114036566 A CN114036566 A CN 114036566A
Authority
CN
China
Prior art keywords
gradient
result
noise
commitment
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111387073.7A
Other languages
Chinese (zh)
Inventor
高胜
罗靖杰
朱建明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central university of finance and economics
Original Assignee
Central university of finance and economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central university of finance and economics filed Critical Central university of finance and economics
Priority to CN202111387073.7A priority Critical patent/CN114036566A/en
Publication of CN114036566A publication Critical patent/CN114036566A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Storage Device Security (AREA)

Abstract

The application relates to the technical field of network information security, in particular to a verifiable federal learning method and a verifiable federal learning device based on a block chain and a lightweight commitment, wherein the method comprises the following steps: initializing all hyper-parameters, training a deep learning model by using a local data set, and generating a noise-added gradient after noise-adding the gradient vector based on a shielding method; carrying out commitment processing on the noise adding gradient, recording the noise adding gradient into a block chain, and uploading the noise adding gradient to a parameter server; and the parameter server is used for aggregating all the aggregation results sent after the noise addition gradient is received, and the model is updated based on the aggregation results after the verification is successful. Therefore, the problems that the federal learning scheme in the related technology seriously depends on the semi-honest assumption of the parameter server, the correctness of the aggregation result cannot be guaranteed and the like are solved.

Description

Verifiable federal learning method and device based on block chain and lightweight commitment
Technical Field
The application relates to the technical field of network information security, in particular to a verifiable federal learning method and a verifiable federal learning device based on a block chain and a lightweight commitment.
Background
In the big data era, deep learning technology has been rapidly developed. The machine learning model trained based on mass data obtains better generalization capability and unprecedented excellent performance in each task. Applications of life, traffic, medical treatment, social interaction and deep learning have penetrated the aspects of social life, and production and life styles are deeply changed. The model performance of deep learning depends on the amount of training data to a great extent, but data owners cannot directly share the data in consideration of the safety of the data, the industry competition relationship, laws and regulations and various approval processes, so that a data island is formed.
Federal Learning (FL) offers the possibility to solve data islanding problems and user privacy protection problems due to its property of being able to train deep Learning models in coordination without directly sharing data. In federal learning, a client side storing sensitive data trains a model locally, calculates gradient by using the local data and uploads the gradient to a parameter server, the parameter server executes aggregation operation and returns the aggregated gradient to the client side, and the client side updates parameters locally after receiving the gradient and carries out the next round of training. The Federal learning is utilized to carry out model training, and the requirements of user privacy protection, model precision improvement, generalization capability enhancement and the like can be met simultaneously.
The federal learning scheme in the related art relies heavily on the semi-honest assumption of the parameter server, i.e., the assumption that the parameter server responsible for aggregating the parameters of the parties can return the correct aggregation result. However, since the central server as an outsourcing service provider may save computation overhead in the interest of itself and may not perform aggregation completely, and meanwhile, a malicious server may tamper with the aggregation result to implement privacy attack, the semi-honest assumption in the related art cannot guarantee the correctness of the aggregation result.
Disclosure of Invention
The application provides a verifiable federated learning method, a verifiable federated learning device, a verifiable federated learning client and a storage medium based on a blockchain and a lightweight commitment, so as to solve the problems that a federated learning scheme in the related art seriously depends on a semi-honest hypothesis of a parameter server, the correctness of an aggregation result cannot be guaranteed, and the like.
An embodiment of the first aspect of the present application provides a verifiable federal learning method based on a blockchain and a lightweight commitment, which includes the following steps: initializing all hyper-parameters, training a deep learning model by using a local data set, and generating a noise-added gradient after noise-adding the gradient vector based on a shielding method; carrying out commitment processing on the noise adding gradient, recording the noise adding gradient into a block chain, and uploading the noise adding gradient to a parameter server; and receiving an aggregation result sent by the parameter server after aggregating all the noise-added gradients, and updating the model based on the successfully verified aggregation result.
Further, the committing the noisy gradient includes: carrying out irreversible linear transformation on the gradient by using an irreversible matrix to obtain a transformed result; and generating a commitment sent to other clients according to the transformed result.
Further, before updating the model based on the aggregated result after the successful verification, the method further includes: receiving random vectors broadcasted by other participants, and locally adding the random vectors to obtain a sum result; calculating a verification value according to the commitment, and calculating a coefficient vector according to a summation result of the random vectors and an irreversible matrix; and verifying whether the aggregation result is correct or not according to the verification value and the coefficient vector.
Further, the calculating a coefficient vector from the random vector and the irreversible matrix includes: and multiplying the row vectors grouped by the addition result of the random vector by the irreversible matrix to generate a corresponding coefficient.
Further, the verification formula is:
Figure BDA0003367481690000021
wherein N represents the number of participants, i represents the ith participant in the N participants, ViIndicating the verification value, S the coefficient vector, g the aggregation result.
An embodiment of a second aspect of the present application provides a verifiable federal learning device based on a blockchain and a lightweight commitment, including: the noise adding module is used for initializing all hyper-parameters, training a deep learning model by using a local data set, and generating a noise added gradient after performing noise adding processing on the gradient vector based on a shielding method; the processing module is used for committing the noise adding gradient, recording the noise adding gradient into a block chain, and uploading the noise adding gradient to a parameter server; and the updating module is used for receiving the aggregation result sent by the parameter server after aggregating all the noise-added gradients and updating the model based on the aggregation result after successful verification.
Further, the processing module is further configured to perform an irreversible linear transformation on the gradient by using the irreversible matrix to obtain a transformed result; and generating a commitment sent to other clients according to the transformed result.
Further, still include: the verification module is used for receiving random vectors broadcasted by other participants and summing the random vectors locally to obtain a summed result before updating the model based on the successfully verified aggregated result; calculating a verification value according to the commitment, and calculating a coefficient vector according to a summation result of the random vectors and an irreversible matrix; verifying whether the aggregation result is correct or not according to the verification value and the coefficient vector; wherein the computing a coefficient vector from the random vector and the irreversible matrix comprises: multiplying the row vectors grouped by the addition result of the random vector by the irreversible matrix to generate a corresponding coefficient, wherein the verification formula is as follows:
Figure BDA0003367481690000022
wherein N represents the number of participants, i represents the ith participant in the N participants, ViIndicating the verification value, S the coefficient vector, g the aggregation result.
An embodiment of a third aspect of the present application provides a client, including: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program to implement a verifiable federal learning methodology based on blockchains and lightweight commitments as described in the embodiments above.
A fourth aspect of the present application provides a computer-readable storage medium, having stored thereon a computer program, which is executed by a processor, for implementing a verifiable federal learning method based on blockchains and lightweight commitments as described in the above embodiments.
Therefore, the application has at least the following beneficial effects:
the irreversible matrix is multiplied by the grouped gradients, the gradient vector is subjected to lightweight commitment, privacy protection of the original gradient is achieved, the consistency of all participants receiving commitments is guaranteed by combining the block chain, and credible aggregation verification is achieved, so that verifiable federal learning is achieved based on the block chain and the lightweight commitment, the correctness of the server aggregation result can be verified, and the correctness of aggregation and the safety of a model are effectively guaranteed. Therefore, the technical problems that the federal learning scheme in the related technology seriously depends on the semi-honest assumption of the parameter server, the correctness of the aggregation result cannot be guaranteed and the like are solved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic view of an application scenario of verifiable federal learning based on blockchains and lightweight commitments, provided in an embodiment of the present application;
FIG. 2 is a flow diagram of a verifiable federated learning method based on blockchains and lightweight commitments provided in accordance with an embodiment of the present application;
FIG. 3 is a flow diagram of a verifiable federated learning method based on blockchains and lightweight commitments provided according to one embodiment of the present application;
fig. 4 is a flow chart of a gradient commitment phase in which a client makes a commitment to its gradient according to an embodiment of the present application;
fig. 5 is a flowchart of an aggregation verification phase for verifying an aggregation result according to all commitments by a client according to an embodiment of the present application;
fig. 6 is a block diagram illustrating a verifiable federated device based on blockchains and lightweight commitments provided in accordance with an embodiment of the present application;
fig. 7 is a block diagram of a client according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
Aiming at the problems that the federal learning scheme in the related technology mentioned in the background technology seriously depends on the semi-honest hypothesis of the parameter server, the correctness of the aggregation result cannot be ensured, and the like, a verifiable federal learning method based on a block chain and a lightweight commitment needs to be provided, and the method is applied to the verification of each client on the aggregation result of the parameter server in the federal learning process, and the correctness of the aggregation result is ensured.
Therefore, the federate learning method supporting the client to verify the server aggregation result is provided, and the correctness of the result returned by the parameter server can be ensured. The method of the embodiment of the application can be applied to scenes related to federal learning in various fields, and in the following embodiments, with federal learning under an untrusted parameter server as an example, an aggregation verification method based on a block chain and a lightweight commitment can be applied to federal learning, so that verification of a server aggregation result by a client is realized, and the accuracy of aggregation and the safety of a model are ensured. The schematic diagram of verifiable federal learning based on a blockchain and a lightweight commitment is shown in fig. 1, entities involved in a scene include a parameter server, a client and a blockchain network, and an execution subject of the verifiable federal learning method based on the blockchain and the lightweight commitment in the embodiment of the application is the client.
A verifiable federal learning method, apparatus, client, and storage medium based on blockchains and lightweight commitments according to embodiments of the present application will be described below with reference to the accompanying drawings. Specifically, fig. 2 is a flowchart illustrating a verifiable federal learning method based on blockchains and lightweight commitments according to an embodiment of the present application.
As shown in fig. 2, the verifiable federal learning method based on blockchains and lightweight commitments includes the following steps:
in step S101, all hyper-parameters are initialized, a deep learning model is trained by using a local data set, and after a noise addition process is performed on the gradient vector based on a masking method, a noise-added gradient is generated.
It can be understood that, in the embodiment of the present application, all hyper-parameters are initialized first, and after the initialization, the client performs local model training and noise adding.
Specifically, as shown in fig. 3, step S101 includes:
s01, initializing all hyper-parameters
All client negotiations generate two non-zero irreversible matrices before training begins to ensure that clients cannot cheat during the verification process. One matrix Mat is of size M × M, and the other matrix Mat is of size M × MlastIs (d × dmodM), M is a hyper-parameter agreed upon by clients in advance, for grouping of gradient vectors. The participants negotiate between themselves to generate a seed for a pseudo-random number generator, specifically, participant PiAnd PjNegotiation generation si,jFor subsequent noising of the gradient.
S02, local model training is carried out on the client side, and gradient noise is added
And each client trains the deep learning model by using the local data set of the client. Client PiFrom local data sets DiAnd randomly selecting a certain number of samples to input into the model for training. Derivation of the loss function yields the direction of the next round of model parameter update, i.e. the gradient gi. In particular, the client PiSelecting a random subset from a local data set to form a sample data set D'iCalculating the loss function LfAnd gradient giIs expressed as follows
Figure BDA0003367481690000051
Wherein (X)i,Yi) As sample data and label, f (x, ω)t) With the expression parameter ωtThe deep learning model of (1). Client PiFrom the global model ω of the t-th roundtAnd a selected sample set D'iCalculating the difference (Y) between the predicted value and the true label of each sample point in the sample seti-f(Xi,ωt))2The obtained average distance is the loss L of the current roundf(D′i,ωt). The purpose of the model training is to minimize the difference between the predicted value and the true value, i.e. minimize the loss, so the client PiThe optimization direction of the parameters of the round is
Figure BDA0003367481690000052
I.e. the gradient gi
In order to avoid privacy risks brought by directly uploading gradients, each client uses a shielding technology to perform noise adding processing on gradient vectors. Specifically, for the limitations that most of the existing noise adding mechanisms influence the accuracy of the final aggregation result, reduce the data availability and the like, a shielding technology is used for performing noise adding processing on the gradient. In particular, participant PiFor gradient g containing self privacy informationiThe following treatment is carried out:
g′i=gi+∑i∈P,i<jPRG(si,j)-∑i∈P,i>jPRG(si,j),
the gradient after noise addition had the following properties: noisy value aggregated result ∑ gi' equal to true value of the aggregate result ∑ giWhile adding noise value g for each gradientiNeither is equal to the true value gi. By making the noise on the gradient value as above, the server cannot obtain the client PiOriginal privacy gradient giWithout affecting the final polymerization resultThe accuracy of (2).
In step S102, the commitment processing is performed on the noise gradient, and the noise gradient is recorded in the blockchain and uploaded to the parameter server.
It will be appreciated that after generating the noisy gradients, the client makes commitments (commissions) to its true gradients; and the client uploads the noisy gradient to a parameter server.
Specifically, as shown in fig. 3, step S102 includes:
s03, the client makes a commitment to the real gradient and records the commitment in the block chain
The embodiment of the application designs a lightweight commitment method, namely, a client generates corresponding commitments by multiplying commitment matrixes (all of which are non-zero irreversible matrixes) and gradient vectors. In order to avoid that the d × d matrix required when the dimension d of the gradient vector is too large occupies too much storage space, commitments need to be generated after the gradient vectors are grouped. And when the client sends the commitment to other clients and receives the commitments from all other clients, uploading the own noise gradient to the server.
S04, uploading the noise gradient to a parameter server by the client
Client side will add noisy gradient vector g'iAnd uploading to a parameter server. Client PiOnly if receiving commitments C sent by all other clientsjThen g'iAnd uploading is carried out.
In this embodiment, the commitment processing on the noise gradient includes: carrying out irreversible linear transformation on the gradient by using an irreversible matrix to obtain a transformed result; and generating a commitment sent to other clients according to the transformed result.
Specifically, verifiability of verifiable federal learning based on blockchain and lightweight commitment includes a gradient commitment stage in which a client makes a commitment to its gradient, as shown in fig. 4, the specific process of the gradient commitment stage is as follows:
to avoid commitment to reveal user privacy, the client first uses the irreversible matrix Mat to step itselfDegree giAnd performing irreversible linear transformation, wherein the irreversibility of the transformation ensures that each party receiving the commitment cannot reversely deduce the original gradient. And sending the transformed result to other clients as a commitment. In order to prevent the needed dimension d of the gradient vector from occupying too much memory when the dimension d is too large, the computing overhead and the storage cost are increased, and the client end groups the gradients and then uses the small-dimension irreversible matrix to commit. In particular, the client PiFirstly, the gradients are grouped, gradient values of every M dimensions are divided into a group, and a gradient vector g of d dimension is subjected toi=(gi(1),gi(2),...,gi(d) Grouping, i.e. taking M elements out of it in order each time to form a new set of vectors, i.e. grouping the 1 st to M-dimensional elements (g) of the original gradient vector, for example the first seti(1),gi(2),...,gi(M)) is taken out as a first set of vectors gi,[1]Of (2) is used. Further, for the last group of gradient vectors, the number of elements is allowed to be less than M, and vectors are formed according to the elements left after actual grouping, wherein the vectors comprise the number of elements which is the original gradient vector dimension number d, and the number M of the elements in each group is modulo, namely, dmodM. The grouped anisotropy values are shown below:
gi,[1]=(gi(1),gi(2),…,gi(M))
gi,[2]=(gi(M+1),gi(M+2),…,gi(2M))
gi,[k]=(gi(kM-M+1),gi(kM-M+2),...,gi(kM))
Figure BDA0003367481690000061
wherein g isi(l) Represents the original gradient vector giGradient value of the l-th dimension, gi,[k]Is expressed as a pair of giGrouping to obtain the k-th group of results. Based on the grouping result, the client uses the irreversible matrix Mat to generate the following commitments:
Figure BDA0003367481690000062
for the last group of gradients, an additional irreversible matrix Mat is needed because the number of gradient values may not be more than MlastGenerating corresponding commitments
Figure BDA0003367481690000063
MatlastCorresponds to the number of elements in the last set of gradients, MatlastIs (dmodM) × (dmodM). The client combines the generated sets of commitments into a vector with the same dimension as the gradient
Figure BDA0003367481690000064
Sharing to other participants.
In step S103, the parameter receiving server aggregates all the aggregation results sent after the noise gradient is added, and updates the model based on the aggregation result after the verification is successful.
It can be understood that after the client uploads the noise gradient to the parameter server, the server is responsible for aggregating all the ciphertexts and returning an aggregation result to the client; and the client verifies the aggregation result and updates the local model.
Specifically, as shown in fig. 3, step S103 includes:
s05, the parameter server aggregates all gradients and sends the aggregation result to each client
And after receiving the gradients uploaded by all the clients, the server sums the gradients and divides the sum by the number of the sent clients to obtain an average gradient, and returns the result to each client. If the server truthfully performs the aggregation, the returned result
Figure BDA0003367481690000071
Otherwise returned result
Figure BDA0003367481690000072
S06, verifying the aggregation result by the client
And the client verifies whether the aggregation result is correct according to the commitment received before training. Each client generates a random vector RiAnd sending the vector to other clients, and summing the vectors received from all other clients to obtain the aggregated random vector R. And verifying the correctness of the aggregation result g according to the R and the commitment. Failure of the authentication requires the parameter server to re-execute S05. And when the result returned by the parameter server passes the verification, updating the local model.
S07, the client updates the local model
The client side uses the verified gradient g to correct the local parameter omegatAdjusting to obtain a local model omega of the next round of trainingt+1. Specifically, the client updates the model parameters omega according to the over-parameter eta agreed in advancet+1=ωt-η·g。
In this embodiment, before updating the model based on the aggregated result after the successful verification, the method further includes: receiving random vectors broadcasted by other participants, and locally summing the random vectors to obtain a summing result; calculating a verification value according to the commitment, and calculating a coefficient vector by the addition result of the random vector and the irreversible matrix; verifying whether the aggregation result is correct or not according to the verification value and the coefficient vector; wherein, calculate the coefficient vector by random vector and irreversible matrix, include: and multiplying the row vectors grouped by the addition result of the random vector by the irreversible matrix to generate a corresponding coefficient.
Specifically, the verifiability of verifiable federal learning based on blockchains and lightweight commitments further includes an aggregation verification phase in which the client verifies the aggregation result according to all commitments, as shown in fig. 5, the specific process of the aggregation verification phase is as follows:
each participant generates a random row vector R with the same dimension as the original gradientiAnd broadcast is performed. After receiving random vectors from other participants, each participant locally sums the random vectors to obtain a result
Figure BDA0003367481690000073
Figure BDA0003367481690000074
This way of generating R can ensure that R is not manipulated by an adversary without at least one client colluding with the adversary. On the basis, other clients can be according to PiGiven commitment calculates its verification value Vi=R·Ci. Meanwhile, any participant can calculate a coefficient vector S according to the random vector R and the irreversible matrix, and the coefficient vector is used for verifying whether the aggregation result is correct or not. Specifically, the coefficient vector S is calculated by grouping the row vectors R[k]Multiplication with the irreversible matrix produces the corresponding coefficients. The random vector R is grouped in the same way as the gradient vectors, i.e. M elements are taken from it at a time in sequence. The grouped random vectors are represented as follows
R[1]=(R(1),R(2),...,R(M))
R[2]=(R(M+1),R(M+2),...,R(2M))
Figure BDA0003367481690000081
Further, the random vector R after grouping is carried out[k]By S[k]=R[k]Mat generates the corresponding coefficient vector, for the last set of random vectors, which have only dmodM elements, it is necessary to use MatlastGenerating corresponding coefficient vectors
Figure BDA0003367481690000082
The client end splices each group of coefficient vectors into a complete coefficient vector in sequence
Figure BDA0003367481690000083
Figure BDA0003367481690000084
Further, any participant can verify whether the aggregation result is correct according to the commitment and the coefficient vector S received from each client before aggregation. Specifically, each client locally verifies whether the following equation holds:
Figure BDA0003367481690000085
wherein N represents the number of participants (N total participants), i represents the ith participant among the N participants, and ViIndicating the verification value, S the coefficient vector, g the aggregation result. If and only if the equation is true, the aggregated result is considered correct and a local model update operation is performed.
Aiming at the problem that the server forges the aggregation result, the random vector can detect whether the aggregation result g contains the modification made by the adversary. Specifically, if the server modifies the true aggregate result g, it attempts to return an erroneous result g + Δ g. The results returned by the malicious server need to be satisfied
Figure BDA0003367481690000086
Due to the fact that
Figure BDA0003367481690000087
The constant satisfaction is as follows:
Figure BDA0003367481690000088
therefore, the enemy needs to be modified to satisfy S.DELTA.g equal to 0, i.e., S1·Δg1+S2·Δg2+…+Sd·Δgd0. Because any group of elements in the coefficient vector S passes through the random vector R[k]Multiplication with a non-zero constant matrix results, so S is also a random vector. Furthermore, since S is generated after the server publishes Δ g, the adversary cannot construct Δ g satisfying this equation in advance. For any tampering deltag returned by the adversary, the verified probability is equal to any point (deltag) in the d-dimensional space1,Δg2,...,Δgd) And the point falls on a specific hyperplane S1·x1+S2·x2+…+Sd·xdThe probability of 0, i.e., the probability of the error aggregation result passing the verification, approaches 0.
For the user privacy issue, the commitment generated by the irreversible matrix can ensure that the adversary cannot reverse the original gradient from the commitment. Specifically, any commitment is through Ci,[k]=Mat·gi,[k]Generated, and Mat has no inverse matrix, and adversaries cannot be generated by Ci,[k]Reverse thrust gi,[k]
In summary, the embodiment of the present application provides a verification method for correctness of a server aggregation result based on a verifiable federal learning method of a block chain and a lightweight commitment, so as to implement verifiable federal learning; meanwhile, a lightweight commitment method based on an irreversible matrix is designed, so that the commitment is ensured not to reveal an original gradient, and the verification of high efficiency and privacy protection on an aggregation result is realized; specifically, the method comprises the following steps: the client inputs local client data into a local model, and calculates the updating direction of model parameters, namely the gradient; making commitments on the uploaded gradients through the irreversible matrix, mutually issuing the commitments to the block chain network, and uploading the gradients to the parameter server after receiving the commitments sent by all other participants; the parameter server aggregates the gradients from all clients and returns the results to the clients. And after downloading the aggregation result, the client verifies the aggregation result by using the commitment, stores the verified result in the block chain, and then updates the local model.
According to the verifiable federated learning method based on the block chain and the lightweight commitment, the lightweight commitment is carried out on the gradient vector by multiplying the gradient after grouping by the irreversible matrix, so that the privacy protection of the original gradient is realized, the consistency of all participants receiving the commitment is ensured by combining the block chain, and the credible aggregation verification is realized, so that the verifiable federated learning is realized based on the block chain and the lightweight commitment, the correctness of the server aggregation result can be verified, and the correctness of the aggregation and the safety of the model are effectively ensured.
Next, a verifiable federal learning device based on blockchains and lightweight commitments proposed according to an embodiment of the present application is described with reference to the drawings.
Fig. 6 is a block diagram illustrating a verifiable federal learning device based on blockchains and lightweight commitments in an embodiment of the present application.
As shown in fig. 6, the verifiable federal learning device 10 based on blockchains and lightweight commitments includes: a noise adding module 100, a processing module 200 and an updating module 300.
The noise adding module 100 is configured to initialize all hyper-parameters, train a deep learning model by using a local data set, perform noise adding processing on a gradient vector based on a shielding method, and generate a noise added gradient; the processing module 200 is configured to perform commitment processing on the noise-added gradient, record the noise-added gradient in the block chain, and upload the noise-added gradient to the parameter server; the updating module 300 is configured to receive an aggregation result sent by the parameter server after aggregating all the noisy gradients, and update the model based on the aggregation result after successful verification.
Further, the processing module 200 is further configured to perform an irreversible linear transformation on the gradient by using the irreversible matrix to obtain a transformed result; and generating a commitment sent to other clients according to the transformed result.
Further, still include: the verification module is used for receiving random vectors broadcasted by other participants and summing the random vectors locally to obtain a summed result before updating the model based on the aggregation result after successful verification; calculating a verification value according to the commitment, and calculating a coefficient vector by the addition result of the random vector and the irreversible matrix; verifying whether the aggregation result is correct or not according to the verification value and the coefficient vector; wherein, calculate the coefficient vector by random vector and irreversible matrix, include: multiplying the row vectors grouped by the addition result of the random vector by the irreversible matrix to generate a corresponding coefficient, wherein the verification formula is as follows:
Figure BDA0003367481690000101
wherein N represents the number of participants, i represents the ith participant in the N participants, ViIndicating the verification value, S the coefficient vector, g the aggregation result.
It should be noted that the explanation of the foregoing embodiment of the verifiable federal learning method based on a blockchain and a lightweight commitment is also applicable to the verifiable federal learning apparatus based on a blockchain and a lightweight commitment of the embodiment, and details thereof are not repeated here.
According to the verifiable federated learning device based on the block chain and the lightweight commitment, the gradient vector is subjected to the lightweight commitment through multiplying the gradient after grouping by the irreversible matrix, privacy protection of the original gradient is achieved, the consistency of acceptance of all participants is guaranteed by combining the block chain, and credible aggregation verification is achieved, so that verifiable federated learning is achieved based on the block chain and the lightweight commitment, the correctness of the server aggregation result can be verified, and the correctness of aggregation and the safety of a model are effectively guaranteed.
Fig. 7 is a schematic structural diagram of a client according to an embodiment of the present application. The client may include:
memory 701, processor 702, and a computer program stored on memory 701 and executable on processor 702.
The processor 702, when executing the program, implements the verifiable federated learning approach based on blockchains and lightweight commitments provided in the embodiments described above.
Further, the client further comprises:
a communication interface 703 for communication between the memory 701 and the processor 702.
A memory 701 for storing computer programs operable on the processor 702.
The memory 701 may comprise high speed RAM memory and may also include non-volatile memory, such as at least one disk memory.
If the memory 701, the processor 702 and the communication interface 703 are implemented independently, the communication interface 703, the memory 701 and the processor 702 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.
Optionally, in a specific implementation, if the memory 701, the processor 702, and the communication interface 703 are integrated on a chip, the memory 701, the processor 702, and the communication interface 703 may complete mutual communication through an internal interface.
The processor 702 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present Application.
Embodiments of the present application also provide a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a verifiable federal learning methodology as above based on blockchains and lightweight commitments.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of implementing the embodiments of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a programmable gate array, a field programmable gate array, or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

Claims (10)

1. A verifiable federated learning method based on blockchains and lightweight commitments is characterized by comprising the following steps:
initializing all hyper-parameters, training a deep learning model by using a local data set, and generating a noise-added gradient after noise-adding the gradient vector based on a shielding method;
carrying out commitment processing on the noise adding gradient, recording the noise adding gradient into a block chain, and uploading the noise adding gradient to a parameter server; and
and receiving an aggregation result sent by the parameter server after aggregating all the noise-added gradients, and updating the model based on the successfully verified aggregation result.
2. The method of claim 1, wherein committing the noisy gradient comprises:
carrying out irreversible linear transformation on the gradient by using an irreversible matrix to obtain a transformed result;
and generating a commitment sent to other clients according to the transformed result.
3. The method of claim 2, further comprising, before updating the model based on the aggregated result after the successful verification:
receiving random vectors broadcasted by other participants, and locally adding the random vectors to obtain a sum result;
calculating a verification value according to the commitment, and calculating a coefficient vector according to a summation result of the random vectors and an irreversible matrix;
and verifying whether the aggregation result is correct or not according to the verification value and the coefficient vector.
4. The method of claim 3, wherein computing the coefficient vector from the random vector and the irreversible matrix comprises:
and multiplying the row vectors grouped by the addition result of the random vector by the irreversible matrix to generate a corresponding coefficient.
5. The method according to claim 3 or 4, wherein the validation formula is:
Figure FDA0003367481680000011
wherein N represents the number of participants, i represents the ith participant in the N participants, ViRepresenting the verification value, representing the coefficient vector, g representing the aggregation result.
6. A verifiable federated learning device based on blockchains and lightweight commitments, comprising:
the noise adding module is used for initializing all hyper-parameters, training a deep learning model by using a local data set, and generating a noise added gradient after performing noise adding processing on the gradient vector based on a shielding method;
the processing module is used for committing the noise adding gradient, recording the noise adding gradient into a block chain, and uploading the noise adding gradient to a parameter server; and
and the updating module is used for receiving the aggregation result sent by the parameter server after aggregating all the noise-added gradients and updating the model based on the aggregation result after successful verification.
7. The apparatus of claim 6, wherein the processing module is further configured to perform an irreversible linear transformation on the gradient by using an irreversible matrix to obtain a transformed result; and generating a commitment sent to other clients according to the transformed result.
8. The apparatus of claim 7, further comprising:
the verification module is used for receiving random vectors broadcasted by other participants and summing the random vectors locally to obtain a summed result before updating the model based on the successfully verified aggregated result; calculating a verification value according to the commitment, and calculating a coefficient vector according to a summation result of the random vectors and an irreversible matrix; verifying whether the aggregation result is correct or not according to the verification value and the coefficient vector;
wherein the computing a coefficient vector from the random vector and the irreversible matrix comprises:
multiplying the row vectors grouped by the addition result of the random vector by the irreversible matrix to generate a corresponding coefficient, wherein the verification formula is as follows:
Figure FDA0003367481680000021
wherein N represents the number of participants, i represents the ith participant in the N participants, ViRepresenting the verification value, representing the coefficient vector, g representing the aggregation result.
9. A client, comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program to implement a verifiable federal learning methodology based on blockchains and lightweight commitments as claimed in any of claims 1-5.
10. A computer-readable storage medium having stored thereon a computer program for execution by a processor to implement a verifiable federal learning method based on blockchains and lightweight commitments as defined in any of claims 1-5.
CN202111387073.7A 2021-11-22 2021-11-22 Verifiable federal learning method and device based on block chain and lightweight commitment Pending CN114036566A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111387073.7A CN114036566A (en) 2021-11-22 2021-11-22 Verifiable federal learning method and device based on block chain and lightweight commitment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111387073.7A CN114036566A (en) 2021-11-22 2021-11-22 Verifiable federal learning method and device based on block chain and lightweight commitment

Publications (1)

Publication Number Publication Date
CN114036566A true CN114036566A (en) 2022-02-11

Family

ID=80138420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111387073.7A Pending CN114036566A (en) 2021-11-22 2021-11-22 Verifiable federal learning method and device based on block chain and lightweight commitment

Country Status (1)

Country Link
CN (1) CN114036566A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115174046A (en) * 2022-06-10 2022-10-11 湖北工业大学 Federal learning bidirectional verifiable privacy protection method and system on vector space
CN117436078A (en) * 2023-12-18 2024-01-23 烟台大学 Bidirectional model poisoning detection method and system in federal learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115174046A (en) * 2022-06-10 2022-10-11 湖北工业大学 Federal learning bidirectional verifiable privacy protection method and system on vector space
CN115174046B (en) * 2022-06-10 2024-04-30 湖北工业大学 Federal learning bidirectional verifiable privacy protection method and system in vector space
CN117436078A (en) * 2023-12-18 2024-01-23 烟台大学 Bidirectional model poisoning detection method and system in federal learning
CN117436078B (en) * 2023-12-18 2024-03-12 烟台大学 Bidirectional model poisoning detection method and system in federal learning

Similar Documents

Publication Publication Date Title
CN110419053B (en) System and method for information protection
CN113095510B (en) Federal learning method and device based on block chain
CN114036566A (en) Verifiable federal learning method and device based on block chain and lightweight commitment
CN113221183B (en) Method, device and system for realizing privacy protection of multi-party collaborative update model
CN113077060A (en) Federal learning system and method aiming at edge cloud cooperation
JP7471445B2 (en) Privacy-preserving machine learning for content delivery and analytics
CN114930357A (en) Privacy preserving machine learning via gradient boosting
Keuffer et al. Efficient proof composition for verifiable computation
CN112199706A (en) Tree model training method and business prediction method based on multi-party safety calculation
CN116777294A (en) Crowd-sourced quality safety assessment method based on federal learning under assistance of blockchain
CN116187471A (en) Identity anonymity and accountability privacy protection federal learning method based on blockchain
Tsaloli et al. DEVA: Decentralized, verifiable secure aggregation for privacy-preserving learning
CN111385096A (en) Block chain network, signature processing method, terminal and storage medium
CN110874481B (en) GBDT model-based prediction method and GBDT model-based prediction device
US20230274183A1 (en) Processing of machine learning modeling data to improve accuracy of categorization
CN113361618A (en) Industrial data joint modeling method and system based on federal learning
CN116489637B (en) Mobile edge computing method oriented to meta universe and based on privacy protection
CN112668016B (en) Model training method and device and electronic equipment
JP2023158097A (en) Computer-implemented system and method for controlling processing step of distributed system
CN117216788A (en) Video scene identification method based on federal learning privacy protection of block chain
CN114640463B (en) Digital signature method, computer equipment and medium
CN112417478B (en) Data processing method, device, equipment and storage medium
CN112668037B (en) Model training method and device and electronic equipment
CN114519191A (en) Medical data management method and device
CN114841355A (en) Joint learning method and system based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination