CN112949865A - Sigma protocol-based federal learning contribution degree evaluation method - Google Patents
Sigma protocol-based federal learning contribution degree evaluation method Download PDFInfo
- Publication number
- CN112949865A CN112949865A CN202110292470.XA CN202110292470A CN112949865A CN 112949865 A CN112949865 A CN 112949865A CN 202110292470 A CN202110292470 A CN 202110292470A CN 112949865 A CN112949865 A CN 112949865A
- Authority
- CN
- China
- Prior art keywords
- gradient
- model
- participant
- contribution degree
- ciphertext
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Abstract
The invention discloses a method for evaluating the federal learning contribution degree based on a SIGMA protocol, wherein a training direction sends a model and an untrustworthy credible execution program, a non-interactive SIGMA protocol and related parameters to each participant; training the model by using a local data set by a participant to obtain a gradient, running a credible execution program, extracting the gradient of the participant, updating the model, running a test module to test the accuracy of the new model, and calculating the contribution degree of the gradient; the participator encodes and encrypts the gradient according to an encryption algorithm and sends the gradient to the training party; the participant generates a random value, encrypts the random value by using an encryption algorithm, inputs all ciphertext generated currently into a hash function sandbox, outputs the hash value and calculates a commitment; and the participant uploads the commitment, the ciphertext and the contribution degree to the training party, the training party calculates the hash value and verifies the commitment, and if the commitment passes the verification, the gradient ciphertext and the contribution degree are bound and recorded in the database. The method can realize gradient certification without revealing privacy.
Description
Technical Field
The invention relates to the field of federal learning and cryptography, in particular to a federal learning contribution degree evaluation method based on a SIGMA protocol.
Background
Google provides federal learning for the first time in 2016, and the process is that a model training party delivers training tasks to a plurality of participants with local data sets, each participant trains a model to be trained, gradients are generated and uploaded to the model training party, and the model training party aggregates the gradients to update the models. The advantage of federal learning under this concept is that the local data sets of each participant are fully utilized for global model training, and the data sets that may contain privacy do not need to be directly acquired. But the problem is that the gradient has a certain correlation with the training data, and the model training party can deduce the relevant information of the training data set from the uploaded gradient. Therefore, how to obtain the final aggregation gradient by the model trainer on the premise of not revealing gradient information and ensure that the uploaded gradient information is correct is a popular research content. CN111552986B proposes a block chain-based federal modeling method, which uses homomorphic encryption to hide uploaded gradients and perform auditing work on training data, but fails to construct a corresponding relationship between gradients and contribution degrees, and may cause part of poor participants to unintentionally damage an aggregation model. A real-time auditing scheme based on intelligent contracts on block chains is provided in CN111797142A, which requires a certain interaction process and causes unnecessary workload.
Overall, the problems of the prior art are: 1) a contribution evaluation method without privacy disclosure is lacked in federal learning; 2) the auditing process is mostly an interactive process, and large workload is brought to auditing.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a SIGMA protocol-based federal learning contribution degree evaluation method, wherein a model training party constructs a trusted execution program in advance and distributes the trusted execution program to each participant, the contribution degree evaluation is carried out on the gradient locally before the gradient encryption, and in order to prevent the uploaded gradient and the contribution degree from being inconsistent, the participant needs to follow a non-interactive SIGMA protocol to additionally generate a commitment of the gradient so as to bind and record the gradient and the contribution degree.
The purpose of the invention is realized by the following technical scheme:
a federal learning contribution degree evaluation method based on a SIGMA protocol comprises the following steps:
the method comprises the following steps: the model training direction sends the models of the current batch, the model parameters, the untrustworthy credible execution program, the non-interactive SIGMA protocol and the related parameters to all the participants participating in the model training; the trusted executive comprises a testing module and a Hash function sandbox H (x);
step two: training the model by using a local data set by the participant to obtain a gradient, running a trusted execution program, extracting the gradient of the participant by the trusted execution program, updating the model by using the gradient, running a test module in the trusted execution program to test the accuracy of the new model, and then calculating the contribution degree of the gradient;
step three: the participator encodes the gradient extracted by the credible executive program according to the encryption algorithm of the non-interactive SIGMA protocol to obtain xiAnd using two generators of encryption algorithm to encrypt them respectively to obtain xiG、xiH, sending the data to a model training party;
step four: the participant generates a random value, which is encoded to obtain viAnd respectively encrypting the two generators by using the two generators of the encryption algorithm in the protocol again to obtain viG、viH, the participant inputs all the ciphertext generated currently into a hash function sandbox H (x) in the trusted execution program, outputs a hash value c by the sandbox, and uses a formula Comi=vi-xic generating acceptance Comi;
Step five: participant uploads acceptance Comi、viG、viH and the degree of contribution of the gradient to a model trainer, and the model trainer uses the currently received xiG、xiH、viG、viH and the backup hash function sandbox H (x) calculate the hash value c. Verification equation viG=rG+c(xiG) And viH=rH+c(xiH) If the judgment result is positive, the model training party passes the verification, the ciphertext gradient uploaded by the participant and the gradient plaintext locally participating in updating and testing are indicated to be corresponding, and the gradient ciphertext and the contribution degree of the gradient ciphertext are bound and recorded in a database; if the verification fails, the ciphertext does not correspond to the plaintext, the gradient is abandoned and recorded, and an error is fed back to the participant.
To ensure that the SIGMA protocol does not reveal participant gradients, the SIGMA protocol uses an elliptic curve encryption algorithm or a discrete-pair encryption algorithm, and includes within the protocol some two generators G, H in the encryption algorithm.
In order to test the accuracy change of the model before and after training, the test module comprises a test data set and stores the current accuracy of the model to be trained; and after the updated model is subjected to accuracy rate test, calculating the accuracy rate change before and after updating, and recording the accuracy rate change as the contribution degree of the corresponding gradient.
In order to ensure the normal operation of the program, the trusted execution program has a signature function, and when the operation of a certain module is completed, the operation result is signed, and a third party can confirm whether the program operates normally by inquiring the signature.
In order to ensure that the generation of the commitment is not forgeable, the input of the hash function sandbox h (x) is all ciphertexts participating in the verification operation, the output is a random value, and when the input is different, the output random values are also different.
The invention has the following beneficial effects:
(1) gradient proof without revealing privacy: the non-interactive SIGMA protocol used may generate gradient proof information. And on the premise of not revealing gradient plaintext, the verification of the gradient authenticity by the model participant is completed.
(2) Time-saving gradient validation procedure: when any third party needs to carry out correctness verification and contribution degree query on the gradient of a certain party, only the commitment Com needs to be processediThe judgment of the equation is made without any interaction.
Drawings
Fig. 1 is a schematic diagram of the federal learning contribution evaluation method based on the SIGMA protocol of the present invention.
Fig. 2 is a flowchart of the federal learning contribution evaluation method based on the SIGMA protocol of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will become more apparent, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.
In horizontal federal learning, for example, a model of word habit inference for a certain user input method, in order to provide a better input method experience for a user, the input method trains a sentence prediction model thereof, so that the user can have subsequent recommended words as accurate as possible for the input of a certain word. However, in order to protect the privacy of the users, the service provider cannot directly obtain the input content of each user, but hopes that the terminals jointly participate in the training of the prediction model by using respective local data, and model gradients need to be continuously exchanged in the training process. In the gradient interaction process, homomorphic encryption is generally used to ensure that gradient privacy information is not leaked, but because many participants are not completely credible, the encrypted gradient ciphertext has unreadability, the contribution degree of the gradient ciphertext cannot be evaluated, and the authenticity of the ciphertext is still studied. Therefore, the scheme is designed, and the accuracy of the gradient contribution degree evaluation and the trueness and credibility of the ciphertext are ensured.
As shown in fig. 1, the present solution includes a model training party and a participant. The model training party sets a testing module and a Hash function sandbox H (x), packages the testing module and the Hash function sandbox H (x) into a credible execution program, and sends the credible execution program and the model to be trained to the participants. And the participator receives the model and the credible executive program of the model training party, runs the credible executive program, completes contribution degree evaluation and correctness commitment and submits the gradient to the model training party. And the model training party performs correctness verification after receiving the gradient of the participant and completes the recording of the gradient and the contribution degree thereof.
As shown in fig. 2, the specific process is as follows:
the first step is as follows: the model training direction sends the models of the current batch, the model parameters, the untrustworthy credible execution program, the non-interactive SIGMA protocol and the related parameters to all the participants participating in the model training; the trusted executive comprises a testing module and a Hash function sandbox H (x);
the second step is that: training the model by using a local data set by the participant to obtain a gradient, running a trusted execution program, extracting the gradient of the participant by the trusted execution program, updating the model by using the gradient, running a test module in the trusted execution program to test the accuracy of the new model, and then calculating the contribution degree of the gradient; as one implementation mode, the test module comprises a test data set and stores the current accuracy of the model to be trained; and after the updated model is subjected to accuracy rate test, calculating the accuracy rate change before and after updating, and recording the accuracy rate change as the contribution degree of the corresponding gradient.
The third step: the participator encodes the gradient extracted by the credible executive program according to the encryption algorithm of the non-interactive SIGMA protocol to obtain xiAnd using two generators of encryption algorithm to encrypt them respectively to obtain xiG、xiH, sending the data to a model training party; as one implementation mode, the encryption algorithm selects elliptic curve plusA cryptographic algorithm or a discrete logarithm encryption algorithm, and includes some two generators G, H of the encryption algorithm within the protocol.
The fourth step: the participant generates a random value, which is encoded to obtain viAnd respectively encrypting the two generators by using the two generators of the encryption algorithm in the protocol again to obtain viG、viH, the participant inputs all the ciphertext generated currently into a hash function sandbox H (x) in the trusted execution program, outputs a hash value c by the sandbox, and uses a formula Comi=vi-xic generating acceptance Comi;
The fifth step: participant uploads acceptance Comi、viG、viH and the degree of contribution of the gradient to a model trainer, and the model trainer uses the currently received xiG、xiH、viG、viH and the backup hash function sandbox H (x) calculate the hash value c. Verification equation viG=rG+c(xiG) And viH=rH+c(xiH) And if so, the model training party passes the verification, and the ciphertext gradient uploaded by the participant and the gradient plaintext locally participating in updating and testing correspond to each other. Binding and recording the gradient ciphertext and the contribution degree thereof in a database; if the verification fails, the ciphertext does not correspond to the plaintext, the gradient is abandoned and recorded, and an error is fed back to the participant.
Similarly, in financial systems, multiple banks collaborate together, training the wind control model using local data sets to circumvent credit risk and equitable planning interest rates. In a medical system, multiple hospitals use patient data to perform deep cooperation, and a drug effect evaluation model is improved according to a large number of medical records and drug effect records of patients, so that the contribution degree evaluation method can be used.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and although the invention has been described in detail with reference to the foregoing examples, it will be apparent to those skilled in the art that various changes in the form and details of the embodiments may be made and equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.
Claims (5)
1. A federal learning contribution degree evaluation method based on a SIGMA protocol is characterized by comprising the following steps:
the method comprises the following steps: the model training direction sends the models of the current batch, the model parameters, the untrustworthy credible execution program, the non-interactive SIGMA protocol and the related parameters to all the participants participating in the model training; the trusted executive comprises a testing module and a Hash function sandbox H (x);
step two: training the model by using a local data set by the participant to obtain a gradient, running a trusted execution program, extracting the gradient of the participant by the trusted execution program, updating the model by using the gradient, running a test module in the trusted execution program to test the accuracy of the new model, and then calculating the contribution degree of the gradient;
step three: the participator encodes the gradient extracted by the credible executive program according to the encryption algorithm of the non-interactive SIGMA protocol to obtain xiAnd using two generators of encryption algorithm to encrypt them respectively to obtain xiG、xiH, sending the data to a model training party;
step four: the participant generates a random value, which is encoded to obtain viAnd respectively encrypting the two generators by using the two generators of the encryption algorithm in the protocol again to obtain viG、viH, the participant inputs all the ciphertext generated currently into a hash function sandbox H (x) in the trusted execution program, outputs a hash value c by the sandbox, and uses a formula Comi=vi-xic generating acceptance Comi;
Step five: participant uploads acceptance Comi、viG、viH and the degree of contribution of the gradient to a model trainer, and the model trainer uses the currently received xiG、xiH、viG、viH and the backup hash function sandbox H (x) calculate the hash value c. Verification equation viG=rG+c(xiG) And viH=rH+c(xiH) If the judgment result is positive, the model training party passes the verification, the ciphertext gradient uploaded by the participant and the gradient plaintext locally participating in updating and testing are indicated to be corresponding, and the gradient ciphertext and the contribution degree of the gradient ciphertext are bound and recorded in a database; if the verification fails, the ciphertext does not correspond to the plaintext, the gradient is abandoned and recorded, and an error is fed back to the participant.
2. The SIGMA protocol-based federal learning contribution assessment method of claim 1, wherein: the SIGMA protocol uses an elliptic curve encryption algorithm or a discrete logarithm encryption algorithm, and two generators G, H in the encryption algorithm are also included in the protocol.
3. The SIGMA protocol-based federal learning contribution assessment method of claim 1, wherein: the test module comprises a test data set and stores the current accuracy of the model to be trained; and after the updated model is subjected to accuracy rate test, calculating the accuracy rate change before and after updating, and recording the accuracy rate change as the contribution degree of the corresponding gradient.
4. The SIGMA protocol-based federal learning contribution assessment method as claimed in claim 1 or 2, wherein: the trusted execution program has a signature function, and when the operation of a certain module is completed, the operation result is signed, and a third party can confirm whether the program normally operates by inquiring the signature.
5. The SIGMA protocol-based federal learning contribution assessment method of claim 2, wherein: the input of the hash function sandbox H (x) is all ciphertexts participating in verification operation, the output is a random value, and when the input is different, the output random values are also different.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110292470.XA CN112949865B (en) | 2021-03-18 | 2021-03-18 | Joint learning contribution degree evaluation method based on SIGMA protocol |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110292470.XA CN112949865B (en) | 2021-03-18 | 2021-03-18 | Joint learning contribution degree evaluation method based on SIGMA protocol |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112949865A true CN112949865A (en) | 2021-06-11 |
CN112949865B CN112949865B (en) | 2022-10-28 |
Family
ID=76227002
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110292470.XA Active CN112949865B (en) | 2021-03-18 | 2021-03-18 | Joint learning contribution degree evaluation method based on SIGMA protocol |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112949865B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113421251A (en) * | 2021-07-05 | 2021-09-21 | 海南大学 | Data processing method and system based on lung CT image |
CN113469371A (en) * | 2021-07-01 | 2021-10-01 | 建信金融科技有限责任公司 | Federal learning method and device |
CN114912136A (en) * | 2022-07-14 | 2022-08-16 | 之江实验室 | Competition mechanism based cooperative analysis method and system for medical data on block chain |
CN115292738A (en) * | 2022-10-08 | 2022-11-04 | 豪符密码检测技术(成都)有限责任公司 | Method for detecting security and correctness of federated learning model and data |
WO2024066042A1 (en) * | 2022-09-27 | 2024-04-04 | 深圳先进技术研究院 | Electronic letter-of-guarantee value prediction method and apparatus based on privacy computing |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443063A (en) * | 2019-06-26 | 2019-11-12 | 电子科技大学 | The method of the federal deep learning of self adaptive protection privacy |
CN111241580A (en) * | 2020-01-09 | 2020-06-05 | 广州大学 | Trusted execution environment-based federated learning method |
US20200218940A1 (en) * | 2019-01-08 | 2020-07-09 | International Business Machines Corporation | Creating and managing machine learning models in a shared network environment |
CN111950739A (en) * | 2020-08-13 | 2020-11-17 | 深圳前海微众银行股份有限公司 | Data processing method, device, equipment and medium based on block chain |
CN112329940A (en) * | 2020-11-02 | 2021-02-05 | 北京邮电大学 | Personalized model training method and system combining federal learning and user portrait |
US20210073677A1 (en) * | 2019-09-06 | 2021-03-11 | Oracle International Corporation | Privacy preserving collaborative learning with domain adaptation |
-
2021
- 2021-03-18 CN CN202110292470.XA patent/CN112949865B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200218940A1 (en) * | 2019-01-08 | 2020-07-09 | International Business Machines Corporation | Creating and managing machine learning models in a shared network environment |
CN110443063A (en) * | 2019-06-26 | 2019-11-12 | 电子科技大学 | The method of the federal deep learning of self adaptive protection privacy |
US20210073677A1 (en) * | 2019-09-06 | 2021-03-11 | Oracle International Corporation | Privacy preserving collaborative learning with domain adaptation |
CN111241580A (en) * | 2020-01-09 | 2020-06-05 | 广州大学 | Trusted execution environment-based federated learning method |
CN111950739A (en) * | 2020-08-13 | 2020-11-17 | 深圳前海微众银行股份有限公司 | Data processing method, device, equipment and medium based on block chain |
CN112329940A (en) * | 2020-11-02 | 2021-02-05 | 北京邮电大学 | Personalized model training method and system combining federal learning and user portrait |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113469371A (en) * | 2021-07-01 | 2021-10-01 | 建信金融科技有限责任公司 | Federal learning method and device |
CN113421251A (en) * | 2021-07-05 | 2021-09-21 | 海南大学 | Data processing method and system based on lung CT image |
CN114912136A (en) * | 2022-07-14 | 2022-08-16 | 之江实验室 | Competition mechanism based cooperative analysis method and system for medical data on block chain |
CN114912136B (en) * | 2022-07-14 | 2022-10-28 | 之江实验室 | Competition mechanism based cooperative analysis method and system for medical data on block chain |
WO2024066042A1 (en) * | 2022-09-27 | 2024-04-04 | 深圳先进技术研究院 | Electronic letter-of-guarantee value prediction method and apparatus based on privacy computing |
CN115292738A (en) * | 2022-10-08 | 2022-11-04 | 豪符密码检测技术(成都)有限责任公司 | Method for detecting security and correctness of federated learning model and data |
CN115292738B (en) * | 2022-10-08 | 2023-01-17 | 豪符密码检测技术(成都)有限责任公司 | Method for detecting security and correctness of federated learning model and data |
Also Published As
Publication number | Publication date |
---|---|
CN112949865B (en) | 2022-10-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112949865B (en) | Joint learning contribution degree evaluation method based on SIGMA protocol | |
CN108616539B (en) | A kind of method and system of block chain transaction record access | |
CN110189192B (en) | Information recommendation model generation method and device | |
US20230078061A1 (en) | Model training method and apparatus for federated learning, device, and storage medium | |
CN113204787B (en) | Block chain-based federated learning privacy protection method, system, device and medium | |
EP4113345A1 (en) | Data processing method and system based on node group, and device and medium | |
CN113159327B (en) | Model training method and device based on federal learning system and electronic equipment | |
KR102145701B1 (en) | Prevent false display of input data by participants in secure multi-party calculations | |
CN111931950A (en) | Method and system for updating model parameters based on federal learning | |
CN109547477A (en) | A kind of data processing method and its device, medium, terminal | |
CN112347500B (en) | Machine learning method, device, system, equipment and storage medium of distributed system | |
CN113111124B (en) | Block chain-based federal learning data auditing system and method | |
Leontiadis et al. | PUDA–privacy and unforgeability for data aggregation | |
CN113992360A (en) | Block chain cross-chain-based federated learning method and equipment | |
CN111340494A (en) | Asset type consistency evidence generation, transaction and transaction verification method and system | |
CN113344222A (en) | Safe and credible federal learning mechanism based on block chain | |
CN111553443B (en) | Training method and device for referee document processing model and electronic equipment | |
CN111027981A (en) | Method and device for multi-party joint training of risk assessment model for IoT (Internet of things) machine | |
CN115455476A (en) | Longitudinal federal learning privacy protection method and system based on multi-key homomorphic encryption | |
CN112434026A (en) | Secure intellectual property pledge financing method based on Hash chain | |
CN111914281B (en) | Bayesian model training method and device based on blockchain and homomorphic encryption | |
CN113221153B (en) | Graph neural network training method and device, computing equipment and storage medium | |
Tran et al. | An efficient privacy-enhancing cross-silo federated learning and applications for false data injection attack detection in smart grids | |
CN117094773A (en) | Online migration learning method and system based on blockchain privacy calculation | |
CN116260662A (en) | Tracing storage method, tracing storage system and tracing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |