CN114338045A - Information data verifiability safety sharing method and system based on block chain and federal learning - Google Patents

Information data verifiability safety sharing method and system based on block chain and federal learning Download PDF

Info

Publication number
CN114338045A
CN114338045A CN202210040143.XA CN202210040143A CN114338045A CN 114338045 A CN114338045 A CN 114338045A CN 202210040143 A CN202210040143 A CN 202210040143A CN 114338045 A CN114338045 A CN 114338045A
Authority
CN
China
Prior art keywords
gradient
user
model
committee
block chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210040143.XA
Other languages
Chinese (zh)
Other versions
CN114338045B (en
Inventor
郭渊博
方晨
王一丰
马佳利
李勇飞
尹安琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202210040143.XA priority Critical patent/CN114338045B/en
Publication of CN114338045A publication Critical patent/CN114338045A/en
Application granted granted Critical
Publication of CN114338045B publication Critical patent/CN114338045B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of network security, and particularly relates to an information data verifiability security sharing method and system based on a block chain and federal learning.A user acquires a local intrusion detection model gradient by utilizing machine training based on local data, and the local intrusion detection model gradient is encrypted and compressed by a mask and then is sent to a neighboring block chain node together with a digital signature; the block chain node performs signature verification on the gradient data, legal model gradients are put into a transaction pool, a leader in the block chain node is used for aggregating the legal gradients in the transaction pool, a user redundancy mask under the abnormal condition recovered by a backup committee is added to obtain a global gradient, and a new block is created to feed back the global gradient to a verification committee; the authentication committee broadcasts the new blocks which pass the authentication to the whole network; the user downloads the latest tile and gets the global gradient from it to update the local intrusion detection model. The invention obtains the converged information data sharing model through the training of multi-party users, deploys the model at the local end for network anomaly detection, and improves the defense performance of intrusion detection.

Description

Information data verifiability safety sharing method and system based on block chain and federal learning
Technical Field
The invention belongs to the technical field of network security, and particularly relates to an information data verifiability security sharing method and system based on block chain and federal learning.
Background
With the frequent occurrence of network security incidents, intelligence data has become an important basis for detecting network information crime, detecting intrusion behaviors and other abnormalities. The method for detecting network abnormity by training an artificial intelligence model by using intelligence data has become an important means for constructing a network safety protection boundary. The accuracy of artificial intelligence models is closely related to the amount of training data. The current sources of intelligence data for each institution user are limited to their own data collection channels or cost a lot of money to purchase from third party institutions. Since informative data may contain sensitive information about a user and today data is a production element, informative data is an important asset for every department. Therefore, most departments are reluctant to share the intelligence data of each other, which leads to a serious data islanding phenomenon, and each department is difficult to construct an effective intrusion detection model because of insufficient intelligence data quantity.
Federal learning serves as a distributed machine learning framework, original data sharing can be converted into model parameter sharing, a distrust-removing and centralization-removing data transaction mode can be established among distributed users through a block chain, the distrust-removing and centralization-removing data transaction mode can be combined, the risk of data privacy disclosure can be reduced, the problems of single-point fault attack, trust loss and the like can be solved, and verification, traceability and audit of the whole data sharing process can be achieved. In recent years, successive learners combine block chain and federal learning to be applied to data security sharing, but the following problems still exist: (1) in the verifiability problem of the data sharing result, the communication base station can be generally used as a block chain node to be responsible for collecting model parameters uploaded by different users. Once some malicious base stations tamper the model parameters and then put the tampered model parameters into the transaction pool (i.e. before data uplink), the blockchain can achieve consensus on the wrong model parameters, and finally joint modeling is performed to obtain a wrong data sharing model. (2) The availability and privacy of shared data, existing documents are typically based on differential privacy and secure multiparty computing in order to enhance privacy protection when sharing data, but they reduce data availability and increase computational overhead, respectively. There is still a need for research on how to compromise data availability and privacy with less computational overhead. (3) The problem of high communication overhead is solved, the inherent training process of federal learning needs high communication overhead, and the communication overhead after the two are combined is higher by a message broadcasting mechanism in a block chain, so that the application of the method to the scene with limited bandwidth is limited. Therefore, how to ensure the verifiability of the data sharing result is also a problem to be solved in the data sharing method based on the block chain and the federal learning.
Disclosure of Invention
Therefore, the invention provides an information data verifiability safety sharing method and system based on block chain and federal learning, a converged information data sharing model is obtained through multi-party user training, the sharing model is deployed at a user local end to be used as an intrusion detection model to carry out network anomaly detection, the problems of confidentiality, result verifiability, high privacy protection scheme cost and the like in the existing data sharing process are solved, and effective technical means support is provided for data circulation and sharing among different mechanisms.
According to the design scheme provided by the invention, the method for sharing the verifiability safety of the intelligence data based on the block chain and the federal learning is used for the joint modeling of the intrusion detection model by a plurality of users in the network security defense, and the joint modeling process comprises the following contents:
the method comprises the following steps that a trusted authority distributes a public and private key pair for each user and each block chain link point, the public and private key pair is sent to the users and the block chain link points through a secure channel, each user secretly shares a private key of the user to a backup committee, and the backup committee consists of a plurality of block chain link points;
a user acquires a local intrusion detection model gradient by using machine training based on local data, encrypts the model gradient by adding a mask, compresses the encrypted model gradient, and sends the compressed and encrypted model gradient and a digital signature to an associated adjacent block chain node;
carrying out signature verification on uploaded model gradient data by block link points, putting legal model gradients passing the signature verification into a transaction pool, aggregating the legal gradients in the transaction pool by using a selected leader in the block link points, adding a redundancy mask generated by a user under an abnormal condition recovered by a backup committee to obtain a global gradient, and sending the global gradient to a verification committee by using a new block by creating the new block for recording the global gradient and other key parameters;
the verification committee verifies the correctness of the global gradient and broadcasts the new blocks passing the verification to the whole network to achieve consensus; the user updates the local intrusion detection model by receiving the latest global gradient and downloading it from the latest tile.
The method is used for safely sharing verifiable information data based on the block chain and federal learning, further, the global gradient in the combined modeling is obtained in an iterative mode by setting the model convergence condition in the iterative round, so that the local intrusion detection model of the user is updated in a synchronous iterative mode, wherein the model convergence condition is the maximum iterative round.
The safety sharing method for verifiability of the intelligence data based on the block chain and federal learning is characterized in that a credit value is set for each user and each block link point to stimulate the user and each block link point to participate in joint modeling of an intrusion monitoring model, the block chain nodes are elected according to the credit values to form a leader, a backup committee and a verification committee, and a blacklist is used for managing and limiting joint participation authorities of the users and the block chain nodes with the credit values smaller than a threshold value in the joint modeling.
The method is used for safely sharing verifiable information data based on block chains and federal learning, and further, in the model gradient data mask encryption, aiming at the private key of a user and the public keys of other users, a shared key between the user and other users is calculated by using a Diffie-Hellman protocol, the shared key is used as the seed of a random number generator to generate a random mask, and the random mask is used for encrypting the local model gradient of the user; the user adds a private key to a selected polynomial and constructs a polynomial commitment by utilizing a verifiable secret sharing technology, and recovers a redundant random mask through key reconstruction when user drop or signature failure occurs by splitting the polynomial into n secret shares and sending the polynomial, secret share witnesses and the polynomial commitment to a backup committee, wherein the secret share witnesses are used for verifying the commitment polynomial to which the secret share belongs.
As the information data verifiability safety sharing method based on the block chain and the federal study, the invention further compresses the encrypted model gradient by using the CRT (cathode ray tube) of the Chinese remainder theorem, and the compression process comprises the following steps: firstly, uniformly dividing the model gradient after the encryption of a user into r segments, wherein,
Figure BDA0003469852790000031
l is the gradient length of the encrypted model, and k is a preset division length value; then, the model gradient segment is compressed into an element corresponding to the segment by using the solution of the equation system formed by the k congruence equations, and the compression result of the whole model gradient is obtained through the element corresponding to the segment.
As the safety sharing method for verifiability of intelligence data based on block chain and federal learning, the invention further utilizes a consistent hash protocol based on credit value to draw a election leader aiming at the block chain nodes, wherein the election process specifically comprises the following steps: setting a Hash ring, distributing Hash ring spaces corresponding to each block link point according to the reputation value of the block link point, carrying out Hash calculation on an initial SHA-256 Hash value of the current latest block, mapping the Hash value obtained by calculation to the Hash ring, and determining a leader of the block link point selected by drawing according to the Hash ring space where the mapping result is located.
As the safety sharing method for verifiability of intelligence data based on block chain and federal learning, further, setting all user sets as U, and abnormal user sets as V, wherein the abnormal user sets are illegal when the user sets are disconnected or signed, and the process that the leader aggregates all legal gradient data in the business pool is expressed as follows:
Figure BDA0003469852790000032
wherein, the CRT indicates a compression operation,
Figure BDA0003469852790000033
encrypt result, Δ w ', for model gradient mask of user i in transaction pool'iAnd (5) gradient compression results of the user i model.
As the safety sharing method for verifiability of the information data based on the block chain and the federal study, the invention further aims at the abnormal users in the abnormal user set, firstly, a plurality of block chain link points in a backup committee are utilized to submit the secret shares of the abnormal users and carry out the correctness verification on the secret shares, a polynomial and the private keys of the abnormal users are recovered by utilizing an interpolation theorem, and then, a redundant random mask is calculated by utilizing the shared key between other users and the abnormal users, so that the global gradient is recovered.
The method for safely sharing verifiability of the information data based on the block chain and federal learning further comprises the steps of confirming whether the model gradient in the transaction pool is tampered according to the addition homomorphism promised by a polynomial in the correctness verification, confirming that a new block created by a leader is legal aiming at the situation that the model gradient is not tampered, and passing the verification when the verifier proportion that the new block is legal by a block chain node in a verification committee reaches a preset value, and generating an invalid empty block if the verifier proportion is smaller than the preset value.
Further, the invention also provides an information data verifiability security sharing system based on block chain and federal learning, which is used for joint modeling of intrusion detection models by multiple users in network security defense, and comprises the following steps: the system comprises user nodes used for participating in local model training in joint modeling, block link points used for carrying out consensus operation on local model training parameters of the user nodes, a credible authority used for distributing public and private key pairs for the user nodes and the block link points, and a backup committee and a verification committee which are composed of a plurality of block link points, wherein each user secretly shares a self private key to the backup committee so as to recover private key information of the user in abnormal situations, and a global gradient new block obtained by aggregation is subjected to correctness verification through the verification committee;
a user acquires a local intrusion detection model gradient by using machine training based on local data, encrypts the model gradient by adding a mask, compresses the encrypted model gradient, and sends the compressed and encrypted model gradient and a digital signature to an associated adjacent block chain node;
carrying out signature verification on uploaded model gradient data by block link points, putting legal model gradients passing the signature verification into a transaction pool, aggregating the legal gradients in the transaction pool by using a selected leader in the block link points, adding a redundancy mask generated by a user under an abnormal condition recovered by a backup committee to obtain a global gradient, and sending the global gradient to a verification committee by using a new block by creating the new block for recording the global gradient and other key parameters;
the verification committee verifies the correctness of the global gradient and broadcasts the new blocks passing the verification to the whole network to achieve consensus; the user updates the local intrusion detection model by receiving the latest global gradient and downloading it from the latest tile.
The invention has the beneficial effects that:
the artificial intelligence model is trained in a joint modeling mode to be used for constructing an intrusion detection system, so that the risk of data privacy disclosure is reduced, the problems of single-point fault attack, trust loss and the like can be solved, the information data sharing whole process can be verified, traced and audited, and the method is suitable for data sharing among multiple departments or organizations; the gradient is quickly encrypted by adding the mask, so that privacy attacks such as the latest model inversion and model extraction can be resisted, and the mask is offset to 0 during gradient aggregation, so that the precision of the federal learning model is not influenced; and gradient verification based on polynomial commitment is merged into a joint modeling consensus process, tampering attack of malicious block chain nodes can be resisted, the problems of confidentiality, result verifiability, high privacy protection scheme overhead and the like in a data sharing process can be solved, an effective technical means can be provided for data circulation and sharing among different departments or mechanisms, local end intrusion detection performance and network security defense effect are effectively improved, and the method has a good application prospect.
Description of the drawings:
FIG. 1 is a flow diagram of a method for verifiably and safely sharing informative data based on a block chain and federal learning in an embodiment;
FIG. 2 is a schematic diagram of an embodiment of an architecture for verifiable security sharing of intelligence data;
FIG. 3 is a schematic diagram of a round of training process in an embodiment of iterative training of verifiable security sharing of intelligence data;
FIG. 4 is a schematic diagram of the gradient mask and compression process in an embodiment;
FIG. 5 is a consistent hash protocol illustration in an embodiment;
FIG. 6 is a schematic diagram of a new block created by the leader in an embodiment;
fig. 7 is a schematic diagram of the swollen attack resistance at different backup committee scales in the example.
The specific implementation mode is as follows:
in order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.
In solving the data island problem in the industry 4.0, a cognitive computing platform can be obtained by combining a block chain and decentralization of federal learning, model parameters of a user are directly stored in the block chain, and once an attacker or a malicious data sharing participant obtains the parameters, information of original data of the user can be deduced through reverse attack of the model. Model parameters of the users are encrypted by using a Paillier algorithm and then uploaded to a block chain, and decryption is completed by cooperation of part of users after the model is updated, so that a large amount of calculation overhead and communication overhead are consumed. Aiming at the data security sharing requirement in an industrial internet scene, a local differential privacy technology can be applied, noise is added to original data, then feature extraction and sharing are carried out, privacy stealing attack can be prevented, and the effectiveness of partial data can be lost. Therefore, it can be seen that there is a certain drawback in enhancing data security by using a homomorphic encryption technology and a differential privacy technology, and how to consider both data availability and privacy with smaller computation overhead and communication overhead still needs to be studied. In addition, as the verifiability of the data sharing result is not considered, the base station can be taken as a block chain node to collect model parameters uploaded by users in different areas in general; if some malicious base stations tamper the model parameters and then put the model parameters into a transaction pool (namely tamper before data uplink), the block chain can achieve consensus on wrong model parameters, and finally joint modeling is carried out to obtain a wrong data sharing model, so that the practical application of data sharing is influenced. To this end, an embodiment of the present invention provides a method for sharing information data verifiability security based on a block chain and federal learning, which is used for joint modeling of an intrusion detection model by multiple users in network security defense, and as shown in fig. 1, the joint modeling process includes the following contents:
s101, distributing a public and private key pair for each user and each block chain link point by a trusted authority, sending the public and private key pair to the users and the block chain link points through a secure channel, and secretly sharing a private key of each user to a backup committee, wherein the backup committee consists of a plurality of block chain link points;
s102, a user acquires a local intrusion detection model gradient by machine training based on local data, encrypts the model gradient by adding a mask, compresses the encrypted model gradient, and sends the compressed and encrypted model gradient and a digital signature to an associated adjacent block chain node;
s103, performing signature verification on uploaded model gradient data by block link points, putting legal model gradients passing the signature verification into a transaction pool, aggregating the legal gradients in the transaction pool by using a leader elected by the block link points, adding a redundancy mask generated by a user under an abnormal condition recovered by a backup committee to obtain a global gradient, and sending the global gradient to a verification committee by creating a new block for recording the global gradient and other key parameters, wherein the verification committee consists of a plurality of block link points;
s104, carrying out correctness verification on the global gradient by a verification committee, and broadcasting the new blocks passing the verification to the whole network to achieve consensus; the user updates the local intrusion detection model by receiving the latest global gradient and downloading it from the latest tile.
In the embodiment of the scheme, each user participating in intelligence data sharing converts the data sharing problem into a model gradient sharing problem through federal training of a local end, and a mask is added to the gradient to realize rapid encryption. To reduce communication overhead, the user may compress the encryption gradient before uploading to the associated block link point. And then carrying out aggregate calculation on all effective gradients in the block chain to obtain a global gradient, verifying the correctness of the global gradient, and generating a legal block which can be identified in the whole network. And finally, downloading the generated new blocks from the block chain by each user, and updating the local model after acquiring the global gradient. The users can refer to departments, local enterprises, organization organizations and the like participating in intelligence data sharing, have limited intelligence data and computing capacity, hope that the intelligence data of the users can be kept locally and simultaneously combined with other users for modeling, and therefore a more accurate anomaly detection model is obtained to construct a network protection system. In this embodiment, it may be assumed that the users participating in the data sharing are semi-honest, i.e. they may honestly execute the protocol, but may also use the information obtained by themselves to infer informative data of other users. The blockchain nodes are generally equipped with certain computing resources and communication resources, such as communication base stations, servers, and the like, and are responsible for operations such as parameter verification, aggregation, consensus, and the like, it can be assumed that some blockchain nodes may be placed in a transaction pool after being tampered with data uploaded by a user after being captured by an attacker, and may also provide false secret shares in a secret reconstruction stage. The transaction is used for data recording of interaction between block chain nodes, and in the embodiment of the scheme, the gradient of a transaction record model and related training information are recorded. The block chain and the federal learning are combined and applied to information data sharing, the problems of confidentiality in the data sharing process, result verifiability and high cost of a privacy protection scheme are solved, and an effective technical means is provided for getting through circulation and sharing of the information data among different departments.
As the information data verifiability safety sharing method based on the block chain and the federal learning in the embodiment of the invention, further, the global gradient in the combined modeling is obtained by iteration through setting the model convergence condition in the iteration round, so as to synchronously update the local intrusion detection model of the user by iteration, wherein the model convergence condition is the maximum iteration round. Further, a credit value is set for each user and each block link point to stimulate the user and the block link points to participate in joint modeling of the intrusion monitoring model, the block link nodes are selected according to the credit values to form a leader, a backup committee and a verification committee, and a blacklist is used for managing and limiting joint participation permission of the users and the block link nodes of which the credit values are smaller than a threshold value in the joint modeling.
In the embodiment of the scheme, each user participating in data sharing and a blockchain node can be endowed with an initialization credit value. For a user, if the user participates in data sharing online in the whole process and the uploaded gradient signature is verified to be legal, the credit value of the user is increased, otherwise, the credit value of the user is reduced; for a blockchain node, its reputation value increases if it provides a correct secret share, generates a legitimate new block, or participates in new block verification, and decreases if it provides a false secret share. When the reputation values decrease to 0, they are blacklisted and are not allowed to participate in intelligence data sharing. It can be assumed that at least 70% of the credit in the system is known by honest at any time to ensure proper operation of the blockchain consensus protocol. Referring to fig. 2, it is assumed that the architecture is composed of a blockchain and m distributed users, where the blockchain is maintained by a plurality of nodes equipped with certain computing and communication resources, and in practical applications, the blockchain nodes may be base stations equipped with servers, etc. The users may be departments, organizations, enterprises, or the like with limited computing and communication capabilities, having possession ofLocal intelligence data set DiAnd (i is more than or equal to 1 and less than or equal to m), mapping the original information data into a model gradient based on machine learning training, uploading the model gradient to associated block chain nodes through a wired or wireless network, and completing federal learning under coordination of the block chains, so that the purpose of information data sharing is achieved.
As an information data verifiability security sharing method based on block chain and federal learning in the embodiment of the invention, further, in the model gradient data mask encryption, aiming at the private key of the user and the public keys of other users, a shared key between the user and other users is calculated by using a Diffie-Hellman protocol, the shared key is used as the seed of a random number generator to generate a random mask, and the random mask is used for encrypting the local model gradient of the user; the user adds a private key to a selected polynomial and constructs a polynomial commitment by utilizing a verifiable secret sharing technology, and recovers a redundant random mask through key reconstruction when user drop or signature failure occurs by splitting the polynomial into n secret shares and sending the polynomial, secret share witnesses and the polynomial commitment to a backup committee, wherein the secret share witnesses are used for verifying the commitment polynomial to which the secret share belongs. Further, the encrypted model gradient is compressed by using a Chinese remainder theorem CRT, and the compression process comprises the following steps: firstly, uniformly dividing the model gradient after the encryption of a user into r segments, wherein,
Figure BDA0003469852790000071
l is the gradient length of the encrypted model, and k is a preset division length value; then, the model gradient segment is compressed into an element corresponding to the segment by using the solution of the equation system formed by the k congruence equations, and the compression result of the whole model gradient is obtained through the element corresponding to the segment.
Cryptographic commitments are a class of important cryptographic primitives that generally include a commitment party and a verification party. In the stage of commitment generation, the commitment party selects a message m, calculates a commitment c in a cryptograph form, and then sends the commitment c to the receiving party, wherein the commitment party cannot change the m at the moment; in the commitment disclosure stage, the commitment party publishes a plaintext message m and a secret key, and the verification party calculates a commitment c 'corresponding to the m according to the same way, if c', the verification is passed, otherwise, the verification fails. The commitment agreement has the following characteristics: (1) concealment: the commitment value c does not reveal any information about the message m; (2) binding property: the committee cannot open the commitment c as a non-m message and verify it. In view of the above, a commitment protocol may be used to ensure uniqueness in the interpretation of the ciphertext form of the private data.
Polynomial commitment is a commitment protocol that satisfies the properties of additively homomorphic cryptography, and is often used to construct zero-knowledge proofs, verifiable secret sharing, and the like. The process of constructing Verifiable Secret Sharing (VSS) can be described as follows:
(1) initialization Setup (1)κT) hypothesis
Figure BDA0003469852790000072
And
Figure BDA0003469852790000073
is a group of order prime p, g is
Figure BDA0003469852790000074
The generation element of (a) is generated,
Figure BDA0003469852790000075
to satisfy the symmetric bilinear pairings mapping assumed by t-strong Diffie-Hellaman (t-SDH). Selecting
Figure BDA00034698527900000710
As the private key SK, the public key is
Figure BDA0003469852790000076
(2) Commitment to generate Commitment (PK, φ (x))
Figure BDA0003469852790000077
Its commitment can be calculated as:
Figure BDA0003469852790000078
(3) promise to disclose VerifyPoly (PK, COMM (φ (x)), φ (x)): given a polynomial
Figure BDA0003469852790000079
And a commitment value COMM. If it is
Figure BDA0003469852790000081
It is proven that the commitment was indeed generated by the polynomial phi (x), otherwise not.
(4) Secret sharing CreateWitness (PK, phi (x), i) secret shares sent to user i (1 ≦ i ≦ n) in order to perform (n, t) -secret sharing among n users<i,φ(i),wiContains the function value phi (i) of the polynomial phi (x) at the index i, and witness wi=COMM(ψi(x) ). Wherein
Figure BDA0003469852790000082
COMM(ψi(x) The calculation method of (c) is the same as that of formula (1).
(5) Secret verification VerifyEval (PK, COMM (φ (x)), i, φ (i), wi) Secret share of user i if formula () holds<i,φ(i),wi>From the promised polynomial COMM (phi (x)), otherwise not.
Figure BDA0003469852790000083
(6) Secret reconstruction Recover (i, f (i)) any t +1 or more users show their secret shares < i, φ (i) >, which pass the verification, and then Recover the original polynomial φ (x) using the interpolation theorem.
In addition, the polynomial commitments also satisfy additive homomorphism:
COMM(φ1(x)+φ2(x))=COMM(φ1(x))*COMM(φ2(x)) (3)
the Chinese Remainder Theorem (CRT) is a method for solving a linear congruence equation set. Suppose m1,m2,L,mkIs a positive integer and is prime in pairs, let M be M1·m2L mkThen the following system of equations is in the finite field
Figure BDA0003469852790000088
Only one solution is included:
Figure BDA0003469852790000084
is solved as
Figure BDA0003469852790000085
Wherein M isi=M/mi
Figure BDA0003469852790000086
Is a finite field
Figure BDA0003469852790000087
Inner MiThe inverse of (c).
Assuming that all users have been registered in the system and assigned their respective public, private and an ordered number ID, the steps in a round of training can be designed to include the following, as shown in fig. 3: before training begins, each user shares its own private key Secret to a backup committee consisting of several tile chain nodes using Verifiable Secret Sharing (VSS) to prevent users from dropping off the line in subsequent training and affecting the normal training process (step 0). In formal training, each user iterates through the model gradient locally (step1) and adds a mask to prevent privacy leakage (step 2). To save communication overhead, the user compresses the encrypted gradient using the Chinese Residual Theorem (CRT) and sends it to the neighboring blockchain nodes along with the commitment value of the original gradient (step3 and step 4). And after the node verifies the data signature, putting legal gradients into the transaction pool, stopping the data after a specified time, and electing a leader to perform the next gradient aggregation (step 5). If the gradients of all users in the transaction pool are in the same, the leader directly adds the gradients to obtain a global gradient; if a partial user's gradient is missing from the transaction pool (i.e., the user is dropped or the signature is verified as illegal), the leader computes a global gradient under the secret shares provided by the backup committee (step 6). The leader then creates a new tile to package the relevant gradient information and sends the tile to the committee for validation and broadcast (step 7). Finally, the user downloads the latest global gradient update local model from the blockchain (step 8). If the user is dropped or the signature verification fails in the training round, the next training round will be assigned a new private key and step0-step8 is executed, otherwise step1-step8 is executed. And repeating the iteration until the model converges or the maximum number of training rounds is reached. Note that the reputation values of users, leaders, and backup committee members identified as legitimate in each round of training will all increase to encourage them to make greater contributions to the data sharing system.
In the initialization stage (step0) before training, the trusted authority generates public and private key pairs for all users and block link points, and other public information is stored in the creation block (namely the first block in the block chain) and is sent to all participants through a secure channel by the trusted authority to execute the initialization task. The creating block mainly comprises the following contents:
a) model initialization parameter w0Learning rate η, total number of training rounds T
b) Generating a public key PK of a polynomial commitment
c) k positive integers m of each two being prime1,m2,L,mk
d) Pseudo-random number generator PRG (-) when its input is from l
Figure BDA0003469852790000098
When the element(s) of (1) is (are) a uniform random seed, it can output a random distribution in [0, R)lSpatially pseudo-random number
e) Initial random seed0Wherein seed parameter seed of ith round of trainingiBased on seed of the previous roundi-1Generated, primarily to ensure election by the leaderRandomness property
f) Initial reputation values for all users and blockchain nodes, and reputation update functions
In addition, considering that some users may be disconnected during training, all users are made to use VSS to split their private keys into secret shares and send the secret shares to the backup committee before formal training begins.
Local training phase (step1-step4), in each round of training, each user gets a model gradient Δ w based on local intelligence dataiI is 1. ltoreq. m, then Δ w is masked by addingiIs encrypted as
Figure BDA0003469852790000091
To enhance privacy protection. To reduce communication overhead, in the present embodiment, the encrypted gradient may be compressed using a Chinese remainder theorem CRT
Figure BDA0003469852790000092
Assuming Δ wi ≦ l, user i (1 ≦ i ≦ m) will first
Figure BDA0003469852790000093
Is divided evenly into
Figure BDA0003469852790000094
A segment, i.e.
Figure BDA0003469852790000095
Wherein the symbols
Figure BDA0003469852790000096
Representing a rounding up. If l is not evenly divisible by k, 0 is used for padding. Suppose that the jth segment is
Figure BDA0003469852790000097
Then user i (1 ≦ i ≦ m) solves the following congruence equation set:
Figure BDA0003469852790000101
according to the Chinese remainder theorem, the above equation set has unique solution
Figure BDA0003469852790000102
It follows that each gradient vector segment of length k
Figure BDA0003469852790000103
Is compressed into an element Δ w by CRTijThen the whole gradient vector
Figure BDA0003469852790000104
Can be compressed into
Figure BDA0003469852790000105
The length becomes 1/k of the original length. The entire gradient mask and compression process may be as shown in fig. 4. User i (i is more than or equal to 1 and less than or equal to m) calculates original gradient delta wiCommitment value COMM (Δ w)i) And will be<Δw′i,COMM(Δwi)>Along with the digital signature to the associated blockchain node.
And an aggregation stage (step5-step6), when the block chain node receives the data uploaded by the user, whether the signature is legal is checked firstly. If it is legal, the data is put into a transaction pool. After a certain time, all nodes stop receiving data and then compete to become the leader to obtain the right to generate a new block. In this embodiment, a consistent hash protocol based on reputation values can be used as a drawing algorithm to select a leader, and the process is as shown in fig. 5, specifically, by giving a hash ring, the space thereof is proportionally allocated to each blockchain node according to the reputation value. And performing repeated Hash calculation on the initial SHA-256 Hash value of the current latest block, and mapping the Hash value obtained by each calculation to a Hash ring, so that the block link point corresponding to the space where the mapping result is located is selected. Note that in this embodiment, the leader of the training round can be selected by repeating the hash calculation for 1 time, and the backup committee and the validation committee, which are composed of a plurality of nodes, need to perform multiple hash calculations to select the member of the block link point in the committee. It can be seen that the above-described drawing process is similar to the Algorand protocol: the probability that a blockchain node is withdrawn is proportional to its reputation value. Let U represent the set of all users and V represent the abnormal set of users who are dropped or illegal signed. The selected leader aggregates all user gradients in the transaction pool according to equation (6):
Figure BDA0003469852790000106
for two data compressed by CRT (as formula (5))
Figure BDA0003469852790000107
Figure BDA0003469852790000108
Can be calculated to obtain:
Figure BDA0003469852790000109
this formula indicates that the CRT satisfies additive homomorphism. From this property, equation (6) can be converted to:
Figure BDA00034698527900001010
the leader will then operate by modulo operation in equation (9)
Figure BDA0003469852790000111
Is decompressed into
Figure BDA0003469852790000112
Then the global gradient delta w is obtained through calculationg
Figure BDA0003469852790000113
Further, in this embodiment, for an abnormal user in the abnormal user set, first, a plurality of block link points in the backup committee are used to submit secret shares of the abnormal user, the secret shares are verified for correctness, a polynomial and a private key of the abnormal user are recovered by using an interpolation theorem, and then, a redundant random mask is calculated by using a shared key between another user and the abnormal user, so that a global gradient is recovered. Further, whether the model gradient in the transaction pool is tampered or not is confirmed according to the addition homomorphism promised by the polynomial, the new block created by the leader is confirmed to be legal aiming at the situation that the model gradient is not tampered, when the verifier proportion that the block chain node confirms that the new block is legal in the verification committee reaches a preset value, the verification is passed, and if the verifier proportion is smaller than the preset value, an invalid empty block is generated.
In the block generation and broadcast phase, the leader creates a new block and broadcasts it to the validation committee for validation. As shown in fig. 6, a block in this embodiment is composed of a block header and a block body, wherein the block header contains meta information of the block and a pointer (i.e., a hash value) pointing to a previous block; the block body contains a series of transaction information. Unlike conventional blockchains, embodiments of the present disclosure store the relevant training parameters as transactions, which may include: (1) random seed parameter seed for next round of trainingt+1(2) proof of proof generated when electing the leader in the aggregation phase, (3) global gradient Δ w of this roundgAnd (4) the commitment value of the legal user gradient. Therefore, the key parameter information in the whole training process is recorded in the block chain in a non-falsifiable mode, and therefore compared with the traditional federal learning algorithm, the training process of the algorithm has auditability.
In the prior art, local gradient plaintext of all users is directly stored in a block, and once an attacker or a semi-honest user obtains the gradient of other users, privacy attacks such as reverse model attack, model extraction attack and the like can be launched. Therefore, in the embodiment of the invention, only the commitment value of the gradient can be stored in the block, so that not only can the privacy information of the gradient be protected, but also the accuracy of the global gradient obtained by each training round can be ensured. Specifically, after the new block generated by the leader is broadcast to the certification committee, all verifiers calculate whether formula (10) holds.
COMM(Δwg)=ΠCOMM(Δwi) (10)
If the result is positive, according to the addition homomorphism promised by the polynomial, the user gradient in the transaction pool can be determined not to be tampered, and the new block is legal. Otherwise, it indicates that some block nodes tamper the user gradient collected by the block nodes and then put the user gradient into the transaction pool, so that the global gradient calculation is wrong, and the new block is illegal. When the verifier exceeding 2/3 determines that the new block is legal, the verification is passed, the new block is broadcast to the whole network to achieve consensus, otherwise, an invalid empty block is generated.
In the block generation and broadcast phase (step7), the leader creates a new block and broadcasts it to the verification committee for verification, wherein the selection method of the verification committee is consistent with the aforementioned drawing algorithm based on the consistent hash protocol, and therefore, the detailed description thereof is omitted. If the verification is successful, the verification committee broadcasts the block to all the block chain nodes of the whole network through the gossip protocol to achieve consensus. Otherwise, an invalid empty block is created.
And a model updating stage (step8), wherein the user downloads the latest block from the link point of the associated block, acquires the global gradient from the latest block and updates the local model. If abnormal users (namely, disconnection or illegal signature) appear in the training round, the leaders recover the private keys of the users during gradient aggregation, so that the trusted authority needs to distribute new public and private keys to the users before the next training round and perform the secret sharing step again. After each round of training is finished, for the users who participate in data sharing and upload the gradient signatures on line in the whole process and are verified to be legal, the credit value is increased, and otherwise, the credit value is reduced; the reputation value increases for blockchain nodes that generate legitimate new blocks or participate in new block verification. When the reputation values decrease to 0, they are blacklisted and are not allowed to participate in intelligence data sharing.
The operational calculation process of secret sharing of the user private key, encryption of the model gradient mask, and aggregate computation of the global gradient can be described as follows:
gradient mask: it is assumed that each user has already obtained the public key pk of the other usersiI ∈ U, then running the Diffie-Hellman protocol can compute the shared secret s between each user pairi,j←KA.agree(ski,pkj) And generates a random mask using the key as a seed for the random number generator. Suppose that each user has now obtained a gradient Δ w of length l through local trainingi1 ≦ i ≦ m, assuming for simplicity vector Δ wiAll elements in (1) are in the field
Figure BDA0003469852790000125
Medium, gradient Δ wiCan be encrypted into
Figure BDA0003469852790000121
As shown in the following formula.
Figure BDA0003469852790000122
As shown in the formula (11), the user only needs to add a random number to the gradient to realize encryption, and when the encryption gradients of all users are equal
Figure BDA0003469852790000123
After addition, the random numbers partially cancel each other to be 0, and the global gradient can be directly obtained
Figure BDA0003469852790000124
Compared with a homomorphic encryption algorithm adopted in the prior art, the encryption mode has higher efficiency and does not lose the data utility. However, once some users are disconnected or the signature is verified to be illegal, the residual gradient is added, the random number cannot be offset to 0, and the global gradient cannot be obtained. Therefore, the private key of the user needs to be secretly backed up, and when the user is disconnected or the signature is illegal, redundant random numbers can be calculated by using the backed-up private key, so that a global gradient is obtained. Based on the idea, the private key of the user is shared to other local users in a secret way. Considering that the key reconstruction needs to consume large calculation and communication overheadThe computation and communication resources of the block chain nodes are much larger than those of the local users, so in the embodiment of the scheme, the private key of the user can be shared to a backup committee consisting of a plurality of block chain nodes through VSS secret.
Private key sharing: assume that the backup committee consists of n block chain nodes (election is based on the drawing algorithm of the consistent hash protocol). User i (1 ≦ i ≦ m) first selects a polynomial phii(x) Its private key skiIs set to phii(x) Constant term of (i.e., +)i(0)=skiThen makes a commitment COMM (phi) to the polynomiali(x) ). Next, the polynomial φ is transformed using verifiable secret sharing techniquesi(x) Split into n secret shares<k,φi(k)>L 1 is less than or equal to k is less than or equal to n, and<k,φi(k),wi,k,COMM(φi(x))>and sending the data to a block chain node k (k is more than or equal to 1 and less than or equal to n). Wherein
Figure BDA0003469852790000131
For the witness of a secret share, it can be used to verify that the secret share does belong to COMM (phi)i(x) A polynomial phi of the commitment in)i(x) This prevents partially malicious block chaining points from providing false shares during key reconstruction.
Gradient polymerization: suppose the leader has been decompressed by equation (9)
Figure BDA0003469852790000132
If all user gradients are uploaded to the blockchain and the signature is legal (i.e. in equation (9))
Figure BDA0003469852790000133
) Then the leader gets the global gradient directly through a simple addition operation as shown in the following equation:
Figure BDA0003469852790000134
if a part of the users are dropped or the signature is verified as illegal (note that this part of the abnormal users is the set V),the secret shares of these abnormal users i e V are submitted first by more than t block chain nodes in the backup committee<k,φi(k),wi,k,COMM(φi(x))>After the correctness of the secret share is verified, the polynomial phi is recovered through an interpolation theoremi(x) And the private key skiI ∈ V. Then calculating shared key s between other users and abnormal usersi,m=KA.agree(ski,pkm) I belongs to V, m belongs to U-V, and finally the global gradient is calculated by the following formula.
Figure BDA0003469852790000135
In the embodiment of the scheme, a private key of a user is shared to a backup committee in an initialization stage in a secret mode, then a mask is added to an original user gradient in a local training stage, and finally a global gradient is calculated in an aggregation stage.
Further, based on the above method, the present invention also provides an intelligence data verifiability security sharing system based on block chain and federal learning, which is used for joint modeling of intrusion detection models by multiple users in network security defense, and comprises: the system comprises user nodes used for participating in local model training in joint modeling, block link points used for carrying out consensus operation on local model training parameters of the user nodes, a credible authority used for distributing public and private key pairs for the user nodes and the block link points, and a backup committee and a verification committee which are composed of a plurality of block link points, wherein each user secretly shares a self private key to the backup committee so as to recover private key information of the user in abnormal situations, and a global gradient new block obtained by aggregation is subjected to correctness verification through the verification committee;
a user acquires a local intrusion detection model gradient by using machine training based on local data, encrypts the model gradient by adding a mask, compresses the encrypted model gradient, and sends the compressed and encrypted model gradient and a digital signature to an associated adjacent block chain node;
carrying out signature verification on uploaded model gradient data by block link points, putting legal model gradients passing the signature verification into a transaction pool, aggregating the legal gradients in the transaction pool by using a selected leader in the block link points, adding a redundancy mask generated by a user under an abnormal condition recovered by a backup committee to obtain a global gradient, and sending the global gradient to a verification committee by using a new block by creating the new block for recording the global gradient and other key parameters;
the verification committee verifies the correctness of the global gradient and broadcasts the new blocks passing the verification to the whole network to achieve consensus; the user updates the local intrusion detection model by the latest tile and obtaining the global gradient from the latest tile.
To verify the validity of the protocol, the following further explanation is made with reference to the test data:
privacy analysis: if a user's local gradient is added with a pair of uniform random masks (as shown in equation (11)), and the masks cancel each other to 0 when all user gradients are added, the user gradients after the masks are added can be regarded as uniform random, i.e. the masks in the pair can protect the gradient privacy of a single user.
Theorem 1: given m, l, R, U, { Δ wi}i∈UWhere m is the number of users and l is the user gradient Δ wiU represents the set of all users. Assume gradient Δ w of all usersiI ∈ U all satisfy
Figure BDA0003469852790000141
Then
Figure BDA0003469852790000142
Where the symbol "≡" indicates that the two distributions are the same.
And (3) proving that: the theorem is proved by using the induction method.
(1) When m ═ U | ═ 1,
Figure BDA0003469852790000143
since it has already been assumed that
Figure BDA0003469852790000148
Then there is
Figure BDA0003469852790000144
In addition, the first and second substrates are,
Figure BDA0003469852790000145
therefore, when m is 1, expression (14) is established.
(2) When m ═ U | ═ k, assuming that theorem 1 holds, equation (14) can be used to obtain
Figure BDA0003469852790000146
Figure BDA0003469852790000147
(3) When m ═ k +1, a new set U ═ U { k +1}, is defined, then
Figure BDA0003469852790000151
Already in formula (16)
Figure BDA0003469852790000152
Then formula (17) can be written as
Figure BDA0003469852790000153
Since it has already been assumed that
Figure BDA0003469852790000154
Then it can be obtained
Figure BDA0003469852790000155
Thus, formula (18) can be written as
Figure BDA0003469852790000156
On the other hand, in the case of a liquid,
Figure BDA0003469852790000157
according to formula (15), there are
Figure BDA0003469852790000158
And has assumed
Figure BDA0003469852790000159
Then can reason out
Figure BDA00034698527900001510
In addition, since it has been assumed
Figure BDA00034698527900001511
Then can reason out
Figure BDA00034698527900001512
Based on equations (21) and (22), equation (20) can be written as:
Figure BDA00034698527900001513
when m is k +1, theorem 1 holds, by combining equations (19) and (23). And (5) finishing the certification.
Anti-witch attack: in the private key sharing operation calculation, in order to support a user drop, the private key of the user is shared in secret to the backup committee. However, if an attacker can control more than t block chain link points in the backup committee through the witch attack, he can generate enough false secret shares and witnesses to pass the verification of VSS, and finally reconstruct a false private key, thereby destroying the gradient aggregation operation process. The minimum value of the threshold t in VSS is analyzed by how much if the probability of an attacker breaking the secret reconstruction is to be limited below a certain value.
Given that the backup committee is derived from a lottery algorithm based on the consistent hash protocol, the probability of each block link point being elected is proportional to its reputation value. Thus, the probability p of an attacker controlling more than t blockchain nodes in the backup committee can be calculated as:
Figure BDA0003469852790000161
where n is the number of members in the backup committee and s is the malicious reputation ratio controlled by the attacker. Since the present invention assumes that at least 70% of the reputation value in the system is known by honest, s is 0.3. According to equation (24), the probability p is assumed to satisfy a binomial distribution. However, the two distributions are resampled, and each block link point can only be elected once in this case, so p can be considered as an upper bound for the probability of an attacker controlling more than t nodes in the backup committee. Through an exhaustion method, the minimum value of the threshold t in the VSS at a time when p is less than a certain probability is calculated, as shown in fig. 7, the minimum value of the threshold t in the VSS at a time when p is less than 0.01, 0.05 and 0.001 is displayed. P can be limited under different probabilities according to the actual training situation to ensure the safety of the method. For example, the number of training rounds of the present solution on the intelligence data set is usually within 100, and p should be less than 0.01. When the number of nodes in the backup committee is 10, the minimum value of the threshold t is 8, so that an attacker cannot destroy the key reconstruction process of the VSS through the witch attack at a high probability.
Based on the experimental data, the scheme based on the gradient mask and the safety aggregation capable of verifying secret sharing not only enhances the privacy safety during the sharing of the information data, but also does not lose the data utility; based on the binding property and the hiding property of the polynomial commitment, the new block chain structure is utilized to integrate the federal learning model verification into the consensus process, so that the tampering attack of the malicious block chain node can be resisted; the communication overhead can be effectively reduced through gradient compression, and the application in an actual scene is facilitated.
Unless specifically stated otherwise, the relative steps, numerical expressions, and values of the components and steps set forth in these embodiments do not limit the scope of the present invention.
Based on the foregoing method and/or system, an embodiment of the present invention further provides a server, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method described above.
Based on the above method and/or system, the embodiment of the invention further provides a computer readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the above method.
In all examples shown and described herein, any particular value should be construed as merely exemplary, and not as a limitation, and thus other examples of example embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A verifiable security sharing method of information data based on block chain and federal learning is used for joint modeling of intrusion detection models by multiple users in network security defense, and is characterized in that the joint modeling process comprises the following contents:
the method comprises the following steps that a trusted authority distributes a public and private key pair for each user and each block chain link point, the public and private key pairs are sent to the users and the block chain link points through a secure channel, each user secretly shares a private key of the user to a backup committee so as to recover private key data of the user under the abnormal condition of the user, and the backup committee consists of a plurality of block chain link points;
a user acquires a local intrusion detection model gradient by using machine training based on local data, encrypts the model gradient by adding a mask, compresses the encrypted model gradient, and sends the compressed and encrypted model gradient and a digital signature to an associated adjacent block chain node;
carrying out signature verification on uploaded model gradient data by block link points, putting legal model gradients passing the signature verification into a transaction pool, aggregating the legal gradients in the transaction pool by using a selected leader in the block link points, adding a redundancy mask generated by a user under an abnormal condition recovered by a backup committee to obtain a global gradient, and sending the global gradient to a verification committee by creating a new block for recording the global gradient and other key parameters, wherein the verification committee consists of a plurality of block link points;
the verification committee verifies the correctness of the global gradient and broadcasts the new blocks passing the verification to the whole network to achieve consensus; the user updates the local intrusion detection model by receiving the latest global gradient and downloading the latest tile and obtaining the global gradient from it.
2. The intelligence data verifiable security sharing method based on blockchain and federal learning of claim 1, wherein the global gradient in the joint modeling is iteratively obtained by setting a model convergence condition in an iteration round to synchronously iteratively update the user local intrusion detection model, wherein the model convergence condition is a maximum iteration round.
3. The intelligence data verifiable security sharing method based on block chain and federal learning of claim 1 or 2, characterized in that, the users and the block chain nodes are stimulated to participate in the joint modeling of the intrusion monitoring model by setting a credit value for each user and block chain node, and the block chain nodes are elected according to the credit values to form a leader, a backup committee and a verification committee, and the joint participation authority of the users and the block chain nodes with the credit values smaller than a threshold value in the joint modeling is managed and limited by using a blacklist.
4. The method for safely sharing verifiability of intelligence data based on blockchain and federal learning according to claim 1, wherein in the model gradient data mask encryption, a shared key between the user and other users is calculated by using a Diffie-Hellman protocol with respect to the own private key of the user and the public keys of other users, and the shared key is used as a seed of a random number generator to generate a random mask, and the model gradient local to the user is encrypted by using the random mask; the user adds a private key to a selected polynomial and constructs a polynomial commitment by using a verifiable secret sharing technology, and the backup committee recovers redundant random masks through key reconstruction when the user drops or signs an illegal abnormal situation, by splitting the polynomial into n secret shares and sending the polynomial, secret share witnesses and the polynomial commitment to the backup committee, wherein the secret share witnesses are used for verifying the commitment polynomial to which the secret share belongs.
5. The method of claim 4, wherein the encrypted model gradient is compressed using a Chinese remainder theorem CRT, the compression process comprising: firstly, uniformly dividing the model gradient after the encryption of a user into r segments, wherein,
Figure FDA0003469852780000021
l is the gradient length of the encrypted model, and k is a preset division length value; then, the model gradient segment is compressed into an element corresponding to the segment by using the solution of the equation system formed by the k congruence equations, and the compression result of the whole model gradient is obtained through the element corresponding to the segment.
6. The method for safely sharing verifiability of intelligence data based on blockchain and federal learning according to claim 4, wherein a consistent hash protocol based on reputation value is used for the blockchain nodes to elect the leader, and the election process specifically comprises: setting a Hash ring, distributing Hash ring spaces corresponding to each block link point according to the reputation value of the block link point, carrying out Hash calculation on an initial SHA-256 Hash value of the current latest block, mapping the Hash value obtained by calculation to the Hash ring, and determining a leader of the block link point selected by drawing according to the Hash ring space where the mapping result is located.
7. The intelligence data verifiable security sharing method based on block chain and federal learning of claim 6, wherein the set of all users is set as U, and the abnormal set of users in case of offline or illegal signature abnormal is set as V, then the process of the leader aggregating legal gradient data in the transaction pool is expressed as:
Figure FDA0003469852780000022
wherein, CRT denotes the operation of the compression, and,
Figure FDA0003469852780000023
encrypt result, Δ w ', for model gradient mask of user i in transaction pool'iAnd (5) gradient compression results of the user i model.
8. The method for safely sharing verifiable intelligence data based on blockchain and federal learning of claim 1 or 7, wherein for a user in an abnormal situation, firstly, a plurality of block chain link points in a backup committee are used to submit secret shares of the abnormal user and carry out correctness verification on the secret shares, a polynomial and a private key of the abnormal user are recovered by using an interpolation theorem, and then a redundant random mask is calculated by using a shared key between other users and the abnormal user, so that a global gradient is recovered.
9. The intelligence data verifiable security sharing method based on block chain and federal learning of claim 8, wherein in the correctness verification, whether the model gradient in the transaction pool is tampered with is confirmed according to the addition homomorphism promised by the polynomial, for the case of not being tampered, the new block created by the leader is determined to be legal, when the verifier that the new block is judged to be legal by the block chain node in the verification committee reaches the preset value, the verification is passed, and if the verifier proportion is smaller than the preset value, the invalid empty block is generated.
10. An intelligence data verifiability security sharing system based on block chain and federal learning is used for joint modeling of intrusion detection models by multiple users in network security defense, and is characterized by comprising the following steps: the system comprises user nodes used for participating in local model training in joint modeling, block link points used for carrying out consensus operation on local model training parameters of the user nodes, a credible authority used for distributing public and private key pairs for the user nodes and the block link points, and a backup committee and a verification committee which are composed of a plurality of block link points, wherein each user secretly shares a self private key to the backup committee so as to recover private key information of the user in abnormal situations, and the global gradient obtained by aggregation is subjected to correctness verification through the verification committee;
a user acquires a local intrusion detection model gradient by using machine training based on local data, encrypts the model gradient by adding a mask, compresses the encrypted model gradient, and sends the compressed and encrypted model gradient and a digital signature to an associated adjacent block chain node;
carrying out signature verification on uploaded model gradient data by block link points, putting legal model gradients passing the signature verification into a transaction pool, aggregating the legal gradients in the transaction pool by using a selected leader in the block link points, adding a redundancy mask generated by a user under an abnormal condition recovered by a backup committee to obtain a global gradient, and sending the global gradient to a verification committee by using a new block by creating the new block for recording the global gradient and other key parameters;
the verification committee verifies the correctness of the global gradient and broadcasts the new blocks passing the verification to the whole network to achieve consensus; the user updates the local intrusion detection model by receiving the latest global gradient and downloading it from the latest tile.
CN202210040143.XA 2022-01-14 2022-01-14 Information data safe sharing method and system based on block chain and federal learning Active CN114338045B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210040143.XA CN114338045B (en) 2022-01-14 2022-01-14 Information data safe sharing method and system based on block chain and federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210040143.XA CN114338045B (en) 2022-01-14 2022-01-14 Information data safe sharing method and system based on block chain and federal learning

Publications (2)

Publication Number Publication Date
CN114338045A true CN114338045A (en) 2022-04-12
CN114338045B CN114338045B (en) 2023-06-23

Family

ID=81025878

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210040143.XA Active CN114338045B (en) 2022-01-14 2022-01-14 Information data safe sharing method and system based on block chain and federal learning

Country Status (1)

Country Link
CN (1) CN114338045B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114726551A (en) * 2022-06-06 2022-07-08 广州优刻谷科技有限公司 Meta-universe credit assessment method and device based on federal management
CN114760023A (en) * 2022-04-19 2022-07-15 光大科技有限公司 Model training method and device based on federal learning and storage medium
CN115021905A (en) * 2022-05-24 2022-09-06 北京交通大学 Method for aggregating parameters of local model for federated learning
CN115549901A (en) * 2022-09-29 2022-12-30 江苏大学 Batch aggregation method for federal learning in Internet of vehicles
CN116016610A (en) * 2023-03-21 2023-04-25 杭州海康威视数字技术股份有限公司 Block chain-based Internet of vehicles data secure sharing method, device and equipment
CN116402169A (en) * 2023-06-09 2023-07-07 山东浪潮科学研究院有限公司 Federal modeling verification method, federal modeling verification device, federal modeling verification equipment and storage medium
CN116489637A (en) * 2023-04-25 2023-07-25 北京交通大学 Mobile edge computing method oriented to meta universe and based on privacy protection
CN116822661A (en) * 2023-08-30 2023-09-29 山东省计算中心(国家超级计算济南中心) Privacy protection verifiable federal learning method based on double-server architecture
CN116828453A (en) * 2023-06-30 2023-09-29 华南理工大学 Unmanned aerial vehicle edge computing privacy protection method based on self-adaptive nonlinear function
CN116895375A (en) * 2023-09-08 2023-10-17 南通大学附属医院 Medical instrument management traceability method and system based on data sharing
CN117272389A (en) * 2023-11-14 2023-12-22 信联科技(南京)有限公司 Non-interactive verifiable joint safety modeling method
WO2024016548A1 (en) * 2022-07-20 2024-01-25 天津科技大学 Blockchain-based ai model training method
CN117521151A (en) * 2024-01-05 2024-02-06 齐鲁工业大学(山东省科学院) Block chain-based decentralization federation learning data sharing method

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190295073A1 (en) * 2018-03-22 2019-09-26 Via Science, Inc. Secure data processing transactions
CN111552986A (en) * 2020-07-10 2020-08-18 鹏城实验室 Block chain-based federal modeling method, device, equipment and storage medium
CN112217626A (en) * 2020-08-24 2021-01-12 中国人民解放军战略支援部队信息工程大学 Network threat cooperative defense system and method based on intelligence sharing
CN112395640A (en) * 2020-11-16 2021-02-23 国网河北省电力有限公司信息通信分公司 Industry Internet of things data lightweight credible sharing technology based on block chain
CN112434280A (en) * 2020-12-17 2021-03-02 浙江工业大学 Block chain-based federal learning defense method
CN113095510A (en) * 2021-04-14 2021-07-09 深圳前海微众银行股份有限公司 Block chain-based federal learning method and device
CN113704810A (en) * 2021-04-01 2021-11-26 华中科技大学 Federated learning oriented chain-crossing consensus method and system
CN113794675A (en) * 2021-07-14 2021-12-14 中国人民解放军战略支援部队信息工程大学 Distributed Internet of things intrusion detection method and system based on block chain and federal learning
CN113873534A (en) * 2021-10-15 2021-12-31 重庆邮电大学 Block chain assisted federal learning active content caching method in fog calculation
CN113886817A (en) * 2021-10-19 2022-01-04 国网山东省电力公司济宁供电公司 Host intrusion detection method and device, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190295073A1 (en) * 2018-03-22 2019-09-26 Via Science, Inc. Secure data processing transactions
CN111552986A (en) * 2020-07-10 2020-08-18 鹏城实验室 Block chain-based federal modeling method, device, equipment and storage medium
CN112217626A (en) * 2020-08-24 2021-01-12 中国人民解放军战略支援部队信息工程大学 Network threat cooperative defense system and method based on intelligence sharing
CN112395640A (en) * 2020-11-16 2021-02-23 国网河北省电力有限公司信息通信分公司 Industry Internet of things data lightweight credible sharing technology based on block chain
CN112434280A (en) * 2020-12-17 2021-03-02 浙江工业大学 Block chain-based federal learning defense method
CN113704810A (en) * 2021-04-01 2021-11-26 华中科技大学 Federated learning oriented chain-crossing consensus method and system
CN113095510A (en) * 2021-04-14 2021-07-09 深圳前海微众银行股份有限公司 Block chain-based federal learning method and device
CN113794675A (en) * 2021-07-14 2021-12-14 中国人民解放军战略支援部队信息工程大学 Distributed Internet of things intrusion detection method and system based on block chain and federal learning
CN113873534A (en) * 2021-10-15 2021-12-31 重庆邮电大学 Block chain assisted federal learning active content caching method in fog calculation
CN113886817A (en) * 2021-10-19 2022-01-04 国网山东省电力公司济宁供电公司 Host intrusion detection method and device, electronic equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YUWEI SUN等: "Blockchain-Based Federated Learing Against End-Point Adversarial Data Corruption", 2020 19TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARING AND APPLICATIONS(ICMLA) *
刘俊旭;孟小峰;: "机器学习的隐私保护研究综述", 计算机研究与发展, no. 02 *
董业;侯炜;陈小军;曾帅;: "基于秘密分享和梯度选择的高效安全联邦学习", 计算机研究与发展, no. 10 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760023A (en) * 2022-04-19 2022-07-15 光大科技有限公司 Model training method and device based on federal learning and storage medium
CN115021905A (en) * 2022-05-24 2022-09-06 北京交通大学 Method for aggregating parameters of local model for federated learning
CN114726551A (en) * 2022-06-06 2022-07-08 广州优刻谷科技有限公司 Meta-universe credit assessment method and device based on federal management
CN114726551B (en) * 2022-06-06 2022-08-16 广州优刻谷科技有限公司 Meta-universe credit assessment method and device based on federal management
WO2024016548A1 (en) * 2022-07-20 2024-01-25 天津科技大学 Blockchain-based ai model training method
CN115549901B (en) * 2022-09-29 2024-03-22 江苏大学 Batch aggregation method for federal learning in Internet of vehicles environment
CN115549901A (en) * 2022-09-29 2022-12-30 江苏大学 Batch aggregation method for federal learning in Internet of vehicles
CN116016610B (en) * 2023-03-21 2024-01-09 杭州海康威视数字技术股份有限公司 Block chain-based Internet of vehicles data secure sharing method, device and equipment
CN116016610A (en) * 2023-03-21 2023-04-25 杭州海康威视数字技术股份有限公司 Block chain-based Internet of vehicles data secure sharing method, device and equipment
CN116489637A (en) * 2023-04-25 2023-07-25 北京交通大学 Mobile edge computing method oriented to meta universe and based on privacy protection
CN116489637B (en) * 2023-04-25 2023-11-03 北京交通大学 Mobile edge computing method oriented to meta universe and based on privacy protection
CN116402169A (en) * 2023-06-09 2023-07-07 山东浪潮科学研究院有限公司 Federal modeling verification method, federal modeling verification device, federal modeling verification equipment and storage medium
CN116402169B (en) * 2023-06-09 2023-08-15 山东浪潮科学研究院有限公司 Federal modeling verification method, federal modeling verification device, federal modeling verification equipment and storage medium
CN116828453A (en) * 2023-06-30 2023-09-29 华南理工大学 Unmanned aerial vehicle edge computing privacy protection method based on self-adaptive nonlinear function
CN116828453B (en) * 2023-06-30 2024-04-16 华南理工大学 Unmanned aerial vehicle edge computing privacy protection method based on self-adaptive nonlinear function
CN116822661B (en) * 2023-08-30 2023-11-14 山东省计算中心(国家超级计算济南中心) Privacy protection verifiable federal learning method based on double-server architecture
CN116822661A (en) * 2023-08-30 2023-09-29 山东省计算中心(国家超级计算济南中心) Privacy protection verifiable federal learning method based on double-server architecture
CN116895375B (en) * 2023-09-08 2023-12-01 南通大学附属医院 Medical instrument management traceability method and system based on data sharing
CN116895375A (en) * 2023-09-08 2023-10-17 南通大学附属医院 Medical instrument management traceability method and system based on data sharing
CN117272389A (en) * 2023-11-14 2023-12-22 信联科技(南京)有限公司 Non-interactive verifiable joint safety modeling method
CN117272389B (en) * 2023-11-14 2024-04-02 信联科技(南京)有限公司 Non-interactive verifiable joint safety modeling method
CN117521151A (en) * 2024-01-05 2024-02-06 齐鲁工业大学(山东省科学院) Block chain-based decentralization federation learning data sharing method
CN117521151B (en) * 2024-01-05 2024-04-09 齐鲁工业大学(山东省科学院) Block chain-based decentralization federation learning data sharing method

Also Published As

Publication number Publication date
CN114338045B (en) 2023-06-23

Similar Documents

Publication Publication Date Title
CN114338045B (en) Information data safe sharing method and system based on block chain and federal learning
KR102627049B1 (en) Computer-implemented method for generating threshold vaults
KR102627039B1 (en) Threshold digital signature method and system
Miao et al. Privacy-preserving Byzantine-robust federated learning via blockchain systems
CN112600675B (en) Electronic voting method and device based on group signature, electronic equipment and storage medium
CN110599164B (en) Supervision-capable quick payment method for any payee under chain
CN115037477A (en) Block chain-based federated learning privacy protection method
Neji et al. Distributed key generation protocol with a new complaint management strategy
CN110830244A (en) Anti-quantum computing vehicle networking method and system based on identity secret sharing and alliance chain
CN117201132A (en) Multi-committee attribute base encryption method capable of achieving complete decentralization and application of multi-committee attribute base encryption method
CN110740034B (en) Method and system for generating QKD network authentication key based on alliance chain
CN111340488A (en) Method and device for generating monitorable secret transaction amount
Wang et al. Dynamic threshold changeable multi‐policy secret sharing scheme
CN114553883A (en) Cloud edge terminal cooperative data acquisition and privacy protection method and system based on block chain
Ma et al. Toward data authenticity and integrity for blockchain-based mobile edge computing
Tornos et al. Optimizing ring signature keys for e-voting
CN110929872B (en) Anti-quantum computing private key backup, loss reporting and recovery method and system
CN116633560B (en) Privacy protection and supervision method for block chain multicast transaction mode
CN110999207B (en) Computer-implemented method of generating a threshold library
Wang et al. Towards Efficient and Secure Verifiable Aggregation for Federated Learning
Sharma et al. A Usable Enhanced Dynamic BFT Protocol
Cheng et al. A blockchain-enabled decentralized access control scheme using multi-authority attribute-based encryption for edge-assisted Internet of Things
Sathya et al. Quantum Protocols for Hash‐Based Blockchain
CN114417419A (en) Outsourcing cloud storage medical data aggregation method with security authorization and privacy protection
Sharma et al. Multisecret‐sharing scheme with two‐level security and its applications in blockchain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant