CN111915294A

CN111915294A - Safety, privacy protection and tradable distributed machine learning framework based on block chain technology

Info

Publication number: CN111915294A
Application number: CN202010496847.9A
Authority: CN
Inventors: 曹向辉; 梁伦
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2020-06-03
Filing date: 2020-06-03
Publication date: 2020-11-10
Anticipated expiration: 2040-06-03
Also published as: CN111915294B

Abstract

The invention discloses a block chain technology-based safe, privacy-protecting and tradable distributed machine learning framework, which comprises the following parts: the certificate authority center CA is responsible for issuing and revoking digital certificates for the block chain nodes and carrying out authority management on the nodes; the block chain node is responsible for maintaining the machine learning model and participating in machine learning model transaction; the intelligent contract defines the operation rule of distributed machine learning and divides the benefit of the nodes according to the contribution degree of the model; the distributed account book records model data and model transaction data in the machine learning model training process; and the data provider is responsible for collecting local data and uploading the local data to the blockchain node server.

Description

Safety, privacy protection and tradable distributed machine learning framework based on block chain technology

Technical Field

The invention relates to a block chain technology-based safe, privacy-protecting and tradable distributed machine learning framework, in particular to a framework which solves the problem of Byzantine attack in distributed machine learning by using a block chain (alliance chain) technology, simultaneously protects the data set privacy of each participant by using a differential privacy technology and can complete machine learning model transaction, and belongs to the fields of artificial intelligence, block chains and information security.

Background

In a parameter server framework commonly used in distributed machine learning, a plurality of working nodes are trained by using local data and a current global model to obtain a local model, the local model is sent to a parameter server, and the parameter server aggregates all local models and updates the global model. However, there may be security problems in this process, and both the working node and the parameter server node may be subject to byzantine attacks. Specifically, the working node is attacked by Byzantine, an error local gradient is sent to the parameter server, and therefore the model effect of final training is affected; the parameter server nodes are attacked by the Byzantine attack to aggregate a wrong global model, making the previous training useless. In recent years, researchers have tried to apply the blockchain to the fields of internet of things, medical treatment, finance and the like, and have solved the problems of security, transaction and the like in the field, because the blockchain has the advantages of being not falsifiable, traceability, distributed storage, public maintenance and the like.

To date, the problem of Byzantine attack in distributed machine learning has achieved some success. However, there are also the following problems: 1) existing distributed machine learning algorithms do not take into account the fact that the parameter server is under a Byzantine attack when aggregating models; 2) how to process the detected Byzantine nodes to prevent the detected Byzantine nodes from interfering with model training; 3) how to implement an incentive mechanism in a blockchain system in conjunction with distributed machine learning to help the system run more efficiently; therefore, a new solution to solve the above technical problems is urgently needed.

Disclosure of Invention

The invention aims to solve the problems that aiming at distributed machine learning, an algorithm is provided to solve the problem that a work node and a parameter server node are attacked by Byzantine, if a block chain technology is introduced, the consensus problem in a block chain needs to be solved, an effective incentive mechanism is provided, and the block chain system is promoted to operate effectively and permanently.

In order to solve the above technical problem, the present invention provides a block chain technology-based secure, privacy-preserving, tradable distributed machine learning framework, which comprises: part 1, a multi-Certificate Authority (CA) is responsible for issuing and revoking digital certificates for block chain nodes and performing authority management on the nodes; the block link points are composed of user nodes and transaction nodes and are respectively responsible for maintaining the machine learning model and participating in the machine learning model transaction; part 3, the intelligent contract is composed of a machine learning intelligent contract (MLMC) and a model contribution intelligent contract (MCMC), and the distribution defines the operation rule of distributed machine learning and the profit division is carried out on the nodes according to the model contribution degree; part 4, the distributed account book records model data (including local model and global model conditions) and model transaction data in the machine learning model training process; and 5, the data provider is responsible for collecting local data and uploading the local data to the blockchain node server. In the scheme, the Certificate Authority (CA) can perform condition examination, supervision and authority management on all nodes to be added into the system, so that malicious nodes can be prevented from being added to a certain extent, and the safety of the system is guaranteed. Both the transaction node and the subsequently joined user node need to pay an entry commission (model transaction fee). The transaction node will exit the system after synchronizing the block information. If the user nodes are distinguished as malicious nodes, the user nodes can exit the system, the user nodes cannot return the previous entering commission charge and cannot obtain the following model transaction charge, the punishment on the malicious nodes is realized, the rules of the intelligent contract are opened for all the user nodes, the contents of the user nodes are difficult to be tampered by the malicious nodes, the distributed book records model data and model transaction data in the machine learning model training process, the traceability of the data is guaranteed, all the malicious data can be recorded, and the safety of the system is guaranteed to a certain extent. If each node of the system does not need data set privacy protection, Gaussian noise can not be added to the local gradient; meanwhile, there are many methods for privacy protection of the data set, and if there is a more appropriate method, the method can be switched to other privacy protection methods.

A running method of a safe, privacy-protecting and transactable distributed machine learning framework based on a block chain technology comprises the following steps:

step 1, alliance chain initialization stage: the CA server issues a digital certificate to an initial node of the alliance chain, and all participants establish connection to achieve some initial consensus;

step 2, parameter initialization stage: all user nodes achieve consistency consensus of the neural network model and synchronize test set data of the system;

step 3, local gradient calculation stage: all user nodes select main nodes in sequence according to the order from small id to large id, the rest nodes are endorsement nodes, then each node calculates local gradient by using local data and a current model, Gaussian noise is added to the gradient to enable the local gradient to meet the difference privacy, and finally the local gradient is sent to the main nodes and the endorsement nodes;

step 4, global model updating stage: the main node calculates a global gradient according to the local gradient of each node and a gradient aggregation algorithm with Byzantine fault tolerance, then the system runs an IPBFT consensus algorithm, if the global gradient obtains the system consensus, the global model is updated, and the related information of the global model is written into the block;

step 5, training termination stage: when the training model meets the expected requirements, the system does not train the model any more, and the subsequent action is maintenance model transaction.

As an improvement of the invention, step 1: the alliance chain initialization stage specifically comprises the following steps:

the CA server issues a digital certificate to the initial node of the alliance chain, all participants establish connection, and some initial consensus is achieved: a. unifying criteria established by the data set of everybody; b. unifying the standard of the model transaction fee; c. and unifying the selection rules of the main node and the endorsement node.

As an improvement of the present invention, step 2: the parameter initialization stage is as follows: in the parameter initialization stage, all user nodes achieve the consistency consensus of the neural network model, including the determination of the network structure of the neural network model, the batch size B, the training iteration times T and the learning rate eta_tInitial weight w₀The cutting threshold value is C, the noise size is sigma and other parameters, meanwhile, the block chain node issues the data set standard to the data provider, the data provider collects the training set and uploads the training set to the block chain node, and when the neural network model and the data set are both prepared, the data provider can obtain the data set standardAnd the user nodes contribute test sets and unify the test set data of the system. The entire system can then begin neural network model training.

As an improvement of the present invention, step 3: the local gradient calculation stage is as follows:

firstly, determining a main node and an endorsement node by all user nodes in a block chain, if the id of the main node is i, the id of the endorsement node is i +1, i +2, …, i + m, then obtaining a local gradient by each node according to a data set and a current model of each node, adding differential privacy on the local gradient, and sending the local gradient to the main node and the endorsement node;

the specific calculation process is as follows: suppose that in the t-th iteration, B training data sets are obtained from the kth node

Global model weight of w_tThe clipping threshold is C, and the noise size is sigma;

in the t-th iteration, the local gradient of each sample of the k-th working node is

Wherein the model predicts as

l () is a loss function;

then cutting the local gradient, adding Gaussian noise, and finally obtaining the local gradient g of the kth node_k(w_t) Is composed of

And finally, each node sends the local gradient of the node to the main node and the endorsement node.

As a modification of the present invention, step 4: the global model updating stage specifically includes: main jointAfter receiving the local gradients of each node, the point operates a gradient aggregation algorithm with Byzantine fault tolerance to aggregate the local gradients to obtain a global gradient and update a model, meanwhile, moments account is adopted to track privacy loss, and then, the system operates an IPBFT consensus algorithm: the master node first writes the aggregate computation result (including master node id, aggregate gradient, differential privacy loss, selected node id and local gradient information) into a block of blocks_tThen block is put in_tSending the block to an endorsement node for verification, and if the block passes the verification, sending the block to a block of endorsements for verification_tBroadcast to all blockchain nodes and the block is successfully added into the blockchain.

In step 4, the block chain consensus algorithm IPBFT can effectively verify the gradient aggregation result and effectively identify malicious nodes, and meanwhile, the algorithm is applicable to an alliance chain, and compared with a public chain consensus algorithm (such as PoW, PoS, PoET and the like), the algorithm can complete transaction confirmation in a shorter time and has lower communication complexity.

Compared with the prior art, the invention has the following advantages: 1) the distributed machine learning framework based on the block chain technology has strong practicability and can be used for all distributed machine learning algorithms based on gradient descent; 2) the invention adopts CA to realize effective authority management on the block chain nodes (including transaction nodes and user nodes). For the transaction node, the CA can charge the transaction fee of the machine learning model of the transaction node and control the validity period of the authority; for a malicious node, the CA can revoke the user authority of the malicious node, so that the malicious node is prevented from damaging a machine learning model; 3) the IPBFT consensus algorithm provided by the invention can effectively resist the parameter server node aggregation process from Byzantine attack and simultaneously distinguish and remove malicious nodes, so that the system is safer and safer; 4) the invention effectively realizes an excitation mechanism on the block chain. Particularly, intelligent contracts are deployed on a blockchain to realize reasonable distribution of model transaction fees; 5) the method adds differential privacy in the distributed machine learning, and can effectively protect the data set privacy of system participants.

Drawings

FIG. 1 is a block chain technique based distributed machine learning framework proposed by the present invention;

FIG. 2 is a diagram of the CA framework of the present invention;

FIG. 3 is a flow chart of the operation of the present invention;

FIG. 4 is a schematic diagram of a normal condition;

fig. 5 is a schematic diagram illustrating comparison of accuracy of test sets of models obtained by different aggregation methods after 8 nodes in 20 nodes of a blockchain are subjected to a byzantine attack when local gradient calculation (without introducing differential privacy) is performed in example 2 of the present invention.

Fig. 6 is a schematic diagram illustrating comparison of accuracy of test sets of models obtained by different aggregation methods after 8 nodes in 20 nodes in a blockchain are subjected to a byzantine attack when local gradient calculation (introducing differential privacy) is performed in example 3 of the present invention.

FIG. 7 is a schematic diagram of an extremely malicious situation;

fig. 8 is a comparison graph of the number of nodes along with the change of the number of iterations when 20 of 100 nodes in a block chain are attacked by byzantine attack and an IPBFT algorithm and a PoW algorithm are respectively run in a gradient aggregation process according to a second embodiment of the present invention.

Fig. 9 is a schematic diagram illustrating comparison of accuracy of test sets of models obtained by different aggregation methods after 8 nodes in 20 nodes in a blockchain are subjected to a byzantine attack when local gradient calculation (without introducing differential privacy) is performed according to a second embodiment of the present invention.

Fig. 10 is a schematic diagram illustrating comparison of accuracy of test sets of models obtained by different aggregation methods after 8 nodes in 20 nodes in a blockchain are subjected to a byzantine attack when local gradient calculation (introducing differential privacy) is performed according to a second embodiment of the present invention.

Detailed Description

The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.

Example 1: fig. 1 is a block chain technology-based secure, tradable distributed machine learning framework proposed by the present invention. Referring now to FIG. 1, the various components of the frame will be described in detail.

A secure, privacy-preserving, tradable distributed machine learning framework based on blockchain techniques, the framework comprising:

part 1: a certificate authority CA;

and the CA is responsible for issuing and revoking digital certificates for the block chain nodes and managing the authority of the nodes. It needs to be trusted by all block nodes and also supervised by all block nodes. The structure is shown in fig. 2. For security, our CA employs a root certificate chain implementation of a more common root CA and intermediate CAs. The root CA does not issue certificates directly to the server, it generates two intermediate CAs (user CA and trader CA) for itself, which act as representatives of the root CA for client application visas, which can reduce the administrative burden of the root CA.

Part 2: a block chain node;

within the framework of the system of the invention, there are two types of block link points: a transaction node and a user node.

The transaction node is a temporary node which an external user wants to obtain a training model and joins the blockchain network. After the transaction node obtains the CA permission to join the block chain, the block synchronization is executed once, after the block synchronization is executed, the digital certificate is cancelled, and the node exits from the network.

The user nodes are main components forming the blockchain network and are used for maintaining and training a machine learning model of the user nodes and writing data into a distributed account book in the blockchain in a packaging mode. Each user node has the functions of local gradient calculation, global model aggregation, accounting, block information verification and the like.

Part 3: an intelligent contract;

in the inventive system framework, there are two intelligent contracts, the distribution being a Machine Learning intelligent Contract (MLSC) and a Model Contribution intelligent Contract (MCSC).

MLSCs specify operational rules for distributed machine learning, including local gradient computations, global model computations, IPBFT consensus mechanisms, and so on.

The MCSC calculates the model contribution degree of each node by checking the account book information in the block chain, divides the model transaction fee according to the contribution degree, and writes the transaction information into the accounting nodes of the block chain to obtain an accounting commission charge.

Contribution C of ith node_iThe specific calculation process is as follows:

C_i＝c₁*l_i+c₂*g_i，

wherein l_iIs the number of times a node participates in the global gradient computation, g_iIs the number of times a node contributes a local gradient, c₁And c₂Are the contribution coefficients of the global gradient calculation and the local gradient calculation.

Because the model transaction fee F ═ accounting commission R + model contribution revenue R for each node_iThe sum of (1). Thus, the model contribution yield R for each node_iThe calculation process of (2) is as follows:

where K is the total number of user nodes.

Part 4: a distributed account book;

the distributed ledger records model data (including local model and global model conditions) and model transaction data during machine learning model training. The method ensures the traceability of data, all malicious data can be recorded, and the safety of the system is ensured to a certain extent.

And part 5: a data provider;

the data provider is responsible for collecting and uploading data to the local server.

Example 2: a running method of a safe, privacy-protecting and transactable distributed machine learning framework based on a block chain technology comprises the following steps:

fig. 3 is a flow chart of the operation of the framework of the invention, and each stage of the operation of the system is explained in detail below with reference to fig. 3.

Step 1: a alliance chain initialization stage;

the CA server issues a digital certificate to the initial node of the alliance chain, all participants establish connection, and some initial consensus is achieved: a. unifying standards established for large data sets (e.g., pictures must all be MNIST handwriting data set standards); b. unifying the standard of the model transaction fee; c. and unifying selection rules of the main node and the endorsement node (wherein the main node is selected circularly from small to large according to the node id, m nodes with the node id behind the main node id are selected as the endorsement node, and if the number of the nodes larger than the main node id is not enough m, the nodes are sequentially supplemented from the minimum id).

Step 2: a parameter initialization stage;

in the parameter initialization stage, all user nodes achieve the consistency consensus of the neural network model, including the determination of the network structure, the blocksize, the training iteration times T and the learning rate eta of the neural network model_tInitial weight w₀C is a clipping threshold value, a noise size sigma and other parameters. Meanwhile, the block link point issues the data set standard to the data provider. The data provider collects the training set and uploads it to the blockchain node.

When the neural network model and the data set are prepared, all the user nodes contribute to the test set and unify the test set data of the system. The entire system can then begin neural network model training.

And step 3: a local gradient calculation stage;

firstly, all user nodes in a block chain determine a main node and an endorsement node, and if the id of the main node is i, the id of the endorsement node is i +1, i +2, …, i + m. Then each node obtains a local gradient according to the own data set and the current model, Gaussian noise is added to the local gradient to enable the local gradient to meet a difference privacy mechanism, and finally the local gradient is sent to the main node and the endorsement node.

Global model weight of w_tThe clipping threshold is C, the noise magnitude σ.

Wherein the model predicts as

l () is a loss function.

And 4, step 4: a global model updating stage;

after receiving the local gradients of each node, the master node runs a gradient aggregation algorithm with Byzantine fault tolerance (such as multi-Krum, l-nearest aggregation, and the like) to aggregate the local gradients to obtain a global gradient and update the model, and meanwhile moments account is adopted to track privacy loss. Next, the system will run the IPBFT consensus algorithm: the master node first writes the aggregate computation result (including master node id, aggregate gradient, differential privacy loss, selected node id and local gradient information) into a block of blocks_tThen block is put in_tSending the block to an endorsement node for verification, and if the block passes the verification, sending the block to a block of endorsements for verification_tBroadcast to all blockchain nodes and the block is successfully added into the blockchain.

IPBFT: among them, as shown in FIGS. 4, 5, 6, and 7, IPBFThe consensus process of the T algorithm consists of 8 stages, with the distributions being request-1(R-1), pre-preparation-1 (Pp-1), preparation-1 (P-2), commit-1(C-1), request-2(R-2), pre-preparation-2 (Pp-2), preparation-2 (P-2) and commit-2 (C-2). All user nodes are divided into a master node (L), an endorsement node (E) and a general node (G). Normally, as shown in FIG. 4, the system only needs to perform 4 steps of R-1, Pp-1, P-1 and C-1 to achieve consensus. And the 4 steps of R-2, Pp-2, P-2 and C-2 are executed more than the normal condition by the system in the abnormal condition shown in the figures 5 and 6. The time when the system starts to operate IPBFT is defined as 0, if the system is at t₁If the consensus is reached before the moment, a new main node is selected and the next consensus process is started; otherwise, the IPBFT will determine whether the host node is a malicious node. If the system is at t₂And if the consensus is not achieved at the moment, the main node in the consensus process is considered as a malicious node and is removed from the system. Fig. 7 belongs to a very abnormal situation, in which the wrong aggregation result is known, but in our system, the malicious node is removed continuously, and in the federation chain, the possibility of the node doing malicious is low due to the addition of the CA, so that the very malicious situation is a small-probability event and is almost impossible to occur. And such erroneous aggregated results, even if introduced at the initial stage of training, do not affect the final training model.

As shown in fig. 4, under normal circumstances, the master node is honest and the number of honest endorsement nodes is not less than

m, then the consensus process for IPBFT at this time is as follows:

1) r-1: each user node sends its local gradient to the master node and the endorsement node.

2) Pp-1: the master node calculates the block_tAnd sending the data to the endorsement node for verification.

3) P-1: if block_tNode E of being endorsed_iAfter verification, the endorsement node sends a valid endorsement voucher Vote (block)_t,E_i) To the master node.

4) C-1: in this case, the master node will receive at least

m agrees to the voucher and then generates a block certificate Cert (block)_t). The master node will then block the block_tAnd a block certificate Cert (block)_t) And sending the data to other user nodes for block synchronization.

As shown in fig. 5, in this exceptional case, the master node is malicious and the honest endorsement node number is not less than

m, then the consensus process for IPBFT at this time is as follows:

3) P-1: because of block_tThe malicious endorsement nodes can not pass the verification, and the malicious nodes can not send approval certificates to the main node. Thus, the number of approval credentials received by the master node may be less than

m, the master node will not generate a block certificate Cert (block)_t)。

4) R-2: in this abnormal situation, the system is at t₁Block not reached before time_tAll the user nodes send their local gradients to the rest of the user nodes.

5) Pp-2: the master node will broadcast the block_tAnd verifying all the other user nodes. However, in such an abnormal situation, the number of approval credentials received by the master node may be less than

K (K is the number of user nodes), the system does not achieve block to block_tAnd (4) consensus is carried out. At the same time, the system will not be at t₂By reaching consensus before the moment, the master node will be considered malicious and will be removed from the system.

As shown in fig. 6, in this exceptional case, the master node is honest and the number of honest endorsement nodes is less than

m, then the consensus process for IPBFT at this time is as follows:

3) P-1: if block_tNode E of being endorsed_iAfter verification, the endorsement node sends a valid endorsement voucher Vote (block)_t,E_i) To the master node. However, in this case, the number of valid approval documents may be less than

m, the master node will not be able to generate a block certificate.

5) Pp-2: the master node will broadcast the block_tAnd verifying all the other user nodes.

6) P-2: if block_tUser node P_iAfter verification, the user node sends valid approval voucher Vote (block)_t,P_i) To the master node.

7) C-2: in this case, the number of approval credentials received by the master node may be no less than

K, it can generate a block certificate Cert (block)_t). Then main sectionDot-to-dot block_tAnd a block certificate Cert (block)_t) And sending the data to other user nodes for block synchronization.

As shown in fig. 7, in such an extremely malicious case, the master node is malicious, and the number of endorsement nodes that are malicious and collude with the master node is not less than

m, then the consensus process for IPBFT at this time is as follows:

2) Pp-1: the malicious master node can obtain wrong aggregation results and block_tAnd sending the data to the endorsement node for verification.

3) P-1: in this case, the block_tEndorsement node E that can be maliciously colluded with the master node_iAfter verification, the endorsement node sends an approval voucher Vote (block)_t,E_i) To the master node.

4) C-1: in this case, the master node will receive at least

m agrees with the certificate, a block certificate Cert (block) is generated_t) The master node will then block the block_tAnd a block certificate Cert (block)_t) And sending the data to other user nodes for block synchronization.

It can be seen that in the very abnormal situation of fig. 7, the main node and some endorsement nodes are malicious and colluding, and the probability of occurrence in our system is extremely small. Because as training progresses, the malicious nodes are gradually removed by our system, and in the federation chain, the probability of the nodes doing malicious is extremely small due to the addition of the CA.

Table 1 shows the performance comparison of the correlation consensus algorithm applied in the distributed machine learning framework proposed in the present invention. It can be seen that the consensus algorithm IPBFT proposed by the present invention can distinguish malicious nodes, while PBFT and PoW cannot distinguish malicious nodes. In addition, PBFT and PoW are required in the fieldThere are nodes that communicate local gradients with each other, so their communication complexity is O (K)²) And K is the number of the user nodes. After the IPBFT is operated, along with the training, the malicious nodes are gradually eliminated, and the user node only needs to send the local gradient to 1 main node and m endorsement nodes, so the communication complexity of the IPBFT is O (mK) under the general condition; only in the two malicious cases of fig. 5 and 6, its communication complexity is O (K)²). Therefore, IPBFT has a better communication complexity than PBFT and PoW.

TABLE 1 comparison of related consensus algorithms

And 5: a training termination stage;

when the training model reaches the expected requirement (the precision of the model reaches the requirement or the privacy loss of the model exceeds the privacy budget requirement), the system does not start to train any more. Subsequently, the main role of the blockchain is to maintain the transaction of the machine learning model, and if new data is added or the model algorithm needs to be improved, the process of machine learning training can be restarted.

Example 2:

As can be seen from fig. 8, as the system runs, the IPBFT algorithm finds 20 malicious nodes and removes the malicious nodes from the system, while the malicious nodes in the system running the PoW algorithm are always present.

As can be seen from fig. 9, in the case where differential privacy is not introduced, after the node is subjected to a byzantine attack (random gradient attack), the multi-Krum algorithm has better aggregation effect than the median algorithm, and is closer to the ideal situation.

As can be seen from fig. 10, in the case of introducing differential privacy, after the node is subjected to a byzantine attack (random gradient attack), the median algorithm has better aggregation effect than the multi-Krum algorithm and is closer to the ideal condition.

From the experimental results, the framework provided by the inventor can effectively solve the problem that both the parameter server and the working nodes in distributed machine learning are attacked by Byzantine, and meanwhile, the framework can reward the contribution nodes and eliminate malicious nodes, so that the system can be better operated. In addition, the framework can also apply other different Byzantine aggregation algorithms to optimize the model effect.

Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A secure, privacy preserving, tradable distributed machine learning framework based on blockchain techniques, comprising:

part 1, a Certificate Authority (CA) is responsible for issuing and revoking digital certificates for block chain nodes and carrying out authority management on the nodes;

the block link points are composed of user nodes and transaction nodes and are respectively responsible for maintaining the machine learning model and participating in the machine learning model transaction;

part 3, the intelligent contract is composed of a machine learning intelligent contract (MLMC) and a model contribution intelligent contract (MCMC), and the distribution defines the operation rule of distributed machine learning and the profit division is carried out on the nodes according to the model contribution degree;

part 4, the distributed account book records model data (including local model and global model conditions) and model transaction data in the machine learning model training process;

and part 5, the data provider is responsible for collecting local data and uploading the local data to the blockchain node server.

2. The method of claim 1 for operating a distributed machine learning framework based on blockchain technology for security, privacy protection and tradable, the method comprising the steps of:

step 3, local gradient calculation stage: all user nodes sequentially and circularly select main nodes according to the sequence that id is from small to large, m nodes behind the id of the main node are endorsement nodes, then each node calculates local gradient by using local data and a current model, Gaussian noise is added to the gradient to enable the local gradient to meet a differential privacy mechanism, and finally the local gradient is sent to the main node and the endorsement nodes;

3. The method of claim 2 for operating a distributed machine learning framework based on blockchain technology for security, privacy protection and tradable, wherein the step 1: the alliance chain initialization stage specifically comprises the following steps:

the CA server issues a digital certificate to the initial node of the alliance chain, all participants establish connection, and some initial consensus is achieved: a. unifying criteria established by the data set of everybody; b. unifying the standard of the model transaction fee, wherein the transaction fee can be increased along with the perfection degree of the model; c. and unifying the selection rules of the main nodes and the endorsement nodes, wherein the main nodes are sequentially selected in a circulating mode according to the sequence from small to large of the node ids, and m nodes behind the main node ids are the endorsement nodes.

4. The method of claim 2 for operating a distributed machine learning framework based on blockchain technology for security, privacy protection and tradable, wherein step 2: the parameter initialization stage is as follows: in the parameter initialization stage, all user nodes achieve the consistency consensus of the neural network model, including the determination of the network structure of the neural network model, the batch size B, the training iteration times T and the learning rate eta_tInitial weight w₀The cutting threshold value is C, the noise size sigma and other parameters, meanwhile, the block chain node issues the data set standard to a data provider, the data provider collects a training set and uploads the training set to the block chain node, and after the neural network model and the data set are prepared, all user nodes contribute to a test set and unify the test set data of the system. The entire system can then begin neural network model training.

5. The method of claim 2 for operating a distributed machine learning framework based on blockchain technology for security, privacy protection and tradable, wherein step 3: the local gradient calculation stage is as follows:

firstly, determining a main node and an endorsement node by all user nodes in a block chain, if the id of the main node is i, the id of the endorsement node is i +1, i +2, …, i + m, then obtaining a local gradient by each node according to a data set and a current model of the node, adding Gaussian noise to the local gradient to enable the local gradient to meet a differential privacy mechanism, and finally sending the local gradient to the main node and the endorsement node;

Wherein the model predicts as

l () is a loss function;

6. The method of claim 2 for operating a distributed machine learning framework based on blockchain technology for security, privacy protection and tradable, wherein step 4: the global model updating stage specifically includes: after receiving the local gradients of each node, the master node operates a gradient aggregation algorithm with Byzantine fault tolerance to aggregate the local gradients to obtain a global gradient and update a model, meanwhile, a moments accounting method is adopted to track privacy loss, and then,the system will run the IPBFT consensus algorithm: the master node writes the result of the aggregation calculation into the block_tThen block is put in_tSending the block to an endorsement node for verification, and if the block passes the verification, sending the block to a block of endorsements for verification_tBroadcast to all blockchain nodes and the block is successfully added into the blockchain.