CN113822758B - Self-adaptive distributed machine learning method based on blockchain and privacy - Google Patents

Self-adaptive distributed machine learning method based on blockchain and privacy Download PDF

Info

Publication number
CN113822758B
CN113822758B CN202110889794.1A CN202110889794A CN113822758B CN 113822758 B CN113822758 B CN 113822758B CN 202110889794 A CN202110889794 A CN 202110889794A CN 113822758 B CN113822758 B CN 113822758B
Authority
CN
China
Prior art keywords
node
local
global
parameters
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110889794.1A
Other languages
Chinese (zh)
Other versions
CN113822758A (en
Inventor
张延华
赵学慧
杨睿哲
李萌
司鹏搏
于非
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110889794.1A priority Critical patent/CN113822758B/en
Publication of CN113822758A publication Critical patent/CN113822758A/en
Application granted granted Critical
Publication of CN113822758B publication Critical patent/CN113822758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Computer Security & Cryptography (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Finance (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a self-adaptive distributed machine learning method based on block chains and privacy, which comprises the following steps: and establishing a block chain-based distributed machine learning system model with privacy protection, and completing the interaction process between nodes according to block chain consensus. By analyzing the calculation complexity of the local node in the training process and the consensus process in detail, the energy consumption is considered to carry out an optimization method for calculating resource allocation, so that an adaptive aggregation method based on resource allocation optimization is provided. Simulation results show that the technical method of the invention carries out a training process with privacy protection between nodes based on distributed consensus, on one hand optimizes the computing resource allocation on the nodes under the constraint of energy consumption, and on the other hand, self-adaptively adjusts the global aggregation frequency, thereby improving the utilization rate of the total energy of the system and further improving the convergence performance of the distributed learning process.

Description

Self-adaptive distributed machine learning method based on blockchain and privacy
Technical Field
The invention belongs to the relevant fields of aggregation frequency and resource allocation in distributed machine learning, in particular to a computing resource optimizing method in distributed machine learning based on block chain consensus and privacy protection, and further relates to a self-adaptive aggregation method based on computing resource allocation optimization.
Background
Currently, people and internet devices are producing data that was not available. Machine learning is an important component of artificial intelligence as a method of data analysis from which decisions can be learned, identified and made. In order to fully exploit the value of the data, the most straightforward approach is to collect and store the data in a central server and then to perform centralized processing. However, data is typically generated by multiple parties and stored in a geographically distributed manner, making it difficult to collect large-scale geographically distributed data in a single data store. As a result, distributed machine learning is receiving increasing attention as an alternative solution to the central architecture, namely to distribute the learning workload to the data owners.
Although distributed machine learning can learn without sharing data, interactions and messaging between decentralized local nodes (datasets) still compromise the security and privacy of the data. In addition, each local update and global aggregation consumes computing resources of the network. The amount of resources consumed may vary over time and there is a complex relationship between resource allocation, frequency of global aggregation, and convergence performance of the model.
Disclosure of Invention
The invention aims to solve the technical problem of providing a self-adaptive distributed machine learning method based on block chains and privacy. The method combines an energy consumption formula to give out an optimization strategy of distributed node computing resource allocation while protecting privacy and ensuring safety in the distributed machine learning process, and continuously adjusts the frequency of global aggregation under the condition of fixed system energy, thereby improving the utilization rate of system energy to the maximum extent.
The invention aims to solve the technical problem of providing a self-adaptive distributed machine learning method based on block chains and privacy. The method protects privacy and ensures safety in the distributed machine learning process, combines energy consumption to give out an optimization strategy of distributed node computing resource allocation, and realizes that the system continuously adjusts the frequency of global aggregation, thereby improving the utilization rate of system energy to the maximum extent and obtaining the optimal learning effect.
In order to solve the problems, the invention adopts the following technical scheme:
an adaptive distributed machine learning method based on blockchain and privacy includes the steps of:
step 1, establishing a distributed machine learning system model with privacy protection based on block chain
The computing party C and the participator P construct a distributed environment among nodes by means of a blockchain network, a local updating and global aggregation process of distributed machine learning is completed by means of linear regression and gradient descent, and partial homomorphic encryption technology is introduced to protect model parameters in a training process; and introducing a consensus process to verify the correctness of the model parameters. Finally, each distributed node interacts by means of a consensus process, and only ciphertext parameters of the model can be received in the interaction process.
Step 2, combining intelligent contracts among nodes to complete distributed consensus process
In order to ensure the credibility of the learning process and confirm the correctness of the learning parameters, a distributed consensus process is formed between the computing party and the participator, and the distributed consensus process comprises ELW, ELP, EGW, EGP, CGP five transaction processes. The training parameters are transferred in the form of transactions by means of smart contracts and recorded in blocks.
Step 3, performance analysis of training process and consensus process
Step 3.1, training procedure
Step 3.2, consensus Process
Step 4, self-adaptive global aggregation based on resource allocation optimization
Two processes are involved in the blockchain and privacy protection based distributed machine learning system model: a distributed machine learning process and a blockchain consensus process. Under the system model, we consider n+1 nodes, including N local nodes, representing participants, one computing node, representing a computing party; the computing power of each node is denoted as f i (CPU cycles per second).
In addition, mu 1 Represents the average CPU cycles, μ required to complete one-step ciphertext computation 2 Representing the average CPU cycles required to complete a one-step plaintext calculation. Under PBFT consensus, at most f=3 exists -1 (N-1) failure nodes, each node generating or verifying a signature and generating or verifying a MAC requiring beta and theta CPU cycles, respectively, by driving the computational tasks required for intelligent contract verificationAlpha CPU cycles are required.
According to the invention, a distributed machine learning model with privacy protection based on a block chain is established, the calculation complexity of each node in different processes is analyzed in detail, the resource allocation optimization of the nodes is carried out by combining an energy consumption formula, and meanwhile, the constraint condition of an optimization function is formulated by introducing the energy formula, so that the final objective function of self-adaptive aggregation under the energy allocation optimization is provided.
Simulation results show that the proposed algorithm has better performance than the traditional algorithm (aggregation interval τ is fixed and the average allocation of resources is calculated).
Drawings
FIG. 1 is a system model;
FIG. 2 is a flow chart of a PBFT consensus protocol;
fig. 3 shows the change trend of the loss function value (n=3, 4, 5) with the change of the total energy of the system.
Fig. 4 shows a change trend of the loss function value according to the number of nodes (e=0.5×10 6 、1.5×10 5 ,τ=10)。
Detailed Description
The invention is further described below with reference to the drawings and examples.
Step 1, establishing a distributed machine learning system model with privacy protection based on block chain
FIG. 1 illustrates a system model of the present invention, then a blockchain-based distributed machine learning process with privacy protection can be described as: the local nodes and the computing nodes are deployed in the blockchain to form a safe distributed environment, the local nodes are responsible for local updating tasks of the training process, the computing nodes are responsible for global aggregation tasks of the training process, and each local node uses a gradient descent algorithm to complete the learning process of linear regression. In order to ensure the privacy of model parameters, homomorphic encryption technology is introduced in the training process, and each node completes the update task of each parameter in a ciphertext state by utilizing the homomorphic property of the homomorphic encryption technology. In addition, in order to ensure the credibility of the training process, a distributed consensus is introduced between nodes based on a blockchain network during global aggregation, so that ciphertext model parameters are transmitted and updated between nodes in a transaction mode by means of intelligent contracts.
Input vector x in machine learning model j And output y j In the above, the best fit equation for linear regression can be expressed as:
y j =w 0 +w 1 x j,1 +w 2 x j,2 +......=w T x j
its corresponding loss function F (w) is the mean square error, with the aim of solving the optimal parameter w that minimizes F (w).
In a distributed network, a node set is made up of a party P and a party C, i.e., i= { P, C }, i=n+1. Representing a set of N participants as p= { P 1 ,P 2 ,...P N },P i (i=1, 2,., N) represents a participant in possession of the sub-data set, D i Representing party P i Owned subdata set, then the total data set is represented as d= { D 1 ,D 2 ,...,D N }。 (x ij ,y ij )∈D i For D i The j-th data in (a).
In the system model proposed by the present invention, local updating occurs in each iteration process, i.e., t=1, 2, &..; global aggregation only occurs when iterative process t=Γτ, Γ=1, 2.
In addition, the computing party C holds a key pair for protecting the model parameters, so that encryption and decryption operations are carried out on the model parameters at any time in the operation process, but the party P always holds only the ciphertext model parameters, and the homomorphic encryption-based distributed machine learning process can be described as follows:
1. issuing global parameters: c, issuing ciphertext parameters after each global aggregationThe participants only see the ciphertext and cannot learn w g And (t) ensuring the privacy of the global model parameters. To->Representing possible globally aggregated model parameters, the interaction of local and global parameters may be described as:
2. local parameter updating: the participators complete the local updating process under the ciphertext state according to the homomorphic property of the homomorphic encryption algorithm. For the i-th participant, the local parameter update procedure may be expressed as:
wherein ,wi,k (t) represents the local model parameters w i The kth element in (t),representing P i The local gradient calculated over the iteration number t is defined as a single data sample (x ij ,y ij ) Gradient of->And is x ij,k Representing the kth element within the input vector, then:
3. global parameter updating: p (P) i At each run τPost-update commit of secondary local updatesC obtain->And then, carrying out global aggregation on the local parameters in a ciphertext state according to the following formula, and updating the global parameters.
Step 2, combining intelligent contracts among nodes to complete distributed consensus process
In the distributed machine learning model based on block chain and privacy protection, a PBFT consensus protocol is utilized to form distributed consensus among nodes, so that the reliability of the learning process is ensured, and the correctness of model parameters is confirmed. The model parameters are subjected to updating and interaction processes among nodes in a transaction mode through intelligent contracts, and uplink authentication is carried out. The workflow of the PBFT consensus protocol is shown in fig. 2.
The consensus process provided by the invention comprises five transaction types, namely: the method comprises the steps of issuing a transaction (EGW) by a computing party on a ciphertext of a global parameter, feeding back the transaction (ELW) by a participant on the ciphertext of a local parameter, submitting the transaction (ELP) by the participant on the ciphertext of the local parameter of the local variable intermediate parameter calculated by the participant, calculating the ciphertext transaction (EGP) by the computing party according to the local parameter and the global parameter, and calculating the plaintext transaction (CGP) by the computing party according to the decrypted ELP and the EGP to obtain the optimized parameter. In the consensus process, intelligent contracts are adopted to drive transactions and carry out block verification and chaining. The local parameter updating calculation is carried out on the local data set in the ciphertext state immediately after the participant receives the EGW transaction, the local parameter updating is carried out before aggregation, and the ELW transaction and the ELP transaction are submitted after completion; and the computing party carries out ciphertext operation on each received ELW transaction to obtain an EGW transaction, further obtains an EGP transaction in a ciphertext state, and then decrypts the ELP transaction and the EGP transaction, and further obtains a CGP transaction after operation.
In the transaction process, the computing party acts as a master node to divide the transaction into a block and performs consensus verification, namely, each participant serving as a slave node verifies the transaction process according to the public key, and the transaction process comprises signature, MAC and signature submitted by each nodeIs a calculation relation of (a).
Step 3, performance analysis of training process and consensus process
In the system model provided by the invention, the participant node and the calculator node finish local updating and global aggregation in the training process by utilizing the blockchain network together, and a consensus process is introduced in the global aggregation to ensure the correctness of model parameters. In the training process and the consensus process, five transaction types are contained, and the corresponding relation of the above contents is shown in the following table:
the performance of the training process and the consensus process is respectively analyzed, and the method comprises the following steps:
step 3.1 training procedure
For the training process, it consists of local updates and global aggregations, and contains five transactions driven by smart contracts. The computational cost and computational time measured in terms of algorithm complexity correspond to the computational process of each transaction:
(1) local update: local node P i' (I '∈i, I' =1,., N) updating the local ciphertext parameters according to the global ciphertext parameters issued by the computing nodes, and delivering in the blockchain in the form of an ELW transaction. In the local update step, the computation cost is O (|w| (2|D) i' I+1)), then P i' Is calculated at a cost of (a)Calculation time +.>Is that
Wherein, byl=1, 2 represents the computational power obtained by the node during training, +.>
(2) Global aggregation: first, the local node P i' (I '∈i, I' =1,., N) calculating an intermediate parameter ELP transaction for obtaining a local ciphertext variable using the updated local ciphertext parameter at the cost of O (|w||d i' |) is provided; then, compute node C (c=i "∈i, I" =n+1) gathers the data from P i' Updating global model parameters EGW transaction under ciphertext state, wherein the calculation cost is O (N|w|); at the same time, the computing node collects ELP transactions from the participants and updates the intermediate parameter EGP transactions for computing global model variables in the ciphertext state at the cost of O (Σ i' (2|D i' |+|w||D i' I)), is set at the right angle; finally, since homomorphic encryption cannot handle the problem of ciphertext multiplication, in order to obtain parameters that are ultimately used to optimize the modelAnd rho, delta, collecting ELP transactions from the participants by the computing nodes, meanwhile, decrypting the ciphertext parameters by using a private key in combination with the EGP transactions, and computing the optimization parameters in a plaintext state, wherein the computing cost is O (N). In the global aggregation step, P i' Is->Calculation time +.>Is that
C is the calculation costCalculation time +.>Is that
Step 3.2 consensus procedure
For the consensus process introduced in the global aggregation, the PBFT consensus protocol comprises five steps:
(1) RequestPre-preparation: the calculator C acts as the master node i ', (i' =n+1), verifies the signatures and MACs of all transactions within the aggregate while packing the transactions into a new chunk. In this step, the cost is calculatedAnd calculating timeIs that
Wherein, bys=1, …,5, representing the computational power of the node in the consensus process, +.>
(2) Pre-preparation: the local node receives the new block with the Pre-preparation message as verification node i "+.i', (i" = 1,., N), first verifies the signature and MAC of the block, then verifies the signature and MAC of each transaction, and finally verifies the results according to the transaction calculation in the intelligent contract. In this step, the cost is calculatedAnd calculating timeIs that
(3) Prepore Commit: each node receives and examines the Prewire message to ensure that it is consistent with the Pre-Prewire message. When 2f Prepare messages are received from other nodes, the node will send a command message to all other nodes. In this step, the cost is calculatedAnd calculate time +.>Is that
(4) Commit Reply: each node receives and examines the wait message to ensure that it is consistent with the Prepare message. Once the node receives 2f Commit messages from other nodes, it will pass the Reply message to the master node. In this step, the cost is calculatedAnd calculate time +.>Is that
(5) Reply is added to the chain: the master node receives and examines the Reply message. When the master node receives 2f Reply messages, the new block will take effect and be added to the blockchain. In this step, the cost is calculatedAnd calculate time +.>Is that
Step 4, self-adaptive global aggregation based on resource allocation optimization
The invention obtains model parameters w (T, tau) by means of global aggregation at intervals tau during the distributed learning of T-round iteration times, and introduces an ideal loss function F (w * )(w * Representing ideal model parameters that are available based on the full data training), the objective equivalent of minimizing the achievable loss function is:
the objective function may be initially defined as follows:
s.t.C1:F e (f,T,τ)≤E
c1 limits total energy consumption; c2, C3 limit computational resources; c4 limits training time; c5 limits consensus time; e is the total energy provided by the system; t (T) time To provide a time limit. Constraints C4 and C5 will keep the training process and consensus process synchronized.
For energy consumption, the training process can be expressed asCan be expressed as in the consensus processWhere γ is a constant related to the hardware architecture; />And is also provided withThis represents whether node I (I e I) participates in each process. In the present invention, delta i' =[0,1,1,0,1,1,1]Representing participation of the computing nodes in the training and consensus process; delta i”≠i' =[1,1,0,1,1,1,0]Representing the participation of the local node in the training and consensus process. Thus, the energy cost of the system is expressed as
wherein ,representing the overall computing resource. />Representing local update proceduresThe energy costs generated; />Representing the energy costs incurred in the global polymerization process.
In addition, each parameter satisfies the following condition:
1)||F i (w)-F i (w')||≤ρ||w-w'||
2)
3)
4)F(w(T,τ)-F(w * ))≥ε
5)
6)
7)
the objective function is set to:
due to denominatorSince the value is constant, the optimum value of T is established when equation C1 takes the equal sign. Will->Substituting, the objective function can be rewritten as:
and finally, solving by using a convex optimization function algorithm.
The setting of simulation parameters and simulation results and analysis are given below:
the MATLAB is utilized for simulation, and a system model is established.
The present invention uses the boston room price dataset (Boston House Price Dataset) to experiment and analyze the results of the proposed algorithm. Some parameters in the simulation process are set to l=2, s=5, η=1×10 -6μ 1 =0.1M cycles,μ 2 =0.05M cycles,/>α=0.2M cycles,β=0.8M cycles,θ=0.005M cycles,γ=1×10 -5 ,T time =300s。
Simulation fig. 4 shows the trend of the loss function value with the total energy of the system when the number N of local nodes is 3, 4, and 5, respectively. The figure shows that as the total energy of the system increases, the loss function value decreases, and the smaller the number of local nodes, the smaller the loss function value, with the same system energy. In addition, the smaller the number of local nodes in the process of the total energy change of the system, the smaller the total energy of the system required by the loss function value to reach convergence.
Simulated when the total energy of the system is e=0.5×10 respectively 5 ,E=1.5×10 5 And the loss function value changes trend along with the change of the number of the local nodes. The figure shows that as the number of local nodes participating in the distributed machine learning process increases, so does the loss function. Compared with the traditional algorithm (average distribution of computing resources, tau=10), on one hand, the invention reasonably distributes computing resources by analyzing the computing cost generated by each node in the transaction process, thereby fully utilizing the total energy of the system, and leading the difference between the loss function value under the traditional algorithm and the loss function value under the optimization algorithm to be larger when the total energy of the system is smaller (namely, the system energy is insufficient), compared with the difference, the performance of the provided resource distribution algorithm is effectively improved; on the other hand, the utilization optimization parameters are based onThe resource allocation algorithm continuously adjusts the τ value such that the smaller the loss function value obtained at the same number of nodes at the same system energy.
The above embodiments are only exemplary embodiments of the present invention and are not intended to limit the present invention, the scope of which is defined by the claims. Various modifications and equivalent arrangements of this invention will occur to those skilled in the art, and are intended to be within the spirit and scope of the invention.

Claims (5)

1. An adaptive distributed machine learning method based on blockchain and privacy, comprising the steps of:
step 1, establishing a distributed machine learning system model with privacy protection based on block chain
The computing party C and the participator P construct a distributed environment among nodes by means of a blockchain network, a local updating and global aggregation process of distributed machine learning is completed by means of linear regression and gradient descent, and partial homomorphic encryption technology is introduced to protect model parameters in a training process; introducing a consensus process, and verifying the correctness of model parameters; finally, each distributed node interacts by means of a consensus process, and only ciphertext parameters of the model can be received in the interaction process;
step 2, combining intelligent contracts among nodes to complete distributed consensus process
In order to ensure the credibility of the learning process and confirm the correctness of the learning parameters, a distributed consensus process is formed between the computing party and the participator, and comprises ELW, ELP, EGW, EGP five transaction processes; the training parameters are transmitted in the form of transactions by means of intelligent contracts and recorded in blocks;
step 3, performance analysis of training process and consensus process
In the system model, a participant node and a calculator node finish local updating and global aggregation in a training process by utilizing a blockchain network together, and a consensus process is introduced in the global aggregation to ensure the correctness of model parameters; in the training process and the consensus process, five transaction types are contained, and the corresponding relation of the above contents is shown in the following table:
the performance of the training process and the consensus process is respectively analyzed, and the method comprises the following steps:
step 3.1 training procedure
For the training process, it consists of local updates and global aggregations, and contains five transactions driven by smart contracts; the computational cost and computational time measured in terms of algorithm complexity correspond to the computational process of each transaction:
(1) local update: local node P i' (I '∈i, I' =1,., N) updating local ciphertext parameters according to global ciphertext parameters issued by the computing nodes, and delivering in the blockchain in the form of ELW transactions; in the local update step, the computation cost is O (|w| (2|D) i' I+1)), then P i' Is calculated at a cost of (a)Calculation time +.>Is that
Wherein, byRepresenting the computing power gained by the node during training, < >>
(2) Global aggregation: first, the local node P i' (I '∈i, I' =1,., N) calculating an intermediate parameter ELP transaction for obtaining a local ciphertext variable using the updated local ciphertext parameter at the cost of O (|w||d i' |) is provided; then, compute node C (c=i "∈i, I" =n+1) gathers the data from P i' Updating global model parameters EGW transaction under ciphertext state, wherein the calculation cost is O (N|w|); at the same time, the computing node collects ELP transactions from the participants and updates the intermediate parameter EGP transactions for computing global model variables in the ciphertext state at the cost of O (Σ i' (2|D i' |+|w||D i' I)), is set at the right angle; finally, since homomorphic encryption cannot handle the problem of ciphertext multiplication, in order to obtain parameters that are ultimately used to optimize the modelRho, delta, the computing node collects ELP transaction from the participators, meanwhile, the secret key is used for decrypting the ciphertext parameters in combination with the EGP transaction, the optimization parameters are calculated in a plaintext state, and the calculation cost is O (N); in the global aggregation step, P i' Is->Calculation time +.>Is that
C is the calculation costCalculation time +.>Is that
Step 3.2 consensus procedure
For the consensus process introduced in the global aggregation, the PBFT consensus protocol comprises five steps:
(1) request Pre-preparation: the calculator C serves as a master node i ', i' =N+1, verifies the signatures and the MAC of all transactions in the aggregation, and packages the transactions into a new block; in this step, the cost is calculatedAnd calculate time +.>Is that
Beta represents the number of CPU cycles required by the node to generate a signature MAC, theta represents the number of CPU cycles required by the node to verify a signature MAC, and alpha represents the number of CPU cycles required by the task to verify calculation through driving an intelligent contract; wherein, byRepresenting the computational power of the nodes in the consensus process,/-, for example>
(2) Pre-preparation: the local node is used as a verification node i ' -i ', i ' -1, & gt, N, a new block with a Pre-preparation message is received, firstly, the signature and the MAC of the block are verified, then the signature and the MAC of each transaction are verified, and finally, the verification result is verified according to a transaction calculation mode in an intelligent contract; in this step, the cost is calculatedAnd calculate time +.>Is that
(3) Prepore Commit: each node receives and examines the Prewire message to ensure that it is consistent with the Prewire message; when receiving 2f Prepaper messages from other nodes, the node sends a Commit message to all other nodes; in this step, the cost is calculatedAnd calculate time +.>Is that
(4) Commit Reply: each node receives and examines the Commit message to ensure that it is consistent with the Precure message; once the node receives 2f Commit messages from other nodes, the node will transmit Reply messages to the master node; in this step, the cost is calculatedAnd calculate time +.>Is that
(5) Reply is added to the chain: the master node receives and examines the Reply message; when the master node receives 2f Reply messages, the new block will take effect and be added into the blockchain; in this step, the cost is calculatedAnd calculate time +.>Is that
Step 4, self-adaptive global aggregation based on resource allocation optimization
Model parameters obtained by means of global aggregation at intervals tau during the distributed learning of the number of iterations of the T-round are w (T, tau), and an ideal loss function F (w * ),w * Representing ideal model parameters that are available based on full data training, the goal of minimizing the achievable loss function is equivalent to:
and solving by using a convex optimization function algorithm.
2. The adaptive distributed machine learning method based on blockchain and privacy as in claim 1, wherein in the distributed network, the node set is composed of a participant P and a calculator C, i= { P, C }, |i|=n+1; representing a set of N participants as p= { P 1 ,P 2 ,...P N },P i (i=1, 2,., N) represents a participant in possession of the sub-data set, D i Representing party P i Owned subdata set, then the total data set is represented as d= { D 1 ,D 2 ,...,D N };(x ij ,y ij )∈D i For D i The j-th data in (a); the system model comprises three steps of global parameter issuing, local gradient updating and global parameter updating, wherein the local updating occurs in each iteration process, namely t=1, 2. Global aggregation only occurs when iterative process t=Γτ, Γ=1, 2.
3. The blockchain and privacy-based adaptive distributed machine learning method of claim 2, wherein the homomorphic encryption-based distributed machine learning process can be described as:
1) Issuing global parameters: c, issuing ciphertext parameters after each global aggregationThe participants only see the ciphertext and cannot learn w g (t) ensuring the privacy of global model parameters; to->Representing possible globally aggregated model parameters, the interaction of local and global parameters may be described as:
2) Local parameter updating: the participators complete the local updating process under the ciphertext state according to the homomorphic property of the homomorphic encryption algorithm; for the i-th participant, the local parameter update procedure may be expressed as:
wherein ,wi,k (t) represents the local model parameters w i The kth element in (t),representing P i The local gradient calculated over the iteration number t is defined as a single data sample (x ij ,y ij ) Gradient of->And is x ij,k Representing the kth element within the input vector, then:
3) Global parameter updating: p (P) i Commit after every τ local updatesC obtain->Then, global aggregation is carried out on the local parameters in a ciphertext state according to the following formula, and the global parameters are updated;
4. the blockchain and privacy-based adaptive distributed machine learning method of claim 1, wherein the blockchain-based verifiable computing system model includes two processes: a distributed machine learning process and a blockchain consensus process; under the system model, consider n+1 nodes, including N local nodes, representing participants, one computing node, representing a computing party; the computing power of each node is denoted as f i A representation;
in addition, mu 1 Represents the average CPU cycles, μ required to complete one-step ciphertext computation 2 Representing the average CPU cycles required to complete a one-step plaintext calculation; under PBFT consensus, at most f=3 exists -1 (N-1) failed nodes, each node requiring β and θ CPU cycles to generate or verify a signature and to generate or verify a MAC, respectively, and α CPU cycles to pass the computational tasks required to drive smart contract verification.
5. The blockchain and privacy-based adaptive distributed machine learning method of claim 1, wherein in the set objective function, C1 limits total energy consumption; c2, C3 limit computational resources; c4 limits training time; c5 limits consensus time; e is the total energy provided by the system; t (T) time A time limit provided; constraints C4 and C5 will keep the training process and consensus process synchronized;
for energy consumption, the training process can be expressed asRepresented in the consensus process asWhere γ is a constant related to the hardware architecture; />And is also provided withRepresenting whether node I (I e I) participates in each process; delta i' =[0,1,1,0,1,1,1]Representing participation of a computing node in a process v; delta i”≠i' =[1,1,0,1,1,1,0]Representing participation of the local node in the process v; the energy cost of the system is expressed as
wherein ,representing overall computing resources; />Representing energy costs generated during the local update process; />Representing the energy costs incurred in the global polymerization process.
CN202110889794.1A 2021-08-04 2021-08-04 Self-adaptive distributed machine learning method based on blockchain and privacy Active CN113822758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110889794.1A CN113822758B (en) 2021-08-04 2021-08-04 Self-adaptive distributed machine learning method based on blockchain and privacy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110889794.1A CN113822758B (en) 2021-08-04 2021-08-04 Self-adaptive distributed machine learning method based on blockchain and privacy

Publications (2)

Publication Number Publication Date
CN113822758A CN113822758A (en) 2021-12-21
CN113822758B true CN113822758B (en) 2023-10-13

Family

ID=78912826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110889794.1A Active CN113822758B (en) 2021-08-04 2021-08-04 Self-adaptive distributed machine learning method based on blockchain and privacy

Country Status (1)

Country Link
CN (1) CN113822758B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114915429B (en) * 2022-07-19 2022-10-11 北京邮电大学 Communication perception calculation integrated network distributed credible perception method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111800274A (en) * 2020-07-03 2020-10-20 北京工业大学 Verifiable calculation energy consumption optimization method based on block chain
CN111915294A (en) * 2020-06-03 2020-11-10 东南大学 Safety, privacy protection and tradable distributed machine learning framework based on block chain technology
CN113114496A (en) * 2021-04-06 2021-07-13 北京工业大学 Block chain expandability problem solution based on fragmentation technology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915294A (en) * 2020-06-03 2020-11-10 东南大学 Safety, privacy protection and tradable distributed machine learning framework based on block chain technology
CN111800274A (en) * 2020-07-03 2020-10-20 北京工业大学 Verifiable calculation energy consumption optimization method based on block chain
CN113114496A (en) * 2021-04-06 2021-07-13 北京工业大学 Block chain expandability problem solution based on fragmentation technology

Also Published As

Publication number Publication date
CN113822758A (en) 2021-12-21

Similar Documents

Publication Publication Date Title
Zhang et al. Reliable and privacy-preserving truth discovery for mobile crowdsensing systems
Bogdanov et al. Sharemind: A framework for fast privacy-preserving computations
Hu et al. Achieving privacy-preserving and verifiable support vector machine training in the cloud
Wang et al. Enhancing privacy preservation and trustworthiness for decentralized federated learning
Backes et al. Asynchronous MPC with a strict honest majority using non-equivocation
CN108737116B (en) Voting protocol method based on d-dimensional three-quantum entangled state
El Kassem et al. More efficient, provably-secure direct anonymous attestation from lattices
CN113822758B (en) Self-adaptive distributed machine learning method based on blockchain and privacy
Dou et al. A distributed trust evaluation protocol with privacy protection for intercloud
Zhao et al. Fuzzy identity-based dynamic auditing of big data on cloud storage
CN115733607A (en) Block chain-based Pedersen secret sharing multi-party aggregation access control method
CN102301643A (en) Management of cryptographic credentials in data processing systems
Xu et al. A blockchain-based federated learning scheme for data sharing in industrial internet of things
Zhang et al. A verifiable and privacy-preserving cloud mining pool selection scheme in blockchain of things
CN115828302B (en) Micro-grid-connected control privacy protection method based on trusted privacy calculation
Jivanyan et al. Hierarchical one-out-of-many proofs with applications to blockchain privacy and ring signatures
Zhou et al. Efficient construction of verifiable timed signatures and its application in scalable payments
Tso Two-in-one oblivious signatures
CN113541963B (en) TEE-based extensible secure multiparty computing method and system
Akhter et al. Privacy-preserving two-party k-means clustering in malicious model
CN113806764B (en) Distributed support vector machine based on blockchain and privacy protection and optimization method thereof
Ma et al. Do not perturb me: A secure byzantine-robust mechanism for machine learning in IoT
Wüller Privacy-preserving electronic bartering
Su et al. Efficient and flexible multiauthority attribute-based authentication for IoT devices
Mundele et al. Polynomial Commitment-Based Zero-Knowledge Proof Schemes: A Brief Review

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant