CN112738035B

CN112738035B - Block chain technology-based vertical federal model stealing defense method

Info

Publication number: CN112738035B
Application number: CN202011494407.6A
Authority: CN
Inventors: 黄方蕾; 谢杨洁; 汪小益; 吴琛; 胡麦芳; 张珂杰; 匡立中; 詹士潇; 邱炜伟; 张帅; 蔡亮; 李伟
Original assignee: Hangzhou Qulian Technology Co Ltd
Current assignee: Hangzhou Qulian Technology Co Ltd
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2022-04-29
Anticipated expiration: 2040-12-17
Also published as: CN112738035A

Abstract

The invention discloses a block chain technology-based vertical federal model stealing defense method, which comprises the following steps: (1) in accordance withSelecting 2 block nodes from the block chain as edge terminals P according to the workload certification_AAnd P_BIs P_AAnd P_BSeparately assigning sample sets D_AAnd D_BEdge model M_AAnd M_B；(2)P_AAccording to D_ATo M_ATraining, P_BAccording to D_BTo M_BTraining, P_ASending the characteristic data generated in the training process to P_B，P_BCalculating a loss function, P, using the received characteristic data_AAnd P_BEncrypting the respective loss function masks, recording the encrypted loss function masks into an account book, and distributing the account book to an edge terminal with large workload certification for storage; (3) edge terminal pair M for keeping account book_AAnd M_BIs decrypted and aggregated to obtain M_AAnd M_BAnd returning the gradient information to P_AAnd P_BTo update the edge model network parameters. The model can be stolen by a steal attacker.

Description

Block chain technology-based vertical federal model stealing defense method

Technical Field

The invention belongs to the technical field of model safety, and particularly relates to a vertical federal model stealing defense method based on a block chain technology.

Background

Federal learning is an effective technical means for solving the problems of data islanding and privacy disclosure in the process of model training and application. In federal learning, the edge end trains local data and uploads a model to the server end, then the server end aggregates the model to obtain overall parameters, and a deep learning model is trained through local training and parameter transmission of the edge end. Federal learning is roughly divided into three categories according to different situations of data distribution: horizontal federal learning, vertical federal learning, and federal migratory learning. The horizontal federated learning refers to that under the condition that data features are overlapped more and users are overlapped less among different data sets, the data sets are segmented according to user dimensions, and the data with the same data features and not identical users is extracted for training. Longitudinal federated learning refers to that under the condition that users overlap more and data features overlap less among different data sets, the data sets are segmented according to data feature dimensions, and the data with the same users and the data features which are not identical are extracted for training. Federal transfer learning refers to the situation where users of multiple data sets have little overlap with data features, data is not segmented, but transfer learning is utilized to overcome data or tag deficiencies.

In recent years, bitcoin has been developed and applied in volts, but the blockchain technology, which is one of the underlying bitcoin technologies, has been gaining attention. In the process of forming the bitcoin, the blocks are storage units one by one, and all communication information of each block node within a certain time is recorded. The blocks are linked through a Hash algorithm, the next block contains the Hash value of the previous block, one block is connected with one block in sequence along with the expansion of information exchange, and the formed result is called a block chain.

Block chains are mainly classified into three categories: public block chains, federated block chains, and private block chains. Public blockchain means that any individual or group in the world can send a transaction, and the transaction can be effectively confirmed by the blockchain, and any person can participate in the consensus process. The united block chain is formed by a group of designated preselected nodes as bookkeepers, the generation of each block is jointly determined by all the preselected nodes, other access nodes can participate in transactions, but the billing process is not queried, and any other person can perform limited query through an API opened by the block chain. The private block chain refers to the block chain general ledger technology for accounting, and can be a company or an individual, and an accountant has a unique write right of the block chain.

Federal learning aims at realizing 'data invisible' privacy protection technology, a blockchain aims at ensuring that transaction records cannot be tampered, and a consensus algorithm and a distributed ledger technology are utilized to solve the problem of double payment in a decentralized network. The blockchain technique provides a trusted mechanism for each edge end of federal learning. Through the authorization mechanism and the identity management of the block chain, the untrusted edge end users can be integrated, and a safe and trusted cooperation mechanism is established.

Although federated learning can realize privacy protection of 'data invisibility' and improve the training efficiency of the model, because the edge ends come from different organizations or organizations and mutual trust relationship is not established between the edge ends, whether a malicious attacker exists in the edge segment cannot be determined. Secondly, federate learning carries out model aggregation at a server side, and when the server fails or privacy is revealed, serious safety problems are caused. In addition, the malicious edge terminal obtains models of other edge terminals by stealing the intermediate training result.

Disclosure of Invention

The invention provides a vertical federal model stealing defense method based on a block chain technology, which aims to establish an edge end mutual trust mechanism under a vertical federal, improve the safety of a vertical federal lower edge end model and prevent a malicious attacker from stealing the edge end model.

The technical scheme of the invention is as follows:

a vertical federal model stealing defense method based on a block chain technology comprises the following steps:

(1) selecting 2 block nodes from the block chain as edge terminals P for vertical federal learning according to workload certification_AAnd P_BAnd is an edge terminal P_AAnd P_BSeparately assigning sample sets D_AAnd D_BEdge model M_AAnd M_B；

(2) Edge terminal P_AAccording to sample set D_AFor edge model M_ATraining is performed, edge terminal P_BAccording to sample set D_BFor edge model M_BTraining is performed, edge terminal P_ASending the characteristic data generated in the training process to P_B，P_BComputing a loss function using the received characteristic data, the edge terminal P_AAnd P_BEncrypting the respective loss function masks, recording the encrypted loss function masks into an account book, and distributing the account book to an edge terminal with large workload certification for storage;

(3) edge model M taking edge terminal for keeping account book as temporary service end_AAnd M_BAfter the loss function mask is decrypted and the loss functions are aggregated, the aggregated loss functions are solved to obtain M_AAnd M_BAnd returning the gradient information to the edge terminal P_AAnd P_BTo update the edge model network parameters.

Compared with the prior art, the invention has the beneficial effects that at least:

in the vertical federation model stealing defense method based on the block chain technology, in the model training process, a computationally intensive block node is selected as an edge terminal to train an edge terminal model, and a distributed ledger is adopted to replace a server of a polymerization model. The edge terminal carries out workload verification through a consensus algorithm of the block chain, and achieves the purpose of defending model stealing attacks by utilizing the defect that a model stealing attacker lacks computing power.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart of a vertical federal model theft defense method based on a block chain technology according to an embodiment of the present invention;

fig. 2 is a schematic diagram of training of a vertical federal model provided in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

Aiming at the model safety problem under the federal scene, the block chain technology is adopted for decentralization, the problem caused by the fault or privacy leakage of the server side is prevented, meanwhile, the block chain accounting technology is adopted, and the edge side user selects the packing nodes through calculation competition to resist the malicious model stealing attackers.

Fig. 1 is a flowchart of a vertical federal model theft defense method based on a block chain technique according to an embodiment of the present invention. As shown in fig. 1, the method for protecting against theft under a vertical federal model based on a block chain technology provided in an embodiment includes the following steps:

step 1, selecting 2 edge terminals P from the block chain as the edge terminals for vertical federal learning according to the workload certification_AAnd P_B。

In an embodiment, the edge terminals are selected by competing with the power of the block nodes in the block chain. In the block chain technology, each block node is used for jointly solving the SHA-256 security coding algorithm problem based on respective computer computing power mutual competition, and the accounting right of the next block is obtained through workload certification.

The SHA-256 encoding algorithm is essentially a hash function, a technique to create a small digital "fingerprint" from any kind of data. The SHA-256 compresses the information data into a summary, so that the data volume is reduced, and the format of the data is fixed. The function mixes the data in a scrambling mode and recreates a fingerprint called a hash value. For any length of message, SHA-256 will generate a 256-bit long hash value, called a message digest. The block nodes need to be solved through continuous operation, and only the block nodes with considerable calculation can obtain the accounting right.

The block determines the input data, using the block header as a workload proof input data. The chunk header has 80 bytes including version, parent chunk hash value, Merkle root, timestamp, difficulty target, and counter (Nonce). In the process of constructing the block, a hash value is generated by using a Merkle tree algorithm on a list of information to be contained in the block, and the hash value is used as a value of a Merkle root of the block head. The Merkle tree is a complete binary tree, which is converted into the Merkle tree by the calculation of the hash value.

The difficulty value is used for measuring the difficulty of the block workload proving algorithm, and the greater the difficulty value is, the more complex the workload algorithm is, the greater the difficulty is. The difficulty value determines how many hash operations are required to generate a valid block. Under the condition that the full network computing power is changed constantly, one block needs to be formed by maintaining for ten minutes on average, and the difficulty value needs to be adjusted according to the change of the full network computing power. The adjustment formula is as follows:

n_new＝n*(time₂₀₁₆/2016miutes)

wherein n is_newRepresenting the new difficulty value, n representing the old difficulty value, time₂₀₁₆Representing the length of time it took for 2016 blocks in the past.

In the embodiment, two block nodes with larger workload proofs are selected from the block chain as the edge terminals P according to the workload proofs of each block node_AAnd P_BFor vertical federal learning.

Step 2, edge terminal P_AAnd P_BSeparately assigning sample sets D_AAnd D_BEdge model M_AAnd M_B。

In an embodiment, the federally learned data set includes the MNIST data set, the Fashinon-MNIST and the CIFAR-10 data set. Each data set comprises a training set and a testing set, wherein the training set of the MNIST data set comprises ten types, 6000 samples of each type, ten types of the testing set and 1000 samples of each type; the fast ion-MNIST data set comprises ten types of training sets, 6000 samples of each type, ten types of testing sets and 1000 samples of each type; the training set of the CIFAR-10 data set comprises ten types, 5000 samples of each type, ten types of the test set and 1000 samples of each type.

Under a vertical federal scene, each edge terminal data has the same sample space and different feature spaces, so that image samples need to be cut to construct a sample set D_AAnd D_BThe method comprises the following specific operations:

dividing each sample in the data set into two parts to form a sample set D_AAnd sample set D_BAnd only the sample set D_BContaining a sample tag;

dividing the sample to obtain a sample set D_AAnd sample set D_BThen, the sample set D is also needed_AAnd sample set D_BIn which the partial samples derived from the same sample are aligned, i.e. the edge model M is guaranteed_AAnd edge model M_BThe partial samples of the same input are derived from the same sample.

In the embodiment, in a vertical federal scenario, user groups of different edge terminals are not completely the same, and in order to ensure that different feature spaces correspond to the same sample space, entity alignment needs to be performed on data, and it is also required to ensure that different edge terminals do not expose their own data to each other, so that an encryption-based user ID alignment technology is adopted to protect the local data privacy security of each edge terminal.

And 3, the edge terminal trains respective edge models by using respective sample sets, encrypts respective loss functions by masks, and records the loss functions into an account book.

In an embodiment, according to sample set D_AFor edge terminal P_BEdge model M of_ATraining is carried out according to a sample set D_BFor edge terminal P_BEdge model M of_BTraining is performed, edge terminal P_ASending the characteristic data generated in the training process to P_B，P_BComputing a loss function using the received characteristic data, the edge terminal P_AAnd P_BAnd the respective loss function mask is encrypted and then recorded into the account book.

Aiming at different data sets, two edge terminals are trained by using the same model structure, and for the ImageNet data set, the model pre-trained by ImageNet is used for training and setting unified hyper-parameters: using a random gradient descent (SGD), adam optimizer, learning rate of η, regularization parameter of λ, data set

Where i denotes a certain sample data, y_iThe original label representing the corresponding sample,

and

the feature spaces respectively representing data, and the model parameters related to the feature spaces are represented by theta_AAnd Θ_BThe model training target is expressed as:

in particular, according to the sample set D_AFor edge model M_AIn training, the edge model M_ALoss function Loss of_AComprises the following steps:

wherein, theta_ARepresenting an edge model M_AThe model parameters of (a) are determined,

represents the ith sample belonging to the sample set A, | · | | non-calculation²Representing the square of the norm of L1.

According to sample set D_BFor edge model M_BIn training, the edge model M_BTotal Loss function Loss of_sumComprises the following steps:

loss_sum＝loss_B+loss_AB

therein, loss_BRepresenting an edge model M_BLoss of_ABDenotes the common loss, Θ_BRepresenting an edge model M_BThe model parameters of (a) are determined,

denotes the i-th sample, y, belonging to the sample set B_iTo represent

Corresponding label, | · | | non-conducting phosphor²Represents the square of the norm of L1 and i represents the sample index.

And 4, distributing the account book according to the workload certificate, namely distributing the account book to the edge terminal with large workload certificate for storage.

In the embodiment, a block chain accounting technology is adopted, the training loss function of the edge terminal is distributed to the edge terminal with strong calculation capacity for storage, the problem caused by the fault or privacy disclosure of the third-party server can be prevented, and decentralization is realized.

Step 5, the edge terminal for keeping accounts is used as a temporary service end to the edge terminal P_AAnd P_BAfter the uploaded loss function mask is decrypted, the aggregation loss function obtains gradient informationAnd returns to the edge terminal P_AAnd P_BTo update the edge model network parameters.

In the embodiment, the edge terminal for keeping accounts is used as a temporary server to the edge model M_AAnd M_BAfter the loss function mask is decrypted and the loss functions are aggregated, the aggregated loss functions are solved to obtain M_AAnd M_BAnd returning the gradient information to the edge terminal P_AAnd P_BTo update the edge model network parameters. Specifically, the temporary server side adopts random gradient descent to solve gradient information of the aggregated loss function. The Loss function Loss of the temporary server aggregation is as follows:

Loss＝loss_B++loss_AB+loss_A

M_Aand M_BRespectively is

And

edge terminal P_AAnd P_BAfter gradient information returned by the temporary server is received, updating M of each edge model according to the gradient information_AAnd M_BBased on the updated new network parameters, the training is resumed.

According to the vertical federal model stealing defense method based on the block chain technology, the block chain technology is adopted for decentralization, the problem caused by failure or privacy disclosure of a server end is prevented, meanwhile, the block chain accounting technology is adopted, and edge terminal users select packing nodes through calculation competition so as to resist malicious model stealing attackers.

The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims

1. A vertical federal model stealing defense method based on a block chain technology is characterized by comprising the following steps:

(1) selecting 2 block nodes with large workload proofness from the block chain according to the workload proofness as edge terminals P for vertical federal learning_AAnd P_BAnd is an edge terminal P_AAnd P_BSeparately assigning sample sets D_AAnd D_BEdge model M_AAnd M_B；

(3) edge terminal for keeping account book as temporary service end to edge terminal P_AAnd P_BAfter the uploaded loss function mask is decrypted and the loss functions are aggregated, the aggregated loss functions are solved to obtain an edge model M_AAnd M_BAnd returning the gradient information to the edge terminal P_AAnd P_BTo update the edge model network parameters.

2. The method for vertical federal model theft defense based on blockchain technology as claimed in claim 1, wherein the assigned sample set D is_AAnd D_BThe method comprises the following steps:

dividing each sample in the data set into two parts to form a sample set D_AAnd sample set D_BAnd only the sample set D_BContaining the sample tag.

3. As claimed in claim 2The vertical federal model stealing defense method based on the block chain technology is characterized in that samples are divided to obtain a sample set D_AAnd sample set D_BThen, the sample set D is also needed_AAnd sample set D_BIn which the partial samples derived from the same sample are aligned, i.e. the edge model M is guaranteed_AAnd edge model M_BThe partial samples of the same input are derived from the same sample.

4. The method for vertical federal model theft defense based on block chain technology as claimed in claim 1, wherein the edge terminal P is a terminal P_AAccording to sample set D_AFor edge model M_AIn training, the edge model M_ALoss function Loss of_AComprises the following steps:

5. The method for vertical federal model theft defense based on blockchain technology as claimed in claim 1, wherein the edge terminal P is a peer terminal P_BAccording to sample set D_BFor edge model M_BTraining, edge model M_BTotal Loss function Loss of_sumComprises the following steps:

loss_sum＝loss_B+loss_AB

denotes the i-th sample, y, belonging to the sample set B_iTo represent

6. The method for vertical federal model theft defense based on the block chain technology as claimed in claim 1, wherein the Loss function Loss of temporary server aggregation is:

Loss＝loss_B++loss_AB+loss_A

M_Aand M_BRespectively is

And

7. the method for vertical federal model theft defense based on blockchain technology as claimed in claim 1, wherein the edge terminal P is a peer terminal P_AAnd P_BAfter gradient information returned by the temporary server is received, updating M of each edge model according to the gradient information_AAnd M_BBased on the updated new network parameters, the training is resumed.

8. The vertical federation model stealing prevention method based on the blockchain technology of claim 1, wherein the temporary server uses stochastic gradient descent to solve gradient information of the aggregated loss function.