CN114492828A - Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application - Google Patents

Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application Download PDF

Info

Publication number
CN114492828A
CN114492828A CN202111489023.XA CN202111489023A CN114492828A CN 114492828 A CN114492828 A CN 114492828A CN 202111489023 A CN202111489023 A CN 202111489023A CN 114492828 A CN114492828 A CN 114492828A
Authority
CN
China
Prior art keywords
participant
model
federal learning
node
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111489023.XA
Other languages
Chinese (zh)
Inventor
张延楠
尚璇
张帅
谢逸俊
李伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Qulian Technology Co Ltd
Original Assignee
Hangzhou Qulian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Qulian Technology Co Ltd filed Critical Hangzhou Qulian Technology Co Ltd
Priority to CN202111489023.XA priority Critical patent/CN114492828A/en
Publication of CN114492828A publication Critical patent/CN114492828A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Abstract

The invention discloses a block chain technology-based vertical federal learning malicious node detection and reinforcement method and application, wherein the method comprises the following steps: establishing an intelligent contract based on the participators of the block chain participating in the vertical federal learning; the participator and the active participator initiating the vertical federal learning carry out the vertical federal learning; the active computing nodes corresponding to the active participants carry out reasoning and verification on the intermediate computing results of the participants by using the central server model, and the accumulated contribution degree of the participants is computed according to the reasoning and verification results; establishing a verification committee, generating countermeasure sample data by the verification committee on the basis of a middle calculation result with high accumulated contribution degree, constructing a sample detection model, and performing malicious node detection on the participants by using the sample detection model; the active computing node combines the countermeasure sample data generated by the verification committee to countertrain the central server model so as to reinforce the central server model to defend the countermeasure attack. The method can realize the detection of the malicious nodes and reinforce the vertical federal learning system.

Description

Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application
Technical Field
The invention belongs to the field of federal learning, and particularly relates to a block chain technology-based vertical federal learning malicious node detection and reinforcement method and application.
Background
In recent years, with the development of artificial intelligence, the artificial intelligence has been gaining attention in the fields of image processing, natural language processing, graph representation learning, and the like, and thus more manpower and material resources are being put into practical application fields based on deep learning. Deep learning is an important module in artificial intelligence technology, but its excellent performance often depends on a large amount of high-quality training data. However, in an actual scenario, data owned by a single small enterprise often cannot support a training task with a huge depth model, and data owned by a business big-head company is difficult to be used in the open due to privacy and interests, so that a data island is formed. In order to break through the data isolation state, a federal learning concept is provided, and a global model is trained and maintained by a plurality of participants through exchanging intermediate calculation results or model parameters by using local data.
In order to protect the data privacy of the participants, technologies such as multi-party safe calculation, differential privacy, homomorphic encryption and the like are introduced in the communication process of federal learning, and the effect that a high-quality global model can be trained together without leaving the local data is achieved. However, similar to the general deep learning model, federal learning also faces some security risks, i.e., federal learning may face counterattack in the system reasoning phase.
Vertical federated learning is more susceptible to such attacks than horizontal federated learning because the vertical federated learning system relies on the computational data provided by the various participants during the inference phase, with all participants achieving the same inference result. If a central server in the vertical federal learning system is attacked by a malicious participant, the other participants will be led to obtain wrong reasoning results. In a real scene, such as a recommendation system of a social network platform, different platforms analyze user relationships and attributes through a local model, and upload intermediate calculation results to jointly train a central server model of a vertical federated system, so as to provide a precondition guarantee for obtaining a high-performance recommendation algorithm. At this time, if some unfair competitive platforms as malicious nodes are disturbed through careful design to tamper the connection relationship or attribute of the user in the original test data (i.e. to resist attack), an effect of making the inference of the central server model wrong may be achieved, so that the recommendation performance of the central server model is reduced, and the services of other benign platforms are affected, while the malicious nodes may profit from the unfair competition. Therefore, how to detect the malicious nodes is a key problem to improve the safety and the usability of the vertical federal learning system.
The blockchain technology is a non-falsifiable, shared distributed ledger and database with the characteristics of being non-falsifiable, publicly transparent, and collectively maintainable. Compared with federal learning, the block chain technology establishes a reliable cooperation mechanism, and all participants authenticate data through a formula mechanism, so that the risk of data tampering can be effectively reduced. The block chain technology can make up the distrust relationship between the two parties in the federal learning system. Thus, the block chain technique is well complementary to federal learning.
Disclosure of Invention
In view of the foregoing, a first object of the present invention is to provide a method for detecting and reinforcing a vertical federal learning malicious node based on a block chain technique, so as to detect the vertical federal learning malicious node and reinforce a vertical federal learning system, where an obtained central server model can resist sample attacks.
In order to achieve the first object of the present invention, an embodiment provides a vertical federal learning malicious node detection and reinforcement method based on a block chain technology, including the following steps:
step 1, establishing an intelligent contract based on a participant participating in vertical federal learning by a block chain, wherein the intelligent contract comprises identity information registration and authentication of the participant, binding of the participant and a computing node and signing of the participant;
step 2, the participator deploys a local training task based on the signed intelligent contract, trains a local model by using sample data, uploads the obtained intermediate calculation result to a corresponding calculation node, encrypts the intermediate calculation result and uploads the encrypted intermediate calculation result to a block;
step 3, initiating a vertical federal learning participant as an active participant, downloading and applying all encrypted intermediate calculation results by an active calculation node corresponding to the active participant to jointly train a central server model, training the model gradient of each participant by using each intermediate calculation result and uploading the model gradient to the block, downloading the model gradient of each participant from the block by the corresponding calculation node, and performing the next round of training;
step 4, the active computing node conducts reasoning verification on the intermediate computing result of each participant by using the central server model, and the accumulated contribution degree of each participant is computed according to the reasoning verification result;
step 5, establishing a verification committee, generating countermeasure sample data by the verification committee on the basis of the intermediate calculation result with high accumulated contribution degree, constructing a sample detection model, and performing malicious node detection on the participating party by using the sample detection model;
and 6, the active computing node resists and trains the central server model by combining the resistance sample data generated by the verification committee so as to reinforce the central server model to defend against attacks.
In one embodiment, during identity information registration and authentication of the participants, the registration information comprises the size format and the equipment calculation capacity of uploaded data, each authenticated participant is allocated with an identity code ID, and the cumulative contribution degree of the participant is initialized to be 0; when the participants are bound to the compute nodes, one compute node binds at least one participant.
In one embodiment, a participant initiating a vertical federated learning task formulates an intelligent contract, the specification of which includes a digital signature encryption mechanism, model structure information, data structure information, and a contribution mechanism, wherein the data structure information includes participant attributes and intermediate calculation result dimensions.
In one embodiment, the digital signature encryption mechanism is:
the method comprises the following steps that a participant of vertical federal learning uses a homomorphic encryption mechanism of data, two prime numbers p and q with equal lengths are selected arbitrarily, n is pq, and lambda is lcm (p-1 and q-1), wherein lcm (·) is a function for solving the least common multiple;
randomly selecting one less than n2Positive integer z of (1), such that gcd (L (z)λmodn2) N) is 1, wherein,
Figure RE-GDA0003527265410000041
mod is the remainder operation, gcd (·) is the function of finding the greatest common divisor, and μ ═ L (g)λmodn2))-1modn;
Let (n, z) be the public key and (λ, μ) be the private key.
In one embodiment, the computing node encrypts the intermediate computation result with a public key (n, z) to obtain a ciphertext c ═ zmrnmodn2And uploading the ciphertext to the block, wherein r is more than 0 and less than n, and r and n are relatively prime.
In one embodiment, the active compute nodes jointly train the central server model by:
cglobal=η1·c12·c2+...+ηk·ck
wherein eta is1,...,ηkFor adaptive weighting, the sign-means element-by-element multiplication, c1,...,ckEncrypted intermediate calculation results uploaded for the participants, cglobalA weighted sum representing the results of the calculations for all participant data is used as input to the central server model.
In one embodiment, the active computing node computes the cumulative contribution of each participant according to the inference verification result as:
Figure RE-GDA0003527265410000042
Figure RE-GDA0003527265410000043
Figure RE-GDA0003527265410000044
where i and j represent the index of the participant, acciRepresenting the accuracy of the central server model independently trained by the intermediate calculation result of the participant i, m being the total number of the participants, alpha being a contribution scale factor, and x representing the normalized accuracy of the participant i after being scaled by the scale factor during the first round of training, namely
Figure RE-GDA0003527265410000051
Ci lRepresenting the contribution of the participant i during the first round of training,
Figure RE-GDA0003527265410000052
representing the cumulative contribution of participant i during the l-1 st round of training.
In one embodiment, the verification committee comprises a verification computing node and a countermeasure sample generator bound with the verification computing node, wherein the verification computing node downloads the intermediate computing result with the highest accumulated contribution degree from the block and transmits the intermediate computing result to the countermeasure sample generator, and the countermeasure sample generator takes the received intermediate computing result as positive sample data and constructs countermeasure sample data by adding disturbance and combining a generating countermeasure network on the basis of the positive sample data and uploads the countermeasure sample data to the verification computing node;
and the verification calculation node is used for training a sample detection model for detecting whether the sample data is normal or not based on the positive sample data and the antagonistic sample data, then the sample detection model is used for detecting the intermediate calculation result of the participant, and when the detection result is found to be the antagonistic sample, the participant corresponding to the intermediate calculation result is taken as a malicious node.
In one embodiment, the validation committee includes validation compute nodes that upload challenge sample data into a challenge sample database of the tiles, active compute nodes download challenge sample data from the challenge sample database of the tiles, and pair the challenge training central server model in combination with intermediate compute results of each participant to reinforce the central server model against challenge attacks.
The second invention aims to provide a method for constructing an interest recommendation model based on vertical federal learning, which belongs to the field of social network data mining, and the interest recommendation model is constructed by the vertical federal learning malicious node detection and reinforcement method based on the block chain technology, so that the interest recommendation model can defend against attacks of a social network, the local social network security of participants is ensured, and the recommendation robustness of the interest recommendation model is improved.
In order to achieve the second object, an embodiment provides a method for constructing an interest prediction model applied to a social network, where the interest prediction model is used for predicting interests and is constructed by the vertical federal learning malicious node detection and reinforcement method based on the block chain technology, that is, sample data owned by a participant is a social network, where nodes of the social network are bloggers and links represent social relationships, interests of the bloggers in the social network are tags of the sample data, that is, the social network and the tags are based on the participant, and a central server model constructed by vertical federal learning is used as the interest prediction model;
when the method is applied, the social network is processed and input into the interest prediction model, and the interest prediction result is output through calculation.
The technical conception of the invention is as follows: aiming at potential vulnerability to anti-attack in a vertical federal learning system, a vertical federal learning malicious node detection and reinforcement method based on a block chain technology and application in the field of social network data mining are established by utilizing the bookkeeping characteristic, traceability and database function of the block chain technology, accumulated contribution degrees are set for each participant through the bookkeeping characteristic of the block chain, a verification committee is established, the participant is verified to select benign sample data through the contribution degrees, and an anti-sample is generated through an anti-sample generating party, so that the effect of training a malicious node detector is achieved. Furthermore, the generated countermeasure sample is established as a countermeasure sample database by utilizing the database function of the block chain, and the countermeasure sample is provided for the countermeasure training of the active party, so that the vertical federal system is reinforced.
The invention has the following beneficial effects: (1) by utilizing the fact that the accounting function of the block chain has non-tamper property, an active party in a vertical federal system distributes contribution degrees for each participant according to training conditions, and basis is provided for selecting benign samples for verification calculation nodes; (2) the data uploaded to the block is utilized to construct a sample detector, and the countermeasure sample uploaded by the malicious participant can be effectively detected, so that whether the participant is a malicious node or not is judged; (3) by utilizing the database function of the block chain, a countermeasure sample database is provided for the established vertical federated system, and active participants can select a countermeasure sample from the database to perform countermeasure training, so that the system is reinforced, and the robustness of a central server model to the countermeasure sample is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a vertical federated learning malicious node detection and reinforcement method based on a blockchain technique according to an embodiment;
fig. 2 is a schematic block diagram of a vertical federated learning malicious node detection and reinforcement method based on a blockchain technique according to an embodiment;
FIG. 3 is a schematic diagram of an embodiment of a validation Committee generated sample testing model and a sample testing process;
fig. 4 is a process diagram of generating challenge sample data by a challenge sample generator of the validation committee according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Aiming at the potential vulnerability problem to anti-attack in a vertical federal learning system in the fields of social network data mining and the like, the embodiment establishes a vertical federal learning malicious node detection and reinforcement method based on a block chain technology by utilizing the accounting function and the non-tamper property of the block chain. In view of the fact that vertical federal learning is applied in industry on a large scale, the vertical federal learning malicious node detection and reinforcement method provided by the method has a strong practical significance for improving the safety and the usability of a vertical federal model aiming at the robustness problem of a vertical federal learning system. The embodiment also provides application of the vertical federal learning malicious node detection and reinforcement method in building an interest and hobby prediction model applied to a social network.
In the embodiment, the method for detecting and reinforcing the vertical federal learning malicious node is applied to the construction of an interest and preference prediction model by combining the vertical federal learning malicious node detection and reinforcement method based on the block chain technology, and the steps of the vertical federal learning malicious node detection and reinforcement method are explained in detail: fig. 1 is a flowchart of a vertical federated learning malicious node detection and reinforcement method based on a blockchain technique according to an embodiment; fig. 2 is a block chain technology-based framework diagram of a vertical federal learning malicious node detection and reinforcement method according to an embodiment. As shown in fig. 1, the method for detecting and reinforcing a vertical federal learning malicious node based on a block chain technology according to the embodiment includes the following steps:
step 1, establishing an intelligent contract based on the participators of the block chain participating in vertical federal learning.
In the embodiment, the participator is social network data locally provided with an interest and preference prediction model for training, each node of the social network is a blogger, the connecting edges represent social relationships, and the interest and preference of the blogger in the social network is a tag of sample data, wherein the social network data can be a BlogCatalog data set. The active participant initiating vertical federal learning is one of all participants, and like the other participants, also possesses social networking data that trains the hobby prediction model.
The intelligent contract established by the participants comprises the registration and authentication of the identity information of the participants, the binding of the participants and the computing nodes and the intelligent contract signed by the participants.
Aiming at identity information registration and authentication, identity registration and authentication are required to be carried out with training participants, and the registration information comprises the size format of uploaded data, equipment calculation capacity and the like; after authentication, an identity code id is distributed to each participant, so that the verification work of subsequent node detection is facilitated, and the accumulated contribution degree C of the participants is initializedtotal=0。
And aiming at the binding of the participants and the computing nodes, the computing nodes are nodes or equipment with certain computing power and are used for receiving and uploading computing results uploaded by the participants and downloading verification data. The relationship between the compute nodes and the participants may be one-to-one or one-to-many depending on the scenario.
An active party of a vertical federal system of an active participant social network initiates a joint training task, and aims to train an interest and hobby prediction model for predicting the interest and hobby of a user and achieve the purpose of realizing accurate recommendation by a downstream recommendation algorithm. And establishing an intelligent contract while initiating the joint training task. The intelligent contract specifies a digital signature encryption mechanism, model structure information, data structure information and a contribution degree mechanism, and can construct a model structure based on GCN when predicting interests and hobbies based on a social network, wherein the data structure information comprises attribute information of participants such as calculation power, data size and the like, and also comprises intermediate calculation result dimensionality.
In an embodiment, the digital signature encryption mechanism comprises:
method and apparatus for vertical federal learning using dataA state encryption mechanism, arbitrarily selecting two prime numbers p, q with equal length, and calculating n ═ pq and λ ═ lcm (p-1, q-1), wherein lcm (·) is a function for obtaining a least common multiple; randomly selecting one less than n2Positive integer z of (1), such that gcd (L (z)λmodn2) N) is 1, wherein,
Figure RE-GDA0003527265410000091
mod is the remainder operation, gcd (·) is the function of finding the greatest common divisor, and μ ═ L (g)λmodn2))-1modn; let (n, z) be the public key and (λ, μ) be the private key.
And 2, deploying a local training task by the participant based on the signed intelligent contract, training a local model by using sample data, encrypting the obtained intermediate calculation result and uploading the encrypted intermediate calculation result to the block.
In the embodiment, the participant uses local data to train and upload a local model to the computing node, and the computing node encrypts an intermediate computing result obtained by computing by using a public key (n, z) according to a digital signature encryption mechanism specified by an intelligent contract and uploads the encrypted intermediate computing result to a block for broadcasting and accounting. The specific way of encrypting the plaintext m into the ciphertext c is as follows: c is zmrnmodn2Wherein r is more than 0 and less than n, and r and n are relatively prime.
And 3, the active participants jointly train the central server model according to the intermediate calculation results, simultaneously train the model gradient of each participant by using each intermediate calculation result and upload the model gradient to the block, and each participant downloads the model gradient of the participant and carries out the next round of training.
In an embodiment, an active computing node corresponding to an active participant downloads and applies all encrypted intermediate computing results (including intermediate computing results uploaded by the active participant) from a block to jointly train a central server model stored in the active computing node, and the specific joint training mode is as follows:
cglobal=η1·c12·c2+...+ηk·ck
wherein eta is1,...,ηkFor adaptive weighting, notation, or representationMultiplication by prime, c1,...,ckEncrypted intermediate calculation results uploaded for the participants, cglobalA weighted sum of the results of the calculations representing all the participant data is used as input to the central server model.
The active computing node also trains the model gradient of each participant by using the intermediate computing result uploaded by each participant, the model gradient is packed and uploaded to the block according to the ID of each participant, the computing node downloads the corresponding model gradient and sends the model gradient to the participants, further training of the local model is completed, and iterative computation is performed in sequence until the central server model converges.
And 4, the active computing node conducts reasoning verification on the intermediate computing result of each participant by using the central server model, and the accumulated contribution degree of each participant is computed according to the reasoning verification result.
And testing after each training period, inputting training samples into the local model by the participants, calculating to obtain intermediate calculation results, uploading the intermediate calculation results to the calculation nodes, and packaging and uploading the calculation nodes to the blocks. The active computing node downloads the intermediate computing results of all the participants from the block, combines the intermediate computing results with the intermediate computing results of the active participants and inputs the intermediate computing results and the intermediate computing results into the central server model to obtain reasoning results, and the active computing node computes the contribution according to the quality of data uploaded by all the participants as follows:
Figure RE-GDA0003527265410000101
Figure RE-GDA0003527265410000102
where i and j represent the index of the participant, acciRepresenting the accuracy of the central server model independently trained by the intermediate calculation result of the participant i, m being the total number of the participants, alpha being a contribution scale factor, and x representing the normalized accuracy of the participant i after scaling by the scale factor during the first round of training, namely
Figure RE-GDA0003527265410000111
Adding the contribution degree and the historical contribution degree of the participant to be used as the accumulated contribution degree, and then calculating the accumulated contribution degree of the participant i for the first time into
Figure RE-GDA0003527265410000112
And 5, establishing a verification committee, generating countermeasure sample data on the basis of the intermediate calculation result with high accumulated contribution degree, and constructing a sample detection model.
In an embodiment, a verification computing node is introduced into a vertical federal system and is bound with a countermeasure sample generator to form a verification committee. The validation committee constructed a sample detection model by the following process: the verification calculation node downloads the intermediate calculation result of the participant with the highest contribution degree as positive sample data according to the contribution degree, the confrontation sample generator generates confrontation sample data based on the positive sample data, and then a classifier P based on the multilayer perceptron model is constructed through the positive sample data and the confrontation sample data:
P(x)=softmax(Wk·ρ(...ρ(W1·x)))
where x is the sample input to the detector, ρ is the nonlinear activation function (ReLU function is used here), and W is1,...,WkRespectively, the weight matrix for each layer of the detector model.
The countermeasure sample generator constructs countermeasure sample data based on the positive sample data through the generative countermeasure network. The specific process is as follows: the intermediate calculation results and the label information downloaded from the block by the verification calculation node are constructed into an original data set SoriAnd constructing a generative confrontation network. The generative countermeasure network is divided into two parts, an attack generator network G and a discriminator network D. The steps of generating the confrontation sample are as follows:
and the attack generator randomly samples a k-dimensional Gaussian noise vector from the Gaussian distribution, inputs the noise into the attack generator, maps the noise into disturbance with the same size as the original data set, and adds the disturbance and the current positive sample data to obtain the countermeasure sample data. The generated countermeasure sample data is sent into a discriminator, the probability of being classified and the cross entropy of the positive sample label are used as the loss function of a generator, and the purpose is to make the generated countermeasure sample and the positive sample more similar so as to confuse the discriminator. The loss function is:
Figure RE-GDA0003527265410000121
wherein DataoriFor the original data set, θDFor discriminator network parameters, thetaGIn order to attack the generator network parameters,
Figure RE-GDA0003527265410000124
representing the expected values for the distribution specified in the table below.
The role of the discriminator is to distinguish as far as possible the antagonistic sample generated by the attack generator from the positive sample, taking the sample as the input of the discriminator, and the probability of classification and the cross entropy of the sample label (original sample label is 0, antagonistic sample label is 1) as the loss function of the discriminator, so that it is robust to the antagonistic sample generated by the generator. The loss function is:
Figure RE-GDA0003527265410000122
wherein DataadvSet of challenge samples, θ, generated for the attack generatorDAre discriminator network parameters.
Training the attack generator and the discriminator until the loss function of the attack generator is reduced to a lower value, wherein the countermeasure samples generated by the attack generator have strong confusion effect on the discriminator, and the countermeasure sample data are used as a benchmark countermeasure data set Sadv
The original data set SoriAnd a reference countermeasure data set SadvAre combined into SLThe training target of the constructed sample detection model is the probability and the label of whether the model predicts the confrontation sample (the confrontation sample label is 1, and the truth isSample label 0):
Figure RE-GDA0003527265410000123
wherein Y is a true tag and Y' is a model prediction tag. Furthermore, verifying that the computing node will confront the dataset SadvAnd packaging and uploading the packages to the blocks, establishing an anti-sample database by using the database function of the block chain, and updating the anti-sample database in real time according to the update of the attack generator to increase the sample diversity in the anti-sample database.
And 6, carrying out malicious node detection on the participants by using the sample detection model.
In the reasoning stage of the model, active participants initiate reasoning tasks, all the participants upload calculation results to a block through calculation nodes, the calculation nodes are verified to download intermediate calculation results uploaded by all the participants from the block and serve as input of a sample detection model, the output result of the model is whether the sample is a countermeasure sample or not, the result is uploaded to the block, and according to the judgment result of the sample detection model, if the sample is the countermeasure sample, no incentive is given to malicious nodes.
And 7, the active computing node resists and trains the central server model by combining the resistance sample data generated by the verification committee so as to reinforce the central server model to defend against attacks.
In the training stage of the model, the active party can download the countermeasure samples in the countermeasure sample database in the block in real time, add the samples into the training set and use the samples as part of the training samples of the central server to perform the countermeasure training, so that the robustness of the central server to the countermeasure attack is enhanced.
According to the embodiment, the non-tamper property and the accounting property in the block chain technology can record the historical contribution degree of each participant in the federal learning system, and provide a basis for detecting the malicious participant. Moreover, the database function of the blockchain can be used for establishing an antagonistic sample database, and a source of the antagonistic sample is provided for the antagonistic training of the central server model by an active party in the federal learning system. Therefore, a malicious node detection and reinforcement method in the vertical federal system is established, and the safety and the usability of the vertical federal model are improved.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A vertical federal learning malicious node detection and reinforcement method based on a block chain technology is characterized by comprising the following steps:
step 1, establishing an intelligent contract based on a participant participating in vertical federal learning by a block chain, wherein the intelligent contract comprises identity information registration and authentication of the participant, binding of the participant and a computing node and signing of the participant;
step 2, the participator deploys a local training task based on the signed intelligent contract, trains a local model by using sample data, uploads the obtained intermediate calculation result to a corresponding calculation node, encrypts the intermediate calculation result and uploads the encrypted intermediate calculation result to a block;
step 3, initiating a vertical federal learning participant as an active participant, downloading and applying all encrypted intermediate calculation results by an active calculation node corresponding to the active participant to jointly train a central server model, training the model gradient of each participant by using each intermediate calculation result and uploading the model gradient to the block, downloading the model gradient of each participant from the block by the corresponding calculation node, and performing the next round of training;
step 4, the active computing node conducts reasoning verification on the intermediate computing result of each participant by using the central server model, and the accumulated contribution degree of each participant is computed according to the reasoning verification result;
step 5, establishing a verification committee, generating countermeasure sample data by the verification committee on the basis of the intermediate calculation result with high accumulated contribution degree, constructing a sample detection model, and performing malicious node detection on the participating party by using the sample detection model;
and 6, the active computing node resists and trains the central server model by combining the resistance sample data generated by the verification committee so as to reinforce the central server model to defend against attacks.
2. The method for detecting and reinforcing the vertical federal learning malicious node based on the blockchain technology as claimed in claim 1, wherein in the step 1, when identity information of participants is registered and authenticated, the registration information includes size format and equipment calculation power of uploaded data, each authenticated participant is allocated with an identity code ID, and the cumulative contribution degree of the participant is initialized to 0;
when the participants are bound to the compute nodes, one compute node binds at least one participant.
3. The method for detecting and reinforcing the vertical federal learning malicious node based on the blockchain technology as claimed in claim 1, wherein in step 1, a participant initiating a vertical federal learning task makes an intelligent contract, and the provision of the intelligent contract comprises a digital signature encryption mechanism, model structure information, data structure information and a contribution degree mechanism, wherein the data structure information comprises participant attributes and intermediate calculation result dimensions.
4. The method for detecting and reinforcing the vertical federal learning malicious node based on the block chain technology as claimed in claim 3, wherein the digital signature encryption mechanism is:
the method comprises the following steps that a participant of vertical federal learning uses a homomorphic encryption mechanism of data, two prime numbers p and q with equal lengths are selected arbitrarily, n is pq, and lambda is lcm (p-1 and q-1), wherein lcm (·) is a function for solving the least common multiple;
randomly selecting one less than n2Positive integer z of (1), such that gcd (L (z)λmodn2) N) is 1, wherein,
Figure RE-FDA0003527265400000021
mod is a remainder operation, gcd (-) isThe greatest common divisor function is obtained, and [ mu ] (L (g)λmodn2))-1modn;
Let (n, z) be the public key and (λ, μ) be the private key.
5. The method for detecting and reinforcing the malicious node in the vertical federal learning based on the blockchain technology as claimed in claim 1, wherein in the step 2, the computing node encrypts the intermediate computing result by using a public key (n, z) to obtain a ciphertext c ═ zmrnmodn2And uploading the ciphertext to the block, wherein r is more than 0 and less than n, and r and n are relatively prime.
6. The method for detecting and reinforcing vertical federal learning malicious nodes based on blockchain technology as claimed in claim 1, wherein in step 3, the active computing node jointly trains the central server model by:
cglobal=η1·c12·c2+...+ηk·ck
wherein eta is1,...,ηkFor adaptive weighting, the sign-means element-by-element multiplication, c1,...,ckEncrypted intermediate calculation results uploaded for the participants, cglobalA weighted sum of the results of the calculations representing all the participant data is used as input to the central server model.
7. The method for detecting and reinforcing the vertical federal learning malicious node based on the blockchain technology as claimed in claim 1, wherein in the step 4, the active computing node computes the cumulative contribution of each participant according to the reasoning and verification result as follows:
Figure RE-FDA0003527265400000031
Figure RE-FDA0003527265400000032
Figure RE-FDA0003527265400000033
where i and j represent the index of the participant, acciRepresenting the accuracy of the central server model independently trained by the intermediate calculation result of the participant i, m is the total number of the participants, alpha is a contribution scaling factor, and x represents the normalized accuracy of the participant i after scaling by the scaling factor during the first round of training, namely
Figure RE-FDA0003527265400000034
Figure RE-FDA0003527265400000036
Representing the contribution of the participant i during the first round of training,
Figure RE-FDA0003527265400000035
representing the cumulative contribution of participant i during the l-1 st round of training.
8. The method for detecting and reinforcing the malicious node in vertical federal learning based on the blockchain technology as claimed in claim 1, wherein in step 5, the validation committee includes a validation computing node and a countermeasure sample generator bound to the validation computing node, the validation computing node downloads the intermediate computing result with the highest accumulated contribution degree from the block and transmits the intermediate computing result to the countermeasure sample generator, and the countermeasure sample generator uses the received intermediate computing result as the positive sample data and constructs countermeasure sample data by adding disturbance and combining with a generative countermeasure network on the basis of the positive sample data and uploads the countermeasure sample data to the validation computing node;
and the verification calculation node is used for training a sample detection model for detecting whether the sample data is normal or not based on the positive sample data and the antagonistic sample data, then the sample detection model is used for detecting the intermediate calculation result of the participant, and when the detection result is found to be the antagonistic sample, the participant corresponding to the intermediate calculation result is taken as a malicious node.
9. The method as claimed in claim 1, wherein in step 6, the verification computing node included in the verification committee uploads the countermeasure sample data to the countermeasure sample database of the tile, the active computing node downloads the countermeasure sample data from the countermeasure sample database of the tile, and combines the intermediate computation results of the participants to the countermeasure training center server model to reinforce the center server model to defend against the countermeasure attack.
10. A method for constructing an interest and hobby prediction model applied to a social network is characterized in that the interest and hobby prediction model is used for predicting interest and hobby and is constructed by the vertical federal learning malicious node detection and reinforcement method based on the block chain technology as claimed in any one of claims 1 to 9, namely sample data owned by a participant is the social network, wherein the node of the social network is a blogger, social relations are represented by connecting edges, the interest and hobby of the blogger in the social network are tags of the sample data, namely the social network and the tags based on the participant, and a central server model constructed by vertical federal learning is the interest and hobby the interest prediction model;
when the method is applied, the social network is processed and input into the interest prediction model, and the interest prediction result is output through calculation.
CN202111489023.XA 2021-12-08 2021-12-08 Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application Pending CN114492828A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111489023.XA CN114492828A (en) 2021-12-08 2021-12-08 Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111489023.XA CN114492828A (en) 2021-12-08 2021-12-08 Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application

Publications (1)

Publication Number Publication Date
CN114492828A true CN114492828A (en) 2022-05-13

Family

ID=81492575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111489023.XA Pending CN114492828A (en) 2021-12-08 2021-12-08 Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application

Country Status (1)

Country Link
CN (1) CN114492828A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010944A (en) * 2023-03-24 2023-04-25 北京邮电大学 Federal computing network protection method and related equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010944A (en) * 2023-03-24 2023-04-25 北京邮电大学 Federal computing network protection method and related equipment

Similar Documents

Publication Publication Date Title
CN108683669B (en) Data verification method and secure multi-party computing system
Leng et al. Blockchain security: A survey of techniques and research directions
Lu et al. A secure and scalable data integrity auditing scheme based on hyperledger fabric
EP3420669B1 (en) Cryptographic method and system for secure extraction of data from a blockchain
Zhu et al. A round-optimal lattice-based blind signature scheme for cloud services
CN111597590B (en) Block chain-based data integrity quick inspection method
CN112132577B (en) Multi-supervision transaction processing method and device based on block chain
CN112104609B (en) Method for verifiable privacy-aware truth discovery in mobile crowd-sourcing awareness systems
CN114363043B (en) Asynchronous federal learning method based on verifiable aggregation and differential privacy in peer-to-peer network
Nikolaenko et al. Powers-of-tau to the people: Decentralizing setup ceremonies
CN111291411A (en) Safe video anomaly detection system and method based on convolutional neural network
El Kassem et al. More efficient, provably-secure direct anonymous attestation from lattices
CN116260587A (en) Quantum-resistant signature authentication method based on hash signature and having small size
Faridi et al. Blockchain in the quantum world
CN114492828A (en) Block chain technology-based vertical federal learning malicious node detection and reinforcement method and application
Khan et al. Memristive hyperchaotic system-based complex-valued artificial neural synchronization for secured communication in Industrial Internet of Things
Blum et al. Superlight–A permissionless, light-client only blockchain with self-contained proofs and BLS signatures
Srivastava et al. Integration of quantum computing and blockchain technology: a cryptographic perspective
CN117216805A (en) Data integrity audit method suitable for resisting Bayesian and hordeolum attacks in federal learning scene
CN108900310A (en) Block chain signature processing method and block chain signature processing unit
CN115118462B (en) Data privacy protection method based on convolution enhancement chain
CN110661816A (en) Cross-domain authentication method based on block chain and electronic equipment
Zhou et al. Breaking symmetric cryptosystems using the offline distributed Grover-meets-Simon algorithm
CN115310120A (en) Robustness federated learning aggregation method based on double trapdoors homomorphic encryption
Ma et al. Public key authenticated encryption with multiple keywords search using Mamdani system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination