CN112749392A

CN112749392A - Method and system for detecting abnormal nodes in federated learning

Info

Publication number: CN112749392A
Application number: CN202110020440.3A
Authority: CN
Inventors: 郭晶晶; 李海洋; 刘玖樽; 熊良成; 田思怡; 马建峰; 高华敏
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-01-07
Filing date: 2021-01-07
Publication date: 2021-05-04
Anticipated expiration: 2041-01-07
Also published as: CN112749392B

Abstract

A method and a system for detecting abnormal nodes in federated learning are provided, wherein the detection method comprises the following steps: initializing a system, and performing user registration, system parameter generation and key agreement; generating a mask local model; the malicious user detection comprises the steps that the server side carries out local model aggregation to generate a confusion aggregation model, the user side carries out verification on the confusion aggregation model to generate a verification result, and the server side carries out malicious user detection according to the user side verification result; the server carries out model aggregation by using a mask local model uploaded by a non-malicious user to obtain a global model of the current iteration wheel and sends the global model to the user, and the user carries out local model updating according to the received global model. The detection system consists of an aggregation server, a trust authority and a plurality of users. The invention can improve the credibility and the accuracy of the global model while ensuring the privacy of each user, and realizes safe and reliable federal learning.

Description

Method and system for detecting abnormal nodes in federated learning

Technical Field

The invention belongs to the field of network space security, and particularly relates to a method and a system for detecting abnormal nodes in federated learning.

Background

Federal Learning (fed Learning) is a machine Learning method proposed in recent years, and is characterized in that a plurality of users collaboratively train a model by using own data under the cooperation of a server. Firstly, a user carries out model training by using local data of the user; then uploading the local model obtained by training to a server; then, the server uses a certain aggregation rule to aggregate the received user local models, so as to obtain a global model for all users to share. The machine learning paradigm can prevent training data of each user from being shared by other users and the central server, so that data privacy of the users is protected, and the federal learning is greatly concerned and developed rapidly in academia and industry in recent years.

Most existing federal learning algorithms assume that all nodes participating in federal learning are honest and credible, but in practical cases, the assumption is difficult to be established. The learners verify that the privacy information of the users can be obtained by analyzing the local models uploaded by the users, and design corresponding privacy protection aggregation rules to ensure that the server cannot directly observe the local model of each user, thereby effectively protecting the data and model privacy of the users. Meanwhile, more opportunities are provided for malicious users (also called Byzantine nodes) to upload abnormal (intentional or unintentional) local models, and the malicious users can submit arbitrarily generated parameters to the server as the local models of the malicious users, so that the whole federal learning process is influenced, and the central server finally obtains an inaccurate global model. Therefore, detecting error models uploaded by malicious users and preventing the error models from influencing the global model are important prerequisites in relation to whether federal learning can be widely applied.

Disclosure of Invention

The invention aims to provide a method and a system for detecting abnormal nodes in federated learning aiming at the problem that an aggregation server generates an inaccurate global model due to the fact that a malicious user can send any local model in the federated learning system for protecting privacy, and the credibility and the accuracy of a federated learning system training result are ensured by eliminating the influence of the malicious models uploaded by the nodes on the whole learning process.

In order to achieve the purpose, the invention has the following technical scheme:

a method for detecting abnormal nodes in federated learning comprises the following steps:

step one, system initialization, including user registration, system parameter generation and key agreement;

step two, generating a mask local model;

step three, malicious user detection, which comprises the steps that a server side conducts local model aggregation to generate a confusion aggregation model, the user side verifies the confusion aggregation model to generate a verification result, and the server side conducts malicious user detection according to the user side verification result;

and fourthly, the server performs model aggregation by using the mask local model uploaded by the non-malicious user to obtain a global model of the current iteration wheel and sends the global model to the user, and the user performs local model updating according to the received global model.

As a preferred embodiment of the present invention, the first step specifically includes:

(1.1) user u_iRegistering the Federal learning System by first sending data [ d ] to the Trust Authority TA_i,MAC,nonce]Wherein d is_iFor user u_iOf local data volume, MAC, as user u_iThe nonce is a random number generated by the MAC address of (1);

(1.2) the server is the user u according to the received data_iGenerating the identity thereof and sending the identity to the user u_iAnd an aggregation server AS, and then generating a system parameter pp ═ G, p, G, H by using a KA (param) algorithm]The parameters [ G, p, G,H]Distributing the data to all users;

(1.3) user u_iGenerating public and private keys by using KA.gen (pp) algorithm

Sending the public key to an aggregation server AS, and distributing all the public keys received by the aggregation server AS to users in the system;

(1.4) user u_iGenerate it and other users u by using KA_jIs shared with the key.

As a preferred embodiment of the present invention, the second step specifically includes:

(2.1) user u_iRespectively utilize

And

as a seed for the random number generator PRG, two sets of mask vectors m are generated according to equations (1), (2)_i,j,mm_i,jThe length of the vector is the same as the length of the user local model parameter;

(2.2) user u_iModel training is carried out by utilizing local data of the user to obtain a local model

(2.3) user u_iGenerating a mask local model by using formulas (5) and (6);

(2.4) user u_iGenerating a mask local model list by using a formula (7) and sending the list to an aggregation server AS;

ULM_i＝[MLM_i,1||…||MLM_i,i-1||MLM_i,i+1||…||MLM_i,n||MLM_i] (7)。

as a preferable scheme of the present invention, the step three specifically includes:

(3.1) the server uploads ULM according to all users_iGenerating a mask local model matrix MMA;

(3.2) the server generates an aggregation model list agg _ model by using a mask local model matrix MMA;

(3.3) the server generates a random model for model confusion to form a random _ list, wherein the number of the random models in the list is a multiple of the length of the agg _ model; the server randomly mixes the agg _ model and the random _ list to obtain a garble _ agg _ model of a confusion aggregation model list; storing the position of each model in the agg _ model in the garble _ agg _ model in an ordered set o _ position, and finally sending the garble _ agg _ model to all users;

(3.4) user u_iAfter receiving the garble _ aff _ model, verifying each model in the garble _ aff _ model by using local data, recording the precision of each model, recording the positions of L models with the highest precision in the garble _ aff _ model in an ordered set s _ position, and sending the positions to a server as the verification result;

(3.5) the Server receives user u_iAfter the s _ position sent, calculate:

inter_position＝o_position∩s_position；

if the length of inter _ position is greater than th, consider user u_iNot a malicious user, otherwise, the user u is considered_iThe local model of the user is also a malicious model.

As a preferable scheme of the present invention, the step four specifically includes:

(4.1) the server updates the global model according to the malicious user detection result; firstly, the server forms all the credible users into a credible user set bs ═ u_iA model uploaded by a malicious user is not adopted, and a server feeds back a randomly generated global model;

for the credible participants, the weighted average of the credible aggregation model is carried out by taking the ratio of the average value of the data volumes of every two participants to the sum of the data volumes of all credible participants as a weight, a global model is distributed to the credible participants, the specific algorithm is shown as formula (9), and the server updates the global model GM of the current round after updating^kDistributing to all trusted users;

(4.2) after receiving the global model, the user compares the precision of the global model with that of the local model, and selects a model with high precision as an updated local model;

(4.3) repeating the second step to the fourth step until a convergence condition is reached.

The invention also provides a system for detecting the abnormal nodes in the federated learning, which consists of an aggregation server, a trust authority and a plurality of users; the trust authority is used for initializing the system, including user registration, system parameter generation and key negotiation; the aggregation server is used for receiving the mask local models uploaded by the users and aggregating the models through aggregation rules to obtain a global model for the users to use, and is also used for carrying out malicious user detection in the process so as to avoid negative influence on the global model caused by error models sent by the malicious users; the user can use local data of the user to carry out model training to obtain a local model; in addition, a mask can be generated according to an agreed mask generation rule, a mask local model is formed by using the generated mask, and the mask local model is uploaded to the aggregation server.

Preferably, the aggregation server and the trust authority are honest and credible, and meanwhile, collusion with other entities in the system is avoided, and all users in the system cannot collude with other entities; at least two trusted users exist in the system; malicious users in the system do not have local training data, and upload a malicious mask local model to the aggregation server after the malicious users generate any local model and add masks; the state of the user during training is on-line or off-line.

Compared with the prior art, the invention has the following beneficial effects: the method can detect the malicious nodes in the privacy-protected federated learning system, and can ensure the credibility and the accuracy of the training result of the federated learning system by eliminating the influence of the malicious local models uploaded by the malicious nodes on the whole learning process. According to the method and the device, the local model added with the mask uploaded by each user can be subjected to anomaly detection under the condition that the server cannot obtain the real local model of each user, so that the privacy of each user is guaranteed, the reliability and the accuracy of the global model are improved, and safe and reliable federal learning is realized.

Drawings

FIG. 1 is a block diagram of the federated learning system architecture of the present invention;

FIG. 2 is a flow chart of a detection method for abnormal nodes in federated learning according to the present invention;

FIG. 3 is a diagram of the classification accuracy of MNIST data sets according to the present invention in terms of different numbers of users and different proportions of malicious users;

FIG. 4 is a computational overhead graph of the system for malicious user detection processes at different user numbers;

fig. 5 is a communication overhead diagram for the malicious user detection process at different user numbers of the server side and the user side.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples.

Referring to fig. 1, a system for detecting abnormal nodes in federated learning is composed of an Aggregation Server (AS), a Trust Authority (TA) and a plurality of Users (Users).

The tasks of the system entities are respectively described as follows:

TA: the system is mainly responsible for initialization of the system, including parameters required for system generation, user registration, key distribution and the like.

AS: the method is mainly used for receiving the mask local models uploaded by the users and aggregating the models through a certain aggregation rule to obtain a global model for the users to use, and is also used for carrying out malicious user detection in the process, so that negative effects on the global model caused by error models sent by the malicious users are avoided.

Users: the method mainly takes charge of carrying out model training by utilizing local data of the user to obtain a local model, and in order to protect privacy, all participants cannot share respective data, wherein malicious users possibly do not have local data to carry out training, and a malicious local model is generated at will; and moreover, a mask is generated according to the agreed mask generation rule, a mask local model is formed by using the generated mask, and then the mask local model is uploaded to the aggregation server. In the process, because a malicious user (e.g., m in fig. 1) exists, the uploaded local model is an abnormal model, which may cause the accuracy of the aggregated global model to be reduced and even generate an erroneous global model, and further cause the global model to generate an error when executing a prediction task, thereby causing a loss.

Referring to fig. 2, a method for detecting an abnormal node in federated learning mainly includes the following steps:

the method comprises the following steps: and (5) initializing the system.

The method mainly comprises user registration, system parameter generation and key negotiation.

Step two: and generating a mask local model.

The method mainly comprises mask generation, local model training and mask local model generation.

Step three: and detecting the malicious user.

The method mainly comprises the steps that a server side conducts local model aggregation to generate a confusion aggregation model, a user side conducts verification on the confusion aggregation model to generate a verification result, and finally the server side conducts malicious user detection according to the user side verification result.

Step four: and (5) updating the model.

And the server performs model aggregation by using the mask local model uploaded by the non-malicious user to obtain a global model of the current iteration wheel and sends the global model to the user. And the user updates the local model according to the received global model.

For the privacy protection federal learning system based on safe multi-party calculation, the following operations can be realized:

(1) key generation

Key generation involves two algorithms, ka.param and ka.gen.

The algorithm ka.param (k) → pp generates (G ', G, q, H) a parameter pp of the system using the secret parameter k, G' being a group having an order q, a generator thereof being G, and H being a hash function.

The algorithm KA.gen utilizes parameters generated by KA.param to generate a public and private key pair of each user. Specifically, the method comprises the following steps: user u first moves from Z_qIn which a random number x is selected as its private key

Then calculate g^xAs its public key

(2) Key agreement

Users u and v through calculation

Obtain a shared secret s between them_u,vThe algorithm satisfies the commutative law, i.e.

(3) Mask generation

The privacy protection scheme of the invention is to add a mask to the local model. The mask generation method specifically comprises the following steps:

assume that all users in the system are ordered, denoted as u₁,u₂,…,u_nU, user u_iIs noted as x_uiFor any pair of users (u)_i,u_j) Obtaining a common mask vector after negotiation

If u is_iUpload to

u_jUpload to

To the server, then mask

Will be at u_iAnd u_jThe uploaded models are added and then cancelled, so that the server can obtain u_iAnd u_jAs a global model without revealing u_iAnd u_jThe real local model of (2).

The mask is generated according to formulas (1) to (2):

in formula (1)

Is a radical of u_iAnd u_jIs shared with the key

A random number generator that is a seed.

As can be seen from equation (2)

After obtaining the mask, user u_iIts masked local vector can be generated according to equation (3)

And uploaded to the server.

(4) Model secure syndication

And after the server receives the mask local models uploaded by all users, performing model aggregation by using a formula (4) so as to obtain the global model update of the current iteration round.

z＝∑_u∈Uy_u (4)

For the system framework shown in fig. 1, the present invention has the following assumptions:

1. assuming that there are common named users in the system, the set { u }₁,u₂……,u_nRepresents it. The TA and AS are honest and reliable and do not collude with other entities in the system. Nor can all users in the system collude with other entities.

2. There are at least two trusted users in the system.

3. The malicious users in the system have no local training data, and the malicious users upload the malicious mask local model to the server by generating any local model (possibly well-designed) and adding the mask.

4. The problem of user disconnection in the training process is not considered in the scheme provided by the invention, but the scheme is still effective under the condition that the user disconnection exists.

The meaning of the parameters involved in the invention is shown in the following table:

the method for detecting the abnormal node in the federal learning provided by the embodiment of the invention specifically comprises the following steps:

(1) the method comprises the following steps: and (5) initializing the system.

(1.1) user u_iRegistering the Federal learning System, first sending data [ d ] to the TA_i,MAC,nonce]Wherein d is_iFor user u_iOf local data amount, MAC is u_iThe nonce is a random number generated by the MAC address of (1);

(1.2) the server is the user u according to the received data_iGenerating its ID and sending it to u_iAnd AS, and then generating a system parameter pp ═ G, p, G, H by using a KA]The parameters [ G, p, G, H]To all users.

(1.3) user u_iGenerating its public and private keys using KA.gen (pp)

And sends the public key to the AS, and the AS distributes all the public keys received by the AS to the users in the system.

(2) Step two: and generating a mask local model.

(2.1) user u_iRespectively utilize

And

as a seed for the random number generator PRG, two sets of mask vectors m are generated according to equations (1), (2)_i,j,mm_i,jThe length of the vector is the same as the length of the user local model parameters.

(2.3) user u_iThe masked local model is generated using equations (5), (6).

(2.4) user u_iA masked local model list is generated using equation (7) and sent to the AS.

ULM_i＝[MLM_i,1||…||MLM_i,i-1||MLM_i,i+1||…||MLM_i,n||MLM_i] (7)

(3) Step three: and detecting the malicious user. The method mainly comprises server-side model aggregation, confusion aggregation model generation, malicious user authentication and user-side model authentication.

(3.1) first, the server uploads ULM according to all users_iA mask local model matrix MMA is generated.

(3.2) the server generates an aggregation model list agg _ model according to algorithm 1 using MMA.

(3.3) the server generates stochastic models for model obfuscation, forming a stochastic model list random _ list, where the number of stochastic models in the list should be a multiple of the length of the agg _ model.

And the server randomly mixes the agg _ model and the random _ list to obtain a confusion aggregation model list garble _ agg _ model. The position of each model in the agg _ model in the garble _ agg _ model is stored in the ordered set o _ position, and finally the garble _ agg _ model is sent to all users.

(3.4) user u_iAfter receiving the garble _ agg _ model, verifying each model in the garble _ agg _ model by using local data, recording the precision of each model, recording the positions of the L models with the highest precision in the garble _ agg _ model in the ordered set s _ position as the verification result and sending the verification result to the server.

(3.5) the Server receives user u_iAfter the s _ position sent, calculate:

inter_position＝o_position∩s_position

if the length of inter _ position is greater than th, consider user u_iNot a malicious user, otherwise, consider user u_iThe local model of the user is also a malicious model.

(4) Step four: and (5) updating the model.

And (4.1) the server updates the global model according to the detection result of the malicious user. Firstly, the server forms all the credible users into a credible user set bs ═ u_i}. The model uploaded by the malicious user will not be adopted and the server feeds back to its randomly generated global model. And for the credible participants, performing weighted average on the credible aggregation model by taking the ratio of the average value of the data volumes of every two participants to the sum of the data volumes of all the credible participants as a weight, and distributing a global model to the credible participants, wherein a specific algorithm is shown as a formula (9). The server updates the global model GM of the current round^kTo all trusted users.

(4.3) repeating steps (2.1) - (4.3) until a convergence condition is reached.

The effectiveness of the invention is verified through experiments.

The experimental environment comprises a DELL T7920 workstation, an Intel 4210R CPU, a 160G memory and a Ubuntu 18.04 operating system. The programming environment was Python 3.6.5, tensoflow 1.12.0, Keras 2.2.4, mpi4 py.0.3. All experiments used data as the MNIST dataset. Each user performs logistic regression model training using the data they own.

Fig. 3 shows the classification accuracy of the detection method of abnormal nodes in federated learning for MNIST data sets under different user numbers and different malicious user ratios, where th is 0.5, and garble _ model _ length is 3L. As can be seen from the figure, as the number of training rounds increases, the difference between the accuracy of the system with malicious users and the accuracy of the system without malicious users becomes smaller. When the number of users in the system is different and the proportion of malicious users is the same, the model precision is almost the same. It can be shown that the invention can effectively detect malicious users in the system under users of different scales.

Fig. 4 shows the computation overhead for the malicious user detection process under different user numbers, th is 0.5, m is 0.4, and garble _ model _ length is 3L, and it can be seen that, at the server side and the user side, the proportion of the computation overhead for the malicious user detection in the computation overhead of the whole system does not change with the change of the user number, and the proportion is less than 10%.

Fig. 5 shows the computation overhead for the malicious user detection process under different user numbers, th is 0.5, m is 0.4, and garble _ model _ length is 3L, and it can be seen that, at the server side, the computation overhead for the malicious user detection remains stable under different user numbers, and at the user side, the computation overhead for the malicious user detection increases in linear proportion with the increase of the user numbers. At the server side and the user side, the proportion of the calculation overhead of malicious user detection in the whole calculation overhead of the system is less than 10%.

The above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the technical solution of the present invention, and it should be understood by those skilled in the art that the technical solution can be modified and replaced by a plurality of simple modifications and replacements without departing from the spirit and principle of the present invention, and the modifications and replacements also fall into the protection scope covered by the claims.

Claims

1. A method for detecting abnormal nodes in federated learning is characterized by comprising the following steps:

step two, generating a mask local model;

2. The method for detecting abnormal nodes in federal learning according to claim 1, wherein the first step specifically comprises:

(1.1) user u_iRegistering the Federal learning System by first sending data [ d ] to the Trust Authority TA_i，MAC，nonce]Wherein d is_iFor user u_iOf local data volume, MAC, as user u_iThe nonce is a random number generated by the MAC address of (1);

(1.2) the server is the user u according to the received data_iGenerating the identity thereof and sending the identity to the user u_iWith the aggregation server AS, then using ka.p，g，H]The parameters [ G, p, G, H]Distributing the data to all users;

(1.3) user u_iGenerating public and private keys by using KA.gen (pp) algorithm

3. The method for detecting abnormal nodes in federal learning according to claim 1, wherein the second step specifically comprises:

(2.1) user u_iRespectively utilize

And

as a seed for the random number generator PRG, two sets of mask vectors m are generated according to equations (1), (2)_i，j，mm_i，jThe length of the vector is the same as the length of the user local model parameter;

(2.3) user u_iBy using maleEquations (5), (6) generate a masked local model;

ULM_i＝[MLM_i，1||…||MLM_i，i-1||MLM_i，i+1||…||MLM_i，n||MLM_i] (7)。

4. the method for detecting abnormal nodes in federal learning according to claim 1, wherein the third step specifically comprises:

(3.4) user u_iAfter receiving the garble _ agg _ model, verifying each model in the garble _ agg _ model by using local data, and recording the essence of each modelRecording the positions of the L models with the highest precision in the garble _ agg _ model in the ordered set s _ position, and sending the positions to the server as the verification results of the models;

(3.5) the Server receives user u_iAfter the s _ position sent, calculate:

inter_position＝o_position∩s_position；

5. The method for detecting abnormal nodes in federal learning according to claim 1, wherein the fourth step specifically comprises:

6. The utility model provides a detecting system of unusual node in bang's study which characterized in that: the system comprises an aggregation server, a trust authority and a plurality of users; the trust authority is used for initializing the system, including user registration, system parameter generation and key negotiation; the aggregation server is used for receiving the mask local models uploaded by the users and aggregating the models through aggregation rules to obtain a global model for the users to use, and is also used for carrying out malicious user detection in the process so as to avoid negative influence on the global model caused by error models sent by the malicious users; the user can use local data of the user to carry out model training to obtain a local model; in addition, a mask can be generated according to an agreed mask generation rule, a mask local model is formed by using the generated mask, and the mask local model is uploaded to the aggregation server.

7. The system for detecting abnormal nodes in federal learning as claimed in claim 6, wherein:

the aggregation server and the trust authority are honest and credible, and meanwhile, the aggregation server and the trust authority cannot collude with other entities in the system, and all users in the system cannot collude with other entities; at least two trusted users exist in the system; malicious users in the system do not have local training data, and upload a malicious mask local model to the aggregation server after the malicious users generate any local model and add masks; the state of the user during training is on-line or off-line.