CN116957064A

CN116957064A - Knowledge distillation-based federal learning privacy protection model training method and system

Info

Publication number: CN116957064A
Application number: CN202310519121.6A
Authority: CN
Inventors: 刘尚东; 王木森; 胥熙; 张嘉铭; 张欣同; 吴飞; 季一木
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-05-09
Filing date: 2023-05-09
Publication date: 2023-10-27

Abstract

The invention provides a federal learning privacy protection model training method and system based on knowledge distillation, wherein in the method, a central server sends global model parameters to a client in each round of iterative training process of a transverse federal learning global model to update local model parameters; carrying out local model neural network training on each client by using the preprocessed long-tail distributed data set, and simultaneously integrating a knowledge distillation technology of a teacher-student mode to obtain processed model parameters; after encryption processing is carried out by using a hybrid encryption technology, combined encryption information is obtained; the center server decrypts to obtain the processed model parameters; the central server returns the updated global parameters to each client, and the clients obtain updated and trained local models; the invention can reduce the calculated amount, reduce the training time and improve the training efficiency, and simultaneously can improve the performance of the model, the training effect of the model and the safety of federal learning parameter transmission.

Description

Knowledge distillation-based federal learning privacy protection model training method and system

Technical Field

The invention relates to a federal learning privacy protection model training method and system based on knowledge distillation.

Background

Traditional machine learning algorithms require users to upload source data to a server for centralized training, which can easily lead to sensitive data leakage, federal learning (Federated Learning) is an emerging artificial intelligence basis technology that shifts the data storage and model training phase of machine learning to local users, enables a large number of devices to cooperatively train machine learning models while protecting user privacy, and enables participants to benefit from each other.

However, there is a challenge in federal learning that the data distribution is characterized by a long tail, which is an off-set distribution, meaning that several classes (also called head classes) contain a large number of samples, while most classes (also called tail classes) have only a very small number of samples. Such data sets may make deep learning networks perform well in head classes, inefficient in tail classes, and significantly degrade overall recognition accuracy. If the classification and recognition system is trained by directly utilizing long tail data, the head data is often over-fitted, so that the classification of the tail is ignored in prediction, and the problem of poor generalization capability of the global model is caused. The existence of the above problems can make federal learning model training inefficient.

Existing approaches often mitigate the negative effects of data heterogeneity by either improving the local training process or employing specific model aggregation mechanisms, which, while solving the Non-iid (Non-independent co-distribution) problem to some extent, generally assume that the generic class distribution is balanced, which may not be correct from a practical point of view. For local clients, the number of samples may be significantly greater than the number of other classes (tail classes) due to the difference between the number of samples, and this distribution is referred to as long tail learning. Existing solutions for non-independent co-distributed data in federal learning often perform poorly on tail-like data due to lack of consideration for general long tail distribution. Global models often also deviate in tail data by aggregating local models on clients.

Meanwhile, in the process of updating the federal learning training parameter communication, the direct parameter is lost, and some sensitive information can be possibly leaked. It is also of great importance to protect the security of its delivery parameters. Therefore, the training effect of the model is improved and the data privacy safety is ensured under the condition of uneven data distribution.

The above-mentioned problems are those that should be considered and addressed in the federal learning privacy preserving model training process based on knowledge distillation.

Disclosure of Invention

The invention aims to provide a federal learning privacy protection model training method and system based on knowledge distillation, which solve the problems that model training is low in efficiency and poor in effect due to abnormal data distribution such as uneven distribution in the prior art, and the training effect and the data privacy safety of the model are to be improved.

The technical scheme of the invention is as follows:

a federal learning privacy protection model training method based on knowledge distillation comprises the following steps,

s1, building a transverse federal learning global model of a client-center server, wherein the center server sends global model parameters to clients participating in training in each round of iterative training process of the transverse federal learning global model, and local model parameters are updated according to the transmitted global model parameters by each client;

s2, after the local model parameters of the clients are updated, preprocessing the long-tail distributed data set, including denoising and dimension reduction, further performing local model neural network training on the clients by using the preprocessed long-tail distributed data set, and distilling the local model by integrating knowledge distillation technology of a teacher-student mode in the neural network training process to obtain model parameters of the clients;

s3, the client encrypts the processed model parameters by using a hybrid encryption technology to obtain encryption information of a combination of a key encryption part, a ciphertext part and a signature part, and uploads the encryption information to the central server;

s4, after the central server receives the combined encryption information, after verification is successful, decrypting to obtain processed model parameters, and performing aggregation evaluation on the model parameters uploaded by the plurality of received clients by the central server to obtain updated global parameters;

and S5, returning the updated global parameters to each client by the central server, and continuing to update and train the local model by the client until the iteration times reach a set threshold value, thereby obtaining the updated and trained local model.

Further, in step S3, the hybrid encryption technology adopts a hybrid encryption technology based on a combination of an RSA asymmetric encryption algorithm and an AES symmetric encryption algorithm.

Further, in step S3, the client performs encryption processing on the processed model parameters using a hybrid encryption technique to obtain encryption information of a combination of the key encryption part, the ciphertext part and the signature part, specifically,

s31, the central server generates an RSA public key and an RSA private key, and distributes the RSA public key to each client side participating in training;

s32, before encryption, the client randomly generates an AES key for AES encryption, and RSA encryption is carried out on an RSA public key shared by the AES key by the central server to obtain a key encryption part, wherein the key encryption part comprises the encrypted AES key;

s33, after the encrypted AES key exists, the client encrypts a plaintext to be transmitted, namely the processed model parameter, by using an AES algorithm to obtain a ciphertext part, wherein the ciphertext part comprises the encrypted plaintext;

s34, processing a plaintext to be transmitted by using a Hash function to generate a result, and encrypting the result by using a client private key by a client to generate a signature part;

and S35, the client transmits the generated encryption information of the combination of the key encryption part, the ciphertext part and the signature part to the central server.

Further, in step S31, the central server generates an RSA public key and an RSA private key, specifically,

s311, selecting two different large prime numbers p and q, and enabling n=p to be q;

s312, calculating by Euler functionWhere n represents one parameter of the public key, used to generate the public key and the private key. />Represents the number of positive integers which are smaller than n and are mutually equal to n,representing the number of positive integers less than p and compatible with p, < >>A number of positive integers smaller than q and compatible with q;

s313, selecting an integer e as an exponent of the public key to enableAnd e and->Mutual quality;

s314, calculating one of eIs the modulo-inverse element d;

s315, (e, n) is RSA public key, (d, n) is RSA private key, and the length of plaintext m is smaller than n.

Further, in step S4, after receiving the combined encryption information, the central server decrypts the encrypted combination information to obtain the processed model parameters, specifically,

s41, after receiving the combined encryption information, the central server decrypts the key encryption part by using an RSA private key to obtain a restored AES key;

s42, decrypting the ciphertext part by the restored AES key to obtain a plaintext, namely a training parameter, and performing Hash function processing on the plaintext by the central server to obtain a first Hash result, namely a summary result after ciphertext processing;

s43, decrypting the public key shared by the client side by the signature part through the central server to obtain a second Hash result, namely a summary result sent by the client side;

s44, the center server compares the first Hash result with the second Hash result, and if the comparison result is the same, the verification is successful, the information is complete, the identity of the client is legal, the identity of the center server is verified successfully, and the center server uses decryption to obtain plaintext, namely the processed model parameters; otherwise, the authentication fails, and the central server performs authentication again or resends the information.

Further, in step S5, the central server encrypts the updated global parameter with the AES key received from the corresponding client, and returns the encrypted global parameter to each client.

The knowledge distillation-based federal learning privacy protection model training system adopting the knowledge distillation-based federal learning privacy protection model training method according to any one of the above, comprises a central server and a client,

the central server: in any round of horizontal federal learning global model training, global model parameters are sent to clients participating in training; after the combined encryption information is received and verified successfully, the model parameters are decrypted to obtain the processed model parameters, and the model parameters uploaded by the received clients are subjected to aggregation evaluation to obtain updated global parameters; the central server returns the updated global parameters to each client;

client side: after local model parameters are updated, preprocessing a long-tail distributed data set, including denoising and dimension reduction, further performing local model neural network training on each client by using the preprocessed long-tail distributed data set, and simultaneously integrating knowledge distillation technology of a teacher-student mode into the neural network training process to distill a local model to obtain model parameters of the client; after the processed model parameters are encrypted by using a hybrid encryption technology, the encryption information of the combination of the key encryption part, the ciphertext part and the signature part is obtained and uploaded to a central server; after receiving the updated global parameters returned by the central server, the client continues to update and train the local model to obtain the updated and trained local model.

The beneficial effects of the invention are as follows: according to the federal learning privacy protection model training method and system based on knowledge distillation, each client side shares computing resources, the computing capacity can be reduced, the training time can be shortened, the training efficiency can be improved, meanwhile, the knowledge distillation technology is combined, the model is lighter and has stronger generalization capability, the performance of the model can be improved, the training effect of the model can be improved, and the safety of federal learning parameter transmission can be improved by adopting the hybrid encryption technology.

Drawings

FIG. 1 is a flow chart of a federal learning privacy protection model training method based on knowledge distillation in accordance with an embodiment of the present invention;

fig. 2 is a model schematic diagram of an embodiment federal learning privacy protection model training method based on knowledge distillation.

FIG. 3 is an illustrative diagram of a single client interacting with a global model based on knowledge distillation techniques in an embodiment;

fig. 4 is a schematic flow chart of an embodiment in which a client performs encryption processing on the processed model parameters using a hybrid encryption technology.

Detailed Description

Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

Examples

s2, after the local model parameters of the clients are updated, preprocessing the long-tail distributed data set, including denoising and dimension reduction, further performing local model neural network training on the clients by using the preprocessed long-tail distributed data set, and distilling the local model by using a knowledge distillation technology which is integrated with a teacher-student mode in the neural network training process to obtain model parameters of the clients.

In step S2, the client uses knowledge distillation technology of teacher-student mode to process the local model, so as to further improve the model performance. Knowledge distillation technology can transfer the knowledge of a large-scale teacher model with high precision to a relatively small student model with relatively low complexity, so that the calculation burden of the model is reduced and the generalization performance of the model is improved. In this way, the client can obtain a more accurate and efficient training model locally. And determining the quantity distribution condition of the data of different categories in the data set by analyzing the data set distributed in the long tail. Before the training process of the client side aiming at the long-tail distributed data set, preprocessing is carried out on the data presenting the long-tail distribution characteristic so as to achieve the aim of optimizing the training effect. The client firstly adopts a filter to denoise long-tail data to be trained, and then reduces the dimension of the data set by dimension reduction after denoising so as to reduce the influence of noise on training and improve the stability and convergence rate of training. In the process of training the model by the client, a random gradient descent algorithm can be adopted to update the model parameters through repeated iteration so as to minimize the loss function on the training set and update the model parameters faster and more accurately.

in step S3, the hybrid encryption technology adopts a hybrid encryption technology based on a combination of an RSA asymmetric encryption algorithm and an AES symmetric encryption algorithm.

In step S3, the client performs encryption processing on the processed model parameters using a hybrid encryption technology to obtain encryption information of a combination of the key encryption part, the ciphertext part and the signature part, as shown in fig. 4, specifically,

in step S31, the central server generates an RSA public key and an RSA private key, specifically,

s314, calculating one of eIs the modulo-inverse element d;

Where the integer e is taken as an exponent of the public key. It should be a small prime number. It should be noted that the longer the number of bits chosen for the primes p and q, the higher the security of the RSA encryption algorithm, but the slower the encryption and decryption speeds. Therefore, in practical applications, it is necessary to select an appropriate number of prime digits according to a balance of security and performance requirements.

in step S32, the client encrypts the key with the RSA public key shared by the central server to avoid the heavy key allocation and management of the AES algorithm.

In step S3, the model parameters after the local training are encrypted by using the RSA and AES hybrid encryption technology and then uploaded, so that confidentiality of data can be ensured, and even if a third party intercepts data transmission, the encrypted data cannot be cracked. In the RSA and AES hybrid encryption technique, RSA is mainly used to encrypt keys of AES, which is used to encrypt data. The client encrypts the AES key using the RSA public key and then encrypts the data to be transmitted using the AES key. The central server decrypts the AES key using the RSA private key and then decrypts the data using the AES key. The RSA and AES hybrid encryption technology combines the asymmetric encryption of RSA and the symmetric encryption of AES, utilizes the advantages of two algorithms, overcomes the defects of the two algorithms, and achieves higher security. The encrypted model parameters are uploaded to a central server. During uploading, the method can use an HTTPS security protocol to ensure confidentiality and integrity of data. The AES algorithm encryption belongs to a symmetrical encryption system, and the same secret key is used by both communication parties in the encryption and decryption processes, so that the decryption process is basically the same as the encryption process. RSA is an asymmetric encryption algorithm, with different keys used for encryption and decryption. In this case, public key encryption is used, and decryption is only possible if the central server has the corresponding private key.

in step S4, after receiving the combined encryption information, the central server decrypts the encrypted combination information to obtain the processed model parameters, as shown in fig. 4, specifically,

In step S4, the central server receives the full link layer output logits of each client, and then performs average processing to obtain global full link layer output logits, and returns to each client to continue updating the local model. Further averaging may be performed on the five most recent rounds of logits vectors to form global knowledge for guiding the training of the respective client model.

In step S4, the ciphertext portion is encrypted using the AES algorithm, so the central server needs to decrypt using the restored AES key. The center server compares the first Hash result with the second Hash result to further verify the integrity of the information and the validity of the identity, if the two summary results are the same, the information is complete, the identity of the client is legal, and the center server successfully verifies the identity. In this case, the central server may use the decrypted plaintext. If the summary results are different, the description information may have been tampered with and the central server needs to re-authenticate or re-send the information.

In step S4, since the data uploaded by each client is encrypted, the server needs to decrypt the RSA-encrypted AES key using the private key first, and then decrypt the model parameters using the AES key to obtain the original model parameters. These model parameters come from different clients and therefore the central server needs to aggregate them evenly in order to evaluate the performance of the model. The method comprises the steps that a central server verifies uploaded data to ensure the credibility of the data, a client side firstly uses a Hash function to process local model parameters to obtain a Hash result, then uses a client side private key to encrypt to obtain a signature part, and after the central server obtains the transmitted signature part, firstly uses a corresponding client side public key to decrypt to obtain the Hash result, and then further compares the Hash result with the Hash result obtained by a ciphertext part. The central server collects the model parameters which are uploaded by a plurality of clients and are subjected to encryption processing, and after the model parameters pass comparison, the model parameters are subjected to aggregation evaluation to obtain updated global parameters.

In step S4, the server needs to evaluate the aggregated model parameters. The evaluation may be done using some metrics such as accuracy, recall. The server will collect model parameters from all participating training clients, merge them into one global model using an aggregation algorithm, and send updates of the global model back to the individual clients. In this way, the client can perform further training and optimization locally by using the transferred global model, so as to improve the accuracy and robustness of the model.

In step S5, the central server encrypts the updated global parameter with the AES key received from the corresponding client, and returns the encrypted global parameter to each client, so that each client continues to perform update training of the local model, and the correctness and safety of the updated parameter are ensured, so as to avoid the problems of inconsistent parameters and data leakage.

In step S5, the local model is a model trained on the client during federal learning. The local models are the core of federal learning based on knowledge distillation, as they use knowledge extracted from the global model to better fit the local data.

According to the knowledge distillation-based federal learning privacy protection model training method, each client shares computing resources, the calculated amount can be reduced, the training time can be shortened, the training efficiency can be improved, meanwhile, the knowledge distillation technology is combined, so that the model is lighter and has stronger generalization capability, the performance of the model can be improved, the training effect of the model can be improved, and the safety of federal learning parameter transmission can be improved by adopting the hybrid encryption technology.

The federal learning privacy protection model training method based on knowledge distillation aims at that the data distribution of each client in the traditional federal learning can be different, and the problems of long tail distribution and Non-iid (independent same distribution) are common. Therefore, the problem of unbalanced data distribution is encountered during model training, and knowledge distillation technology is used in the process of realizing federal learning: and each client uses a local data set to average the distribution of the logits vectors output by the convolutional neural network according to the label types in the local training process, and then the center server realizes the average aggregation of the logits vectors uploaded by all the clients to obtain the logits vectors of the round, as shown in figure 3. The invention can alleviate the problem by combining knowledge distillation technology, and enhance the robustness of the model.

The federal learning privacy protection model training method based on knowledge distillation uses RSA and AES mixed encryption technology, RSA can provide public key encryption and digital signature functions, and AES can provide high-strength symmetric encryption. The method adopts the mixed encryption technology to effectively combine the advantages of two encryption algorithms, and improves the security of federal learning parameter transmission. Meanwhile, the data of all the participants are scattered in all places, so that the data can be prevented from being concentrated together, and the risk of data leakage is reduced.

The embodiment also provides a knowledge distillation-based federal learning privacy protection model training system adopting the knowledge distillation-based federal learning privacy protection model training method of any one of the above, comprising a central server and a client,

The federal learning privacy protection model training method and system based on knowledge distillation adopt a transverse federal learning framework, and update and sharing of model parameters are realized through a distributed training mode. In any round of horizontal federal learning global model training, firstly, a central server sends global model parameters to each client, and the local model of each client is correspondingly updated according to the transmitted model parameters; each then uses the local data set for model training. For a local client, in the iterative training process of a local model, due to the difference between the number of samples, the number of the samples of the local client may be seriously more than that of some other classes (tail classes), and the existence of the long-tail data problem can cause the locally trained model to easily ignore a few samples; therefore, the model is integrated with knowledge distillation technology of Teacher-student mode to process the local model, and then parameters of the local model are obtained. And then, encrypting the model parameters after the local training is finished by using an RSA and AES hybrid encryption technology, uploading the model parameters, carrying out aggregation evaluation on the model parameters uploaded by the plurality of received clients by the central server, obtaining global parameters, and returning to each client to continuously update the local model. The invention solves the problem of long-tail data existing in the client in federal learning, and improves the safety of data sharing by using the hybrid encryption technology.

According to the federal learning privacy protection model training method and system based on knowledge distillation, participants do not need to transmit local data to a central server, and model training can be performed on the premise of keeping data privacy. The client training method adopts a knowledge distillation mode to improve the generalization capability of the model. In knowledge distillation combined with federal learning techniques, a larger model is used to provide "knowledge" that can be obtained by letting a smaller model learn and gradually approach the predictions of a large model. The method is applicable to federal learning, namely, a teacher model is a large model converged by a central server, and a student model is a small model of each client. In federal learning based on knowledge distillation, the global model is the model used to summarize local model updates during federal learning. The global model improves model performance by sharing knowledge between local models.

The federal learning privacy protection model training method and system based on knowledge distillation require cooperation between a plurality of participants and a central server to ensure the effectiveness and reliability of the federal learning process.

The federal learning privacy protection model training method and system based on knowledge distillation have high safety due to the fact that two encryption algorithms of asymmetric encryption and symmetric encryption are used, and the RSA and AES hybrid encryption technology can effectively protect the transmission and storage safety of data. Although the RSA and AES hybrid encryption technique involves two encryption algorithms, in practical applications, it is very simple and easy to use and does not require much knowledge of the security technology for the end user. The RSA and AES hybrid encryption technology has higher flexibility, and parameters of the RSA and the AES, such as key length, encryption algorithm and the like, can be adjusted according to actual requirements so as to meet different security requirements.

According to the knowledge distillation-based federal learning privacy protection model training method and system, each client can share computing resources, so that the computing amount is reduced, the training time is shortened, and the knowledge distillation technology is fused, so that the model is lighter and has stronger generalization capability, and the performance of the model is improved.

Although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.

Claims

1. A federal learning privacy protection model training method based on knowledge distillation is characterized in that: comprises the steps of,

2. The knowledge distillation based federal learning privacy protection model training method according to claim 1, wherein: in step S3, the hybrid encryption technology adopts a hybrid encryption technology based on a combination of an RSA asymmetric encryption algorithm and an AES symmetric encryption algorithm.

3. The knowledge distillation based federal learning privacy protection model training method according to claim 1 or 2, wherein: in step S3, the client performs encryption processing on the processed model parameters using a hybrid encryption technique to obtain encryption information of a combination of the key encryption part, the ciphertext part and the signature part, specifically,

4. The knowledge distillation based federal learning privacy protection model training method according to claim 3, wherein: in step S31, the central server generates an RSA public key and an RSA private key, specifically,

s312, calculating Φ (n) =Φ (p) = (q) = (p-1) = (q-1) by using euler function, where n represents a parameter in the public key, and is used to generate the public key and the private key. Phi (n) represents the number of positive integers smaller than n and mutually exclusive with n, phi (p) represents the number of positive integers smaller than p and mutually exclusive with p, phi (q) represents the number of positive integers smaller than q and mutually exclusive with q;

s313, selecting an integer e as an exponent of the public key, so that e < phi (n) and e and phi (n) are mutually prime;

s314, calculating a modulo inverse element d of e relative to phi (n);

5. The knowledge distillation based federal learning privacy protection model training method according to claim 3, wherein: in step S4, after receiving the combined encryption information, the central server decrypts the encrypted combination information to obtain the processed model parameters, specifically,

6. The knowledge distillation based federal learning privacy protection model training method according to claim 5, wherein: in step S5, the central server encrypts the updated global parameter with the AES key received from the corresponding client, and returns the encrypted global parameter to each client.

7. A knowledge distillation based federal learning privacy protection model training system employing the knowledge distillation based federal learning privacy protection model training method of any of claims 1-6, characterized in that: comprising a central server and a client-side,