CN116882524A

CN116882524A - Federal learning method and system for meeting personalized privacy protection requirements of participants

Info

Publication number: CN116882524A
Application number: CN202310707082.2A
Authority: CN
Inventors: 韩启龙; 祝永杰; 宋洪涛; 卢丹; 刘鹏
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2023-06-14
Filing date: 2023-06-14
Publication date: 2023-10-13

Abstract

A federated learning method and system that meets the personalized privacy protection needs of participants, involving the field of network and information security technology. Solve the problem of privacy leakage of participants due to untrustworthy servers in federated learning scenarios. The method includes: the participant selects a privacy budget, the privacy budget is encrypted and sent to the server; the server receives the privacy budget for summation, the server collaborates with the participant to decrypt the privacy budget sum, and sends the sum to the participant; the participant transfers the privacy budget to the participant. The aggregate weight is obtained by dividing the sum of the budget and the privacy budget; the server sends the global model parameters to the participants, and trains according to the parameters to obtain the local model; the participant multiplies the local gradient and the aggregate weight, and then performs gradient clipping; the clipped parameters are Perturbation is sent to the server; the server receives the gradient parameters and aggregates them to generate a global model. Applied to the field of privacy data protection.

Description

A federated learning method that meets the personalized privacy protection needs of participants and system

技术领域Technical field

本发明涉及网络与信息安全技术领域，尤其涉及一种满足参与方的个性化隐私保护需求的联邦学习方法。The invention relates to the technical field of network and information security, and in particular to a federated learning method that meets the personalized privacy protection needs of participants.

背景技术Background technique

随着大数据与人工智能的迅速发展，人工智能已经进入到我们生活的方方面面，比如金融、医疗、无人驾驶等。人工智能中最重要的技术是机器学习，也正是机器学习的发展促使了人工智能的飞速发展。与此同时，机器学习中的隐私问题也受到了广泛的关注。随着隐私问题受到的关注程度日益提高,用户分享数据的意愿越来越低。与之矛盾的是，工智能技术却必须依靠大量数据收集和融合，如果不能获取完整丰富的信息来训练模型并发展技术,人工智能应用的发展将受到严重限制。With the rapid development of big data and artificial intelligence, artificial intelligence has entered every aspect of our lives, such as finance, medical care, autonomous driving, etc. The most important technology in artificial intelligence is machine learning, and it is the development of machine learning that promotes the rapid development of artificial intelligence. At the same time, privacy issues in machine learning have also received widespread attention. As privacy issues receive increasing attention, users are becoming less and less willing to share data. Paradoxically, artificial intelligence technology must rely on the collection and fusion of large amounts of data. If complete and rich information cannot be obtained to train models and develop technology, the development of artificial intelligence applications will be severely restricted.

在数据孤岛现象与数据融合需求的矛盾逐渐凸显的背景下，联邦学习应运而生。它的核心思想是在分布在不同设备或各方的单独数据集上训练机器学习模型，这可以在一定程度上保护本地数据隐私。在联邦学习中，参与者仅使用上传的模型参数或梯度，而不会暴露潜在的敏感本地数据。尽管联邦学习是机器学习中保护隐私信息的有效手段，但是仍然有隐私泄露的风险。然而，一些研究表明，参与方上传的梯度或模型参数也可以泄露隐私，攻击者可以使用一些攻击手段，如差分攻击和模型反转攻击，分析原始模型信息与模型信息之间的差异，以获得模型中特定的隐私信息。In the context of the increasingly prominent contradiction between the data island phenomenon and the demand for data fusion, federated learning emerged as the times require. Its core idea is to train machine learning models on separate data sets distributed across different devices or parties, which can protect local data privacy to a certain extent. In federated learning, participants only use uploaded model parameters or gradients without exposing potentially sensitive local data. Although federated learning is an effective means to protect private information in machine learning, there is still a risk of privacy leakage. However, some studies have shown that gradients or model parameters uploaded by participants can also leak privacy. Attackers can use some attack methods, such as differential attacks and model inversion attacks, to analyze the difference between the original model information and the model information to obtain Specific private information in the model.

为了解决联邦学习中的隐私问题，学者们提出了几种方案，例如基于同态加密或安全多方计算的联邦学习。由于同态加密技术的成本昂贵，需要大量的计算，因此在实践中不适用于大规模数据参与的模型迭代训练。基于安全的多方计算的联邦学习方法的一个挑战是提高计算效率，因为在联邦学习框架中完成一轮训练需要大量的计算资源。与其他方法相比，基于差分隐私的联邦学习方法具有较低的通信和计算开销，因此被广泛用于保护联邦学习的隐私。目前，基于差分隐私工作的联邦学习工作主要包括两种类型：1)在用户上传模型参数之前使用本地差分隐私扰动用户上传的参数；2)利用中心化差分隐私扰动聚合梯度的中央聚合服务器。但是本地差分隐私有一个缺陷，就是为所有用户提供统一级别的隐私保护。In order to solve the privacy problem in federated learning, scholars have proposed several solutions, such as federated learning based on homomorphic encryption or secure multi-party computation. Since homomorphic encryption technology is expensive and requires a lot of calculations, it is not suitable for iterative training of models involving large-scale data in practice. One challenge of federated learning methods based on secure multi-party computation is to improve computational efficiency, since completing a round of training in a federated learning framework requires a large amount of computing resources. Compared with other methods, federated learning methods based on differential privacy have lower communication and computational overhead, and therefore are widely used to protect the privacy of federated learning. Currently, federated learning work based on differential privacy work mainly includes two types: 1) using local differential privacy to perturb the parameters uploaded by the user before the user uploads the model parameters; 2) using centralized differential privacy to perturb the central aggregation server of the aggregation gradient. However, local differential privacy has a flaw, which is to provide a uniform level of privacy protection for all users.

由于不同的文化价值观、收入、年龄、法律、国家或职业背景，统一级别的隐私保护是不切实际的。在实践中，当面对包含具有不同隐私期望的多个用户的数据集时，使用本地差分隐私是有限的。一种可能性是将全局隐私级别设置得足够高，这可能会在分析输出中引入不可接受的噪声量，从而导致效用不佳。另一方面，设置较低的隐私级别也可能会显着损害效用。此外，单一的隐私级别意味着浪费一些客户的大量隐私预算，通常会给模型的准确性带来负面影响。使用普通的本地差分隐私没有考虑用户不同的隐私需求，用统一的隐私级别来保护隐私是不现实的。A uniform level of privacy protection is impractical due to different cultural values, income, age, legal, national or professional backgrounds. In practice, the use of local differential privacy is limited when faced with datasets containing multiple users with different privacy expectations. One possibility is to set the global privacy level high enough, which may introduce an unacceptable amount of noise into the analysis output, resulting in poor utility. On the other hand, setting a lower privacy level may also significantly harm utility. Furthermore, a single privacy level means wasting a large amount of some clients’ privacy budget, often negatively impacting model accuracy. Using ordinary local differential privacy does not take into account the different privacy needs of users, and it is unrealistic to protect privacy with a unified privacy level.

发明内容Contents of the invention

本发明为解决联邦学习场景中因服务器不可信而导致的参与方隐私泄露的问题，提出了一种满足参与方的个性化隐私保护需求的联邦学习方法，方案具体为：In order to solve the problem of privacy leakage of participants due to untrustworthy servers in federated learning scenarios, the present invention proposes a federated learning method that meets the personalized privacy protection needs of participants. The specific solution is:

一种满足参与方的个性化隐私保护需求的联邦学习方法，所述方法包括：A federated learning method that meets the personalized privacy protection needs of participants, the method includes:

S1：两个以上的参与方根据隐私需求选择隐私预算，并对所述隐私预算加密，将所述加密后的隐私预算发送给服务器；S1: Two or more participants select a privacy budget based on privacy requirements, encrypt the privacy budget, and send the encrypted privacy budget to the server;

S2：服务器接收所述加密后的隐私预算并进行求和，根据所述服务器求和后的隐私预算与参与方共同解密得出隐私预算的总和，并把总和发送给参与方；S2: The server receives the encrypted privacy budget and sums it, and jointly decrypts the sum of the privacy budgets with the participants based on the server's summed privacy budget, and sends the sum to the participants;

S3：参与方将自己的隐私预算与隐私预算总和相除得到聚合权重；S3: Participants divide their privacy budget by the sum of privacy budgets to obtain the aggregate weight;

S4：服务器把全局模型参数发送给参与方，参与方根据服务器发送的参数进行本地训练来获得本地模型；S4: The server sends the global model parameters to the participants, and the participants perform local training based on the parameters sent by the server to obtain the local model;

S5：每个参与方将所述本地模型的参数与聚合权重相乘，之后进行梯度裁剪；S5: Each participant multiplies the parameters of the local model and the aggregate weight, and then performs gradient clipping;

S6：将所述步骤S5中裁剪后的参数加入个性化噪声进行扰动，并发送给服务器；S6: Add personalized noise to the parameters cropped in step S5 for perturbation, and send it to the server;

S7：服务器接收所述步骤S6中参与方发送的参数并进行聚合生成全局模型，所述全局模型用于对隐私保护场景中的设定问题来进行预测与分析。S7: The server receives the parameters sent by the participants in step S6 and aggregates them to generate a global model. The global model is used to predict and analyze setting issues in privacy protection scenarios.

进一步的，还提供一种优选方式，所述步骤S1具体为：Further, a preferred way is also provided, and the step S1 is specifically:

S11：共有m个参与训练的参与方{c₁,c₂,…c_m}，每个参与方有自己的原始数据集{d₁,d₂,…d_m}，参与方根据自己的隐私需求并选择自己的隐私预算；S11: There are m participants {c ₁ , c ₂ ,…c _m } participating in the training. Each participant has its own original data set {d ₁ , d ₂ ,…d _m }. The participants can collect data based on their own privacy. Require and choose your own privacy budget;

S12：参与方根据自己的密钥使用同态加密来进行加密后发送给服务器。S12: Participants use homomorphic encryption according to their own keys to encrypt and send it to the server.

进一步的，还提供一种优选方式，所述步骤S2包括：Further, a preferred way is provided, the step S2 includes:

S21：服务器采集参与方的模型参数并进行聚合，用于分布式训练；S21: The server collects the model parameters of participants and aggregates them for distributed training;

S22：服务器接收所述加密后的隐私预算进行求和，服务器在不知道参与方的具体的隐私预算的情况下与参与方共同解密加密后的隐私预算，获得明文的隐私预算总和，并发送给参与方。S22: The server receives the encrypted privacy budget and performs the sum. The server jointly decrypts the encrypted privacy budget with the participant without knowing the specific privacy budget of the participant, obtains the plaintext privacy budget sum, and sends it to participants.

进一步的，还提供一种优选方式，所述步骤S4包括：Further, a preferred way is provided, the step S4 includes:

服务器发送全局模型参数给参与方，参与方收到所述全局模型参数后进行本地训练，在本地使用本地数据集和随机梯度下降法进行多次迭代，获取本地模型。The server sends the global model parameters to the participants. After receiving the global model parameters, the participants perform local training and use the local data set and stochastic gradient descent method to perform multiple iterations locally to obtain the local model.

基于同一发明构思，本发明还提供一种满足参与方的个性化隐私保护需求的联邦学习系统，所述系统包括：Based on the same inventive concept, the present invention also provides a federated learning system that meets the personalized privacy protection needs of participants. The system includes:

加密模块，用于两个以上的参与方根据隐私需求选择隐私预算，并对所述隐私预算加密，将所述加密后的隐私预算发送给服务器；An encryption module, used by two or more participants to select a privacy budget based on privacy requirements, encrypt the privacy budget, and send the encrypted privacy budget to the server;

解密模块，用于服务器接收所述加密后的隐私预算并进行求和，根据所述服务器求和后的隐私预算与参与方共同解密得出隐私预算的总和，并把总和发送给参与方；A decryption module, configured for the server to receive the encrypted privacy budget and perform summation, and to jointly decrypt the sum of the privacy budgets with the participants based on the summed privacy budget of the server, and to obtain the sum of the privacy budgets, and send the sum to the participants;

聚合权重获取模块，用于参与方将自己的隐私预算与隐私预算总和相除得到聚合权重；The aggregate weight acquisition module is used by participants to divide their own privacy budget and the sum of privacy budgets to obtain the aggregate weight;

训练模块，用于服务器把全局模型参数发送给参与方，参与方根据服务器发送的参数进行本地训练来获得本地模型；The training module is used by the server to send the global model parameters to the participants, and the participants perform local training based on the parameters sent by the server to obtain the local model;

裁剪模块，用于每个参与方将所述本地模型的参数与聚合权重相乘，之后进行梯度裁剪；A clipping module, used by each participant to multiply the parameters of the local model and the aggregate weight, and then perform gradient clipping;

参数更新模块，用于将所述裁剪模块中裁剪后的参数加入个性化噪声进行扰动，并发送给服务器；A parameter update module, used to add personalized noise to the cropped parameters in the cropping module for perturbation, and send it to the server;

输出模块，用于服务器接收所述参数更新模块中参与方发送的参数并进行聚合生成全局模型，所述全局模型用于对隐私保护场景中的设定问题来进行预测与分析。The output module is used for the server to receive the parameters sent by the participants in the parameter update module and aggregate them to generate a global model. The global model is used to predict and analyze setting issues in privacy protection scenarios.

进一步的，还提供一种优选方式，所述加密模块具体为：Furthermore, a preferred method is also provided. The encryption module is specifically:

共有m个参与训练的参与方{c₁,c₂,…c_m}，每个参与方有自己的原始数据集{d₁,d₂,…d_m}，参与方根据自己的隐私需求并选择自己的隐私预算；There are m participants {c ₁ , c ₂ ,…c _m } participating in the training. Each participant has its own original data set {d ₁ , d ₂ ,…d _m }, and the participants combine the data according to their own privacy needs. Choose your own privacy budget;

参与方根据自己的密钥使用同态加密来进行加密后发送给服务器。Participants use homomorphic encryption according to their own keys to encrypt and send it to the server.

进一步的，还提供一种优选方式，所述解密模块包括：Furthermore, a preferred method is provided, in which the decryption module includes:

服务器采集参与方的模型参数并进行聚合，用于分布式训练；The server collects the model parameters of the participants and aggregates them for distributed training;

服务器接收所述加密后的隐私预算进行求和，服务器在不知道参与方的具体的隐私预算的情况下与参与方共同解密加密后的隐私预算，获得明文的隐私预算总和，并发送给参与方。The server receives the encrypted privacy budget and performs the sum. The server jointly decrypts the encrypted privacy budget with the participant without knowing the specific privacy budget of the participant, obtains the plaintext privacy budget sum, and sends it to the participant. .

进一步的，还提供一种优选方式，所述训练模块包括：Further, a preferred way is provided, the training module includes:

基于同一发明构思，本发明还提供一种计算机可读存储介质，所述计算机可读存储介质用于存储计算机程序，所述计算机程序执行上述任一项所述的一种满足参与方的个性化隐私保护需求的联邦学习方法。Based on the same inventive concept, the present invention also provides a computer-readable storage medium. The computer-readable storage medium is used to store a computer program. The computer program executes any of the above-mentioned methods to satisfy the personalization of the participants. Federated learning approach for privacy protection requirements.

基于同一发明构思，本发明还提供一种计算机设备，包括存储器和处理器，所述存储器中存储有计算机程序，当所述处理器运行所述存储器存储的计算机程序时，所述处理器执行根据上述中任一项所述的一种满足参与方的个性化隐私保护需求的联邦学习方法。Based on the same inventive concept, the present invention also provides a computer device, including a memory and a processor. A computer program is stored in the memory. When the processor runs the computer program stored in the memory, the processor executes according to A federated learning method described in any of the above that meets the personalized privacy protection needs of participants.

本发明的有益之处在于：The benefits of the present invention are:

本发明解决了联邦学习场景中因服务器不可信而导致的参与方隐私泄露的问题。The invention solves the problem of privacy leakage of participants caused by untrustworthy servers in federated learning scenarios.

本发明提出了一种满足参与方的个性化隐私保护需求的联邦学习方法，所述方法针对使用本地差分隐私保护的联邦学习的缺陷，针对参与方的隐私需求不同的基础上引入了个性化的隐私保护。所述方法中在联邦学习的训练过程中，参与方针对自己的隐私需求选择自己的隐私预算来对本地模型参数添加随机噪声，达到扰动参数的目的，然后服务器再对收集到的参数进行聚合，循环多次直到模型收敛，通过模型可以对隐私保护场景中的设定问题来进行预测与分析，从而更好的保护用户隐私。The present invention proposes a federated learning method that meets the personalized privacy protection needs of participants. The method is aimed at the shortcomings of federated learning using local differential privacy protection and introduces personalized learning based on the different privacy needs of participants. privacy protection. In the method described, during the training process of federated learning, participants select their own privacy budget to add random noise to the local model parameters to achieve the purpose of perturbing the parameters according to their own privacy needs, and then the server aggregates the collected parameters. Loop multiple times until the model converges. The model can be used to predict and analyze setting issues in privacy protection scenarios, thereby better protecting user privacy.

本发明提出了一种满足参与方的个性化隐私保护需求的联邦学习方法，基于本地差分隐私和联邦学习等技术提出了一种个性化隐私保护的联邦学习框架，可以避免被不可信的服务器的隐私攻击。发明中参与方可自由选择自己的隐私预算，达到个性化隐私保护的目的，服务器无法获知每个参与方具体的隐私预算的大小，避免了不可信的服务器对隐私预算大的参与方的隐私攻击。发明中保证服务器有更好的聚合结果，生成更好的全局模型。The present invention proposes a federated learning method that meets the personalized privacy protection needs of participants. Based on technologies such as local differential privacy and federated learning, a federated learning framework for personalized privacy protection is proposed, which can avoid being accessed by untrusted servers. Privacy Attack. In the invention, participants can freely choose their own privacy budget to achieve the purpose of personalized privacy protection. The server cannot know the specific privacy budget of each participant, thus avoiding privacy attacks by untrustworthy servers on participants with large privacy budgets. . The invention ensures that the server has better aggregation results and generates a better global model.

本发明提出了一种满足参与方的个性化隐私保护需求的联邦学习方法，还解决了用户对隐私的态度存在差异的问题，引入了个性化的隐私保护，让不同的用户可以根据自己的隐私需求进行设置。这不仅提高了数据效用，而且增加了用户积极性。当用户能够控制他们的隐私时，他们更有可能首先贡献他们的数据进行分析。This invention proposes a federated learning method that meets the personalized privacy protection needs of participants. It also solves the problem of differences in users' attitudes towards privacy and introduces personalized privacy protection so that different users can adjust their privacy according to their own needs. Set up as required. This not only improves data utility but also increases user motivation. When users have control over their privacy, they are more likely to contribute their data to analysis in the first place.

本发明应用于隐私数据保护领域。The invention is applied in the field of privacy data protection.

附图说明Description of the drawings

图1为实施方式一所述的一种满足参与方的个性化隐私保护需求的联邦学习方法流程图；Figure 1 is a flow chart of a federated learning method that meets the personalized privacy protection needs of participants according to the first embodiment;

图2为实施方式十一所述的步骤时序图。FIG. 2 is a sequence diagram of steps described in Embodiment 11.

具体实施方式Detailed ways

为使本发明实施方式的目的、技术方案和优点更加清楚，下面将结合本发明实施方式中的附图，对本发明实施方式中的技术方案进行清楚、完整地描述，显然，所描述的实施方式是本发明一部分实施方式，而不是全部的实施方式。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments.

实施方式一、参见图1说明本实施方式。本实施方式所述的一种满足参与方的个性化隐私保护需求的联邦学习方法，所述方法包括：Embodiment 1: This embodiment will be described with reference to Figure 1 . This implementation mode describes a federated learning method that meets the personalized privacy protection needs of participants. The method includes:

本实施方式的聚合权重是通过考虑每个参与方的数据贡献、计算能力和信任度等因素来分配的。因此，在梯度参数相乘聚合权重之后，得到的结果是代表了参与方贡献的参数。服务器可以在不知道参与方的具体参数的情况下，聚合参与方的扰动后的参数产生一个新的全局模型参数。这个方法可以保护数据隐私和安全性，消除参与方之间的差异性，从而实现更好的联邦学习效果。Aggregation weights in this implementation are assigned by considering factors such as data contribution, computing power, and trust of each participant. Therefore, after the gradient parameters are multiplied by the aggregate weight, the result is a parameter that represents the contribution of the participating parties. The server can aggregate the perturbed parameters of the participants to generate a new global model parameter without knowing the specific parameters of the participants. This method can protect data privacy and security, eliminate differences between participants, and achieve better federated learning results.

本实施方式所述的方法针对使用本地差分隐私保护的联邦学习的缺陷，针对参与方的隐私需求不同的基础上引入了个性化的隐私保护。所述方法中在联邦学习的训练过程中，参与方针对自己的隐私需求选择自己的隐私预算来对本地模型参数添加随机噪声，达到扰动参数的目的，然后服务器再对收集到的参数进行聚合，循环多次直到模型收敛，通过模型可以对隐私保护场景中的设定问题来进行预测与分析，从而更好的保护用户隐私。The method described in this implementation mode addresses the shortcomings of federated learning using local differential privacy protection, and introduces personalized privacy protection based on the different privacy needs of the participants. In the method described, during the training process of federated learning, participants select their own privacy budget to add random noise to the local model parameters to achieve the purpose of perturbing the parameters according to their own privacy needs, and then the server aggregates the collected parameters. Loop multiple times until the model converges. The model can be used to predict and analyze setting issues in privacy protection scenarios, thereby better protecting user privacy.

实施方式二、本实施方式是对实施方式一所述的一种满足参与方的个性化隐私保护需求的联邦学习方法的进一步限定，所述步骤S1具体为：Embodiment 2. This implementation is a further limitation of the federated learning method described in Embodiment 1 that meets the personalized privacy protection needs of participants. The step S1 is specifically:

本实施方式的步骤S11和步骤S12，保护了参与方的数据隐私和安全，同时保证参与方的数据不被公开或攻击的风险最小化。具体来说，这些操作的好处包括以下几个方面：Steps S11 and S12 of this implementation protect the data privacy and security of the participants, while ensuring that the data of the participants is not disclosed or the risk of being attacked is minimized. Specifically, the benefits of these operations include the following aspects:

保护数据隐私：参与方的数据通常包含个人敏感信息或商业机密等，直接将数据暴露给第三方可能导致隐私泄漏。因此，参与方使用同态加密将数据加密后再发送给服务器，可以保护数据隐私，并减少数据泄露的风险。Protect data privacy: Participants’ data often contains sensitive personal information or business secrets, and directly exposing data to third parties may lead to privacy leaks. Therefore, participants use homomorphic encryption to encrypt data before sending it to the server, which can protect data privacy and reduce the risk of data leakage.

保护数据安全：参与方的数据还可能遭到黑客攻击或恶意篡改等安全威胁。使用同态加密方法可以保证数据加密和传输过程的安全，避免被黑客攻击、截获或篡改。Protect data security: Participants’ data may also be subject to security threats such as hacker attacks or malicious tampering. Using homomorphic encryption methods can ensure the security of the data encryption and transmission process and avoid being attacked, intercepted or tampered by hackers.

提高可控性：参与方根据自己的需求配置隐私预算，可以灵活地根据自己的需求平衡效果和隐私保护度，从而提高参与方对联邦学习过程的可控性。Improve controllability: Participants configure privacy budgets according to their own needs, and can flexibly balance effects and privacy protection according to their own needs, thereby improving participants' controllability over the federated learning process.

保护隐私预算：隐私预算是用于评估数据的隐私敏感度的标准，在联邦学习中很重要。使用同态加密方法，参与方可以在不暴露隐私预算的情况下进行计算，防止黑客或敌对参与方获取隐私预算。Preserving privacy budget: Privacy budget is a criterion used to evaluate the privacy sensitivity of data and is important in federated learning. Using homomorphic encryption methods, participants can perform calculations without exposing the privacy budget, preventing hackers or hostile parties from obtaining the privacy budget.

保持数据管理的责任：参与方使用同态加密方法加密发送数据，可以尽量减少对于服务器的数据管理和处理，从而更好地控制自己的数据资源。这也可以帮助参与方管理数据风险，并保持对自己数据的完全控制。Maintain responsibility for data management: Participants use homomorphic encryption to encrypt and send data, which can minimize data management and processing on the server, thereby better controlling their own data resources. This can also help participants manage data risks and maintain full control over their own data.

实施方式三、本实施方式是对实施方式一所述的一种满足参与方的个性化隐私保护需求的联邦学习方法的进一步限定，所述步骤S2包括：Embodiment 3. This implementation is a further limitation of the federated learning method described in Embodiment 1 that meets the personalized privacy protection needs of participants. The step S2 includes:

本实施方式所述的步骤S21通过按照预定方式聚合参与者上传的受限训练信息使得模型在各参与单位学习自身数据的同时，不直接拥有其他参与单位的数据。步骤S22针对联邦学习中的加密机制，解密所有参与方的加密隐私预算，并将其聚合为总计算量。S22通常用于S21的内部操作中，并通过加密机制来保证数据隐私性。具体而言，聚合器根据参与方上传的受限训练信息的密文，计算出密文的总和，而聚合器不会直接解密上传的加密受限训练数据。相反，参与方需要在上传离线加密的信息前计算并上传隐私预算，将其用于打破聚合器对其上传的信息的隐私，并在聚合器的加密方式下使参与者获得总预算的合计并更新本地模型。这样就保证了数据隐私性，并最大化地减少了攻击者可能利用计算数据和模型数据泄漏对联邦学习系统的损害。Step S21 described in this implementation mode aggregates the restricted training information uploaded by participants in a predetermined manner so that the model does not directly possess the data of other participating units while each participating unit learns its own data. Step S22 targets the encryption mechanism in federated learning, decrypts the encryption privacy budgets of all participants, and aggregates them into the total calculation amount. S22 is usually used in the internal operation of S21 and ensures data privacy through encryption mechanisms. Specifically, the aggregator calculates the sum of the ciphertext based on the ciphertext of the restricted training information uploaded by the participants, and the aggregator does not directly decrypt the uploaded encrypted restricted training data. Instead, participants need to calculate and upload a privacy budget before uploading offline encrypted information, use it to break the aggregator's privacy of their uploaded information, and enable participants to obtain a combined sum of the total budget in the aggregator's encryption. Update local model. This ensures data privacy and minimizes the damage that attackers may do to the federated learning system by using computational data and model data leaks.

实施方式四、本实施方式是对实施方式一所述的一种满足参与方的个性化隐私保护需求的联邦学习方法的进一步限定，所述步骤S4包括：Embodiment 4. This implementation is a further limitation of the federated learning method described in Embodiment 1 that meets the personalized privacy protection needs of participants. The step S4 includes:

本实施方式中通过将全局模型参数融合到本地模型中，参与方可以通过本地迭代来优化本地模型，改进并加强整个共享模型的性能和准确性；参与方在本地训练时，可以根据自己的需求来调整学习率、正则化等参数，从而提高模型的可控性，同时，通过本地训练，参与方可以更加有效地了解其本地数据特征，优化本地模型，提高训练效果；通过本地训练，参与方可以保证自己的数据不被上传，进一步加强数据隐私保护，并且可能减少对于其数据传输和储存的负担。In this implementation, by integrating the global model parameters into the local model, participants can optimize the local model through local iteration, improve and enhance the performance and accuracy of the entire shared model; when training locally, participants can perform training based on their own needs. To adjust parameters such as learning rate and regularization to improve the controllability of the model. At the same time, through local training, participants can more effectively understand the characteristics of their local data, optimize the local model, and improve the training effect; through local training, participants You can ensure that your data will not be uploaded, further strengthen data privacy protection, and may reduce the burden on its data transmission and storage.

实施方式五、本实施方式所述的一种满足参与方的个性化隐私保护需求的联邦学习系统，所述系统包括：Embodiment 5. This embodiment describes a federated learning system that meets the personalized privacy protection needs of participants. The system includes:

实施方式六、本实施方式是对实施方式五所述的一种满足参与方的个性化隐私保护需求的联邦学习系统的进一步限定，所述加密模块具体为：Embodiment 6. This implementation is a further limitation of the federated learning system described in Embodiment 5 that meets the personalized privacy protection needs of participants. The encryption module is specifically:

实施方式七、本实施方式是对实施方式五所述的一种满足参与方的个性化隐私保护需求的联邦学习系统的进一步限定，所述解密模块包括：Embodiment 7. This implementation is a further limitation of the federated learning system described in Embodiment 5 that meets the personalized privacy protection needs of participants. The decryption module includes:

实施方式八、本实施方式是对实施方式五所述的一种满足参与方的个性化隐私保护需求的联邦学习系统的进一步限定，所述训练模块包括：Embodiment 8. This implementation is a further limitation of the federated learning system described in Embodiment 5 that meets the personalized privacy protection needs of participants. The training module includes:

实施方式九、本实施方式所述的一种计算机可读存储介质，所述计算机可读存储介质用于存储计算机程序，所述计算机程序执行实施方式一至实施方式四任一项所述的一种满足参与方的个性化隐私保护需求的联邦学习方法。Embodiment 9. A computer-readable storage medium according to this embodiment. The computer-readable storage medium is used to store a computer program. The computer program executes the method described in any one of Embodiments 1 to 4. A federated learning approach that meets the personalized privacy protection needs of participants.

实施方式十、本实施方式所述的一种计算机设备，包括存储器和处理器，所述存储器中存储有计算机程序，当所述处理器运行所述存储器存储的计算机程序时，所述处理器执行根据实施方式一至实施方式四中任一项所述的一种满足参与方的个性化隐私保护需求的联邦学习方法。Embodiment 10. A computer device according to this embodiment includes a memory and a processor. A computer program is stored in the memory. When the processor runs the computer program stored in the memory, the processor executes A federated learning method that meets the personalized privacy protection needs of participants according to any one of Embodiments 1 to 4.

实施方式十一、参见图2说明本实施方式。本实施方式是对实施方式一所述的一种满足参与方的个性化隐私保护需求的联邦学习方法提供一个具体实施例，同时也用于解释实施方式二至实施方式四，具体的：Embodiment 11: This embodiment will be described with reference to Figure 2 . This implementation mode provides a specific example of the federated learning method described in the first implementation mode that meets the personalized privacy protection needs of the participants. It is also used to explain the second to the fourth implementation mode. Specifically:

步骤1：参与方根据自己的隐私需求选择自己的隐私预算，对隐私预算加密后发送给服务器；Step 1: Participants select their own privacy budget based on their own privacy needs, encrypt the privacy budget and send it to the server;

所述步骤1包括以下步骤：The step 1 includes the following steps:

步骤1.1：参与方选择自己的隐私预算，隐私预算越小，隐私保护程度越高，反之相反。对于给定的安全参数λ，在R上设置格问题的维数为n，密文模q，密钥分布x和误差分布y。生成一个随机的向量返回一个公共参数(n,q,x,y,a),每个参与方采样一个密钥s_i←x和一个误差向量e_i←y^d，然后计算其公钥b_i＝-s_i·a+e_i(mod q)，所有参与方协作计算聚合公钥：/> Step 1.1: Participants choose their own privacy budget. The smaller the privacy budget, the higher the degree of privacy protection, and vice versa. For a given security parameter λ, set the dimension of the lattice problem on R to be n, the ciphertext module q, the key distribution x and the error distribution y. Generate a random vector Return a public parameter (n, q, x, y, a), each participant samples a key s _i ←x and an error vector e _i ←y ^d , and then calculates its public key b _i =-s _i · a+e _i (mod q), all participants collaborate to calculate the aggregate public key:/>

步骤1.2：参与方使用自己的密钥对隐私预算进行加密，设ε_i∈R是隐私预算的明文，并且a＝a[0]，采样v^di←x，/>计算隐私预算的密文并发送给服务器。Step 1.2: Participants use their own keys to encrypt the privacy budget. Let ε _i ∈R be the plaintext of the privacy budget, and a=a[0], Sample v ^di ←x,/> Ciphertext for calculating privacy budget and sent to the server.

步骤2：服务器对参与方上传的密文隐私预算进行求和，之后与参与方共同解密得出隐私预算的总和，并把总和发送给参与方；Step 2: The server sums the ciphertext privacy budgets uploaded by the participants, and then decrypts them together with the participants to obtain the sum of the privacy budgets, and sends the sum to the participants;

所述步骤2包括以下步骤：The step 2 includes the following steps:

步骤2.1：服务器收到参与方发送的加密后的隐私预算后，对其进行同态加密获得总和 Step 2.1: After the server receives the encrypted privacy budget sent by the participant, it performs homomorphic encryption on it to obtain the sum.

步骤2.2：所有的参与方共同来解密密文，每个参与方采样一个计算解密份额/>并把D_i发送给服务器。Step 2.2: All participants work together to decrypt the ciphertext, and each participant samples one Calculate decryption share/> And send _Di to the server.

步骤2.3：服务器收到所有的参与方发送的解密份额以后，对其解密出明文隐私预算ε的总和。并把隐私预算总和ε发送给参与方。Step 2.3: After the server receives the decrypted shares sent by all participants, it decrypts them to obtain the sum of the plaintext privacy budget ε. And send the privacy budget sum ε to the participating parties.

步骤3：参与方将自己的隐私预算与隐私预算总和相除得到聚合权重w＝ε_i/ε。Step 3: Participants divide their privacy budget by the sum of privacy budgets to obtain the aggregate weight w=ε _i /ε.

步骤4：服务器把全局模型参数发送给参与方，参与方根据服务器发送的参数进行本地训练来获得本地模型；Step 4: The server sends the global model parameters to the participants, and the participants perform local training based on the parameters sent by the server to obtain the local model;

所述步骤4包含以下步骤：Said step 4 includes the following steps:

步骤4.1：服务器发送全局模型的参数g给参与方，参与方收到参数以后进行本地训练。在本地的每次迭代之中用本地数据集进行训练，使用随机梯度下降法训练之后生成本地模型，其中学习率为η，得出来的本地模型的梯度为g_i。Step 4.1: The server sends the parameters g of the global model to the participants, and the participants perform local training after receiving the parameters. The local data set is used for training in each local iteration, and the local model is generated after training using the stochastic gradient descent method, where the learning rate is eta, and the gradient of the resulting local model is g _i .

步骤5：每个参与方把模型的参数与聚合权重相乘，之后进行梯度裁剪，然后加入个性化的噪声进行扰动，之后发送给服务器；Step 5: Each participant multiplies the parameters of the model and the aggregate weight, then performs gradient clipping, then adds personalized noise for perturbation, and then sends it to the server;

所述步骤5包含以下步骤：Said step 5 includes the following steps:

步骤5.1：将本地梯度g_i与聚合权重相乘，g′_i＝g_i·(ε_i/ε)，使用梯度裁剪的阈值σ来对本地梯度进行梯度裁剪，g＝clip(g′_i，σ)。Step 5.1: Multiply the local gradient g _i by the aggregate weight, g′ _i =gi _· (ε _i /ε), and use the gradient clipping threshold σ to gradient clip the local gradient, g=clip(g′ _i , σ).

步骤5.2：设根据参与方自己的隐私预算ε_i添加随机噪声，例如拉普拉斯噪声或者高斯噪声，之后把经过扰动后的本地梯度发送给服务器。Step 5.2: Suppose that random noise, such as Laplacian noise or Gaussian noise, is added according to the participant's own privacy budget ε _i , and then the perturbed local gradient is sent to the server.

步骤6：服务器对参与方发送来的模型参数进行聚合来生成全局模型即 Step 6: The server aggregates the model parameters sent by the participants to generate a global model.

尽管已描述了本公开的优选实施方式，但本领域内的技术人员一旦得知了基本创造性概念，则可对这些实施例作出另外的变更和修改。所以，所附权利要求意欲解释为包括优选实施方式以及落入本公开范围的所有变更和修改。Although preferred embodiments of the present disclosure have been described, those skilled in the art will be able to make additional changes and modifications to these embodiments once the basic inventive concepts are apparent. Therefore, it is intended that the appended claims be construed to include the preferred embodiments and all changes and modifications that fall within the scope of this disclosure.

显然，本领域的技术人员可以对本公开进行各种改动和变型而不脱离本公开的精神和范围。这样，倘若本公开的这些修改和变型属于本公开权利要求及其等同技术的范围之内，则本公开也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present disclosure without departing from the spirit and scope of the disclosure. In this way, if these modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and equivalent technologies, the present disclosure is also intended to include these modifications and variations.

本领域内的技术人员应明白，本公开的实施例可提供为方法、系统或计算机程序产品。因此，本公开可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且，本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present disclosure may be provided as methods, systems, or computer program products. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本公开是参照根据本公开实施例的方法、设备(系统)和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram. These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

最后应当说明的是：以上实施例仅用于说明本公开的技术方案而非对其保护范围的限制，尽管参照上述实施例对本公开进行了详细的说明，所属领域的普通技术人员应当理解：本领域技术人员阅读本公开后依然可对发明的具体实施方式进行种种变更、修改或者等同替换，但这些变更、修改或者等同替换，均在公开待批的权利要求保护范围之内。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present disclosure and do not limit the scope of protection. Although the present disclosure has been described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: After reading this disclosure, those skilled in the art can still make various changes, modifications or equivalent substitutions to the specific embodiments of the invention, but these changes, modifications or equivalent substitutions are within the protection scope of the disclosed and pending claims.

Claims

1. A federated learning method that meets the personalized privacy protection needs of participants, characterized in that the method includes:

S1: Two or more participants select a privacy budget based on privacy requirements, encrypt the privacy budget, and send the encrypted privacy budget to the server;

S2: The server receives the encrypted privacy budget and sums it, and jointly decrypts the sum of the privacy budgets with the participants based on the server's summed privacy budget, and sends the sum to the participants;

S3: Participants divide their privacy budget by the sum of privacy budgets to obtain the aggregate weight;

S4: The server sends the global model parameters to the participants, and the participants perform local training based on the parameters sent by the server to obtain the local model;

S5: Each participant multiplies the parameters of the local model and the aggregate weight, and then performs gradient clipping;

S6: Add personalized noise to the parameters cropped in step S5 for perturbation, and send it to the server;

S7: The server receives the parameters sent by the participants in step S6 and aggregates them to generate a global model. The global model is used to predict and analyze setting issues in privacy protection scenarios.

2. A federated learning method that meets the personalized privacy protection needs of participants according to claim 1, characterized in that step S1 is specifically:

S11: There are m participants {c ₁ , c ₂ ,…c _m } participating in the training. Each participant has its own original data set {d ₁ , d ₂ ,…d _m }. The participants can collect data based on their own privacy. Require and choose your own privacy budget;

S12: Participants use homomorphic encryption according to their own keys to encrypt and send it to the server.

3. A federated learning method that meets the personalized privacy protection needs of participants according to claim 1, characterized in that the step S2 includes:

S21: The server collects the model parameters of participants and aggregates them for distributed training;

S22: The server receives the encrypted privacy budget and performs the sum. The server jointly decrypts the encrypted privacy budget with the participant without knowing the specific privacy budget of the participant, obtains the plaintext privacy budget sum, and sends it to participants.

4. A federated learning method that meets the personalized privacy protection needs of participants according to claim 1, characterized in that the step S4 includes:

The server sends the global model parameters to the participants. After receiving the global model parameters, the participants perform local training and use the local data set and stochastic gradient descent method to perform multiple iterations locally to obtain the local model.

5. A federated learning system that meets the personalized privacy protection needs of participants, characterized in that the system includes:

An encryption module, used by two or more participants to select a privacy budget based on privacy requirements, encrypt the privacy budget, and send the encrypted privacy budget to the server;

A decryption module, configured for the server to receive the encrypted privacy budget and perform summation, and to jointly decrypt the sum of the privacy budgets with the participants based on the summed privacy budget of the server, and to obtain the sum of the privacy budgets, and send the sum to the participants;

The aggregate weight acquisition module is used by participants to divide their own privacy budget and the sum of privacy budgets to obtain the aggregate weight;

The training module is used by the server to send the global model parameters to the participants, and the participants perform local training based on the parameters sent by the server to obtain the local model;

A clipping module, used by each participant to multiply the parameters of the local model and the aggregate weight, and then perform gradient clipping;

A parameter update module, used to add personalized noise to the cropped parameters in the cropping module for perturbation, and send it to the server;

The output module is used for the server to receive the parameters sent by the participants in the parameter update module and aggregate them to generate a global model. The global model is used to predict and analyze setting issues in privacy protection scenarios.

6. A federated learning system that meets the personalized privacy protection needs of participants according to claim 5, characterized in that the encryption module is specifically:

There are m participants {c ₁ , c ₂ ,…c _m } participating in the training. Each participant has its own original data set {d ₁ , d ₂ ,…d _m }, and the participants combine the data according to their own privacy needs. Choose your own privacy budget;

Participants use homomorphic encryption according to their own keys to encrypt and send it to the server.

7. A federated learning system that meets the personalized privacy protection needs of participants according to claim 5, characterized in that the decryption module includes:

The server collects the model parameters of the participants and aggregates them for distributed training;

The server receives the encrypted privacy budget and performs the sum. The server jointly decrypts the encrypted privacy budget with the participant without knowing the specific privacy budget of the participant, obtains the plaintext privacy budget sum, and sends it to the participant. .

8. A federated learning system that meets the personalized privacy protection needs of participants according to claim 5, characterized in that the training module includes:

9. A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program, and the computer program executes the method described in any one of claims 1-4 to satisfy the personality of the participant. A federated learning method to optimize privacy protection requirements.

10. A computer device, characterized in that it includes a memory and a processor, a computer program is stored in the memory, and when the processor runs the computer program stored in the memory, the processor executes the method according to claim 1 A federated learning method described in any of -4 that meets the personalized privacy protection needs of the participants.