CN115310121B - Real-time reinforced federal learning data privacy security method based on MePC-F model in Internet of vehicles - Google Patents

Real-time reinforced federal learning data privacy security method based on MePC-F model in Internet of vehicles Download PDF

Info

Publication number
CN115310121B
CN115310121B CN202210816716.3A CN202210816716A CN115310121B CN 115310121 B CN115310121 B CN 115310121B CN 202210816716 A CN202210816716 A CN 202210816716A CN 115310121 B CN115310121 B CN 115310121B
Authority
CN
China
Prior art keywords
model
mepc
data
federated
round
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210816716.3A
Other languages
Chinese (zh)
Other versions
CN115310121A (en
Inventor
朱容波
李梦瑶
刘浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Agricultural University
Original Assignee
Huazhong Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Agricultural University filed Critical Huazhong Agricultural University
Priority to CN202210816716.3A priority Critical patent/CN115310121B/en
Publication of CN115310121A publication Critical patent/CN115310121A/en
Application granted granted Critical
Publication of CN115310121B publication Critical patent/CN115310121B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/062Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00 applying encryption of the keys
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a real-time reinforced federal learning data privacy security method based on a MePC-F model in the internet of vehicles, which comprises the following steps: building multiple edge servers E i And a cloud server CS; edge server E i Downloading initial type A gradients from cloud server CS
Figure DDA0003740951440000011
And decrypted into
Figure DDA0003740951440000012
Random initialization of type B gradients
Figure DDA0003740951440000013
Carrying out local model training; edge server E i By a decoding function from
Figure DDA0003740951440000014
To obtain partial gradient information to be preserved
Figure DDA0003740951440000015
And the remaining gradient information is used
Figure DDA0003740951440000016
Is homomorphically encrypted as
Figure DDA0003740951440000017
Then broadcast and send to all other edge servers E through MePC algorithm j (ii) a The class A gradient information after all the edge servers are updated and shared is respectively
Figure DDA0003740951440000018
All edge servers will
Figure DDA0003740951440000019
Uploading the global parameters to a cloud server CS, and aggregating the global parameters by the cloud server CS through a PreFLa algorithm; the above steps are repeated until a termination condition is reached. The invention prevents data leakage between terminals, realizes data privacy safety protection, and reduces communication overhead while preventing original data leakage.

Description

车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全 方法Real-time enhanced federated learning data privacy security based on MePC-F model in Internet of Vehicles Method

技术领域Technical Field

本发明涉及联网车辆用户协同处理实时安全行为分析技术领域,尤其涉及一种车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法。The present invention relates to the technical field of real-time safety behavior analysis for collaborative processing by networked vehicle users, and in particular to a real-time enhanced federated learning data privacy security method based on a MePC-F model in a network of vehicles.

背景技术Background Art

随着车联网支持各种实时通信和服务的发展,通过车载单元等互联设备生成的数据量空前巨大,面向车辆用户的大量异构性数据和设备计算能力的差异性,联邦学习为满足网络模型实时训练过程中数据安全保护要求提供了一种有效的解决方案,它可以让不同的边缘设备在不暴露原始数据的情况下协同训练机器学习模型。With the development of the Internet of Vehicles supporting various real-time communications and services, the amount of data generated by interconnected devices such as vehicle-mounted units is unprecedentedly huge. Faced with the large amount of heterogeneous data of vehicle users and the differences in device computing capabilities, federated learning provides an effective solution to meet the data security protection requirements during real-time training of network models. It allows different edge devices to collaboratively train machine learning models without exposing the original data.

边缘计算海量数据与用户个人隐私联合紧密,例如,用户的轨迹、信用卡、账单等数据,切实关系到用户隐私安全,如发生数据泄露,将给用户带来重大安全隐患。联邦学习可以在一定程度上保护数据,但依旧存在信息泄露的风险,主要有四种类型:1)成员泄露,2)非预期特征泄露,3)代表原始数据泄露的类,4)原始数据泄露。最后一种类型的数据泄漏对于隐私敏感的参与者来说是最不可接受的。The massive data of edge computing is closely related to the personal privacy of users. For example, the user's trajectory, credit card, bill and other data are closely related to the privacy and security of users. If data leakage occurs, it will bring major security risks to users. Federated learning can protect data to a certain extent, but there is still the risk of information leakage, which mainly includes four types: 1) member leakage, 2) unexpected feature leakage, 3) class representing the original data leakage, and 4) original data leakage. The last type of data leakage is the most unacceptable for privacy-sensitive participants.

为了保护移动用户的数据隐私性,解决上述原始数据泄露问题,研究者们在基于密码学的数据安全保护进行了大量的研究:差分隐私、同态加密、多方安全计算。差分隐私通常使用三种噪声添加机制:分别为拉普拉斯机制、高斯机制和指数机制。通过添加噪声来干扰上下文信息,以保护数据的隐私,但如果噪声增加过多则会影响模型训练的性能。同态加密中常用的是加法和乘法同态加密:研究表明,使用Paillier加法同态加密计算时,噪声会增加一倍,而使用El Gamal乘法同态加密计算时,噪声呈二次增长。为了增加数据的可用性并克服噪声问题,研究者引入bootstrapping,通过设定阈值进行加密解密降低噪声,从而允许该方案计算无限次的操作。还可以进行批处理,或者进行并行同态计算或删除对的压缩来解决噪声问题。安全多方计算是指在无可信第三方的条件下多方参与者安全地计算一个约定函数的问题,主要目的是在计算过程中必须保证各方私密输入独立,计算时不泄露任何本地数据。有研究证明使用安全多方计算可以解决联邦学习中的梯度泄露问题,并证明只需对第一隐藏层进行信息交换就可在保证精确度的同时进行数据安全保护。但它的信息交互的过程是P2P的,所以会出现通信开销大的问题。In order to protect the data privacy of mobile users and solve the above-mentioned original data leakage problem, researchers have conducted a lot of research on data security protection based on cryptography: differential privacy, homomorphic encryption, and multi-party secure computing. Differential privacy usually uses three noise addition mechanisms: Laplace mechanism, Gaussian mechanism, and exponential mechanism. By adding noise to interfere with contextual information, the privacy of data is protected, but if the noise increases too much, the performance of model training will be affected. Additive and multiplicative homomorphic encryption are commonly used in homomorphic encryption: studies have shown that when using Paillier additive homomorphic encryption, the noise will double, while when using El Gamal multiplicative homomorphic encryption, the noise will increase quadratically. In order to increase the availability of data and overcome the noise problem, researchers introduced bootstrapping, which reduces noise by setting a threshold for encryption and decryption, allowing the scheme to calculate an unlimited number of operations. Batch processing, parallel homomorphic computing, or deletion of compression can also be performed to solve the noise problem. Secure multi-party computing refers to the problem of multiple participants securely calculating an agreed function without a trusted third party. The main purpose is to ensure that the private inputs of all parties are independent during the calculation process and no local data is leaked during the calculation. Research has shown that using secure multi-party computing can solve the gradient leakage problem in federated learning, and that only information exchange on the first hidden layer is needed to ensure accuracy while protecting data security. However, the information exchange process is P2P, so there will be a problem of high communication overhead.

大部分基于密码学的数据安全保护研究都是集中式解决方法,为了在数据安全保护的同时解决时间开销问题:联邦学习可以让边缘设备在不暴露原始数据的情况下协同训练机器学习模型。联邦学习通常采用参数服务器架构,其中客户端由参数服务器同步局部模型训练。通常使用同步方法实现,即中央服务器将全局模型同步发送给多个客户机,多个客户机基于本地数据训练模型后同步将更新后的模型返回中央服务器。这可能会因为掉队而变得缓慢。由于计算能力和电池时间有限,可用性和完成时间因设备而异,因此全局同步非常困难,尤其是在联合学习场景中。有人提出了一种新的联合优化异步算法来解决正则化局部问题以保证收敛,使得多个设备和服务器能够在不泄露隐私的情况下协同高效地训练模型。Most cryptography-based data security protection research is a centralized solution. In order to solve the time overhead problem while protecting data security: Federated learning allows edge devices to collaboratively train machine learning models without exposing raw data. Federated learning usually adopts a parameter server architecture, in which the client synchronizes local model training by the parameter server. It is usually implemented using a synchronous method, that is, the central server sends the global model to multiple clients synchronously, and multiple clients train the model based on local data and then synchronously return the updated model to the central server. This may become slow due to falling behind. Due to limited computing power and battery time, availability and completion time vary from device to device, so global synchronization is very difficult, especially in federated learning scenarios. A new joint optimization asynchronous algorithm has been proposed to solve the regularization local problem to ensure convergence, so that multiple devices and servers can collaboratively and efficiently train models without leaking privacy.

尽管在数据安全方面有很多的研究。但大多数都限制于解决原始数据安全问题,如何在复杂车联网空间中,同时满足移动用户大数据隐私性和可用性为目标,设计一个有效的联邦学习算法来减少通信开销的同时防止梯度泄露后导致数据被恢复的问题依旧是开放的。Although there are many studies on data security, most of them are limited to solving the original data security problem. How to meet the privacy and availability of mobile users' big data in the complex Internet of Vehicles space, and design an effective federated learning algorithm to reduce communication overhead while preventing data recovery due to gradient leakage are still open issues.

首先,联邦学习中数据都存储在本地节点中,可以减少数据传输中原始数据泄露的风险问题。但仅仅只传输梯度信息,依旧会出现原始数据被恢复的可能性。安全多方计算中数据交互可以使得多方拥有数据,降低梯度信息被泄露后样本被信息恢复的可能性。但现有的安全多方计算中用户交互信息的方式是所有用户都发送给其他用户,简单讲就是使用单播的方式,这样就会带来较高的时间开销。所以在应对车辆用户的数据安全和实时性需求时,找到一个合适的解决方案降低数据被攻击和被恢复的风险,并降低传输时延就很重要。其次,由于不同边缘服务器的数据和设备差异性,在训练过程中有针对性的提高整个模型训练精度也是很有必要的。采用典型的联邦平均同步方式进行全局参数聚合会出现掉队现象而变得缓慢。在平衡计算通信时间开销的同时,多个模型个性化训练来保障全局精度也是很重要。然而,大多数基于数据安全的联邦学习算法依赖于同步聚合算法会带来高时延对满足车联网的实时性需求具有挑战性。因此一个基于强化学习的联邦学习算法来降低时延提高精确度并保障数据安全是有必要的。First, in federated learning, data is stored in local nodes, which can reduce the risk of original data leakage during data transmission. However, if only gradient information is transmitted, there is still a possibility that the original data can be recovered. Data interaction in secure multi-party computing allows multiple parties to have data, reducing the possibility of sample information recovery after gradient information is leaked. However, the existing way of user interaction information in secure multi-party computing is that all users send it to other users, which is simply to use unicast, which will bring high time overhead. Therefore, when dealing with the data security and real-time requirements of vehicle users, it is important to find a suitable solution to reduce the risk of data being attacked and recovered, and reduce transmission delay. Secondly, due to the differences in data and devices of different edge servers, it is also necessary to improve the training accuracy of the entire model in a targeted manner during the training process. Using the typical federated average synchronization method for global parameter aggregation will cause falling behind and become slow. While balancing the computational communication time overhead, it is also important to train multiple models in a personalized way to ensure global accuracy. However, most federated learning algorithms based on data security rely on synchronous aggregation algorithms, which will bring high latency and are challenging to meet the real-time requirements of the Internet of Vehicles. Therefore, a federated learning algorithm based on reinforcement learning is necessary to reduce latency, improve accuracy and ensure data security.

发明内容Summary of the invention

本发明要解决的技术问题在于针对现有技术中的缺陷,提供一种车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法。The technical problem to be solved by the present invention is to provide a real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles in response to the defects in the prior art.

本发明解决其技术问题所采用的技术方案是:The technical solution adopted by the present invention to solve the technical problem is:

本发明提供一种车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,该方法包括以下步骤:The present invention provides a real-time enhanced federated learning data privacy security method based on the MePC-F model in an Internet of Vehicles, the method comprising the following steps:

S1、构建多个边缘服务器Ei和一个云服务器CS;获取车辆数据D={D1,D2,…,Di},边缘服务器Ei获取对应车辆数据DiS1. Build multiple edge servers E i and a cloud server CS; obtain vehicle data D = {D 1 , D 2 , ..., D i }, and the edge server E i obtains the corresponding vehicle data D i ;

S2、在第k轮联邦任务中,边缘服务器Ei从云服务器CS中下载初始A型梯度

Figure BDA0003740951420000041
并解密为
Figure BDA0003740951420000042
随机初始化B型梯度
Figure BDA0003740951420000043
边缘服务器Ei根据其车辆数据Di的最小化损失函数来计算本地网络模型训练中的梯度,边缘服务器Ei完成T轮本地训练完后的梯度信息记为
Figure BDA0003740951420000044
S2. In the kth round of federated tasks, the edge server E i downloads the initial A-type gradient from the cloud server CS
Figure BDA0003740951420000041
and decrypted to
Figure BDA0003740951420000042
Randomly initialize B-type gradient
Figure BDA0003740951420000043
The edge server E i calculates the gradient in the local network model training according to the minimized loss function of its vehicle data D i . The gradient information of the edge server E i after completing T rounds of local training is recorded as
Figure BDA0003740951420000044

S3、边缘服务器Ei通过解码函数

Figure BDA0003740951420000045
Figure BDA0003740951420000046
中获取需要保留的部分梯度信息
Figure BDA0003740951420000047
并将剩余的梯度信息
Figure BDA0003740951420000048
经同态加密为
Figure BDA0003740951420000049
再通过MePC算法广播发送给其它所有的边缘服务器Ej;边缘服务器Ei根据解码函数
Figure BDA00037409514200000410
获取来自其它边缘服务器Ej的对应部分梯度信息
Figure BDA00037409514200000411
所有边缘服务器更新共享后的A类梯度信息分别为
Figure BDA00037409514200000412
i∈[1,n],n为边缘服务器的总数;S3, edge server E i through decoding function
Figure BDA0003740951420000045
from
Figure BDA0003740951420000046
Get some gradient information that needs to be retained
Figure BDA0003740951420000047
And the remaining gradient information
Figure BDA0003740951420000048
After homomorphic encryption,
Figure BDA0003740951420000049
Then, the MePC algorithm is used to broadcast the message to all other edge servers Ej ; the edge server Ei receives the message according to the decoding function
Figure BDA00037409514200000410
Get the corresponding partial gradient information from other edge servers Ej
Figure BDA00037409514200000411
The updated and shared A-type gradient information of all edge servers are
Figure BDA00037409514200000412
i∈[1,n], n is the total number of edge servers;

S4、所有边缘服务器将

Figure BDA00037409514200000413
上传到云服务器CS,云服务器CS通过PreFLa算法聚合全局参数,PreFLa算法通过强化学习获得最大化回报来选择边缘服务器Ei的最优参数权重比ai,k,全局梯度参数
Figure BDA00037409514200000414
根据ai,k进行聚合;参数的上传和下载过程是并行的,所有参数都经过HE加密;S4, all edge servers will
Figure BDA00037409514200000413
Uploaded to the cloud server CS, the cloud server CS aggregates global parameters through the PreFLa algorithm, and the PreFLa algorithm selects the optimal parameter weight ratio a i, k of the edge server E i and the global gradient parameter by maximizing the return through reinforcement learning
Figure BDA00037409514200000414
Aggregate according to a i, k ; the parameter upload and download process is parallel, and all parameters are HE encrypted;

S5、重复步骤S2-S4,直到达到终止条件,云服务器CS计算最终的全局梯度参数,下发给各边缘服务器,边缘服务器根据多个车辆数据的特征提取,计算MePC-F模型的精确度和最优损失函数,得到训练好的MePC-F模型,完成整个训练过程,实时输出给车联网对应的服务。。S5. Repeat steps S2-S4 until the termination condition is reached. The cloud server CS calculates the final global gradient parameters and sends them to each edge server. The edge server extracts features from multiple vehicle data, calculates the accuracy and optimal loss function of the MePC-F model, obtains the trained MePC-F model, completes the entire training process, and outputs it to the corresponding service of the Internet of Vehicles in real time.

进一步地,本发明的所述步骤S2中,本地网络模型训练的具体方法为:Furthermore, in step S2 of the present invention, the specific method of training the local network model is:

采用深度神经网络DNN模型,DNN通过将不同车辆数据作为原始输入来执行端到端的特征学习和分类器训练,使用随机梯度下降作为子程序来最小化每个本地训练中损失值;A deep neural network (DNN) model is used. The DNN performs end-to-end feature learning and classifier training by taking different vehicle data as raw input, and uses stochastic gradient descent as a subroutine to minimize the loss value in each local training.

Ei在第k轮通信中从云服务器CS下载基础层参数,即解密前的初始A型梯度

Figure BDA0003740951420000051
并解密为A型梯度
Figure BDA0003740951420000052
随机初始化B型梯度
Figure BDA0003740951420000053
其中,k∈[1,K],K表示联邦任务的总轮数;若为第一轮联邦任务,CS随机初始化
Figure BDA0003740951420000054
在本地训练之前,Ei通过使用同态加密对
Figure BDA0003740951420000055
解密为
Figure BDA0003740951420000056
并记为
Figure BDA0003740951420000057
E i downloads the base layer parameters from the cloud server CS in the kth round of communication, i.e., the initial A-type gradient before decryption
Figure BDA0003740951420000051
and decrypted into a type A gradient
Figure BDA0003740951420000052
Randomly initialize B-type gradient
Figure BDA0003740951420000053
Among them, k∈[1, K], K represents the total number of rounds of the federated task; if it is the first round of the federated task, CS is randomly initialized
Figure BDA0003740951420000054
Before local training, E i is encrypted using homomorphic encryption
Figure BDA0003740951420000055
Decrypted to
Figure BDA0003740951420000056
And record it as
Figure BDA0003740951420000057

局部模型的损失函数设置如下:The loss function of the local model is set as follows:

L(wi)=l(wi)+λ(wi,t-wi,t+1)2 L( wi )=l( wi )+λ( wi,t - wi,t+1 ) 2

其中,l()表示网络的损失,第二项是L2正则化项,λ是正则化系数;wi表示局部模型中的总权重信息,wi,t是局部模型在t时刻的权重信息,wi,t+1是局部模型在t+1时刻的权重信息;Among them, l() represents the loss of the network, the second term is the L2 regularization term, and λ is the regularization coefficient; wi represents the total weight information in the local model, wi,t is the weight information of the local model at time t, and wi,t+1 is the weight information of the local model at time t+1;

Ei初始化Gk并替换模型的权重参数wi,通过最小化损失函数继续进行局部模型训练如下:E i initializes G k and replaces the model's weight parameters w i , and continues local model training by minimizing the loss function as follows:

wi=wi-ηGk w i = w i −ηG k

其中,η是学习率,Gk

Figure BDA0003740951420000058
Figure BDA0003740951420000059
的总表示,这里的
Figure BDA00037409514200000510
随机初始化;Where η is the learning rate and Gk is
Figure BDA0003740951420000058
and
Figure BDA0003740951420000059
The total representation here is
Figure BDA00037409514200000510
Random initialization;

边缘服务器Ei在达到T轮本地训练后,此时会得到每个局部模型的准确率acci,k

Figure BDA00037409514200000511
After the edge server E i reaches T rounds of local training, the accuracy of each local model acc i,k and
Figure BDA00037409514200000511

进一步地,本发明的所述步骤S3中,MePC算法的具体方法为:Furthermore, in step S3 of the present invention, the specific method of the MePC algorithm is:

第k轮联邦任务中,所有的边缘服务器使用MePC来交换基础层梯度

Figure BDA00037409514200000512
其中,
Figure BDA00037409514200000513
表示第k轮联邦任务中第n个边缘服务器的A类加密数据,
Figure BDA00037409514200000514
表示第k轮联邦任务中第i个边缘服务器的A类加密数据,
Figure BDA00037409514200000515
表示第k轮联邦任务中第i个边缘服务器广播发给其他边缘服务器的A类加密数据,
Figure BDA00037409514200000516
即为
Figure BDA00037409514200000517
去除了自己保留的那一份后的加密数据;In the kth round of federated tasks, all edge servers use MePC to exchange base layer gradients.
Figure BDA00037409514200000512
in,
Figure BDA00037409514200000513
represents the encrypted data of type A of the nth edge server in the kth round of federated tasks,
Figure BDA00037409514200000514
represents the encrypted data of type A of the i-th edge server in the k-th round of federated tasks,
Figure BDA00037409514200000515
It indicates the encrypted data of type A broadcasted by the i-th edge server to other edge servers in the k-th round of federated tasks.
Figure BDA00037409514200000516
That is
Figure BDA00037409514200000517
The encrypted data after removing the copy you keep;

为了避免数据被破解的风险,在每个网络中取随机比例χ的

Figure BDA0003740951420000061
梯度即为
Figure BDA0003740951420000062
并保持同一轮联邦的随机比例χ相同,再将
Figure BDA0003740951420000063
加密为
Figure BDA0003740951420000064
在不同轮的联邦任务中,随机比例χ是变化的,χ∈[1,1/n];
Figure BDA0003740951420000065
剩下的梯度通过同态加密为
Figure BDA0003740951420000066
被均分为n-1份
Figure BDA0003740951420000067
的值划分为:In order to avoid the risk of data being cracked, a random ratio of χ is taken in each network.
Figure BDA0003740951420000061
The gradient is
Figure BDA0003740951420000062
Keep the random proportion χ of the federation in the same round the same, and then
Figure BDA0003740951420000063
Encrypted as
Figure BDA0003740951420000064
In different rounds of federated tasks, the random ratio χ varies, χ∈[1, 1/n];
Figure BDA0003740951420000065
The remaining gradients are homomorphically encrypted as
Figure BDA0003740951420000066
Divided equally into n-1 parts
Figure BDA0003740951420000067
The values are divided into:

Figure BDA0003740951420000068
Figure BDA0003740951420000068

只有

Figure BDA0003740951420000069
被保留在Ei中,其它部分和随机参数χ将会以密文的形式广播发送给其它Ej;通过这种方式,即使部分传输内容被攻击,最初的数据
Figure BDA00037409514200000610
也不会泄露;only
Figure BDA0003740951420000069
is retained in E i , and the other parts and random parameters χ will be broadcasted to other E j in the form of ciphertext; in this way, even if part of the transmission content is attacked, the original data
Figure BDA00037409514200000610
It will not be leaked;

共享给其它Ej的梯度信息是

Figure BDA00037409514200000611
The gradient information shared to other E j is
Figure BDA00037409514200000611

Figure BDA00037409514200000612
Figure BDA00037409514200000612

当Ei接收到由其它服务器发送的数据包

Figure BDA00037409514200000613
它在本地执行数据验证。When Ei receives a data packet sent by other servers
Figure BDA00037409514200000613
It performs data validation locally.

进一步地,本发明的所述步骤S3中,在本地执行数据验证的具体方法为:Furthermore, in step S3 of the present invention, the specific method of performing data verification locally is:

在第k轮联邦任务中,使用相应的“乘法”方法进行验证,每个边缘服务器自己设计两个解码函数,如下:In the kth round of federated tasks, the corresponding "multiplication" method is used for verification, and each edge server designs two decoding functions by itself, as follows:

Figure BDA00037409514200000614
Figure BDA00037409514200000614

Figure BDA00037409514200000615
Figure BDA00037409514200000615

其中,L0

Figure BDA00037409514200000616
的长度,L’是
Figure BDA00037409514200000617
的长度;解码函数的下标k,表示第k轮联邦任务中的解码函数;Among them, L0 is
Figure BDA00037409514200000616
The length, L' is
Figure BDA00037409514200000617
The length of the decoding function; the subscript k of the decoding function represents the decoding function in the k-th round of federated tasks;

L0=χ·LL 0 = x·L

Figure BDA00037409514200000618
Figure BDA00037409514200000618

其中,L是

Figure BDA00037409514200000619
的长度,
Figure BDA00037409514200000620
Figure BDA00037409514200000621
的长度相等;Where L is
Figure BDA00037409514200000619
Length,
Figure BDA00037409514200000620
and
Figure BDA00037409514200000621
are of equal length;

要求

Figure BDA0003740951420000071
满足所有边缘服务器的解码函数对同一个数据包执行“并”操作得到全0,并执行“交”操作得到全1,即:Require
Figure BDA0003740951420000071
The decoding functions that satisfy all edge servers perform a "and" operation on the same data packet to obtain all 0s, and perform a "cross" operation to obtain all 1s, that is:

Figure BDA0003740951420000072
Figure BDA0003740951420000072

Figure BDA0003740951420000073
Figure BDA0003740951420000073

首先,初始化解码函数如下:First, initialize the decoding function as follows:

Figure BDA0003740951420000074
Figure BDA0003740951420000074

将数据包

Figure BDA0003740951420000075
与其它服务器中相应的解码函数相乘;由于
Figure BDA0003740951420000076
中0的二进制位被乘得0,所以Ei保证只得到它自己的部分数据包;当
Figure BDA0003740951420000077
中的二进制位为1时,得到对应位置的梯度信息的密文,如下:Packet
Figure BDA0003740951420000075
Multiply with the corresponding decoding function in other servers; since
Figure BDA0003740951420000076
The 0 bits in are multiplied by 0, so E i is guaranteed to get only its own part of the data packet; when
Figure BDA0003740951420000077
When the binary bit in is 1, the ciphertext of the gradient information at the corresponding position is obtained as follows:

Figure BDA0003740951420000078
Figure BDA0003740951420000078

Ei将从其它边缘服务器Ej获取的所有数据包数组添加到对应位置,得到所有密文数据,并更新为最后

Figure BDA0003740951420000079
即:E i adds all the data packet arrays obtained from other edge servers E j to the corresponding positions, obtains all the ciphertext data, and updates to the final
Figure BDA0003740951420000079
Right now:

Figure BDA00037409514200000710
Figure BDA00037409514200000710

每次进行安全多方计算时,随着k的增加,将每个Ei中的解码函数

Figure BDA00037409514200000711
的二进制向左循环移动m个单位,以保证
Figure BDA00037409514200000712
共享的动态性,并将它们平均划分到E1,E2,…,En中,且每个部分的数据信息不重复。Each time a secure multi-party computation is performed, as k increases, the decoding function in each E i is replaced by
Figure BDA00037409514200000711
The binary is circularly shifted to the left by m units to ensure
Figure BDA00037409514200000712
The dynamics of sharing are evenly divided into E 1 , E 2 , …, E n , and the data information of each part is not repeated.

进一步地,本发明的所述步骤S4中,PreFla算法的具体方法为:Furthermore, in step S4 of the present invention, the specific method of the PreFla algorithm is:

PreFLa采用强化学习RL进行适配来选择最优参数权重比ai,k聚合全局参数

Figure BDA00037409514200000713
PreFLa uses reinforcement learning RL to adapt to select the optimal parameter weight ratio a i,k to aggregate global parameters
Figure BDA00037409514200000713

在上行通信阶段,每个边缘服务器不仅训练局部模型,还将本地参数上传到云服务器CS进行联合聚合;在第k轮联邦中执行MePC算法后,Ei通过TLS/SSL安全通道将参数

Figure BDA00037409514200000714
Figure BDA00037409514200000715
上传到CS;在聚合阶段,由于每个ES的分布不平衡和数据异构性,其模型参数用于聚合对该阶段的收敛速度具有至关重要的影响;因此,有必要考虑k轮联邦聚合中参与者Ei的参数权重比ai,k;In the uplink communication phase, each edge server not only trains the local model, but also uploads the local parameters to the cloud server CS for joint aggregation; after executing the MePC algorithm in the kth round of federation, E i transmits the parameters to the cloud server CS through the TLS/SSL secure channel.
Figure BDA00037409514200000714
and
Figure BDA00037409514200000715
Upload to CS; In the aggregation stage, due to the imbalanced distribution and data heterogeneity of each ES, its model parameters used for aggregation have a crucial impact on the convergence speed of this stage; therefore, it is necessary to consider the parameter weight ratio a i,k of participant E i in the k-round federated aggregation;

使用基于DQN的强化学习去预测参数权重比,通过Q函数来存储信息,以防止空间多维灾难;为了更好地实现模型个性化,减少MePC-F中上传权重的等待时间,用DQN来选择最优参数权重比ai,k,聚合更新CS中的全局参数

Figure BDA0003740951420000081
强化学习包括:状态、动作、奖励函数以及反馈。Use DQN-based reinforcement learning to predict parameter weight ratios and store information through Q functions to prevent spatial multidimensional disasters; in order to better personalize the model and reduce the waiting time for uploading weights in MePC-F, use DQN to select the optimal parameter weight ratio a i, k and aggregate to update the global parameters in CS
Figure BDA0003740951420000081
Reinforcement learning includes: state, action, reward function, and feedback.

进一步地,本发明的所述步骤S4中,状态、动作、奖励函数以及反馈的具体方法为:Furthermore, in step S4 of the present invention, the specific methods of state, action, reward function and feedback are:

状态:第k轮的状态

Figure BDA0003740951420000082
其中,
Figure BDA0003740951420000083
是精度差,表示为:Status: Status of round k
Figure BDA0003740951420000082
in,
Figure BDA0003740951420000083
is the precision difference, expressed as:

Figure BDA0003740951420000084
Figure BDA0003740951420000084

动作:参数权重占比ai,k表示为第k轮联邦任务的动作;为避免陷入局部最优解,采用ε-贪心算法优化动作选择过程,得到ai,kAction: The parameter weight ratio ai,k represents the action of the federated task in round k. To avoid falling into the local optimal solution, the ε-greedy algorithm is used to optimize the action selection process and obtain ai,k :

Figure BDA0003740951420000085
Figure BDA0003740951420000085

其中P是权重排列的集合,rand是一个随机数,rand∈[0,1],Q(si,k,ai,k)指代理在状态si,k下采取行动ai,k时的累积折现收益;一旦DQN在测试期间被训练为近似Q(si,k,ai,k),DQN代理将为第k轮中的所有动作计算{Q(si,k,ai,k)|ai,k∈[P]};每个动作值表示代理通过在状态si,k选择特定动作ai,k可以获得的最大预期回报;where P is a set of weight permutations, rand is a random number, rand ∈ [0, 1], Q( si, k , ai , k ) refers to the cumulative discounted return of the agent when taking action ai , k in state si , k ; once the DQN is trained to approximate Q(si , k , ai , k ) during testing, the DQN agent will calculate {Q(si , k , ai , k )|ai , k ∈ [P]} for all actions in the kth round; each action value represents the maximum expected return that the agent can obtain by choosing a specific action ai , k in state si, k;

奖励:将第k轮联邦结束时观察到的奖励设置为:Reward: Set the reward observed at the end of the kth federation round to:

Figure BDA0003740951420000086
Figure BDA0003740951420000086

其中,

Figure BDA0003740951420000087
是一个正数,确保rk随着测试准确度Δacci,k呈指数增长;第一项激励代理选择能够实现更高测试精度的设备;
Figure BDA0003740951420000091
用来控制随着Δacci,k增长rk的变化;当Δacci,k小于0时,有rk∈(-1,0);in,
Figure BDA0003740951420000087
is a positive number, ensuring that r k grows exponentially with the test accuracy Δacc i,k ; the first term motivates the agent to select equipment that can achieve higher test accuracy;
Figure BDA0003740951420000091
It is used to control the change of r k as Δacc i,k increases; when Δacc i,k is less than 0, r k ∈(-1,0);

训练DQN代理以最大化累积折扣奖励的期望,如下式所示:The DQN agent is trained to maximize the expectation of the cumulative discounted reward as shown below:

Figure BDA0003740951420000092
Figure BDA0003740951420000092

其中,γ∈(0,1],表示一个折扣未来奖励的因子;Among them, γ∈(0,1], represents a factor that discounts future rewards;

在获得rk之后,云服务器CS为每轮联邦任务保存多维四元组Bk=(si,k,ai,k,rk,si,k+1);最优动作值函数Q(si,k,ai,k)是RL代理寻求的备忘单,定义为从si,k开始的累积折现收益的最大期望:After obtaining r k , the cloud server CS saves the multidimensional quadruple B k = (s i, k , a i, k , r k , s i, k + 1 ) for each round of federated tasks; the optimal action-value function Q(s i, k , a i, k ) is the cheat sheet sought by the RL agent and is defined as the maximum expectation of the cumulative discounted return starting from s i, k :

Q(si,k,ai,k)=E(ri,k+γmax Q(si,k+1,ai,k)|si,k,ai,k)Q(s i,k ,a i,k )=E(r i,k +γmax Q(s i,k+1 ,a i,k )|s i,k ,a i,k )

应用函数逼近技术学习一个参数化的值函数Q(si,k,ai,k;wk)逼近最优值函数Q(si,k,ai,k);rk+γmax Q(si,k+1,ai,k)是Q(si,k,ai,k;wk)学习的目标;DNN用于表示函数逼近器;RL学习问题变成最小化目标和逼近器之间的MSE损失,定义为:Function approximation techniques are applied to learn a parameterized value function Q(s i, k , a i, k ; w k ) to approximate the optimal value function Q(s i, k , a i, k ); r k +γmax Q(s i, k+1 , a i, k ) is the goal of learning Q(s i, k , a i, k ; w k ); DNN is used to represent the function approximator; the RL learning problem becomes minimizing the MSE loss between the target and the approximator, defined as:

l(wk)=(ri,k+γmax Q(si,k+1,ai,k;wk)-Q(si,k,ai,k;wk))2 l( wk )=(ri ,k +γmax Q(si ,k+1 ,ai ,k ; wk )-Q(si ,k ,ai ,k ; wk )) 2

CS更新全局参数wk为:CS updates the global parameter wk as:

Figure BDA0003740951420000093
Figure BDA0003740951420000093

其中,η≥0是步长;Where η≥0 is the step size;

云服务器CS得到最佳学习模型后,得到第k轮权重比序列的ai,k,将全局参数

Figure BDA0003740951420000094
更新为:After the cloud server CS obtains the best learning model, it obtains the k-th round weight ratio sequence a i, k and sets the global parameter
Figure BDA0003740951420000094
Updated to:

Figure BDA0003740951420000095
Figure BDA0003740951420000095

所有边缘服务器更新全局参数

Figure BDA0003740951420000096
并开始接下来的T轮本地训练。All edge servers update global parameters
Figure BDA0003740951420000096
And start the next T rounds of local training.

进一步地,本发明的该方法中HE加密的方法具体为:Furthermore, the HE encryption method in the method of the present invention is specifically as follows:

权重矩阵和偏置向量的加密方案遵循相同的思想,实数a的加法同态加密表示为aE,在加法同态加密中,对于任意两个数a和b,有aE+bE=(a+b)E;将任何实数r转换为编码的有理数不动点v的方法是:The encryption schemes for the weight matrix and the bias vector follow the same idea. The additive homomorphic encryption of a real number a is represented by a E . In additive homomorphic encryption, for any two numbers a and b, a E + b E = (a+b) E . The method to convert any real number r into an encoded rational fixed point v is:

Figure BDA0003740951420000101
Figure BDA0003740951420000101

认为梯度

Figure BDA0003740951420000102
中的每个编码实数r可以表示为有理定理的H位数,由一个符号位、z位整数位和d位小数位组成;因此,每个可以编码的有理数都由其H=1+z+d位定义;执行编码以允许乘法运算,这需要运算模数为H+2d以避免比较;Think gradient
Figure BDA0003740951420000102
Each coded real number r in can be represented as an H-bit rational theorem, consisting of a sign bit, z integer bits, and d fraction bits; therefore, each coded rational number is defined by its H = 1 + z + d bits; the encoding is performed to allow multiplication operations, which require operations modulo H + 2d to avoid comparisons;

解码定义为:Decode is defined as:

Figure BDA0003740951420000103
Figure BDA0003740951420000103

这些编码数字的乘法需要去除因子1/2d;使用Paillier加法加密时,可以准确计算编码乘法的情况,但只能保证一次同态乘法;为简单起见,在解码时处理它;Multiplication of these encoded numbers requires the removal of the factor 1/2d; when using Paillier addition encryption, the encoded multiplication can be calculated exactly, but only one homomorphic multiplication is guaranteed; for simplicity, it is handled during decoding;

最大的可加密整数是V-1,所以最大的可加密实数必须考虑到这一点,因此,整数z和小数d的选择条件如下:The largest encryptable integer is V-1, so the largest encryptable real number must take this into account. Therefore, the integer z and the decimal d are selected as follows:

V≥2H+2d≥21+z+3dV≥2 H+2d ≥2 1+z+3d .

进一步地,本发明的所述步骤S5中最优的损失函数为Furthermore, the optimal loss function in step S5 of the present invention is

Figure BDA0003740951420000104
Figure BDA0003740951420000104

其中,L(wi)表示Ei网络的损失。Where L( wi ) represents the loss of the Ei network.

本发明产生的有益效果是:The beneficial effects produced by the present invention are:

(1)提出一种多方广播安全计算的联邦学习模型(MePC-F)。该模型将MePC算法和PreFla算法相结合解决车联网中联邦学习训练数据安全和通信开销问题。并考虑同态加密以及安全多方计算的混合优势来防止端与端之间的数据泄露,在数据被攻击后降低原始数据的还原度,以最大限度实现数据的隐私安全保护。(1) A federated learning model with multi-party broadcast secure computing (MePC-F) is proposed. This model combines the MePC algorithm and the PreFla algorithm to solve the problems of data security and communication overhead in federated learning training in the Internet of Vehicles. The hybrid advantages of homomorphic encryption and secure multi-party computing are considered to prevent data leakage between ends, and reduce the degree of restoration of original data after the data is attacked, so as to maximize the privacy and security protection of data.

(2)提出一种安全广播多方计算MePC。针对安全多方计算,只共享第一层的梯度信息就可以大大降低数据被恢复的风险并减少通信量。在共享过程中采用广播的方式,边缘服务器模型经过解码函数取各自的部分,可以将时间复杂度从O(n2)降低到O(n),在防止原始数据泄露的同时减小通信开销。(2) A secure broadcast multi-party computation (MePC) is proposed. For secure multi-party computation, only sharing the gradient information of the first layer can greatly reduce the risk of data recovery and reduce the amount of communication. In the sharing process, the broadcast method is adopted. The edge server model takes its own part through the decoding function, which can reduce the time complexity from O(n 2 ) to O(n), preventing the leakage of original data while reducing the communication overhead.

(3)提出了基于权重占比的联邦学习算法PreFla。利用PreFla找到最优的梯度权重占比去聚合全局参数,使用每个边缘服务器的精确度差值来设计奖励函数,使得整体回报最大的动作选择即为每轮联邦的权重占比。并在损失函数中添加L2正则化项来促进边缘服务器协作并降低数据异构性带来的时延和性能问题。从而更好的泛化全局模型、加速收敛。(3) A federated learning algorithm PreFla based on weight ratio is proposed. PreFla is used to find the optimal gradient weight ratio to aggregate global parameters, and the accuracy difference of each edge server is used to design the reward function, so that the action selection with the largest overall return is the weight ratio of each round of federation. An L2 regularization term is added to the loss function to promote edge server collaboration and reduce the latency and performance issues caused by data heterogeneity. This can better generalize the global model and accelerate convergence.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

下面将结合附图及实施例对本发明作进一步说明,附图中:The present invention will be further described below with reference to the accompanying drawings and embodiments, in which:

图1是本发明实施例的MePC-F模型;FIG1 is a MePC-F model according to an embodiment of the present invention;

图2是本发明实施例的MePC-F模型的流程图;FIG2 is a flow chart of a MePC-F model according to an embodiment of the present invention;

图3是本发明实施例的MePC算法;FIG3 is a MePC algorithm according to an embodiment of the present invention;

图4是本发明实施例的MNIST上四种方法隐藏第一个隐藏层不隐藏时的DLG结果;(a)FL;(b)MePC-F;(c)PeMPC;(d)Gaussian;(e)Laplacian;FIG4 is a DLG result of four methods on MNIST in an embodiment of the present invention when the first hidden layer is not hidden; (a) FL; (b) MePC-F; (c) PeMPC; (d) Gaussian; (e) Laplacian;

图5是本发明实施例的DLG在MNIST上的性能,当第一个隐藏层的梯度被四种方法(高斯分布、拉普拉斯分布、PEMPC和MePC-F)替换时;FIG5 is the performance of DLG on MNIST according to an embodiment of the present invention when the gradient of the first hidden layer is replaced by four methods (Gaussian distribution, Laplace distribution, PEMPC, and MePC-F);

图6是本发明实施例的No-IID MNIST数据的平均准确率和损失;FIG6 is an average accuracy and loss of No-IID MNIST data according to an embodiment of the present invention;

图7是本发明实施例的No-IID CAFIR-10数据的平均准确率和损失。FIG. 7 shows the average accuracy and loss of No-IID CAFIR-10 data according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。In order to make the purpose, technical solution and advantages of the present invention more clearly understood, the present invention is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention and are not used to limit the present invention.

本发明实施例中涉及的参数说明如下:The parameters involved in the embodiments of the present invention are described as follows:

表1参数说明Table 1 Parameter Description

Figure BDA0003740951420000121
Figure BDA0003740951420000121

其中,Ei表示当前的边缘服务器,Ej表示除当前边缘服务器之外的边缘服务器,Es表示所有边缘服务器。Among them, E i represents the current edge server, E j represents the edge server other than the current edge server, and Es represents all edge servers.

本发明实施例的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,包括以下步骤:The real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles of the embodiment of the present invention includes the following steps:

S1、构建多个边缘服务器Ei和一个云服务器CS;获取车辆数据D={D1,D2,…,Di},边缘服务器Ei获取对应车辆数据DiS1. Build multiple edge servers E i and a cloud server CS; obtain vehicle data D = {D 1 , D 2 , ..., D i }, and the edge server E i obtains the corresponding vehicle data D i ;

S2、在第k轮联邦任务中,边缘服务器Ei从云服务器CS中下载初始A型梯度

Figure BDA0003740951420000131
并解密为
Figure BDA0003740951420000132
随机初始化B型梯度
Figure BDA0003740951420000133
边缘服务器Ei根据其车辆数据Di的最小化损失函数来计算本地网络模型训练中的梯度,边缘服务器Ei完成T轮本地训练完后的梯度信息记为
Figure BDA0003740951420000134
S2. In the kth round of federated tasks, the edge server E i downloads the initial A-type gradient from the cloud server CS
Figure BDA0003740951420000131
and decrypted to
Figure BDA0003740951420000132
Randomly initialize B-type gradient
Figure BDA0003740951420000133
The edge server E i calculates the gradient in the local network model training according to the minimized loss function of its vehicle data D i . The gradient information of the edge server E i after completing T rounds of local training is recorded as
Figure BDA0003740951420000134

S3、边缘服务器Ei通过解码函数

Figure BDA0003740951420000135
Figure BDA0003740951420000136
中获取需要保留的部分梯度信息
Figure BDA0003740951420000137
并将剩余的梯度信息
Figure BDA0003740951420000138
经同态加密为
Figure BDA0003740951420000139
再通过MePC算法广播发送给其它所有的边缘服务器Ej;边缘服务器Ei根据解码函数
Figure BDA00037409514200001310
获取来自其它边缘服务器Ej的对应部分梯度信息
Figure BDA00037409514200001311
所有边缘服务器更新共享后的A类梯度信息分别为
Figure BDA00037409514200001312
i∈[1,n],n为边缘服务器的总数;S3, edge server E i through decoding function
Figure BDA0003740951420000135
from
Figure BDA0003740951420000136
Get some gradient information that needs to be retained
Figure BDA0003740951420000137
And the remaining gradient information
Figure BDA0003740951420000138
After homomorphic encryption,
Figure BDA0003740951420000139
Then, the MePC algorithm is used to broadcast the message to all other edge servers Ej ; the edge server Ei receives the message according to the decoding function
Figure BDA00037409514200001310
Get the corresponding partial gradient information from other edge servers Ej
Figure BDA00037409514200001311
The updated and shared A-type gradient information of all edge servers are
Figure BDA00037409514200001312
i∈[1,n], n is the total number of edge servers;

S4、所有边缘服务器将

Figure BDA00037409514200001313
上传到云服务器CS,云服务器CS通过PreFLa算法聚合全局参数,PreFLa算法通过强化学习获得最大化回报来选择边缘服务器Ei的最优参数权重比ai,k,全局参数
Figure BDA00037409514200001314
根据ai,k进行聚合;参数的上传和下载过程是并行的,所有参数都经过HE加密;S4, all edge servers will
Figure BDA00037409514200001313
Uploaded to the cloud server CS, the cloud server CS aggregates the global parameters through the PreFLa algorithm, and the PreFLa algorithm selects the optimal parameter weight ratio a i, k of the edge server E i by maximizing the return through reinforcement learning, and the global parameters
Figure BDA00037409514200001314
Aggregate according to a i, k ; the parameter upload and download process is parallel, and all parameters are HE encrypted;

S5、重复步骤S2-S4,直到达到终止条件,完成整个训练过程。终止条件可以是最大训练周期数、损失函数的收敛性或其他用户定义的条件。最后,可以根据以下公式(1)得到最优的损失函数。S5. Repeat steps S2-S4 until the termination condition is reached and the entire training process is completed. The termination condition can be the maximum number of training cycles, the convergence of the loss function, or other user-defined conditions. Finally, the optimal loss function can be obtained according to the following formula (1).

Figure BDA00037409514200001315
Figure BDA00037409514200001315

其中,L(wi)表示Ei网络的损失。Where L( wi ) represents the loss of the Ei network.

本地训练的具体方法为:The specific method of local training is:

在局部模型阶段,采用深度神经网络(DNN)来学习云模型和ES模型。DNN通过将不同用户数据作为原始输入来执行端到端的特征学习和分类器训练。将在提出的算法中使用随机梯度下降作为子程序来最小化每个本地训练中损失值。In the local model stage, a deep neural network (DNN) is used to learn the cloud model and ES model. DNN performs end-to-end feature learning and classifier training by taking different user data as raw input. Stochastic gradient descent is used as a subroutine in the proposed algorithm to minimize the loss value in each local training.

在下行通信阶段Ei在第k(k∈[1,K])轮通信中从CS下载基础层参数

Figure BDA0003740951420000141
并随机初始化
Figure BDA0003740951420000142
其中,K表示联邦任务的总轮数。若为第一轮联邦任务,CS随机初始化
Figure BDA0003740951420000143
在本地训练之前,Ei需要通过使用同态加密(公式(4))对
Figure BDA0003740951420000144
解密为
Figure BDA0003740951420000145
并记为
Figure BDA0003740951420000146
In the downlink communication phase, E i downloads the base layer parameters from CS in the kth (k∈[1, K]) round of communication.
Figure BDA0003740951420000141
And randomly initialize
Figure BDA0003740951420000142
Where K represents the total number of rounds of the federated task. If it is the first round of the federated task, CS is randomly initialized.
Figure BDA0003740951420000143
Before local training, E i needs to be encrypted using homomorphic encryption (Formula (4)).
Figure BDA0003740951420000144
Decrypted to
Figure BDA0003740951420000145
And record it as
Figure BDA0003740951420000146

为了更好的体现模型个性化,局部模型的损失函数设置如下:In order to better reflect the model personalization, the loss function of the local model is set as follows:

L(wi)=l(wi)+λ(wi,t-wi,t+1)2 (16)L( wi )=l( wi )+λ( wi,t - wi,t+1 ) 2 (16)

其中l()表示网络的损失,例如分类任务的交叉熵损失。第二项是L2正则化项,既可以保留自己的个性化能力,又可以提高与其他参与者的协作效率。λ是正则化系数。Where l() represents the loss of the network, such as the cross entropy loss of the classification task. The second term is the L2 regularization term, which can both retain its own personalization ability and improve the efficiency of cooperation with other participants. λ is the regularization coefficient.

Ei初始化Gk并替换模型的权重参数wi,继续进行局部模型训练如下E i initializes G k and replaces the model's weight parameter w i , and continues local model training as follows

wi=wi-ηGk (17)w i = w i - ηG k (17)

其中,η是学习率,Gk

Figure BDA0003740951420000147
Figure BDA0003740951420000148
的总表示。这里的
Figure BDA0003740951420000149
随机初始化。Where η is the learning rate and Gk is
Figure BDA0003740951420000147
and
Figure BDA0003740951420000148
Here is the total representation of
Figure BDA0003740951420000149
Random initialization.

Ei在达到T轮本地训练后,此时会得到每个局部模型的准确率acci,k

Figure BDA00037409514200001410
Figure BDA00037409514200001411
端与端之间直接共享用户信息是禁止的,边缘服务器中的数据在通信前需要进行加密,防止数据在通信前被攻击。这个过程使用HE来避免信息泄露。以下将展示使用实数进行加法HE的过程。权重矩阵和偏置向量的加密方案遵循相同的思想,实数a的加法同态加密表示为aE。在加法同态加密中,对于任意两个数a和b,有aE+bE=(a+b)E。将任何实数r转换为编码的有理数不动点v的方法是:After E i reaches T rounds of local training, the accuracy of each local model acc i,k ,
Figure BDA00037409514200001410
and
Figure BDA00037409514200001411
Direct sharing of user information between ends is prohibited. Data in the edge server needs to be encrypted before communication to prevent data from being attacked before communication. This process uses HE to avoid information leakage. The following will show the process of additive HE using real numbers. The encryption schemes for weight matrices and bias vectors follow the same idea. The additive homomorphic encryption of a real number a is represented as a E . In additive homomorphic encryption, for any two numbers a and b, a E +b E =(a+b) E . The method to convert any real number r into an encoded rational number fixed point v is:

Figure BDA00037409514200001412
Figure BDA00037409514200001412

认为梯度

Figure BDA0003740951420000151
中的每个编码实数r可以表示为有理定理的H位数,由一个符号位、z位整数位和d位小数位组成。因此,每个可以编码的有理数都由其H=1+z+d位定义。执行编码以允许乘法运算,这需要运算模数为H+2d以避免比较。Think gradient
Figure BDA0003740951420000151
Each coded real number r in can be represented as an H-bit rational theorem, consisting of a sign bit, z integer bits, and d fraction bits. Therefore, each coded rational number is defined by its H = 1 + z + d bits. The encoding is performed to allow multiplication operations, which require operations modulo H + 2d to avoid comparisons.

解码定义为:Decode is defined as:

Figure BDA0003740951420000152
Figure BDA0003740951420000152

这些编码数字的乘法需要去除因子1/2d。使用Paillier加法加密时,可以准确计算编码乘法的情况,但只能保证一次同态乘法。为简单起见,在解码时处理它。Multiplication of these encoded numbers requires the removal of the factor 1/2d. When encrypting with Paillier addition, the encoded multiplication can be calculated exactly, but only one homomorphic multiplication is guaranteed. For simplicity, it is handled during decoding.

如果只发生了一次代码乘法,则它是正确的。因为最大的可加密整数是V-1,所以最大的可加密实数必须考虑到这一点。因此,整数z和小数d的选择必须如下:If only one code multiplication occurs, then it is correct. Since the largest encryptable integer is V-1, the largest encryptable real number must take this into account. Therefore, the integer z and the decimal d must be chosen as follows:

V≥2H+2d≥21+z+3d (5)V≥2 H+2d ≥2 1+z+3d (5)

加密过后的

Figure BDA0003740951420000153
和aCCi,k分别表示为
Figure BDA0003740951420000154
Figure BDA0003740951420000155
Encrypted
Figure BDA0003740951420000153
and aCC i,k are represented as
Figure BDA0003740951420000154
and
Figure BDA0003740951420000155

MePC算法的具体方法如图3所示。The specific method of the MePC algorithm is shown in Figure 3.

第k轮联邦任务中,使用MePC来交换基础层梯度

Figure BDA0003740951420000156
为了避免数据被破解的风险,在每个网络中取随机比例χ的
Figure BDA0003740951420000157
梯度即为
Figure BDA0003740951420000158
并保持同一轮联邦的随机比例χ相同。在不同轮的联邦任务中,随机比例χ(χ∈[1,1/n])是变化的,
Figure BDA0003740951420000159
剩下的梯度被均分为n-1份
Figure BDA00037409514200001510
如图3所示,
Figure BDA00037409514200001511
的值划分为:In the kth round of federated tasks, MePC is used to exchange the base layer gradients
Figure BDA0003740951420000156
In order to avoid the risk of data being cracked, a random ratio of χ is taken in each network.
Figure BDA0003740951420000157
The gradient is
Figure BDA0003740951420000158
And keep the random ratio χ of the same round of federation the same. In different rounds of federation tasks, the random ratio χ (χ∈[1, 1/n]) is changed.
Figure BDA0003740951420000159
The remaining gradient is divided equally into n-1 parts
Figure BDA00037409514200001510
As shown in Figure 3,
Figure BDA00037409514200001511
The values are divided into:

Figure BDA00037409514200001512
Figure BDA00037409514200001512

只有

Figure BDA00037409514200001513
被保留在Ei中,其他部分和随机参数χ将会以密文的形式广播发送给其他ESs。通过这种方式,即使部分传输内容被攻击,最初的数据
Figure BDA0003740951420000161
也不会泄露。具体来说,如果攻击者想要获取数据
Figure BDA0003740951420000162
就必须获取
Figure BDA0003740951420000163
的所有部分。但是,
Figure BDA0003740951420000164
和χ在参与者Ei和接收者Ej之间通信时都是通过同态加密保持密文形式。only
Figure BDA00037409514200001513
is retained in E i , and the other parts and random parameters χ will be broadcasted to other ESs in the form of ciphertext. In this way, even if part of the transmission content is attacked, the original data
Figure BDA0003740951420000161
Specifically, if an attacker wants to obtain data
Figure BDA0003740951420000162
You must obtain
Figure BDA0003740951420000163
However,
Figure BDA0003740951420000164
When communicating between participant E i and receiver E j, both χ and χ are kept in ciphertext form through homomorphic encryption.

共享给其他ESs的梯度信息是

Figure BDA0003740951420000165
The gradient information shared to other ESs is
Figure BDA0003740951420000165

Figure BDA0003740951420000166
Figure BDA0003740951420000166

当Ei接收到由其他服务器发送的数据包

Figure BDA0003740951420000167
它在本地执行数据验证。具体来说,它使用相应的“乘法”方法进行验证。每个边缘服务器自己设计两个解码功能,如下:When E i receives a data packet sent by other servers
Figure BDA0003740951420000167
It performs data verification locally. Specifically, it uses the corresponding "multiplication" method for verification. Each edge server designs two decoding functions by itself, as follows:

Figure BDA0003740951420000168
Figure BDA0003740951420000168

Figure BDA0003740951420000169
Figure BDA0003740951420000169

其中,L0

Figure BDA00037409514200001610
的长度,L’是
Figure BDA00037409514200001611
的长度。Among them, L0 is
Figure BDA00037409514200001610
length, L' is
Figure BDA00037409514200001611
Length.

L0=χ·L (9)L 0 = χ·L (9)

Figure BDA00037409514200001612
Figure BDA00037409514200001612

其中,L是

Figure BDA00037409514200001613
的长度,
Figure BDA00037409514200001614
Figure BDA00037409514200001615
的长度相等。Where L is
Figure BDA00037409514200001613
Length,
Figure BDA00037409514200001614
and
Figure BDA00037409514200001615
are of equal length.

要求

Figure BDA00037409514200001616
满足所有ES的解码函数对同一个数据包执行“并”操作得到全0,并执行“交”操作得到全1,即Require
Figure BDA00037409514200001616
The decoding function that satisfies all ESs performs a "and" operation on the same data packet to obtain all 0s, and performs a "cross" operation to obtain all 1s, that is,

Figure BDA00037409514200001617
Figure BDA00037409514200001617

Figure BDA00037409514200001618
Figure BDA00037409514200001618

首先,初始化解码函数如下,First, initialize the decoding function as follows,

Figure BDA00037409514200001619
Figure BDA00037409514200001619

需要说明的是,在初始化的时候,不同Ei发送的数据包在同一个联邦任务中的数据解码也是同一个函数。It should be noted that, at the time of initialization, the data packets sent by different E i in the same federated task also use the same function for data decoding.

将数据包

Figure BDA0003740951420000171
与其他服务器中相应的解码函数相乘。由于
Figure BDA0003740951420000172
中0的二进制位被乘得0,所以Ei可以保证只得到它自己的部分数据包。当
Figure BDA0003740951420000173
中的二进制位为1时,可以得到对应位置的梯度信息的密文,如下:Packet
Figure BDA0003740951420000171
Multiply by the corresponding decoding function in the other servers.
Figure BDA0003740951420000172
The 0 bits in are multiplied to 0, so E i can ensure that it only gets its own part of the data packet.
Figure BDA0003740951420000173
When the binary bit in is 1, the ciphertext of the gradient information at the corresponding position can be obtained as follows:

Figure BDA0003740951420000174
Figure BDA0003740951420000174

Ei将从其他ES获取的所有数据包数组添加到对应位置,得到所有密文数据,并更新为最后

Figure BDA0003740951420000175
即E i adds all the data packet arrays obtained from other ES to the corresponding positions, obtains all the ciphertext data, and updates to the last
Figure BDA0003740951420000175
Right now

Figure BDA0003740951420000176
Figure BDA0003740951420000176

每次进行安全多方计算时,随着k的增加将每个Ei中的解码函数

Figure BDA0003740951420000177
的二进制向左循环移动m个单位,以保证
Figure BDA0003740951420000178
共享的动态性,并可以将它们平均划分到E1,E2,…,En中,且每个部分的数据信息不重复。Each time a secure multi-party computation is performed, the decoding function in each E i is replaced by
Figure BDA0003740951420000177
The binary is circularly shifted to the left by m units to ensure
Figure BDA0003740951420000178
The shared dynamics can be evenly divided into E 1 , E 2 , …, E n , and the data information in each part is not repeated.

PreFla算法的具体方法为:The specific method of the PreFla algorithm is:

车联网中的数据分布较为分散且数据的不平衡和异构化问题,在满足实时性要求的同时提高个性化服务需求就很困难。为了防止不同边缘服务器通信过程中的用户隐私泄露,在通信过程中使用HE对参数进行加密。为了更好的实现针对不同的用户数据进行个性化训练,将第一层设置为基础层,使用现有的联邦学习方法以协作方式进行训练,而其他层作为个性化层在本地训练,从而能够捕获不同ES设备的个人信息。这样,在联合训练过程之后,全局共享的基础层可以转移到ES中,以构建自己的个性化深度学习模型,并使用其独特的个性化层。仅从CS下载基础层参数

Figure BDA0003740951420000179
个性化层的参数
Figure BDA00037409514200001710
随机生成并使用本地数据进行微调。为了满足实时性要求,实现ES的个性化需求,PreFLa采用强化学习(RL)进行适配来选择最优参数权重比ai,k聚合全局参数
Figure BDA00037409514200001711
The data distribution in the Internet of Vehicles is relatively scattered, and the data is unbalanced and heterogeneous, making it difficult to meet real-time requirements while improving personalized service needs. In order to prevent user privacy leakage during communication between different edge servers, HE is used to encrypt parameters during the communication process. In order to better achieve personalized training for different user data, the first layer is set as the base layer and trained in a collaborative manner using the existing federated learning method, while the other layers are trained locally as personalized layers, thereby being able to capture personal information of different ES devices. In this way, after the joint training process, the globally shared base layer can be transferred to ES to build its own personalized deep learning model and use its unique personalized layer. Only download the base layer parameters from CS
Figure BDA0003740951420000179
Parameters of the personalization layer
Figure BDA00037409514200001710
Randomly generate and fine-tune using local data. In order to meet the real-time requirements and realize the personalized needs of ES, PreFLa uses reinforcement learning (RL) to adapt to select the optimal parameter weight ratio a i,k aggregate global parameters
Figure BDA00037409514200001711

在上行通信阶段,每个ES不仅训练局部模型,还将本地参数上传到CS进行联合聚合。在第k轮联邦中执行MePC算法后,Ei通过TLS/SSL安全通道将参数

Figure BDA0003740951420000181
Figure BDA0003740951420000182
上传到BS。在聚合阶段,由于每个ES的分布不平衡和数据异构性,其模型参数用于聚合对该阶段的收敛速度具有至关重要的影响。因此,有必要考虑k轮联邦聚合中参与者Ei的参数权重比ai,k。In the uplink communication phase, each ES not only trains the local model, but also uploads the local parameters to the CS for joint aggregation. After executing the MePC algorithm in the kth round of federation, E i transmits the parameters to the CS through the TLS/SSL secure channel.
Figure BDA0003740951420000181
and
Figure BDA0003740951420000182
Upload to BS. In the aggregation stage, due to the imbalanced distribution and data heterogeneity of each ES, its model parameters used for aggregation have a crucial impact on the convergence speed of this stage. Therefore, it is necessary to consider the parameter weight ratio a i,k of participant E i in the k-round federated aggregation.

本发明中使用基于DQN的强化学习去预测参数权重比,通过Q函数而不是Q-Learning中的表存储来存储信息,以防止空间多维灾难。为了更好地实现模型个性化,减少MePC-F中上传权重的等待时间,用DQN来选择最优参数权重比ai,k来,聚合更新CS中的全局参数

Figure BDA0003740951420000183
强化学习中核心内容有:状态、动作、奖励函数以及反馈,定义如下:In this invention, DQN-based reinforcement learning is used to predict parameter weight ratios, and information is stored through Q functions instead of table storage in Q-Learning to prevent spatial multidimensional disasters. In order to better realize model personalization and reduce the waiting time for uploading weights in MePC-F, DQN is used to select the optimal parameter weight ratio a i, k to aggregate and update the global parameters in CS.
Figure BDA0003740951420000183
The core contents of reinforcement learning are: state, action, reward function and feedback, which are defined as follows:

状态:第k轮的状态

Figure BDA0003740951420000184
其中,
Figure BDA0003740951420000185
是精度差,表示为:Status: Status of round k
Figure BDA0003740951420000184
in,
Figure BDA0003740951420000185
is the precision difference, expressed as:

Figure BDA0003740951420000186
Figure BDA0003740951420000186

动作:参数权重占比ai,k表示为第k轮联邦任务的动作。为避免陷入局部最优解,采用ε-贪心算法优化动作选择过程,可以得到ai,kAction: The parameter weight ratio ai,k represents the action of the federated task in round k. To avoid falling into the local optimal solution, the ε-greedy algorithm is used to optimize the action selection process, and ai ,k can be obtained:

Figure BDA0003740951420000187
Figure BDA0003740951420000187

其中P是权重排列的集合,rand是一个随机数(rand∈[0,1]),Q(si,k,ai,k)指代理在状态si,k下采取行动ai,k时的累积折现收益。一旦DQN在测试期间被训练为近似Q(si,k,ai,k),DQN代理将为第k轮中的所有动作计算{Q(si,k,ai,k)|ai,k∈[P]}。每个动作值表示代理通过在状态si,k选择特定动作ai,k可以获得的最大预期回报。Where P is a set of weight permutations, rand is a random number (rand ∈ [0, 1]), and Q( si, k , ai , k ) refers to the cumulative discounted return of the agent when taking action ai , k in state si , k. Once DQN is trained to approximate Q(si , k , ai , k ) during testing, the DQN agent computes {Q(si , k , ai , k )|ai , k ∈ [P]} for all actions in the kth round. Each action value represents the maximum expected return that the agent can obtain by choosing a specific action ai, k in state si, k.

奖励:将第k轮联邦结束时观察到的奖励设置为Reward: Set the reward observed at the end of the kth federation round to

Figure BDA0003740951420000191
Figure BDA0003740951420000191

其中,

Figure BDA0003740951420000195
是一个正数,确保rk随着测试准确度Δacci,k呈指数增长。第一项激励代理选择能够实现更高测试精度的设备。
Figure BDA0003740951420000192
用来控制随着Δacci,k增长rk的变化。一般来说,随着机器学习训练的进行,模型准确率会以较慢的速度增加。但在联邦合作任务中,由于数据分布不平衡和异质性,模型精度可能会降低。因此,随着FL进入后期阶段,使用指数项来放大边界准确度的增加。第二项-1用来鼓励智能体提高模型准确性,因为当Δacci,k小于0时,有rk∈(-1,0)。in,
Figure BDA0003740951420000195
is a positive number that ensures that r k grows exponentially with the test accuracy Δacc i,k . The first term motivates the agent to select devices that can achieve higher test accuracy.
Figure BDA0003740951420000192
It is used to control the change of r k as Δacc i,k grows. In general, as machine learning training progresses, model accuracy increases at a slower rate. However, in federated cooperative tasks, model accuracy may decrease due to imbalanced data distribution and heterogeneity. Therefore, as FL enters the later stages, an exponential term is used to amplify the increase in boundary accuracy. The second term -1 is used to encourage the agent to improve model accuracy because when Δacc i,k is less than 0, there is r k ∈(-1, 0).

训练DQN代理以最大化累积折扣奖励的期望,如下式所示The DQN agent is trained to maximize the expectation of the cumulative discounted reward as shown below

Figure BDA0003740951420000193
Figure BDA0003740951420000193

其中,γ∈(0,1]一个折扣未来奖励的因子。where γ∈(0,1] is a factor that discounts future rewards.

在获得rk之后,CS为每轮联邦任务保存多维四元组Bk=(si,k,ai,k,rk,si,k+1)。最优动作值函数Q(si,k,ai,k)是RL代理寻求的备忘单,定义为从si,k开始的累积折现收益的最大期望:After obtaining r k , CS saves a multidimensional quadruple B k = (s i , k , a i , k , r k , s i , k + 1 ) for each round of federated tasks. The optimal action-value function Q (s i , k , a i , k ) is the cheat sheet sought by the RL agent and is defined as the maximum expectation of the cumulative discounted return starting from s i , k :

Q(si,k,ai,k)=E(ri,k+γmax Q(si,k+1,ai,k)|si,k,ai,k) (22)Q(s i,k ,a i,k )=E(r i,k +γmax Q(s i,k+1 ,a i,k )|s i,k ,a i,k ) (22)

然后,可以应用函数逼近技术学习一个参数化的值函数Q(si,k,ai,k;wk)逼近最优值函数Q(si,k,ai,k)。第一步的rk+γmax Q(si,k+1,ai,k)是Q(si,k,ai,k;wk)学习的目标。通常,DNN用于表示函数逼近器。RL学习问题变成最小化目标和逼近器之间的MSE损失,定义为:Then, function approximation techniques can be applied to learn a parameterized value function Q(s i, k , a i, k ; w k ) to approximate the optimal value function Q(s i, k , a i, k ). The first step r k +γmax Q(s i, k+1 , a i, k ) is the target of Q(s i, k , a i, k ; w k ) learning. Typically, DNN is used to represent the function approximator. The RL learning problem becomes minimizing the MSE loss between the target and the approximator, defined as:

l(wk)=(ri,k+γmax Q(si,k+1,ai,k;wk)-Q(si,k,ai,k;wk))2 (23)l( wk )=(ri ,k +γmax Q(si ,k+1 ,ai ,k ; wk )-Q(si ,k ,ai ,k ; wk )) 2 (23 )

CS更新全局参数wk为:CS updates the global parameter wk as:

Figure BDA0003740951420000194
Figure BDA0003740951420000194

其中,η≥0是步长。where η ≥ 0 is the step size.

CS重复上述步骤以获得最佳学习模型。CS可以得到第k轮权重比序列的ai,k,将全局参数

Figure BDA0003740951420000201
更新为:CS repeats the above steps to obtain the best learning model. CS can obtain the k-th round weight ratio sequence a i, k and set the global parameter
Figure BDA0003740951420000201
Updated to:

Figure BDA0003740951420000202
Figure BDA0003740951420000202

所有ES更新全局参数

Figure BDA0003740951420000203
并开始接下来的T轮本地训练。All ES update global parameters
Figure BDA0003740951420000203
And start the next T rounds of local training.

实验测试实施例:Experimental test example:

为了验证所提出的机制的有效性,给出了实验结果和分析。分别考虑具有1个云服务器、10个边缘服务器的系统。实验学习率α为0.01,折扣因子γ为0.9。正整数

Figure BDA0003740951420000204
取3。各参数取值见表2。In order to verify the effectiveness of the proposed mechanism, experimental results and analysis are given. Consider a system with 1 cloud server and 10 edge servers respectively. The experimental learning rate α is 0.01 and the discount factor γ is 0.9. Positive integer
Figure BDA0003740951420000204
The value is 3. See Table 2 for the values of each parameter.

表2参数设置Table 2 Parameter settings

Figure BDA0003740951420000205
Figure BDA0003740951420000205

在两个数据集上验证了提出的模型的有效性:MNIST和CIFAR-10。所提出的联邦学习模型MePC-F的性能是根据DLG的重建图像、平均精度和平均损失来评估的。首先,防御DLG攻击的五种方案的性能,然后将所提出的联邦学习模型MePC-F的性能与集中式和PeMPC进行比较。以下场景中的所有结果都是1000次独立实验的平均值。The effectiveness of the proposed model is verified on two datasets: MNIST and CIFAR-10. The performance of the proposed federated learning model MePC-F is evaluated based on the reconstructed images, average precision, and average loss of DLG. First, the performance of five schemes for defending against DLG attacks is presented, and then the performance of the proposed federated learning model MePC-F is compared with centralized and PeMPC. All results in the following scenarios are the average of 1000 independent experiments.

1)防御DLG攻击的性能1) Performance of defending against DLG attacks

这一节评估MePC-F的有效性并与FL、PeMPC和DP算法(Gaussian分布噪声和Laplace分布噪声)在DLG重建图像方面进行比较。为MNIST数据集上的单个图像计算网络的公共梯度,不同方案的结果如图4所示。由于研究[17]表明隐藏第一层的梯度可以减少数据的重建,用四种方法替换第一层的梯度(权重和偏置项):提出的MePC-F、PeMPC、高斯分布(μ=0,σ=1)噪声和拉普拉斯分布(μ=0,σ=1)噪声以查看DLG的行为。在完成隐藏第一层的梯This section evaluates the effectiveness of MePC-F and compares it with FL, PeMPC, and DP algorithms (Gaussian distributed noise and Laplace distributed noise) on DLG reconstructed images. The common gradient of the network is calculated for a single image on the MNIST dataset, and the results of different schemes are shown in Figure 4. Since the study [17] showed that the gradient of the hidden first layer can reduce the reconstruction of the data, four methods are used to replace the gradient of the first layer (weights and bias terms): the proposed MePC-F, PeMPC, Gaussian distributed (μ = 0, σ = 1) noise, and Laplace distributed (μ = 0, σ = 1) noise to see the behavior of DLG. After completing the gradient of the hidden first layer

度后,DLG使用这些梯度来恢复创建公共共享梯度的图像。After quantization, DLG uses these gradients to recover the image creating a common shared gradient.

从图4可以看出,在没有任何方法隐藏第一层的梯度(图4(a)中的FL)时,DLG过程可以准确地重建训练数据。当第一层的梯度用本发明提出的方法MePC-F进行保护时,图4(b)中可以有效地防止信息泄漏。当迭代步数达到500时,DLG仍然无法构建图像。从图4(c)中可以看到与图4(b)类似的结果,PeMPC也可以防御图4中的DLG攻击。从图4(d)可以看出,通过在第一层加入高斯噪声,重建图像从第15轮开始有部分显示,到第20轮已经构建了原图的基本轮廓。随着迭代轮数增加到500轮,可以清晰地恢复图像。图4(e)中的拉普拉斯噪声与驾驭高斯噪声也有类似的现象。As can be seen from Figure 4, when there is no method to hide the gradient of the first layer (FL in Figure 4(a)), the DLG process can accurately reconstruct the training data. When the gradient of the first layer is protected by the method MePC-F proposed in the present invention, information leakage can be effectively prevented in Figure 4(b). When the number of iterations reaches 500, DLG still cannot construct the image. From Figure 4(c), it can be seen that the results are similar to those in Figure 4(b), and PeMPC can also defend against the DLG attack in Figure 4. From Figure 4(d), it can be seen that by adding Gaussian noise to the first layer, the reconstructed image is partially displayed from the 15th round, and the basic outline of the original image has been constructed by the 20th round. As the number of iterations increases to 500, the image can be clearly restored. The Laplace noise in Figure 4(e) also has similar phenomena to the driven Gaussian noise.

从图5可以看出,如果恶意服务器将所有隐藏层的梯度作为纯文本接收,则重建过程能够获得最低梯度损失和图像的MSE(图5中的绿线)。随着轮数的增加,PeMPC和MePC-F不会收敛到零,图像的MSE达到107。在原始梯度上添加拉普拉斯和高斯噪声会收敛到10-5,图4也证明了当达到20轮时数据可以被重构。图像的MSE越大,表明图像被重建的可能性就越小。As can be seen from Figure 5, if the malicious server receives all the gradients of the hidden layers as plain text, the reconstruction process is able to obtain the lowest gradient loss and the MSE of the image (the green line in Figure 5). As the number of rounds increases, PeMPC and MePC-F do not converge to zero, and the MSE of the image reaches 10 7 . Adding Laplace and Gaussian noise to the original gradient converges to 10 -5 , and Figure 4 also proves that the data can be reconstructed when 20 rounds are reached. The larger the MSE of the image, the less likely it is that the image is reconstructed.

基于以上实验结果验证了在原始梯度上加入拉普拉斯和高斯噪声可以防止早期的部分梯度泄漏,但是随着轮数的增加,由于深度泄漏仍然会恢复原始数据。然而,PeMPC和MePC-F无论训练轮数多久都是可以防止DLG攻击重建原始数据的有效方法。Based on the above experimental results, it is verified that adding Laplace and Gaussian noise to the original gradient can prevent partial gradient leakage in the early stage, but as the number of rounds increases, the original data will still be restored due to deep leakage. However, PeMPC and MePC-F are effective methods to prevent DLG attacks from reconstructing the original data regardless of the number of training rounds.

2)平均准确率和平均损失的性能比较2) Performance comparison of average accuracy and average loss

在本节中,评估MePC-F的有效性,并与集中式和PeMPC在平均准确度、MNIST和CIFAR-10数据集上的平均损失方面进行比较。In this section, the effectiveness of MePC-F is evaluated and compared with Centralized and PeMPC in terms of mean average accuracy and mean loss on MNIST and CIFAR-10 datasets.

从图6(a)中可以看出模型在MNIST数据集上达到98%的准确度所需的轮数。三个方法的平均准确率都随着训练轮次的增加而增加。在MNIST数据上实现目标准确度集中式方法需要25轮,PeMPC需要140轮和MePC-F需要40轮,MePC-F需要的训练轮数比PeMPC低71.2%。原因是所提出的强化联邦学习算法PreFLa可以通过与环境的交互找到更好的聚合参数权重ai,k,可以更好地应对No-IID数据,加速模型收敛,达到目标精度。集中式是在在所有数据的组合上训练,所以它的准确性将高于联邦学习算法的准确性。但是从图中可以看出PeMPC的收敛几乎可以达到集中式的精度。Figure 6(a) shows the number of rounds required for the model to achieve 98% accuracy on the MNIST dataset. The average accuracy of the three methods increases with the increase in training rounds. To achieve the target accuracy on the MNIST data, the centralized method requires 25 rounds, PeMPC requires 140 rounds, and MePC-F requires 40 rounds. The number of training rounds required for MePC-F is 71.2% lower than that of PeMPC. The reason is that the proposed reinforced federated learning algorithm PreFLa can find better aggregate parameter weights a i,k through interaction with the environment, which can better cope with No-IID data, accelerate model convergence, and achieve the target accuracy. The centralized method is trained on a combination of all data, so its accuracy will be higher than that of the federated learning algorithm. However, it can be seen from the figure that the convergence of PeMPC can almost reach the accuracy of the centralized method.

从图6(b)中,可以看到三个方案的平均损失随着训练轮数的增加而减少。对于集中式,平均损失从0.233减少到0.052。PeMPC的平均损失从0.35减少到0.084。同时,本发明提出的MePC-F的平均损失降低到0.06,低于PeMPC的28.6%。当训练轮数达到100轮时,提出的MePC-F几乎可以达到集中式的损失值。From Figure 6(b), we can see that the average losses of the three schemes decrease as the number of training rounds increases. For the centralized scheme, the average loss decreases from 0.233 to 0.052. The average loss of PeMPC decreases from 0.35 to 0.084. At the same time, the average loss of MePC-F proposed in the present invention is reduced to 0.06, which is lower than 28.6% of PeMPC. When the number of training rounds reaches 100, the proposed MePC-F can almost reach the loss value of the centralized scheme.

从图7(a)中可以看出,模型在CAFIR-10上达到50%的目标准确度所需的轮数。可以看到与图6(a)类似的结果。三个模型的平均准确率都在增加,直到达到目标值。对于集中式,平均准确度从0.42增加到0.5,共23轮。PeMPC的平均准确率从0.372上升到0.5,共89轮。同时,所提出的MePC-F的平均精度在41轮时达到目标精度,比PeMPC低53.9%。图7(a)表明,MePC-F使用比PeMPC更优的权重ai,k来更新全局模型,这导致更快的收敛速度。From Figure 7(a), we can see the number of rounds required for the model to reach the target accuracy of 50% on CAFIR-10. Similar results can be seen as in Figure 6(a). The average accuracy of the three models increases until the target value is reached. For the centralized model, the average accuracy increases from 0.42 to 0.5 in 23 rounds. The average accuracy of PeMPC increases from 0.372 to 0.5 in 89 rounds. Meanwhile, the average accuracy of the proposed MePC-F reaches the target accuracy at 41 rounds, which is 53.9% lower than PeMPC. Figure 7(a) shows that MePC-F uses better weights a i,k than PeMPC to update the global model, which leads to faster convergence.

从图7(b)可以看出,三种方案的平均损失正在减少,直到达到稳定值。集中式、MePC-F、PeMPC依次达到损失最小值,提出的MePC-F的时间效率在PeMPC下更好。As can be seen from Figure 7(b), the average loss of the three schemes is decreasing until they reach a stable value. Centralized, MePC-F, and PeMPC reach the minimum loss in turn, and the time efficiency of the proposed MePC-F is better under PeMPC.

表3 100轮内三种方案的最高精度Table 3 The highest accuracy of the three schemes within 100 rounds

MNISTMNIST CIFAR-10CIFAR-10 centralizedcentralized 98.4%98.4% 51.4%51.4% MePC-FMePC-F 98.2%98.2% 51.1%51.1% PeMPCPcP 97.6%97.6% 49.2%49.2%

表3给出三种方案在100轮内的精度。对于MNIST数据,所提出的MePC-F的平均准确率为98.2%,比PeMPC高0.6%。PeMPC的准确率几乎可以达到集中训练的准确率。对于CAFIR-10数据,MePC-F在100轮时的平均准确度高达0.511,比PeMPC高1.9%。它表明,MePC-F可以通过最优权重更新ai,k,比PeMPC更好地聚合全局参数,从而获得更高的准确度,更接近集中准确度。Table 3 gives the accuracy of the three schemes within 100 rounds. For MNIST data, the average accuracy of the proposed MePC-F is 98.2%, which is 0.6% higher than PeMPC. The accuracy of PeMPC can almost reach the accuracy of centralized training. For CAFIR-10 data, the average accuracy of MePC-F in 100 rounds is as high as 0.511, which is 1.9% higher than PeMPC. It shows that MePC-F can better aggregate global parameters than PeMPC through the optimal weight update a i,k , thereby achieving higher accuracy and closer to the centralized accuracy.

应当理解的是,对本领域普通技术人员来说,可以根据上述说明加以改进或变换,而所有这些改进和变换都应属于本发明所附权利要求的保护范围。It should be understood that those skilled in the art can make improvements or changes based on the above description, and all these improvements and changes should fall within the scope of protection of the appended claims of the present invention.

Claims (8)

1.一种车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,该方法包括以下步骤:1. A real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles, characterized in that the method comprises the following steps: S1、构建多个边缘服务器Ei和一个云服务器CS;获取车辆数据D={D1,D2,…,Di},边缘服务器Ei获取对应车辆数据DiS1. Build multiple edge servers E i and a cloud server CS; obtain vehicle data D = {D 1 , D 2 , ..., D i }, and the edge server E i obtains the corresponding vehicle data D i ; S2、在第k轮联邦任务中,边缘服务器Ei从云服务器CS中下载初始A型梯度
Figure FDA0003740951410000011
并解密为
Figure FDA0003740951410000012
随机初始化B型梯度
Figure FDA0003740951410000013
边缘服务器Ei根据其车辆数据Di的最小化损失函数来计算本地网络模型训练中的梯度,边缘服务器Ei完成T轮本地训练完后的梯度信息记为
Figure FDA0003740951410000014
S2. In the kth round of federated tasks, the edge server E i downloads the initial A-type gradient from the cloud server CS
Figure FDA0003740951410000011
and decrypted to
Figure FDA0003740951410000012
Randomly initialize B-type gradient
Figure FDA0003740951410000013
The edge server E i calculates the gradient in the local network model training according to the minimized loss function of its vehicle data D i . The gradient information of the edge server E i after completing T rounds of local training is recorded as
Figure FDA0003740951410000014
S3、边缘服务器Ei通过解码函数
Figure FDA0003740951410000015
Figure FDA0003740951410000016
中获取需要保留的部分梯度信息
Figure FDA0003740951410000017
并将剩余的梯度信息
Figure FDA0003740951410000018
经同态加密为
Figure FDA0003740951410000019
再通过MePC算法广播发送给其它所有的边缘服务器Ej;边缘服务器Ei根据解码函数
Figure FDA00037409514100000110
获取来自其它边缘服务器Ej的对应部分梯度信息
Figure FDA00037409514100000111
所有边缘服务器更新共享后的A类梯度信息分别为
Figure FDA00037409514100000112
i∈[1,n],n为边缘服务器的总数;
S3, edge server E i through decoding function
Figure FDA0003740951410000015
from
Figure FDA0003740951410000016
Get some gradient information that needs to be retained
Figure FDA0003740951410000017
And the remaining gradient information
Figure FDA0003740951410000018
After homomorphic encryption,
Figure FDA0003740951410000019
Then, the MePC algorithm is used to broadcast the message to all other edge servers Ej ; the edge server Ei receives the message according to the decoding function
Figure FDA00037409514100000110
Get the corresponding partial gradient information from other edge servers Ej
Figure FDA00037409514100000111
The updated and shared A-type gradient information of all edge servers are
Figure FDA00037409514100000112
i∈[1,n], n is the total number of edge servers;
S4、所有边缘服务器将
Figure FDA00037409514100000113
上传到云服务器CS,云服务器CS通过PreFLa算法聚合全局参数,PreFLa算法通过强化学习获得最大化回报来选择边缘服务器Ei的最优参数权重比ai,k,全局梯度参数
Figure FDA00037409514100000114
根据ai,k进行聚合;参数的上传和下载过程是并行的,所有参数都经过HE加密;
S4, all edge servers will
Figure FDA00037409514100000113
Uploaded to the cloud server CS, the cloud server CS aggregates global parameters through the PreFLa algorithm, and the PreFLa algorithm selects the optimal parameter weight ratio a i, k of the edge server E i and the global gradient parameter by maximizing the return through reinforcement learning
Figure FDA00037409514100000114
Aggregate according to a i, k ; the parameter upload and download process is parallel, and all parameters are HE encrypted;
S5、重复步骤S2-S4,直到达到终止条件,云服务器CS计算最终的全局梯度参数,下发给各边缘服务器,边缘服务器根据多个车辆数据的特征提取,计算MePC-F模型的精确度和最优损失函数,得到训练好的MePC-F模型,完成整个训练过程,实时输出给车联网对应的服务。S5. Repeat steps S2-S4 until the termination condition is reached. The cloud server CS calculates the final global gradient parameters and sends them to each edge server. The edge server extracts features from multiple vehicle data, calculates the accuracy and optimal loss function of the MePC-F model, obtains the trained MePC-F model, completes the entire training process, and outputs it to the corresponding service of the Internet of Vehicles in real time.
2.根据权利要求1所述的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,所述步骤S2中,本地网络模型训练的具体方法为:2. According to the method for real-time enhanced federated learning data privacy security based on the MePC-F model in the Internet of Vehicles according to claim 1, it is characterized in that in the step S2, the specific method of training the local network model is: 采用深度神经网络DNN模型,DNN通过将不同车辆数据作为原始输入来执行端到端的特征学习和分类器训练,使用随机梯度下降作为子程序来最小化每个本地训练中损失值;A deep neural network (DNN) model is used. The DNN performs end-to-end feature learning and classifier training by taking different vehicle data as raw input, and uses stochastic gradient descent as a subroutine to minimize the loss value in each local training. Ei在第k轮通信中从云服务器CS下载基础层参数,即解密前的初始A型梯度
Figure FDA0003740951410000021
并解密为A型梯度
Figure FDA0003740951410000022
随机初始化B型梯度
Figure FDA0003740951410000023
其中,k∈[1,K],K表示联邦任务的总轮数;若为第一轮联邦任务,CS随机初始化
Figure FDA0003740951410000024
在本地训练之前,Ei通过使用同态加密对
Figure FDA0003740951410000025
解密为
Figure FDA0003740951410000026
并记为
Figure FDA0003740951410000027
E i downloads the base layer parameters from the cloud server CS in the kth round of communication, i.e., the initial A-type gradient before decryption
Figure FDA0003740951410000021
and decrypted into a type A gradient
Figure FDA0003740951410000022
Randomly initialize B-type gradient
Figure FDA0003740951410000023
Among them, k∈[1, K], K represents the total number of rounds of the federated task; if it is the first round of the federated task, CS is randomly initialized
Figure FDA0003740951410000024
Before local training, E i is encrypted using homomorphic encryption
Figure FDA0003740951410000025
Decrypted to
Figure FDA0003740951410000026
And record it as
Figure FDA0003740951410000027
局部模型的损失函数设置如下:The loss function of the local model is set as follows: L(wi)=l(wi)+λ(wi,t-wi,t+1)2 L( wi )=l( wi )+λ( wi,t - wi,t+1 ) 2 其中,l()表示网络的损失,第二项是L2正则化项,λ是正则化系数;wi表示局部模型中的总权重信息,wi,t是局部模型在t时刻的权重信息,wi,t+1是局部模型在t+1时刻的权重信息;Among them, l() represents the loss of the network, the second term is the L2 regularization term, and λ is the regularization coefficient; wi represents the total weight information in the local model, wi ,t is the weight information of the local model at time t, and wi,t+1 is the weight information of the local model at time t+1; Ei更新Gk并替换模型的权重参数wi,通过最小化损失函数继续进行局部模型训练如下:E i updates G k and replaces the model's weight parameters w i , and continues local model training by minimizing the loss function as follows: wi=wi-ηGk w i = w i −ηG k 其中,η是学习率,Gk
Figure FDA0003740951410000028
Figure FDA0003740951410000029
的总表示,这里的
Figure FDA00037409514100000210
随机初始化;
Where η is the learning rate and Gk is
Figure FDA0003740951410000028
and
Figure FDA0003740951410000029
The total representation here is
Figure FDA00037409514100000210
Random initialization;
边缘服务器Ei在达到T轮本地训练后,此时会得到每个局部模型的准确率aCCi,k
Figure FDA00037409514100000211
After the edge server E i reaches T rounds of local training, the accuracy of each local model aCC i,k and
Figure FDA00037409514100000211
3.根据权利要求1所述的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,所述步骤S3中,MePC算法的具体方法为:3. According to claim 1, the real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles is characterized in that in step S3, the specific method of the MePC algorithm is: 第k轮联邦任务中,所有的边缘服务器使用MePC来交换基础层梯度
Figure FDA0003740951410000031
其中,
Figure FDA0003740951410000032
表示第k轮联邦任务中第n个边缘服务器的A类加密数据,
Figure FDA0003740951410000033
表示第k轮联邦任务中第i个边缘服务器的A类加密数据,
Figure FDA0003740951410000034
表示第k轮联邦任务中第i个边缘服务器广播发给其他边缘服务器的A类加密数据,
Figure FDA0003740951410000035
即为
Figure FDA0003740951410000036
去除了自己保留的那一份后的加密数据;
In the kth round of federated tasks, all edge servers use MePC to exchange base layer gradients.
Figure FDA0003740951410000031
in,
Figure FDA0003740951410000032
represents the encrypted data of type A of the nth edge server in the kth round of federated tasks,
Figure FDA0003740951410000033
represents the encrypted data of type A of the i-th edge server in the k-th round of federated tasks,
Figure FDA0003740951410000034
It indicates the encrypted data of type A broadcasted by the i-th edge server to other edge servers in the k-th round of federated tasks.
Figure FDA0003740951410000035
That is
Figure FDA0003740951410000036
The encrypted data after removing the copy you keep;
为了避免数据被破解的风险,在每个网络中取随机比例χ的
Figure FDA0003740951410000037
梯度即为
Figure FDA0003740951410000038
并保持同一轮联邦的随机比例χ相同,再将
Figure FDA0003740951410000039
加密为
Figure FDA00037409514100000310
在不同轮的联邦任务中,随机比例χ是变化的,χ∈[1,1/n];
Figure FDA00037409514100000311
剩下的梯度通过同态加密为
Figure FDA00037409514100000312
被均分为n-1份
Figure FDA00037409514100000313
的值划分为:
In order to avoid the risk of data being cracked, a random ratio of χ is taken in each network.
Figure FDA0003740951410000037
The gradient is
Figure FDA0003740951410000038
Keep the random proportion χ of the federation in the same round the same, and then
Figure FDA0003740951410000039
Encrypted as
Figure FDA00037409514100000310
In different rounds of federated tasks, the random ratio χ varies, χ∈[1, 1/n];
Figure FDA00037409514100000311
The remaining gradients are homomorphically encrypted as
Figure FDA00037409514100000312
Divided equally into n-1 parts
Figure FDA00037409514100000313
The values are divided into:
Figure FDA00037409514100000314
Figure FDA00037409514100000314
只有
Figure FDA00037409514100000315
被保留在Ei中,其它部分和随机参数χ将会以密文的形式广播发送给其它Ej;通过这种方式,即使部分传输内容被攻击,最初的数据
Figure FDA00037409514100000316
也不会泄露;
only
Figure FDA00037409514100000315
is retained in E i , and the other parts and random parameters χ will be broadcasted to other E j in the form of ciphertext; in this way, even if part of the transmission content is attacked, the original data
Figure FDA00037409514100000316
It will not be leaked;
共享给其它Ej的梯度信息是
Figure FDA00037409514100000317
The gradient information shared to other E j is
Figure FDA00037409514100000317
Figure FDA00037409514100000318
Figure FDA00037409514100000318
当Ei接收到由其它服务器发送的数据包
Figure FDA00037409514100000319
它在本地执行数据验证。
When E i receives a data packet sent by other servers
Figure FDA00037409514100000319
It performs data validation locally.
4.根据权利要求3所述的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,所述步骤S3中,在本地执行数据验证的具体方法为:4. The real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles according to claim 3 is characterized in that in the step S3, the specific method of performing data verification locally is: 在第k轮联邦任务中,使用相应的“乘法”方法进行验证,每个边缘服务器自己设计两个解码函数,如下:In the kth round of federated tasks, the corresponding "multiplication" method is used for verification, and each edge server designs two decoding functions by itself, as follows:
Figure FDA0003740951410000041
Figure FDA0003740951410000041
Figure FDA0003740951410000042
Figure FDA0003740951410000042
其中,L0
Figure FDA0003740951410000043
的长度,L’是
Figure FDA0003740951410000044
的长度;解码函数的下标k,表示第k轮联邦任务中的解码函数;
Among them, L0 is
Figure FDA0003740951410000043
The length, L' is
Figure FDA0003740951410000044
The length of the decoding function; the subscript k of the decoding function represents the decoding function in the k-th round of federated tasks;
L0=χ·LL 0 = x·L
Figure FDA0003740951410000045
Figure FDA0003740951410000045
其中,L是
Figure FDA0003740951410000046
的长度,
Figure FDA0003740951410000047
Figure FDA0003740951410000048
的长度相等;
Where L is
Figure FDA0003740951410000046
Length,
Figure FDA0003740951410000047
and
Figure FDA0003740951410000048
are of equal length;
要求
Figure FDA0003740951410000049
满足所有边缘服务器的解码函数对同一个数据包执行“并”操作得到全0,并执行“交”操作得到全1,即:
Require
Figure FDA0003740951410000049
The decoding functions that satisfy all edge servers perform a "and" operation on the same data packet to obtain all 0s, and perform a "cross" operation to obtain all 1s, that is:
Figure FDA00037409514100000410
Figure FDA00037409514100000410
Figure FDA00037409514100000411
Figure FDA00037409514100000411
首先,初始化解码函数如下:First, initialize the decoding function as follows:
Figure FDA00037409514100000412
Figure FDA00037409514100000412
将数据包
Figure FDA00037409514100000413
与其它服务器中相应的解码函数相乘;由于
Figure FDA00037409514100000414
中0的二进制位被乘得0,所以Ei保证只得到它自己的部分数据包;当
Figure FDA00037409514100000415
中的二进制位为1时,得到对应位置的梯度信息的密文,如下:
Packet
Figure FDA00037409514100000413
Multiply with the corresponding decoding function in other servers; since
Figure FDA00037409514100000414
The 0 bits in are multiplied by 0, so E i is guaranteed to get only its own part of the data packet; when
Figure FDA00037409514100000415
When the binary bit in is 1, the ciphertext of the gradient information at the corresponding position is obtained as follows:
Figure FDA00037409514100000416
Figure FDA00037409514100000416
Ei将从其它边缘服务器Ej获取的所有数据包数组添加到对应位置,得到所有密文数据,并更新为最后
Figure FDA00037409514100000417
即:
E i adds all the data packet arrays obtained from other edge servers E j to the corresponding positions, obtains all the ciphertext data, and updates to the final
Figure FDA00037409514100000417
Right now:
Figure FDA00037409514100000418
Figure FDA00037409514100000418
每次进行安全多方计算时,随着k的增加,将每个Ei中的解码函数
Figure FDA0003740951410000051
的二进制向左循环移动m个单位,以保证
Figure FDA0003740951410000052
共享的动态性,并将它们平均划分到E1,E2,…,En中,且每个部分的数据信息不重复。
Each time a secure multi-party computation is performed, as k increases, the decoding function in each E i is replaced by
Figure FDA0003740951410000051
The binary is circularly shifted to the left by m units to ensure
Figure FDA0003740951410000052
The dynamics of sharing are evenly divided into E 1 , E 2 , …, E n , and the data information of each part is not repeated.
5.根据权利要求1所述的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,所述步骤S4中,PreFla算法的具体方法为:5. The real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles according to claim 1 is characterized in that in the step S4, the specific method of the PreFla algorithm is: PreFLa采用强化学习RL进行适配来选择最优参数权重比ai,k聚合全局参数
Figure FDA0003740951410000053
PreFLa uses reinforcement learning RL to adapt to select the optimal parameter weight ratio a i,k to aggregate global parameters
Figure FDA0003740951410000053
在上行通信阶段,每个边缘服务器不仅训练局部模型,还将本地参数上传到云服务器CS进行联合聚合;在第k轮联邦中执行MePC算法后,Ei通过TLS/SSL安全通道将参数
Figure FDA0003740951410000054
Figure FDA0003740951410000055
上传到CS;在聚合阶段,由于每个ES的分布不平衡和数据异构性,其模型参数用于聚合对该阶段的收敛速度具有至关重要的影响;因此,有必要考虑k轮联邦聚合中参与者Ei的参数权重比ai,k
In the uplink communication phase, each edge server not only trains the local model, but also uploads the local parameters to the cloud server CS for joint aggregation; after executing the MePC algorithm in the kth round of federation, E i transmits the parameters to the cloud server CS through the TLS/SSL secure channel.
Figure FDA0003740951410000054
and
Figure FDA0003740951410000055
Upload to CS; In the aggregation stage, due to the imbalanced distribution and data heterogeneity of each ES, its model parameters used for aggregation have a crucial impact on the convergence speed of this stage; therefore, it is necessary to consider the parameter weight ratio a i,k of participant E i in the k-round federated aggregation;
使用基于DQN的强化学习去预测参数权重比,通过Q函数来存储信息,以防止空间多维灾难;为了更好地实现模型个性化,减少MePC-F中上传权重的等待时间,用DQN来选择最优参数权重比ai,k,聚合更新CS中的全局参数
Figure FDA0003740951410000056
强化学习包括:状态、动作、奖励函数以及反馈。
Use DQN-based reinforcement learning to predict parameter weight ratios and store information through Q functions to prevent spatial multidimensional disasters; in order to better personalize the model and reduce the waiting time for uploading weights in MePC-F, use DQN to select the optimal parameter weight ratio a i, k and aggregate to update the global parameters in CS
Figure FDA0003740951410000056
Reinforcement learning includes: state, action, reward function, and feedback.
6.根据权利要求5所述的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,所述步骤S4中,状态、动作、奖励函数以及反馈的具体方法为:6. The real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles according to claim 5 is characterized in that in step S4, the specific methods of state, action, reward function and feedback are: 状态:第k轮的状态
Figure FDA0003740951410000057
其中,
Figure FDA0003740951410000058
是精度差,表示为:
Status: Status of round k
Figure FDA0003740951410000057
in,
Figure FDA0003740951410000058
is the precision difference, expressed as:
Figure FDA0003740951410000059
Figure FDA0003740951410000059
动作:参数权重占比ai,k表示为第k轮联邦任务的动作;为避免陷入局部最优解,采用ε-贪心算法优化动作选择过程,得到ai,kAction: The parameter weight ratio ai,k represents the action of the federated task in round k. To avoid falling into the local optimal solution, the ε-greedy algorithm is used to optimize the action selection process and obtain ai,k :
Figure FDA0003740951410000061
Figure FDA0003740951410000061
其中P是权重排列的集合,rand是一个随机数,rand∈[0,1],Q(si,k,ai,k)指代理在状态si,k下采取行动ai,k时的累积折现收益;一旦DQN在测试期间被训练为近似Q(si,k,ai,k),DQN代理将为第k轮中的所有动作计算{Q(si,k,ai,k)|ai,k∈[P]};每个动作值表示代理通过在状态si,k选择特定动作ai,k获得的最大预期回报;where P is a set of weight permutations, rand is a random number, rand ∈ [0, 1], Q( si, k , ai , k ) refers to the cumulative discounted return of the agent when taking action ai , k in state si , k ; once the DQN is trained to approximate Q(si , k , ai , k ) during testing, the DQN agent will calculate {Q(si , k , ai , k )|ai , k ∈ [P]} for all actions in the kth round; each action value represents the maximum expected return that the agent can obtain by choosing a specific action ai , k in state si, k; 奖励:将第k轮联邦结束时观察到的奖励设置为:Reward: Set the reward observed at the end of the kth federation round to:
Figure FDA0003740951410000062
Figure FDA0003740951410000062
其中,
Figure FDA0003740951410000063
是一个正数,确保rk随着训练准确度Δacci,k呈指数增长;第一项激励代理选择能够实现更高测试精度的设备;
Figure FDA0003740951410000065
用来控制随着Δacci,k增长rk的变化;当Δacci,k小于0时,有rk∈(-1,0);
in,
Figure FDA0003740951410000063
is a positive number that ensures that r k grows exponentially with the training accuracy Δacc i,k ; the first term incentivizes the agent to choose a device that can achieve higher test accuracy;
Figure FDA0003740951410000065
It is used to control the change of r k as Δacc i,k increases; when Δacc i,k is less than 0, r k ∈(-1,0);
训练DQN代理以最大化累积折扣奖励的期望,如下式所示:The DQN agent is trained to maximize the expectation of the cumulative discounted reward as shown below:
Figure FDA0003740951410000064
Figure FDA0003740951410000064
其中,γ∈(0,1],表示一个折扣未来奖励的因子;Among them, γ∈(0,1], represents a factor that discounts future rewards; 在获得rk之后,云服务器CS为每轮联邦任务保存多维四元组Bk=(si,k,ai,k,rk,si,k+1);最优动作值函数Q(si,k,ai,k)是RL代理寻求的备忘单,定义为从si,k开始的累积折现收益的最大期望:After obtaining r k , the cloud server CS saves the multidimensional quadruple B k = (s i, k , a i, k , r k , s i, k + 1 ) for each round of federated tasks; the optimal action-value function Q(s i, k , a i, k ) is the cheat sheet sought by the RL agent and is defined as the maximum expectation of the cumulative discounted return starting from s i, k : Q(si,k,ai,k)=E(ri,k+γmax Q(si,k+1,ai,k)|si,k,ai,k)Q(s i,k ,a i,k )=E(r i,k +γmax Q(s i,k+1 ,a i,k )|s i,k ,a i,k ) 应用函数逼近技术学习一个参数化的值函数Q(si,k,ai,k;wk)逼近最优值函数Q(si,k,ai,k);rk+γmax Q(si,k+1,ai,k)是Q(si,k,ai,k;wk)学习的目标;DNN用于表示函数逼近器;RL学习问题变成最小化目标和逼近器之间的MSE损失,定义为:Function approximation techniques are applied to learn a parameterized value function Q(s i, k , a i, k ; w k ) to approximate the optimal value function Q(s i, k , a i, k ); r k +γmax Q(s i, k+1 , a i, k ) is the goal of learning Q(s i, k , a i, k ; w k ); DNN is used to represent the function approximator; the RL learning problem becomes minimizing the MSE loss between the target and the approximator, defined as: l(wk)=(ri,k+γmax Q(Si,k+1,ai,k;wk)-Q(si,k,ai,k;wk))2 l( wk )=(ri ,k +γmax Q(Si ,k+1 ,ai ,k ; wk )-Q(si ,k ,ai ,k ; wk )) 2 CS更新全局参数wk为:CS updates the global parameter wk as:
Figure FDA0003740951410000071
Figure FDA0003740951410000071
其中,η≥0是步长;Where η≥0 is the step size; 云服务器CS得到最佳学习模型后,得到第k轮权重比序列的ai,k,将全局参数
Figure FDA0003740951410000072
更新为:
After the cloud server CS obtains the best learning model, it obtains the k-th round weight ratio sequence a i, k and sets the global parameter
Figure FDA0003740951410000072
Updated to:
Figure FDA0003740951410000073
Figure FDA0003740951410000073
所有边缘服务器更新全局参数
Figure FDA0003740951410000074
并开始接下来的T轮本地训练。
All edge servers update global parameters
Figure FDA0003740951410000074
And start the next T rounds of local training.
7.根据权利要求1所述的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,该方法中HE加密的方法具体为:7. The real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles according to claim 1 is characterized in that the HE encryption method in the method is specifically: 权重矩阵和偏置向量的加密方案遵循相同的思想,实数a的加法同态加密表示为aE,在加法同态加密中,对于任意两个数a和b,有aE+bE=(a+b)E;将任何实数r转换为编码的有理数不动点v的方法是:The encryption schemes for the weight matrix and the bias vector follow the same idea. The additive homomorphic encryption of a real number a is represented by a E . In additive homomorphic encryption, for any two numbers a and b, a E + b E = (a+b) E . The method to convert any real number r into an encoded rational fixed point v is:
Figure FDA0003740951410000076
Figure FDA0003740951410000076
认为梯度
Figure FDA0003740951410000075
中的每个编码实数r可以表示为有理定理的H位数,由一个符号位、z位整数位和d位小数位组成;因此,每个可以编码的有理数都由其H=1+z+d位定义;执行编码以允许乘法运算,这需要运算模数为H+2d以避免比较;
Think gradient
Figure FDA0003740951410000075
Each coded real number r in can be represented as an H-bit rational theorem, consisting of a sign bit, z integer bits, and d fraction bits; therefore, each coded rational number is defined by its H = 1 + z + d bits; the encoding is performed to allow multiplication operations, which require operations modulo H + 2d to avoid comparisons;
解码定义为:Decode is defined as:
Figure FDA0003740951410000081
Figure FDA0003740951410000081
这些编码数字的乘法需要去除因子1/2d;使用Paillier加法加密时,可以准确计算编码乘法的情况,但只能保证一次同态乘法;为简单起见,在解码时处理它;Multiplication of these encoded numbers requires the removal of the factor 1/2d; when using Paillier addition encryption, the encoded multiplication can be calculated exactly, but only one homomorphic multiplication is guaranteed; for simplicity, it is handled during decoding; 最大的可加密整数是V-1,所以最大的可加密实数必须考虑到这一点,因此,整数z和小数d的选择条件如下:The largest encryptable integer is V-1, so the largest encryptable real number must take this into account. Therefore, the integer z and the decimal d are selected as follows: V≥2H+2d≥21+z+3dV≥2 H+2d ≥2 1+z+3d .
8.根据权利要求1所述的车联网中基于MePC-F模型的实时强化联邦学习数据隐私安全方法,其特征在于,所述步骤S5中最优的损失函数为8. The real-time enhanced federated learning data privacy security method based on the MePC-F model in the Internet of Vehicles according to claim 1 is characterized in that the optimal loss function in step S5 is
Figure FDA0003740951410000082
Figure FDA0003740951410000082
其中,L(wi)表示Ei网络的损失。Where L( wi ) represents the loss of the Ei network.
CN202210816716.3A 2022-07-12 2022-07-12 Real-time reinforced federal learning data privacy security method based on MePC-F model in Internet of vehicles Active CN115310121B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210816716.3A CN115310121B (en) 2022-07-12 2022-07-12 Real-time reinforced federal learning data privacy security method based on MePC-F model in Internet of vehicles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210816716.3A CN115310121B (en) 2022-07-12 2022-07-12 Real-time reinforced federal learning data privacy security method based on MePC-F model in Internet of vehicles

Publications (2)

Publication Number Publication Date
CN115310121A CN115310121A (en) 2022-11-08
CN115310121B true CN115310121B (en) 2023-04-07

Family

ID=83857637

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210816716.3A Active CN115310121B (en) 2022-07-12 2022-07-12 Real-time reinforced federal learning data privacy security method based on MePC-F model in Internet of vehicles

Country Status (1)

Country Link
CN (1) CN115310121B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115731424B (en) * 2022-12-03 2023-10-31 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization
CN116260617A (en) * 2022-12-22 2023-06-13 湖南匡安网络技术有限公司 Power grid industrial control protocol intrusion detection method and system based on federal learning
CN116013067A (en) * 2022-12-30 2023-04-25 中国联合网络通信集团有限公司 Vehicle data processing method, processor and server
CN115860789B (en) * 2023-03-02 2023-05-30 国网江西省电力有限公司信息通信分公司 A Day-Ahead Scheduling Method of CES Based on FRL
CN116228729A (en) * 2023-03-15 2023-06-06 北京工商大学 Heterogeneous robust federal learning method based on self-step learning
CN116610958B (en) * 2023-06-20 2024-07-26 河海大学 Distributed model training method and system for reservoir water quality detection using drone swarms
CN117812564B (en) * 2024-02-29 2024-05-31 湘江实验室 A federated learning method, device, equipment and medium applied to Internet of Vehicles
CN117873402B (en) * 2024-03-07 2024-05-07 南京邮电大学 A collaborative edge cache optimization method based on asynchronous federated learning and perceptual clustering

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160573B (en) * 2020-04-01 2020-06-30 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy joint training by two parties
CN111611610B (en) * 2020-04-12 2023-05-30 西安电子科技大学 Federated learning information processing method, system, storage medium, program, terminal
CN112100295A (en) * 2020-10-12 2020-12-18 平安科技(深圳)有限公司 User data classification method, device, equipment and medium based on federated learning
CN112199702B (en) * 2020-10-16 2024-07-26 鹏城实验室 Privacy protection method, storage medium and system based on federal learning
CN112015749B (en) * 2020-10-27 2021-02-19 支付宝(杭州)信息技术有限公司 Method, device and system for updating business model based on privacy protection
CN112347500B (en) * 2021-01-11 2021-04-09 腾讯科技(深圳)有限公司 Machine learning method, device, system, equipment and storage medium of distributed system
CN113037460B (en) * 2021-03-03 2023-02-28 北京工业大学 A privacy-preserving method for federated learning based on homomorphic encryption and secret sharing
CN113435472A (en) * 2021-05-24 2021-09-24 西安电子科技大学 Vehicle-mounted computing power network user demand prediction method, system, device and medium

Also Published As

Publication number Publication date
CN115310121A (en) 2022-11-08

Similar Documents

Publication Publication Date Title
CN115310121B (en) Real-time reinforced federal learning data privacy security method based on MePC-F model in Internet of vehicles
Han et al. Logistic regression on homomorphic encrypted data at scale
CN111600707B (en) A method for decentralized federated machine learning under privacy protection
Zhang et al. Dubhe: Towards data unbiasedness with homomorphic encryption in federated learning client selection
CN112949741B (en) Convolutional neural network image classification method based on homomorphic encryption
CN109104544A (en) A kind of New chaotic image encryption method synchronous based on complex network
CN117640253B (en) Privacy protection method and system for federated learning based on homomorphic encryption
CN117294469A (en) Privacy protection method for federal learning
Zhu et al. Enhanced federated learning for edge data security in intelligent transportation systems
Ghavamipour et al. Federated synthetic data generation with stronger security guarantees
CN118400089A (en) Block chain-based intelligent internet of things privacy protection federation learning method
Qiu et al. Privacy preserving federated learning using ckks homomorphic encryption
Chen et al. Lightweight privacy-preserving training and evaluation for discretized neural networks
Li et al. An Adaptive Communication‐Efficient Federated Learning to Resist Gradient‐Based Reconstruction Attacks
CN116684061A (en) A private picture encryption method and device based on an improved AES algorithm based on key expansion
CN110737907A (en) Anti-quantum computing cloud storage method and system based on alliance chain
CN114760023A (en) Model training method and device based on federal learning and storage medium
Sengupta et al. Publicly verifiable secure cloud storage for dynamic data using secure network coding
Chu et al. Random linear network coding for peer-to-peer applications
CN118740360A (en) A secure aggregation method and system for federated learning based on modular component homomorphism
Xu et al. Camel: Communication-Efficient and Maliciously Secure Federated Learning in the Shuffle Model of Differential Privacy
Zhao et al. SuperFL: Privacy-preserving federated learning with efficiency and robustness
CN111277406A (en) A method for comparing the advantages of secure two-way vector based on blockchain
CN115865302B (en) A privacy-preserving multi-matrix multiplication method
Ramezanipour et al. A secure and robust images encryption scheme using chaos game representation, logistic map and convolutional auto-encoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant