CN114900343B

CN114900343B - Abnormal traffic detection method of IoT devices based on clustering federated learning

Info

Publication number: CN114900343B
Application number: CN202210442394.0A
Authority: CN
Inventors: 马卓; 高佳晨; 刘洋; 杨易龙; 刘心晶; 李腾; 张俊伟; 马建峰
Original assignee: Xidian University
Current assignee: Lanxiang Zhilian Hangzhou Technology Co ltd
Priority date: 2022-04-25
Filing date: 2022-04-25
Publication date: 2023-01-24
Anticipated expiration: 2042-04-25
Also published as: CN114900343A

Abstract

The invention discloses a method for detecting abnormal traffic of Internet of Things devices based on clustering federated learning. The implementation steps are as follows: the federated learning system is initialized, the local participants perform local iterative training on the global neural network model, and the central server judges whether the clustering is satisfied. Opening conditions, the central server aggregates the neural network model and distributes it, the central server clusters all participants, the central server aggregates the neural network model within the cluster and distributes it, and all local participants perform abnormal traffic detection. The central server of the present invention can divide the Internet of Things devices into different clusters according to the data distribution, and realize the optimization and personalization of the model in the scene of uneven data distribution by aggregating the global model within the cluster, thereby improving the efficiency of the Internet of Things devices. Abnormal traffic detection accuracy rate.

Description

Abnormal traffic detection method of IoT devices based on clustering federated learning

技术领域technical field

本发明涉及物联网(Internet of Things，IoT)安全领域，涉及一种网络异常流量检测方法，特别涉及一种分布式的基于聚类联邦学习的物联网设备异常流量检测方法。The present invention relates to the security field of the Internet of Things (IoT), relates to a method for detecting abnormal network traffic, and in particular to a distributed method for detecting abnormal traffic of Internet of Things devices based on clustering federated learning.

背景技术Background technique

物联网即万物相连的互联网，是互联网基础上的延伸和扩展的网络，将各种信息传感设备与互联网结合起来而形成的一个巨大网络，实现在任何时间、任何地点，人、机、物的互联互通。物联网设备，即物物相连的互联网设备。物联网设备有：条码、射频识别、传感器、全球定位系统、激光扫描器等信息，其中的传感器设备是基础设备。因此物联网设备在现如今的生产生活中扮演着重要角色，物联网设备中存储着大量的隐私信息。随之而来针对物联网的攻击手段和规模也愈演愈烈，严重威胁着社会的正常秩序。The Internet of Things is the Internet where everything is connected. It is an extended and expanded network based on the Internet. It is a huge network formed by combining various information sensing devices with the Internet. interconnection. Internet of Things devices, that is, Internet devices connected to things. IoT devices include: barcodes, radio frequency identification, sensors, global positioning systems, laser scanners and other information, among which sensor devices are basic devices. Therefore, IoT devices play an important role in today's production and life, and a large amount of private information is stored in IoT devices. As a result, the means and scale of attacks against the Internet of Things have also intensified, seriously threatening the normal order of society.

为了快速准确响应网络中的异常情况、维护网络的正常通信、提高网络的服务质量，网络异常流量检测技术得到人们的广泛关注。异常流量检测系统主要是通过技术手段建模检测异常行为，发现流量异常，即向网络管理者发出警告。In order to quickly and accurately respond to abnormal situations in the network, maintain normal network communication, and improve network service quality, network abnormal traffic detection technology has attracted widespread attention. The abnormal traffic detection system mainly uses technical means to model and detect abnormal behavior, and when abnormal traffic is found, it sends a warning to the network manager.

异常流量检测方法的重点在于检测异常流量精度要高，自动化程度要高。机器学习通过海量数据自主迭代训练可以以自动化完成高精度的异常检测，可是在多源异构的物联网场景下，基于机器学习的高精度的异常流量检测需要海量数据的支撑，在大规模分布式环境下，数据孤岛的出现大幅提高了模型训练的难度。同时用户海量数据被采集并应用于模型训练，也引起人们对数据隐私和安全问题的担忧。为解决上述问题，将联邦学习框架应用于物联网场景下的异常流量检测。The key point of abnormal traffic detection method is to detect abnormal traffic with high precision and high degree of automation. Machine learning can automate high-precision anomaly detection through autonomous iterative training of massive data. However, in a multi-source heterogeneous IoT scenario, high-precision abnormal traffic detection based on machine learning requires the support of massive data. In a typical environment, the emergence of data islands greatly increases the difficulty of model training. At the same time, a large amount of user data is collected and applied to model training, which also raises concerns about data privacy and security issues. To solve the above problems, the federated learning framework is applied to abnormal traffic detection in the Internet of Things scenario.

中国人民解放军战略支援部队信息工程大学在申请的专利文献“基于区块链和联邦学习的分布式物联网入侵检测方法及系统”(专利申请号：202110797 560.4，申请公布号：CN 113794675A)中公开了一种针对物联网设备的基于联邦学习的网络入侵检测方法。该方法利用联邦学习的分布式训练提升入侵检测模型训练效率及攻击检测准确率；利用区块链的分布式存储解决中心化存储的安全性问题。The patent document "Distributed Internet of Things Intrusion Detection Method and System Based on Blockchain and Federated Learning" (patent application number: 202110797 560.4, application publication number: CN 113794675A) was disclosed in the patent document applied by the Information Engineering University of the Strategic Support Force of the Chinese People's Liberation Army A federated learning-based network intrusion detection method for IoT devices is proposed. This method uses distributed training of federated learning to improve intrusion detection model training efficiency and attack detection accuracy; uses blockchain distributed storage to solve the security problem of centralized storage.

虽然上述方法利用联邦学习技术解决了数据孤岛的问题，为机器学习提供了海量数据。但是在异构网络融合，海量终端常态化接入的物联网时代，由于多元化终端的安全需求不同，不同设备的网络流量分布存在差异化。这将导致上述方法在数据分布不均时，模型训练无法最优化，以及分布式训练的全局模型并不适用于物联网设备的本地网络异常流量检测两个技术问题导致异常流量检测准确率低。Although the above method uses federated learning technology to solve the problem of data islands, it provides massive data for machine learning. However, in the Internet of Things era where heterogeneous networks are integrated and massive terminals are regularly connected, due to the different security requirements of multiple terminals, the network traffic distribution of different devices is different. This will lead to the inability of model training to be optimized when the data distribution of the above method is uneven, and the global model of distributed training is not suitable for the local network abnormal traffic detection of IoT devices. Two technical problems lead to low abnormal traffic detection accuracy.

发明内容Contents of the invention

本发明的目的是克服上述现有技术的不足，提出了一种基于多任务聚类联邦学习的物联网设备异常流量检测方案，用于解决数据分布不均场景下，模型训练无法最优化，以及联邦学习的全局模型并不适用于物联网设备的本地网络异常流量检测，两方面导致异常流量检测准确率低小的问题。The purpose of the present invention is to overcome the shortcomings of the above-mentioned prior art, and propose a multi-task clustering federated learning-based Internet of Things device abnormal traffic detection scheme, which is used to solve the problem of uneven data distribution, model training cannot be optimized, and The global model of federated learning is not suitable for the detection of abnormal traffic in the local network of IoT devices. Two aspects lead to the problem of low accuracy of abnormal traffic detection.

为实现上述目的，本发明采取的技术方案包括如下步骤：In order to achieve the above object, the technical solution taken by the present invention comprises the following steps:

(1)初始化联邦学习系统：(1) Initialize the federated learning system:

初始化包括一个中心服务器和N个物联网设备M＝{M₁,M₂,...,M_n,...,M_N}作为本地参与者的联邦学习系统，中心服务器与本地参与者的通信轮数为r，最大通信轮数为R，参与者的本地迭代训练轮数为T，中心服务器初始化一个用于异常检测的全局神经网络模型权值参数θ₀并下发给所有本地参与者，并令r＝1，其中，N≥2，M_n表示第n个本地参与者，R≥20，T≥1；Initialize a federated learning system including a central server and N IoT devices M={M ₁ ,M ₂ ,...,M _n ,...,M _N } as local participants, the central server and local participants The number of communication rounds is r, the maximum number of communication rounds is R, and the number of local iterative training rounds of participants is T. The central server initializes a global neural network model weight parameter θ ₀ for anomaly detection and distributes it to all local participants , and let r=1, where, N≥2, M _n represents the nth local participant, R≥20, T≥1;

(2)本地参与者对全局神经网络模型进行本地迭代训练：(2) Local participants perform local iterative training on the global neural network model:

参与者集合M中每个本地参与者M_n在本轮r中使用自身的本地流量数据D_n对全局神经网络模型θ₀进行T轮本地迭代训练，并将全局神经网络模型在本地训练后的权值参数更新

上传至中心服务器，其中所有参与者M的流量数据集合D＝{D₁,D₂,...,D_n,...,D_N}；Each local participant M _n in the participant set M uses its own local traffic data D _n to perform T rounds of local iterative training on the global neural network model θ ₀ in the current round r, and the local training Weight parameter update

Upload to the central server, where the traffic data set D of all participants M = {D ₁ , D ₂ ,...,D _n ,...,D _N };

(3)中心服务器判断是否满足聚类开启条件：(3) The central server judges whether the clustering opening condition is satisfied:

中心服务器计算本轮所有参与者上传的神经网络模型权值参数更新集合

中的最大值

与Δθ_r的均值

的差值

并判断

与预设的阈值差值Δθ^sub是否满足

且r与预设的阈值通信轮数r_c是否满足r≥r_c，若是执行步骤(5)，否则执行步骤(4)；The central server calculates the update set of neural network model weight parameters uploaded by all participants in this round

the maximum value in

and the mean of Δθ _r

difference

and judge

Whether the difference Δθ ^sub with the preset threshold is satisfied

And whether r and the preset threshold number of communication rounds r _c satisfy r≥r _c , if so, perform step (5), otherwise, perform step (4);

(4)中心服务器聚合神经网络模型并下发：(4) The central server aggregates the neural network model and issues:

中心服务器对所有参与者的本地神经网络模型权值参数集合Δθ_r进行聚合，聚合结果为本轮的全局神经网络模型权值参数θ_r，将聚合结果θ_r下发给每个参与者，判断r＝R是否成立，若是执行步骤(7)，否则，令r＝r+1，θ₀＝θ_r并执行步骤(2)；The central server aggregates the local neural network model weight parameter set Δθ _r of all participants, and the aggregation result is the global neural network model weight parameter θ _r of the current round, and sends the aggregation result θ _r to each participant, and judges Whether r=R is established, if so, execute step (7), otherwise, set r=r+1, θ ₀ =θ _r and execute step (2);

(5)中心服务器将所有参与者进行聚类化：(5) The central server clusters all participants:

中心服务器通过参与者上传的神经网络模型权值参数更新集合Δθ_r，计算出所有参与者之间的梯度优化方向的余弦相似度集合

中心服务器将所有N个本地参与者M，按照余弦相似度集合a_r依据公式1，将参与者划分到两个聚类C₁和C₂，其中公式1为The central server updates the set Δθ _r through the weight parameters of the neural network model uploaded by the participants, and calculates the cosine similarity set of gradient optimization directions among all participants

The central server divides all N local participants M into two clusters C ₁ and C ₂ according to the cosine similarity set a _r according to formula 1, where formula 1 is

max为取最大值，min为取最小值，C₁∪C₂＝M且

max is the maximum value, min is the minimum value, C ₁ ∪ C ₂ = M and

(6)中心服务器在聚类内部聚合神经网络模型并下发：(6) The central server aggregates the neural network model within the cluster and issues:

中心服务器根据聚类C₁和C₂将参与者上传神经网络模型权值参数更新集合Δθ_r划分成隶属于聚类C₁的参与者上传的神经网络模型权值参数更新集合Δθ_r,1＝{Δθ_n|M_n∈C₁}和隶属于聚类C₂的参与者上传的神经网络模型权值参数更新集合Δθ_r,2＝{Δθ_n|M_n∈C₂}；According to the clusters _C1 and C2, the central server divides the neural network model weight parameter update set _Δθr uploaded by the participants into the neural network model weight parameter update set _Δθr _,1 ₌ {Δθ _n |M _n ∈C ₁ } and the update set of neural network model weight parameters uploaded by participants belonging to cluster C ₂ Δθ _r,2 = {Δθ _n |M _n ∈C ₂ };

(6a)中心服务器将神经网络模型权值参数更新集合Δθ_r,1进行聚合，聚合结果为的聚类C₁内部成员本轮的全局神经网络模型权值参数θ_r,1，将聚合结果θ_r,1下发给聚类C₁中每个参与者，判断r＝R是否成立，若是令θ₀＝θ_r,1，执行步骤(7)，否则，令r＝r+1，θ₀＝θ_r,1，M＝C₁并且执行步骤(2)；(6a) The central server aggregates the weight parameter update set Δθ _r,1 of the neural network model, and the aggregation result is the global neural network model weight parameter θ _r,1 of the internal members of the cluster C ₁ in the current round, and aggregates the result θ r,1 Send _r,1 to each participant in cluster C ₁ , judge whether r=R holds true, if set θ ₀ =θ _r,1 , go to step (7), otherwise, set r=r+1, θ ₀ =θ _r,1 , M=C ₁ and perform step (2);

(6b)中心服务器将神经网络模型权值参数更新集合Δθ_r,2进行聚合，聚合结果为的聚类C₂内部成员本轮的全局神经网络模型权值参数θ_r,2，将聚合结果θ_r,2下发给聚类C₂中每个参与者，判断r＝R是否成立，若是令θ₀＝θ_r,2，执行步骤(7)，否则，令r＝r+1，θ₀＝θ_r,2，M＝C₂并且执行步骤(2)；(6b) The central server aggregates the weight parameter update set Δθ _r,2 of the neural network model, and the aggregation result is the global neural network model weight parameter θ _r,2 of the internal members of the cluster C ₂ in the current round, and the aggregation result θ r,2 Send _r,2 to each participant in cluster C ₂ , judge whether r=R holds true, if set θ ₀ =θ _r,2 , go to step (7), otherwise, set r=r+1, θ ₀ =θ _r,2 , M=C ₂ and perform step (2);

(7)所有本地参与者进行异常流量检测：(7) All local participants perform abnormal traffic detection:

所有本地参与者M利用中心服务器下发的神经网络模型权值参数θ₀的神经网络模型，将本地采集的网络流量按照训练数据集D的格式，提取特征，通过数值化和归一化处理后，投入模型经过多层网络结构提取高维度特征，最后在全连接层进行分类，并判断分类结果是否为攻击类型，若是则为异常流量，否则为正常流量。All local participants M use the neural network model with the neural network model weight parameter θ ₀ issued by the central server to extract features from the locally collected network traffic according to the format of the training data set D, and after numerical and normalization processing , the input model extracts high-dimensional features through a multi-layer network structure, and finally classifies at the fully connected layer, and judges whether the classification result is an attack type. If so, it is abnormal traffic, otherwise it is normal traffic.

本发明与现有技术相比，具有如下优点：Compared with the prior art, the present invention has the following advantages:

本发明利用聚类的思想，在中心服务器聚合模型阶段根据参与者上传的模型更新，将参与者进行聚类化。在聚类内部聚合模型，减少了梯度优化方向不同的参与者之间的负面影响，实现模型的最优化。同时聚类内部数据分布相同或类似，聚类内部聚合的模型与聚类内每个参与者都很适配。本发明在流量数据分布不均情况下实现模型最优化和个性化，从而提高了物联网设备的异常流量检测精度。The invention uses the idea of clustering to cluster the participants according to the model update uploaded by the participants in the central server aggregation model stage. Aggregating the models within the cluster reduces the negative impact between participants with different gradient optimization directions and realizes the optimization of the model. At the same time, the distribution of data within the cluster is the same or similar, and the model of aggregation within the cluster is very suitable for each participant in the cluster. The present invention realizes model optimization and personalization under the condition of uneven distribution of flow data, thereby improving the detection accuracy of abnormal flow of Internet of Things devices.

附图说明Description of drawings

图1是本发明的实现流程图；Fig. 1 is the realization flowchart of the present invention;

图2是本发明构建的聚类联邦学习的架构图；Fig. 2 is the architecture diagram of the clustering federated learning constructed by the present invention;

图3是本发明中心服务器实现聚类化的流程图；Fig. 3 is the flow chart that central server of the present invention realizes clustering;

图4是80轮联邦学习训练的准确率变化对比图。Figure 4 is a comparison chart of accuracy rate changes for 80 rounds of federated learning training.

具体实施方式Detailed ways

下面结合附图和具体实施例，对本发明作进一步的详细描述。The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

参照附图1，本发明包括如下步骤。With reference to accompanying drawing 1, the present invention comprises the following steps.

(1)初始化联邦学习系统：(1) Initialize the federated learning system:

参照附图2，初始化包括一个中心服务器和N个物联网设备M＝{M₁,M₂,...,M_n,...,M_N}作为本地参与者的聚类联邦学习系统，此时默认所有参与者在同一个聚类中。中心服务器与本地参与者的通信轮数为r，最大通信轮数为R，参与者的本地迭代训练轮数为T，中心服务器初始化一个用于异常检测的全局神经网络模型权值参数θ₀并下发给所有本地参与者，并令r＝1，其中，N≥2，M_n表示第n个本地参与者，R≥20，T≥1；Referring to Figure 2, the initialization includes a central server and N IoT devices M={M ₁ , M ₂ ,...,M _n ,...,M _N } as a clustering federated learning system for local participants, At this point all participants are in the same cluster by default. The number of communication rounds between the central server and local participants is r, the maximum number of communication rounds is R, and the number of local iterative training rounds of participants is T. The central server initializes a global neural network model weight parameter θ ₀ for anomaly detection and Distributed to all local participants, and let r=1, where, N≥2, M _n represents the nth local participant, R≥20, T≥1;

由于物联网设备计算能力有限，构建一个结构简单的卷积神经网络模型作为全局神经网络模型，其结构依次为：第一卷积层，激活函数层，最大池化层，第二卷积层，激活函数层，最大池化层，全连接层；第一卷积层的输入通道数为1，输出通道数为6，卷积核尺寸为2，填充为1，步长为1；第二卷积层的输入通道数为6，输出通道数为16，卷积核尺寸为2×2，填充为1，步长为1；激活函数层采用线性整流函数；最大池化层池化区域核的大小设置为2×2，步长均设置为2；全连接层的输入维度为64，输出维度为5；本实例中物联网设备数量N＝6，联邦学习通信轮数R＝80，本地迭代训练轮数T＝5。Due to the limited computing power of IoT devices, a convolutional neural network model with a simple structure is constructed as a global neural network model. Its structure is as follows: the first convolutional layer, the activation function layer, the maximum pooling layer, the second convolutional layer, Activation function layer, maximum pooling layer, fully connected layer; the number of input channels of the first convolutional layer is 1, the number of output channels is 6, the convolution kernel size is 2, the padding is 1, and the step size is 1; the second volume The number of input channels of the product layer is 6, the number of output channels is 16, the convolution kernel size is 2×2, the padding is 1, and the step size is 1; the activation function layer adopts a linear rectification function; the maximum pooling layer pools the area kernel The size is set to 2×2, and the step size is set to 2; the input dimension of the fully connected layer is 64, and the output dimension is 5; in this example, the number of IoT devices N=6, the number of rounds of federated learning communication R=80, local iteration The number of training rounds T=5.

参与者集合M中每个本地参与者M_n在本轮r中使用自身的本地流量数据D_n对全局神经网络模型θ₀进行T＝5轮本地迭代训练，并将全局神经网络模型在本地训练后的权值参数更新

上传至中心服务器，其中所有参与者M的流量数据集合D＝{D₁,D₂,...,D_n,...,D_N}；Each local participant M _n in the participant set M uses its own local traffic data D _n in the current round r to perform T = 5 rounds of local iterative training on the global neural network model θ ₀ , and train the global neural network model locally After the weight parameter update

其中参与者本地流量数据是由KDD Cup 99数据集拆分得到的，KDD Cup 99数据集中流量类型共5种包括正常、拒绝服务式、嗅探、远程到本地和用户到根。本实例为了模拟数据分布不均的场景，将正常类型流量均分到6个参与者中，将拒绝服务式类型流量均分到第4和第6参与者中，将嗅探类型流量均分到第1和第3参与者中，将远程到本地和用户到根类型流量均分到第2和第5参与者中，所有参与者训练集和测试集比例均为8比2。The participant’s local traffic data is obtained by splitting the KDD Cup 99 dataset. There are five types of traffic in the KDD Cup 99 dataset, including normal, denial of service, sniffing, remote to local, and user to root. In this example, in order to simulate the scenario of uneven data distribution, the normal type traffic is evenly divided into 6 participants, the denial of service type traffic is evenly divided into the 4th and 6th participants, and the sniffing type traffic is evenly divided into In the 1st and 3rd participants, the remote-to-local and user-to-root traffic are equally divided into the 2nd and 5th participants, and the ratio of the training set to the test set of all participants is 8:2.

中的最大值

与Δθ_r的均值

的差值

并判断

与预设的阈值差值Δθ^sub＝0.5是否满足

且r与预设的阈值通信轮数r_c＝20是否满足r≥r_c，若是执行步骤(5)，否则执行步骤(4)；The central server calculates the update set of neural network model weight parameters uploaded by all participants in this round

the maximum value in

and the mean of Δθ _r

difference

and judge

Whether the difference with the preset threshold value Δθ ^sub = 0.5 is satisfied

And whether r and the preset threshold number of communication rounds r _c =20 satisfy r≥r _c , if so, perform step (5), otherwise, perform step (4);

选择差值

作为开启聚类联邦学习的判断条件是因为在数据独立同分布的情况下，联邦学习优化的总体目标的稳定解在每个客户端的本地数据上时也是稳定解。然而非独立同分布的数据投入联邦学习时，那么联邦学习的总体全局优化目标的稳定解一定不会在所有的客户端本地数据时也是稳定解。所以当差值

大于阈值时，表明参与联邦学习的物联网设备本地流量数据分布存在较大差异，有必要进行聚类联邦学习。select difference

The judgment condition for enabling clustering federated learning is because in the case of independent and identical distribution of data, the stable solution of the overall goal of federated learning optimization is also a stable solution on the local data of each client. However, when non-independent and identically distributed data is put into federated learning, the stable solution of the overall global optimization goal of federated learning must not be a stable solution for all client local data. So when the difference

When it is greater than the threshold, it indicates that there is a large difference in the distribution of local traffic data of IoT devices participating in federated learning, and clustering federated learning is necessary.

中心服务器对所有参与者的本地神经网络模型权值参数集合Δθ_r进行聚合，聚合结果为本轮的全局神经网络模型权值参数θ_r，将聚合结果θ_r下发给每个参与者，判断r＝R＝80是否成立，若是执行步骤(7)，否则，令r＝r+1，θ₀＝θ_r并执行步骤(2)。The central server aggregates the local neural network model weight parameter set Δθ _r of all participants, and the aggregation result is the global neural network model weight parameter θ _r of the current round, and sends the aggregation result θ _r to each participant, and judges Whether r=R=80 holds true, if so, execute step (7), otherwise, set r=r+1, θ ₀ =θ _r and execute step (2).

max为取最大值，min为取最小值，C₁∪C₂＝M且

max is the maximum value, min is the minimum value, C ₁ ∪ C ₂ = M and

其中公式1成立的理论依据是同一聚类内部参与者之间的余弦相似度一定大于不同聚类的参与者之间的余弦相似度，即同一聚类内部参与者之间的余弦相似度最小值也要大于不同聚类参与者之间的余弦相似度最大值。根据以上依据，参照附图3，以通信轮数为单位通过迭代二分法中心服务器将参与者进行聚类化，本实例中，将参与者4与参与者6归为一个聚类，将参与者1、2、3、5归为一个聚类。The theoretical basis for formula 1 is that the cosine similarity between participants in the same cluster must be greater than the cosine similarity between participants in different clusters, that is, the minimum cosine similarity between participants in the same cluster Also larger than the maximum cosine similarity between different clustering participants. According to the above basis, with reference to Figure 3, the participants are clustered by the iterative dichotomy central server in units of communication rounds. In this example, participants 4 and 6 are classified into one cluster, and participants 1, 2, 3, 5 are classified into one cluster.

经过大于阈值通信轮数r_c＝20的传统联邦学习，6个参与者之间的共性学习目标通过数据共享得到提升，即针对正常类型流量检测。参照附图2，此时为了继续实现数据相似的节点之间的知识共享，又减少数据不相似节点之间的消极影响，中心服务器在聚类内部聚合模型，并将聚合结果下发给对应聚类内的参与者。After the traditional federated learning with the number of communication rounds greater than the threshold r _c =20, the common learning goal among the six participants is improved through data sharing, that is, for normal type traffic detection. Referring to Figure 2, in order to continue to realize knowledge sharing between nodes with similar data and reduce the negative impact between nodes with dissimilar data, the central server aggregates the models within the cluster and sends the aggregation results to the corresponding cluster participants in the class.

所有本地参与者M利用中心服务器下发的神经网络模型权值参数θ₀的神经网络模型，将本地采集的网络流量按照KDD Cup 99数据集形式，提取特征，通过数值化和线性归一化处理后，线性归一化公式如下：All local participants M use the neural network model with the neural network model weight parameter θ ₀ issued by the central server to extract features from the locally collected network traffic in the form of the KDD Cup 99 data set, and process it through numericalization and linear normalization After that, the linear normalization formula is as follows:

处理后的流量数据投入卷积神经网络后，经过两层卷积层提取出高维度的特征，最后在全连接层进行分类，若将该网络流量数据分类为拒绝服务式、嗅探、远程到本地和用户到根四种流量类型则检测出异常，否则是正常网络流量。After the processed traffic data is put into the convolutional neural network, high-dimensional features are extracted through two layers of convolutional layers, and finally classified in the fully connected layer. If the network traffic data is classified as denial of service, sniffing, remote to The four types of local and user-to-root traffic are detected as abnormal, otherwise they are normal network traffic.

下面结合仿真实验对本发明的技术效果作进一步的描述。The technical effects of the present invention will be further described below in conjunction with simulation experiments.

1.仿真条件：1. Simulation conditions:

仿真硬件平台为：处理器Intel(R)Core(TM)i5，主频3.0GHz，内存为8G，显卡为GeForce GT 730。The simulation hardware platform is: processor Intel(R) Core(TM) i5, main frequency 3.0GHz, memory 8G, graphics card GeForce GT 730.

仿真实验的软件平台为：Windows10家庭版操作系统，Pycharm2019软件，开发语言python3.8，机器学习库Scikit-learn，Pytorch深度学习框架，第三方数据处理库pandas和numpy。The software platform for the simulation experiment is: Windows10 Home Edition operating system, Pycharm2019 software, development language python3.8, machine learning library Scikit-learn, Pytorch deep learning framework, third-party data processing libraries pandas and numpy.

仿真实验数据KDD Cup 99数据集是指，通过模拟美国空军典型局域网的运行和多次攻击而收集的，获得了9周的TCP转储数据，由麻省理工学院(MIT)林肯实验室收集和发布的。KDD99数据集由大约4,900,000个单一连接数据组成，每个数据包含41个特征。The simulation experiment data KDD Cup 99 data set refers to the collection of 9 weeks of TCP dump data collected by simulating the operation of a typical US Air Force LAN and multiple attacks, collected by the Massachusetts Institute of Technology (MIT) Lincoln Laboratory and Published. The KDD99 dataset consists of approximately 4,900,000 single-connected data, each containing 41 features.

2.仿真内容及结果分析：2. Simulation content and result analysis:

本发明仿真实验是采用本发明与基于区块链和联邦学习的分布式物联网入侵检测方法及系统分别对仿真数据KDD Cup 99进行网络异常流量检测仿真。The simulation experiment of the present invention uses the present invention and the distributed Internet of Things intrusion detection method and system based on blockchain and federated learning to perform network abnormal flow detection simulation on the simulation data KDD Cup 99 respectively.

为了评价本发明的实际仿真实验的效果，以异常流量检测的准确率，作为评价本发明和现有技术的评价标准。将按照上述拆分好的KDD数据集进行80轮联邦学习训练的准确率变化如附图4所示：In order to evaluate the effect of the actual simulation experiment of the present invention, the accuracy rate of abnormal flow detection is used as the evaluation standard for evaluating the present invention and the prior art. The change in the accuracy rate of the 80 rounds of federated learning training based on the above split KDD dataset is shown in Figure 4:

参考附图4(a)为现有技术的表现，可以很明显得看出聚类1的准确率在整个联邦学习过程中出现较大程度的波动，这就表现为在以KDD数据集为基础构造的数据分布不均的传统的联邦学习架构中已经出现了负优化的情况，那么传统的联邦学习框架已经不适用于该场景下的网络异常流量检测。与此同时，我们可以看到聚类2的准确率一直在96％附近波动，没能继续更近一步的优化。Referring to Figure 4(a) for the performance of the existing technology, it can be clearly seen that the accuracy of cluster 1 fluctuates to a large extent during the entire federated learning process, which is manifested in the KDD data set. Negative optimization has already occurred in the traditional federated learning architecture with uneven data distribution, so the traditional federated learning framework is no longer suitable for abnormal network traffic detection in this scenario. At the same time, we can see that the accuracy of cluster 2 has been fluctuating around 96%, failing to continue further optimization.

参考附图4(b)基于聚类联邦学习的物联网设备异常流量检测模型训练结果，聚类1的准确率在多任务聚类联邦学习的第一阶段中虽然出现过波动，但是总体准确率呈现上升趋势，进入第二阶段后准确率保持在99％以上，没有出现过大幅度的波动。同时聚类2的准确率在第一阶段时准确率同样在96％后没有再往上继续优化，但是进入到第二阶段后通过聚类内部的个性化联邦学习后，准确率上升到99％以上并且保持。Referring to Figure 4(b) the training results of the abnormal traffic detection model for IoT devices based on cluster federated learning, although the accuracy of cluster 1 has fluctuated in the first stage of multi-task cluster federated learning, the overall accuracy rate It showed an upward trend, and the accuracy rate remained above 99% after entering the second stage, without any large fluctuations. At the same time, the accuracy rate of clustering 2 was not further optimized after the accuracy rate was 96% in the first stage, but after entering the second stage through personalized federated learning within the cluster, the accuracy rate rose to 99%. above and maintain.

将进行80轮联邦学习训练得到物联网异常流量检测准确率结果绘制成表1：The results of 80 rounds of federated learning training to obtain the accuracy of IoT abnormal traffic detection are drawn in Table 1:

表1 80轮联邦学习训练后的异常流量检测准确率对比表Table 1 Comparison table of abnormal traffic detection accuracy after 80 rounds of federated learning training

算法algorithm 聚类1准确率Cluster 1 Accuracy 聚类2准确率Cluster 2 Accuracy 本发明this invention 99.98％99.98% 99.95％99.95% 现有技术current technology 98.83％98.83% 96.79％96.79%

从表1可以看出，本发明的异常流量检测准确率在聚类1和聚类2上相比较于现有技术都有提升。证明了本发明在数据分布不均场景下的，能够实现模型的最优化并且为每个聚类内参与者提供个性化模型，从而提高了物联网设备异常流量检测的准确率。It can be seen from Table 1 that the detection accuracy of abnormal traffic in the present invention has been improved in both cluster 1 and cluster 2 compared with the prior art. It is proved that the present invention can optimize the model and provide a personalized model for participants in each cluster in the scenario of uneven data distribution, thereby improving the accuracy of abnormal traffic detection of IoT devices.

Claims

1. An abnormal traffic detection method of Internet of things equipment based on clustered federal learning is characterized in that a federal learning system is initialized; local iterative training is carried out on the global neural network model by a local participant; the central server judges whether a clustering starting condition is met; the central server aggregates the neural network model and issues the neural network model; the central server carries out clustering on all participants; the central server aggregates the neural network model in the cluster and issues the neural network model; all local participants carry out abnormal flow detection; the method specifically comprises the following steps:

(1) Initializing the federal learning system:

the initialization comprises a central server and N pieces of Internet of things equipment M = { M = { (M) } ₁ ,M ₂ ,...,M _n ,...,M _N The federate learning system as a local participant, the number of communication rounds of a central server and the local participant is R, the maximum number of communication rounds is R, the number of local iterative training rounds of the participant is T, and the central server initializes a global neural network model weight parameter theta for anomaly detection ₀ And is issued to all local participants with r =1, where N ≧ 2 _n Represents the nth local participant, R is more than or equal to 20, T is more than or equal to 1;

(2) Local iterative training is carried out on the global neural network model by the local participants:

each local participant M in the participant set M _n Use of its own local traffic data D in the current round r _n For global neural network model theta ₀ Carrying out T-round local iterative training, and updating the weight parameter of the global neural network model after local training

Uploading to a central server, wherein the traffic data sets D = { D) of all participants M ₁ ,D ₂ ,...,D _n ,...,D _N }；

(3) The central server judges whether a cluster starting condition is met:

the central server calculates the update set of the weight parameters of the neural network model uploaded by all the participants in the current round

Maximum value of

And Δ θ _r Mean value of

Difference of (2)

And make a judgment on

Difference value delta theta from preset threshold value ^sub Whether or not to satisfy

And r is communicated with a preset threshold value by the number of rounds r _c Whether r is more than or equal to r _c If yes, executing the step (5), otherwise, executing the step (4);

(4) The central server aggregates the neural network model and issues:

local neural network model weight parameter set delta theta of central server for all participants _r Carrying out aggregation, wherein the aggregation result is the weight parameter theta of the global neural network model of the current round _r The polymerization result θ _r Issuing the result to each participant, judging whether R = R is true, if so, executing the step (7), otherwise, making R = R +1, and theta ₀ ＝θ _r And executing the step (2);

(5) The central server clusters all participants:

the central server updates the set delta theta through the weight parameters of the neural network model uploaded by the participants _r Calculating the cosine similarity set of the gradient optimization directions among all participants

The central server collects all N local participants M according to the cosine similarity a _r According to equation 1, the participants are divided into two clusters C ₁ And C ₂ Wherein formula 1 is

max is taken as the maximum value, min is taken as the minimum value, C ₁ ∪C ₂ Is = M and

(6) The central server aggregates the neural network model in the cluster and issues:

the central server clusters according to C ₁ And C ₂ Uploading participants to neural network model weight parameter update set delta theta _r Division into membership clusters C ₁ The weight parameter update set delta theta of the neural network model uploaded by the participator _r,1 ＝{Δθ _n |M _n ∈C ₁ And membership to cluster C ₂ The neural network model weight parameter update set delta theta uploaded by the participants _r,2 ＝{Δθ _n |M _n ∈C ₂ }；

(6a) The central server updates the weight parameter set Delta theta of the neural network model _r,1 Performing an aggregation, the cluster C of the aggregation result ₁ Global neural network model weight parameter theta of internal member local round _r,1 The polymerization result θ _r,1 Is sent to a cluster C ₁ Judging whether R = R is true or not for each participant, if yes, making theta ₀ ＝θ _r,1 And (7) executing the step, otherwise, enabling r = r +1 and theta ₀ ＝θ _r,1 ，M＝C ₁ And performing step (2);

(6b) The central server updates the weight parameter set Delta theta of the neural network model _r,2 Performing aggregation to obtain clusters C ₂ Global neural network model weight parameter theta of internal member local round _r,2 The polymerization result θ _r,2 Is issued to cluster C ₂ Judging whether R = R is true or not for each participant, if yes, making theta ₀ ＝θ _r,2 And (7) executing the step, otherwise, enabling r = r +1 and theta ₀ ＝θ _r,2 ，M＝C ₂ And performing step (2);

(7) All local participants perform abnormal traffic detection:

all local participants M utilize a neural network model weight parameter theta issued by a central server ₀ The neural network model extracts the characteristics of the locally acquired network flow according to the format of a training data set D, after the characteristics are subjected to digitization and normalization processing, the neural network model is put into a multi-layer network structure to extract high-dimensional characteristics, finally, classification is carried out on a full connection layer, whether the classification result is an attack type or not is judged, if yes, abnormal flow is obtained, and otherwise, normal flow is obtained.

2. The Internet of things equipment abnormal flow detection method based on clustered federal learning according to claim 1, wherein the global neural network model structure in the step (1) is a convolutional neural network model, and the structure of the convolutional neural network model sequentially comprises the following steps: the device comprises a first convolution layer, an activation function layer, a maximum pooling layer, a second convolution layer, an activation function layer, a maximum pooling layer and a full connection layer.

3. The method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning as claimed in claim 1, wherein each local participant M in step (2) is _n By means of local traffic data D _n For global neural network model theta ₀ Carrying out T-round local iterative training, comprising the following implementation steps:

(2a) Let the number of training rounds t =1, and apply the global neural network model θ ₀ As a local neural network model m ₀ ；

(2b) Each local participant M _n The local flow data D is transmitted in the current round _n As a local neural network model m ₀ Training the input of the model, taking the predicted value of the model and the actual value of the flow data as the input of a cross entropy loss function, and calculating the local flow data D of the model in the current round _n A loss value L of (a), wherein:

L＝CrossEntropy Loss(m ₀ ；D _n )

cross Encopy Loss is a cross entropy Loss function;

(2c) Gradient obtained by adopting Adam gradient descent method and by devitalizing loss value L

For local neural network model m ₀ The weight parameters are updated to obtain the model trained in the current round

Wherein:

wherein,

representing the operation of calculating the deviation, and eta representing the learning rate;

(2d) Judging whether T = T is true, if yes, obtaining global neural network model weight parameter update

Otherwise, let t = t +1,

and performing step (2 b), wherein:

4. the method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning as claimed in claim 1, wherein the mean value of updating model parameters is calculated in step (3)

Maximum value

Sum and difference

The formula is as follows:

where, | is the radix, Σ represents the summation operation, and | | is the L2 norm operation.

5. The method for detecting abnormal traffic of equipment of the internet of things based on clustered federal learning as claimed in claim 1, wherein the global model parameter θ is calculated in the step (4) _r The formula is as follows:

6. the method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning according to claim 1, wherein in the step (5), the participant M _i With participant M _j Cosine similarity of (a) _i,j The calculation formula is as follows:

wherein the inner product operation is carried out.

7. The method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning as claimed in claim 1, wherein the cluster C is calculated in the step (6 a) ₁ Global neural network model weight parameter theta of internal member local round _r,1 The formula is as follows:

8. the Internet of things equipment abnormal flow detection method based on clustered federal learning according to claim 1, wherein the method comprises the step (6 b)Middle calculation cluster C ₂ Global neural network model weight parameter theta of internal member local round _r,2 The formula is as follows: