CN114900343B - Internet of things equipment abnormal flow detection method based on clustered federal learning - Google Patents

Internet of things equipment abnormal flow detection method based on clustered federal learning Download PDF

Info

Publication number
CN114900343B
CN114900343B CN202210442394.0A CN202210442394A CN114900343B CN 114900343 B CN114900343 B CN 114900343B CN 202210442394 A CN202210442394 A CN 202210442394A CN 114900343 B CN114900343 B CN 114900343B
Authority
CN
China
Prior art keywords
neural network
network model
local
central server
participants
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210442394.0A
Other languages
Chinese (zh)
Other versions
CN114900343A (en
Inventor
马卓
高佳晨
刘洋
杨易龙
刘心晶
李腾
张俊伟
马建峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanxiang Zhilian Hangzhou Technology Co ltd
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202210442394.0A priority Critical patent/CN114900343B/en
Publication of CN114900343A publication Critical patent/CN114900343A/en
Application granted granted Critical
Publication of CN114900343B publication Critical patent/CN114900343B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computer Hardware Design (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an Internet of things equipment abnormal flow detection method based on clustering federal learning, which comprises the following implementation steps of: initializing a federal learning system, carrying out local iterative training on a global neural network model by local participants, judging whether a cluster starting condition is met or not by a central server, aggregating the neural network model and issuing by the central server, clustering all the participants by the central server, aggregating the neural network model and issuing by the central server in a cluster, and carrying out abnormal flow detection on all the local participants. According to the invention, the central server can divide the Internet of things equipment into different clusters according to the data distribution condition, and model optimization and individuation under the scene of uneven data distribution are realized by clustering the internal aggregate global model, so that the abnormal flow detection accuracy of the Internet of things equipment is improved.

Description

Internet of things equipment abnormal flow detection method based on clustered federal learning
Technical Field
The invention relates to the field of Internet of Things (IoT) security, in particular to a network abnormal flow detection method, and particularly relates to a distributed Internet of Things equipment abnormal flow detection method based on clustered federal learning.
Background
The internet of things is an extended and expanded network based on the internet, various information sensing devices are combined with the internet to form a huge network, and the interconnection of people, machines and objects at any time and any place is realized. The internet of things equipment is internet equipment connected with objects. The equipment of the Internet of things comprises: bar code, radio frequency identification, sensor, global positioning system, laser scanner, etc., wherein the sensor device is a basic device. Therefore, the internet of things equipment plays an important role in the current production life, and a large amount of privacy information is stored in the internet of things equipment. The attacking means and scale aiming at the internet of things are also getting stronger, and the normal order of the society is seriously threatened.
In order to quickly and accurately respond to abnormal conditions in a network, maintain normal communication of the network and improve service quality of the network, a network abnormal flow detection technology is widely concerned by people. The abnormal flow detection system mainly models and detects abnormal behaviors through a technical means, and gives a warning to a network manager when flow abnormality is found.
The method for detecting the abnormal flow is mainly used for detecting the abnormal flow with high accuracy and high automation degree. The machine learning can automatically complete high-precision abnormal detection through autonomous iterative training of mass data, however, under the scene of the multi-source heterogeneous Internet of things, high-precision abnormal flow detection based on the machine learning needs support of the mass data, and under the large-scale distributed environment, the difficulty of model training is greatly improved due to the occurrence of a data isolated island. Meanwhile, mass data of users are collected and applied to model training, and worry about data privacy and safety problems of people is also caused. In order to solve the problems, the federal learning framework is applied to abnormal flow detection in the scene of the internet of things.
A distributed Internet of things intrusion detection method and system based on block chain and federal learning (patent application number: 202110797 560.4, application publication number: CN 113794675A) aiming at Internet of things equipment is disclosed in a patent document applied by information engineering university of people's liberation army strategy support army of China, wherein the patent document is CN 113794675A. The method utilizes distributed training of federal learning to improve the training efficiency and attack detection accuracy of an intrusion detection model; the distributed storage utilizing the blockchain solves the security problem of centralized storage.
Although the method solves the problem of data islanding by using the federal learning technology, massive data are provided for machine learning. However, in the era of the internet of things with heterogeneous network convergence and mass terminals accessed in a normalized manner, network traffic distribution of different devices is differentiated due to different security requirements of diversified terminals. The method has the advantages that model training cannot be optimized when data are unevenly distributed, and abnormal traffic detection accuracy is low due to the fact that a global model of distributed training is not suitable for two technical problems of abnormal traffic detection of a local network of the Internet of things equipment.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides an abnormal flow detection scheme of the Internet of things equipment based on multi-task clustering federal learning, which is used for solving the problems that model training cannot be optimized in a scene with uneven data distribution, a global model of the federal learning is not suitable for detecting abnormal flow of a local network of the Internet of things equipment, and the abnormal flow detection accuracy is low in two aspects.
In order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:
(1) Initializing the federal learning system:
the initialization comprises a central server and N pieces of Internet of things equipment M = { M = { (M) } 1 ,M 2 ,...,M n ,...,M N The federate learning system as a local participant, the number of communication rounds of a central server and the local participant is R, the maximum number of communication rounds is R, the number of local iterative training rounds of the participant is T, and the central server initializes a global neural network model weight parameter theta for anomaly detection 0 And is issued to all local participants with r =1, where N ≧ 2 n Denotes the nth bookThe ground participant, R is more than or equal to 20, T is more than or equal to 1;
(2) Local iterative training is carried out on the global neural network model by the local participants:
each local participant M in the participant set M n Using its own local flow data D in the current round r n For global neural network model theta 0 Carrying out T-round local iterative training, and updating the weight parameter of the global neural network model after local training
Figure BDA0003614501650000021
Uploading to a central server, wherein the traffic data sets D = { D) of all participants M 1 ,D 2 ,...,D n ,...,D N };
(3) The central server judges whether the cluster starting condition is met:
the central server calculates the update set of the weight parameters of the neural network model uploaded by all participants in the current round
Figure BDA0003614501650000022
Maximum value of (2)
Figure BDA0003614501650000023
And Δ θ r Mean value of
Figure BDA0003614501650000024
Difference of (2)
Figure BDA0003614501650000025
And make a judgment on
Figure BDA0003614501650000026
Difference value delta theta from preset threshold value sub Whether or not to satisfy
Figure BDA0003614501650000027
And r is communicated with a preset threshold value by the number of rounds r c Whether r is more than or equal to r c If yes, executing the step (5), otherwise, executing the step (4);
(4) The central server aggregates the neural network model and issues:
local neural network model weight parameter set delta theta of central server for all participants r Carrying out aggregation, wherein the aggregation result is the weight parameter theta of the global neural network model of the current round r The polymerization result θ r Issuing the result to each participant, judging whether R = R is true, if so, executing the step (7), otherwise, enabling R = R +1, and theta 0 =θ r And executing the step (2);
(5) The central server clusters all participants:
the central server updates the set delta theta through the weight parameter of the neural network model uploaded by the participants r Calculating the cosine similarity set of the gradient optimization directions among all participants
Figure BDA0003614501650000031
The central server collects all N local participants M according to the cosine similarity a r According to equation 1, the participants are divided into two clusters C 1 And C 2 Wherein formula 1 is
Figure BDA0003614501650000032
max is taken as the maximum value, min is taken as the minimum value, C 1 ∪C 2 Is = M and
Figure BDA0003614501650000033
(6) The central server aggregates the neural network model in the cluster and issues:
the central server clusters according to C 1 And C 2 Uploading participants to neural network model weight parameter update set delta theta r Division into membership clusters C 1 The neural network model weight parameter update set delta theta uploaded by the participants r,1 ={Δθ n |M n ∈C 1 And membership to cluster C 2 The neural network model weight parameter update set delta theta uploaded by the participants r,2 ={Δθ n |M n ∈C 2 };
(6a) The central server updates the weight parameter of the neural network model by a set delta theta r,1 Performing aggregation to obtain clusters C 1 Global neural network model weight parameter theta of internal member local round r,1 The polymerization result θ r,1 Is issued to cluster C 1 Judging whether R = R is true or not by each participant, if yes, making theta 0 =θ r,1 And (7) executing the step, otherwise, enabling r = r +1 and theta 0 =θ r,1 ,M=C 1 And performing step (2);
(6b) The central server updates the weight parameter of the neural network model by a set delta theta r,2 Performing aggregation to obtain clusters C 2 Global neural network model weight parameter theta of internal member local round r,2 The polymerization result θ r,2 Is issued to cluster C 2 Judging whether R = R is true or not for each participant, if yes, making theta 0 =θ r,2 And (7) executing the step, otherwise, enabling r = r +1 and theta 0 =θ r,2 ,M=C 2 And performing step (2);
(7) All local participants perform abnormal traffic detection:
all local participants M utilize the weight parameter theta of the neural network model issued by the central server 0 The neural network model extracts the characteristics of the locally acquired network flow according to the format of a training data set D, after the characteristics are subjected to digitization and normalization processing, the neural network model is put into a multi-layer network structure to extract high-dimensional characteristics, finally, classification is carried out on a full connection layer, whether the classification result is an attack type or not is judged, if yes, abnormal flow is obtained, and otherwise, normal flow is obtained.
Compared with the prior art, the invention has the following advantages:
the invention utilizes the idea of clustering, and clusters the participants according to the model update uploaded by the participants in the model aggregation stage of the central server. And the model is aggregated in the cluster, so that the negative influence among participants with different gradient optimization directions is reduced, and the optimization of the model is realized. Meanwhile, the data distribution inside the clusters is the same or similar, and the model of the aggregation inside the clusters is very suitable for each participant in the clusters. The invention realizes model optimization and individuation under the condition of uneven flow data distribution, thereby improving the abnormal flow detection precision of the Internet of things equipment.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is an architecture diagram of clustered federated learning as constructed by the present invention;
FIG. 3 is a flow chart of the present invention center server implementing clustering;
FIG. 4 is a graph comparing the change in accuracy for 80 rounds of Federal learning training.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments.
Referring to fig. 1, the present invention includes the following steps.
(1) Initializing the federal learning system:
referring to fig. 2, the initialization includes a central server and N internet of things devices M = { M = } 1 ,M 2 ,...,M n ,...,M N A clustered federated learning system as local participants, where all participants are defaulted to be in the same cluster. The number of communication rounds of the central server and local participants is R, the maximum number of communication rounds is R, the number of local iterative training rounds of the participants is T, and the central server initializes a global neural network model weight parameter theta for anomaly detection 0 And is issued to all local participants with r =1, wherein N ≧ 2 n Represents the nth local participant, R is more than or equal to 20, T is more than or equal to 1;
because the computing power of the Internet of things equipment is limited, a convolutional neural network model with a simple structure is constructed as a global neural network model, and the structure of the convolutional neural network model is as follows in sequence: the device comprises a first convolution layer, an activation function layer, a maximum pooling layer, a second convolution layer, an activation function layer, a maximum pooling layer and a full-connection layer; the number of input channels of the first convolution layer is 1, the number of output channels is 6, the convolution kernel size is 2, the padding is 1, and the step length is 1; the number of input channels of the second convolution layer is 6, the number of output channels is 16, the size of the convolution kernel is 2 multiplied by 2, the filling is 1, and the step length is 1; the activation function layer adopts a linear rectification function; the size of the core of the maximum pooling layer pooling area is set to be 2 multiplied by 2, and the step length is set to be 2; the input dimension of the fully connected layer is 64, and the output dimension is 5; in this example, the number of internet of things devices N =6, the number of federal learning communication rounds R =80, and the number of local iterative training rounds T =5.
(2) Local iterative training is carried out on the global neural network model by the local participants:
each local participant M in the participant set M n Use of its own local traffic data D in the current round r n For global neural network model theta 0 Carrying out T =5 rounds of local iterative training, and updating the weight parameter of the global neural network model after the local training
Figure BDA0003614501650000051
Uploading to a central server, wherein the traffic data sets D = { D) of all participants M 1 ,D 2 ,...,D n ,...,D N };
The local flow data of the participants are obtained by splitting a KDD Cup 99 data set, and the flow types in the KDD Cup 99 data set comprise 5 types of normal types, service denial types, sniffing types, remote-to-local types and user-to-root types. In order to simulate a scene with uneven data distribution, normal type traffic is evenly distributed into 6 participants, denial-of-service type traffic is evenly distributed into the 4 th participant and the 6 th participant, sniffing type traffic is evenly distributed into the 1 st participant and the 3 rd participant, remote-to-local and user-to-root type traffic is evenly distributed into the 2 nd participant and the 5 th participant, and all participant training set and test set proportions are 8 to 2.
(3) The central server judges whether a cluster starting condition is met:
the central server calculates the update set of the weight parameters of the neural network model uploaded by all the participants in the current round
Figure BDA0003614501650000052
Maximum value of (2)
Figure BDA0003614501650000053
And Δ θ r Mean value of
Figure BDA0003614501650000054
Difference of (2)
Figure BDA0003614501650000055
And judge
Figure BDA0003614501650000056
Difference value delta theta from preset threshold value sub Whether or not 0.5 satisfies
Figure BDA0003614501650000057
And r is communicated with a preset threshold value by the number of rounds r c =20 whether r ≧ r is satisfied c If yes, executing the step (5), otherwise, executing the step (4);
selecting a difference value
Figure BDA0003614501650000058
The judgment condition for starting clustering federal learning is that in the case of independent and same distribution of data, a stable solution of the overall target of federal learning optimization is also a stable solution when the stable solution is on the local data of each client. However, when data that is not independently and uniformly distributed is put into federal learning, the stable solution of the global optimization goal of federal learning is not necessarily a stable solution even when all client local data are included. So when the difference is
Figure BDA0003614501650000061
When the flow rate of the internet of things equipment participating in federal learning is larger than the threshold value, the distribution of the local flow rate data of the internet of things equipment participating in federal learning is shown to have large difference, and clustered federal learning is necessary.
(4) The central server aggregates the neural network model and issues:
local neural network model weight parameter set delta theta of central server for all participants r Carrying out aggregation, wherein the aggregation result is the weight parameter theta of the global neural network model of the current round r The polymerization result θ r Sending to each participant, judging whether R = R =80 is true, if yes, executing step (7) Otherwise, let r = r +1, θ 0 =θ r And step (2) is performed.
(5) The central server clusters all participants:
the central server updates the set delta theta through the weight parameter of the neural network model uploaded by the participants r Calculating the cosine similarity set of the gradient optimization directions among all participants
Figure BDA0003614501650000062
The central server collects all N local participants M according to cosine similarity a r According to equation 1, the participants are divided into two clusters C 1 And C 2 Wherein formula 1 is
Figure BDA0003614501650000063
max is taken as the maximum value, min is taken as the minimum value, C 1 ∪C 2 Is = M and
Figure BDA0003614501650000064
the theoretical basis for the establishment of equation 1 is that the cosine similarity between participants in the same cluster is necessarily greater than the cosine similarity between participants in different clusters, i.e., the minimum cosine similarity between participants in the same cluster is also greater than the maximum cosine similarity between participants in different clusters. According to the above, referring to fig. 3, the participants are clustered by the iterative bisection center server in units of communication rounds, in this example, the participants 4 and 6 are classified into one cluster, and the participants 1, 2, 3, and 5 are classified into one cluster.
(6) The central server aggregates the neural network model in the cluster and issues:
the central server is based on the cluster C 1 And C 2 Uploading participants to neural network model weight parameter update set delta theta r Division into membership clusters C 1 The weight parameter of the neural network model uploaded by the participator is moreNew set delta theta r,1 ={Δθ n |M n ∈C 1 And membership to cluster C 2 The neural network model weight parameter update set delta theta uploaded by the participants r,2 ={Δθ n |M n ∈C 2 };
(6a) The central server updates the weight parameter of the neural network model by a set delta theta r,1 Performing an aggregation, the cluster C of the aggregation result 1 Global neural network model weight parameter theta of internal member local round r,1 The polymerization result θ r,1 Is issued to cluster C 1 Judging whether R = R is true or not for each participant, if yes, making theta 0 =θ r,1 And (7) executing the step, otherwise, enabling r = r +1 and theta 0 =θ r,1 ,M=C 1 And performing step (2);
(6b) The central server updates the weight parameter set Delta theta of the neural network model r,2 Performing aggregation to obtain clusters C 2 Global neural network model weight parameter theta of inner member local round r,2 The polymerization result θ r,2 Is issued to cluster C 2 Judging whether R = R is true or not by each participant, if yes, making theta 0 =θ r,2 And (7) executing the step, otherwise, enabling r = r +1 and theta 0 =θ r,2 ,M=C 2 And performing step (2);
number of communication rounds r passing greater than threshold c =20 traditional federal learning, the objective of commonality learning among 6 participants is improved by data sharing, i.e. for normal type traffic detection. Referring to fig. 2, at this time, in order to continue to implement knowledge sharing between nodes with similar data and reduce negative influence between nodes with dissimilar data, the central server aggregates the models within the clusters and sends the aggregation results to the participants in the corresponding clusters.
(7) All local participants perform abnormal traffic detection:
all local participants M utilize the weight parameter theta of the neural network model issued by the central server 0 The neural network model extracts the characteristics of the locally acquired network flow according to a KDD Cup 99 data set formAfter the numerical and linear normalization processes, the linear normalization formula is as follows:
Figure BDA0003614501650000071
after the processed flow data is put into a convolutional neural network, high-dimensional features are extracted through two convolutional layers, finally classification is carried out on a full connection layer, if the network flow data is classified into four flow types of a denial of service type, sniffing, remote to local type and user to root type, abnormity is detected, and otherwise, normal network flow is detected.
The technical effect of the present invention is further described in conjunction with simulation experiments.
1. Simulation conditions are as follows:
the simulation hardware platform is as follows: the processor Intel (R) Core (TM) i5, the main frequency 3.0GHz, the internal memory 8G and the display card GeForce GT 730.
The software platform of the simulation experiment is as follows: windows10 family version operating system, pycharm2019 software, development language python3.8, machine learning library Scikit-leann, pythroch deep learning framework, third party data processing libraries pandas and numpy.
Simulation experiment data the KDD Cup 99 dataset is collected by simulating the operation and multiple attacks on the typical local area network of the united states air force, and 9 weeks of TCP dump data is obtained and collected and distributed by the lincoln laboratories of the Massachusetts Institute of Technology (MIT). The KDD99 dataset consists of approximately 4,900,000 single concatenated data, each containing 41 features.
2. Simulation content and result analysis:
the simulation experiment of the invention is to respectively carry out network abnormal flow detection simulation on simulation data KDD Cup 99 by adopting the invention and a distributed Internet of things intrusion detection method and system based on block chain and federal learning.
In order to evaluate the effect of the actual simulation experiment of the invention, the accuracy of abnormal flow detection is used as an evaluation standard for evaluating the invention and the prior art. The accuracy rate change of 80 rounds of federal learning training performed on the split KDD dataset is shown in fig. 4:
referring to fig. 4 (a) for the performance of the prior art, it can be clearly seen that the accuracy of the cluster 1 fluctuates to a large extent in the whole federal learning process, which is indicated that negative optimization has occurred in the traditional federal learning framework based on the KDD data set with unevenly distributed data, so that the traditional federal learning framework is not suitable for detecting network abnormal traffic in this scenario. At the same time, we can see that the accuracy of cluster 2 fluctuates around 96% all the time, and further optimization cannot be continued.
Referring to fig. 4 (b), based on the abnormal traffic detection model training result of the internet of things device based on clustering federal learning, the accuracy of the cluster 1 is fluctuated in the first stage of multitask clustering federal learning, but the overall accuracy is in an ascending trend, and after the second stage, the accuracy is kept above 99% and no fluctuation with too large amplitude occurs. Meanwhile, the accuracy of the cluster 2 is not optimized continuously upwards after 96% in the first stage, but the accuracy is improved to be more than 99% and maintained after the second stage is carried out and the individualized federation learning in the cluster is carried out.
Drawing the result of the accuracy rate of the abnormal flow detection of the internet of things obtained by 80 rounds of federal learning training into a table 1:
TABLE 1 contrast table of abnormal flow detection accuracy rate after 80 rounds of Federal learning training
Algorithm Cluster 1 accuracy Cluster 2 accuracy
The invention 99.98% 99.95%
Prior Art 98.83% 96.79%
As can be seen from table 1, the accuracy of abnormal traffic detection in the present invention is improved in both cluster 1 and cluster 2 compared to the prior art. The method and the device have the advantages that under the scene of uneven data distribution, the optimization of the model can be realized, and the personalized model is provided for each cluster participant, so that the accuracy of abnormal flow detection of the equipment of the Internet of things is improved.

Claims (8)

1. An abnormal traffic detection method of Internet of things equipment based on clustered federal learning is characterized in that a federal learning system is initialized; local iterative training is carried out on the global neural network model by a local participant; the central server judges whether a clustering starting condition is met; the central server aggregates the neural network model and issues the neural network model; the central server carries out clustering on all participants; the central server aggregates the neural network model in the cluster and issues the neural network model; all local participants carry out abnormal flow detection; the method specifically comprises the following steps:
(1) Initializing the federal learning system:
the initialization comprises a central server and N pieces of Internet of things equipment M = { M = { (M) } 1 ,M 2 ,...,M n ,...,M N The federate learning system as a local participant, the number of communication rounds of a central server and the local participant is R, the maximum number of communication rounds is R, the number of local iterative training rounds of the participant is T, and the central server initializes a global neural network model weight parameter theta for anomaly detection 0 And is issued to all local participants with r =1, where N ≧ 2 n Represents the nth local participant, R is more than or equal to 20, T is more than or equal to 1;
(2) Local iterative training is carried out on the global neural network model by the local participants:
each local participant M in the participant set M n Use of its own local traffic data D in the current round r n For global neural network model theta 0 Carrying out T-round local iterative training, and updating the weight parameter of the global neural network model after local training
Figure FDA0003957824770000011
Uploading to a central server, wherein the traffic data sets D = { D) of all participants M 1 ,D 2 ,...,D n ,...,D N };
(3) The central server judges whether a cluster starting condition is met:
the central server calculates the update set of the weight parameters of the neural network model uploaded by all the participants in the current round
Figure FDA0003957824770000012
Maximum value of
Figure FDA0003957824770000013
And Δ θ r Mean value of
Figure FDA0003957824770000014
Difference of (2)
Figure FDA0003957824770000015
And make a judgment on
Figure FDA0003957824770000016
Difference value delta theta from preset threshold value sub Whether or not to satisfy
Figure FDA0003957824770000017
And r is communicated with a preset threshold value by the number of rounds r c Whether r is more than or equal to r c If yes, executing the step (5), otherwise, executing the step (4);
(4) The central server aggregates the neural network model and issues:
local neural network model weight parameter set delta theta of central server for all participants r Carrying out aggregation, wherein the aggregation result is the weight parameter theta of the global neural network model of the current round r The polymerization result θ r Issuing the result to each participant, judging whether R = R is true, if so, executing the step (7), otherwise, making R = R +1, and theta 0 =θ r And executing the step (2);
(5) The central server clusters all participants:
the central server updates the set delta theta through the weight parameters of the neural network model uploaded by the participants r Calculating the cosine similarity set of the gradient optimization directions among all participants
Figure FDA0003957824770000021
The central server collects all N local participants M according to the cosine similarity a r According to equation 1, the participants are divided into two clusters C 1 And C 2 Wherein formula 1 is
Figure FDA0003957824770000022
max is taken as the maximum value, min is taken as the minimum value, C 1 ∪C 2 Is = M and
Figure FDA0003957824770000023
(6) The central server aggregates the neural network model in the cluster and issues:
the central server clusters according to C 1 And C 2 Uploading participants to neural network model weight parameter update set delta theta r Division into membership clusters C 1 The weight parameter update set delta theta of the neural network model uploaded by the participator r,1 ={Δθ n |M n ∈C 1 And membership to cluster C 2 The neural network model weight parameter update set delta theta uploaded by the participants r,2 ={Δθ n |M n ∈C 2 };
(6a) The central server updates the weight parameter set Delta theta of the neural network model r,1 Performing an aggregation, the cluster C of the aggregation result 1 Global neural network model weight parameter theta of internal member local round r,1 The polymerization result θ r,1 Is sent to a cluster C 1 Judging whether R = R is true or not for each participant, if yes, making theta 0 =θ r,1 And (7) executing the step, otherwise, enabling r = r +1 and theta 0 =θ r,1 ,M=C 1 And performing step (2);
(6b) The central server updates the weight parameter set Delta theta of the neural network model r,2 Performing aggregation to obtain clusters C 2 Global neural network model weight parameter theta of internal member local round r,2 The polymerization result θ r,2 Is issued to cluster C 2 Judging whether R = R is true or not for each participant, if yes, making theta 0 =θ r,2 And (7) executing the step, otherwise, enabling r = r +1 and theta 0 =θ r,2 ,M=C 2 And performing step (2);
(7) All local participants perform abnormal traffic detection:
all local participants M utilize a neural network model weight parameter theta issued by a central server 0 The neural network model extracts the characteristics of the locally acquired network flow according to the format of a training data set D, after the characteristics are subjected to digitization and normalization processing, the neural network model is put into a multi-layer network structure to extract high-dimensional characteristics, finally, classification is carried out on a full connection layer, whether the classification result is an attack type or not is judged, if yes, abnormal flow is obtained, and otherwise, normal flow is obtained.
2. The Internet of things equipment abnormal flow detection method based on clustered federal learning according to claim 1, wherein the global neural network model structure in the step (1) is a convolutional neural network model, and the structure of the convolutional neural network model sequentially comprises the following steps: the device comprises a first convolution layer, an activation function layer, a maximum pooling layer, a second convolution layer, an activation function layer, a maximum pooling layer and a full connection layer.
3. The method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning as claimed in claim 1, wherein each local participant M in step (2) is n By means of local traffic data D n For global neural network model theta 0 Carrying out T-round local iterative training, comprising the following implementation steps:
(2a) Let the number of training rounds t =1, and apply the global neural network model θ 0 As a local neural network model m 0
(2b) Each local participant M n The local flow data D is transmitted in the current round n As a local neural network model m 0 Training the input of the model, taking the predicted value of the model and the actual value of the flow data as the input of a cross entropy loss function, and calculating the local flow data D of the model in the current round n A loss value L of (a), wherein:
L=CrossEntropy Loss(m 0 ;D n )
cross Encopy Loss is a cross entropy Loss function;
(2c) Gradient obtained by adopting Adam gradient descent method and by devitalizing loss value L
Figure FDA0003957824770000031
For local neural network model m 0 The weight parameters are updated to obtain the model trained in the current round
Figure FDA0003957824770000032
Wherein:
Figure FDA0003957824770000033
Figure FDA0003957824770000041
wherein,
Figure FDA0003957824770000042
representing the operation of calculating the deviation, and eta representing the learning rate;
(2d) Judging whether T = T is true, if yes, obtaining global neural network model weight parameter update
Figure FDA0003957824770000043
Otherwise, let t = t +1,
Figure FDA0003957824770000044
and performing step (2 b), wherein:
Figure FDA0003957824770000045
4. the method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning as claimed in claim 1, wherein the mean value of updating model parameters is calculated in step (3)
Figure FDA0003957824770000046
Maximum value
Figure FDA0003957824770000047
Sum and difference
Figure FDA0003957824770000048
The formula is as follows:
Figure FDA0003957824770000049
Figure FDA00039578247700000410
Figure FDA00039578247700000411
where, | is the radix, Σ represents the summation operation, and | | is the L2 norm operation.
5. The method for detecting abnormal traffic of equipment of the internet of things based on clustered federal learning as claimed in claim 1, wherein the global model parameter θ is calculated in the step (4) r The formula is as follows:
Figure FDA00039578247700000412
6. the method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning according to claim 1, wherein in the step (5), the participant M i With participant M j Cosine similarity of (a) i,j The calculation formula is as follows:
Figure FDA00039578247700000413
wherein the inner product operation is carried out.
7. The method for detecting abnormal traffic of Internet of things equipment based on clustered federal learning as claimed in claim 1, wherein the cluster C is calculated in the step (6 a) 1 Global neural network model weight parameter theta of internal member local round r,1 The formula is as follows:
Figure FDA0003957824770000051
8. the Internet of things equipment abnormal flow detection method based on clustered federal learning according to claim 1, wherein the method comprises the step (6 b)Middle calculation cluster C 2 Global neural network model weight parameter theta of internal member local round r,2 The formula is as follows:
Figure FDA0003957824770000052
CN202210442394.0A 2022-04-25 2022-04-25 Internet of things equipment abnormal flow detection method based on clustered federal learning Active CN114900343B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210442394.0A CN114900343B (en) 2022-04-25 2022-04-25 Internet of things equipment abnormal flow detection method based on clustered federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210442394.0A CN114900343B (en) 2022-04-25 2022-04-25 Internet of things equipment abnormal flow detection method based on clustered federal learning

Publications (2)

Publication Number Publication Date
CN114900343A CN114900343A (en) 2022-08-12
CN114900343B true CN114900343B (en) 2023-01-24

Family

ID=82717750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210442394.0A Active CN114900343B (en) 2022-04-25 2022-04-25 Internet of things equipment abnormal flow detection method based on clustered federal learning

Country Status (1)

Country Link
CN (1) CN114900343B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364943A (en) * 2020-12-10 2021-02-12 广西师范大学 Federal prediction method based on federal learning
CN112800461A (en) * 2021-01-28 2021-05-14 深圳供电局有限公司 Network intrusion detection method for electric power metering system based on federal learning framework
CN113139600A (en) * 2021-04-23 2021-07-20 广东安恒电力科技有限公司 Intelligent power grid equipment anomaly detection method and system based on federal learning
CN114266361A (en) * 2021-12-30 2022-04-01 浙江工业大学 Model weight alternation-based federal learning vehicle-mounted and free-mounted defense method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112364943A (en) * 2020-12-10 2021-02-12 广西师范大学 Federal prediction method based on federal learning
CN112800461A (en) * 2021-01-28 2021-05-14 深圳供电局有限公司 Network intrusion detection method for electric power metering system based on federal learning framework
CN113139600A (en) * 2021-04-23 2021-07-20 广东安恒电力科技有限公司 Intelligent power grid equipment anomaly detection method and system based on federal learning
CN114266361A (en) * 2021-12-30 2022-04-01 浙江工业大学 Model weight alternation-based federal learning vehicle-mounted and free-mounted defense method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Survey on Federated Learning: The Journey;Sawsan AbdulRahman等;《IEEE》;20210401;全文 *
基于联邦学习和卷积神经网络的入侵检测方法;王蓉等;《信息网络安全》;20200410(第04期);全文 *

Also Published As

Publication number Publication date
CN114900343A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN111614690B (en) Abnormal behavior detection method and device
CN110224862B (en) Multi-agent system network intrusion tolerance capability assessment method based on multilayer perceptron
CN108709745A (en) One kind being based on enhanced LPP algorithms and the quick bearing fault recognition method of extreme learning machine
Jiang et al. Electrical-STGCN: An electrical spatio-temporal graph convolutional network for intelligent predictive maintenance
CN105791051A (en) WSN (Wireless Sensor Network) abnormity detection method and system based on artificial immunization and k-means clustering
CN111723367B (en) Method and system for evaluating service scene treatment risk of power monitoring system
CN115858675A (en) Non-independent same-distribution data processing method based on federal learning framework
Ye et al. Deep over-the-air computation
CN112200263B (en) Self-organizing federal clustering method applied to power distribution internet of things
CN105654175A (en) Part supplier multi-target preferable selection method orienting bearing manufacturing enterprises
CN111581445A (en) Graph embedding learning method based on graph elements
CN117495205B (en) Industrial Internet experiment system and method
CN114580087B (en) Method, device and system for predicting federal remaining service life of shipborne equipment
CN112905671A (en) Time series exception handling method and device, electronic equipment and storage medium
Liu et al. Intrusion detection based on parallel intelligent optimization feature extraction and distributed fuzzy clustering in WSNs
CN115859344A (en) Secret sharing-based safe sharing method for data of federal unmanned aerial vehicle group
CN114900343B (en) Internet of things equipment abnormal flow detection method based on clustered federal learning
CN105722129A (en) Wireless sensing network event detection method and system based on FSAX-MARKOV model
CN110765668B (en) Concrete penetration depth test data abnormal point detection method based on deviation index
CN108985563B (en) Electromechanical system service dynamic marking method based on self-organizing feature mapping
CN104239785B (en) Intrusion detection data classification method based on cloud model
CN111461184A (en) XGB multi-dimensional operation and maintenance data anomaly detection method based on multivariate feature matrix
CN116296396A (en) Rolling bearing fault diagnosis method based on mixed attention mechanism residual error network
CN116136897A (en) Information processing method and device
CN115936477A (en) Industrial enterprise benefit comprehensive evaluation method and system based on edge cloud cooperation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240617

Address after: 311100, Building 1, No. 1500 Wenyi West Road, Cangqian Street, Yuhang District, Hangzhou City, Zhejiang Province, China, 593

Patentee after: Lanxiang Zhilian (Hangzhou) Technology Co.,Ltd.

Country or region after: China

Address before: 710071 No. 2 Taibai South Road, Shaanxi, Xi'an

Patentee before: XIDIAN University

Country or region before: China