CN116708009A

CN116708009A - Network intrusion detection method based on federal learning

Info

Publication number: CN116708009A
Application number: CN202310883184.XA
Authority: CN
Inventors: 汤中运; 胡海洋
Original assignee: Hangzhou Dianzi University Shangyu Science and Engineering Research Institute Co Ltd
Current assignee: Hangzhou Dianzi University Shangyu Science and Engineering Research Institute Co Ltd
Priority date: 2023-07-18
Filing date: 2023-07-18
Publication date: 2023-09-05

Abstract

The invention discloses a network intrusion detection method based on federal learning, which comprises the following steps: establishing a federal learning intrusion detection framework of the LSTM; the client processes the intrusion data set; the server (federal learning center platform) sends the federal learning global model to the client (a network of a certain organization); the client performs federal learning training according to the local data, and sends the trained parameters and loss values to the server; the server performs weighted average calculation in combination with the data size to generate a new global model. And repeating the operation until the model converges, and stabilizing the performance. And then the trained model is sent to the client for real-time intrusion detection. The intrusion detection method based on federal learning not only ensures the privacy of network traffic data, but also ensures the deep learning accuracy, and realizes the collaborative training of the model on the premise that the data is kept locally.

Description

Network intrusion detection method based on federal learning

Technical Field

The invention relates to the field of federal learning and network security, in particular to a network intrusion detection method based on federal learning.

Background

In the rapid development of information technology, network security is a problem that must be faced. The frequent occurrence of network attack events causes significant losses in all respects. Intrusion detection is a common network security defense technique. The network traffic is analyzed by an effective detection means, from which traffic data having characteristics different from normal traffic are identified, in particular, attack behavior against various malicious programs. The deep learning is proposed to further learn features from a large amount of chaotic and unordered high-dimensional data, and has the advantage that a learning model can be built by setting reasonable training parameters to select optimal features. There has been much research to apply deep learning to network intrusion detection systems for security defense. Deep learning requires a huge data set to train, and the larger the training set size, the better the performance of the model. The richer the network traffic data set, the higher the accuracy of the final model intrusion detection, but the network traffic is collected centrally, which will involve privacy concerns. Existing intrusion detection based on deep learning relies on local network traffic for model training. Different network operators, different organizations, typically do not share the network traffic sets together to construct a complete intrusion detection network traffic data set.

Federal learning is a framework for machine learning using data under the premise of meeting privacy protection and data security. It was first the concept proposed by Google, whose main idea was to build a machine learning model based on data sets distributed over multiple devices, while preventing data leakage. The traditional machine learning training method is to collect data in a data set for model training. The basic structure of federal learning consists of a server and a plurality of clients. The server does not collect data, but can collect parameters of the model, the server coordinates clients to participate in training, each client having training data. The client side keeps the training data locally, trains a local model by using the data, then encrypts or does not encrypt own parameters to the server side, and the server side averages or weighted averages the collected parameters and broadcasts the parameters to each client side for the next training round.

Federal learning can solve model training for various organizations to implement intrusion detection deep learning without sharing network traffic data. The institutions involved in federal learning may be various institutions with important network traffic, such as network operators, medical, educational and financial institutions, etc. The operator network is the network with the largest network traffic and is directly oriented to the user, and is particularly important for intrusion detection and other network security measures. And network traffic of different operators cannot be shared, and only network traffic of each operator is used when the intrusion detection model training based on deep learning is performed. The federal learning can coordinate each operator to cooperatively train a deep learning model of intrusion detection on the premise of not sharing network flow, so that a training data set is increased, and the performance of the intrusion detection model is improved. Data from institutions such as finance and medicine is important, and the frequency of network attacks is high. And their internal network traffic data is very sensitive, their data volume is smaller than that of the operator network. If an intrusion detection deep learning model with better performance is to be obtained, federal learning is more required to realize a collaborative training model with other mechanisms in the same field. These institutions are clients of federal learning, and servers will be deployed in the cloud. And after the federal learning is completely deployed, the client and the server are automatically linked. In addition, the client can exit participating in learning at any time according to the situation. Federal learning does not require that every round of learning require all clients to participate simultaneously.

Disclosure of Invention

The network intrusion detection method based on the deep learning is more, the detection accuracy is higher, and the feasibility of the application of the deep learning in the network intrusion detection is proved. Most methods also present problems such as insufficient data sets. Common centralized deep learning methods require various organizations to collect network traffic, which can lead to privacy concerns. If the method such as changing the model structure or generating a new data set is complicated, the method is difficult to apply to the real network scene. The federal learning can solve the problem of insufficient data sets and protect the privacy of network traffic. The invention aims to solve the technical problems and provide a network intrusion detection method based on federal learning.

The technical scheme adopted by the invention is as follows:

a network intrusion detection method based on federal learning comprises the following steps:

s1, establishing a federal learning intrusion detection framework of an LSTM, determining a federal learning center server, and taking a network mechanism added into federal learning as a client;

s2, each client added with federal learning performs data processing and attack type labeling on local network traffic, so that the local network traffic meets the input requirement of an LSTM model; the input of the LSTM model is network flow data, and the output is network intrusion attack type;

s3, the Federal learning center server distributes the latest LSTM model stored on the server to each client, each client performs a round of training on the LSTM model distributed on the server locally by using the network flow data marked by each client, and then sends the trained model parameters and loss function values back to the server;

s4, the server combines the total data amount of each client, performs weighted average on the model parameters and the loss function values sent back by each client, and updates the global model parameters and the global loss function values of the LSTM model stored on the server;

s5, repeatedly and iteratively updating the LSTM model stored on the server in the S3 and the S5 until the weighted average global loss function value on the server is converged, and storing the latest LSTM model as a network intrusion detection model by the server;

and S6, the server distributes the network intrusion detection model to each client, and each client inputs the local real-time network flow data into the model based on the network intrusion detection model distributed latest by the server to obtain a classification result of the network intrusion attack type, thereby realizing local network intrusion detection.

Preferably, the federal learning intrusion detection framework of the LSTM includes a server serving as a central platform and a plurality of clients participating in federal learning; network mechanisms serving as clients locally store network intrusion data; and each client uniformly receives the LSTM model from the server, locally utilizes the locally stored data to train, and uploads the trained model parameters and loss function values to the server after the train is finished, so as to cooperatively update the LSTM model stored on the server, thereby ensuring that the network traffic data is kept locally at each client, and realizing the cooperative train of multiple data sets.

Preferably, the network mechanism as the federal learning client is any mechanism with network traffic data, the mechanism type comprises a network operator, a campus network, a financial mechanism network and a medical mechanism network, and the network traffic of the mechanisms belongs to sensitive data which cannot be shared; mechanisms participating in the same federal learning task must belong to the same type of mechanism.

Preferably, the output classification of the LSTM model includes normal traffic data and abnormal intrusion data, and the network intrusion attack types further subdivided in the abnormal intrusion data include violent FTP, violent SSH, DOS, heartbleed, network attacks, penetration, botnet and DDoS.

Preferably, in the step S2, the network traffic data locally stored in each client is composed of each attribute field of the data packet, and the specific steps of data processing and attack type labeling are as follows:

s21, screening network flow data locally stored by a client to remove incomplete invalid data;

s22, carrying out data balance on the network flow data processed in the S21, and reducing the proportion of normal flow data in a data set;

s23, converting non-numerical data in the network flow data processed in the S22 into numerical data by utilizing an encoding mode;

s24, respectively normalizing each attribute field value in the network flow data processed in the S23;

s25, converting the network intrusion attack type label in the network flow data processed in the S24 into a single thermal code, wherein the dimension of the single thermal code vector is one added to the total number of the network intrusion attack types, and the single thermal code vector corresponds to the normal flow data and all the network intrusion attack types respectively.

Preferably, in S24, the normalization is mean variance normalization.

Preferably, in the step S3, after receiving the LSTM model distributed by the server, the client collects locally stored network traffic data at different moments and processes the locally stored network traffic data until the network traffic data meets the input requirement of the model, and each network traffic data and its label form a training sample; then inputting each training sample into an LSTM model, calculating cross entropy loss of a prediction label and a truth value label of each training sample, taking a cross entropy loss mean value of all training samples as a loss function value, and optimizing model parameters of the LSTM model through a gradient descent algorithm according to the loss function value; and finally, uploading the loss function value of the round and the optimized model parameters to a server for model aggregation and updating at the server side.

Preferably, in the step S4, the server performs aggregate calculation after receiving the loss function values and the model parameters uploaded by all the clients, and updates the global model parameters and the global loss function values of the LSTM model stored in the server; wherein the global loss function value is updated as:

wherein: l (w) represents the updated global loss function value on the server, n _k Representing the amount of network traffic data locally stored by the kth client, n representing the total amount of network traffic data for all clients, l _k (w) represents the loss function value uploaded by the kth client, K being the total number of clients participating in federal learning;

the update formula of the global model parameters is as follows:

wherein: g _k Representing the gradient used by the kth client in updating the local model parameters by a gradient descent algorithm; η is the learning rate; w (w) _t+1 And w _t Representing global model parameters after and before updating, respectively.

Preferably, after the model training is completed by the convergence of the global loss function value on the server, final LSTM model parameters are sent to each client; and each client updates a local LSTM model by using the received model parameters and performs intrusion detection, and judges whether abnormal network intrusion flow occurs or not according to the output of the model by inputting real-time network flow data into the model.

Preferably, when the client finds that the accuracy of the network intrusion detection model is lower than a specified threshold or finds a new network intrusion detection type, a message is sent to the server, and the server initiates a new round of network intrusion federal learning again according to S2-S6, so as to update the network intrusion detection model.

Compared with the prior art, the invention has the following beneficial effects:

the invention provides an intrusion detection method based on federal learning, which ensures detection accuracy while protecting network traffic privacy. Based on the federal learning framework, the invention combines the characteristics of network traffic to provide algorithms such as federal learning framework and related data processing, model fusion and the like for intrusion detection. The invention selects the LSTM model as the model of intrusion detection in federal learning, and accords with the time sequence characteristics of network traffic. The intrusion detection method based on federal learning provided by the invention proves the feasibility of carrying out the deep learning of the shared data set on the premise of protecting the privacy of network traffic.

Drawings

FIG. 1 is a diagram of an intrusion detection framework based on federal learning in accordance with the present invention;

FIG. 2 is a flow chart of network traffic data processing based on federal learning clients in accordance with the present invention;

FIG. 3 is a flow chart of intrusion detection based on federal learning in accordance with the present invention.

Detailed Description

In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to the appended drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be embodied in many other forms than described herein and similarly modified by those skilled in the art without departing from the spirit of the invention, whereby the invention is not limited to the specific embodiments disclosed below. The technical features of the embodiments of the invention can be combined correspondingly on the premise of no mutual conflict.

The invention combines the basic characteristics of network attack flow and the basic method of federal learning to preprocess the network flow data and then apply the preprocessing to the intrusion detection model training of the network flow data, thereby ensuring the accuracy of network attack detection while protecting the privacy of the network flow.

The core of the invention is to provide a framework comprising a platform serving as a server center and a plurality of clients participating in learning. Network intrusion data is stored locally by the authorities acting as clients. Wherein Model is _global Representing each training model received from the server and trained locally. And Model _k The model trained by itself is uploaded to the server on behalf of each client. Therefore, the sensitive data of the network flow can be kept locally at each client, and the cooperative training of multiple data sets is realized. Based on the framework, the invention provides an intrusion detection preprocessing method suitable for federal learning, which mainly comprises the steps of arranging network flow data, balancing a data set, data coding, data normalization and coding tag data. The local learning process of the client is to input local data for training once after receiving the server model, and upload the training completion model and the loss value to the server. After receiving all client models and loss values, the server side carries out weighted average calculation according to the data volume, firstly evaluates the global loss value, and if the convergence requirement is not met, then sends the calculated global model to each clientAnd (3) an end. The server transmits the trained model to each client, and the clients perform intrusion detection according to the global model.

A detailed description of a specific implementation of a federal learning-based network intrusion detection method is provided in a preferred embodiment of the present invention. The detection method comprises the following specific steps:

s1, establishing a federal learning intrusion detection framework of the LSTM, determining a federal learning center server, and taking a network mechanism added into federal learning as a client.

As shown in fig. 1, a schematic diagram of the federal learning intrusion detection framework of LSTM is shown, which is composed of a central server and several clients (which may be operators or other specific institutions) that own network traffic. The server and the client can be provided with the LSTM model in advance, and the parameters of the LSTM model can be transmitted to realize model updating. Each client mechanism automatically collects local network traffic and performs iterative training on the LSTM model after local data preprocessing. After one iteration, model parameters of each client are Model _k (parameter _k ) And loss value l _k Is uploaded to a server deployed at an authoritative location. The server performs a weighted average calculation of all uploaded models and loss values. And then the updated global Model is updated _global To each client. Each client performs a new round of training according to this model. Based on this framework, the data of the institutions involved in federal learning are stored locally, but the models or parameters are shared. All the collected network traffic data is stored locally. In addition, various preprocessing may be performed on the training data. The control right of the data is in each organization, which effectively protects the privacy of the network traffic. In addition, each organization can choose whether to exit federal learning at any time. There is no limit to the number of clients participating in federal learning, but too large a number can result in slow convergence. One reason is the increase in data set, another sourceAnd thus an increase in communication costs, requires waiting for some customers to complete local training.

It should be noted that, the network mechanism of the present invention as a federal learning client is a generic term. The network institutions involved in federal learning may be any of a variety of institutions having network traffic data, for example, the institution types may be respective network operators, respective campus networks, respective financial institution networks, respective medical institution networks. And the network traffic of these institutions all belongs to sensitive data which cannot be shared. The institutions participating in the same intrusion detection federal learning task must belong to the same type of institution, e.g. all belong to the network operator, or all belong to the campus network. When different types of institutions exist, the corresponding federal learning intrusion detection frameworks are required to be built for training.

S2, each client added with federal learning performs data processing and attack type labeling on local network traffic, so that the local network traffic meets the input requirement of an LSTM model; the LSTM model is input as network flow data and output as network intrusion attack type.

The classification of network traffic data is to separate the data into normal traffic data and other abnormal intrusion data. The anomaly data includes: violent FTP, violent SSH, DOS, heartbleed, cyber attacks, penetration, botnet, DDoS, etc. And network traffic data are the various fields of the data packet. Thus, in an embodiment of the present invention, the output classification of the corresponding LSTM model includes normal traffic data and abnormal intrusion data, and the network intrusion attack types further subdivided in the abnormal intrusion data include violent FTP, violent SSH, DOS, heartbleed, network attacks, penetration, botnet and DDoS. Of course, the network intrusion attack type can be further expanded, and the method is not limited.

The specific processing mode of the network flow data stored locally by each client needs to be designed according to the input requirement of the model. As shown in fig. 2, in the embodiment of the present invention, the network traffic data locally stored by each client is composed of each attribute field of the data packet, so the specific steps of the data processing and attack type labeling in S2 are as follows:

s21, screening the network flow data locally stored by the client to remove incomplete invalid data.

And S22, carrying out data balance on the network flow data processed in the step S21, and reducing the proportion of normal flow data in a data set so as to balance the sample sizes of all the categories.

S23, converting non-numerical data in the network flow data processed in the S22 into numerical data by using an encoding mode.

And S24, respectively normalizing the attribute field values in the network flow data processed in the step S23.

In the embodiment of the invention, the normalization adopted here is mean variance normalization, and the normalization calculation formula is shown in formula (1):

wherein: x is X _mean Represents the average value of each column, X _std Representing the variance of the data. This formula subtracts the data from its mean value according to its attribute (per column) and finally divides by its variance, the resulting new data is that for each attribute (per column) all data will be clustered around 0, with a variance of 1. In the calculation process, each attribute (each column) is performed respectively, and interference is avoided.

S25, converting the network intrusion attack type label in the network flow data processed in the S24 into a single thermal code, wherein the dimension of the single thermal code vector is one added to the total number of the network intrusion attack types (1 added corresponds to the type of the normal flow data, namely, no network intrusion attack exists), and the single thermal code vector corresponds to the normal flow data and all the network intrusion attack types respectively. And when the model accuracy is calculated, comparing the maximum value index of the vector elements output by the model with the single thermal code. As each client performs intrusion detection by training the completed model, network traffic is also classified by observing the single thermal codes output by the model.

S3, the Federal learning center server distributes the latest LSTM model stored on the server to each client, each client performs a round of training on the LSTM model distributed on the server locally by using the network flow data marked by each client, and then sends the trained model parameters and loss function values back to the server.

The time sequence and long-term memory function of the common deep learning model of the LSTM model selected by the invention accord with the characteristics of network flow, and the basic formula is as follows

f _t ＝σ(W _f *[h _t-1 ,x _t ]+b _t ) (2)

It will be passed between the server and the client during federal learning of network intrusion data, being trained together.

In the embodiment of the invention, the training of the LSTM model by the client is similar to the conventional training mode, namely, after the client receives the LSTM model distributed by the server, the locally stored network traffic data at different moments are collected and processed until the input requirements of the model are met, and each network traffic data and the label thereof form a training sample; then inputting each training sample into an LSTM model, calculating cross entropy loss of a prediction label and a truth value label of each training sample, taking a cross entropy loss mean value of all training samples as a loss function value, and optimizing model parameters of the LSTM model through a gradient descent algorithm according to the loss function value; and finally, uploading the loss function value of the round and the optimized model parameters to a server for model aggregation and updating at the server side.

Specifically, in the training process after the client receives the model sent by the server, the input data set of each mechanism client is x _t . It represents a network traffic packet at a certain time t, as shown in the following formula (3). Each element of which represents the traffic characteristics of each packet field, for a total of m.

x _t ＝{X _i |i＝1…m} (3)

The input label of each client is y _t . It represents the standard value of intrusion category corresponding to network flow, including normalTraffic data. As shown in equation (4). Each element of the network intrusion detector represents the type of network intrusion, and j types are used.

y _t ＝{Y _i |i＝1…j} (4)

The number of layers of the model can be defined by itself, the dimension of the hidden layer state is generally a multiple of 2, and the size of the input sample is consistent with the dimension of the hidden layer.

Will x _t Substituting into the formula (2) of the LSTM model. Let the value of the loss function be loss, the calculation mode of the loss function is shown as formula (5)

loss＝l(y _t ,f _t ) (5)

Where the loss function l is a cross entropy function.

The kth client obtains the network intrusion data of the stripe in a period of time and n is shared _k And (3) calculating a loss function once for each data input, wherein after all data inputs, the loss function calculation method of the client is shown as a formula (6), and w represents parameters of a model f function.

Minimizing updating the model parameters local to the iterative client by a gradient descent algorithm based on the loss values, gradient calculation formulas such as (7)

And finally, sending the local loss function value and the model parameters to a server.

And S4, the server combines the total data quantity of each client, performs weighted average on the model parameters and the loss function values sent back by each client, and updates the global model parameters and the global loss function values of the LSTM model stored on the server.

In the embodiment of the invention, the server performs aggregation calculation after receiving the models sent by each client, and performs aggregation calculation after receiving the loss function values and the model parameters uploaded by all clients, and updates the global model parameters and the global loss function values of the LSTM models stored on the server. The calculation mode combines the data volume of each client to realize weighted average calculation.

Wherein the global loss function value update optimization is as shown in formula (8):

the update formula of the global model parameters is shown as formula (9):

From the above formula, it can also be found that the aggregate calculation of the loss function value and the model parameter combines the magnitude of the network traffic data amount local to each client as a weight proportion.

S5, repeatedly and iteratively updating the LSTM model stored on the server in S3 and S5 until the weighted average global loss function value on the server is converged, and storing the latest LSTM model as a network intrusion detection model by the server.

In the specific implementation process, after model training is completed by convergence of the global loss function value on the server, final LSTM model parameters are sent to all clients; and each client updates a local LSTM model by using the received model parameters and performs intrusion detection, and judges whether abnormal network intrusion flow occurs or not according to the output of the model by inputting real-time network flow data into the model.

And S7, when the client finds that the accuracy rate of the model is reduced or a new network intrusion detection type is found, sending a message to a server, and initiating a new round of network intrusion federal learning by the server so as to update the model. When the client finds that the accuracy of the network intrusion detection model is lower than a specified threshold or a new network intrusion detection type is found, a message is sent to the server, and the server initiates a new round of network intrusion federal learning again according to S2-S6, so that the network intrusion detection model is updated.

As shown in fig. 3, a flowchart of intrusion detection based on federal learning is shown, and a server serving as a federal learning center platform is first established. All network institutions can choose whether to join federal learning or not by themselves, and can also choose to exit midway. The added clients and servers form a federal learning framework. Each client first processes local network traffic data. The server transmits the initial global model to each client, the client trains the model by using the processed local data, and the model and the loss value are transmitted to the server. The server combines the data quantity of all the clients to perform weighted average calculation, generates a new model, and then sends the model to each client again, and training is repeated. Until the model has converged on all clients. And the server side transmits the trained global model to each client side. Each client may utilize the model for intrusion detection. And inputting the real-time network flow data into a model, and judging whether abnormal network intrusion flow occurs according to the output of the model. When the client finds that the accuracy rate of the model is reduced or a new network intrusion detection type is found, a message is sent to the server, and the server initiates a new round of network intrusion federal learning so as to update the model.

The above embodiment is only a preferred embodiment of the present invention, but it is not intended to limit the present invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, all the technical schemes obtained by adopting the equivalent substitution or equivalent transformation are within the protection scope of the invention.

Claims

1. A network intrusion detection method based on federal learning is characterized by comprising the following steps:

2. The federal learning-based network intrusion detection method according to claim 1, wherein the federal learning intrusion detection framework of the LSTM includes a server as a central platform and a plurality of clients participating in federal learning; network mechanisms serving as clients locally store network intrusion data; and each client uniformly receives the LSTM model from the server, locally utilizes the locally stored data to train, and uploads the trained model parameters and loss function values to the server after the train is finished, so as to cooperatively update the LSTM model stored on the server, thereby ensuring that the network traffic data is kept locally at each client, and realizing the cooperative train of multiple data sets.

3. The federal learning-based network intrusion detection method according to claim 1, wherein the network institution as the federal learning client is any institution having network traffic data, the institution type includes network operators, campus networks, financial institution networks, medical institution networks, and the network traffic of these institutions all belong to sensitive data which cannot be shared; mechanisms participating in the same federal learning task must belong to the same type of mechanism.

4. The federal learning-based network intrusion detection method according to claim 1, wherein the output classification of the LSTM model includes normal traffic data and abnormal intrusion data, and further subdivided network intrusion attack types in the abnormal intrusion data include violent FTP, violent SSH, DOS, heartbleed, network attacks, penetration, botnet and DDoS.

5. The network intrusion detection method based on federal learning according to claim 1, wherein in S2, the locally stored network traffic data of each client is composed of each attribute field of the data packet, and the specific steps of data processing and attack type labeling are as follows:

6. The federal learning-based network intrusion detection method according to claim 5, wherein in S24, the normalization is mean variance normalization.

7. The network intrusion detection method based on federal learning according to claim 1, wherein in the step S3, after receiving the LSTM model distributed by the server, the client collects locally stored network traffic data at different moments and processes the locally stored network traffic data until the network traffic data meets the input requirement of the model, and each network traffic data and its label form a training sample; then inputting each training sample into an LSTM model, calculating cross entropy loss of a prediction label and a truth value label of each training sample, taking a cross entropy loss mean value of all training samples as a loss function value, and optimizing model parameters of the LSTM model through a gradient descent algorithm according to the loss function value; and finally, uploading the loss function value of the round and the optimized model parameters to a server for model aggregation and updating at the server side.

8. The network intrusion detection method based on federal learning according to claim 1, wherein in S4, the server performs aggregate computation after receiving the loss function values and the model parameters uploaded by all clients, and updates global model parameters and global loss function values of the LSTM model stored on the server; wherein the global loss function value is updated as:

the update formula of the global model parameters is as follows:

9. The network intrusion detection method based on federal learning according to claim 1, wherein after the model training is completed by convergence of the global loss function value on the server, final LSTM model parameters are transmitted to each client; and each client updates a local LSTM model by using the received model parameters and performs intrusion detection, and judges whether abnormal network intrusion flow occurs or not according to the output of the model by inputting real-time network flow data into the model.

10. The network intrusion detection method according to claim 1, wherein when the client finds that the accuracy of the network intrusion detection model is lower than a specified threshold or finds a new network intrusion detection type, a message is sent to the server, and the server initiates a new round of network intrusion federation learning again according to S2-S6, thereby updating the network intrusion detection model.