CN114554491A

CN114554491A - Wireless local area network intrusion detection method based on improved SSAE and DNN models

Info

Publication number: CN114554491A
Application number: CN202210167997.4A
Authority: CN
Inventors: 王海珍; 崔志青; 葛海淼; 廉佐政; 滕艳平; 李梦歌
Original assignee: Qiqihar University
Current assignee: Qiqihar University
Priority date: 2022-02-23
Filing date: 2022-02-23
Publication date: 2022-05-27

Abstract

The invention relates to the field of wireless network security, in particular to a wireless local area network intrusion detection method based on improved SSAE and DNN models, which adopts SSAE to carry out feature extraction and dimension reduction and then takes the SSAE as the input of DNN to carry out classification, wherein the SSAE adopts tanh as an activation function and an L2 regular term, the DNN comprises three hidden layers and a dropout layer for preventing overfitting, and simultaneously, a grid search method is adopted to optimize parameters and methods of a neural network. Compared with AE, SSAE models and traditional SSAE-DNN models, the method has better effect on each classification on the attack multi-classification problem of the wireless local area network, and obtains better performance on precision and ROC images.

Description

Wireless local area network intrusion detection method based on improved SSAE and DNN models

Technical Field

The invention relates to the field of wireless network security, in particular to a wireless local area network intrusion detection method based on improved SSAE and DNN models.

Background

Since the world, Wireless Local Area Network (WLAN) technology has been integrated into people's daily life by virtue of the characteristics of flexible access, economy, strong expansibility and the like, and with the increasing number of users, security problems also occur, for example, various new bugs and attacks are visible everywhere, wireless network crimes occur sometimes, the network threat problem becomes daily serious, and the network security problem becomes a hotspot problem concerned by people. However, in a wireless lan, the security system is difficult to install on the wireless access point device, and is generally independent of the wireless access point. The intrusion detection system can monitor and find the security problem of the wireless network, better meets the requirement, and highlights increasingly important position in the aspect of resisting the security of the wireless network.

Generally, intrusion detection methods can be classified into: misuse-based intrusion detection and anomaly-based intrusion detection. The intrusion detection system based on misuse identifies intrusion by matching predefined features, and is classified as intrusion as long as flow deviating from the predefined features is observed, so that the intrusion detection system is effective in detecting known attacks and has higher detection accuracy and lower false alarm rate, but when unknown or new attacks are detected, the performance of the intrusion detection system is influenced due to the limitation of rules pre-installed in the intrusion detection system; anomaly-based intrusion detection may identify unknown intrusion behavior, is well suited to detect unknown and new attacks, and has been widely recognized in the research community. In essence, anomaly-based intrusion detection can be viewed as a classification problem that determines network attacks by classifying network traffic as normal and anomalous. Thus, machine learning algorithms may facilitate the development of intrusion detection systems. However, in the intrusion detection method based on machine learning, 80% of the workload is used for feature extraction. And the deep learning can greatly improve the working efficiency by automatically extracting the features. Network data traffic often contains multidimensional features, so deep learning has a significant impact on intrusion detection, and particularly AE performs very well in capturing complex multivariate distributions of input feature spaces.

At present, scholars at home and abroad have proposed various intrusion detection methods for deep learning. For example, Shone and the like construct a deep asymmetric automatic encoder network model with strong learning capability to extract the characteristics of network traffic data, and combine the characteristics with an SVM method to classify network attacks. Thing constructed an SAE model and combined it with sofmax. Amianto et al extracts a new feature expression from original features of network traffic data by using an SAE model, filters out important features most relevant to network intrusion detection by sequencing all the original features and the new features according to importance, and then identifies and classifies network attacks by combining a Support Vector Machine (SVM) method. Hosseinzadeh et al use SVM for anomaly detection. Tan et al propose a new intrusion detection method based on a Deep Belief Network (DBN) and a particle swarm optimization algorithm. Liu et al propose an aided feature vector anomaly detection method that applies spatial clustering of noise and density. Mighan et al combined the advantages of deep learning and machine learning to train potential features and address the challenges of massive data. Experiments show that the method is superior to a simple machine learning method, but the precision needs to be improved. Zhong et al propose a hierarchical deep learning system (BDHDLS) based on a large dataset BDHDLS, each model focusing on a unique distribution of data within the learning cluster. Compared with the traditional single learning model method, the strategy can improve the accuracy of attack detection, but is too complex. Wang and the like combine SDAE-ELM with DBN softmax, design a new intrusion detection model, overcome the defects of long training time and low classification precision, but the deep intrusion detection model has a complex structure and many parameters. The results of Saraeian et al applying CNN to DOS attack show that deep learning has strong learning ability and high accuracy, but detection accuracy of other attacks needs to be improved. Imrana et al propose an intrusion detection system based on BIDLSM to reduce the false alarm rate. Lin combines traditional intrusion techniques with DBNs to improve accuracy, but the network parameters are too many and the model is too complex.

In recent years, AE, SAE, DNN and variants RNN, LSTM and the like thereof have been successfully applied to an intrusion detection system, so that the limitation of shallow learning is broken through, and the intrusion detection system is effectively improved. Rao et al propose a two-stage hybrid intrusion detection method. In the first stage, smooth L1 regularization is employed to enhance the sparsity of the auto-encoder. In the second stage, DNN is used for predicting and classifying the attack, and experiments show that the method has better accuracy and detection rate. Andresini et al applied the AE neural network to feature extraction and anomaly detection and demonstrated its effectiveness through experimental analysis. Chen et al propose a classification model for wireless network intrusion detection based on recurrent neural networks and long-term and short-term memory. The model can predict time series and process discrete data by establishing correlation, has higher accuracy when detecting data injection attacks, but has no obvious influence on other attack detection and takes longer time.

In summary, there are many problems with the wlan intrusion detection system. Firstly, false alarm and missing report exist in the intrusion behavior verified by the WLAN intrusion detection technology, and the detection performance needs to be improved. Secondly, with the continuous increase of data in the WLAN, the existing method is not very effective in data extraction, analysis and processing, and it is necessary to enhance the processing capability of the intrusion detection system.

Disclosure of Invention

In order to solve the problems, the invention provides a WLAN intrusion detection method based on improved SSAE and DNN models.

In order to achieve the purpose, the invention adopts the technical scheme that:

the wireless local area network intrusion detection method based on an improved SSAE (stacked sparse self-encoder) and DNN (deep neural network) model is characterized in that the SSAE is adopted for feature extraction and dimension reduction, and then the feature extraction and dimension reduction are used as the input of DNN for classification, wherein the SSAE adopts tanh as an activation function and an L2 regular term, the DNN comprises three hidden layers and a dropout layer for preventing overfitting, and simultaneously, a grid search method is adopted for optimizing parameters and methods of the neural network.

In the wireless local area network intrusion detection method based on the improved SSAE and DNN models, the SSAE) adopts three sparse self-encoders to be stacked, the feature output of a first encoder after hidden layer compression is used as the input of a second encoder, the feature is compressed by the second encoder, the hidden layer feature output of the second encoder is used as the input of a third encoder, and finally the low-dimensional feature extracted by the third encoder is output.

In the wireless local area network intrusion detection method based on the improved SSAE and DNN models, 70-dimensional features are extracted by a first encoder, 50-layer features are extracted by a second encoder, and 20-dimensional final features are extracted by a third encoder.

In the method for detecting wireless local area network intrusion based on improved SSAE and DNN models, parameters optimized by a grid search method comprise: activation, epochs, batch _ size, init _ mode, optizer.

The invention discloses a wireless local area network intrusion detection method based on improved SSAE and DNN models, which comprises the following steps:

s1, preprocessing the original data set to finish the validation, equalization, data quantization and normalization of the data; specifically, the method comprises the following steps:

data validation

Deleting the attribute column with the null value of 80%, and finally changing the characteristics of the data set into 93 dimensions;

data equalization

Randomly selecting 5% of data from the normal flow to balance the proportion of the normal flow and the abnormal flow;

data digitization

Assigning different values under the same characteristics to different discrete data by adopting a LabelEncoder, digitizing labels, wherein the labels are 4 types in total, and mapping the labels into four-dimensional vectors by adopting One-hot coding;

data normalization

And mapping the attribute values between [0,1] by using the most value normalization, wherein the formula is shown as (1).

Wherein, y_iNormalized value, x, for the ith characteristic value_iThe ith characteristic value, max (x) is the maximum value in the characteristic column of i, and min (x) is the minimum value in the characteristic column of i;

s2, feature extraction and dimension reduction are performed using SSAE (stacked sparse self-encoder), and then the feature is classified as an input to DNN (deep neural network).

Compared with AE and SSAE models and a traditional SSAE-DNN model, the method has better effect on each classification on the attack multi-classification problem of the wireless local area network, and obtains better performance on precision and an ROC (optimum-angle-dependent characteristic) diagram.

Drawings

FIG. 1 is a diagram of an improved SSAE and DNN intrusion detection model in an embodiment of the present invention.

FIG. 2 is a single SAE structure in an embodiment of the present invention.

FIG. 3 shows the activation functions and the L1 and L2 regular term penalties in an embodiment of the present invention.

FIG. 4 shows the accuracy of each activation function and the L1 and L2 regularization terms in an embodiment of the present invention.

FIG. 5 is a stacked SAE structure in an embodiment of the present invention.

Fig. 6 shows a DNN structure according to an embodiment of the present invention.

FIG. 7 is a graph illustrating the effect of compression dimension on the accuracy of a test model in an embodiment of the present invention.

FIG. 8 is a graph illustrating the effect of compression dimension on the loss rate of a test model in an embodiment of the present invention.

FIG. 9 is a graph showing the effect of different epochs, batch _ size, on the model in an embodiment of the invention.

FIG. 10 is a diagram illustrating the influence of different weight initialization methods on a model according to an embodiment of the present invention.

FIG. 11 is a diagram illustrating the effect of the activation function activation of the output layer on the model according to an embodiment of the present invention.

FIG. 12 shows the effect of the optimizer on the model in an embodiment of the present invention.

FIG. 13 is a ROC graph of the AE model in the embodiment of the present invention.

FIG. 14 is a ROC plot of an SSAE model in an embodiment of the present invention.

FIG. 15 is a ROC plot of SSAE-DNN in an example of the present invention.

FIG. 16 ROC plots of the model of the present invention.

Detailed Description

In order that the objects and advantages of the invention will be more clearly understood, the invention is further described in detail below with reference to examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

1. Introduction to data set

The wireless network attack behavior data set-Aiqin wireless data set (AWID) used by the invention is a data set generated by real wireless network records, and can be divided into a reduced set and a complex set. The invention adopts a training set and a testing set in a simplified set to carry out tests, namely AWID-CLS-R-Trn and AWID-CLS-R-Tst. The training set comprises actual traffic packet data captured in one week, and comprises 1795574 pieces of data, wherein the normal traffic is 1633189 pieces, the injection is 65379 pieces, the innovation is 48522 pieces, and the flooding is 48484 pieces. The test set contained 575642 samples with a normal flow rate of 530784, 20079 for implantable, 16682 for injection, and 8097 for flooding. The training set and test set distributions are shown in table 1.

TABLE 1 AWID reduced dataset type distribution

It can be seen from table 1 that the amount of normal traffic in the data set is much higher than the amount of attack traffic, the ratio of the amount of normal label samples to the amount of attack label samples is close to 15:1, the training set and the test set are both unbalanced data sets, and the unbalanced training data problem may cause the classifier to over-fit the normal label categories and under-fit the other attack label categories, which may affect the training effect, so that the data set needs to be balanced sampled. Section 4 of the present invention performs validation, equalization, digitization, and normalization on the data set, respectively.

2. Method of producing a composite material

The general framework of the intrusion detection model provided by the invention is shown in fig. 1, firstly, data preprocessing is carried out on an original data set to finish the validation, equalization, data quantization and normalization processing of data, then, the model is trained by using the data of a training set to obtain and store the trained model, and then, the model is used for detecting the test data. The model mainly comprises SSAE design and DNN design, and features are extracted and dimension is reduced by adopting SSAE, and then the features are used as the input of DNN for classification. In order to obtain a better detection effect, the DNN neural network comprises three hidden layers and a dropout layer for preventing overfitting, and various parameters and methods of the neural network are optimized by adopting a grid search method. The design process of the model is described in detail below.

SSAE design

93-dimensional features are obtained through data preprocessing, relevance exists among the features, and therefore the features can be continuously removed.

Single SAE architecture

A single SAE architecture is shown in fig. 2. The SAE is a neural network with input dimension equal to output dimension, and is composed of an encoder and a decoder, wherein original features are used as input layers, are encoded by the encoder, are mapped into features with lower dimension, and are subjected to nonlinear conversion from high dimension to low dimension and are reconstructed into the original dimension by the decoder.

The encoding process from the input layer to the hidden layer is performed as in equation (2), and the decoding process from the hidden layer to the output layer is performed as in equation (3), and the decoding completes the reconstruction of the features.

Y＝g_a(w_aX+b_a) (2)

Z＝g_h(w_hY+b_h) (3)

Wherein, w_a，b_a，w_h，b_hThe weight vector and the offset value of the input layer and the weight vector and the offset value of the hidden layer are respectively; x ═ X₁,x₂,x₃,……x_n)，Y＝(y₁,y₂,y₃,……y_m)，Z＝(z₁,z₂,z₃,……z_n)，X,Z∈Rⁿ，Y∈R^mWhere n is the dimension of the input layer and the output layer, and m is the dimension of the hidden layer. g_a，g_hFor activating the function, the present invention finds that the effect achieved by the tanh activation function is better by comparing the effects of various activation functions, and therefore, g_a、g_hIs shown in (4), wherein x_tIs the input data for the activation function.

To prevent over-fitting of the self-encoder model, a regularization term is added to the model, limiting the complexity of the model, thereby balancing the model between complexity and performance. Common regularization methods include L1 regularization and L2 regularization, and the L1 regularization term and the L2 regularization term can be regarded as penalty terms of the loss function, that is, some restrictions are made on some parameters in the loss function. Expression of L1 regularization is as in equation (5).

Wherein alpha | | w | | non-woven cells₁The term is L1 regularization, and L1 regularization refers to the sum of the absolute values of the elements in the weight vector w. The constraint that the L1 regularization adds to the solution space is: Σ | w | non-woven phosphor₁C is less than or equal to C, the L1 regularization can generate a sparse weight matrix, namely a sparse model is generated, the sparse model can be used for feature selection, and overfitting can be prevented. The expression of L2 regularization is as in formula (6), wherein,

l2 regularization for the L2 regularization termWhich refers to the sum of the squares of the elements in the weight vector w. The constraint that L2 adds to the solution space is:

l2 may prevent the model from overfitting.

Through experimental data, a proper activation function and a proper regular term are selected, experimental loss and accuracy are shown in fig. 3 and fig. 4, and through comparison of experimental data, the effect is better when tanh is used as the activation function and the L2 regular term is adopted.

Stacked SAE design

In order to better realize the dimension reduction of data, the invention adopts three sparse autoencoders to stack, and the dimension extracted by the three autoencoders is reduced layer by layer, thereby realizing the effect of characteristic dimension reduction. As shown in fig. 5, the present invention adopts three self-encoders stacked, the feature output after the hidden layer compression of the first encoder is used as the input of the second encoder, the feature is compressed by the second encoder, the hidden layer feature output of the second encoder is used as the input of the third encoder, and finally, the low-dimensional feature extracted by the third encoder is output. The structure of the stacked codes adopted by the invention is that 70-dimensional features are extracted from the first layer, 50-layer features are extracted from the second layer, the final features are 20-dimensional from the third layer, and the extracted 20-dimensional features are output.

DNN design

The DNN structure designed by the invention is shown in FIG. 6, the 20-dimensional characteristic after dimension reduction is realized, the hidden layer comprises 3 layers, the 1 st layer comprises 128 nerve units, the 2 nd layer comprises 64 nerve units, the 3 rd layer comprises 32 nerve units, all layers are connected, namely any one neuron of the i th layer is connected with any one neuron of the i +1 th layer, and meanwhile, in order to prevent model overfitting, a dropout layer is added, so that overfitting can be effectively relieved, and the regularization effect can be achieved to a certain extent. Finally, a softmax classifier is connected behind the DNN and used for solving the multi-classification problem.

The invention adopts the grid search method to automatically optimize various parameters of the deep neural network, thereby achieving the effect of optimizing the model. The grid search method is an exhaustive search method for specifying parameter values, different parameter combinations are listed through the exhaustive search method, a structure with optimal performance is determined, and parameters of an estimation function are optimized through a cross validation method to obtain an optimal learning algorithm. And (3) arranging and combining the possible values of each parameter, and listing all possible combination results to generate a 'grid'. Each combination was then used for training and performance was evaluated using cross-validation. The parameters optimized in this model are: activation, epochs, batch _ size, init _ mode, optizer, which improve the performance of the model by adjusting these parameters.

Data pre-processing

Data validation

The original training set and the test set share 155-dimensional features, and there is a large amount of null data, which does not contribute effectively to the experiment, so the attribute columns with 80% null values are deleted, and the feature of the data set is 93-dimensional.

Data equalization

The AWID dataset is an extremely unbalanced dataset, and therefore, the dataset needs to be subjected to instance selection, 5% of data is randomly selected from normal traffic, and the proportion of the normal traffic to abnormal traffic is balanced, so that a model can be trained better.

Data digitization

The data required when data sampling is carried out is a numerical matrix, so that MAC addresses of character type attributes in a data set need to be converted into numerical data, the LabeleEncoder is used, and different values under the same characteristics are assigned to different discrete data. And the tags are digitized, the tags are 4 types in total, and are mapped into four-dimensional vectors by adopting One-hot coding, for example, the normal is 0001, the injection is 0010, the impersonation is 0100, and the flooding is 1000.

Data normalization

The value ranges of different features in the data set are too different, and numerical data are normalized in order to eliminate the influence of the difference on the model. The invention adopts the most value normalization to map the attribute value between [0,1], and the formula is shown as (1).

Wherein, y_iNormalized value, x, for the ith characteristic value_iFor the ith feature value, max (x) is the maximum value in the feature column of i, and min (x) is the minimum value in the feature column of i.

Results of the experiment

The experimental environments are python3.7 and Tensorflow.

SSAE experiments and results analysis

SSAE adopts multilayer neural network structure, and wherein the input layer contains 92 neurons, and the output layer is the same with the input layer neuron number, and the autoencoder comprises three encoder and three decoder again, because the autoencoder is symmetrical structure, encoder and decoder are respectively: 70: 50: 20 and 20: 50: 70. wherein 20 is a vector of the hidden layer, which can be regarded as a final compressed vector of the encoder, i.e. the finally extracted feature, which is used as an input feature for training the neural network in the next step. Fig. 7 and 8 show that the influence of different features on the model is extracted, so that the compression dimension has an important influence on the model test, wherein the dimension compression is better at 20 dimensions.

DNN experiments and results analysis

The input of DNN is 20-dimensional characteristics after the hidden layer of the self-encoder is compressed, and the DNN comprises three hidden layers, wherein the neuron structure of the hidden layers is as follows: 128: 64: 32, and a dropout layer for preventing overfitting, the output layer is a softmax classifier, and 4 categories are output. In the model, GridSearchCV is added to search the optimal state of each parameter so as to improve the accuracy of the model. As can be seen from fig. 9 and 10, the changes in epochs and batch _ size have an effect on the improvement in model accuracy, with the effect being better when the batch _ size is 40 and the epochs is 50; when the weight initialization method is uniform, the effect of the model is better.

It can be seen from fig. 11 and 12 that the changes of the different activation functions activation and optimizer have an effect on the improvement of the model accuracy, wherein the effect is better when activation is softmax and the optimizer is RMSprop.

Classification models most often measure the effectiveness of the model in terms of True Positives (TP), False Positives (FP), True Negatives (TN), and False Negatives (FN). However, the AWID data set studied by the invention is an extremely unbalanced data set, so the measurement index selects an ROC curve, and the ROC curve is a curve which takes the false positive rate as an x axis and the true positive rate as a y axis. AUC, the area under the ROC characteristic curve, the larger the AUC, the better the classifier performance. Because the calculation method of the AUC simultaneously considers the classification capacity of the classifier on positive examples and negative examples, the classifier can be reasonably evaluated under the condition of unbalanced samples, and the problem caused by uneven samples is successfully avoided. The ROC curves for AE, SSAE, conventional SSAE-DNN and the model of the invention are shown in FIGS. 13-16 and the experimental data for their detection is shown in Table 2.

TABLE 2 comparison of experimental data

As can be seen from Table 2, the model added with the grid search method can improve the effect of the model as a whole, wherein the detection effect of impersonation is improved obviously.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and these improvements and modifications should also be construed as the protection scope of the present invention.

Claims

1. The wireless local area network intrusion detection method based on the improved SSAE and DNN model is characterized in that: and performing feature extraction and dimension reduction by adopting SSAE, and then, taking the SSAE as the input of DNN for classification, wherein the SSAE adopts tanh as an activation function and an L2 regular term, the DNN comprises three hidden layers and a dropout layer for preventing overfitting, and simultaneously, each parameter and method of the neural network are optimized by adopting a grid search method.

2. The method of claim 1 for wireless local area network intrusion detection based on improved SSAE and DNN models, wherein: the SSAE adopts three sparse self-encoders to stack, the feature output of the first encoder after the hidden layer compression is used as the input of a second encoder, the features are compressed by the second encoder, the hidden layer feature output of the second encoder is used as the input of a third encoder, and finally the low-dimensional features extracted by the third encoder are output.

3. The method of claim 2 for wireless local area network intrusion detection based on improved SSAE and DNN models, wherein: the first encoder extracts 70-dimensional features, the second 50-layer features, and the third 20-dimensional final features.

4. The method of claim 1 for wireless local area network intrusion detection based on improved SSAE and DNN models, wherein: the parameters optimized by the grid search method comprise: activation, epochs, batch _ size, init _ mode, optizer.

5. The method of claim 1 for wireless local area network intrusion detection based on improved SSAE and DNN models, wherein: the method comprises the following steps:

data validation

data equalization

Randomly selecting 5% of data from the normal flow to balance the proportion of the normal flow and the abnormal flow; data digitization

data normalization

and S2, performing feature extraction and dimension reduction by using SSAE, and classifying the feature as the input of DNN.