CN112839034B - Network intrusion detection method based on CNN-GRU hierarchical neural network - Google Patents

Network intrusion detection method based on CNN-GRU hierarchical neural network Download PDF

Info

Publication number
CN112839034B
CN112839034B CN202011590155.7A CN202011590155A CN112839034B CN 112839034 B CN112839034 B CN 112839034B CN 202011590155 A CN202011590155 A CN 202011590155A CN 112839034 B CN112839034 B CN 112839034B
Authority
CN
China
Prior art keywords
data
network
gru
cnn
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011590155.7A
Other languages
Chinese (zh)
Other versions
CN112839034A (en
Inventor
王梓天
朱国胜
邹洁
王泽松
刘旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University
CERNET Corp
Original Assignee
Hubei University
CERNET Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University, CERNET Corp filed Critical Hubei University
Priority to CN202011590155.7A priority Critical patent/CN112839034B/en
Publication of CN112839034A publication Critical patent/CN112839034A/en
Application granted granted Critical
Publication of CN112839034B publication Critical patent/CN112839034B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a network intrusion detection method based on a CNN-GRU hierarchical neural network, which comprises the following steps: capturing a network flow data packet, namely a data packet to be classified, by Wireshark software; carrying out data packet marking, preprocessing and data cleaning on a data packet to be classified, further analyzing the data packet into decimal data, and converting the decimal data into a 40 × 40 single-channel gray-scale image to obtain a sample complete set; dividing a sample complete set into a training set and a testing set, taking a single-channel gray-scale map matrix as an input vector, and establishing a CNN-GRU hierarchical neural network classification model through the training set; and after the model training is finished, the data of the test set is transmitted into the model, the model predicts the input data according to the parameters obtained by training, and classifies unknown network traffic to judge whether the unknown network traffic is attack traffic. The experimental result shows that the accuracy of the method for classifying the normal flow and the attack flow reaches 99.92%.

Description

Network intrusion detection method based on CNN-GRU hierarchical neural network
Technical Field
The invention relates to the technical field of network security, in particular to a network intrusion detection method based on a CNN-GRU hierarchical neural network.
Background
With the rapid development of the Internet, a large number of devices and persons have joined the Internet environment. At the same time, problems with network traffic security have increased. Wherein, the network attacker often breaks down the network according to the loophole on the internet, which causes immeasurable loss to the user. In the past, such attacks often caused economic losses to the enterprise, but now including personal privacy theft, which caused tremendous harm to the interests of most network users.
To avoid such problems, we often need to be able to detect attack behavior by analyzing traffic data generated by network users. A key challenge is how to efficiently identify traffic data with aggressive behavior. Because the traditional method for cracking and decrypting the network traffic needs to deploy additional equipment, the cost and the deployment difficulty are higher. Traditional payload-based methods have been unable to handle more and more encrypted traffic, and traditional machine learning models are often used in machine learning-based network intrusion detection. However, the common problems encountered are that it is difficult to find a proper function as a reference standard of the network, and the machine learning model usually needs more quantifiable features as a training reference and is not suitable for classification training with ambiguous features. When machine learning methods are used for classification, this further leads to a bottleneck in accuracy, which is difficult to improve.
With the development of chip technology, the computing power of computers has been greatly developed in recent years. Meanwhile, the development of the internet also urges a large amount of data. In this case, deep learning networks have been widely used, including network intrusion detection. Compared with the traditional machine learning method, the deep learning method can automatically find the correlation among different traffic information, and gives different weights to the features through mass data training. Compared with the method for manually defining the characteristics, the method has better applicability and is more suitable for realizing a network intrusion detection system.
Disclosure of Invention
The purpose of the invention is: the invention provides a network intrusion detection method based on a CNN-GRU (neural network-GRU) hierarchical neural network by analyzing the characteristic attributes in the acquired network traffic through the CNN-GRU hierarchical neural network. According to the practical problem of network intrusion detection, the method comprises the steps of collecting available original data, extracting a CNN-GRU hierarchical neural network sample complete set by utilizing the determined label data, preprocessing the original data through characteristic engineering, and removing part of invalid contents in a data packet. And after dividing the sample complete set into a training set and a testing set according to a proper proportion, training the model, verifying the effectiveness of the model through the testing set to obtain a CNN-GRU hierarchical neural network classification model, and realizing accurate monitoring of network intrusion behaviors.
In order to solve the problems, the technical scheme adopted by the invention is as follows:
a network intrusion detection method based on a CNN-GRU hierarchical neural network is characterized by comprising the following steps:
(1) capturing network traffic through Wireshark software to obtain a network traffic data packet, namely a data packet to be classified;
(2) carrying out data packet marking on the data packet to be classified, and meanwhile, preprocessing the data packet to be classified through feature engineering to remove part of invalid contents in the data packet; cleaning the data packets in all the streams, and cleaning each data stream; further analyzing the data packet into decimal data, and converting the decimal data into a 40-by-40 single-channel gray-scale graph; obtaining all picture samples required by model training, thereby obtaining a CNN-GRU layered neural network sample complete set;
(3) dividing a sample complete set into a training set and a test set according to a proper proportion, based on a CNN-GRU hierarchical neural network algorithm, taking a single-channel gray-scale map matrix as an input vector, and establishing a CNN-GRU hierarchical neural network classification model through the training set, so that the model learns how to classify samples;
(4) and after the model training is finished, transmitting the data of the test set into the model, predicting the input data by the model according to the parameters obtained by training, and classifying unknown network traffic to judge whether the unknown network traffic is attack traffic or which type of attack traffic.
Further, the network flow data packet captured in the step (1) has a binary data content stored in the data packet.
Further, the specific process of step (2) includes:
(2.1) marking the data packet to be classified, marking normal flow and attack flow according to the requirement, if the attack flow is required to be classified, marking different types of attack flow in a classified manner, wherein the result of flow type marking is stored in a digital manner and starts from 0;
(2.2) preprocessing a data packet to be classified through feature engineering, shunting the captured network flow according to a source IP address, a source port and a destination IP address, and realizing shunting by using SliptCat software;
(2.3) cleaning the data packets in all the streams, removing the MAC source address, the MAC destination address and the network protocol type information used by the data packets in the data packets, extracting the data of the first 160 bytes from each data packet, and filling 0 treatment on the part of the data packets with less than 160 bytes;
(2.4) cleaning each data stream, extracting the first 10 data packets from each data stream, and filling 160-byte all-0 data packets until 10 data packets are processed under the condition that the data stream contains less than 10 data packets;
(2.5) at the moment, the data in each stream is 1600 bytes of 160 × 10, the data in each byte is converted into decimal, a numerical value with the value range of 0-255 is obtained, and the decimal data with 1600 dimensions is converted into matrix data of 40 × 40;
and (2.6) converting the numerical values in the data of the 40-by-40 matrix into gray levels, obtaining a single-channel gray level diagram with the size of 40-by-40 corresponding to each matrix, and obtaining all picture samples required by model training.
Further, the specific process of step (3) includes:
(3.1) firstly, entering the data into an improved LetNet-5 network, and extracting the spatial characteristics of the original network flow data by using two convolution layers and two maximum pooling layers; using 32 5 x 5 convolution kernels in the first tier of the convolution process, then performing a max pooling operation, and using 64 3 x 3 convolution kernels in the second tier, then performing a max pooling operation; after each convolution operation, the CNN hidden layer is firstly converted by using a ReLU activation function, then the CNN hidden layer is processed by using a maximum pooling operation, and an original single-channel 40 × 40 picture is converted into an 8 × 8 picture with 64 channels; after the vectors are fully expanded, 4096-dimensional vectors are obtained and transmitted to an output layer of a CNN network, the output layer uses a full-junction layer, the full-junction layer uses 1600 neurons, and after the transformation keeps the same dimensionality and original data are extracted, the full-junction layer is considered to randomly inactivate some neurons so as to avoid overfitting;
(3.2) then, automatically extracting the time characteristics of the original stream data by using a GRU network, wherein the GRU network extracts the time characteristics by using a two-layer unit; each unit of the GRU comprises 256 GRU units, and the activation function of each layer performs nonlinear operation by using an S-shaped function; the last layer of the GRU network uses a fully connected layer, and the number of neurons in the fully connected layer is equal to the number of flow classes;
and (3.3) training by using the training set to obtain a network intrusion detection model.
Further, the convolution operation in step (3.1) performs sliding convolution on the picture with size n × n using a convolution kernel ω with size f × f, and each sliding convolution generates a new feature; assuming X is the input to the convolution, b is the bias term, ci is the new feature produced by the convolution at layer i, and σ r is the activation function ReLU; then, the new features obtained by the convolution operation are: ci ═ σ r (# × Xi + bi), after convolution, the n × n profile will generate c ═ nf +1) profiles; determining the size by a sliding window of convolution kernel of size ff; after convolution, performing maximum pooling on the feature map c, and taking the maximum value in the selected window as a final feature; the final signature size is: [ (nf +1) ]/2.
Further, the GRU network in step (3.2) transmits a status h t -1 and input x of the current node t To obtain two gating states; where r controls the gating of resets and z controls the gating of updates. h is t-1′ =h t-1 After Θ r gets the gating signal, reset gating is first used to get the data h later t-1′ =h t-1 Theta r, then h t-1′ And input x t Splicing, and scaling the data to-1 by a tanh activation functionWithin the range of 1, two steps of forgetting to memorize are carried out simultaneously, and the final expression h is obtained by using the previously obtained updating gating z t =(1-z)Θh t-1 +zΘh
Inputting the result output by the complete connection layer into a softmax regression layer, and outputting the classification probability of each flow by a softmax classifier; the label with the highest probability represents the classification result of the hierarchical network on the flow; the loss function used in the model is the mean-square loss function, and the training optimizer uses an adammoptimizer that performs gradient descent using adaptive moment estimation.
Further, the step (4) further comprises: and comparing the result of the model prediction with the actual result of the test set, and judging the specific indexes of the model prediction result, wherein the reference items comprise accuracy, precision, recall, F1-Measure and convergence rate.
The technical scheme provided by the invention has the beneficial effects that at least: compared with the method for detecting network intrusion through deep learning, which is widely used at present, the method has the following advantages:
1. the acquired flow data directly come from a transmitted data packet, the cost is extremely low in data acquisition, and the universality of data sources can be obviously increased;
2. the one-dimensional data packet data is innovatively converted into the two-dimensional image, different features in the data packet are fully combined in such a way, and a feature combination more beneficial to describing the type of the data packet is obtained;
3. the GRU network is used for describing the time sequence relation among the data packets, and in view of the fact that only part of the data packets among the first 10 data packets in the same intercepted flow contain attack information, the random front-back relevance of the GRU network describes the situation from the bottom layer design and is closer to the actual situation of describing the data packet transmission;
4. after two networks are combined according to levels, compared with the traditional machine learning method, the method provided by the invention has obvious improvement on accuracy, and the accuracy of classifying normal traffic and attack traffic reaches 99.92% and 99.77% according to experimental results, so that the traditional method cannot achieve high prediction accuracy;
5. compared with the traditional machine learning method or a single network deep learning method, the model has the defect of long convergence time.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a block diagram of a network intrusion detection method based on a CNN-GRU hierarchical neural network according to an embodiment of the present invention;
fig. 2 is a flow chart of network traffic data graphical.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, a specific implementation of a network intrusion detection method based on a CNN-GRU hierarchical neural network is as follows:
and capturing the network traffic through Wireshark software to obtain a network traffic data packet, namely a data packet to be classified, wherein the content stored in the data packet at this moment is binary data.
And marking the data packet to be classified, marking normal flow and attack flow according to the requirement, if the requirement for classifying the attack flow exists, classifying the attack flow of different types, and storing the result of flow type marking in a digital mode, starting from 0.
And preprocessing the data packet to be classified through characteristic engineering, shunting the captured network flow according to a source IP address, a source port and a destination IP address, and realizing shunting by utilizing SliptCat software.
And cleaning the data packets in all the streams, removing the MAC source address, the MAC destination address and the network protocol type information used by the data packets in the data packets, extracting the data of the first 160 bytes from each data packet, and filling 0 in the part of the data packets with less than 160 bytes.
The two influences are considered in selecting 160 bytes of data, selecting a shorter length may cause insufficient selected features, and good features for displaying the data packet cannot be obtained, if the selected data is too long, the training time of the model is significantly increased, and selecting a longer length of data may cause too much padding 0, which affects the display of the features of the data packet.
And cleaning each data stream, extracting the first 10 data packets from each data stream, and filling 160-byte all-0 data packets until 10 data packets are obtained when the number of the data packets in the data stream is less than 10.
The 10 packets are selected in consideration of the fact that the network traffic data is less under most network attack behaviors, and too many packets are not selected for better describing the process, otherwise, the network attacks are mixed or the attack traffic and the normal traffic are mixed.
At the moment, the data in each stream is 1600 bytes of 160 × 10, the data in each byte is converted into decimal, a numerical value with the value range of 0-255 is obtained, and the decimal data with 1600 dimensions is converted into matrix data of 40 × 40.
And (3) converting the numerical values in the 40-by-40 matrix data into gray levels, obtaining a single-channel gray level graph with the size of 40-by-40 corresponding to each matrix, and obtaining all picture samples required by model training.
And taking 20% of all processed samples as test samples, taking the rest training samples as training samples, and sending the training samples into the model for training so that the model learns how to classify the samples.
The training sample enters the model, firstly, an improved LetNet-5 network is used for extracting the spatial features of original network flow data by using two convolution layers and two maximum pooling layers, the features are used for describing the features contained in each data packet, and the data packets are converted into pictures to combine the features of the data packets which are originally far away, so that the feature combinations which are beneficial to classification can be more easily learned.
The network has two layers in total, using 32 5 x 5 convolution kernels in the first layer of the convolution process, and then performing the maximum pooling operation, and 64 3 x 3 convolution kernels in the second layer, and then performing the maximum pooling operation. After each convolution operation, the CNN hidden layer is first transformed using the ReLU activation function, and then processed using the max pooling operation, the original single-channel 40 × 40 picture will be transformed into 8 × 8 picture with 64 channels. After extending them sufficiently, 4096-dimensional vectors are obtained and transmitted to the output layer of the CNN network, which uses the full-junction layer, which uses 1600 neurons, this transformation preserving the same dimensions. After the raw data is extracted, some neurons are randomly inactivated in view of the fully connected layer to avoid overfitting.
The convolution operation performs a sliding convolution on a picture of size n using a convolution kernel ω of size f, each time the sliding convolution produces a new feature. Let X be the input of the convolution, b be the bias term, ci be the new feature produced by the convolution at layer i, and σ r be the activation function ReLU. Then, the new features obtained by the convolution operation are: after the convolution operation, the n × n signature will generate c ═ nf + 1. The size is determined by a sliding window of convolution kernel of size ff. After convolution, the feature map c is maximally pooled, and the maximum value in the selected window is taken as the final feature. The final signature size is: [ (nf +1) ]/2.
The second layer of the model is a double-layer GRU network used for extracting the time characteristics of the original network traffic data, the characteristics are used for describing the relation of the data packets in the same flow in time stamp sequence, the actual process of the transmission of the data packets is met, and the characteristics of the network flow are more comprehensively described for distinguishing the types of the data packets.
Each layer of the network contains 256 GRU units and the activation function of each layer operates non-linearly using an S-type function. The last layer of the GRU network uses a fully connected layer, and the number of neurons in the fully connected layer is equal to the number of flow classes.
State h transmitted by GRU network t -1 and input x of the current node t To obtain two gating states. Where r controls the gating of resets and z controls the gating of updates. h is t-1′ =h t-1 After Θ r gets the gating signal, reset gating is first used to get the data h later t-1′ =h t-1 Theta r, then h t-1′ And input x t Splicing, zooming the data to the range of-1 to 1 through a tanh activation function, simultaneously performing two steps of forgetting to memorize, and using the previously obtained update gate control z to obtain the final expression h t =(1-z)Θh t-1 +zΘh′。
The result output by the complete connection layer is input to the softmax regression layer, and the softmax classifier outputs the classification probability of each stream. The label with the highest probability represents the classification result of the hierarchical network on the flow. The loss function used in the model is the mean-square loss function, and the training optimizer uses an adammoptimizer that performs gradient descent using adaptive moment estimation.
And after the model training is finished, the data of the test set is transmitted into the model, the model predicts the input data according to the parameters obtained by training, and classifies unknown network traffic to judge whether the unknown network traffic is attack traffic or the type of the unknown network traffic.
And comparing the result of the model prediction with the actual result of the test set, and judging the specific indexes of the model prediction result, wherein the reference items comprise Accuracy (Accuracy), Precision (Precision), Recall (Recall), F1-Measure and Convergence rate (Convergence speed).
The specific embodiment is as follows:
the model is tested using the CICIDS2017 data set, which has the advantage that it has richer traffic types and relatively newer data distribution times, which are more consistent with the current network practice. The data set is from an attack scenario designed by a researcher. All data collected on the first day is normal network traffic. In the next four days, the network is under attack and traffic information is recorded. The final result is stored in the PCAP file, which includes all traffic marked as normal network traffic and various network attacks. Considering the reliability of the training result, the first ten attack traffic and normal traffic are selected as the training set and the test set of the user, and each type is ensured to contain at least two thousand traffic data. Given that the labels given in the data set do not meet the actual requirements, we re-add the labels to the traffic data to meet training requirements. After certain treatment, the number and the proportion of the network flows are shown in the following table:
table 1: number and proportion of network flows
Figure BDA0002868713540000101
Figure BDA0002868713540000111
In order to make the test of the model more complete, we test the results of the model classifying only normal traffic and attack traffic and classifying each attack traffic, with an iteration number of twenty-thousand:
table 2: results of model classification
Figure BDA0002868713540000112
As can be seen from the table, the accuracy of the model prediction exceeds 99.5% in any classification mode, and the model has very high training precision.
The example can show that the method can effectively realize the accurate classification of the network intrusion detection flow and realize the network intrusion detection.
It should be understood that the specific order or hierarchy of steps in the processes disclosed is an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged without departing from the scope of the present disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not intended to be limited to the specific order or hierarchy presented.
In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby expressly incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment of the invention.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. Of course, the processor and the storage medium may reside as discrete components in a user terminal.
For a software implementation, the techniques described herein may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in memory units and executed by processors. The memory unit may be implemented within the processor or external to the processor, in which case it can be communicatively coupled to the processor via various means as is known in the art.
What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim. Furthermore, any use of the term "or" in the specification of the claims is intended to mean a "non-exclusive or".

Claims (4)

1. A network intrusion detection method based on a CNN-GRU hierarchical neural network is characterized by comprising the following steps:
(1) capturing network traffic through Wireshark software to obtain a network traffic data packet, namely a data packet to be classified;
(2) carrying out data packet marking on the data packet to be classified, and meanwhile, preprocessing the data packet to be classified through feature engineering to remove part of invalid contents in the data packet; cleaning the data packets in all the streams, and cleaning each data stream; further analyzing the data packet into decimal data, and converting the decimal data into a 40-by-40 single-channel gray-scale graph; obtaining all picture samples required by model training, thereby obtaining a CNN-GRU layered neural network sample complete set;
(3) dividing a sample complete set into a training set and a test set according to a proper proportion, based on a CNN-GRU hierarchical neural network algorithm, taking a single-channel gray-scale map matrix as an input vector, and establishing a CNN-GRU hierarchical neural network classification model through the training set, so that the model learns how to classify samples;
(4) after the model training is finished, the data of the test set is transmitted into the model, the model predicts the input data according to the parameters obtained by training, and classifies unknown network traffic to judge whether the unknown network traffic is attack traffic or the type of the unknown network traffic;
the specific process of the step (2) comprises the following steps:
(2.1) marking the data packet to be classified, marking normal flow and attack flow according to the requirement, if the attack flow is required to be classified, marking different types of attack flow in a classified manner, wherein the result of flow type marking is stored in a digital manner and starts from 0;
(2.2) preprocessing a data packet to be classified through feature engineering, shunting the captured network flow according to a source IP address, a source port and a destination IP address, and realizing shunting by using SliptCat software;
(2.3) cleaning the data packets in all the flows, removing the MAC source address, the MAC destination address and the network protocol type information used by the data packets in the data packets, extracting the data of the first 160 bytes from each data packet, and filling 0 in the part of the data packets with less than 160 bytes;
(2.4) cleaning each data stream, extracting the first 10 data packets from each data stream, and filling 160-byte all-0 data packets until 10 data packets are processed under the condition that the data stream contains less than 10 data packets;
(2.5) at the moment, the data in each stream is 1600 bytes of 160 × 10, the data in each byte is converted into decimal, a numerical value with the value range of 0-255 is obtained, and the decimal data with 1600 dimensions is converted into matrix data of 40 × 40;
(2.6) converting numerical values in the 40-by-40 matrix data into gray levels, obtaining a single-channel gray level diagram with the size of 40-by-40 corresponding to each matrix, and obtaining all picture samples required by model training;
the specific process of the step (3) comprises the following steps:
(3.1) firstly, entering the data into an improved LetNet-5 network, and extracting the spatial characteristics of the original network flow data by using two convolution layers and two maximum pooling layers; using 32 5 x 5 convolution kernels in the first tier of the convolution process, then performing a max pooling operation, and using 64 3 x 3 convolution kernels in the second tier, then performing a max pooling operation; after each convolution operation, the CNN hidden layer is firstly converted by using a ReLU activation function, then the CNN hidden layer is processed by using a maximum pooling operation, and an original single-channel 40 × 40 picture is converted into an 8 × 8 picture with 64 channels; after the vectors are fully expanded, 4096-dimensional vectors are obtained and transmitted to an output layer of a CNN network, the output layer uses a full-junction layer, the full-junction layer uses 1600 neurons, and after the transformation keeps the same dimensionality and original data are extracted, the full-junction layer is considered to randomly inactivate some neurons so as to avoid overfitting;
(3.2) then, automatically extracting the time characteristics of the original stream data by using a GRU network, wherein the GRU network extracts the time characteristics by using a two-layer unit; each unit of the GRU comprises 256 GRU units, and the activation function of each layer performs nonlinear operation by using an S-shaped function; the last layer of the GRU network uses a fully connected layer, and the number of neurons in the fully connected layer is equal to the number of flow classes;
(3.3) training by using a training set to obtain a network intrusion detection model;
performing sliding convolution on the picture with the size of n × n by using a convolution kernel ω with the size of f × f in the convolution operation in the step (3.1), wherein each sliding convolution generates a new feature; let X be the input of the convolution, b be the bias term, c i Is the new feature at layer i generated by convolution, and σ r is the activation function ReLU; then, the new features obtained by the convolution operation are: c. C i =σr*(ω*X i +b i ) After the convolution operation, the n × n feature map generates a feature map of c ═ n-f +1 (n-f + 1); determining the size by sliding a window through a convolution kernel of size f; after convolution, performing maximum pooling on the feature map c, and taking the maximum value in the selected window as a final feature; the final signature size is: [ (n-f +1) (n-f +1)]/2。
2. The CNN-GRU hierarchical neural network-based network intrusion detection method according to claim 1, wherein the network traffic data packet captured in step (1) has binary data stored therein.
3. The network intrusion detection method based on CNN-GRU hierarchical neural network as claimed in claim 1, wherein the GRU network in step (3.2) transmits a down state h t -1 and input x of the current node t To obtain two gating states; where r controls the gating of resets, z controls the gating of updates, h t-1′ =h t-1 After Θ r gets the gating signal, reset gating is first used to get the data h later t-1′ =h t-1 Theta.r, and then h t-1′ And input x t Splicing, zooming the data to the range of-1 to 1 through a tanh activation function, simultaneously performing two steps of forgetting to memorize, and obtaining a final expression h by using the previously obtained updating gating z t =(1-z)Θh t-1 +zΘh′;
Inputting the result output by the complete connection layer into a softmax regression layer, and outputting the classification probability of each flow by a softmax classifier; the label with the highest probability represents the classification result of the hierarchical network on the flow; the loss function used in the model is the mean-square loss function, and the training optimizer uses an adammoptimizer that performs gradient descent using adaptive moment estimation.
4. The CNN-GRU hierarchical neural network-based network intrusion detection method according to claim 1, wherein the step (4) further comprises: and comparing the result of the model prediction with the actual result of the test set, and judging the specific indexes of the model prediction result, wherein the reference items comprise accuracy, precision, recall, F1-Measure and convergence rate.
CN202011590155.7A 2020-12-29 2020-12-29 Network intrusion detection method based on CNN-GRU hierarchical neural network Active CN112839034B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011590155.7A CN112839034B (en) 2020-12-29 2020-12-29 Network intrusion detection method based on CNN-GRU hierarchical neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011590155.7A CN112839034B (en) 2020-12-29 2020-12-29 Network intrusion detection method based on CNN-GRU hierarchical neural network

Publications (2)

Publication Number Publication Date
CN112839034A CN112839034A (en) 2021-05-25
CN112839034B true CN112839034B (en) 2022-08-05

Family

ID=75925146

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011590155.7A Active CN112839034B (en) 2020-12-29 2020-12-29 Network intrusion detection method based on CNN-GRU hierarchical neural network

Country Status (1)

Country Link
CN (1) CN112839034B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113364787B (en) * 2021-06-10 2023-08-01 东南大学 Botnet flow detection method based on parallel neural network
CN113556328B (en) * 2021-06-30 2022-09-30 杭州电子科技大学 Encryption traffic classification method based on deep learning
CN113569992B (en) * 2021-08-26 2024-01-09 中国电子信息产业集团有限公司第六研究所 Abnormal data identification method and device, electronic equipment and storage medium
CN114157513B (en) * 2022-02-07 2022-09-13 南京理工大学 Vehicle networking intrusion detection method and equipment based on improved convolutional neural network
CN114760098A (en) * 2022-03-16 2022-07-15 南京邮电大学 CNN-GRU-based power grid false data injection detection method and device
CN114615172B (en) * 2022-03-22 2024-04-16 中国农业银行股份有限公司 Flow detection method and system, storage medium and electronic equipment
CN115001781B (en) * 2022-05-25 2023-05-26 国网河南省电力公司信息通信公司 Terminal network state safety monitoring method
CN115037535B (en) * 2022-06-01 2023-07-07 上海磐御网络科技有限公司 Intelligent recognition method for network attack behaviors
CN115102773A (en) * 2022-06-29 2022-09-23 苏州浪潮智能科技有限公司 Smuggling attack detection method, system, equipment and readable storage medium
CN115277258B (en) * 2022-09-27 2022-12-20 广东财经大学 Network attack detection method and system based on temporal-spatial feature fusion
CN115865486B (en) * 2022-11-30 2024-04-09 山东大学 Network intrusion detection method and system based on multi-layer perception convolutional neural network
CN115865534B (en) * 2023-02-27 2023-05-12 深圳大学 Malicious encryption-based traffic detection method, system, device and medium
CN117640254A (en) * 2024-01-25 2024-03-01 浙江大学 Industrial control network intrusion detection method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034264A (en) * 2018-08-15 2018-12-18 云南大学 Traffic accident seriousness predicts CSP-CNN model and its modeling method
CN109086878A (en) * 2018-10-19 2018-12-25 电子科技大学 Keep the convolutional neural networks model and its training method of rotational invariance
CN110351244A (en) * 2019-06-11 2019-10-18 山东大学 A kind of network inbreak detection method and system based on multireel product neural network fusion

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9904893B2 (en) * 2013-04-02 2018-02-27 Patternex, Inc. Method and system for training a big data machine to defend
CN109117634B (en) * 2018-09-05 2020-10-23 济南大学 Malicious software detection method and system based on network traffic multi-view fusion
CN110619049A (en) * 2019-09-25 2019-12-27 北京工业大学 Message anomaly detection method based on deep learning
CN110597240B (en) * 2019-10-24 2021-03-30 福州大学 Hydroelectric generating set fault diagnosis method based on deep learning
CN111064678A (en) * 2019-11-26 2020-04-24 西安电子科技大学 Network traffic classification method based on lightweight convolutional neural network
CN111371806B (en) * 2020-03-18 2021-05-25 北京邮电大学 Web attack detection method and device
CN111683108B (en) * 2020-08-17 2020-11-17 鹏城实验室 Method for generating network flow anomaly detection model and computer equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034264A (en) * 2018-08-15 2018-12-18 云南大学 Traffic accident seriousness predicts CSP-CNN model and its modeling method
CN109086878A (en) * 2018-10-19 2018-12-25 电子科技大学 Keep the convolutional neural networks model and its training method of rotational invariance
CN110351244A (en) * 2019-06-11 2019-10-18 山东大学 A kind of network inbreak detection method and system based on multireel product neural network fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于ResNet和双向LSTM融合的物联网入侵检测分类模型构建与优化研究;陈红松等;《湖南大学学报(自然科学版)》;20200825(第08期);全文 *
基于深度学习的入侵检测研究;张露璐等;《信息与电脑(理论版)》;20190615(第11期);全文 *

Also Published As

Publication number Publication date
CN112839034A (en) 2021-05-25

Similar Documents

Publication Publication Date Title
CN112839034B (en) Network intrusion detection method based on CNN-GRU hierarchical neural network
CN112398779B (en) Network traffic data analysis method and system
CN112235264B (en) Network traffic identification method and device based on deep migration learning
CN112003870A (en) Network encryption traffic identification method and device based on deep learning
EP3355547A1 (en) Method and system for learning representations of network flow traffic
CN111866024B (en) Network encryption traffic identification method and device
CN108965001B (en) Method and device for evaluating vehicle message data model
WO2022227388A1 (en) Log anomaly detection model training method, apparatus and device
CN112165484B (en) Network encryption traffic identification method and device based on deep learning and side channel analysis
CN110751222A (en) Online encrypted traffic classification method based on CNN and LSTM
WO2020238353A1 (en) Data processing method and apparatus, storage medium, and electronic apparatus
Dao et al. Stacked autoencoder-based probabilistic feature extraction for on-device network intrusion detection
CN110912908B (en) Network protocol anomaly detection method and device, computer equipment and storage medium
CN112688946B (en) Method, module, storage medium, device and system for constructing abnormality detection features
CN110738264A (en) Abnormal sample screening, cleaning and training method, device, equipment and storage medium
CN114386514A (en) Unknown flow data identification method and device based on dynamic network environment
CN113660196A (en) Network traffic intrusion detection method and device based on deep learning
Bowen et al. BLoCNet: a hybrid, dataset-independent intrusion detection system using deep learning
CN114826681A (en) DGA domain name detection method, system, medium, equipment and terminal
CN114372536A (en) Unknown network flow data identification method and device, computer equipment and storage medium
CN115630298A (en) Network flow abnormity detection method and system based on self-attention mechanism
CN115314239A (en) Analysis method and related equipment for hidden malicious behaviors based on multi-model fusion
CN111901324B (en) Method, device and storage medium for flow identification based on sequence entropy
CN113992419A (en) User abnormal behavior detection and processing system and method thereof
CN113852605A (en) Protocol format automatic inference method and system based on relational reasoning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant