CN112039906B - Cloud computing-oriented network flow anomaly detection system and method - Google Patents

Cloud computing-oriented network flow anomaly detection system and method Download PDF

Info

Publication number
CN112039906B
CN112039906B CN202010916022.8A CN202010916022A CN112039906B CN 112039906 B CN112039906 B CN 112039906B CN 202010916022 A CN202010916022 A CN 202010916022A CN 112039906 B CN112039906 B CN 112039906B
Authority
CN
China
Prior art keywords
flow
anomaly detection
cloud
traffic
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010916022.8A
Other languages
Chinese (zh)
Other versions
CN112039906A (en
Inventor
莫毓昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaqiao University
Original Assignee
Huaqiao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaqiao University filed Critical Huaqiao University
Priority to CN202010916022.8A priority Critical patent/CN112039906B/en
Publication of CN112039906A publication Critical patent/CN112039906A/en
Application granted granted Critical
Publication of CN112039906B publication Critical patent/CN112039906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Abstract

The invention discloses a cloud computing-oriented network traffic anomaly detection system and method, wherein the detection system comprises a client, a cloud application server, a cloud platform, a cloud entry router and a traffic anomaly detection device; the detection method comprises the steps of collecting network flow, calculating flow distribution data and constructing a flow sample pool; constructing and training a flow anomaly detection convolutional neural network model by using the flow sample pool; detecting the flow in real time by using the trained flow anomaly detection convolutional neural network model; continuously updating the flow sample pool by using the real-time flow data; and retraining the traffic anomaly detection convolutional neural network model by using the updated traffic sample pool, and replacing the existing traffic anomaly detection convolutional neural network model. The advantages are that: based on the flow distribution convolution analysis, the correlation characteristics of the subdivided flow are fully considered, and the detection precision and efficiency are improved; by adopting a mode of automatically updating the sample cell and the model, the detection timeliness is improved.

Description

Cloud computing-oriented network flow anomaly detection system and method
Technical Field
The invention relates to the field of network traffic anomaly detection, in particular to a cloud computing-oriented network traffic anomaly detection system and method.
Background
The current network security situation is extremely severe, and network attacks are increasingly reported, which brings serious network security threats to important enterprises, individuals and important department organs. The advanced persistent threat (apt) can easily avoid the identification of the traditional detection technology by utilizing the characteristics of strong pertinence, disguise and stage. The technology of novel attack means is endless, so that a general intrusion prevention system cannot be effectively matched and identified. Meanwhile, any network attack is transmitted through the network, and related data packets are necessarily transmitted between the attacking host and the attacked host, so that a plurality of network security engineers and inspirations are given. Among them, detecting network attacks is one of the most effective ways from the viewpoint of analyzing and processing transmission traffic between networks.
The network anomaly detection system needs strong timeliness, and can block network attack earlier if network anomaly is found earlier, so that loss caused by network attack can be well reduced and even avoided.
With the development of computer technology, the research on related algorithms such as machine learning and deep learning becomes more and more mature, and meanwhile, the method is applied in many fields with great success. Deep learning can be used for well and automatically extracting flow characteristics, and limitation of extraction of a small amount of static characteristics is avoided. Under the technical background and huge network security threats, the flow anomaly detection method can process and analyze network flow through a deep learning technology, detect network flow anomaly and further take effective precautionary measures for precaution.
Disclosure of Invention
The invention aims to provide a cloud computing-oriented network traffic anomaly detection system and method, so as to solve the problems in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a cloud computing-oriented network flow anomaly detection system comprises
A client; the client can be a single working computer or a proxy gateway, each client has a source IP, and a network data packet is sent to a certain cloud application server in the cloud platform through the source IP;
a cloud application server; each cloud application server provides a cloud application for a client, and each cloud application server has a target IP and carries out network communication with the client through the target IP;
a cloud platform; all cloud application servers operated by a certain cloud service provider form a cloud platform, and the cloud platform corresponds to a group of destination IPs;
a cloud entry router; receiving a network data packet sent by a client, and distributing the received network data packet to each cloud application server;
a flow abnormality detection device; receiving a network traffic mirror image sent by a cloud entry router, constructing a traffic anomaly detection convolutional neural network model according to accumulated historical traffic data, and performing real-time detection on network traffic anomaly by using the traffic anomaly detection convolutional neural network model;
the flow abnormality detecting device includes a flow abnormality detecting unit,
a flow acquisition module; the network traffic mirror image receiving device is used for receiving the network traffic mirror image sent by the cloud entry router, and obtaining traffic distribution data after accumulating according to the header length information of each network data packet;
a model generation module; training a convolutional neural network model by using flow samples in a flow sample pool to obtain a flow anomaly detection convolutional neural network model;
an anomaly detection implementation module; obtaining a detection result by using an anomaly detection implementation module for current flow distribution data;
a flow sample pool updating module; updating the flow sample pool by using the new flow sample, thereby ensuring the timeliness of the flow sample pool;
a model update module; and retraining the flow anomaly detection convolutional neural network model by using the updated flow sample pool, updating the model, and keeping the timeliness of the flow anomaly detection convolutional neural network model.
The invention also aims to provide a cloud computing-oriented network traffic anomaly detection method, which is realized by using the detection system of the claim 1; the detection method comprises the following steps of,
s1, collecting network flow, calculating flow distribution data, and constructing a flow sample pool;
s2, constructing and training a flow anomaly detection convolutional neural network model by using the flow sample pool;
s3, detecting the flow in real time by using the trained flow anomaly detection convolutional neural network model;
s4, continuously updating the flow sample pool by using the real-time flow data;
and S5, retraining the traffic anomaly detection convolutional neural network model by using the updated traffic sample pool, and replacing the existing traffic anomaly detection convolutional neural network model by using the retrained traffic anomaly detection convolutional neural network model.
Preferably, step S1 specifically includes the following steps,
s11, connecting the flow abnormity detection device to a network interface NIC of the cloud entry router;
s12, configuring a sniffing function for the network interface NIC;
s13, copying all network traffic passing through the cloud entry router and then sending the copied network traffic to the network interface NIC;
s14, constructing and initializing a flow distribution two-dimensional array LD by the flow abnormity detection device; each line of the flow distribution two-dimensional array LD corresponds to a source IP; each column corresponds to a destination IP; each element in the initialized flow distribution two-dimensional array LD is zero;
s15, the traffic anomaly detection device collects network traffic sent by a cloud entry router, and the network traffic comprises a plurality of network data packets;
s16, the flow abnormity detection device calculates flow distribution data according to each received network data packet; specifically, the header of each network data packet is analyzed, and the source IP of each network data packet is obtained as IPs, the destination IP is IPd, and the length is Len; the row IPs and the column (IPs, IPd) in the traffic distribution two-dimensional array LD are the elements (IPs, IPd) of IPd, and the value of LenSum is equal to the accumulated length LenSum of the collected network packets Len with the source IP of IPs and the destination IP of IPd;
s17, the flow anomaly detection device continuously analyzes the network flow for one minute to obtain a flow distribution two-dimensional array LD, and the flow distribution two-dimensional array LD is used as a flow sample and is placed in a flow sample pool;
s18, judging whether the preset conditions are met, if so, entering the step S2; if not, clearing the flow distribution two-dimensional array, and returning to the step S15; the preset condition is that the acquisition time of the network traffic transmitted by the cloud entry router acquired by the traffic anomaly detection device is not less than the acquisition time period D.
Preferably, step S2 specifically includes the following steps,
s21, normalizing the flow samples in the flow sample pool;
s22, constructing flow sample categories, and sorting the categories of the flow samples according to the total flow amount of each category of flow samples;
s23, constructing a flow anomaly detection convolutional neural network model;
and S24, training the traffic anomaly detection convolutional neural network model.
Preferably, in step S21, the min-max normalization is performed on the flow distribution two-dimensional array LD, that is, the array element a in the flow distribution two-dimensional array LD is linearly transformed, so that the output result value is mapped between [0,1 ]; the conversion function of the linear transformation is,
Figure BDA0002665050720000041
wherein max is the maximum value of all flow samples in the flow sample pool; min is the minimum value of all flow samples in the flow sample pool; and a is an output result value.
Preferably, in step S22, the flow samples in the flow sample pool are classified and combined in units of hours according to the time of flow collection, so as to obtain 24 classes, the number of the flow samples in each class is respectively counted, all array elements in all the flow samples in each class are added to obtain the total flow amount of the flow samples in each class, the total flow amount is sorted from large to small, and the classes of the flow samples are sorted.
Preferably, the convolutional neural network model for detecting traffic anomaly comprises an input layer, two convolutional layers, two pooling layers and an output layer; wherein the content of the first and second substances,
inputting all flow samples in the flow sample pool, namely all flow distribution two-dimensional arrays LD, by an input layer;
extracting the characteristics of a flow distribution two-dimensional array LD by the first convolution layer;
the first pooling layer reduces the data amount to be processed by a 2 x 2 domain down-sampling mode by using a local correlation principle, namely the acquired characteristic data is 1/4 before sampling;
the second convolution layer extracts the flow characteristic data after being pooled by the first pooling layer;
the second pooling layer reduces the data amount to be processed by a 2 x 2 domain down-sampling mode by using a local correlation principle, namely the acquired characteristic data is 1/4 before sampling;
and the output layer maps the characteristic data after being pooled by the second pooling layer into the finally predicted traffic category.
Preferably, step S24 specifically includes the following steps,
a forward propagation phase; inputting any one flow sample from all flow distribution two-dimensional arrays LD in the flow sample pool into a flow anomaly detection convolutional neural network model to obtain a corresponding output result x;
a backward propagation phase; calculating the error between the output result x and the corresponding category y; whether the detection error is smaller than a set threshold epsilon or not is judged, if yes, training is finished, the traffic anomaly detection convolutional neural network model and all parameters are stored, and the trained traffic anomaly detection convolutional neural network model is obtained; if not, entering the next round of training; the error between the output result x and the corresponding category y is the sum of the squares of the differences between x and y of the corresponding elements x i and yi.
Preferably, step S3 specifically includes the following steps,
s31, analyzing the network flow of one minute in real time by the flow abnormity detection device to obtain a flow distribution two-dimensional array LDC, taking the LDC as a current flow sample, and obtaining the category y of the flow sample according to the current time;
s32, normalizing the flow distribution two-dimensional array LDC, inputting the normalized flow distribution two-dimensional array LDC into a trained flow anomaly detection convolution neural network model, and obtaining a corresponding output result x;
s33, according to the category sorting, obtaining the sorting position Y of the array element label with the median value of 1 in the category Y and the sorting position X of the array element label with the maximum median value in the output result X, carrying out the flow abnormity detection alarm according to the following rules,
if X is Y, the detection result is that the flow is normal;
if Y ranks after X, i.e., the current flow is less than the model predicted flow, specifically,
when Y-X is 1,2, sending out a light abnormal flow alarm;
when Y-X is 3,4, a medium flow abnormal alarm is sent out;
when Y-X is larger than 4, a heavy flow abnormal alarm is sent out;
if the Y sequence is before the X sequence, namely the current flow is larger than the model predicted flow, further providing a suspected network attack alarm according to different network attack scenes, wherein the network attack scenes comprise CC attack, scanning attack, APT attack and Trojan horse attack; in particular, the method comprises the following steps of,
when X-Y is 1, sending out a suspected APT attack alarm;
when X-Y is 2, sending out a suspected Trojan attack alarm;
when X-Y is 3, sending out a suspected scanning attack alarm;
and when the X-Y is more than 3, sending a suspected CC attack alarm.
Preferably, in step S4, a moving queue is set in the flow sample pool, each of the flow samples is arranged in the moving queue in sequence, the head of the moving queue is the oldest flow sample, the tail of the moving queue is the newest flow sample, and when a new flow sample is added, the flow sample at the head of the mobility queue is removed, and the new flow sample is added at the tail of the moving queue; the flow sample with the slight abnormal alarm can be used as a normal flow sample to be added into a mobile queue; the flow sample of the moderate abnormal alarm needs to determine whether the flow sample can be used as a normal flow sample to be added into a mobile queue according to the alarm processing feedback result; and the flow sample of the severe abnormal alarm is forbidden to be added into the sample pool as a normal flow sample.
The invention has the beneficial effects that: 1. this patent adopts convolutional neural network automatic extraction a large amount of characteristics, has promoted detection precision and efficiency. 2. This patent is based on flow distribution convolution analysis to fully consider the correlation characteristic of subdivision flow, promoted detection accuracy and efficiency. 3. This patent adopts the mode of sample cell and model automatic update, has promoted the detection ageing. 4. The classification deviation of the flow prediction output is adopted to automatically identify the flow abnormity, and the practicability and the applicable occasion of abnormity detection are greatly enlarged.
Drawings
FIG. 1 is a schematic flow chart of a detection method in an embodiment of the present invention;
fig. 2 is a schematic diagram of a mobile queue in a traffic sample pool according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Example one
In this embodiment, a cloud computing-oriented network traffic anomaly detection system is provided, and the detection system includes
A client; the client can be a single working computer or a proxy gateway, each client has a source IP, and a network data packet is sent to a certain cloud application server in the cloud platform through the source IP;
a cloud application server; each cloud application server provides a cloud application for a client, and each cloud application server has a target IP and carries out network communication with the client through the target IP;
a cloud platform; all cloud application servers operated by a certain cloud service provider (such as Ali cloud, Huashi, and the like) form a cloud platform, and the cloud platform corresponds to a group of destination IPs;
a cloud entry router; receiving a network data packet sent by a client, and distributing the received network data packet to each cloud application server;
a flow abnormality detection device; receiving a network traffic mirror image sent by a cloud entry router, constructing a traffic anomaly detection convolutional neural network model according to accumulated historical traffic data, and performing real-time detection on network traffic anomaly by using the traffic anomaly detection convolutional neural network model;
the flow abnormality detecting device includes a flow abnormality detecting unit,
a flow acquisition module; the network traffic mirror image receiving device is used for receiving the network traffic mirror image sent by the cloud entry router, and obtaining traffic distribution data after accumulating according to the header length information of each network data packet;
a model generation module; training a convolutional neural network model by using flow samples in a flow sample pool to obtain a flow anomaly detection convolutional neural network model;
an anomaly detection implementation module; obtaining a detection result by using an anomaly detection implementation module for current flow distribution data;
a flow sample pool updating module; updating the flow sample pool by using the new flow sample, thereby ensuring the timeliness of the flow sample pool;
a model update module; and retraining the flow anomaly detection convolutional neural network model by using the updated flow sample pool, updating the model, and keeping the timeliness of the flow anomaly detection convolutional neural network model.
Example two
As shown in fig. 1, the present embodiment provides a cloud computing-oriented network traffic anomaly detection method, which is implemented by using the detection system of claim 1; the detection method comprises the following steps of,
s1, collecting network flow, calculating flow distribution data, and constructing a flow sample pool;
s2, constructing and training a flow anomaly detection convolutional neural network model by using the flow sample pool;
s3, detecting the flow in real time by using the trained flow anomaly detection convolutional neural network model;
s4, continuously updating the flow sample pool by using the real-time flow data;
and S5, retraining the traffic anomaly detection convolutional neural network model by using the updated traffic sample pool, and replacing the existing traffic anomaly detection convolutional neural network model by using the retrained traffic anomaly detection convolutional neural network model.
In this embodiment, step S1 specifically includes the following steps,
s11, connecting the flow abnormity detection device to a network interface NIC of the cloud entry router;
s12, configuring a sniffing function for the network interface NIC;
s13, copying all network traffic passing through the cloud entry router and then sending the copied network traffic to the network interface NIC;
s14, constructing and initializing a flow distribution two-dimensional array LD by the flow abnormity detection device; each line of the flow distribution two-dimensional array LD corresponds to a source IP; each column corresponds to a destination IP; each element in the initialized flow distribution two-dimensional array LD is zero;
s15, the traffic anomaly detection device collects network traffic sent by a cloud entry router, and the network traffic comprises a plurality of network data packets;
s16, the flow abnormity detection device calculates flow distribution data according to each received network data packet; specifically, the header of each network data packet is analyzed, and the source IP of each network data packet is obtained as IPs, the destination IP is IPd, and the length is Len; the row IPs and the column (IPs, IPd) in the traffic distribution two-dimensional array LD are the elements (IPs, IPd) of IPd, and the value of LenSum is equal to the accumulated length LenSum of the collected network packets Len with the source IP of IPs and the destination IP of IPd;
s17, the flow anomaly detection device continuously analyzes the network flow for one minute to obtain a flow distribution two-dimensional array LD, and the flow distribution two-dimensional array LD is used as a flow sample and is placed in a flow sample pool;
s18, judging whether the preset conditions are met, if so, entering the step S2; if not, clearing the flow distribution two-dimensional array, and returning to the step S15; the preset condition is that the acquisition time of the network traffic transmitted by the cloud entry router acquired by the traffic anomaly detection device is not less than the acquisition time period D.
In step S14, if the number of clients is huge, which results in a huge number of source IPs, the grouping may be performed in a way of 32-bit binary IP address segmentation, for example, all the IP addresses with the same first 30-bit binary value are used as a group. By the method, the size of the flow distribution two-dimensional array LD can be compressed, so that the storage capacity of a subsequent flow sample pool is kept in a reasonable range, and the calculation amount of subsequent anomaly detection model training is reduced.
In step S18, the collection period D may be set with reference to the stability of the network traffic, and when the traffic variation range is large, the collection period D may take a smaller value. Typically, the collection period D may be set to 30 days, and when D is 30, the number of samples in the flow cell is 30 × 24 × 60 — 43200.
In this embodiment, step S2 specifically includes the following steps,
s21, normalizing the flow samples in the flow sample pool;
s22, constructing flow sample categories, and sorting the categories of the flow samples according to the total flow amount of each category of flow samples;
s23, constructing a flow anomaly detection convolutional neural network model;
and S24, training the traffic anomaly detection convolutional neural network model.
In this embodiment, step S21 is specifically to perform min-max normalization on the flow distribution two-dimensional array LD, that is, perform linear transformation on the array element a in the flow distribution two-dimensional array LD, so that the output result value is mapped between [0,1 ]; the conversion function of the linear transformation is,
Figure BDA0002665050720000091
wherein max is the maximum value of all flow samples in the flow sample pool; min is the minimum value of all flow samples in the flow sample pool; and a is an output result value.
In this embodiment, the step S22 is specifically that, the step S22 is specifically that, according to the time of traffic collection, the traffic samples in the traffic sample pool are classified and combined in units of hours, 24 classes (24 hours in total per day, and therefore are classified into 24 classes) can be obtained, the number of the traffic samples in each class is respectively counted, array elements in all the traffic samples in each class are all added to obtain the total traffic amount of the traffic samples in each class, and the total traffic amount is sorted from large to small to sort the traffic samples.
As before, the number of samples in the flow cell was 30 x 24 x 60-43200. By classifying the data by hour, 24 classes can be obtained, with 30 x 60 samples in each class. And counting the total flow of each type of sample, namely adding all elements in the 30 x 60 sample flow array. And sorting the categories according to the flow total amount from large to small. For example, the obtained category ranks <20,9,15,20,8, …,3,4 >; the category ordering states that for the cloud platform, traffic is greatest at 8 pm (i.e., 20), followed by 9 am and 3 pm. The flow was minimal at 4 am and second last at 3 am.
In this embodiment, the constructed convolutional neural network CNN is a multilayer neural network structure, and mainly includes an input layer, 2 convolutional layers, 2 pooling layers, and an output layer, and the specific network structure is as follows; the flow anomaly detection convolutional neural network model comprises an input layer, two convolutional layers, two pooling layers and an output layer; wherein the content of the first and second substances,
inputting all flow samples in the flow sample pool, namely all flow distribution two-dimensional arrays LD, by an input layer;
the first convolution layer C1 extracts the characteristics of the flow distribution two-dimensional array LD; consists of 6 convolution kernels of 5 × 5 size;
the first pooling layer S1 reduces the amount of data to be processed by means of 2 × 2 domain down-sampling using the local correlation principle, that is, the obtained feature data is 1/4 before sampling;
the second convolution layer C2 extracts the flow characteristic data after being pooled by the first pooling layer; consists of 16 convolution kernels of size 5 × 5;
the second pooling layer S2 reduces the amount of data to be processed by means of 2 × 2 domain down-sampling using the local correlation principle, that is, the obtained feature data is 1/4 before sampling;
and the output layer maps the characteristic data after being pooled by the second pooling layer into the finally predicted traffic category. The output layer consists of 24 neurons.
The activation function used by the convolutional layer is the Relu function. Assuming that the convolutional layer obtains an input of x before activation, the output after activation is relu (x) max (x,0),
and the output layer calculates the probability distribution of the flow category by adopting Softmax regression. Assume that the output layer obtains inputs z1, z2, z3, …, z24,
Figure BDA0002665050720000101
wherein, K is 24; zj is one of the inputs, and the output of zj after the Softmax regression processing is zj' when j is 1,2,3, …,24, zj.
In this embodiment, the convolutional neural network model is essentially an input-to-output mapping, which can learn a large number of input-to-output mapping relationships without any precise mathematical expression between the input and output, and the network has the capability of mapping between input-output pairs as long as the convolutional network is trained with a known pattern.
Before training is started, all parameters in the network model should be initialized with some different random numbers. The small random number is used for ensuring that the network does not enter a saturation state due to overlarge weight value, so that training fails; "different" is used to ensure that the network can learn normally.
Step S24 includes the following steps (training process is divided into two stages)
A forward propagation phase; from 30 × 24 × 60 ═ 43200 flow distribution two-dimensional arrays LD in the flow sample pool, inputting any flow sample into the flow anomaly detection convolutional neural network model, and transmitting data from an input layer to an output layer through gradual transformation to obtain a corresponding output result x;
a backward propagation phase; calculating the error between the output result x and the corresponding category y; whether the detection error is smaller than a set threshold epsilon or not is judged, if yes, training is finished, the traffic anomaly detection convolutional neural network model and all parameters are stored, and the trained traffic anomaly detection convolutional neural network model is obtained; if not, entering the next round of training; the error between the output result x and the corresponding category y is the sum of the squares of the differences between x and y of the corresponding elements x i and yi.
The specific execution process of the backward propagation stage is as follows;
(1) calculating the error between the output result x of the output layer and the corresponding category y; for example, the output of the actual calculation is
x={0.008,0.77,0.012,0.011,0.009,0.01,0.01,0.01,0.01,0.012,0.001,0.017,0.001,0.019,0.01,0.01,0.01,0.011,0.009,0.01,0.01,0.01,0.015,0.005};
Since the sample belongs to the second class in the category ordering, the category output is
y ═ 0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 }; the error between the two is the square sum of corresponding element differences x i-yi;
(2) and adjusting the parameter matrix of the output layer according to the traditional gradient descent method.
(3) According to the traditional back propagation method, the errors of the convolution layer and the pooling layer are calculated,
(4) adjusting parameter matrixes of the convolution layer and the pooling layer according to a traditional gradient descent method;
after each round of training, checking whether the error of the output layer is smaller than a set threshold epsilon, if epsilon is 0.5, if yes, finishing the training, and storing the model and all parameters. If not, the next round of training is performed.
In this embodiment, step S3 specifically includes the following steps,
s31, analyzing the network flow for one minute in real time by the flow abnormity detection device under the condition of meeting the preset condition to obtain a flow distribution two-dimensional array LDC, taking the LDC as a current flow sample, and acquiring the category y of the flow sample according to the current time;
s32, performing min-max normalization processing consistent with the above on the flow distribution two-dimensional array LDC, inputting the normalized flow distribution two-dimensional array LDC into a trained flow anomaly detection convolutional neural network model, and transmitting the flow distribution two-dimensional array LDC to an output layer through gradual transformation from the input layer to obtain a corresponding output result x;
s33, according to the category sorting, obtaining the sorting position Y of the array element label with the median value of 1 in the category Y and the sorting position X of the array element label with the maximum median value in the output result X, carrying out the flow abnormity detection alarm according to the following rules,
if X is Y, the detection result is that the flow is normal;
if Y ranks after X, i.e., the current flow is less than the model predicted flow, specifically,
when Y-X is 1,2, sending out a light abnormal flow alarm;
when Y-X is 3,4, a medium flow abnormal alarm is sent out;
when Y-X is larger than 4, a heavy flow abnormal alarm is sent out;
if the Y sequence is before X, namely the current flow is larger than the model predicted flow, further providing suspected network attack alarm according to different network attack scenes, wherein the network attack scenes comprise CC attack, scanning attack, APT attack and Trojan attack; in particular, the method comprises the following steps of,
when X-Y is 1, sending out a suspected APT attack alarm;
when X-Y is 2, sending out a suspected Trojan attack alarm;
when X-Y is 3, sending out a suspected scanning attack alarm;
and when the X-Y is more than 3, sending a suspected CC attack alarm.
As shown in fig. 2, in the present embodiment, step S4 is performed when the preset condition is satisfied, and then, as can be seen from the foregoing, the flow sample cell is constructed by setting the collection time period D. For example, the collection period D is 30 days, and the number of samples in the flow cell is 30 × 24 × 60 — 43200; since the detection system is constantly collecting network traffic in real time, the sample pool can be viewed as a mobile queue as in fig. 2. The head of the queue is the oldest traffic sample and the tail of the queue is the newest traffic sample. Each time there is a new traffic sample, the head of line sample is removed and the new sample is added to the tail of the line.
Step S4 is to set a moving queue in the flow sample pool, where each of the flow samples is arranged in the moving queue in sequence, the head of the moving queue is the oldest flow sample, the tail of the moving queue is the newest flow sample, and whenever a new flow sample is added, the flow sample at the head of the mobility queue is removed, and the new flow sample is added at the tail of the moving queue. It should be noted that not every sample can be added as a normal sample; the flow sample with the slight abnormal alarm can be used as a normal flow sample to be added into a mobile queue; the flow sample of the moderate abnormal alarm needs to decide whether the flow sample can be added into the mobile queue as a normal flow sample according to the alarm processing feedback result, if the operation and maintenance engineer confirms that the moderate alarm processing is abnormal, the flow sample is not added into the sample pool as the normal sample, otherwise, the flow sample can be added into the sample pool as the normal sample; and the flow sample of the severe abnormal alarm is forbidden to be added into the sample pool as a normal flow sample.
In this embodiment, step S5 needs to be performed when the preset condition is satisfied, and then, as can be seen from the foregoing, the flow sample cell is constructed by setting the collection time period D. Therefore, under normal conditions, every time the acquisition period D passes, most samples in the sample pool are replaced and updated, at this time, the new sample pool can be used for retraining the traffic anomaly detection convolutional neural network model and replacing the existing traffic anomaly detection convolutional neural network model.
Special cases to be considered are:
when the daily confirmation rate of the severe abnormal alarm is lower than 80% or the daily confirmation rate of the moderate abnormal alarm is lower than 50%, it can be considered that the network traffic of the cloud platform is changed greatly and reasonably due to the change of the external conditions, for example, after an epidemic situation occurs, the access amount of the cloud platform is reduced sharply, or during the period of a large commodity, the access amount of the cloud platform is increased sharply, and at this time, the existing traffic abnormality detection convolutional neural network model is distorted and needs to be updated immediately. And retraining the traffic anomaly detection convolutional neural network model by using a partially updated sample pool before the acquisition time interval D is reached, and replacing the existing traffic anomaly detection convolutional neural network model.
By adopting the technical scheme disclosed by the invention, the following beneficial effects are obtained:
the invention provides a cloud computing-oriented network flow abnormity detection system and method, wherein the traditional flow abnormity detection method is usually implemented by manually designing a single flow characteristic or a small number of flow characteristics by an expert, and the traditional flow abnormity detection method has the defects of limitation of abnormity detection, high missing report rate and high false report rate due to the small number of the flow characteristics. This patent adopts convolutional neural network automatic extraction a large amount of characteristics, has promoted detection precision and efficiency. In a traditional traffic anomaly detection method, a method of multiple static thresholds of subdivided traffic is generally adopted, and traffic distribution and subdivided traffic correlation are not considered, so that the abnormal traffic caused by distributed cooperative attack cannot be effectively detected. This patent is based on flow distribution convolution analysis to fully consider the correlation characteristic of subdivision flow, promoted detection accuracy and efficiency. The traditional flow anomaly detection method has less update or only manual update on detection characteristics and threshold value setting, so that the timeliness of anomaly detection is poor. This patent adopts the mode of sample cell and model automatic update, has promoted the detection ageing. In the traditional flow anomaly detection method, data marking is needed, namely, abnormal data samples need to be collected, but the data marking cost is high in practical application, and marked samples cannot be obtained at the initial stage of anomaly detection. The classification deviation of the flow prediction output is adopted to automatically identify the flow abnormity, and the practicability and the applicable occasion of abnormity detection are greatly enlarged.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and improvements can be made without departing from the principle of the present invention, and such modifications and improvements should also be considered within the scope of the present invention.

Claims (10)

1. A cloud computing-oriented network traffic anomaly detection system is characterized in that: comprises that
A client; the client can be a single working computer or a proxy gateway, each client has a source IP, and a network data packet is sent to a certain cloud application server in the cloud platform through the source IP;
a cloud application server; each cloud application server provides a cloud application for a client, and each cloud application server has a target IP and carries out network communication with the client through the target IP;
a cloud platform; all cloud application servers operated by a certain cloud service provider form a cloud platform, and the cloud platform corresponds to a group of destination IPs;
a cloud entry router; receiving a network data packet sent by a client, and distributing the received network data packet to each cloud application server;
a flow abnormality detection device; receiving a network traffic mirror image sent by a cloud entry router, constructing a traffic anomaly detection convolutional neural network model according to accumulated historical traffic data, and performing real-time detection on network traffic anomaly by using the traffic anomaly detection convolutional neural network model;
the flow abnormality detecting device includes a flow abnormality detecting unit,
a flow acquisition module; the network traffic mirror image receiving device is used for receiving the network traffic mirror image sent by the cloud entry router, and obtaining traffic distribution data after accumulating according to the header length information of each network data packet;
a model generation module; training a convolutional neural network model by using flow samples in a flow sample pool to obtain a flow anomaly detection convolutional neural network model;
an anomaly detection implementation module; obtaining a detection result by using an anomaly detection implementation module for current flow distribution data;
a flow sample pool updating module; updating the flow sample pool by using the new flow sample, thereby ensuring the timeliness of the flow sample pool;
a model update module; and retraining the flow anomaly detection convolutional neural network model by using the updated flow sample pool, updating the model, and keeping the timeliness of the flow anomaly detection convolutional neural network model.
2. A cloud computing-oriented network flow anomaly detection method is characterized by comprising the following steps: the detection method is realized by using the detection system of the claim 1; the detection method comprises the following steps of,
s1, collecting network flow, calculating flow distribution data, and constructing a flow sample pool;
s2, constructing and training a flow anomaly detection convolutional neural network model by using the flow sample pool;
s3, detecting the flow in real time by using the trained flow anomaly detection convolutional neural network model;
s4, continuously updating the flow sample pool by using the real-time flow data;
and S5, retraining the traffic anomaly detection convolutional neural network model by using the updated traffic sample pool, and replacing the existing traffic anomaly detection convolutional neural network model by using the retrained traffic anomaly detection convolutional neural network model.
3. The cloud-computing-oriented network traffic anomaly detection method according to claim 2, wherein: the step S1 specifically includes the following contents,
s11, connecting the flow abnormity detection device to a network interface NIC of the cloud entry router;
s12, configuring a sniffing function for the network interface NIC;
s13, copying all network traffic passing through the cloud entry router and then sending the copied network traffic to the network interface NIC;
s14, constructing and initializing a flow distribution two-dimensional array LD by the flow abnormity detection device; each line of the flow distribution two-dimensional array LD corresponds to a source IP; each column corresponds to a destination IP; each element in the initialized flow distribution two-dimensional array LD is zero;
s15, the traffic anomaly detection device collects network traffic sent by a cloud entry router, and the network traffic comprises a plurality of network data packets;
s16, the flow abnormity detection device calculates flow distribution data according to each received network data packet; specifically, the header of each network data packet is analyzed, and the source IP of each network data packet is obtained as IPs, the destination IP is IPd, and the length is Len; the row IPs and the column (IPs, IPd) in the traffic distribution two-dimensional array LD are the elements (IPs, IPd) of IPd, and the value of LenSum is equal to the accumulated length LenSum of the collected network packets Len with the source IP of IPs and the destination IP of IPd;
s17, the flow anomaly detection device continuously analyzes the network flow for one minute to obtain a flow distribution two-dimensional array LD, and the flow distribution two-dimensional array LD is used as a flow sample and is placed in a flow sample pool;
s18, judging whether the preset conditions are met, if so, entering the step S2; if not, clearing the flow distribution two-dimensional array, and returning to the step S15; the preset condition is that the acquisition time of the network traffic transmitted by the cloud entry router acquired by the traffic anomaly detection device is not less than the acquisition time period D.
4. The cloud-computing-oriented network traffic anomaly detection method according to claim 3, wherein: the step S2 specifically includes the following contents,
s21, normalizing the flow samples in the flow sample pool;
s22, constructing flow sample categories, and sorting the categories of the flow samples according to the total flow amount of each category of flow samples;
s23, constructing a flow anomaly detection convolutional neural network model;
and S24, training the traffic anomaly detection convolutional neural network model.
5. The cloud-computing-oriented network traffic anomaly detection method according to claim 4, wherein: step S21 is specifically to perform min-max normalization on the flow distribution two-dimensional array LD, that is, perform linear transformation on the array element a in the flow distribution two-dimensional array LD, so that the output result value is mapped between [0,1 ]; the conversion function of the linear transformation is,
Figure FDA0002665050710000031
wherein max is the maximum value of all flow samples in the flow sample pool; min is the minimum value of all flow samples in the flow sample pool; and a is an output result value.
6. The cloud-computing-oriented network traffic anomaly detection method according to claim 4, wherein: step S22 is specifically to classify and combine the flow samples in the flow sample pool in hours according to the time of flow collection, so as to obtain 24 classes, count the number of flow samples in each class respectively, add all array elements in all flow samples in each class, obtain the total flow amount of the flow samples in each class, sort the total flow amount from large to small, and sort the flow samples.
7. The cloud-computing-oriented network traffic anomaly detection method according to claim 4, wherein: the flow anomaly detection convolutional neural network model comprises an input layer, two convolutional layers, two pooling layers and an output layer; wherein the content of the first and second substances,
inputting all flow samples in the flow sample pool, namely all flow distribution two-dimensional arrays LD, by an input layer;
extracting the characteristics of a flow distribution two-dimensional array LD by the first convolution layer;
the first pooling layer reduces the data amount to be processed by a 2 x 2 domain down-sampling mode by using a local correlation principle, namely the acquired characteristic data is 1/4 before sampling;
the second convolution layer extracts the flow characteristic data after being pooled by the first pooling layer;
the second pooling layer reduces the data amount to be processed by a 2 x 2 domain down-sampling mode by using a local correlation principle, namely the acquired characteristic data is 1/4 before sampling;
and the output layer maps the characteristic data after being pooled by the second pooling layer into the finally predicted traffic category.
8. The cloud-computing-oriented network traffic anomaly detection method according to claim 4, wherein: the step S24 specifically includes the following contents,
a forward propagation phase; inputting any one flow sample from all flow distribution two-dimensional arrays LD in the flow sample pool into a flow anomaly detection convolutional neural network model to obtain a corresponding output result x;
a backward propagation phase; calculating the error between the output result x and the corresponding category y; whether the detection error is smaller than a set threshold epsilon or not is judged, if yes, training is finished, the traffic anomaly detection convolutional neural network model and all parameters are stored, and the trained traffic anomaly detection convolutional neural network model is obtained; if not, entering the next round of training; the error between the output result x and the corresponding category y is the sum of the squares of the differences between x and y of the corresponding elements x i and yi.
9. The cloud-computing-oriented network traffic anomaly detection method according to claim 4, wherein: the step S3 specifically includes the following contents,
s31, analyzing the network flow of one minute in real time by the flow abnormity detection device to obtain a flow distribution two-dimensional array LDC, taking the LDC as a current flow sample, and obtaining the category y of the flow sample according to the current time;
s32, normalizing the flow distribution two-dimensional array LDC, inputting the normalized flow distribution two-dimensional array LDC into a trained flow anomaly detection convolution neural network model, and obtaining a corresponding output result x;
s33, according to the category sorting, obtaining the sorting position Y of the array element label with the median value of 1 in the category Y and the sorting position X of the array element label with the maximum median value in the output result X, carrying out the flow abnormity detection alarm according to the following rules,
if X is Y, the detection result is that the flow is normal;
if Y ranks after X, i.e., the current flow is less than the model predicted flow, specifically,
when Y-X is 1,2, sending out a light abnormal flow alarm;
when Y-X is 3,4, a medium flow abnormal alarm is sent out;
when Y-X is larger than 4, a heavy flow abnormal alarm is sent out;
if the Y sequence is before the X sequence, namely the current flow is larger than the model predicted flow, further providing a suspected network attack alarm according to different network attack scenes, wherein the network attack scenes comprise CC attack, scanning attack, APT attack and Trojan horse attack; in particular, the method comprises the following steps of,
when X-Y is 1, sending out a suspected APT attack alarm;
when X-Y is 2, sending out a suspected Trojan attack alarm;
when X-Y is 3, sending out a suspected scanning attack alarm;
and when the X-Y is more than 3, sending a suspected CC attack alarm.
10. The cloud-computing-oriented network traffic anomaly detection method according to claim 9, wherein: step S4 specifically includes setting a mobile queue in the flow sample pool, arranging each flow sample in the mobile queue in sequence, setting the head of the mobile queue as the oldest flow sample, setting the tail of the mobile queue as the newest flow sample, removing the flow sample at the head of the mobility queue whenever a new flow sample is added, and adding the new flow sample at the tail of the mobile queue; the flow sample with the slight abnormal alarm can be used as a normal flow sample to be added into a mobile queue; the flow sample of the moderate abnormal alarm needs to determine whether the flow sample can be used as a normal flow sample to be added into a mobile queue according to the alarm processing feedback result; and the flow sample of the severe abnormal alarm is forbidden to be added into the sample pool as a normal flow sample.
CN202010916022.8A 2020-09-03 2020-09-03 Cloud computing-oriented network flow anomaly detection system and method Active CN112039906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010916022.8A CN112039906B (en) 2020-09-03 2020-09-03 Cloud computing-oriented network flow anomaly detection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010916022.8A CN112039906B (en) 2020-09-03 2020-09-03 Cloud computing-oriented network flow anomaly detection system and method

Publications (2)

Publication Number Publication Date
CN112039906A CN112039906A (en) 2020-12-04
CN112039906B true CN112039906B (en) 2022-03-18

Family

ID=73591862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010916022.8A Active CN112039906B (en) 2020-09-03 2020-09-03 Cloud computing-oriented network flow anomaly detection system and method

Country Status (1)

Country Link
CN (1) CN112039906B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113630385B (en) * 2021-07-13 2022-05-06 电子科技大学 Dos attack prevention and control method and device under sdn network
CN114401145A (en) * 2022-01-20 2022-04-26 北京邮电大学 Network flow detection system and method
CN116132154B (en) * 2023-02-03 2023-06-30 北京六方云信息技术有限公司 Verification method, device, equipment and storage medium of DNS tunnel traffic detection system
CN116633870B (en) * 2023-05-25 2023-11-14 圣麦克思智能科技(江苏)有限公司 Operation and maintenance data processing system and method based on cloud end-added mode

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108923975A (en) * 2018-07-05 2018-11-30 中山大学 A kind of traffic behavior analysis method of Based on Distributed network
CN109167789A (en) * 2018-09-13 2019-01-08 上海海事大学 A kind of cloud environment LDoS attack data-flow detection method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11122058B2 (en) * 2014-07-23 2021-09-14 Seclytics, Inc. System and method for the automated detection and prediction of online threats

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108923975A (en) * 2018-07-05 2018-11-30 中山大学 A kind of traffic behavior analysis method of Based on Distributed network
CN109167789A (en) * 2018-09-13 2019-01-08 上海海事大学 A kind of cloud environment LDoS attack data-flow detection method and system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Cost Efficient Offloading Strategy for DNN-based Applications in Edge-Cloud Environment;Huang,Yinhao等;《2019 IEEE Symposium Series on Computational Intelligence (SSCI)》;20200326;第331-337页 *
LuNet: A Deep Neural Network for Network Intrusion Detection;Wu, Peilun等;《2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019)》;20200220;第617-624页 *
一种新的在线流数据异常检测方法;丁智国;莫毓昌;杨凡;《计算机科学》;20161015;全文 *
云环境中时序数据的预测和异常检测算法的研究;王超;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20190715;I138-748 *
云计算平台中心服务器系统异常检测技术研究及系统实现;王博;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20190515;I137-116 *

Also Published As

Publication number Publication date
CN112039906A (en) 2020-12-04

Similar Documents

Publication Publication Date Title
CN112039906B (en) Cloud computing-oriented network flow anomaly detection system and method
CN109768985B (en) Intrusion detection method based on flow visualization and machine learning algorithm
CN112398779B (en) Network traffic data analysis method and system
CN114257386B (en) Training method, system, equipment and storage medium for detection model
Michael et al. Network traffic classification via neural networks
CN113469234A (en) Network flow abnormity detection method based on model-free federal meta-learning
CN111224994A (en) Botnet detection method based on feature selection
CN105471670A (en) Flow data classification method and device
CN113904881B (en) Intrusion detection rule false alarm processing method and device
CN115396204A (en) Industrial control network flow abnormity detection method and device based on sequence prediction
Janabi et al. Convolutional neural network based algorithm for early warning proactive system security in software defined networks
CN113378990A (en) Traffic data anomaly detection method based on deep learning
CN109728977B (en) JAP anonymous flow detection method and system
CN117421684B (en) Abnormal data monitoring and analyzing method based on data mining and neural network
CN110650124A (en) Network flow abnormity detection method based on multilayer echo state network
US11436320B2 (en) Adaptive computer security
Sun et al. Deep learning-based anomaly detection in LAN from raw network traffic measurement
CN116545679A (en) Industrial situation security basic framework and network attack behavior feature analysis method
CN114189364B (en) Network node path reduction and prediction method based on Markov chain
CN1612135A (en) Invasion detection (protection) product and firewall product protocol identifying technology
CN112929364B (en) Data leakage detection method and system based on ICMP tunnel analysis
CN116032515A (en) DDoS attack detection method based on transducer on SDN
CN111586052B (en) Multi-level-based crowd sourcing contract abnormal transaction identification method and identification system
CN110708296B (en) VPN account number collapse intelligent detection model based on long-time behavior analysis
CN114330504A (en) Network malicious traffic detection method based on Sketch

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant