Invention content
An embodiment of the present invention provides it is a kind of based on stratification space-time characteristic study net flow assorted method and device,
To solve the problems, such as accurately to portray the feature set of traffic behavior.
In a first aspect, an embodiment of the present invention provides a kind of net flow assorted sides based on the study of stratification space-time characteristic
Method, including:
The space characteristics of network flow data are obtained by first nerves network;
The temporal aspect of the network flow data is obtained by nervus opticus network;
Classified according to the space characteristics and the temporal aspect to the network flow.
In a possible embodiment, network flow data is converted to the data of two dimensional image form.
In a possible embodiment, network flow data progress flow cutting is obtained into multiple data on flows lists
Member;The quantity of the traffic data unit in unified each network flow data and unified each traffic data unit
Length;By after quantity and length are uniformly processed traffic data unit carry out flow encode to obtain the number of two dimensional image form
According to.
In a possible embodiment, using the data of two dimensional image form described in first nerves e-learning, obtain with
The corresponding data packet vector of traffic data unit and data packet sequence vector corresponding with the network flow data;Its
In, the first nerves network includes:Convolutional neural networks.
In a possible embodiment, the network is learnt on the basis of the space characteristics using nervus opticus network
The temporal aspect of data on flows obtains network flow vector corresponding with the network flow data;Wherein, the nervus opticus net
Network includes:Recognition with Recurrent Neural Network.
Second aspect, the embodiment of the present invention provide a kind of net flow assorted dress based on the study of stratification space-time characteristic
It puts, including:
First acquisition module, for passing through the space characteristics that first nerves network obtains network flow data;
Second acquisition module, for passing through the temporal aspect that nervus opticus network obtains the network flow data;
Sort module, for being classified according to the space characteristics and the temporal aspect to the network flow.
In a possible embodiment, modular converter, for network flow data to be converted to the number of two dimensional image form
According to.
In a possible embodiment, the modular converter, for network flow data progress flow cutting to be obtained
To multiple traffic data units;The quantity of the traffic data unit in unified each network flow data and unification are each
The length of the traffic data unit;It traffic data unit progress flow will encode to obtain two after quantity and length are uniformly processed
Tie up the data of picture format.
In a possible embodiment, first acquisition module, for using two dimension described in first nerves e-learning
The data of picture format, obtain corresponding with traffic data unit data packet vector and with the network flow data pair
The data packet sequence vector answered;Wherein, the first nerves network includes:Convolutional neural networks.
In a possible embodiment, second acquisition module, for special in the space using nervus opticus network
Learn the temporal aspect of the network flow data on the basis of sign, obtain network flow direction corresponding with the network flow data
Amount;Wherein, the nervus opticus network includes:Recognition with Recurrent Neural Network.
Net flow assorted method provided in an embodiment of the present invention based on the study of stratification space-time characteristic passes through convolution god
It is completed through network and Recognition with Recurrent Neural Network, eliminates a large amount of Feature Engineering workload;It is special by the space of convolutional neural networks
Learning ability and the temporal aspect learning ability of Recognition with Recurrent Neural Network are levied, on the basis of the space characteristics of learning network flow lower floor
The temporal aspect on upper further learning network flow upper strata, finally obtains more comprehensive and accurate traffic characteristic information, Neng Gouyou
Effect improves net flow assorted ability;False alarm rate can be effectively reduced using better traffic characteristic collection, this method uses two
Kind neural network obtains more comprehensive and accurate traffic characteristic collection, can be effectively reduced exception of network traffic detection or network
The false alarm rate of intrusion detection.
Specific embodiment
Purpose, technical scheme and advantage to make the embodiment of the present invention are clearer, below in conjunction with the embodiment of the present invention
In attached drawing, the technical solution in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
Part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
All other embodiments obtained without making creative work shall fall within the protection scope of the present invention.
For ease of the understanding to the embodiment of the present invention, it is further explained below in conjunction with attached drawing with specific embodiment
Bright, embodiment does not form the restriction to the embodiment of the present invention.
Fig. 1 is a kind of net flow assorted method based on the study of stratification space-time characteristic provided in an embodiment of the present invention
Flow chart.As shown in Figure 1, the method specifically includes:
S101, the data that network flow data is converted to two dimensional image form.
Specifically, network flow data progress flow cutting is obtained into multiple traffic data units;Unified each net
The length of the quantity of the traffic data unit in network data on flows and unified each traffic data unit;It will be through number
Amount and length are uniformly processed rear traffic data unit progress flow and encode to obtain the data of two dimensional image form.
Wherein, network flow data progress flow cutting being obtained multiple traffic data units can include:According to
Data on flows cutting is multiple traffic data units by the form of two-way network flow, wherein two-way network flow refers to according to five
Tuple { source IP, source port, destination IP, destination interface, transport layer protocol } and the beginning transmission time of first data packet are jointly true
Fixed two-way network flow data unit, each data packet requirement in two-way network flow include all Protocol layer datas,
Traffic classes mark is carried out to each flow cell after cutting later.
The quantity of the traffic data unit in unified each network flow data and unified each flow number
It can include according to the length of unit:Data packet number in each network flow is unified for n.If the data packet in former network flow
Number is more than n, then abandons other data packets;If the data packet number in former network flow is less than n, several content phases are increased newly
Same data packet is until n data packet of polishing.The length of each data packet is unified for m bytes.If the word in former data packet
Joint number is more than m, then abandons other bytes;If the byte number in former data packet is less than m, it is straight to increase several identical bytes newly
To polishing m bytes.
Such as:Data packet number in each network flow is unified for 6.If the data packet number in former network flow is more than 6,
Then abandon all data packets after 7;If the data packet number in former network flow is less than 6, increasing several contents newly is
The data packet of 0x00 is until 6 data packets of polishing.The length of each data packet is unified for 100 bytes.If in former data packet
Byte number be more than 100, then abandon 101 after all bytes;If the byte number in former data packet is less than 100, increase newly
Several 0x00 bytes are until 100 byte of polishing.
By after quantity and length are uniformly processed traffic data unit carry out flow encode to obtain the number of two dimensional image form
According to can include:Bytes all in network flow are encoded, coding mode is one-hot coding either embedded coding or picture
Element coding, wherein, one-hot coding and embedded coding the result is that the byte vector of fixed dimension, multiple vector composition X-Y schemes
Picture, pixel coder refer to byte be considered as gray-scale pixel values and by flow byte align be two dimensional image.
S102, the space characteristics that network flow data is obtained by first nerves network.
Specifically, it using the data of two dimensional image form described in first nerves e-learning, obtains and the data on flows
The corresponding data packet vector of unit and data packet sequence vector corresponding with the network flow data;Wherein, described first
Neural network includes:Convolutional neural networks.
Convolutional neural networks can be one-dimensional convolutional neural networks, or two-dimensional convolution neural network;Form data
During packet vector, a convolutional neural networks can be used to directly generate final data packet vector, can also use has difference
Multiple convolutional neural networks of size convolution kernel generate multiple ephemeral data packet vectors, and they are spliced into final data packet
Vector;Between convolutional layer and convolutional layer, 0 to n pond layer can be added.The space characteristics of convolutional neural networks study refer to
Space characteristics in flow two dimensional image.
Such as:S1021, the generation interim vector v 1 of data packet.Using size be 128 and 256 convolution kernel learning data packet to
It measures, uses one layer of full articulamentum later using one layer of maximum pond layer, the last one convolutional layer between different convolutional layers, finally
Obtain the interim vector v 1 of each data packet;S1022, the generation interim vector v 2 of data packet.Use the volume that size is 192 and 320
Product core learning data packet is vectorial, is used later using one layer of maximum pond layer, the last one convolutional layer between different convolutional layers
One layer of full articulamentum finally obtains the interim vector v 2 of each data packet;S1023, generation data packet vector v.By two data
It wraps interim vector and is spliced into final data packet vector.That is,S1024, generation data packet sequence vector v1,
v2,…,vn}.Above three sub-step is carried out for each data packet in network flow, ultimately generates a data packet vector
Sequence, the input as Recognition with Recurrent Neural Network.
S103, the temporal aspect that the network flow data is obtained by nervus opticus network.
Specifically, learnt on the basis of the space characteristics using nervus opticus network the network flow data when
Sequence characteristics obtain network flow vector corresponding with the network flow data;Wherein, the nervus opticus network includes:Cycle
Neural network.
Such as:Learn temporal aspect on the basis of network flow space characteristics using Recognition with Recurrent Neural Network, it is specifically used
Recognition with Recurrent Neural Network is long Memory Neural Networks in short-term, successively from forward and reverse both direction scan data packet sequence vector
Practise temporal aspect.
Recognition with Recurrent Neural Network can be the Recognition with Recurrent Neural Network of general Recognition with Recurrent Neural Network or special construction,
Such as long short-term memory Recognition with Recurrent Neural Network or bidirectional circulating neural network;The temporal aspect of Recognition with Recurrent Neural Network study refers to
Sequence signature between multiple data packet vectors.
Fig. 2 is the feature learning flow chart of net flow assorted method provided in this embodiment, and specific S102 and S103 can
With reference to Fig. 2.
S104, classified according to the space characteristics and the temporal aspect to the network flow.
The temporal aspect on upper strata is established on the space characteristics basis of lower floor, is finally held using the space-time characteristic of stratification
Row net flow assorted can be used grader and perform the step, and grader can be the grader inside neural network, such as
Softmax or independent grader, such as SVM or decision tree.
Such as:Final classification is performed to network flow vector using softmax graders, one layer of full connection is used before grader
Network.The network flow of grader final output input belongs to the probability distribution of 5 class target network flows, probability distribution it is maximum one
Class flow is output category, if classification results are one of four kinds of malicious traffic streams, illustrates to detected exception of network traffic.
It should be noted that in step S102-S104, four convolutional layers, four pond layers, two LSTM have been used altogether
Layer and three full articulamentums, table 1 show the parameter that convolutional neural networks and Recognition with Recurrent Neural Network use.
The number of plies |
Operation |
Convolution kernel/neuron |
Step-length |
Polishing |
1 |
conv+tanh |
128 |
1 |
valid |
2 |
max pool |
2 |
2 |
valid |
3 |
conv+tanh |
256 |
1 |
valid |
4 |
max pool |
2 |
2 |
valid |
5 |
dense |
128 |
-- |
none |
6 |
conv+tanh |
192 |
1 |
valid |
7 |
max pool |
2 |
2 |
valid |
8 |
conv+tanh |
320 |
1 |
valid |
9 |
max pool |
2 |
2 |
valid |
10 |
dense |
128 |
--- |
none |
11 |
lstm |
92 |
--- |
none |
12 |
lstm |
92 |
--- |
none |
13 |
dense |
5 |
--- |
none |
14 |
softmax |
--- |
--- |
none |
Table 1
Net flow assorted method provided in an embodiment of the present invention based on the study of stratification space-time characteristic passes through convolution god
It is completed through network and Recognition with Recurrent Neural Network, eliminates a large amount of Feature Engineering workload;It is special by the space of convolutional neural networks
Learning ability and the temporal aspect learning ability of Recognition with Recurrent Neural Network are levied, on the basis of the space characteristics of learning network flow lower floor
The temporal aspect on upper further learning network flow upper strata, finally obtains more comprehensive and accurate traffic characteristic information, Neng Gouyou
Effect improves net flow assorted ability;False alarm rate can be effectively reduced using better traffic characteristic collection, this method uses two
Kind neural network obtains more comprehensive and accurate traffic characteristic collection, can be effectively reduced exception of network traffic detection or network
The false alarm rate of intrusion detection.
Fig. 3 is a kind of net flow assorted device based on the study of stratification space-time characteristic provided in an embodiment of the present invention
Structure chart.As shown in figure 3, described device specifically includes:
First acquisition module 301, for passing through the space characteristics that first nerves network obtains network flow data;
Second acquisition module 302, for passing through the temporal aspect that nervus opticus network obtains the network flow data;
Sort module 303, for being classified according to the space characteristics and the temporal aspect to the network flow.
Optionally, described device further includes:Modular converter 304, for network flow data to be converted to two dimensional image lattice
The data of formula.
Optionally, the modular converter 304, for network flow data progress flow cutting to be obtained multiple flows
Data cell;The quantity of the traffic data unit in unified each network flow data and unified each flow number
According to the length of unit;It traffic data unit progress flow will encode to obtain two dimensional image form after quantity and length are uniformly processed
Data.
Optionally, first acquisition module 301, for using two dimensional image form described in first nerves e-learning
Data obtain data packet vector corresponding with the traffic data unit and data packet corresponding with the network flow data
Sequence vector;Wherein, the first nerves network includes:Convolutional neural networks.
Optionally, second acquisition module 302, for using nervus opticus network on the basis of the space characteristics
Learn the temporal aspect of the network flow data, obtain network flow vector corresponding with the network flow data;Wherein, institute
Nervus opticus network is stated to include:Recognition with Recurrent Neural Network.
Net flow assorted device provided in an embodiment of the present invention based on the study of stratification space-time characteristic passes through convolution god
It is completed through network and Recognition with Recurrent Neural Network, eliminates a large amount of Feature Engineering workload;It is special by the space of convolutional neural networks
Learning ability and the temporal aspect learning ability of Recognition with Recurrent Neural Network are levied, on the basis of the space characteristics of learning network flow lower floor
The temporal aspect on upper further learning network flow upper strata, finally obtains more comprehensive and accurate traffic characteristic information, Neng Gouyou
Effect improves net flow assorted ability;False alarm rate can be effectively reduced using better traffic characteristic collection, this method uses two
Kind neural network obtains more comprehensive and accurate traffic characteristic collection, can be effectively reduced exception of network traffic detection or network
The false alarm rate of intrusion detection.
Professional should further appreciate that, be described with reference to the embodiments described herein each exemplary
Unit and algorithm steps can be realized with the combination of electronic hardware, computer software or the two, hard in order to clearly demonstrate
The interchangeability of part and software generally describes each exemplary composition and step according to function in the above description.
These functions are performed actually with hardware or software mode, specific application and design constraint depending on technical solution.
Professional technician can realize described function to each specific application using distinct methods, but this realization
It is it is not considered that beyond the scope of this invention.
The step of method or algorithm for being described with reference to the embodiments described herein, can use hardware, processor to perform
The combination of software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only memory
(ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field
In any other form of storage medium well known to interior.
Above-described specific embodiment has carried out the purpose of the present invention, technical solution and advantageous effect further
It is described in detail, it should be understood that the foregoing is merely the specific embodiment of the present invention, is not intended to limit the present invention
Protection domain, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.