CN105959175B - Net flow assorted method based on the GPU kNN algorithm accelerated - Google Patents
Net flow assorted method based on the GPU kNN algorithm accelerated Download PDFInfo
- Publication number
- CN105959175B CN105959175B CN201610258008.7A CN201610258008A CN105959175B CN 105959175 B CN105959175 B CN 105959175B CN 201610258008 A CN201610258008 A CN 201610258008A CN 105959175 B CN105959175 B CN 105959175B
- Authority
- CN
- China
- Prior art keywords
- value
- vector
- thread
- similarity
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 230000008569 process Effects 0.000 claims abstract description 28
- 238000004364 calculation method Methods 0.000 claims abstract description 12
- 239000013598 vector Substances 0.000 claims description 43
- 238000012360 testing method Methods 0.000 claims description 31
- 238000012549 training Methods 0.000 claims description 29
- HPTJABJPZMULFH-UHFFFAOYSA-N 12-[(Cyclohexylcarbamoyl)amino]dodecanoic acid Chemical compound OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFH-UHFFFAOYSA-N 0.000 claims description 21
- 239000011159 matrix material Substances 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000000926 separation method Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 claims description 2
- 238000005111 flow chemistry technique Methods 0.000 claims 1
- 230000001133 acceleration Effects 0.000 abstract description 2
- 238000013480 data collection Methods 0.000 abstract 1
- 230000001737 promoting effect Effects 0.000 abstract 1
- 238000002474 experimental method Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 3
- 102100026278 Cysteine sulfinic acid decarboxylase Human genes 0.000 description 2
- 238000002790 cross-validation Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 108010064775 protein C activator peptide Proteins 0.000 description 2
- 101000578693 Homo sapiens Target of rapamycin complex subunit LST8 Proteins 0.000 description 1
- 244000097202 Rathbunia alamosensis Species 0.000 description 1
- 235000009776 Rathbunia alamosensis Nutrition 0.000 description 1
- 102100027802 Target of rapamycin complex subunit LST8 Human genes 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
- H04L43/026—Capturing of monitoring data using flow identification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention proposes the net flow assorted methods of the kNN algorithm accelerated based on GPU.This method is significantly promoting classification performance on to kNN algorithm similarity calculation and sequence using GPU acceleration, while having chosen one group and efficiently flow feature and establishing traffic classifier.The invention also provides a kind of network flow acquisition methods based on process, and in this, as the basic data collection of experimentation to guarantee experimental data validity.It is of the invention the experimental results showed that GPU peak computational speed has 187 times of promotions relative to CPU, and accurate rate of classifying can reach 80% or more substantially, have been reached by 95% or more accuracy rate, has sufficiently proved effectiveness of the invention by certain existing application such as FTP, WEB etc..
Description
Technical field
The present invention relates to the fields such as high-performance calculation, network flow classification and network security, and in particular to is used for network flow
Amount monitoring, classification and network attack detection.
Background technique
With the fast development of Internet application, more and more internet emerging services are applied and are given birth to.At the same time, net
Network operators wish that existing IP network can be not required to change and provide for the carrying of multiple business set to reduce network foundation
The cost and operation cost of standby construction.So IP network is in addition to that can carry common Internet business, video, voice etc.
Multimedia service, it is also necessary to carry the emerging services such as 3G, NGN (next generation network).Different business is to network
The function and performance of offer are not quite similar, such as video conference and traditional internet service are in data throughout and network delay
There is very big difference in equal factors dictates.This requires network managers to network flow exhaustive division and to provide corresponding
QOS (quality of service, service quality) guarantees.
Important technology of improving service quality is exactly traffic classification, it is for QOS, IDS, charge on traffic, firewall
The fields such as technology, traffic classification all play important role, have become one of enhancing network controllability key technology.This
Outside, traffic classification can also help the distribution and rule of network management personnel's awareness network flow, and network designer is helped to improve net
The planning of network.In recent years, with the continuous expansion of internet scale, traffic classification technology is also in the industry cycle with academia by pass
Note, and gradually formed an independent research field.
Due to the continuous promotion of Internet link-speeds, existing traffic classification method gradually becomes difficult to deal with.
Since issuing first item CUDA platform from NVIDIA company in 2007, CUDA technology is continued to develop, and driving is constantly updated, and is supported more
More characteristics constantly simplify the difficulty of graphic programming, are a big technological innovations on the basis of OPENGL.In machine learning classical way
In, kNN is a kind of high-efficient, good classification effect a method.Although kNN algorithm does not need to spend the modeling time,
Computation complexity is dependent on k nearest-neighbors are found in a large amount of d dimensional vectors, so when kNN algorithm needs to spend a large amount of classification
Between.In real-time network traffic classification, the requirement of classification speed is very high.The classification speed of CPU has been unable to satisfy reality at present
When the requirement classified.
Summary of the invention
It can not be coped with present invention aims at the performance for solving current net flow assorted method and constantly accelerate link-speeds
The problem of, a kind of kNN algorithm based on CUDA is provided in the application method in traffic classification field.It is different from other research approaches it
It is in the method for the present invention is suitble to the kNN algorithm run under GPU environment, and on similarity calculation and sequence step all
Optimized using CUDA, the results show GPU peak computational speed is 187 times of CPU calculating speed, significant increase flow point
Speed, handling capacity and the performance of class.
The present invention achieves higher traffic classification recall rate and accurate rate by establishing efficient classifier.Such as
FTP, WEB etc. have reached 95% or more accuracy rate.The present invention devises the flow acquisition methods based on process, and with this
As experimental data set, the confidence level of experiment is improved.
Technical solution of the present invention:
A kind of net flow assorted method of the kNN algorithm accelerated based on GPU, this method calculate energy using powerful GPU
Power and bandwidth significantly improve the speed, handling capacity and performance of traffic classification after accelerating to kNN algorithm.Meanwhile using being based on
The flow acquisition methods of process guarantee that data set is pure effectively, establish efficient net flow assorted device and reach degree of precision rate.
This method specifically includes the following steps:
Step 1, acquisition include the mixed traffic data set of various applications;The data set of these flows is disclosed on internet
Data on flows collection, or write using oneself program capture data on flows collection;Since the flow of acquisition may be too big
Or comprising noise flow, so we will be filtered processing to flow to obtain data set within 1G and pure;So
Afterwards again it is the form of the network flow with identical five-tuple by Segmentation of Data Set, randomly selects 90% network flow as training set,
Remaining 10% network flow detects test effect as test set;
Step 2 generates optimal characteristics set using feature selecting algorithm;Step 1 by Segmentation of Data Set be network flow it
Afterwards, the given characteristic value result of each network flow is calculated;In order to avoid characteristic value magnitude is different and causes similarity calculation inclined
The situation of difference, it is necessary to by characteristic value standardization within the same section;
Described characteristic value is standardized refers within the same section, and characteristic value planningization is existed according to formula (1)
In (- 1,1):
Wherein, MiRepresent i-th of vector in feature vector, avg (Mi) represent the average value of i-th of vector characteristics, max
(Mi) represent the maximum value of i-th of vector, min (Mi) represent the minimum value of i-th of vector.
Step 3 establishes kNN algorithm traffic classifier using training dataset;Each test network is calculated using CUDA
Flow to the similarity of all training datasets;It is arranged using the similarity that CUDA flows to all training datasets to each test
Sequence selectes the highest preceding k neighbours of similarity;The highest type of proportion is selected using voting mechanism in this k neighbour, i.e.,
For final result;
The similarity for flowing to all training datasets to each test network using CUDA is calculated and is sorted
Method be:
If the record quantity of training stream is m, m≤104, the quantity for testing stream is n, n=m/9, the following equal table of parameter m and n
Show such meaning;Enable A={ a1,a2,...,amIt is m training stream record;Enable B={ b1,b2,...,bnIt is n test stream note
Record;Meanwhile each stream record can use feature vector u in set Ai={ ai1,ai2,...,aid}TIt indicates, -1 < ai1,
ai2,...,aid< 1;Each stream record can use feature vector u in set Bj={ bj1,bj2,...,bjd}TIt indicates, -1 <
bj1,bj2,...,bjd< 1, vector uiAnd ujElement after standardization value range in (- 1,1);Accelerate similarity to realize
Calculating process, the present invention in propose a kind of CUDA thread algorithm of load balance, as shown in formula (2), (3):
From=(m × n)/(kb×kt)×Tid (2)
To=(m × n)/(kb×kt)×(Tid+1) (3)
Wherein (m × n) represents general assignment number, kbIndicate thread block number in total, k in kerneltIt indicates in per thread block
Total Thread Count, so (kb×kt) indicate Thread Count total in kernel, TidIt is the id of thread, for identifying a thread;
It is clear that From calculated result represents the calculative initial position of per thread, and To calculated result then represents each line
The calculative final position of journey;
Although having calculated that most similarity between test set and training set, there are also (m × n) mod (kb×
kt) similarity of quantity do not calculated, so per thread needs to calculate a similarity, calculation formula such as formula (4) institute again
Show:
(m × n)-(m × n) % (kb×kt)+Tid (4)
Wherein, (m × n)-(m × n) % (kb×kt) indicate to add T by calculating similarity quantityidTo guarantee residue
Similarity all calculated;
Accelerate sort algorithm to realize in CUDA environment, have chosen the sort algorithm of sorting network, for there is l member
Array { the d of element0,d1,...,dl-1, it needs to select k least member in array, k≤l enables dl/2For the separation of array,
So shown in the larger subsequence such as formula (5) that comparator generates:
s1={ max { d0,d0+l/2},max{d1,d1+l/2},...max{dl/2-1,dl/2-1+l/2}} (5)
Shown in the smaller subsequence such as formula (6) that comparator generates:
s2={ min { d0,d0+l/2},min{d1,d1+l/2},...min{dl/2-1,dl/2-1+l/2}} (6)
For the smallest number certainly in second subsequence, this sequence is sharedA number, and exist every time
s1Or s2All be using comparator it is independent, CUDA thread can be called while being carried out, thus useSecondary comparator can
It is parallel, then in s2This process is repeated, until s2An element is only remained, this element is exactly current sequence least member,
After once-through operation, sequence just becomes { c0,c1,...,cl-1, cl-1For first the smallest element in sequence, then by sequence
{c0,c1,...,cl-2Regard initial array as, the process of above-mentioned screening least member is repeated, so k are most after k operation
Small element is just selected;
Finally, in task distribution, using per thread block and thread loops processing feature value matrix, and by calculating task
Equalization distributes to per thread block and thread to reach load balancing, and the task distribution of per thread block and per thread uses public
Formula (2), (3), (4) are calculated.
It is described to calculate each network test using CUDA and flow to the similarities of all training datasets using Euclidean distance i.e.
Formula (7) calculates:
Wherein, xiRepresent the ith feature value of first feature vector, yiRepresent the ith feature of second feature vector
Value, M represent the dimension of feature vector.
Step 4, step 3 traffic classifier establish after the completion of, can under GPU environment using existing disaggregated model into
Row traffic classification, and analyze classification results and performance.
The advantages of the present invention:
The invention proposes the net flow assorted methods of the kNN algorithm accelerated based on GPU.This method is to kNN algorithm
Classification performance is significantly promoted using GPU acceleration in similarity calculation and sequence, while being had chosen one group and efficiently being flowed spy
Sign establishes traffic classifier.
The invention also provides a kind of network flow acquisition methods based on process, and in this, as the basis of experimentation
Data set is to guarantee experimental data validity.
It is of the invention the experimental results showed that GPU peak computational speed has 187 times of promotions, and essence of classifying relative to CPU
True rate can reach 80% or more substantially, reached 95% or more accuracy rate to certain existing application such as FTP, WEB etc., sufficiently demonstrate,proved
Bright effectiveness of the invention.
Detailed description of the invention
Fig. 1 is Ground Truth architecture diagram.
Fig. 2 is the matrix expression of training set and test set.
Fig. 3 is the comparator in sorting network.
Fig. 4 is the example (l=7) of sorting network.
Fig. 5 is algorithm performance histogram.
Fig. 6 is algorithm performance curve graph.
Specific embodiment
Step 1 passes through mixed traffic data set of the traffic capture method acquisition comprising various applications based on process.
The invention proposes a kind of kNN net flow assorted methods accelerated based on GPU, are caught by the flow based on process
Method is obtained to obtain and test data used, Fig. 1 shows the configuration diagram of Ground Truth.Wherein each host needs
One Daemon client is installed, this client Windows API of calling per second obtains host log information, mainly
Including information such as Socket timestamp, port numbers, the process names generated.Then according to the setting of heart-beat protocol, by these logs
Information is stored into database.
For TCP, on the basis of the timestamp of packet capture, front and back offset finds five yuan in 100 seconds in log
Group (IP address pair, port numbers pair, protocol type) all with the consistent log recording of data packet.If finding this record, just will
This packet marking process thus.Because TCP has three-way handshake process when establishing connection, the first packet of capture must be
SYN packet.We only need to capture this SYN, then mark its process number, then herein connect validity period in, behind and it
There is the data packet of common five-tuple or opposite five-tuple to belong to this process number, by the PCAP of this data packet write-in to the end
In file, if can not find this record, this data packet is marked as invalid and is rejected.
For UDP, Windows obtains the API of UDP information, can not obtain destination slogan and purpose IP address, together
When, there is no three-way handshake process by UDP to establish connection.It therefore, can only be on the basis of the timestamp of packet capture, in log
Middle front and back deviate 10 seconds searching source IP address and source port number all with the consistent log recording of data packet.If finding this note
Record just by this packet marking process thus, and this data packet is written in PCAP file to the end, if can not find this
Item record, this data packet are marked as invalid and are rejected.
The data set obtained using the GT framework based on process is as shown in table 1.Smtp, Pop3 represent answering for email type
With Ftp represents the application of file transmission class, and QQ represents the application of instant messaging class, and what BitTorrent and Thunder were represented is
The application of P2P class, what YouKu was represented is the application of video class, and what Web was represented is the application such as Taobao, Sina of browser class
Etc..Number in table represents the quantity of stream, such as has used 900 Smtp training streams, 100 test streams in an experiment.I
Used it is trained stream and test stream be 9:1 ratio, that is to say, that the 90% of data set be used as training set, 10% as test
Collection.Also, every other application of type has only selected one group of data to test, and does not take the method for cross validation.Reason exists
It is also random selection multi-group data in cross validation, there is no essential distinctions with one group of data experiment is randomly selected for this.
Data set used in the experiment of table 1
Step 2 generates optimal characteristics collection using feature selecting algorithm, calculates characteristic value and planningization characteristic value in same area
In, the feature set of selection is as shown in table 2.
Firstly, agreement refers to that transport layer protocol is TCP or UDP.The present invention only research TCP or UDP link at present
Data packet.Second, the size of packet load and data packet in network flow, the data packet number and data of different type link
Packet size is different.Such as the data packet of Ftp connection will be bigger than other kinds of data packet, because it needs to transmit text
Part needs higher link utilization.Third, the minimum of data packet length, maximum, average value.Obviously, different types of to answer
It is made a big difference on data packet length with it.Choosing minimum, maximum, average value more can reflect comprehensively it in data packet length
On difference.4th, minimum, maximum, average value and the variance of Inter-arrival Time.The influence of the factors such as network delay is not considered,
Wrapping arrival time is an important feature.It is envisioned that the packet arrival time of instant messaging or the application of real-time video class
It is shorter, because they will have better real-time.
Due to network flow be it is two-way, the two contrary stream feature vectors are integrated into a feature vector.
The two-way flow that represents is sent mutually between client and server end.
Feature vector used in the experiment of table 2
The present invention standardizes to characteristic value using formula (1).The denominator of this formula is characterized value maximum value and minimum
The difference of value, expression be this characteristic value value maximum difference range, its molecule is characterized value average value and this feature
The difference of value.Obviously, molecule is less than the value of denominator forever, and also just each characteristic value is standardized in proportion in (- 1,1).It is logical
Experiment is crossed it can be proved that the experimental result after standardization is than improving 10% or so before standardizing, the importance of standardization is not
It says and explains.
Step 3 establishes kNN algorithm traffic classifier using training dataset, using CUDA to each test network stream
Similarity to all training datasets is calculated and is sorted.
If the record quantity of training stream is m, m≤104, the quantity for testing stream is n, n=m/9, the following equal table of parameter m and n
Show such meaning;Enable A={ a1,a2,...,amIt is m training stream record;Enable B={ b1,b2,...,bnIt is n test stream note
Record;Meanwhile each stream record can use feature vector u in set Ai={ ai1,ai2,...,aid}TIt indicates, -1 < ai1,
ai2,...,aid< 1;Each stream record can use feature vector u in set Bj={ bj1,bj2,...,bjd}TIt indicates, -1 <
bj1,bj2,...,bjd< 1, vector uiAnd ujElement for value range in (- 1,1), matrix indicates such as attached drawing 2 after standardization
It is shown.
Record u is flowed for each testj={ bj1,bj2,...,bjd}T, we must calculate it and all training streams record
Distance, so calculating time complexity at this time is O (nmd).This is a huge calculating time complexity.Meanwhile I
Must to use height be m, the result that width is the Matrix C of n to store calculating.So the space complexity of this algorithm
For O (nm).And it is recognised that i-th of vector distance result inside test set inside j-th of vector sum training set is deposited
It stores up in (jn+i) in Matrix C.
Theoretically, mn similarity calculation task we can mn thread calculate.Each thread calculates
One distance improves calculating speed to greatest extent.However, total Thread Count p that GPU can rise is by hardware limitation, and p is normal
Often it is much smaller than task amount mn.Therefore, in order to reach the load balance of per thread, it is secondary that per thread needs compute repeatedly (nm/p)
Distance.The task distribution that formula (2) (3) carry out similarity calculation can be used in we.
From=(m × n)/(kb×kt)×Tid (2)
To=(m × n)/(kb×kt)×(Tid+1) (3)
Although we have calculated that most distance between test set and training set, there are also (m × n) mod (kb
×kt) distance of quantity is not calculated, so per thread needs to calculate primary distance again, shown in calculation formula such as formula (4):
(m × n)-(m × n) % (kb×kt)+Tid (4)
Wherein, (m × n)-(m × n) % (kb×kt) indicate to add T by calculating similarity quantityidTo guarantee residue
Similarity all calculated.
Attached drawing 3 is the comparator of a sorting network.The input of this comparator is 2 data x, y, is handled by comparator
Biggish data are on top afterwards, and lesser data are in bottom end.We can see the time required for the operation a comparator
Office's time.
For there is the array { d of l element0,d1,...,dl-1, it would be desirable to the k least member of selection in array, k≤
L enables dl/2For the separation of array, so shown in the larger subsequence such as formula (5) that comparator generates:
s1={ max { d0,d0+l/2},max{d1,d1+l/2},...max{dl/2-1,dl/2-1+l/2}} (5)
Shown in the smaller subsequence such as formula (6) that comparator generates:
s2={ min { d0,d0+l/2},min{d1,d1+l/2},...min{dl/2-1,dl/2-1+l/2}} (6)
For the smallest number certainly in second subsequence, this sequence is sharedA number.And exist every time
s1Or s2All be using comparator it is independent, CUDA thread can be called while being carried out.SoThe use of secondary comparator can
With parallel.Then in s2This process is repeated, until s2An element is only remained, this element is exactly current sequence least member.
After once-through operation, sequence just becomes { c0,c1,...,cl-1, cl-1For first the smallest element in sequence.Then by sequence
Arrange { c0,c1,...,cl-2Regard initial array as, the process of above-mentioned screening least member is repeated, so after k operation, k
Least member is just selected.
For example, it would be desirable to 2 least members are selected in sequence { 4,2,3,1,0,6,5 }.Select first most
The process of small element is as shown in Fig. 4.After first time sequence, s2Become { 1,0,3,5 }, the s after second time sequence2Become
{1,0}.After 3rd time, in s2In only surplus 0 this element.So 0 is first least member.At this moment, array becomes
At { 4,2,6,3,5,1,0 }.We can continue process above until selecting second least member 1 in this series relay.
In this process, we it should be noted that this time sequence use one time as a result, so all threads require
Terminate preamble in this time sequence, guarantees not having the sequence that some thread continues next time before this time sequence terminates
Process.We can use the synchronizing function of GPU thread block, is ranked up using a column of the thread block to Matrix C
It is limited though the hardware resources such as the arithmetic core of GPU, available memory space are powerful but also, so in GPU
Total Thread Count p and the number b of thread block have a upper limit.In the ideal case, it would be desirable to use n thread block and each line
A thread of journey block (m/2p) executes the algorithm above.But often we can not generate so many thread block and Thread Count, I
Can use for reference distance calculate in method for allocating tasks, (n/b) a column in each thread block processing array C;It is each simultaneously
Thread process (m/2p) secondary comparison.That is, making per thread block and thread loops processing array, and calculating task is impartial
Per thread block and thread are distributed to reach load balancing.The task of per thread block and per thread using formula (2),
(3), (4) are calculated.
It is described to calculate each network test using CUDA and flow to the similarities of all training datasets using Euclidean distance i.e.
Formula (7) calculates:
Wherein, xiRepresent the ith feature value of first feature vector, yiRepresent the ith feature of second feature vector
Value, M represent the dimension of feature vector.
Step 4, step 3 traffic classifier establish after the completion of, can under GPU environment using existing disaggregated model into
Row traffic classification, and analyze classification results and performance.
The speed of service of present invention comparison CPU and GPU.Model Intel (R) Xeon (R) of CPU, E5-2620, frequency
For 2.0GHZ, there are 8 physical cores, inside save as 32G;GPU uses NVIDIATesla, possesses 2880 CUDA cores, deposits
Memory bandwidth reaches 288GB/sec, and peak value double-precision floating point performance reaches 1.87Tflops.Meanwhile CPU and GPU are operated in together
On one server, operating system is Red Hat Enterprise Linux Server Release 6.3.
When analyzing the effect of flow point class, whether we measure experimental result using 2 indexs effective, respectively accurately
Rate and recall rate.
Accurate rate (precision) is defined as:
Recall rate (recall rate) is also referred to as feedback rates, is defined as:
Speed-up ratio is defined as:
Under above-mentioned environment, experiment has chosen 6943 training stream records, and 777 test stream records, K, which chooses, is not more than 10
Odd number.Experimental result is as shown in table 3:
Table 3CPU and GPU experimental result
For the more intuitive relatively speed of GPU and CPU, experimental result is depicted as curve graph and histogram by us.
As shown in attached drawing 5,6.It can be seen from the figure that the classification time required for GPU only accounts for the tiny segment of CPU classification time.GPU
The speed of service of kNN algorithm is significantly improved using a large amount of core, the bandwidth of high speed and the powerful computing capability of calculating.
The different application program of experimental selection and quantity and different k values, have done several groups of experiments respectively.Recall rate
With accurate rate experimental result as shown in table 4,5,6.
Table 4 flows classification results (3classes, k=3)
Table 5 flows classification results (5classes, k=5)
Table 6 flows classification results (8classes, k=5)
The flow point class accurate rate and recall rate that these three are applied from classification angle analysis WEB, FTP, BitTorrent reach
90% or more.If run a large amount of protocol suites inside QQ, Youku, characteristic value is easy to obscure with other application, results in point
Class accuracy rate is not high.But overall classifying quality all reaches 80% or more, basically reaches experiment and is expected.
Claims (2)
1. a kind of kNN algorithm net flow assorted method accelerated based on GPU, this method utilize the computation capability pair of GPU
KNN algorithm is accelerated, using based on process flow acquisition methods guarantee data set it is pure effectively, this method specifically include with
Lower step:
Step 1, acquisition include the mixed traffic data set of a variety of applications;The data set of these flows is disclosed stream on internet
Data set is measured, or the data on flows collection for the program capture write using oneself;Due to acquisition flow may it is too big or
Comprising noise flow, need to be filtered flow processing to obtain data set within 1G and pure;Then by data set
It is divided into the form of the network flow with identical five-tuple, randomly selects 90% network flow as training set, remaining 10% network
Stream detects test effect as test set;
Step 2 generates optimal characteristics set using feature selecting algorithm;It is meter after network flow by Segmentation of Data Set in step 1
Calculate the given characteristic value result of each network flow;In order to avoid characteristic value magnitude is different and leads to asking for similarity calculation deviation
Topic needs to standardize characteristic value within the same section;
Step 3 establishes kNN algorithm traffic classifier using training dataset;Each test network is calculated using CUDA to flow to
The similarity of all training datasets, specific method are calculated using Euclidean distance i.e. formula (7):
Wherein, xiRepresent the ith feature value of first feature vector, yiRepresent the ith feature value of second feature vector, M
Represent the dimension of feature vector;The feature vector is made of the characteristic value of one group of network flow characteristic, network flow characteristic
It include: transport layer protocol, the size of packet load and data packet, the minimum of data packet length, maximum, average value, packet arrival
Minimum, maximum, average value and the variance at interval;
It is ranked up using the similarity that CUDA flows to all training datasets to each test, selectes the highest preceding k of similarity
A neighbours;The highest type of proportion, as final result are selected using voting mechanism in this k neighbour;
The side that the similarity for flowing to all training datasets to each test network using CUDA is calculated and sorted
Method is:
If the record quantity of training stream is m, m≤104, the quantity for testing stream is n, n=m/9, and following parameter m and n indicates such
Meaning;Enable A={ a1,a2,...,amIt is m training stream record;Enable B={ b1,b2,...,bnIt is n test stream record;Together
When, each stream record can use feature vector u in set Ai={ ai1,ai2,...,aid}TIt indicates, -1 < ai1,ai2,...,aid
< 1, wherein d=4;Each stream record can use feature vector u in set Bj={ bj1,bj2,...,bjd}TIt indicates, -1 <
bj1,bj2,...,bjd< 1, wherein d=4, vector uiAnd ujElement value range is in (- 1,1) after standardization, to realize
Accelerate similarity calculation process, proposes a kind of CUDA thread algorithm of load balance, as shown in formula (2), (3):
From=(m × n)/(kb×kt)×Tid (2)
To=(m × n)/(kb×kt)×(Tid+1) (3)
Wherein (m × n) represents general assignment number, kbIndicate thread block number in total, k in kerneltIndicate total in per thread block
Thread Count, so (kb×kt) indicate Thread Count total in kernel, TidIt is the id of thread, for identifying a thread;It is aobvious and
It is clear to, From calculated result represents the calculative initial position of per thread, and To calculated result then represents per thread and needs
The final position to be calculated;
Although having calculated that most similarity between test set and training set, there are also (m × n) mod (kb×kt) number
The similarity of amount is not calculated, so per thread needs to calculate a similarity again, shown in calculation formula such as formula (4):
(m × n)-(m × n) % (kb×kt)+Tid (4)
Wherein, (m × n)-(m × n) % (kb×kt) indicate to add T by calculating similarity quantityidTo guarantee remaining phase
It is all calculated like degree;
Accelerate sort algorithm to realize in CUDA environment, the sort algorithm of sorting network is had chosen, for there is l element
Array { d0,d1,...,dl-1, it needs to select k least member in array, k≤l enables dl/2For the separation of array, so
Shown in the larger subsequence such as formula (5) that comparator generates:
s1={ max { d0,d0+l/2},max{d1,d1+l/2},...max{dl/2-1,dl/2-1+l/2}} (5)
Shown in the smaller subsequence such as formula (6) that comparator generates:
s2={ min { d0,d0+l/2},min{d1,d1+l/2},...min{dl/2-1,dl/2-1+l/2}} (6)
For the smallest number certainly in second subsequence, this sequence is sharedA number, and every time in s1Or
s2All be using comparator it is independent, CUDA thread can be called while being carried out, thus useSecondary comparator can be by simultaneously
Row, then in s2This process is repeated, until s2An element is only remained, this element is exactly current sequence least member, primary
After operation, sequence just becomes { c0,c1,...,cl-1, cl-1For first the smallest element in sequence, then by sequence { c0,
c1,...,cl-2Regard initial array as, the process of above-mentioned screening least member is repeated, so k is minimum after k operation
Element is just selected;
Finally, in task distribution, using per thread block and thread loops processing feature value matrix, and calculating task is impartial
Per thread block and thread are distributed to reach load balancing, the task distribution of per thread block and per thread uses formula
(2), (3), (4) are calculated;
Step 4, step 3 traffic classifier establish after the completion of, can be flowed under GPU environment using existing disaggregated model
Amount classification, and analyze classification results and performance.
2. according to the method described in claim 1, it is characterized in that use feature selecting algorithm described in step 2 generates optimal spy
The method that collection is closed is to choose transport layer protocol, the size of packet load and data packet, the minimum of data packet length, maximum,
Average value, minimum, maximum, average value and the variance of Inter-arrival Time are mostly feature set and construction feature vector;
Described characteristic value is standardized refer within the same section, according to formula (1) by characteristic value planningization (- 1,
1) in;
Wherein, MiRepresent i-th of vector in feature vector, avg (Mi) represent the average value of i-th of vector characteristics, max (Mi) generation
The maximum value of i-th of vector of table, min (Mi) represent the minimum value of i-th of vector;Each feature vector is by one group of feature
Characteristic value constitute, the average value of i-th of vector is the average value of each characteristic value in this feature vector, the maximum of i-th of vector
Value is that each characteristic value is maximum in this feature vector, and the minimum value of i-th of vector is each characteristic value minimum value in this feature vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610258008.7A CN105959175B (en) | 2016-04-21 | 2016-04-21 | Net flow assorted method based on the GPU kNN algorithm accelerated |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610258008.7A CN105959175B (en) | 2016-04-21 | 2016-04-21 | Net flow assorted method based on the GPU kNN algorithm accelerated |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105959175A CN105959175A (en) | 2016-09-21 |
CN105959175B true CN105959175B (en) | 2019-10-22 |
Family
ID=56915406
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610258008.7A Expired - Fee Related CN105959175B (en) | 2016-04-21 | 2016-04-21 | Net flow assorted method based on the GPU kNN algorithm accelerated |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105959175B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106789349B (en) * | 2017-01-20 | 2020-04-07 | 南京邮电大学 | Quality of experience modeling analysis and conversation flow classification based method |
CN108600246B (en) * | 2018-05-04 | 2020-08-21 | 浙江工业大学 | Network intrusion detection parallelization acceleration method based on KNN algorithm |
CN109861862A (en) * | 2019-02-03 | 2019-06-07 | 江苏深度空间信息科技有限公司 | A kind of network flow search method, device, electronic equipment and storage medium |
CN109815075B (en) * | 2019-02-28 | 2020-07-03 | 苏州浪潮智能科技有限公司 | Method and device for detecting GPGPU (general purpose graphics processing unit) link speed |
CN111698178B (en) * | 2020-04-14 | 2022-08-30 | 新华三技术有限公司 | Flow analysis method and device |
CN112380003B (en) * | 2020-09-18 | 2021-09-17 | 北京大学 | High-performance parallel implementation device for K-NN on GPU processor |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8928658B2 (en) * | 2008-09-30 | 2015-01-06 | Microsoft Corporation | Photon mapping on graphics hardware using kd-trees |
CN103021017B (en) * | 2012-12-04 | 2015-05-20 | 上海交通大学 | Three-dimensional scene rebuilding method based on GPU acceleration |
WO2015077958A1 (en) * | 2013-11-28 | 2015-06-04 | 华为技术有限公司 | Method, apparatus and system for controlling service traffic |
CN103714185B (en) * | 2014-01-17 | 2017-02-01 | 武汉大学 | Subject event updating method base and urban multi-source time-space information parallel updating method |
CN104020983A (en) * | 2014-06-16 | 2014-09-03 | 上海大学 | KNN-GPU acceleration method based on OpenCL |
-
2016
- 2016-04-21 CN CN201610258008.7A patent/CN105959175B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN105959175A (en) | 2016-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105959175B (en) | Net flow assorted method based on the GPU kNN algorithm accelerated | |
Cormode et al. | What's new: Finding significant differences in network data streams | |
CN102420723A (en) | Anomaly detection method for various kinds of intrusion | |
CN104102700A (en) | Categorizing method oriented to Internet unbalanced application flow | |
WO2015154484A1 (en) | Traffic data classification method and device | |
CN103973589B (en) | Network traffic classification method and device | |
CN112350956A (en) | Network traffic identification method, device, equipment and machine readable storage medium | |
Xu et al. | Video analytics with zero-streaming cameras | |
Xiao et al. | A traffic classification method with spectral clustering in SDN | |
CN114650229B (en) | Network encryption traffic classification method and system based on three-layer model SFTF-L | |
Shafiq et al. | Effective packet number for 5G IM WeChat application at early stage traffic classification | |
CN108805211A (en) | IN service type cognitive method based on machine learning | |
Saravanan et al. | A graph-based churn prediction model for mobile telecom networks | |
Wu et al. | On addressing the imbalance problem: a correlated KNN approach for network traffic classification | |
Özdel et al. | Payload-based network traffic analysis for application classification and intrusion detection | |
Mu et al. | A parallelized network traffic classification based on hidden markov model | |
Liu et al. | A cascade forest approach to application classification of mobile traces | |
CN112235254B (en) | Rapid identification method for Tor network bridge in high-speed backbone network | |
Amei et al. | A survey of application-level protocol identification based on machine learning | |
Altschaffel et al. | Statistical pattern recognition based content analysis on encrypted network: Traffic for the teamviewer application | |
Huang et al. | Internet traffic classification based on min-max ensemble feature selection | |
CN108141377A (en) | Network flow early stage classifies | |
CN114020471B (en) | Sketch-based lightweight elephant flow detection method and platform | |
Dong | Online encrypted skype identification based on an updating mechanism | |
Wang et al. | Ensemble classifier for traffic in presence of changing distributions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20191022 |
|
CF01 | Termination of patent right due to non-payment of annual fee |