CN110113338A - A kind of encryption traffic characteristic extracting method based on Fusion Features - Google Patents
A kind of encryption traffic characteristic extracting method based on Fusion Features Download PDFInfo
- Publication number
- CN110113338A CN110113338A CN201910379472.5A CN201910379472A CN110113338A CN 110113338 A CN110113338 A CN 110113338A CN 201910379472 A CN201910379472 A CN 201910379472A CN 110113338 A CN110113338 A CN 110113338A
- Authority
- CN
- China
- Prior art keywords
- burst
- feature
- data packet
- plen
- ptime
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/142—Network analysis or design using statistical or mathematical methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2483—Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Pure & Applied Mathematics (AREA)
- Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- General Physics & Mathematics (AREA)
- Algebra (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The encryption traffic characteristic extracting method based on Fusion Features that the present invention relates to a kind of, belongs to machine learning, network service safe and flow identification technology field.The characteristic value for including the following steps: step 1, extracting encrypted packet different dimensions in an encryption stream;Step 2, calculating signature contributions degree simultaneously normalize, then carry out feature selecting based on signature contributions degree, pick out the optimal characteristics quantity n for participating in fusion, and select preceding n feature as the optimal characteristics amount for participating in fusion;Step 3 is sorted out based on feature of the optimum fusion feature quantity n to different dimensions, carries out a liter peacekeeping fusion using optimal characteristics of the kernel function to the participation fusion that step 2 is selected, and the final characteristic set for participating in classification is exported.The method can preferably portray refined net flow fingerprint;The connection between different characteristic can be characterized;It can quickly determine the feature quantity for participating in fusion, improve the efficiency of Fusion Features;Realize higher accuracy rate.
Description
Technical field
The encryption traffic characteristic extracting method based on Fusion Features that the present invention relates to a kind of, more particularly to different dimensions
Traffic characteristic carries out dimension raising and fusion, it is intended to encrypt flow for identification and high-dimensional reliable characteristic be provided, belong to machine learning,
Network service safe and flow identification technology field.
Background technique
Flow is the carrier of network information transfer.In order to protect privacy of user, the existing network transmission protocol is using encryption
Mode transmits data.By carrying out analysis identification to refined net flow, can preferably can be made for Internet service provider
The data distribution efficiency offer theoretical foundation determined routing policy, improve critical transmissions node, further promotes the use of the network user
Family experience.Existing encryption method for recognizing flux depends on the network flow characteristic such as data packet length, data packet of single dimension
Zone bit information, the temporal information of data packet step on, only rely on the feature of single dimension and the identification of encryption flow helped limited, lead to
The Fusion Features for crossing different dimensions can preferably promote refined net traffic classification effect.
Existing method for recognizing flux mainly includes two major classes: the identification of plaintext flow and encryption flow identification.In plain text stream
The major technique taken in amount identification is the detection of depth data packet and Port detecting.Use and hop-ports with encryption technology
The use of technology, the data packet during network communication are encrypted, and depth data packet inspection technical and Port detecting technology are gradually
Lose effectiveness.Present research hotspot is concentrated mainly in encryption flow identification.
In terms of encryption application network traffic classification and identification, maximum two patents of the association that can be retrieved are as follows:
(1) having document A proposes one kind based on markovian refined net stream recognition method.This method utilizes SSL/
The zone bit information of TLS encrypted data packet constructs the Markov fingerprint of different encryption applications, in adding for classification unknown applications
The probability that the unknown applications are classified into other different applications is calculated when close flow, this is made a decision using maximum-likelihood method and unknown is answered
Generic.The fingerprint of the flag bit state Finite used when constructing markov fingerprint, difference encryption application may
Can be closely similar, difference encrypts the case where fingerprint portion applied is overlapped and happens occasionally, this causes this kind of method to be applied in encryption
Accuracy in identification.
(2) have document B and propose a kind of encryption method for recognizing flux based on data packet length feature.This method utilizes
The data packet length statistical characteristics of every encryption stream, such as minimum value, maximum value, median, average, amount to 54 statistics
Characteristic value constructs the fingerprint of different encryption applications, carries out the identification of encryption flow in classification using random forest grader later.
As the value volume and range of product of flow to be sorted increases, performance page of this kind of classification method on classification accuracy is had a greatly reduced quality.
In conclusion encryption traffic classification method relies on the feature structure of single dimension in existing encryption traffic classification field
The fingerprint of encryption application is built, the feature of single dimension increases with number of applications, and the encryption employing fingerprint of single dimension feature construction is difficult
To provide enough differentiation information, will lead to reduces the classification accuracy of encryption application.
Summary of the invention
It is an object of the invention to overcome in existing encryption traffic characteristic extracting method feature quantity is few, characteristic present power
Weak technological deficiency provides discrimination big traffic characteristic, and then aids in the flow point of encryption application for identification encryption application
Class is extracted and is merged by the feature to flow different dimensions, and the feature after fusion is used classification to promote classification
Effect proposes a kind of encryption traffic characteristic extracting method based on Fusion Features.
A kind of encryption traffic characteristic extracting method based on Fusion Features, includes the following steps:
Step 1, the characteristic value for extracting encrypted packet different dimensions in an encryption stream;
Specifically, the encryption stream comprising i data packet is defined with five-tuple, is denoted as flow=[pkt1,…,pkti];
Wherein, five-tuple refers to source port, destination port, source IP, destination IP and transport protocol;pktiIndicate i-th of data packet;
Wherein, the characteristic value of data packet different dimensions includes data packet length statistical characteristics, packet time information system
Count characteristic value and data packet Burst behavioral statistics characteristic value;
Step 1 includes following sub-step again:
Step 1.1 is to the data packet computational length statistical characteristics captured;
Wherein, data packet length statistical characteristics includes the data packet length statistical characteristics in three directions;
Wherein, the statistical characteristics quantity in each direction is 19, the data packet length statistical characteristics in three directions
Quantity totally 57, it is denoted as Plen=[[plen1],…,[plen57]];
The statistical characteristics in each direction include minimum value Lminimum, maximum value Lmaximum, average value Lmean, in
Digit absolute deviation Lmedian_absolute_deviation, standard deviation Lstandard deviation, variance Lvar, tiltedly
Rate Lskew, kurtosis Lkurtosis, percentile Lpercentiles10%, Lpercentiles20%,
Lpercentiles30%, Lpercentiles40%, Lpercentiles50%, Lpercentiles60%,
Lpercentiles70%, Lpercentiles80%, Lpercentiles90%, the data packet number Lnumbers in sequence
With the sum of data packet length Lsum;
Step 1.2 calculates temporal information statistical characteristics to the data packet captured;
Wherein, packet time Information Statistics characteristic value includes the packet time Information Statistics characteristic value in three directions;
Wherein, the statistical characteristics quantity in each direction is 18, the data packet length statistical characteristics in three directions
Quantity totally 54, it is denoted as Ptime=[[ptime1],…,[ptime54]];
The statistical characteristics in each direction include minimum value Tminimum, maximum of T maximum, average value Tmean, in
Digit absolute deviation Tmedian_absolute_deviation, standard deviation Tstandard_deviation, variance Tvar, tiltedly
Rate Tskew, kurtosis Tkurtosis, percentile Tpercentiles10%, Tpercentiles20%,
Tpercentiles30%, Tpercentiles40%, Tpercentiles50%, Tpercentiles60%,
Element number Tnumbers in Tpercentiles70%, Tpercentiles80%, Tpercentiles90% and sequence;
Step 1.3 calculates Burst behavioral statistics characteristic value to the data packet captured;
Wherein, Burst refers to the data packet that the same direction continuously transmits in a stream;
Burst behavioral statistics characteristic value includes that Burst Size and Burst Length, Burst Size refer to a Burst
In data packet number, Burst Length refers to the sum of all data packet lengths in a Burst;
Burst Size and Burst Length consider the system in the direction Ingress Burst and the direction Egress Burst
Characteristic value is counted, the statistical characteristics of four direction is 72 total, is denoted as PBurst=[[burst1],…,[burst72]];
The statistical characteristics in each direction include minimum value Bminimum, maximum value Bmaximum, average value Bmean, in
Digit absolute deviation Bmedian_absolute_deviation, standard deviation Bstandard_deviation, variance
Bvariance, slope Bskew, kurtosis Bkurtosis, percentile Bpercentiles10%, Bpercentiles20%,
Bpercentiles30%, Bpercentiles40%, Bpercentiles50%, Bpercentiles60%,
Element number in Bpercentiles70%, Bpercentiles80%, Bpercentiles90% and sequence
Bnumbers amounts to 18;Therefore Burst Size and the Burst Length of all Burst includes the direction Ingress Burst
Statistical characteristics with the direction Egress Burst is 72;
Step 2 calculates signature contributions degree and normalizes, then carries out feature selecting based on signature contributions degree, picks out participation
The optimal characteristics quantity n of fusion, and select preceding n feature as the optimal characteristics for participating in fusion, specifically include following sub-step:
Step 2.1 calculates signature contributions degree;
The signature contributions degree VIM of every kind of feature is calculated using the Gini coefficient in random foresti;
Wherein, every kind of feature refers to the calculated Plen=of step 1.1, step 1.2 and step 1.3
[[plen1],…,[plen57]], Ptime=[[ptime1],…,[ptime54]] and PBurst=[[burst1],…,
[burst72One of]];
Wherein, i represents ith feature, and it is 57,54 and 72 sum, respectively that the value range of i, which is 1 to c and c=183,
Correspond to the number of species of Plen, Ptime and PBurst;
Step 2.2 is based on the signature contributions degree VIM that formula (1) calculates step 2.1jIt is normalized:
Wherein, c represents all Characteristic Numbers;VIMiRepresent the signature contributions degree of ith feature;
Step 2.3 calculates feature selecting standard value CFC;
The resulting signature contributions degree of step 2.2 is ranked up by sequence from big to small, calculates each feature according to (2)
Feature selecting standard value CFC:
Wherein, CFCjIndicate the feature selecting standard value CFC of j-th of feature;The value range of j is 1 to c and c=183;
Step 2.3 draws feature CFC value with the trend chart of characteristic j according to the CFC value calculated in step 2.2, looks for
Out in figure inflection point and remember the corresponding j of this inflection point be n, this n be participate in fusion optimal characteristics quantity;
Step 3, sorted out based on feature of the optimum fusion feature quantity n to different dimensions, using kernel function to step 2
The optimal characteristics for the participation fusion selected carry out a liter peacekeeping fusion, export the final characteristic set for participating in classification;
Step 3, specific includes following sub-step again:
Step 3.1 sorts out the feature of different dimensions according to the optimum fusion feature quantity n obtained in step 2;
Wherein, the feature of different dimensions includes data packet length feature, packet time information characteristics and data packet
Burst behavioural characteristic, three's quantity are respectively i, j and k;Sort out data packet length feature, is denoted as Plen=
[[plen1],…,[pleni]], packet time information characteristics are denoted as Ptime=[[ptime1],…,[ptimej]], data
Burst behavioural characteristic is wrapped, Burst=[[burst is denoted as1],…,[burstk]];
And Plen=[[plen1],…,[pleni]] update and the data packet length statistical nature that is substituted in step 1
Plen=[[plen1],…,[plen57]];Ptime=[[ptime1],…,[ptimej]] update and be substituted in step 1
Ptime=[[ptime1],…,[ptime54]], Burst=[[burst1],…,[burstk]] update and be substituted in step 1
Data packet Burst behavioral statistics feature PBurst=[[burstr1],…,[burst72]];
Step 3.2 merges single dimension feature using kernel function, i.e. progress single dimension feature rises dimension, specifically: use x
Any one dimensional characteristics in characteristic set f=[Plen, Ptime, Burst] are represented, x is calculated according to (3) first to x and is turned
Matrix x ' is set, x is the matrix of a n*1, and x ' is the matrix of a 1*n;
X '=xT (3)
Feature, which is carried out, using Radial basis kernel function (4) rises dimension:
Wherein, K (x, x ') is the matrix of a n*n, δ ∈ (0,1);
After step 3.2, feature quantity is respectively the Plen of i, j, k, and the feature quantity of Ptime, Burst become respectively
i2、j2And k2;
Step 3.3 is to the i after step 3.2 liter dimension2、j2And k2A feature is merged, specifically: Plen is successively traversed,
Element is added in Feature matrix after Ptime, Burst elevation dimension, returns to Feature as final participation classification
Characteristic set.
Beneficial effect
The invention proposes a kind of encryption traffic characteristic extracting methods based on special type fusion, with existing encryption traffic characteristic
Extracting method is compared, and is had the following beneficial effects:
1. invention introduces the systems of data packet Burst behavior in data packet length, packet time information and network flow
Information is counted, the feature set of refined net flow is extracted from multiple dimensions, can preferably portray refined net flow fingerprint;
2. present invention uses Radial basis kernel functions to increase Characteristic Number, the connection between different characteristic is characterized;
3. the present invention devises the balancing method of optimum fusion feature quantity, by using this method, it is embodied in step
Rapid 2.3, the interference that can reduce useless feature is selected by the feature to Fusion Features to be participated in, and can quickly determine participation
The feature quantity of fusion improves the efficiency of Fusion Features;
4. the present invention is by lot of experimental data it is demonstrated experimentally that with existing refined net traffic classification and recognition methods phase
Than can be realized higher accuracy rate using the feature classifiers after Fusion Features.
Detailed description of the invention
Fig. 1 is a kind of overall flow figure of the encryption traffic characteristic extracting method based on Fusion Features of the present invention;
Fig. 2 is that the Burst behavior in a kind of encryption traffic characteristic extracting method step 1 based on special type fusion of the present invention is shown
It is intended to;
Fig. 3 is a kind of CFC value encrypted in traffic characteristic extracting method step 2 merged based on special type of the present invention with feature
Number changes schematic diagram.
Specific embodiment
With reference to the accompanying drawings and examples, further illustrating the present invention, " a kind of encryption flow based on Fusion Features is special
The process of sign extracting method ", and illustrate its advantage.It should be pointed out that implementation of the invention is not limited by the following examples, it is right
Accommodation in any form that the present invention is made changes and will fall into protection scope of the present invention.
Embodiment 1
The present embodiment is that the complete encryption traffic characteristic carried out based on step 1 of the invention to step 3 extracts emulation, whole
Body flow chart as shown in Figure 1, Dataset Collection be data acquisition phase, can acquire Taobao, Jingdone district etc. using plus
The website traffic of close agreement transmission data, goes after then carrying out feature, then carries out feature selecting and Fusion Features, will finally melt
Feature after conjunction is classified for Machine learning classifiers.By extracting the feature of different dimensions, Radial basis kernel function is used
It carries out feature and rises dimension to obtain the last characteristic set for participating in classification.
Taobao, Jingdone district, Netease's cloud, Amazon, Alipay, wechat etc. are acquired using the flow of cryptographic protocol transmission, with five
The form of tuple is shunted, specifically:
It is to extract data packet about data packet length, the statistics of packet time information and data packet Burst behavior first
Characteristic value, detailed process are as shown in Figure 1.Assuming that certain the data flow table captured is shown as F=(p1,…,pn), extract this stream
Data packet length statistical nature Plen=[[plen1],…,[plen57]], packet time Information Statistics feature Ptime=
[[ptime1],…,[ptime54]] and data packet Burst behavioral statistics feature Burst=[[burst1],…,
[burst72]].Burst behavior schematic diagram as shown in Fig. 2, one stream in Burst include both direction Ingress Burst and
Egress Burst, Burst Size is the data packet number in Burst, and Burst Length is data packet length in Burst
The sum of.
The contribution degree of these features is calculated using the Gini coefficient in random forest, the signature contributions degree of Partial Feature is such as
Shown in table 1.The CFC value changed with Characteristic Number is calculated according to the feature digit after signature contributions degree and sequence, with Characteristic Number
The CFC value schematic diagram of variation is as shown in figure 3, select the inflection point in figure as the optimal number of fusion feature, in this example,
We select 120 optimal numbers as fusion feature.
1 Partial Feature signature contributions degree of table
Feature | Contribution degree | Feature | Contribution degree |
plen_18 | 0.030011 | burst_11 | 0.016430 |
plen_38 | 0.027685 | plen_35 | 0.015731 |
plen_55 | 0.025450 | burst_17 | 0.015577 |
plen_47 | 0.018072 | plen_33 | 0.015150 |
plen_34 | 0.017442 | plen_40 | 0.014951 |
plen_42 | 0.016791 | burst_16 | 0.014811 |
Then the feature chosen is subjected to feature according to the method in step 3 and rises peacekeeping fusion, by fused spy
It takes over for use in traffic classification.
Embodiment 2
The present embodiment is that the traffic characteristic for extracting the method for the invention is used for Machine learning classifiers, with other use
Single dimension feature classifiers compare, to verify advantage and validity of the invention.Melted of the present invention based on feature
The encryption traffic characteristic extracting method of conjunction is in conjunction with conventional machines learning algorithm random forest, as the classifier of this method, note
For FFP.
The method to be compared includes that data packet flag bit is only used to use as the markov classifier (MARK) of feature and only
Random forest grader (APPS) of the data packet length as feature.The index of comparison includes the accuracy rate of classifier
(Accuracy) and F1-score, F1-Score comprehensively considered accurate rate (Precision) and recall rate (Recall) to point
The evaluation criteria of class device.Comparing result is as shown in table 2.
Table 2 and advanced traffic classification category of model Contrast on effect
Classification method | MARK | APPS | FFP |
Accuracy rate | 0.5879 | 0.8080 | 0.9181 |
F1-Score | 0.5665 | 0.7977 | 0.9175 |
From table 2 it can be seen that the present invention has a clear superiority compared with existing traffic classification method, classification it is accurate
Rate and F1-Score are higher than other two kinds of sorting algorithms.The present invention is good to using the encrypted flow of cryptographic protocol that can extract
Good traffic characteristic, power-assisted can be improved classification accuracy, can put into practical application in encryption traffic classification detection.
Although describing the embodiment of this patent herein in conjunction with attached Example, those skilled in the art are come
It says, under the premise of not departing from this patent principle, several improvement can also be made, these are also the protection model to belong to this patent
It encloses.
Claims (7)
1. a kind of encryption traffic characteristic extracting method based on Fusion Features, characterized by the following steps:
Step 1, the characteristic value for extracting encrypted packet different dimensions in an encryption stream;
Specifically, the encryption stream comprising i data packet is defined with five-tuple, is denoted as flow=[pkt1,…,pkti];pkti
Indicate i-th of data packet;
Wherein, the characteristic value of data packet different dimensions includes data packet length statistical characteristics, packet time Information Statistics spy
Value indicative and data packet Burst behavioral statistics characteristic value;
Step 1 includes following sub-step again:
Step 1.1 is to the data packet computational length statistical characteristics captured;
Wherein, data packet length statistical characteristics includes the data packet length statistical characteristics in three directions;
Wherein, the statistical characteristics quantity in each direction is 19, the quantity of the data packet length statistical characteristics in three directions
Totally 57, it is denoted as Plen=[[plen1],…,[plen57]];
Step 1.2 calculates temporal information statistical characteristics to the data packet captured;
Wherein, packet time Information Statistics characteristic value includes the packet time Information Statistics characteristic value in three directions;
Wherein, the statistical characteristics quantity in each direction is 18, the quantity of the data packet length statistical characteristics in three directions
Totally 54, it is denoted as Ptime=[[ptime1],…,[ptime54]];
Step 1.3 calculates Burst behavioral statistics characteristic value to the data packet captured;
Wherein, Burst refers to the data packet that the same direction continuously transmits in a stream;
Wherein, Burst behavioral statistics characteristic value includes that Burst Size and Burst Length, Burst Size refer to one
Data packet number in Burst, Burst Length refer to the sum of all data packet lengths in a Burst;
Burst Size and the Burst Length of Burst includes the direction Ingress Burst and the direction Egress Burst
Statistical characteristics is 72 total, is denoted as PBurst=[[burst1],…,[burst72]];
Step 2 calculates signature contributions degree and normalizes, then carries out feature selecting based on signature contributions degree, picks out participation fusion
Optimal characteristics quantity n, and optimal characteristics that n feature is merged as participation before selecting specifically include following sub-step:
Step 2.1 calculates signature contributions degree;
The signature contributions degree VIM of every kind of feature is calculated using the Gini coefficient in random foresti;
Wherein, i represents ith feature, and the value range of i is 1 to c and c=183, is 57,54 and 72 sum, respectively corresponds
The number of species of Plen, Ptime and PBurst;
Step 2.2 is based on the signature contributions degree VIM that formula (1) calculates step 2.1jIt is normalized:
Wherein, c represents all Characteristic Numbers;VIMiRepresent the signature contributions degree of ith feature;
Step 2.3 calculates feature selecting standard value CFC;
The resulting signature contributions degree of step 2.2 is ranked up by sequence from big to small, the spy of each feature is calculated according to (2)
Levy selection criteria value CFC:
Wherein, CFCjIndicate the feature selecting standard value CFC of j-th of feature;The value range of j is 1 to c and c=183;
Step 2.3 draws feature CFC value with the trend chart of characteristic j according to the CFC value calculated in step 2.2, finds out figure
Middle inflection point simultaneously remembers that the corresponding j of this inflection point is n, this n is the optimal characteristics quantity for participating in fusion;
Step 3 is sorted out based on feature of the optimum fusion feature quantity n to different dimensions, is selected using kernel function to step 2
The optimal characteristics of participation fusion carry out liter peacekeeping fusion, export the final characteristic set for participating in classification;
Step 3, specific includes following sub-step again:
Step 3.1 sorts out the feature of different dimensions according to the optimum fusion feature quantity n obtained in step 2;
Wherein, the feature of different dimensions includes data packet length feature, packet time information characteristics and data packet Burst row
It is characterized, three's quantity is respectively i, j and k;Sort out data packet length feature, is denoted as Plen=[[plen1],…,
[pleni]], packet time information characteristics are denoted as Ptime=[[ptime1],…,[ptimej]], data packet Burst behavior
Feature is denoted as Burst=[[burst1],…,[burstk]];
Step 3.2 merges single dimension feature using kernel function, i.e. progress single dimension feature rises dimension, specifically: use x generation
Any one dimensional characteristics in table characteristic set f=[Plen, Ptime, Burst] calculate first x the transposition of x according to (3)
Matrix x ', x are the matrixes of a n*1, and x ' is the matrix of a 1*n;
X '=xT (3)
Feature, which is carried out, using Radial basis kernel function (4) rises dimension:
Wherein, K (x, x ') is the matrix of a n*n, δ ∈ (0,1);
After step 3.2, feature quantity is respectively the Plen of i, j, k, and the feature quantity of Ptime, Burst become i respectively2、j2
And k2;
Step 3.3 is to the i after step 3.2 liter dimension2、j2And k2A feature is merged, specifically: Plen is successively traversed,
Element is added in Feature matrix after Ptime, Burst elevation dimension, returns to Feature as final participation classification
Characteristic set.
2. a kind of encryption traffic characteristic extracting method based on Fusion Features according to claim 1, it is characterised in that: step
Five-tuple in rapid 1 refers to source port, destination port, source IP, destination IP and transport protocol.
3. a kind of encryption traffic characteristic extracting method based on Fusion Features according to claim 1, it is characterised in that: step
The statistical characteristics in each direction includes minimum value Lminimum, maximum value Lmaximum, average value Lmean, middle position in rapid 1.1
Number absolute deviation Lmedian_absolute_deviation, standard deviation Lstandard deviation, variance Lvar, slope
Lskew, kurtosis Lkurtosis, percentile Lpercentiles10%, Lpercentiles20%,
Lpercentiles30%, Lpercentiles40%, Lpercentiles50%, Lpercentiles60%,
Lpercentiles70%, Lpercentiles80%, Lpercentiles90%, the data packet number Lnumbers in sequence
With the sum of data packet length Lsum.
4. a kind of encryption traffic characteristic extracting method based on Fusion Features according to claim 1, it is characterised in that: step
The statistical characteristics in each direction includes minimum value Tminimum, maximum of T maximum, average value Tmean, middle position in rapid 1.2
Number absolute deviation Tmedian_absolute_deviation, standard deviation Tstandard_deviation, variance Tvar, slope
Tskew, kurtosis Tkurtosis, percentile Tpercentiles10%, Tpercentiles20%,
Tpercentiles30%, Tpercentiles40%, Tpercentiles50%, Tpercentiles60%,
Element number Tnumbers in Tpercentiles70%, Tpercentiles80%, Tpercentiles90% and sequence.
5. a kind of encryption traffic characteristic extracting method based on Fusion Features according to claim 1, it is characterised in that: step
Burst Size and Burst Length described in rapid 1.3 consider the direction Ingress Burst and the direction Egress Burst
Statistical characteristics, in four direction the statistical characteristics in each direction include minimum value Bminimum, maximum value Bmaximum,
Average value Bmean, median absolute deviation Bmedian_absolute_deviation, standard deviation Bstandard_
Deviation, variance Bvariance, slope Bskew, kurtosis Bkurtosis, percentile Bpercentiles10%,
Bpercentiles20%, Bpercentiles30%, Bpercentiles40%, Bpercentiles50%,
In Bpercentiles60%, Bpercentiles70%, Bpercentiles80%, Bpercentiles90% and sequence
Element number Bnumbers, amount to 18.
6. a kind of encryption traffic characteristic extracting method based on Fusion Features according to claim 1, it is characterised in that: step
In rapid 2.1, every kind of feature refers to calculated the Plen=[[plen of step 1.1, step 1.2 and step 1.31],…,
[plen57]], Ptime=[[ptime1],…,[ptime54]] and PBurst=[[burst1],…,[burst72]] in
It is a kind of.
7. a kind of encryption traffic characteristic extracting method based on Fusion Features according to claim 1, it is characterised in that: step
Plen=[[plen in rapid 3.11],…,[pleni]] update and the data packet length statistical nature Plen=that is substituted in step 1
[[plen1],…,[plen57]];Ptime=[[ptime1],…,[ptimej]] update and the Ptime=that is substituted in step 1
[[ptime1],…,[ptime54]], Burst=[[burst1],…,[burstk]] update and the data that are substituted in step 1
Wrap Burst behavioral statistics feature PBurst=[[burst1],…,[burst72]]。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910379472.5A CN110113338B (en) | 2019-05-08 | 2019-05-08 | Encrypted flow characteristic extraction method based on characteristic fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910379472.5A CN110113338B (en) | 2019-05-08 | 2019-05-08 | Encrypted flow characteristic extraction method based on characteristic fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110113338A true CN110113338A (en) | 2019-08-09 |
CN110113338B CN110113338B (en) | 2020-06-26 |
Family
ID=67488756
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910379472.5A Active CN110113338B (en) | 2019-05-08 | 2019-05-08 | Encrypted flow characteristic extraction method based on characteristic fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110113338B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110751222A (en) * | 2019-10-25 | 2020-02-04 | 中国科学技术大学 | Online encrypted traffic classification method based on CNN and LSTM |
CN110958233A (en) * | 2019-11-22 | 2020-04-03 | 上海交通大学 | Encryption type malicious flow detection system and method based on deep learning |
CN111526100A (en) * | 2020-04-16 | 2020-08-11 | 中南大学 | Cross-network traffic identification method and device based on dynamic identification and path hiding |
CN112001452A (en) * | 2020-08-27 | 2020-11-27 | 深圳前海微众银行股份有限公司 | Feature selection method, device, equipment and readable storage medium |
CN114363061A (en) * | 2021-12-31 | 2022-04-15 | 深信服科技股份有限公司 | Abnormal flow detection method, system, storage medium and terminal |
CN116016365A (en) * | 2023-01-06 | 2023-04-25 | 哈尔滨工业大学 | Webpage identification method based on data packet length information under encrypted flow |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104135385A (en) * | 2014-07-30 | 2014-11-05 | 南京市公安局 | Method of application classification in Tor anonymous communication flow |
US20180260705A1 (en) * | 2017-03-05 | 2018-09-13 | Verint Systems Ltd. | System and method for applying transfer learning to identification of user actions |
CN108650194A (en) * | 2018-05-14 | 2018-10-12 | 南开大学 | Net flow assorted method based on K_means and KNN blending algorithms |
CN109194657A (en) * | 2018-09-11 | 2019-01-11 | 北京理工大学 | A kind of encrypting web traffic characteristic extracting method based on accumulation data packet length |
CN109286576A (en) * | 2018-10-10 | 2019-01-29 | 北京理工大学 | A kind of network agent encryption traffic characteristic extracting method of data packet frequency analysis |
-
2019
- 2019-05-08 CN CN201910379472.5A patent/CN110113338B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104135385A (en) * | 2014-07-30 | 2014-11-05 | 南京市公安局 | Method of application classification in Tor anonymous communication flow |
US20180260705A1 (en) * | 2017-03-05 | 2018-09-13 | Verint Systems Ltd. | System and method for applying transfer learning to identification of user actions |
CN108650194A (en) * | 2018-05-14 | 2018-10-12 | 南开大学 | Net flow assorted method based on K_means and KNN blending algorithms |
CN109194657A (en) * | 2018-09-11 | 2019-01-11 | 北京理工大学 | A kind of encrypting web traffic characteristic extracting method based on accumulation data packet length |
CN109286576A (en) * | 2018-10-10 | 2019-01-29 | 北京理工大学 | A kind of network agent encryption traffic characteristic extracting method of data packet frequency analysis |
Non-Patent Citations (1)
Title |
---|
KHALED AL-NAAMI等: "Adaptive encrypted traffic fingerprinting with bi-directional dependence", 《ACSAC’16:PROCEEDINGS OF THE 32ND ANNUAL CONFERENCE ON COMPUTER SECURITY》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110751222A (en) * | 2019-10-25 | 2020-02-04 | 中国科学技术大学 | Online encrypted traffic classification method based on CNN and LSTM |
CN110958233A (en) * | 2019-11-22 | 2020-04-03 | 上海交通大学 | Encryption type malicious flow detection system and method based on deep learning |
CN110958233B (en) * | 2019-11-22 | 2021-08-20 | 上海交通大学 | Encryption type malicious flow detection system and method based on deep learning |
CN111526100A (en) * | 2020-04-16 | 2020-08-11 | 中南大学 | Cross-network traffic identification method and device based on dynamic identification and path hiding |
CN111526100B (en) * | 2020-04-16 | 2021-08-24 | 中南大学 | Cross-network traffic identification method and device based on dynamic identification and path hiding |
CN112001452A (en) * | 2020-08-27 | 2020-11-27 | 深圳前海微众银行股份有限公司 | Feature selection method, device, equipment and readable storage medium |
CN112001452B (en) * | 2020-08-27 | 2021-08-27 | 深圳前海微众银行股份有限公司 | Feature selection method, device, equipment and readable storage medium |
CN114363061A (en) * | 2021-12-31 | 2022-04-15 | 深信服科技股份有限公司 | Abnormal flow detection method, system, storage medium and terminal |
CN116016365A (en) * | 2023-01-06 | 2023-04-25 | 哈尔滨工业大学 | Webpage identification method based on data packet length information under encrypted flow |
CN116016365B (en) * | 2023-01-06 | 2023-09-19 | 哈尔滨工业大学 | Webpage identification method based on data packet length information under encrypted flow |
Also Published As
Publication number | Publication date |
---|---|
CN110113338B (en) | 2020-06-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110113338A (en) | A kind of encryption traffic characteristic extracting method based on Fusion Features | |
CN111340191B (en) | Bot network malicious traffic classification method and system based on ensemble learning | |
CN112235264B (en) | Network traffic identification method and device based on deep migration learning | |
CN108768986B (en) | Encrypted traffic classification method, server and computer readable storage medium | |
Gogoi et al. | MLH-IDS: a multi-level hybrid intrusion detection method | |
CN104244035B (en) | Network video stream sorting technique based on multi-level clustering | |
CN104135385B (en) | Method of application classification in Tor anonymous communication flow | |
Wang et al. | A deep hierarchical network for packet-level malicious traffic detection | |
CN105871619B (en) | A kind of flow load type detection method based on n-gram multiple features | |
CN113364787B (en) | Botnet flow detection method based on parallel neural network | |
Ahn et al. | Explaining deep learning-based traffic classification using a genetic algorithm | |
CN110958233B (en) | Encryption type malicious flow detection system and method based on deep learning | |
CN110611640A (en) | DNS protocol hidden channel detection method based on random forest | |
Niu et al. | A heuristic statistical testing based approach for encrypted network traffic identification | |
Liu et al. | A distance-based method for building an encrypted malware traffic identification framework | |
CN109286576A (en) | A kind of network agent encryption traffic characteristic extracting method of data packet frequency analysis | |
Dowoo et al. | PcapGAN: Packet capture file generator by style-based generative adversarial networks | |
Lu et al. | A heuristic-based co-clustering algorithm for the internet traffic classification | |
CN108123962A (en) | A kind of method that BFS algorithms generation attack graph is realized using Spark | |
Zheng et al. | Two-layer detection framework with a high accuracy and efficiency for a malware family over the TLS protocol | |
Chung et al. | An effective similarity metric for application traffic classification | |
CN113254743B (en) | Security semantic perception searching method for dynamic spatial data in Internet of vehicles | |
CN107124410A (en) | Network safety situation feature clustering method based on machine deep learning | |
CN106557983A (en) | A kind of microblogging junk user detection method based on fuzzy multiclass SVM | |
Lu et al. | Cascaded classifier for improving traffic classification accuracy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |