CN103780501B - Peer-to-peer network traffic identification method of inseparable-wavelet support vector machine - Google Patents
Peer-to-peer network traffic identification method of inseparable-wavelet support vector machine Download PDFInfo
- Publication number
- CN103780501B CN103780501B CN201410017016.3A CN201410017016A CN103780501B CN 103780501 B CN103780501 B CN 103780501B CN 201410017016 A CN201410017016 A CN 201410017016A CN 103780501 B CN103780501 B CN 103780501B
- Authority
- CN
- China
- Prior art keywords
- training
- sample
- wsvm
- flow
- peer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to a peer-to-peer (P2P) network traffic identification method of an inseparable-wavelet support vector machine. The method includes the following steps: (1): selecting feature vectors: adopting the following three-dimensional feature vectors: Vector=<v1, v2, v3>, wherein v1 represents a mean-square value of data packet size change, v2 represents the ratio of an uplink speed and a downlink speed at a node and v3 represents a ratio of the number of IP addresses and the number of ports; (2): selecting an appropriate kernel function; (3) selecting a increment training algorithm; (4) performing a Boosting algorithm for P2P traffic identification of wavelet SVM: at last, obtaining a strong classifier H(x) through adoption of a voting method with a weight for identification of the P2P traffic identification. The peer-to-peer network traffic identification method of the inseparable-wavelet support vector machine is capable of identifying the P2P network traffic efficiently and adopting a countermeasure timely to controlling effectively the P2P network traffic.
Description
Technical field
The present invention relates to a kind of peer-to-peer network method for recognizing flux of inseparable wavelet support vector machines, belong to computer pair
Deng networking technology area.
Background technology
Peer-to-peer network technology (Peer to Peer Computing, referred to as P2P), is developed with very fast speed,
P2P technology is as a kind of brand-new network communication mode, has been listed in now the scientific and technological skill of impact following Intemet development
One of art, becomes distribution side by side with grid computing technology (Grid Computing), cloud computing technology (Cloud Computing)
The correlative study emphasis in formula computing technique field, is increasingly paid attention to by researcher.At present, for P2P technology also not really
The definition cut, but its thought changes the understanding and cognition for internet for the people.P2P network and legacy network maximum
Difference is that it allows to be linked to each other between two users, carries out file transmission each other and shares, changes legacy network
In, the transmission mode of server/client, the demander of resource is also the supplier of resource simultaneously, the demand of same resource
Person is more, and its speed of download is faster, thus significantly improving speed and the efficiency of data transfer.
Developing rapidly of P2P technology, also brings a lot of problems, is embodied in the following aspects:(1) occupy substantial amounts of net
Network bandwidth:The P2P such as sharing video frequency and HD video application occupies the substantial amounts of network bandwidth, consumes excessive Internet resources, draws
Play the congestion of network, other normal Networks can not be carried out, had influence on the right of the user of non-P2P application, compromised
The interests of ISP.(2) protection Question on Network Safety:While P2P application popularization, also make substantial amounts of virus, trojan horse program, no
The content information of health is taken advantage of a weak point, and is rapidly performed by the internet propagating, and to hacker and lawless person with opportunity, endangers
Evil is to the interests safety of user.(3) copyright problem of P2P file-sharing:According to statistics, P2P downloads and is accused of stealing more than 80% content
Version infringement, compromises the interests of authorship, and with the popularization of 3G network, SARFT(The State Administration of Radio and Television) is directed to P2P download website within 2009
Pornograph, pirate the problems such as increase the dynamics of key point strike.
Therefore, all challenged, the Strengthens network stream such as availability of the security of network, manageability and tradition application
Amount monitoring, this is just highly desirable to carry out deep understanding and analysis to P2P flow and network behavior, for management and monitoring P2P
Network provides technical support.The flow of P2P is different from traditional WEB flow, and it has the characteristics that to be difficult to manage, controls:(1) do not have
There is fixing network protocol standard:P2P application uses its proprietary protocol, and common firewall technology can not be to P2P flow
Filtered completely;(2) employ dynamic port:Detect P2P flow in order to hide using fixed port, employ dynamic end
Mouthful, typical case's application has PPlive, and Skype can change original default port by user, and the setting of port is more flexible, for just
Really identify P2P increased flow capacity difficulty;(3) disguise as normal discharge:The P2P such as Kazza apply when carrying out flow transmission, its report
Civilian form disguises oneself as HTTP flow, is more not easy to identify.(4) use Traffic Encryption technology:Skype etc. employs message encryption
Technology is so that cannot recognize that the P2P flow through encryption according to the method for application layer characteristic matching.
So, the management to P2P flow to be realized, the problem first having to solve is the identification realized to P2P flow.Deeply
The feature of research P2P network traffics, chooses suitable identification model, and then efficiently P2P network traffics is identified, in time
Take some countermeasures, P2P network traffics are effectively controlled be there is very important theory significance and realistic price.
Content of the invention
It is an object of the invention to provide a kind of peer-to-peer network method for recognizing flux of inseparable wavelet support vector machines, with
In the case that small sample offer limited information is provided, to find the optimal solution of classification results, thus having avoided a lot of engineerings
The method practised needs the shortcoming of big-sample data collection and needs to set up for specific problem accordingly using non-linear method
Model shortcoming, and then efficiently P2P network traffics are identified, timely take some countermeasures, P2P network traffics are carried out
Effectively control.
To achieve these goals, technical scheme is as follows.
A kind of peer-to-peer network method for recognizing flux of inseparable wavelet support vector machines, comprises the following steps:
1st, selected characteristic vector:
Choose suitable characteristic vector, be the importance that P2P network traffics are identified, P2P network traffics are entered
When row feature selecting, it then follows have two principles:(1) there is difference in functionality and present with the node flow providing different services
Discrepant behavioural characteristic, so select the behavioural characteristic of node flow as far as possible.(2) selection of feature will be able to reflect
The difference of P2P flow and non-P2P flow, thus playing the shortening training time, improves the purpose of the accuracy of identification.Enough when having
Many characteristic vectors, can provide more accurate discrimination for grader, but provide excessive feature can make the time of training
Longer, computation complexity increases.
Based on above reason, connect three aspects and carry out characteristic vector by packet, network flow, node in the present invention
Analysis:
(1) feature of packet aspect:Including the average length of bag, the maximum length of bag, the minimum length of bag, Yi Jifang
The statistical natures such as difference.
(2) feature of network flow aspect:By the original statistical nature of convection current, the such as time started, the end time, service class
Type etc. obtains flowing related statistical nature:Mean flow duration, average transmission rate, the average byte number of stream, bag reaches
Time interval and variance etc..
(3) node connects the feature of aspect:By the connection status of TCP, the correlated characteristic that node is connected counts,
The symmetry presenting including connection and IP address, port identity etc..
Following three-dimensional feature vector is adopted in the present invention:
Vector=<V1, v2, v3>;
Wherein, V1 represents the mean square deviation of data package size change, and V2 represents the ratio of up-downgoing speed at node, V3 generation
Table I P address quantity and the ratio of port number.When being identified to network traffics, using three-dimensional feature vector as input to
Amount, then effectively can be identified to its sample P2P sample data with the decision function being generated using SVM model.
2nd, select suitable kernel function:
P2P network traffics present sudden, uncertain non-linear flow measure feature, and wavelet analysis is suitable for the office of signal
Portion's analysis and the detection of jump signal, introduce, in conjunction with wavelet analysis, the kernel function that multiple dimensioned wavelet basis function to construct SVM, build
The recognizer of vertical small echo SVM, can fully improve the accuracy of identification of SVM.Introduce wavelet basis function to construct the kernel function of SVM,
And the flow identification for P2P network, needs to meet two conditions:(1) meet the condition of the construction of SVM kernel function.(2) select
The computation complexity of the wavelet basis function selected can not be too high, and excessive parameter setting can increase the training time of sample.
3rd, select incremental training algorithm:
The thought of SVM incremental training algorithm is exactly that its decision function is determined by supporting vector, by propping up in training set
Hold vector all to remain, give up non-supporting vector, the result of final incremental training be with not using the result of incremental learning
It is consistent.
Incremental training algorithm is as follows:
Step 1:Preliminary classification device f (x) of SVM, SVs are obtained through training on initial training set1Represent f's (x)
Supporting vector collection;
Step 2:By SVs1Merge into new training set with newly-increased sample set, after training, new grader will be obtained
F ' (x), new supporting vector collection SVs2;
Step 3:Make SVs1=SVs2, return to step 2.
In incremental training algorithm, because each incremental learning in algorithm only remains supporting vector, give up non-
Hold vector, but in actual conditions, also contains the useful information of classification in data set in non-supporting vector, influence whether identification
Accuracy.
4th, the Boosting algorithm of the P2P flow identification of small echo SVM:
Boosting algorithm is a kind of iterative algorithm of wrong point of sample of special disposal in integrated study, and its core concept is pin
The grader (Weak Classifier) different to the training of same training set, then gets up these weak classifier set, constitutes one
Higher final classification device (strong classifier).Its algorithm is distributed to realize by change data in itself, and it is instructed according to each
Whether the classification practicing each sample among collection is correct, and the accuracy rate of the general classification of last time, to determine the power of each sample
Value.The new data set changing weights is given sub-classification device be trained, finally trained the grader obtaining to melt every time
Altogether, as last Decision Classfication device.So the key that grader is processed can be placed on wrong point these keys of sample
Training data above, thus improving the identification accurate rate of sample.
Small echo SVM is exactly trained to sample, first from whole by the Boosting algorithm of small echo SVM as base grader
M sample is selected to constitute a training subset S according to weight size in individual P2P and non-P2P sample set Sj, draw after training
One base grader WSVMj, then use WSVMjTest sample collection S is it can be deduced that WSVMjClassification accuracy;Then to wrong point
Sample give higher weight;It is last that according to the weight size after adjustment, M sample of selection from S constitutes new training again
Subset Sj+1If, Sj+1=S then exits, and otherwise repeats above step.After training t wheel, (t≤T, T are the number of times of iteration),
Obtain recognition function sequence WSVM based on WSVM1..., WSVMj, WSVM simultaneouslyjAlso weights are given, that is, to sample
The accuracy rate of collection S identification;Eventually through obtaining strong classifier H (x) by the way of the ballot having weight, for P2P stream
The identification of amount.
Specific algorithm is described as follows:
(1) input.Base grader WSVM;P2P and non-P2P flow training sample set S={ (x1, y1) ..., (xn, yn),
Wherein xi∈Rd, y ∈ { -1,1 }, 1≤i≤n;The iterations T of training;Sample initial weight Di=0 (1≤i≤n);
(2)for(j=1;j≤T;j++)
{
1) size according to weight chooses M sample successively from training sample set S, obtains training sample subset Sj;
2) if Sj=Sj-1(j>1) then exit circulation;
3) use WSVM Algorithm for Training Sj, obtain a base grader WSVMj;
4) use WSVMjClassified sample set S, obtaining error rate is ej;
5)WSVMjWeight be designated as αj=1-ej;
6) sample weights that the mistake in adjustment supporting vector and sample set S is divided are Di=D1+j;
}
(3) export.Decision function sequences h={ WSVM1..., WSVMt, its weight α={ α1..., αt, t≤T, finally
Decision function be:
This beneficial effect of the invention is:The present invention by provide a kind of peer-to-peer network of inseparable wavelet support vector machines
Network method for recognizing flux, in the case of providing limited information by small sample, to find the optimal solution of classification results, thus returning
The method having kept away a lot of machine learning is needed the shortcoming of big-sample data collection and is needed for specific using non-linear method
Problem is setting up the shortcoming of corresponding model, and then efficiently P2P network traffics is identified, and timely takes some countermeasures, right
P2P network traffics are effectively controlled.
Brief description
Fig. 1 is that in the embodiment of the present invention, P2P application connects enforcement ideograph.
Fig. 2 is that in the embodiment of the present invention, P2P application connects enforcement ideograph.
Fig. 3 is the Boosting algorithm flow chart of small echo SVM in the embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawings the specific embodiment of the present invention is described, to be better understood from the present invention.
Embodiment
A kind of peer-to-peer network method for recognizing flux of inseparable wavelet support vector machines, including:
1st, selected characteristic vector:
Choose suitable characteristic vector, be the importance that P2P network traffics are identified, P2P network traffics are entered
When row feature selecting, it then follows have two principles:(1) there is difference in functionality and present with the node flow providing different services
Discrepant behavioural characteristic, so select the behavioural characteristic of node flow as far as possible.(2) selection of feature will be able to reflect
The difference of P2P flow and non-P2P flow, thus playing the shortening training time, improves the purpose of the accuracy of identification.Enough when having
Many characteristic vectors, can provide more accurate discrimination for grader, but provide excessive feature can make the time of training
Longer, computation complexity increases, according to statistics, if all being flowed the choosing of characteristic attribute in the algorithm based on machine learning
Select, carry out the identification of network traffics, only the accuracy rate that characteristic attribute selects is high by 2% than carrying out for its accuracy rate, but algorithm
Execution efficiency will exceed a lot.So to the selection of characteristic attribute while ensureing classifier performance, choosing as far as possible
Characteristic vector is the essential step of P2P flow identification.
Based on above reason, connect three aspects and carry out characteristic vector by packet, network flow, node in the present invention
Analysis:
(1) feature of packet aspect:Including the average length of bag, the maximum length of bag, the minimum length of bag, Yi Jifang
The statistical natures such as difference.
(2) feature of network flow aspect:By the original statistical nature of convection current, the such as time started, the end time, service class
Type etc. obtains flowing related statistical nature:Mean flow duration, average transmission rate, the average byte number of stream, bag reaches
Time interval and variance etc..
(3) node connects the feature of aspect:By the connection status of TCP, the correlated characteristic that node is connected counts,
The symmetry presenting including connection and IP address, port identity etc..
In real network, different nodes has different functions, and some nodes play the function of server, to network its
He provides asset delivery service by node, and some nodes play the function of client, the respective services that the reception server provides.And
Node in P2P network both can provide service as server to other peer node, can receive it as client again
The service that his peer node provides.Therefore, have different functions with provide different services node flow present variant
Behavioural characteristic, separately below these behavioural characteristics are analyzed.
Fig. 1 is P2P application connection mode figure, in P2P network, the connected mode of peer node and in traditional service
Connected mode under device/client mode is different, and any one node in P2P network act as dual role, referred to as right
Deng node.The random port being connected between 1024-65535 that P2P application uses carries out data transmission, and under Transmission Control Protocol, enters
When row connects, a source end node and multiple peer node connect.With respect to source end node, the IP address quantity of peer node is relatively
Many, the port of peer node is random port, and then the IP address quantity of peer node and the ratio of port number are close to 1.This
Point is different with the application under traditional connection mode, thus the identification feature as P2P flow.
Through from data above bag, network flow, node connects the behavioural characteristic analysis that three aspects are carried out, the spy taking
Levy the difference that vector can embody flow in P2P network and legacy network, be also feature in live network for the P2P flow
Embody, reach the requirement of identification.By the feature of these three aspects, in the present invention, adopt following three-dimensional feature vector:
Vector=<V1, v2, v3>;
Wherein, V1 represents the mean square deviation of data package size change, and V2 represents the ratio of up-downgoing speed at node, V3 generation
Table I P address quantity and the ratio of port number.When being identified to network traffics, using three-dimensional feature vector as input to
Amount, then effectively can be identified to its sample P2P sample data with the decision function being generated using SVM model.
2nd, select suitable kernel function:
SVM make use of the method for kernel function it is ensured that while preferable generalization ability, solving training sample feature space
Problem of dimension, by choosing different kernel functions, to process nonlinear problem, the identification currently for P2P network traffics is general
All over employing radial direction base (RBF) kernel function, because RBF nuclear phase has less parameter than other kernel functions, difficulty in computation is less,
Be distributed sample can be used in.P2P network traffics present sudden, uncertain non-linear flow measure feature, small echo
Analysis is suitable for the partial analysis of signal and the detection of jump signal, introduces multiple dimensioned wavelet basis function in conjunction with wavelet analysis and comes
The kernel function of construction SVM, sets up the recognizer of small echo SVM, can fully improve the accuracy of identification of SVM.Introduce wavelet basis function
To construct the kernel function of SVM, and the flow identification for P2P network, to need to meet two conditions:(1) meet SVM kernel function
Construction condition.(2) computation complexity of the wavelet basis function selecting can not be too high, and excessive parameter setting can increase sample
Training time.As shown in Figure 2.
3rd, select incremental training algorithm:
Traditional SVM, the process completing to train and classify is one step completed, and needs to solve two when training
Secondary planning, when training sample set than larger when it is necessary to take larger internal memory, and convergence rate is slower, with
The network data information being continually changing, its data set also presents disequilibrium and multifarious feature, therefore existing list
The grader identification accuracy that one grader and delta algorithm train out is less desirable.In fact affect identification accuracy
Principal element is the presence of wrong point of sample in a large number, and the Boosting algorithm in integrated study is specific to one kind of wrong point of sample
Sorting technique.Present invention proposition is a kind of to be applied to the identification of P2P flow based on the Boosting algorithm of small echo SVM, by study
During wrong point of emphasis training sample, to improve the generalization ability of learning machine, and then to improve the accurate rate of identification.
In SVM, supporting vector can describe the characteristic of whole sample data set, for the SVM determining kernel function,
Excellent classifying face only has relation with its supporting vector, the classification of whole sample data set can be equivalent to supporting vector is divided
Class.That is remove sample training concentrate supporting vector beyond other vector, re-start training, then the result trained and
It is consistent for concentrating, in whole sample training, the result obtaining.The thought of SVM incremental training algorithm be exactly its decision function be by
Supporting vector determines, the supporting vector in training set is all remained, and gives up non-supporting vector, final incremental training
Result is to be consistent with the not result using incremental learning.
Incremental training algorithm is as follows:
Step 1:Preliminary classification device f (x) of SVM, SVs are obtained through training on initial training set1Represent f's (x)
Supporting vector collection;
Step 2:By SVs1Merge into new training set with newly-increased sample set, after training, new grader will be obtained
F ' (x), new supporting vector collection SVs2;
Step 3:Make SVs1=SVs2, return to step 2.
In incremental training algorithm, because each incremental learning in algorithm only remains supporting vector, give up non-
Hold vector, but in actual conditions, also contains the useful information of classification in data set in non-supporting vector, influence whether identification
Accuracy.
4th, the Boosting algorithm of the P2P flow identification of small echo SVM:
Boosting algorithm is a kind of iterative algorithm of wrong point of sample of special disposal in integrated study, and its core concept is pin
The grader (Weak Classifier) different to the training of same training set, then gets up these weak classifier set, constitutes one
Higher final classification device (strong classifier).Its algorithm is distributed to realize by change data in itself, and it is instructed according to each
Whether the classification practicing each sample among collection is correct, and the accuracy rate of the general classification of last time, to determine the power of each sample
Value.The new data set changing weights is given sub-classification device be trained, finally trained the grader obtaining to melt every time
Altogether, as last Decision Classfication device.So the key that grader is processed can be placed on wrong point these keys of sample
Training data above, thus improving the identification accurate rate of sample.
Fig. 3 is the Boosting algorithm flow chart of small echo SVM in the embodiment of the present invention it is simply that divide small echo SVM as base
Class device is trained to sample, selects M sample to constitute one according to weight size first from whole P2P and non-P2P sample set S
Individual training subset Sj, after training, draw a base grader WSVMj, then use WSVMjTest sample collection S it can be deduced that
WSVMjClassification accuracy;Then give higher weight to wrong point of sample;Weight size after last foundation adjustment is again
M sample is selected to constitute new training subset S from Sj+1If, Sj+1=S then exits, and otherwise repeats above step.Through training
After t wheel, (t≤T, T are the number of times of iteration), obtain recognition function sequence WSVM based on WSVM1..., WSVMj, simultaneously
WSVMjAlso weights are given, that is, the accuracy rate to sample set S identification;Obtain eventually through by the way of the ballot having weight
To strong classifier H (x), for the identification of P2P flow.
Specific algorithm is described as follows:
(1) input.Base grader WSVM;P2P and non-P2P flow training sample set S={ (x1, y1) ..., (xn, yn),
Wherein xi∈Rd, y ∈ { -1,1 }, 1≤i≤n;The iterations T of training;Sample initial weight Di=0 (1≤i≤n);
(2)for(j=1;j≤T;j++)
{
1) size according to weight chooses M sample successively from training sample set S, obtains training sample subset Sj;
2) if Sj=Sj-1(j>1) then exit circulation;
3) use WSVM Algorithm for Training Sj, obtain a base grader WSVMj;
4) use WSVMjClassified sample set S, obtaining error rate is ej;
5)WSVMjWeight be designated as αj=1-ej;
6) sample weights that the mistake in adjustment supporting vector and sample set S is divided are Di=D1+i;
}
(3) export.Decision function sequences h={ WSVM1..., WSVMt, its weight α={ α1..., αt, t≤T, finally
Decision function be:
The above is the preferred embodiment of the present invention it is noted that for those skilled in the art
For, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also considered as
Protection scope of the present invention.
Claims (3)
1. a kind of peer-to-peer network method for recognizing flux of inseparable wavelet support vector machines it is characterised in that:Comprise the following steps:
(1) selected characteristic vector:Two principles are followed:A () has difference in functionality and with the node flow providing different services is in
Reveal discrepant behavioural characteristic, so selecting the behavioural characteristic of node flow as far as possible;B the selection of () feature is wanted can be anti-
Mirroring the difference of P2P flow and non-P2P flow thus playing the shortening training time, improving the purpose of the accuracy of identification;Select
Packet, network flow, node connect three aspects as characteristic vector;The feature of described packet aspect:Average including bag
Length, the maximum length of bag, the minimum length of bag, and variance statistic feature;The feature of described network flow aspect:By convection current
Original statistical nature, the time started, the end time, COS obtains flowing related statistical nature:Mean flow lasting when
Between, average transmission rate, the average byte number of stream, time interval and variance that bag reaches;Described node connects the spy of aspect
Levy:By the connection status of TCP, the correlated characteristic that node is connected counts, include symmetry that connection presents and
IP address, port identity;Using following three-dimensional feature vector:Vector=<V1, v2, v3>;Wherein, V1 represents data package size
The mean square deviation of change, V2 represents the ratio of up-downgoing speed at node, and V3 represents the ratio of IP address quantity and port number;
When being identified to network traffics, using three-dimensional feature vector as input vector;
(2) select suitable kernel function:Introduce wavelet basis function to construct the kernel function of SVM, and the flow for P2P network
Identification, needs to meet two conditions:A () meets the condition of the construction of SVM kernel function;(2) calculating of the wavelet basis function selecting
Complexity is low, and excessive parameter setting can increase the training time of sample;
(3) select incremental training algorithm:The thought of SVM incremental training algorithm is exactly its decision function is to be determined by supporting vector
, the supporting vector in training set is all remained, gives up non-supporting vector, the result of final incremental training is and does not make
It is consistent with the result of incremental learning;
(4) the Boosting algorithm of the P2P flow identification of small echo SVM:Small echo SVM is trained to sample as base grader,
M sample is selected to constitute a training subset S according to weight size first from whole P2P and non-P2P sample set Sj, Jing Guoxun
A base grader WSVM is drawn after white silkj, then use WSVMjTest sample collection S is it can be deduced that WSVMjClassification accuracy;So
Give higher weight to wrong point of sample afterwards;It is last that according to the weight size after adjustment, M sample of selection from S is constituted again
New training subset Sj+1If, Sj+1=S then exits, and otherwise repeats above step;After training t wheel, (t≤T, T are iteration
Number of times), obtain recognition function sequence WSVM based on WSVM1..., WSVMj, WSVM simultaneouslyjAlso weights are given, also
It is the accuracy rate to sample set S identification;Eventually through obtaining strong classifier H (x) by the way of the ballot having weight, use
Identification in P2P flow.
2. the peer-to-peer network method for recognizing flux of a kind of inseparable wavelet support vector machines according to claim 1, it is special
Levy and be:Incremental training algorithm steps in described step (3) are as follows:
Step 1:Preliminary classification device f (x) of SVM, SVs are obtained through training on initial training set1Represent f (x) support to
Quantity set;
Step 2:By SVs1Merge into new training set with newly-increased sample set, after training, new grader f ' (x) will be obtained,
New supporting vector collection SVs2;
Step 3:Make SVs1=SVs2, return to step 2.
3. the peer-to-peer network method for recognizing flux of a kind of inseparable wavelet support vector machines according to claim 1, it is special
Levy and be:The Boosting arthmetic statement of the P2P flow identification of the small echo SVM in described step (4) is as follows:
(1) input:Base grader WSVM;P2P and non-P2P flow training sample set S={ (x1, y1) ..., (xn, yn), wherein
xi∈Rd, y ∈ { -1,1 }, 1≤i≤n;The iterations T of training;Sample initial weight Di=0 (1≤i≤n);
(2) for (j=1;j≤T;j++)
{
1) size according to weight chooses M sample successively from training sample set S, obtains training sample subset Sj;
2) if Sj=Sj-1(j > 1) then exits circulation;
3) use WSVM Algorithm for Training Sj, obtain a base grader WSVMj;
4) use WSVMjClassified sample set S, obtaining error rate is ej;
5)WSVMjWeight be designated as αj=1-ej;
6) sample weights that the mistake in adjustment supporting vector and sample set S is divided are Di=D1+i;
}
(3) export:Decision function sequences h={ WSVM1..., WSVMt, its weight α={ α1..., αt, t≤T, final determines
Plan function is:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410017016.3A CN103780501B (en) | 2014-01-03 | 2014-01-03 | Peer-to-peer network traffic identification method of inseparable-wavelet support vector machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410017016.3A CN103780501B (en) | 2014-01-03 | 2014-01-03 | Peer-to-peer network traffic identification method of inseparable-wavelet support vector machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103780501A CN103780501A (en) | 2014-05-07 |
CN103780501B true CN103780501B (en) | 2017-02-15 |
Family
ID=50572352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410017016.3A Expired - Fee Related CN103780501B (en) | 2014-01-03 | 2014-01-03 | Peer-to-peer network traffic identification method of inseparable-wavelet support vector machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103780501B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104092618A (en) * | 2014-07-23 | 2014-10-08 | 湖北工业大学 | Peer-to-peer network traffic feature selection method based on cuckoo search algorithm |
CN104243245A (en) * | 2014-10-16 | 2014-12-24 | 湖北工业大学 | Method and system for identifying non-separable wavelet SVM-based peer-to-peer network traffic |
CN105184322B (en) * | 2015-09-14 | 2018-11-02 | 哈尔滨工业大学 | A kind of multidate image sorting technique based on incremental integration study |
CN106897705B (en) * | 2017-03-01 | 2020-04-10 | 上海海洋大学 | Ocean observation big data distribution method based on incremental learning |
CN108076038A (en) * | 2017-06-16 | 2018-05-25 | 哈尔滨安天科技股份有限公司 | A kind of C&C servers determination methods and system based on Service-Port |
CN109194622B (en) * | 2018-08-08 | 2020-03-31 | 西安交通大学 | Encrypted flow analysis feature selection method based on feature efficiency |
CN109327464A (en) * | 2018-11-15 | 2019-02-12 | 中国人民解放军战略支援部队信息工程大学 | Class imbalance processing method and processing device in a kind of network invasion monitoring |
CN109951444B (en) * | 2019-01-29 | 2020-05-22 | 中国科学院信息工程研究所 | Encrypted anonymous network traffic identification method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101771702A (en) * | 2010-01-05 | 2010-07-07 | 中兴通讯股份有限公司 | Method and system for defending distributed denial of service attack in point-to-point network |
CN101958843A (en) * | 2010-11-01 | 2011-01-26 | 南京邮电大学 | Intelligent routing selection method based on flow analysis and node trust degree |
CN102594836A (en) * | 2012-03-06 | 2012-07-18 | 青岛农业大学 | Flow recognition method based on wavelet energy spectrum |
-
2014
- 2014-01-03 CN CN201410017016.3A patent/CN103780501B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101771702A (en) * | 2010-01-05 | 2010-07-07 | 中兴通讯股份有限公司 | Method and system for defending distributed denial of service attack in point-to-point network |
CN101958843A (en) * | 2010-11-01 | 2011-01-26 | 南京邮电大学 | Intelligent routing selection method based on flow analysis and node trust degree |
CN102594836A (en) * | 2012-03-06 | 2012-07-18 | 青岛农业大学 | Flow recognition method based on wavelet energy spectrum |
Also Published As
Publication number | Publication date |
---|---|
CN103780501A (en) | 2014-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103780501B (en) | Peer-to-peer network traffic identification method of inseparable-wavelet support vector machine | |
Salman et al. | A review on machine learning–based approaches for Internet traffic classification | |
CN114257386B (en) | Training method, system, equipment and storage medium for detection model | |
WO2018054342A1 (en) | Method and system for classifying network data stream | |
Wang et al. | Real network traffic collection and deep learning for mobile app identification | |
Dainotti et al. | Issues and future directions in traffic classification | |
CN102035698B (en) | HTTP tunnel detection method based on decision tree classification algorithm | |
CN110213227A (en) | A kind of network data flow detection method and device | |
Sun et al. | Network security technology of intelligent information terminal based on mobile internet of things | |
Park et al. | Toward fine-grained traffic classification | |
CN116346418A (en) | DDoS detection method and device based on federal learning | |
Bhalaji et al. | A trust based mechanism to combat blackhole attack in RPL protocol | |
He et al. | AppFA: a novel approach to detect malicious android applications on the network | |
CN117240560A (en) | GAN-based high-simulation honeypot implementation method and system | |
Khatouni et al. | Integrating machine learning with off-the-shelf traffic flow features for http/https traffic classification | |
Ye et al. | FLAG: Few-shot latent Dirichlet generative learning for semantic-aware traffic detection | |
Li et al. | ProGraph: Robust network traffic identification with graph propagation | |
Kattadige et al. | Seta++: Real-time scalable encrypted traffic analytics in multi-gbps networks | |
Song et al. | The correlation study for parameters in four tuples | |
Cascarano et al. | Comparing P2PTV traffic classifiers | |
Zhang et al. | An uncertainty-based traffic training approach to efficiently identifying encrypted proxies | |
Rexha et al. | Guarding the Cloud: An Effective Detection of Cloud-Based Cyber Attacks using Machine Learning Algorithms. | |
JP6538618B2 (en) | Management device and management method | |
Niu et al. | Using XGBoost to discover infected hosts based on HTTP traffic | |
Aouini et al. | Early classification of residential networks traffic using C5. 0 machine learning algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170215 Termination date: 20180103 |