CN104348741A - Method and system for detecting P2P (peer-to-peer) traffic based on multi-dimensional analysis and decision tree - Google Patents

Method and system for detecting P2P (peer-to-peer) traffic based on multi-dimensional analysis and decision tree Download PDF

Info

Publication number
CN104348741A
CN104348741A CN201310337914.2A CN201310337914A CN104348741A CN 104348741 A CN104348741 A CN 104348741A CN 201310337914 A CN201310337914 A CN 201310337914A CN 104348741 A CN104348741 A CN 104348741A
Authority
CN
China
Prior art keywords
flow
decision tree
network
classification
doubtful
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310337914.2A
Other languages
Chinese (zh)
Inventor
戚湧
李千目
李嘉
侯君
於东军
陈俊
汪欢
侍球干
丁玲玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology Changshu Research Institute Co Ltd
Original Assignee
Nanjing University of Science and Technology Changshu Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology Changshu Research Institute Co Ltd filed Critical Nanjing University of Science and Technology Changshu Research Institute Co Ltd
Priority to CN201310337914.2A priority Critical patent/CN104348741A/en
Publication of CN104348741A publication Critical patent/CN104348741A/en
Pending legal-status Critical Current

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method and a system for detecting P2P (peer-to-peer) traffic based on a multi-dimensional analysis and a decision tree. The method is characterized in that the wavelet multi-dimensional analysis is combined with a decision tree algorithm in machine learning to detect the P2P traffic; the traffic is subject to multi-dimensional analysis by the wavelet analysis technology so as to extract the suspected P2P traffic, and then the traffic is classified by the decision tree algorithm. The method has the advantages that the encrypted and unknown P2P traffic can be detected, the accuracy and detection efficiency are higher, the effectivity of classification and detection is improved, and the good safety detection effect is realized.

Description

Based on P2P flow rate testing methods and the system of multiscale analysis and decision tree
Technical field
The invention belongs to P2P flow quantity detecting system, particularly the P2P flow quantity detecting system that combines of a kind of multiscale analysis based on small echo and decision Tree algorithms.
Background technology
Scientific and technological progress is maked rapid progress, and network is flooded with various flow, wherein P2P flow occupies the overwhelming majority, and strong engulfs limited bandwidth resources.How effective and reasonable the study hotspot that management and control become new is carried out to P2P flow.Traditional P2P flow rate testing methods as based on traffic characteristic and based on the limitation having self of DPI, this is at present ever-increasing in network traffics, while carrying out protocol analysis and characteristic matching, the calculating of computer and storage overhead can enlarge markedly, serious increase network equipment burden, causes P2P network security can not be guaranteed.
The P2P flow quantity detecting system that a kind of multiscale analysis based on small echo of the present invention and decision Tree algorithms combine can extract doubtful P2P flow by multiscale analysis network traffics, improve the validity of decision tree classification detection model, encryption and unknown P2P flow can be detected simultaneously.
In prior art, due to P2P flow increase time, the storage overhead of computer enlarges markedly, the serious burden increasing the network equipment, in order to the network traffics of control P2P, the present invention proposes the P2P flow quantity detecting system that a kind of multiscale analysis based on small echo and decision Tree algorithms combine.
2, the technical solution adopted in the present invention.
Based on the P2P flow rate testing methods of multiscale analysis and decision tree, carry out in accordance with the following steps:
The first step, image data from network layer, makes Port Mirroring to specific switch, copies the uplink and downlink data on flows of test machine completely to data acquisition server, and carries out wavelet multi-scale analysis by packet capturing analysis tool sample drawn data;
Second step, analyzes the self-similarity characteristics of network traffics, appearance like with short analogous relationship feature, find out the flow rate mode that it is total or similar, then set up P2P network flow characteristic pattern base according to these features;
3rd step, according to the autocorrelation characteristic in flow rate mode storehouse, long correlation and short correlative flow pattern base carry out pattern matching;
4th step, the whether doubtful P2P network traffics of comprehensive descision, if judging is doubtful P2P network traffics, then enter the flow detection protocol classification stage of transport layer based on decision tree, otherwise let pass.
Based on the P2P flow quantity detecting system of multiscale analysis and decision tree, comprise following structure:
Network collection unit, for image data from network layer, Port Mirroring is done to specific switch, copies the uplink and downlink data on flows of test machine completely to data acquisition server, and carry out wavelet multi-scale analysis by packet capturing analysis tool sample drawn data;
Network flow characteristic storehouse, analyzes the self-similarity characteristics of network traffics, appearance like with short analogous relationship feature, find out the flow rate mode that it is total or similar, then set up P2P network flow characteristic pattern base according to these features;
Flow matches unit, according to the autocorrelation characteristic in flow rate mode storehouse, long correlation and short correlative flow pattern base carry out pattern matching;
Traffic classification unit, the whether doubtful P2P network traffics of comprehensive descision, if judging is doubtful P2P network traffics, then enter the flow detection protocol classification stage of transport layer based on decision tree, otherwise let pass.
3, beneficial effect of the present invention.
The present invention jumps out traditional detection method category, proposes the P2P flow detection technology combined with the decision Tree algorithms in machine learning by the multiscale analysis of small echo.Encryption and unknown P2P flow can not only be detected, also there is higher accuracy and detection efficiency simultaneously, reach good safety detection effect.
Below in conjunction with accompanying drawing, the present invention is described in further detail.
Summary of the invention
1, object of the present invention.
Accompanying drawing explanation
Fig. 1 is P2P network traffics multiscale analysis module map.
Fig. 2 is that decision-tree model builds and identifying figure.
Embodiment
Embodiment 1
Composition graphs 1, based on the P2P flow rate testing methods of multiscale analysis and decision tree, carries out in accordance with the following steps:
The first step, image data from network layer, makes Port Mirroring to specific switch, copies the uplink and downlink data on flows of test machine completely to data acquisition server, and carries out wavelet multi-scale analysis by packet capturing analysis tool sample drawn data;
Second step, analyzes the self-similarity characteristics of network traffics, appearance like with short analogous relationship feature, find out the flow rate mode that it is total or similar, then set up P2P network flow characteristic pattern base according to these features;
3rd step, according to the autocorrelation characteristic in flow rate mode storehouse, long correlation and short correlative flow pattern base carry out pattern matching;
4th step, the whether doubtful P2P network traffics of comprehensive descision, if judging is doubtful P2P network traffics, then enter the flow detection protocol classification stage of transport layer based on decision tree, otherwise let pass.
Embodiment 2
Based on the P2P flow quantity detecting system of wavelet multi-scale analysis and decision tree, step is as follows:
The first step, carries out network traffics collection from network layer.Copy switch ports themselves data to data acquisition server.
Second step, traffic characteristic multiscale analysis.Carry out wavelet multi-scale analysis by packet capturing analysis tool sample drawn data, comprise self-similarity characteristics analysis, long correlation and short correlation analysis.
3rd step, carries out pattern matching according to traffic characteristic pattern base.
4th step, judges and extracts doubtful P2P flow.Proceed to the flow detection protocol classification stage of transport layer based on decision tree.
5th step, training sample data also set up decision tree classification detection model, make further classification and Detection to doubtful P2P flow.
Embodiment 3
On the basis of embodiment 1 or 2, the sorting phase step of composition graphs 2, P2P flow is as follows:
The first step, according to the five-tuple concept of stream, extracts transport layer TCP/UDP bidirectional traffic, extracts the characterization rules of the network data flow had nothing to do with Port IP address and upper-layer protocol.
Second step, according to the size of the different regular ratio of profit increase of the C4.5 method comparison in the characterization rules decision-tree model of sample stream, sets up decision tree P2P traffic classification model, completes net flow assorted.
3rd step, according to the decision-tree model set up to new doubtful P2P traffic classification.
Embodiment 4
Based on the P2P flow quantity detecting system of multiscale analysis and decision tree, comprise following structure:
Network collection unit, for image data from network layer, Port Mirroring is done to specific switch, copies the uplink and downlink data on flows of test machine completely to data acquisition server, and carry out wavelet multi-scale analysis by packet capturing analysis tool sample drawn data;
Network flow characteristic storehouse, analyzes the self-similarity characteristics of network traffics, appearance like with short analogous relationship feature, find out the flow rate mode that it is total or similar, then set up P2P network flow characteristic pattern base according to these features;
Flow matches unit, according to the autocorrelation characteristic in flow rate mode storehouse, long correlation and short correlative flow pattern base carry out pattern matching;
Traffic classification unit, the whether doubtful P2P network traffics of comprehensive descision, if judging is doubtful P2P network traffics, then enter the flow detection protocol classification stage of transport layer based on decision tree, otherwise let pass.
Embodiment 5
On the basis of embodiment 4, the sorting phase step of flow taxon is as follows:
A. according to the five-tuple concept of stream, extract transport layer TCP/UDP bidirectional traffic, extract the characterization rules of the network data flow had nothing to do with Port IP address and upper-layer protocol;
B. according to the size of the different regular ratio of profit increase of the C4.5 method comparison in the characterization rules decision-tree model of sample stream, set up decision tree P2P traffic classification model, complete net flow assorted;
C. according to the decision-tree model set up to new doubtful P2P traffic classification.
Embodiment 6
On the basis of embodiment 4 or 5, also comprise P2P traffic flow amount detection unit, after completing the classification of P2P flow, decision tree classification detection model is set up to training sample data, further classification and Detection is done to doubtful P2P flow.
Above-described embodiment does not limit the present invention in any way, and the technical scheme that the mode that every employing is equal to replacement or equivalent transformation obtains all drops in protection scope of the present invention.

Claims (6)

1., based on a P2P flow rate testing methods for multiscale analysis and decision tree, it is characterized in that carrying out in accordance with the following steps:
The first step, image data from network layer, makes Port Mirroring to specific switch, copies the uplink and downlink data on flows of test machine completely to data acquisition server, and carries out wavelet multi-scale analysis by packet capturing analysis tool sample drawn data;
Second step, analyzes the self-similarity characteristics of network traffics, appearance like with short analogous relationship feature, find out the flow rate mode that it is total or similar, then set up P2P network flow characteristic pattern base according to these features;
3rd step, according to the autocorrelation characteristic in flow rate mode storehouse, long correlation and short correlative flow pattern base carry out pattern matching;
4th step, the whether doubtful P2P network traffics of comprehensive descision, if judging is doubtful P2P network traffics, then enter the flow detection protocol classification stage of transport layer based on decision tree, otherwise let pass.
2. the P2P flow rate testing methods based on multiscale analysis and decision tree according to claim 1, is characterized in that: the sorting phase step of the 4th described step P2P flow is as follows:
A. according to the five-tuple concept of stream, extract transport layer TCP/UDP bidirectional traffic, extract the characterization rules of the network data flow had nothing to do with Port IP address and upper-layer protocol;
B. according to the size of the different regular ratio of profit increase of the C4.5 method comparison in the characterization rules decision-tree model of sample stream, set up decision tree P2P traffic classification model, complete net flow assorted;
C. according to the decision-tree model set up to new doubtful P2P traffic classification.
3. the P2P flow rate testing methods based on multiscale analysis and decision tree according to claim 1 and 2, it is characterized in that: after completing the classification of P2P flow, decision tree classification detection model is set up to training sample data, further classification and Detection is done to doubtful P2P flow.
4., based on a P2P flow quantity detecting system for multiscale analysis and decision tree, it is characterized in that comprising following structure:
Network collection unit, for image data from network layer, Port Mirroring is done to specific switch, copies the uplink and downlink data on flows of test machine completely to data acquisition server, and carry out wavelet multi-scale analysis by packet capturing analysis tool sample drawn data;
Network flow characteristic storehouse, analyzes the self-similarity characteristics of network traffics, appearance like with short analogous relationship feature, find out the flow rate mode that it is total or similar, then set up P2P network flow characteristic pattern base according to these features;
Flow matches unit, according to the autocorrelation characteristic in flow rate mode storehouse, long correlation and short correlative flow pattern base carry out pattern matching;
Traffic classification unit, the whether doubtful P2P network traffics of comprehensive descision, if judging is doubtful P2P network traffics, then enter the flow detection protocol classification stage of transport layer based on decision tree, otherwise let pass.
5. the P2P flow quantity detecting system based on multiscale analysis and decision tree according to claim 4, is characterized in that: the sorting phase step of described traffic classification unit is as follows:
A. according to the five-tuple concept of stream, extract transport layer TCP/UDP bidirectional traffic, extract the characterization rules of the network data flow had nothing to do with Port IP address and upper-layer protocol;
B. according to the size of the different regular ratio of profit increase of the C4.5 method comparison in the characterization rules decision-tree model of sample stream, set up decision tree P2P traffic classification model, complete net flow assorted;
C. according to the decision-tree model set up to new doubtful P2P traffic classification.
6. the P2P flow quantity detecting system based on multiscale analysis and decision tree according to claim 4 or 5, it is characterized in that: also comprise doubtful P2P flow detection unit, after completing the classification of P2P flow, decision tree classification detection model is set up to training sample data, further classification and Detection is done to doubtful P2P flow.
CN201310337914.2A 2013-08-06 2013-08-06 Method and system for detecting P2P (peer-to-peer) traffic based on multi-dimensional analysis and decision tree Pending CN104348741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310337914.2A CN104348741A (en) 2013-08-06 2013-08-06 Method and system for detecting P2P (peer-to-peer) traffic based on multi-dimensional analysis and decision tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310337914.2A CN104348741A (en) 2013-08-06 2013-08-06 Method and system for detecting P2P (peer-to-peer) traffic based on multi-dimensional analysis and decision tree

Publications (1)

Publication Number Publication Date
CN104348741A true CN104348741A (en) 2015-02-11

Family

ID=52503576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310337914.2A Pending CN104348741A (en) 2013-08-06 2013-08-06 Method and system for detecting P2P (peer-to-peer) traffic based on multi-dimensional analysis and decision tree

Country Status (1)

Country Link
CN (1) CN104348741A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372670A (en) * 2016-09-06 2017-02-01 南京理工大学 Loyalty index prediction method based on improved nearest neighbor algorithm
CN106603497A (en) * 2016-11-15 2017-04-26 国家数字交换系统工程技术研究中心 Multi-granularity detection method of network space attack flow
CN108896996A (en) * 2018-05-11 2018-11-27 中南大学 A kind of Pb-Zn deposits absorbing well, absorption well water sludge interface ultrasonic echo signal classification method based on random forest
CN109768985A (en) * 2019-01-30 2019-05-17 电子科技大学 A kind of intrusion detection method based on traffic visualization and machine learning algorithm
CN110012009A (en) * 2019-04-03 2019-07-12 华南师范大学 Internet of Things intrusion detection method based on decision tree and self similarity models coupling
CN110048962A (en) * 2019-04-24 2019-07-23 广东工业大学 A kind of method of net flow assorted, system and equipment
US11425047B2 (en) 2017-12-15 2022-08-23 Huawei Technologies Co., Ltd. Traffic analysis method, common service traffic attribution method, and corresponding computer system
US11586971B2 (en) 2018-07-19 2023-02-21 Hewlett Packard Enterprise Development Lp Device identifier classification
CN117061249A (en) * 2023-10-12 2023-11-14 明阳时创(北京)科技有限公司 Intrusion monitoring method and system based on network traffic

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7058712B1 (en) * 2002-06-04 2006-06-06 Rockwell Automation Technologies, Inc. System and methodology providing flexible and distributed processing in an industrial controller environment
CN102868632A (en) * 2011-07-05 2013-01-09 句容博通科技咨询服务有限公司 P2P (peer-to-peer) traffic identification and monitoring system based on multi-dimensional vector machine
CN103078772A (en) * 2013-02-26 2013-05-01 南京理工大学常熟研究院有限公司 Depth packet inspection (DPI) sampling peer-to-peer (P2P) flow detection system based on credibility

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7058712B1 (en) * 2002-06-04 2006-06-06 Rockwell Automation Technologies, Inc. System and methodology providing flexible and distributed processing in an industrial controller environment
CN102868632A (en) * 2011-07-05 2013-01-09 句容博通科技咨询服务有限公司 P2P (peer-to-peer) traffic identification and monitoring system based on multi-dimensional vector machine
CN103078772A (en) * 2013-02-26 2013-05-01 南京理工大学常熟研究院有限公司 Depth packet inspection (DPI) sampling peer-to-peer (P2P) flow detection system based on credibility

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑淋等: "基于多尺度分析和决策树的P2P流量检测模型", 《电视技术》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106372670A (en) * 2016-09-06 2017-02-01 南京理工大学 Loyalty index prediction method based on improved nearest neighbor algorithm
CN106603497A (en) * 2016-11-15 2017-04-26 国家数字交换系统工程技术研究中心 Multi-granularity detection method of network space attack flow
US11425047B2 (en) 2017-12-15 2022-08-23 Huawei Technologies Co., Ltd. Traffic analysis method, common service traffic attribution method, and corresponding computer system
CN108896996B (en) * 2018-05-11 2019-09-20 中南大学 A kind of Pb-Zn deposits absorbing well, absorption well water sludge interface ultrasonic echo signal classification method based on random forest
CN108896996A (en) * 2018-05-11 2018-11-27 中南大学 A kind of Pb-Zn deposits absorbing well, absorption well water sludge interface ultrasonic echo signal classification method based on random forest
US11586971B2 (en) 2018-07-19 2023-02-21 Hewlett Packard Enterprise Development Lp Device identifier classification
US12026597B2 (en) 2018-07-19 2024-07-02 Hewlett Packard Enterprise Development Lp Device identifier classification
CN109768985A (en) * 2019-01-30 2019-05-17 电子科技大学 A kind of intrusion detection method based on traffic visualization and machine learning algorithm
CN110012009B (en) * 2019-04-03 2021-05-28 华南师范大学 Internet of things intrusion detection method based on combination of decision tree and self-similarity model
CN110012009A (en) * 2019-04-03 2019-07-12 华南师范大学 Internet of Things intrusion detection method based on decision tree and self similarity models coupling
CN110048962A (en) * 2019-04-24 2019-07-23 广东工业大学 A kind of method of net flow assorted, system and equipment
CN117061249A (en) * 2023-10-12 2023-11-14 明阳时创(北京)科技有限公司 Intrusion monitoring method and system based on network traffic
CN117061249B (en) * 2023-10-12 2024-04-26 明阳时创(北京)科技有限公司 Intrusion monitoring method and system based on network traffic

Similar Documents

Publication Publication Date Title
CN104348741A (en) Method and system for detecting P2P (peer-to-peer) traffic based on multi-dimensional analysis and decision tree
US8797901B2 (en) Method and its devices of network TCP traffic online identification using features in the head of the data flow
CN102315974B (en) Stratification characteristic analysis-based method and apparatus thereof for on-line identification for TCP, UDP flows
CN105577679B (en) A kind of anomalous traffic detection method based on feature selecting and density peaks cluster
CN102035698B (en) HTTP tunnel detection method based on decision tree classification algorithm
CN104283897B (en) Wooden horse communication feature rapid extracting method based on multiple data stream cluster analysis
CN104244035B (en) Network video stream sorting technique based on multi-level clustering
CN102739457B (en) Network flow recognition system and method based on DPI (Deep Packet Inspection) and SVM (Support Vector Machine) technology
CN109117634A (en) Malware detection method and system based on network flow multi-view integration
CN107370752B (en) Efficient remote control Trojan detection method
CN105024993A (en) Protocol comparison method based on vector operation
CN102984269B (en) A kind of point-to-point method for recognizing flux and device
CN109151880A (en) Mobile application flow identification method based on multilayer classifier
CN104468567B (en) A kind of system and method for the identification of network multimedia Business Stream and mapping
CN107566192B (en) A kind of abnormal flow processing method and Network Management Equipment
CN104394021A (en) Network flow abnormity analysis method based on visualization clustering
CN104092588B (en) A kind of exception flow of network detection method combined based on SNMP with NetFlow
CN101841440A (en) Peer-to-peer network flow identification method based on support vector machine and deep packet inspection
CN111294342A (en) Method and system for detecting DDos attack in software defined network
CN110493235A (en) A kind of mobile terminal from malicious software synchronization detection method based on network flow characteristic
CN109413079A (en) Fast-Flux Botnet detection method and system under a kind of high speed network
CN108055227B (en) WAF unknown attack defense method based on site self-learning
CN102984131B (en) A kind of information identifying method and device
CN104657747A (en) Online game stream classifying method based on statistical characteristics
CN105871861A (en) Intrusion detection method for self-learning protocol rule

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150211