CN106953854A - A kind of method for building up of the darknet flow identification model based on SVM machine learning - Google Patents
A kind of method for building up of the darknet flow identification model based on SVM machine learning Download PDFInfo
- Publication number
- CN106953854A CN106953854A CN201710156258.4A CN201710156258A CN106953854A CN 106953854 A CN106953854 A CN 106953854A CN 201710156258 A CN201710156258 A CN 201710156258A CN 106953854 A CN106953854 A CN 106953854A
- Authority
- CN
- China
- Prior art keywords
- flow
- machine learning
- anonymous
- detection model
- building
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of method for building up of the darknet flow identification model based on SVM machine learning, comprise the following steps:Build the flow detection model of the machine learning based on SVM;Machine learning is carried out to the parameter in flow detection model, four characteristic values of pure anonymous flow and pure non-anonymous flow are obtained;Four characteristic values of pure anonymous flow and pure non-anonymous flow are brought into flow detection model and carry out computing, the parameter of flow detection model is obtained.Compared with prior art, the positive effect of the present invention is:Pass through the inventive method, the Mathematical Modeling of Anonymizing networks data traffic identification can extremely accurate be depicted, applied in the detection of Anonymizing networks data traffic, Detection accuracy is high, computing is simply efficient, and after Anonymizing networks are upgraded, because this method uses the algorithm based on machine learning, as long as therefore re-starting study for the Anonymizing networks after upgrading, new Anonymizing networks data traffic just can be detected.
Description
Technical field
The present invention relates to a kind of method for building up of the darknet flow identification model based on SVM machine learning.
Background technology
The analysis and control of Anonymizing networks (darknet) flow, particularly flow detection are currently in the exploratory development stage,
At present do not have a kind of method can all Anonymizing networks flows of effective detection, some methods may be only to certain Anonymizing networks
Therefore the detection of Anonymizing networks flow is an eternal research topic, it is necessary to not effectively, even only for some version effectively,
Disconnected follow-up research, is changed with the continuous upgrading for tackling Anonymizing networks, and improves the accuracy rate of Anonymizing networks flow detection, crucial
It is in the accuracy of flow identification model foundation.The method that this method uses machine learning, accurately sets up one as far as possible and hides
The Mathematical Modeling of name network traffics identification, it is intended to drop to the upgrading change due to Anonymizing networks most to the influence that detection band is come
It is low, can be with the accurate flow for detecting Anonymizing networks.
The content of the invention
In order to overcome the disadvantages mentioned above of prior art, the invention provides a kind of darknet flow based on SVM machine learning
The method for building up of identification model, it is intended to set up a dynamic change and accurately Mathematical Modeling for the flow identification of Anonymizing networks.
The technical solution adopted for the present invention to solve the technical problems is:A kind of darknet flow based on SVM machine learning
The method for building up of identification model, comprises the following steps:
Step 1: building the flow detection model of the machine learning based on SVM;
Step 2: carry out machine learning to the parameter in flow detection model, obtain pure anonymous flow and pure non-hide
Four characteristic values of name flow;
Step 3: four characteristic values of pure anonymous flow and pure non-anonymous flow are brought into flow detection model
Computing is carried out, the parameter of flow detection model is obtained.
Compared with prior art, the positive effect of the present invention is:
By the inventive method, the Mathematical Modeling of Anonymizing networks data traffic identification can be extremely accurate depicted, should
In being detected for Anonymizing networks data traffic, Detection accuracy is high, and computing is simply efficient, and after Anonymizing networks are upgraded,
Because this method uses the algorithm based on machine learning, as long as therefore re-starting for the Anonymizing networks after upgrading
Practise, just can detect new Anonymizing networks data traffic.
Brief description of the drawings
Examples of the present invention will be described by way of reference to the accompanying drawings, wherein:
Fig. 1 is the flow detection modular concept figure based on SVM.
Embodiment
A kind of method for building up of the darknet flow identification model based on SVM machine learning, comprises the following steps:
Step 1: model is set up
The detection of Anonymizing networks flow is implemented on the basis of founding mathematical models, but most detection at present
Model, in order to solve this problem, may be successfully managed even only for some version effectively only to certain Anonymizing networks effectively
The continuous upgrading change of Anonymizing networks, improves the accuracy rate of Anonymizing networks flow detection, it is necessary to set up a kind of new anonymous net
Network flow detection model.
In this method, detection model uses the flow detection model of the machine learning based on SVM, Anonymizing networks flow detection
Model is as shown in Figure 1:X is the characteristic vector of input in figure, and the quantity of feature is d;xnIt is d dimensional vectors for the sample of collection;yn
For the value (1, -1) of desired output, the corresponding anonymous flow of correspondence yes or no.The model mathematic(al) representation can table of equal value
It is shown as:
Y=kx+b
Wherein, k, b are the parameter of Anonymizing networks flow identification model, and k is the weight vector that d is tieed up, and b is amount of bias, in machine
The device study stage needs to calculate the k and b value by substantial amounts of x and y input, once complete Anonymizing networks flow identification mould
Type foundation can treat measurement of discharge and be detected, work as y>When 0, it can determine whether to treat that measurement of discharge is corresponding anonymous flow, work as y<When 0,
It can determine whether to treat that measurement of discharge is not anonymous flow.
Step 2: parameter is determined
, it is necessary to carry out machine learning to determine its parameter value to the parameter in model after flow detection model is selected.Machine
It will learn the correspondence pure Anonymizing networks flow of Anonymizing networks and pure non-anonymous network traffics respectively in the overall process of study
Four features of (background traffic), classification, one are re-started for all flows for being collected into by host profile forms
One pacp file of main frame, and with the self-study of the mathematical model parameter of following four characteristic values progress Anonymizing networks flow identification
Practise, this four features are respectively:The similar messages of Ping-pong go out in UDP connections number, weights of climbing over the walls, UDP flow comentropy, flow
Existing frequency.Their definition and computational methods is as follows:
(1) UDP connections number:Each Pcap files difference UDP connection numbers in unit interval:
Calculate different IP addresses quantity K altogether in each Hostprofie (pcap) file, then using K divided by
Hostprofile time T, obtain this feature value;
(2) climb over the walls weights:Weights are multiplied by the number of times of the sensitive domain name mapping such as Amazon server, Dynamic Networks:
A sensitive DNS query list is safeguarded, different domain names distribute different weights, if deposited in Hostprofile
Sensitivity DNS inquiry is being accessed, then is increasing corresponding weights of climbing over the walls;
(3) UDP flow comentropy:UDP flow comentropy size in average each Host profile:
Each UDP flow in Hostprofile is carried out comentropy calculating and to sum, then divided by UDP flow sum, letter
Breath entropy definition be
(4) there is frequency in similar message:The similar message occurrence numbers of Ping-pong:
The similar number of continuous data bag in Hostprofile is counted, number of times adds 1 if similar.
Machine learning is finished, by four characteristic values of the pure anonymous flow learnt and pure non-anonymous flow band repeatedly
Enter and carry out computing into Anonymizing networks flow identification model, finally obtain the parameter k and b in Anonymizing networks flow identification model,
Model, which is set up, to be completed.
Step 3: model is verified
Freegate Anonymizing networks are built, enough Freegate are captured respectively in the Anonymizing networks environment
The background traffic of anonymous flow and non-Freegate, four features of each flow are calculated for a certain main frame respectively:UDP
There is frequency in the similar messages of Ping-pong in connection number, weights of climbing over the walls, UDP flow comentropy, flow, are then brought into flow inspection
Computing is carried out in the Mathematical Modeling of survey, parameter k and b in model is calculated, the flow detection model of the Anonymizing networks environment is
Build and complete.
It can be examined in real time in the Freegate Anonymizing networks environment using the Anonymizing networks flow detection model built
Measure the data on flows of Anonymizing networks.In machine-learning process, the time of study is longer, and the data on flows of acquisition is more, structure
The flow detection model built is more accurate, and follow-up flow detection is also more accurate.
Claims (5)
1. a kind of method for building up of the darknet flow identification model based on SVM machine learning, it is characterised in that:Including following step
Suddenly:
Step 1: building the flow detection model of the machine learning based on SVM;
Step 2: carrying out machine learning to the parameter in flow detection model, pure anonymous flow and pure non-anonymous stream are obtained
Four characteristic values of amount;
Carried out Step 3: four characteristic values of pure anonymous flow and pure non-anonymous flow are brought into flow detection model
Computing, obtains the parameter of flow detection model.
2. a kind of method for building up of darknet flow identification model based on SVM machine learning according to claim 1, it is special
Levy and be:The mathematical equivalent expression formula of the flow detection model is:Y=kx+b, wherein:K, b are the ginseng of flow detection model
Number, k is weight vector, and b is amount of bias.
3. a kind of method for building up of darknet flow identification model based on SVM machine learning according to claim 2, it is special
Levy and be:Four characteristic values of pure anonymous flow described in step 2 and pure non-anonymous flow are UDP connections number, the power of climbing over the walls
There is frequency in value, UDP flow comentropy and similar message.
4. a kind of method for building up of darknet flow identification model based on SVM machine learning according to claim 3, it is special
Levy and be:The computational methods of four characteristic values are respectively:
UDP connection numbers:Different IP addresses quantity divided by Hostprofile times in each Hostprofie files altogether are obtained
Arrive;
Climb over the walls weights:The number of times of sensitive domain name mapping, which is multiplied by, to be distributed to the weights of the domain name and obtains;
UDP flow comentropy:Then divided by UDP flow each UDP flow in Hostprofile is carried out comentropy calculating and to sum,
Sum obtain;
There is frequency in similar message:The statistical value of the similar number of continuous data bag in Hostprofile.
5. a kind of method for building up of darknet flow identification model based on SVM machine learning according to claim 4, it is special
Levy and be:When treating measurement of discharge using flow detection model and being detected, if y>0, then judge to treat that measurement of discharge is corresponding anonymity
Flow, if y<0, then judge to treat that measurement of discharge is not anonymous flow.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611157218 | 2016-12-15 | ||
CN2016111572183 | 2016-12-15 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106953854A true CN106953854A (en) | 2017-07-14 |
CN106953854B CN106953854B (en) | 2019-10-18 |
Family
ID=59473479
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710156258.4A Active CN106953854B (en) | 2016-12-15 | 2017-03-16 | A kind of method for building up of the darknet flow identification model based on SVM machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106953854B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108933846A (en) * | 2018-06-21 | 2018-12-04 | 北京谷安天下科技有限公司 | A kind of recognition methods, device and the electronic equipment of general parsing domain name |
CN111224940A (en) * | 2019-11-15 | 2020-06-02 | 中国科学院信息工程研究所 | Anonymous service traffic correlation identification method and system nested in encrypted tunnel |
KR102129375B1 (en) * | 2019-11-01 | 2020-07-02 | (주)에이아이딥 | Deep running model based tor site active fingerprinting system and method thereof |
CN112887291A (en) * | 2021-01-20 | 2021-06-01 | 中国科学院计算技术研究所 | I2P traffic identification method and system based on deep learning |
CN113938290A (en) * | 2021-09-03 | 2022-01-14 | 华中科技大学 | Website de-anonymization method and system for user side traffic data analysis |
CN115001861A (en) * | 2022-07-20 | 2022-09-02 | 中国电子科技集团公司第三十研究所 | Method and system for detecting abnormal services of hidden network based on mixed fingerprint characteristics |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101510841A (en) * | 2008-12-31 | 2009-08-19 | 成都市华为赛门铁克科技有限公司 | Method and system for recognizing end-to-end flux |
CN101695035A (en) * | 2009-10-21 | 2010-04-14 | 成都市华为赛门铁克科技有限公司 | Flow rate identification method and device thereof |
CN102984131A (en) * | 2012-11-09 | 2013-03-20 | 华为技术有限公司 | Information recognition method and device |
US20140082725A1 (en) * | 2006-02-28 | 2014-03-20 | The Trustees Of Columbia University In The City Of New York | Systems, Methods, and Media for Outputting a Dataset Based Upon Anomaly Detection |
CN104052639A (en) * | 2014-07-02 | 2014-09-17 | 山东大学 | Real-time multi-application network flow identification method based on support vector machine |
CN105471883A (en) * | 2015-12-10 | 2016-04-06 | 中国电子科技集团公司第三十研究所 | Tor network tracing system and tracing method based on web injection |
CN105721242A (en) * | 2016-01-26 | 2016-06-29 | 国家信息技术安全研究中心 | Information entropy-based encrypted traffic identification method |
-
2017
- 2017-03-16 CN CN201710156258.4A patent/CN106953854B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140082725A1 (en) * | 2006-02-28 | 2014-03-20 | The Trustees Of Columbia University In The City Of New York | Systems, Methods, and Media for Outputting a Dataset Based Upon Anomaly Detection |
CN101510841A (en) * | 2008-12-31 | 2009-08-19 | 成都市华为赛门铁克科技有限公司 | Method and system for recognizing end-to-end flux |
CN101695035A (en) * | 2009-10-21 | 2010-04-14 | 成都市华为赛门铁克科技有限公司 | Flow rate identification method and device thereof |
CN102984131A (en) * | 2012-11-09 | 2013-03-20 | 华为技术有限公司 | Information recognition method and device |
CN104052639A (en) * | 2014-07-02 | 2014-09-17 | 山东大学 | Real-time multi-application network flow identification method based on support vector machine |
CN105471883A (en) * | 2015-12-10 | 2016-04-06 | 中国电子科技集团公司第三十研究所 | Tor network tracing system and tracing method based on web injection |
CN105721242A (en) * | 2016-01-26 | 2016-06-29 | 国家信息技术安全研究中心 | Information entropy-based encrypted traffic identification method |
Non-Patent Citations (5)
Title |
---|
ZHONGLIU ZHOU等: "("A multi-granularity heuristic-combining approach for censorship circumvention activity identification"", 《SECURITY AND COMMUNICATION NETWORKS》 * |
潘吴斌等: ""网络加密流量识别研究综述及展望"", 《通信学报》 * |
陈周国等: ""僵尸网络分析及其防御"", 《信息安全与通信保密》 * |
陈周国等: ""匿名网络追踪溯源综述"", 《计算机研究与发展》 * |
陈周国等: ""网络攻击追踪溯源层次分析"", 《计算机系统应用》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108933846A (en) * | 2018-06-21 | 2018-12-04 | 北京谷安天下科技有限公司 | A kind of recognition methods, device and the electronic equipment of general parsing domain name |
CN108933846B (en) * | 2018-06-21 | 2021-08-27 | 北京谷安天下科技有限公司 | Method and device for identifying domain name by pan-resolution and electronic equipment |
KR102129375B1 (en) * | 2019-11-01 | 2020-07-02 | (주)에이아이딥 | Deep running model based tor site active fingerprinting system and method thereof |
CN111224940A (en) * | 2019-11-15 | 2020-06-02 | 中国科学院信息工程研究所 | Anonymous service traffic correlation identification method and system nested in encrypted tunnel |
CN111224940B (en) * | 2019-11-15 | 2021-03-09 | 中国科学院信息工程研究所 | Anonymous service traffic correlation identification method and system nested in encrypted tunnel |
CN112887291A (en) * | 2021-01-20 | 2021-06-01 | 中国科学院计算技术研究所 | I2P traffic identification method and system based on deep learning |
CN113938290A (en) * | 2021-09-03 | 2022-01-14 | 华中科技大学 | Website de-anonymization method and system for user side traffic data analysis |
CN115001861A (en) * | 2022-07-20 | 2022-09-02 | 中国电子科技集团公司第三十研究所 | Method and system for detecting abnormal services of hidden network based on mixed fingerprint characteristics |
Also Published As
Publication number | Publication date |
---|---|
CN106953854B (en) | 2019-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106953854A (en) | A kind of method for building up of the darknet flow identification model based on SVM machine learning | |
CN105606499B (en) | Suspended particulate matter mass concentration real-time detection device, and measuring method | |
CN109145516B (en) | Analog circuit fault identification method based on improved extreme learning machine | |
CN103840988A (en) | Network traffic measurement method based on RBF neural network | |
CN109238455B (en) | A kind of characteristic of rotating machines vibration signal monitoring method and system based on graph theory | |
CN105025515B (en) | A kind of wireless sensor network Traffic anomaly detection method based on GM models | |
CN110309609B (en) | Building indoor air quality evaluation method based on rough set and WNN | |
CN104112062B (en) | The acquisition methods of wind-resources distribution based on interpolation method | |
CN110441478A (en) | A kind of river ecological environmental data on-line monitoring method, system and storage medium | |
CN116994999B (en) | Mechanical arm suction adjusting method and system for ultra-clean environment | |
CN115688288B (en) | Aircraft pneumatic parameter identification method and device, computer equipment and storage medium | |
Demirci et al. | Suspended sediment estimation using an artificial intelligence approach | |
CN111143999A (en) | Method, device and equipment for calculating regional surface roughness | |
CN105898691B (en) | Wireless sensor network target tracking method based on particlized sum-product algorithm | |
CN110147827A (en) | A kind of failure prediction method based on IAALO-SVM and similarity measurement | |
CN108256238A (en) | A kind of optic fiber grating wavelength demodulation method and device based on deep learning | |
CN109688112A (en) | Industrial Internet of Things unusual checking device | |
CN110889207B (en) | Deep learning-based intelligent assessment method for credibility of system combination model | |
CN117495205B (en) | Industrial Internet experiment system and method | |
CN114022035A (en) | Method for evaluating carbon emission of building in urban heat island effect | |
CN106972968A (en) | A kind of exception flow of network detection method for combining mahalanobis distance based on cross entropy | |
CN111914488B (en) | Data area hydrologic parameter calibration method based on antagonistic neural network | |
CN109450876A (en) | A kind of DDos recognition methods and system based on various dimensions state-transition matrix feature | |
CN116562171B (en) | Error assessment method for online measurement of temperature and humidity | |
CN112632862A (en) | Method and device for determining wind field stability, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |